Message ID | 20190801175116.28221-1-nusiddiq@redhat.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev,branch2.12] ovn: Add a new logical switch port type - 'virtual' | expand |
Bleep bloop. Greetings Numan Siddique, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 236 characters long (recommended limit is 79) #516 FILE: ovn/northd/ovn-northd.8.xml:530: <code>inport == <var>P</var> && !is_chassis_resident(<var>V</var>) && ((arp.op == 1 && arp.spa == <var>VIP</var> && arp.tpa == <var>VIP</var>) || (arp.op == 2 && arp.spa == <var>VIP</var>))</code> Lines checked: 1440, Warnings: 1, Errors: 0 Please check this out. If you feel there has been an error, please email aconole@bytheb.org Thanks, 0-day Robot
On Thu, 1 Aug 2019 at 10:52, <nusiddiq@redhat.com> wrote: > From: Numan Siddique <nusiddiq@redhat.com> > > This new type is added for the following reasons: > > - When a load balancer is created in an OpenStack deployment with Octavia > service, it creates a logical port 'VIP' for the virtual ip. > > - This logical port is not bound to any VIF. > > - Octavia service creates a service VM (with another logical port 'P' > which > belongs to the same logical switch) > > - The virtual ip 'VIP' is configured on this service VM. > > - This service VM provides the load balancing for the VIP with the > configured > backend IPs. > > - Octavia service can be configured to create few service VMs with > active-standby mode > with the active VM configured with the VIP. The VIP can move between > these service nodes. > > Presently there are few problems: > > - When a floating ip (externally reachable IP) is associated to the VIP > and if > the compute nodes have external connectivity then the external traffic > cannot > reach the VIP using the floating ip as the VIP logical port would be > down. > dnat_and_snat entry in NAT table for this vip will have 'external_mac' > and > 'logical_port' configured. > > - The only way to make it work is to clear the 'external_mac' entry so > that > the gateway chassis does the DNAT for the VIP. > > To solve these problems, this patch proposes a new logical port type - > virtual. > CMS when creating the logical port for the VIP, should > > - set the type as 'virtual' > > - configure the VIP in the options - > Logical_Switch_Port.options:virtual-ip > > - And set the virtual parents in the options > Logical_Switch_Port.options:virtual-parents. > These virtual parents are the one which can be configured with the VIP. > > If suppose the virtual_ip is configured to 10.0.0.10 on a virtual logical > port 'sw0-vip' > and the virtual_parents are set to - [sw0-p1, sw0-p2] then below logical > flows are added in the > lsp_in_arp_rsp logical switch pipeline > > - table=11(ls_in_arp_rsp), priority=100, > match=(inport == "sw0-p1" && !is_chassis_resident("sw0-vip") && > ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || > (arp.op == 2 && arp.spa == 10.0.0.10))), > action=(bind_vport("sw0-vip", inport); next;) > - table=11(ls_in_arp_rsp), priority=100, > match=(inport == "sw0-p2" && !is_chassis_resident("sw0-vip") && > ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || > (arp.op == 2 && arp.spa == 10.0.0.10))), > action=(bind_vport("sw0-vip", inport); next;) > > The action bind_vport will claim the logical port - sw0-vip on the chassis > where this action > is executed. Since the port - sw0-vip is claimed by a chassis, the > dnat_and_snat rule for > the VIP will be handled by the compute node. > > Co-authored-by: Ben Pfaff <blp@ovn.org> > Signed-off-by: Ben Pfaff <blp@ovn.org> > Acked-by: Gurucharan Shetty <guru@ovn.org> > Acked-by: Mark Michelson <mmichels@redhat.com> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com> > Ben is on vacation. So I took the liberty to commit it. > > (cherry picked from ovn commit 054f4c85c413e20d893e10ba053ec52ac15db49c) > --- > NEWS | 1 + > include/ovn/actions.h | 18 ++- > ovn/controller/binding.c | 30 +++- > ovn/controller/pinctrl.c | 174 ++++++++++++++++++++ > ovn/lib/actions.c | 59 +++++++ > ovn/lib/ovn-util.c | 1 + > ovn/northd/ovn-northd.8.xml | 61 ++++++- > ovn/northd/ovn-northd.c | 306 +++++++++++++++++++++++++++--------- > ovn/ovn-nb.xml | 45 ++++++ > ovn/ovn-sb.ovsschema | 6 +- > ovn/ovn-sb.xml | 46 ++++++ > ovn/utilities/ovn-trace.c | 3 + > tests/ovn.at | 290 ++++++++++++++++++++++++++++++++++ > tests/test-ovn.c | 1 + > 14 files changed, 954 insertions(+), 87 deletions(-) > > diff --git a/NEWS b/NEWS > index 8cf850823..be3ea42b4 100644 > --- a/NEWS > +++ b/NEWS > @@ -60,6 +60,7 @@ v2.12.0 - xx xxx xxxx > logical groups which results in tunnels only been formed between > members of the same transport zone(s). > * Support for IGMP Snooping and IGMP Querier. > + * Support for new logical switch port type - 'virtual'. > - New QoS type "linux-netem" on Linux. > - Added support for TLS Server Name Indication (SNI). > - Linux datapath: > diff --git a/include/ovn/actions.h b/include/ovn/actions.h > index 63d3907d8..0ca06537c 100644 > --- a/include/ovn/actions.h > +++ b/include/ovn/actions.h > @@ -85,7 +85,8 @@ struct ovn_extend_table; > OVNACT(SET_METER, ovnact_set_meter) \ > OVNACT(OVNFIELD_LOAD, ovnact_load) \ > OVNACT(CHECK_PKT_LARGER, ovnact_check_pkt_larger) \ > - OVNACT(TRIGGER_EVENT, ovnact_controller_event) > + OVNACT(TRIGGER_EVENT, ovnact_controller_event) \ > + OVNACT(BIND_VPORT, ovnact_bind_vport) > > /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. */ > enum OVS_PACKED_ENUM ovnact_type { > @@ -328,6 +329,13 @@ struct ovnact_controller_event { > size_t n_options; > }; > > +/* OVNACT_BIND_VPORT. */ > +struct ovnact_bind_vport { > + struct ovnact ovnact; > + char *vport; > + struct expr_field vport_parent; /* Logical virtual port's port > name. */ > +}; > + > /* Internal use by the helpers below. */ > void ovnact_init(struct ovnact *, enum ovnact_type, size_t len); > void *ovnact_put(struct ofpbuf *, enum ovnact_type, size_t len); > @@ -505,6 +513,14 @@ enum action_opcode { > * Snoop IGMP, learn the multicast participants > */ > ACTION_OPCODE_IGMP, > + > + /* "bind_vport(vport, vport_parent)". > + * > + * 'vport' follows the action_header, in the format - 32-bit field. > + * 'vport_parent' is passed through the packet metadata as > + * MFF_LOG_INPORT. > + */ > + ACTION_OPCODE_BIND_VPORT, > }; > > /* Header. */ > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c > index ace0f811b..dfe002b60 100644 > --- a/ovn/controller/binding.c > +++ b/ovn/controller/binding.c > @@ -571,11 +571,31 @@ consider_local_datapath(struct ovsdb_idl_txn > *ovnsb_idl_txn, > sbrec_port_binding_set_encap(binding_rec, encap_rec); > } > } else if (binding_rec->chassis == chassis_rec) { > - VLOG_INFO("Releasing lport %s from this chassis.", > - binding_rec->logical_port); > - if (binding_rec->encap) > - sbrec_port_binding_set_encap(binding_rec, NULL); > - sbrec_port_binding_set_chassis(binding_rec, NULL); > + if (!strcmp(binding_rec->type, "virtual")) { > + /* pinctrl module takes care of binding the ports > + * of type 'virtual'. > + * Release such ports if their virtual parents are no > + * longer claimed by this chassis. */ > + const struct sbrec_port_binding *parent > + = lport_lookup_by_name(sbrec_port_binding_by_name, > + binding_rec->virtual_parent); > + if (!parent || parent->chassis != chassis_rec) { > + VLOG_INFO("Releasing lport %s from this chassis.", > + binding_rec->logical_port); > + if (binding_rec->encap) { > + sbrec_port_binding_set_encap(binding_rec, NULL); > + } > + sbrec_port_binding_set_chassis(binding_rec, NULL); > + sbrec_port_binding_set_virtual_parent(binding_rec, > NULL); > + } > + } else { > + VLOG_INFO("Releasing lport %s from this chassis.", > + binding_rec->logical_port); > + if (binding_rec->encap) { > + sbrec_port_binding_set_encap(binding_rec, NULL); > + } > + sbrec_port_binding_set_chassis(binding_rec, NULL); > + } > } else if (our_chassis) { > static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); > VLOG_INFO_RL(&rl, > diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c > index d857067a5..357050eb5 100644 > --- a/ovn/controller/pinctrl.c > +++ b/ovn/controller/pinctrl.c > @@ -273,9 +273,22 @@ static void pinctrl_ip_mcast_handle_igmp( > > static bool may_inject_pkts(void); > > +static void init_put_vport_bindings(void); > +static void destroy_put_vport_bindings(void); > +static void run_put_vport_bindings( > + struct ovsdb_idl_txn *ovnsb_idl_txn, > + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, > + struct ovsdb_idl_index *sbrec_port_binding_by_key, > + const struct sbrec_chassis *chassis) > + OVS_REQUIRES(pinctrl_mutex); > +static void wait_put_vport_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn); > +static void pinctrl_handle_bind_vport(const struct flow *md, > + struct ofpbuf *userdata); > + > COVERAGE_DEFINE(pinctrl_drop_put_mac_binding); > COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map); > COVERAGE_DEFINE(pinctrl_drop_controller_event); > +COVERAGE_DEFINE(pinctrl_drop_put_vport_binding); > > struct empty_lb_backends_event { > struct hmap_node hmap_node; > @@ -432,6 +445,7 @@ pinctrl_init(void) > init_buffered_packets_map(); > init_event_table(); > ip_mcast_snoop_init(); > + init_put_vport_bindings(); > pinctrl.br_int_name = NULL; > pinctrl_handler_seq = seq_create(); > pinctrl_main_seq = seq_create(); > @@ -1957,6 +1971,12 @@ process_packet_in(struct rconn *swconn, const > struct ofp_header *msg) > ovs_mutex_unlock(&pinctrl_mutex); > break; > > + case ACTION_OPCODE_BIND_VPORT: > + ovs_mutex_lock(&pinctrl_mutex); > + pinctrl_handle_bind_vport(&pin.flow_metadata.flow, &userdata); > + ovs_mutex_unlock(&pinctrl_mutex); > + break; > + > default: > VLOG_WARN_RL(&rl, "unrecognized packet-in opcode %"PRIu32, > ntohl(ah->opcode)); > @@ -2135,6 +2155,8 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, > run_put_mac_bindings(ovnsb_idl_txn, sbrec_datapath_binding_by_key, > sbrec_port_binding_by_key, > sbrec_mac_binding_by_lport_ip); > + run_put_vport_bindings(ovnsb_idl_txn, sbrec_datapath_binding_by_key, > + sbrec_port_binding_by_key, chassis); > send_garp_prepare(sbrec_port_binding_by_datapath, > sbrec_port_binding_by_name, br_int, chassis, > local_datapaths, active_tunnels); > @@ -2481,6 +2503,7 @@ pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn) > { > wait_put_mac_bindings(ovnsb_idl_txn); > wait_controller_event(ovnsb_idl_txn); > + wait_put_vport_bindings(ovnsb_idl_txn); > int64_t new_seq = seq_read(pinctrl_main_seq); > seq_wait(pinctrl_main_seq, new_seq); > } > @@ -2498,6 +2521,7 @@ pinctrl_destroy(void) > destroy_buffered_packets_map(); > event_table_destroy(); > destroy_put_mac_bindings(); > + destroy_put_vport_bindings(); > destroy_dns_cache(); > ip_mcast_snoop_destroy(); > seq_destroy(pinctrl_main_seq); > @@ -4341,3 +4365,153 @@ pinctrl_handle_event(struct ofpbuf *userdata) > return; > } > } > + > +struct put_vport_binding { > + struct hmap_node hmap_node; > + > + /* Key and value. */ > + uint32_t dp_key; > + uint32_t vport_key; > + > + uint32_t vport_parent_key; > +}; > + > +/* Contains "struct put_vport_binding"s. */ > +static struct hmap put_vport_bindings; > + > +static void > +init_put_vport_bindings(void) > +{ > + hmap_init(&put_vport_bindings); > +} > + > +static void > +flush_put_vport_bindings(void) > +{ > + struct put_vport_binding *vport_b; > + HMAP_FOR_EACH_POP (vport_b, hmap_node, &put_vport_bindings) { > + free(vport_b); > + } > +} > + > +static void > +destroy_put_vport_bindings(void) > +{ > + flush_put_vport_bindings(); > + hmap_destroy(&put_vport_bindings); > +} > + > +static void > +wait_put_vport_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn) > +{ > + if (ovnsb_idl_txn && !hmap_is_empty(&put_vport_bindings)) { > + poll_immediate_wake(); > + } > +} > + > +static struct put_vport_binding * > +pinctrl_find_put_vport_binding(uint32_t dp_key, uint32_t vport_key, > + uint32_t hash) > +{ > + struct put_vport_binding *vpb; > + HMAP_FOR_EACH_WITH_HASH (vpb, hmap_node, hash, &put_vport_bindings) { > + if (vpb->dp_key == dp_key && vpb->vport_key == vport_key) { > + return vpb; > + } > + } > + return NULL; > +} > + > +static void > +run_put_vport_binding(struct ovsdb_idl_txn *ovnsb_idl_txn OVS_UNUSED, > + struct ovsdb_idl_index > *sbrec_datapath_binding_by_key, > + struct ovsdb_idl_index *sbrec_port_binding_by_key, > + const struct sbrec_chassis *chassis, > + const struct put_vport_binding *vpb) > +{ > + /* Convert logical datapath and logical port key into lport. */ > + const struct sbrec_port_binding *pb = lport_lookup_by_key( > + sbrec_datapath_binding_by_key, sbrec_port_binding_by_key, > + vpb->dp_key, vpb->vport_key); > + if (!pb) { > + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); > + > + VLOG_WARN_RL(&rl, "unknown logical port with datapath %"PRIu32" " > + "and port %"PRIu32, vpb->dp_key, vpb->vport_key); > + return; > + } > + > + /* pinctrl module updates the port binding only for type 'virtual'. */ > + if (!strcmp(pb->type, "virtual")) { > + const struct sbrec_port_binding *parent = lport_lookup_by_key( > + sbrec_datapath_binding_by_key, sbrec_port_binding_by_key, > + vpb->dp_key, vpb->vport_parent_key); > + if (parent) { > + VLOG_INFO("Claiming virtual lport %s for this chassis " > + "with the virtual parent %s", > + pb->logical_port, parent->logical_port); > + sbrec_port_binding_set_chassis(pb, chassis); > + sbrec_port_binding_set_virtual_parent(pb, > parent->logical_port); > + } > + } > +} > + > +/* Called by pinctrl_run(). Runs with in the main ovn-controller > + * thread context. */ > +static void > +run_put_vport_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn, > + struct ovsdb_idl_index > *sbrec_datapath_binding_by_key, > + struct ovsdb_idl_index *sbrec_port_binding_by_key, > + const struct sbrec_chassis *chassis) > + OVS_REQUIRES(pinctrl_mutex) > +{ > + if (!ovnsb_idl_txn) { > + return; > + } > + > + const struct put_vport_binding *vpb; > + HMAP_FOR_EACH (vpb, hmap_node, &put_vport_bindings) { > + run_put_vport_binding(ovnsb_idl_txn, > sbrec_datapath_binding_by_key, > + sbrec_port_binding_by_key, chassis, vpb); > + } > + > + flush_put_vport_bindings(); > +} > + > +/* Called with in the pinctrl_handler thread context. */ > +static void > +pinctrl_handle_bind_vport( > + const struct flow *md, struct ofpbuf *userdata) > + OVS_REQUIRES(pinctrl_mutex) > +{ > + /* Get the datapath key from the packet metadata. */ > + uint32_t dp_key = ntohll(md->metadata); > + uint32_t vport_parent_key = md->regs[MFF_LOG_INPORT - MFF_REG0]; > + > + /* Get the virtual port key from the userdata buffer. */ > + uint32_t *vport_key = ofpbuf_try_pull(userdata, sizeof *vport_key); > + > + if (!vport_key) { > + return; > + } > + > + uint32_t hash = hash_2words(dp_key, *vport_key); > + > + struct put_vport_binding *vpb > + = pinctrl_find_put_vport_binding(dp_key, *vport_key, hash); > + if (!vpb) { > + if (hmap_count(&put_vport_bindings) >= 1000) { > + COVERAGE_INC(pinctrl_drop_put_vport_binding); > + return; > + } > + > + vpb = xmalloc(sizeof *vpb); > + hmap_insert(&put_vport_bindings, &vpb->hmap_node, hash); > + } > + > + vpb->dp_key = dp_key; > + vpb->vport_key = *vport_key; > + vpb->vport_parent_key = vport_parent_key; > + > + notify_pinctrl_main(); > +} > diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c > index 4eacc44ed..66916a837 100644 > --- a/ovn/lib/actions.c > +++ b/ovn/lib/actions.c > @@ -2599,6 +2599,63 @@ ovnact_check_pkt_larger_free(struct > ovnact_check_pkt_larger *cipl OVS_UNUSED) > { > } > > +static void > +parse_bind_vport(struct action_context *ctx) > +{ > + if (!lexer_force_match(ctx->lexer, LEX_T_LPAREN)) { > + return; > + } > + > + if (ctx->lexer->token.type != LEX_T_STRING) { > + lexer_syntax_error(ctx->lexer, "expecting port name string"); > + return; > + } > + > + struct ovnact_bind_vport *bind_vp = > ovnact_put_BIND_VPORT(ctx->ovnacts); > + bind_vp->vport = xstrdup(ctx->lexer->token.s); > + lexer_get(ctx->lexer); > + (void) (lexer_force_match(ctx->lexer, LEX_T_COMMA) > + && action_parse_field(ctx, 0, false, &bind_vp->vport_parent) > + && lexer_force_match(ctx->lexer, LEX_T_RPAREN)); > +} > + > +static void > +format_BIND_VPORT(const struct ovnact_bind_vport *bind_vp, > + struct ds *s ) > +{ > + ds_put_format(s, "bind_vport(\"%s\", ", bind_vp->vport); > + expr_field_format(&bind_vp->vport_parent, s); > + ds_put_cstr(s, ");"); > +} > + > +static void > +encode_BIND_VPORT(const struct ovnact_bind_vport *vp, > + const struct ovnact_encode_params *ep, > + struct ofpbuf *ofpacts) > +{ > + uint32_t vport_key; > + if (!ep->lookup_port(ep->aux, vp->vport, &vport_key)) { > + return; > + } > + > + const struct arg args[] = { > + { expr_resolve_field(&vp->vport_parent), MFF_LOG_INPORT }, > + }; > + encode_setup_args(args, ARRAY_SIZE(args), ofpacts); > + size_t oc_offset = > encode_start_controller_op(ACTION_OPCODE_BIND_VPORT, > + false, NX_CTLR_NO_METER, > + ofpacts); > + ofpbuf_put(ofpacts, &vport_key, sizeof(uint32_t)); > + encode_finish_controller_op(oc_offset, ofpacts); > + encode_restore_args(args, ARRAY_SIZE(args), ofpacts); > +} > + > +static void > +ovnact_bind_vport_free(struct ovnact_bind_vport *bp) > +{ > + free(bp->vport); > +} > + > /* Parses an assignment or exchange or put_dhcp_opts action. */ > static void > parse_set_action(struct action_context *ctx) > @@ -2706,6 +2763,8 @@ parse_action(struct action_context *ctx) > parse_set_meter_action(ctx); > } else if (lexer_match_id(ctx->lexer, "trigger_event")) { > parse_trigger_event(ctx, ovnact_put_TRIGGER_EVENT(ctx->ovnacts)); > + } else if (lexer_match_id(ctx->lexer, "bind_vport")) { > + parse_bind_vport(ctx); > } else { > lexer_syntax_error(ctx->lexer, "expecting action"); > } > diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c > index 0f07d80ac..de745d73f 100644 > --- a/ovn/lib/ovn-util.c > +++ b/ovn/lib/ovn-util.c > @@ -326,6 +326,7 @@ static const char *OVN_NB_LSP_TYPES[] = { > "router", > "vtep", > "external", > + "virtual", > }; > > bool > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml > index d2267de0e..6ff7aaff1 100644 > --- a/ovn/northd/ovn-northd.8.xml > +++ b/ovn/northd/ovn-northd.8.xml > @@ -519,6 +519,34 @@ > some additional flow cost for this and the value appears limited. > </li> > > + <li> > + <p> > + If inport <code>V</code> is of type <code>virtual</code> adds a > + priority-100 logical flow for each <var>P</var> configured in > the > + <ref table="Logical_Switch_Port" > column="options:virtual-parents"/> > + column with the match > + </p> > + <pre> > +<code>inport == <var>P</var> && > !is_chassis_resident(<var>V</var>) && ((arp.op == 1 && > arp.spa == <var>VIP</var> && arp.tpa == <var>VIP</var>) || (arp.op > == 2 && arp.spa == <var>VIP</var>))</code> > + </pre> > + > + <p> > + and applies the action > + </p> > + <pre> > +<code>bind_vport(<var>V</var>, inport);</code> > + </pre> > + > + <p> > + and advances the packet to the next table. > + </p> > + > + <p> > + Where <var>VIP</var> is the virtual ip configured in the column > + <ref table="Logical_Switch_Port" column="options:virtual-ip"/>. > + </p> > + </li> > + > <li> > <p> > Priority-50 flows that match ARP requests to each known IP > address > @@ -541,7 +569,8 @@ output; > > <p> > These flows are omitted for logical ports (other than router > ports or > - <code>localport</code> ports) that are down. > + <code>localport</code> ports) that are down and for logical > ports of > + type <code>virtual</code>. > </p> > </li> > > @@ -588,7 +617,8 @@ nd_na_router { > > <p> > These flows are omitted for logical ports (other than router > ports or > - <code>localport</code> ports) that are down. > + <code>localport</code> ports) that are down and for logical > ports of > + type <code>virtual</code>. > </p> > </li> > > @@ -2031,6 +2061,33 @@ next; > <code>eth.dst = <var>E</var>; next;</code>. > </p> > > + <p> > + For each virtual ip <var>A</var> configured on a logical port > + of type <code>virtual</code> and its virtual parent set in > + its corresponding <ref db="OVN_Southbound" > table="Port_Binding"/> > + record and the virtual parent with the Ethernet address > <var>E</var> > + and the virtual ip is reachable via the router port > <var>P</var>, a > + priority-100 flow with match <code>outport === <var>P</var> > + && reg0 == <var>A</var></code> has actions > + <code>eth.dst = <var>E</var>; next;</code>. > + </p> > + > + <p> > + For each virtual ip <var>A</var> configured on a logical port > + of type <code>virtual</code> and its virtual parent > <code>not</code> > + set in its corresponding > + <ref db="OVN_Southbound" table="Port_Binding"/> > + record and the virtual ip <var>A</var> is reachable via the > + router port <var>P</var>, a > + priority-100 flow with match <code>outport === <var>P</var> > + && reg0 == <var>A</var></code> has actions > + <code>eth.dst = <var>00:00:00:00:00:00</var>; next;</code>. > + This flow is added so that the ARP is always resolved for the > + virtual ip <var>A</var> by generating ARP request and > + <code>not</code> consulting the MAC_Binding table as it can have > + incorrect value for the virtual ip <var>A</var>. > + </p> > + > <p> > For each IPv6 address <var>A</var> whose host is known to have > Ethernet address <var>E</var> on router port <var>P</var>, a > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c > index eb6c47cad..ae09cf338 100644 > --- a/ovn/northd/ovn-northd.c > +++ b/ovn/northd/ovn-northd.c > @@ -4878,96 +4878,146 @@ build_lswitch_flows(struct hmap *datapaths, > struct hmap *ports, > continue; > } > > - /* > - * Add ARP/ND reply flows if either the > - * - port is up or > - * - port type is router or > - * - port type is localport > - */ > - if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") && > - strcmp(op->nbsp->type, "localport")) { > - continue; > - } > + if (!strcmp(op->nbsp->type, "virtual")) { > + /* Handle > + * - GARPs for virtual ip which belongs to a logical port > + * of type 'virtual' and bind that port. > + * > + * - ARP reply from the virtual ip which belongs to a logical > + * port of type 'virtual' and bind that port. > + * */ > + ovs_be32 ip; > + const char *virtual_ip = smap_get(&op->nbsp->options, > + "virtual-ip"); > + const char *virtual_parents = smap_get(&op->nbsp->options, > + "virtual-parents"); > + if (!virtual_ip || !virtual_parents || > + !ip_parse(virtual_ip, &ip)) { > + continue; > + } > > - if (lsp_is_external(op->nbsp)) { > - continue; > - } > + char *tokstr = xstrdup(virtual_parents); > + char *save_ptr = NULL; > + char *vparent; > + for (vparent = strtok_r(tokstr, ",", &save_ptr); vparent != > NULL; > + vparent = strtok_r(NULL, ",", &save_ptr)) { > + struct ovn_port *vp = ovn_port_find(ports, vparent); > + if (!vp || vp->od != op->od) { > + /* vparent name should be valid and it should belong > + * to the same logical switch. */ > + continue; > + } > > - for (size_t i = 0; i < op->n_lsp_addrs; i++) { > - for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) { > ds_clear(&match); > - ds_put_format(&match, "arp.tpa == %s && arp.op == 1", > - op->lsp_addrs[i].ipv4_addrs[j].addr_s); > + ds_put_format(&match, "inport == \"%s\" && " > + "!is_chassis_resident(%s) && " > + "((arp.op == 1 && arp.spa == %s && " > + "arp.tpa == %s) || (arp.op == 2 && " > + "arp.spa == %s))", > + vparent, op->json_key, virtual_ip, > virtual_ip, > + virtual_ip); > ds_clear(&actions); > ds_put_format(&actions, > - "eth.dst = eth.src; " > - "eth.src = %s; " > - "arp.op = 2; /* ARP reply */ " > - "arp.tha = arp.sha; " > - "arp.sha = %s; " > - "arp.tpa = arp.spa; " > - "arp.spa = %s; " > - "outport = inport; " > - "flags.loopback = 1; " > - "output;", > - op->lsp_addrs[i].ea_s, op->lsp_addrs[i].ea_s, > - op->lsp_addrs[i].ipv4_addrs[j].addr_s); > - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 50, > + "bind_vport(%s, inport); " > + "next;", > + op->json_key); > + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100, > ds_cstr(&match), ds_cstr(&actions)); > + } > > - /* Do not reply to an ARP request from the port that owns > the > - * address (otherwise a DHCP client that ARPs to check > for a > - * duplicate address will fail). Instead, forward it the > usual > - * way. > - * > - * (Another alternative would be to simply drop the > packet. If > - * everything is working as it is configured, then this > would > - * produce equivalent results, since no one should reply > to the > - * request. But ARPing for one's own IP address is > intended to > - * detect situations where the network is not working as > - * configured, so dropping the request would frustrate > that > - * intent.) */ > - ds_put_format(&match, " && inport == %s", op->json_key); > - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100, > - ds_cstr(&match), "next;"); > + free(tokstr); > + } else { > + /* > + * Add ARP/ND reply flows if either the > + * - port is up or > + * - port type is router or > + * - port type is localport > + */ > + if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") > && > + strcmp(op->nbsp->type, "localport")) { > + continue; > } > > - /* For ND solicitations, we need to listen for both the > - * unicast IPv6 address and its all-nodes multicast address, > - * but always respond with the unicast IPv6 address. */ > - for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; j++) { > - ds_clear(&match); > - ds_put_format(&match, > - "nd_ns && ip6.dst == {%s, %s} && nd.target == %s", > - op->lsp_addrs[i].ipv6_addrs[j].addr_s, > - op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s, > - op->lsp_addrs[i].ipv6_addrs[j].addr_s); > + if (lsp_is_external(op->nbsp)) { > + continue; > + } > > - ds_clear(&actions); > - ds_put_format(&actions, > - "%s { " > + for (size_t i = 0; i < op->n_lsp_addrs; i++) { > + for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; > j++) { > + ds_clear(&match); > + ds_put_format(&match, "arp.tpa == %s && arp.op == 1", > + op->lsp_addrs[i].ipv4_addrs[j].addr_s); > + ds_clear(&actions); > + ds_put_format(&actions, > + "eth.dst = eth.src; " > "eth.src = %s; " > - "ip6.src = %s; " > - "nd.target = %s; " > - "nd.tll = %s; " > + "arp.op = 2; /* ARP reply */ " > + "arp.tha = arp.sha; " > + "arp.sha = %s; " > + "arp.tpa = arp.spa; " > + "arp.spa = %s; " > "outport = inport; " > "flags.loopback = 1; " > - "output; " > - "};", > - !strcmp(op->nbsp->type, "router") ? > - "nd_na_router" : "nd_na", > - op->lsp_addrs[i].ea_s, > - op->lsp_addrs[i].ipv6_addrs[j].addr_s, > - op->lsp_addrs[i].ipv6_addrs[j].addr_s, > - op->lsp_addrs[i].ea_s); > - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 50, > - ds_cstr(&match), ds_cstr(&actions)); > + "output;", > + op->lsp_addrs[i].ea_s, op->lsp_addrs[i].ea_s, > + op->lsp_addrs[i].ipv4_addrs[j].addr_s); > + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, > 50, > + ds_cstr(&match), ds_cstr(&actions)); > + > + /* Do not reply to an ARP request from the port that > owns > + * the address (otherwise a DHCP client that ARPs to > check > + * for a duplicate address will fail). Instead, > forward > + * it the usual way. > + * > + * (Another alternative would be to simply drop the > packet. > + * If everything is working as it is configured, then > this > + * would produce equivalent results, since no one > should > + * reply to the request. But ARPing for one's own IP > + * address is intended to detect situations where the > + * network is not working as configured, so dropping > the > + * request would frustrate that intent.) */ > + ds_put_format(&match, " && inport == %s", > op->json_key); > + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, > 100, > + ds_cstr(&match), "next;"); > + } > > - /* Do not reply to a solicitation from the port that owns > the > - * address (otherwise DAD detection will fail). */ > - ds_put_format(&match, " && inport == %s", op->json_key); > - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100, > - ds_cstr(&match), "next;"); > + /* For ND solicitations, we need to listen for both the > + * unicast IPv6 address and its all-nodes multicast > address, > + * but always respond with the unicast IPv6 address. */ > + for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; > j++) { > + ds_clear(&match); > + ds_put_format(&match, > + "nd_ns && ip6.dst == {%s, %s} && nd.target == > %s", > + op->lsp_addrs[i].ipv6_addrs[j].addr_s, > + op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s, > + op->lsp_addrs[i].ipv6_addrs[j].addr_s); > + > + ds_clear(&actions); > + ds_put_format(&actions, > + "%s { " > + "eth.src = %s; " > + "ip6.src = %s; " > + "nd.target = %s; " > + "nd.tll = %s; " > + "outport = inport; " > + "flags.loopback = 1; " > + "output; " > + "};", > + !strcmp(op->nbsp->type, "router") ? > + "nd_na_router" : "nd_na", > + op->lsp_addrs[i].ea_s, > + op->lsp_addrs[i].ipv6_addrs[j].addr_s, > + op->lsp_addrs[i].ipv6_addrs[j].addr_s, > + op->lsp_addrs[i].ea_s); > + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, > 50, > + ds_cstr(&match), ds_cstr(&actions)); > + > + /* Do not reply to a solicitation from the port that > owns > + * the address (otherwise DAD detection will fail). */ > + ds_put_format(&match, " && inport == %s", > op->json_key); > + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, > 100, > + ds_cstr(&match), "next;"); > + } > } > } > } > @@ -7504,7 +7554,8 @@ build_lrouter_flows(struct hmap *datapaths, struct > hmap *ports, > 100, ds_cstr(&match), > ds_cstr(&actions)); > } > } > - } else if (op->od->n_router_ports && strcmp(op->nbsp->type, > "router")) { > + } else if (op->od->n_router_ports && strcmp(op->nbsp->type, > "router") > + && strcmp(op->nbsp->type, "virtual")) { > /* This is a logical switch port that backs a VM or a > container. > * Extract its addresses. For each of the address, go through > all > * the router ports attached to the switch (to which this port > @@ -7581,6 +7632,105 @@ build_lrouter_flows(struct hmap *datapaths, struct > hmap *ports, > } > } > } > + } else if (op->od->n_router_ports && strcmp(op->nbsp->type, > "router") > + && !strcmp(op->nbsp->type, "virtual")) { > + /* This is a virtual port. Add ARP replies for the virtual ip > with > + * the mac of the present active virtual parent. > + * If the logical port doesn't have virtual parent set in > + * Port_Binding table, then add the flow to set eth.dst to > + * 00:00:00:00:00:00 and advance to next table so that ARP is > + * resolved by router pipeline using the arp{} action. > + * The MAC_Binding entry for the virtual ip might be invalid. > */ > + ovs_be32 ip; > + > + const char *vip = smap_get(&op->nbsp->options, > + "virtual-ip"); > + const char *virtual_parents = smap_get(&op->nbsp->options, > + "virtual-parents"); > + if (!vip || !virtual_parents || > + !ip_parse(vip, &ip) || !op->sb) { > + continue; > + } > + > + if (!op->sb->virtual_parent || !op->sb->virtual_parent[0] || > + !op->sb->chassis) { > + /* The virtual port is not claimed yet. */ > + for (size_t i = 0; i < op->od->n_router_ports; i++) { > + const char *peer_name = smap_get( > + &op->od->router_ports[i]->nbsp->options, > + "router-port"); > + if (!peer_name) { > + continue; > + } > + > + struct ovn_port *peer = ovn_port_find(ports, > peer_name); > + if (!peer || !peer->nbrp) { > + continue; > + } > + > + if (find_lrp_member_ip(peer, vip)) { > + ds_clear(&match); > + ds_put_format(&match, "outport == %s && reg0 == > %s", > + peer->json_key, vip); > + > + ds_clear(&actions); > + ds_put_format(&actions, > + "eth.dst = 00:00:00:00:00:00; > next;"); > + ovn_lflow_add(lflows, peer->od, > + S_ROUTER_IN_ARP_RESOLVE, 100, > + ds_cstr(&match), > ds_cstr(&actions)); > + break; > + } > + } > + } else { > + struct ovn_port *vp = > + ovn_port_find(ports, op->sb->virtual_parent); > + if (!vp || !vp->nbsp) { > + continue; > + } > + > + for (size_t i = 0; i < vp->n_lsp_addrs; i++) { > + bool found_vip_network = false; > + const char *ea_s = vp->lsp_addrs[i].ea_s; > + for (size_t j = 0; j < vp->od->n_router_ports; j++) { > + /* Get the Logical_Router_Port that the > + * Logical_Switch_Port is connected to, as > + * 'peer'. */ > + const char *peer_name = smap_get( > + &vp->od->router_ports[j]->nbsp->options, > + "router-port"); > + if (!peer_name) { > + continue; > + } > + > + struct ovn_port *peer = > + ovn_port_find(ports, peer_name); > + if (!peer || !peer->nbrp) { > + continue; > + } > + > + if (!find_lrp_member_ip(peer, vip)) { > + continue; > + } > + > + ds_clear(&match); > + ds_put_format(&match, "outport == %s && reg0 == > %s", > + peer->json_key, vip); > + > + ds_clear(&actions); > + ds_put_format(&actions, "eth.dst = %s; next;", > ea_s); > + ovn_lflow_add(lflows, peer->od, > + S_ROUTER_IN_ARP_RESOLVE, 100, > + ds_cstr(&match), > ds_cstr(&actions)); > + found_vip_network = true; > + break; > + } > + > + if (found_vip_network) { > + break; > + } > + } > + } > } else if (!strcmp(op->nbsp->type, "router")) { > /* This is a logical switch port that connects to a router. */ > > @@ -9256,6 +9406,8 @@ main(int argc, char *argv[]) > &sbrec_port_binding_col_gateway_chassis); > ovsdb_idl_add_column(ovnsb_idl_loop.idl, > &sbrec_port_binding_col_ha_chassis_group); > + ovsdb_idl_add_column(ovnsb_idl_loop.idl, > + &sbrec_port_binding_col_virtual_parent); > ovsdb_idl_add_column(ovnsb_idl_loop.idl, > &sbrec_gateway_chassis_col_chassis); > ovsdb_idl_add_column(ovnsb_idl_loop.idl, > &sbrec_gateway_chassis_col_name); > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml > index 57b6edbf8..f5f10a5c1 100644 > --- a/ovn/ovn-nb.xml > +++ b/ovn/ovn-nb.xml > @@ -465,6 +465,31 @@ > </li> > </ul> > </dd> > + > + <dt><code>virtual</code></dt> > + <dd> > + <p> > + Represents a logical port which does not have an OVS > + port in the integration bridge and has a virtual ip > configured > + in the <ref column="options:virtual-ip"/> column. This > virtual ip > + can move around between the logical ports configured in > + the <ref column="options:virtual-parents"/> column. > + </p> > + > + <p> > + One of the use case where <code>virtual</code> > + ports can be used is. > + </p> > + > + <ul> > + <li> > + The <code>virtual ip</code> represents a load balancer vip > + and the <code>virtual parents</code> provide load balancer > + service in an active-standby setup with the active virtual > + parent owning the <code>virtual ip</code>. > + </li> > + </ul> > + </dd> > </dl> > </column> > </group> > @@ -618,6 +643,26 @@ > interface, in bits. > </column> > </group> > + > + <group title="Virtual port Options"> > + <p> > + These options apply when <ref column="type"/> is > + <code>virtual</code>. > + </p> > + > + <column name="options" key="virtual-ip"> > + This option represents the virtual IPv4 address. > + </column> > + > + <column name="options" key="virtual-parents"> > + This options represents a set of logical port names (with in > the same > + logical switch) which can own the <code>virtual ip</code> > configured > + in the <ref column="options:virtual-ip"/>. All these virtual > parents > + should add the <code>virtual ip</code> in the > + <ref column="port_security"/> if port security addressed are > enabled. > + </column> > + </group> > + > </group> > > <group title="Containers"> > diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema > index 2b7bc57a7..5c013b17e 100644 > --- a/ovn/ovn-sb.ovsschema > +++ b/ovn/ovn-sb.ovsschema > @@ -1,7 +1,7 @@ > { > "name": "OVN_Southbound", > - "version": "2.4.0", > - "cksum": "3059284885 20260", > + "version": "2.5.0", > + "cksum": "1257419092 20387", > "tables": { > "SB_Global": { > "columns": { > @@ -173,6 +173,8 @@ > "minInteger": 1, > "maxInteger": 4095}, > "min": 0, "max": 1}}, > + "virtual_parent": {"type": {"key": "string", "min": 0, > + "max": 1}}, > "chassis": {"type": {"key": {"type": "uuid", > "refTable": "Chassis", > "refType": "weak"}, > diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml > index 544a071fa..17c45bbac 100644 > --- a/ovn/ovn-sb.xml > +++ b/ovn/ovn-sb.xml > @@ -2017,6 +2017,24 @@ tcp.flags = RST; > </p> > <p><b>Prerequisite:</b> <code>igmp</code></p> > </dd> > + > + <dt><code>bind_vport(<var>V</var>, <var>P</var>);</code></dt> > + <dd> > + <p> > + <b>Parameters</b>: logical port string field <var>V</var> > + of type <code>virtual</code>, logical port string field > + <var>P</var>. > + </p> > + > + <p> > + Binds the virtual logical port <var>V</var> and sets the > + <ref table="Port_Binding" column="chassis"/> column and > + <ref table="Port_Binding" column="virtual_parent"/> of > + the table <ref table="Port_Binding"/>. > + <ref table="Port_Binding" column="virtual_parent"/> is > + set to <var>P</var>. > + </p> > + </dd> > </dl> > </column> > > @@ -2480,6 +2498,13 @@ tcp.flags = RST; > the <code>outport</code> will be reset to the value of the > distributed port. > </dd> > + > + <dt><code>virtual</code></dt> > + <dd> > + Represents a logical port with an <code>virtual ip</code>. > + This <code>virtual ip</code> can be configured on a > + logical port (which is refered as virtual parent). > + </dd> > </dl> > </column> > </group> > @@ -2720,6 +2745,27 @@ tcp.flags = RST; > </column> > </group> > > + <group title="Virtual ports"> > + <column name="virtual_parent"> > + <p> > + This column is set by <code>ovn-controller</code> with one of > the > + value from the > + <ref table="Logical_Switch_Port" > column="options:virtual-parents" > + db="OVN_Northbound"/> in the OVN_Northbound database's > + <ref table="Logical_Switch_Port" db="OVN_Northbound"/> table > + when the OVN action <code>bind_vport</code> is executed. > + <code>ovn-controller</code> also sets the > + <ref column="chassis"/> column when it executes this action > + with its chassis id. > + </p> > + > + <p> > + <code>ovn-controller</code> sets this column only if the > + <ref column="type"/> is "virtual". > + </p> > + </column> > + </group> > + > <group title="Naming"> > <column name="external_ids" key="name"> > <p> > diff --git a/ovn/utilities/ovn-trace.c b/ovn/utilities/ovn-trace.c > index 044eb1cc2..b532b8eaf 100644 > --- a/ovn/utilities/ovn-trace.c > +++ b/ovn/utilities/ovn-trace.c > @@ -2144,6 +2144,9 @@ trace_actions(const struct ovnact *ovnacts, size_t > ovnacts_len, > > case OVNACT_CHECK_PKT_LARGER: > break; > + > + case OVNACT_BIND_VPORT: > + break; > } > } > ds_destroy(&s); > diff --git a/tests/ovn.at b/tests/ovn.at > index cb380d275..5d6c90c5f 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -1368,6 +1368,24 @@ reg0 = check_pkt_larger(foo); > reg0[0] = check_pkt_larger(foo); > Syntax error at `foo' expecting `;'. > > +# bind_vport > +# lsp1's port key is 0x11. > +bind_vport("lsp1", inport); > + encodes as controller(userdata=00.00.00.11.00.00.00.00.11.00.00.00) > +# lsp2 doesn't exist. So it should be encoded as drop. > +bind_vport("lsp2", inport); > + encodes as drop > +bind_vport; > + Syntax error at `;' expecting `('. > +bind_vport(; > + Syntax error at `;' expecting port name string. > +bind_vport("xyzzy"; > + Syntax error at `;' expecting `,'. > +bind_vport("xyzzy",; > + Syntax error at `;' expecting field name. > +bind_vport("xyzzy", inport; > + Syntax error at `;' expecting `)'. > + > # Miscellaneous negative tests. > ; > Syntax error at `;'. > @@ -14345,6 +14363,278 @@ OVN_CLEANUP([hv1],[hv2]) > > AT_CLEANUP > > +AT_SETUP([ovn -- virtual ports]) > +AT_KEYWORDS([virtual ports]) > +AT_SKIP_IF([test $HAVE_PYTHON = no]) > +ovn_start > + > +send_garp() { > + local hv=$1 inport=$2 eth_src=$3 eth_dst=$4 spa=$5 tpa=$6 > + local > request=${eth_dst}${eth_src}08060001080006040001${eth_src}${spa}${eth_dst}${tpa} > + as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request > +} > + > +send_arp_reply() { > + local hv=$1 inport=$2 eth_src=$3 eth_dst=$4 spa=$5 tpa=$6 > + local > request=${eth_dst}${eth_src}08060001080006040002${eth_src}${spa}${eth_dst}${tpa} > + as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request > +} > + > +net_add n1 > + > +sim_add hv1 > +as hv1 > +ovs-vsctl add-br br-phys > +ovn_attach n1 br-phys 192.168.0.1 > +ovs-vsctl -- add-port br-int hv1-vif1 -- \ > + set interface hv1-vif1 external-ids:iface-id=sw0-p1 \ > + options:tx_pcap=hv1/vif1-tx.pcap \ > + options:rxq_pcap=hv1/vif1-rx.pcap \ > + ofport-request=1 > +ovs-vsctl -- add-port br-int hv1-vif2 -- \ > + set interface hv1-vif2 external-ids:iface-id=sw0-p3 \ > + options:tx_pcap=hv1/vif2-tx.pcap \ > + options:rxq_pcap=hv1/vif2-rx.pcap \ > + ofport-request=2 > + > +sim_add hv2 > +as hv2 > +ovs-vsctl add-br br-phys > +ovn_attach n1 br-phys 192.168.0.2 > +ovs-vsctl -- add-port br-int hv2-vif1 -- \ > + set interface hv2-vif1 external-ids:iface-id=sw0-p2 \ > + options:tx_pcap=hv2/vif1-tx.pcap \ > + options:rxq_pcap=hv2/vif1-rx.pcap \ > + ofport-request=1 > +ovs-vsctl -- add-port br-int hv2-vif2 -- \ > + set interface hv2-vif2 external-ids:iface-id=sw1-p1 \ > + options:tx_pcap=hv2/vif2-tx.pcap \ > + options:rxq_pcap=hv2/vif2-rx.pcap \ > + ofport-request=2 > + > +ovn-nbctl ls-add sw0 > + > +ovn-nbctl lsp-add sw0 sw0-vir > +ovn-nbctl lsp-set-addresses sw0-vir "50:54:00:00:00:10 10.0.0.10" > +ovn-nbctl lsp-set-port-security sw0-vir "50:54:00:00:00:10 10.0.0.10" > +ovn-nbctl lsp-set-type sw0-vir virtual > +ovn-nbctl set logical_switch_port sw0-vir options:virtual-ip=10.0.0.10 > +ovn-nbctl set logical_switch_port sw0-vir > options:virtual-parents=sw0-p1,sw0-p2 > + > +ovn-nbctl lsp-add sw0 sw0-p1 > +ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" > +ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3 > 10.0.0.10" > + > +ovn-nbctl lsp-add sw0 sw0-p2 > +ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4" > +ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4 > 10.0.0.10" > + > +ovn-nbctl lsp-add sw0 sw0-p3 > +ovn-nbctl lsp-set-addresses sw0-p3 "50:54:00:00:00:05 10.0.0.5" > +ovn-nbctl lsp-set-port-security sw0-p3 "50:54:00:00:00:05 10.0.0.5" > + > +# Create the second logical switch with one port > +ovn-nbctl ls-add sw1 > +ovn-nbctl lsp-add sw1 sw1-p1 > +ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03 20.0.0.3" > +ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03 20.0.0.3" > + > +# Create a logical router and attach both logical switches > +ovn-nbctl lr-add lr0 > +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > +ovn-nbctl lsp-add sw0 sw0-lr0 > +ovn-nbctl lsp-set-type sw0-lr0 router > +ovn-nbctl lsp-set-addresses sw0-lr0 00:00:00:00:ff:01 > +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > + > +ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24 > +ovn-nbctl lsp-add sw1 sw1-lr0 > +ovn-nbctl lsp-set-type sw1-lr0 router > +ovn-nbctl lsp-set-addresses sw1-lr0 00:00:00:00:ff:02 > +ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1 > + > +OVN_POPULATE_ARP > +ovn-nbctl --wait=hv sync > + > +# Check that logical flows are added for sw0-vir in lsp_in_arp_rsp > pipeline > +# with bind_vport action. > + > +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > > lflows.txt > + > +AT_CHECK([cat lflows.txt], [0], [dnl > + table=11(ls_in_arp_rsp ), priority=100 , match=(inport == > "sw0-p1" && !is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == > 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == > 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) > + table=11(ls_in_arp_rsp ), priority=100 , match=(inport == > "sw0-p2" && !is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == > 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == > 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) > +]) > + > +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == > 10.0.0.10" \ > +> lflows.txt > + > +# Since the sw0-vir is not claimed by any chassis, eth.dst should be set > to > +# zero if the ip4.dst is the virtual ip in the router pipeline. > +AT_CHECK([cat lflows.txt], [0], [dnl > + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > +]) > + > +ip_to_hex() { > + printf "%02x%02x%02x%02x" "$@" > +} > + > +hv1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv1"` > +hv2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv2"` > + > +AT_CHECK([test x$(ovn-sbctl --bare --columns chassis find port_binding \ > +logical_port=sw0-vir) = x], [0], []) > + > +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find > port_binding \ > +logical_port=sw0-vir) = x]) > + > +# From sw0-p0 send GARP for 10.0.0.10. hv1 should claim sw0-vir > +# and sw0-p1 should be its virtual_parent. > +eth_src=505400000003 > +eth_dst=ffffffffffff > +spa=$(ip_to_hex 10 0 0 10) > +tpa=$(ip_to_hex 10 0 0 10) > +send_garp 1 1 $eth_src $eth_dst $spa $tpa > + > +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find > port_binding \ > +logical_port=sw0-vir) = x$hv1_ch_uuid], [0], []) > + > +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find > port_binding \ > +logical_port=sw0-vir) = xsw0-p1]) > + > +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == > 10.0.0.10" \ > +> lflows.txt > + > +# There should be an arp resolve flow to resolve the virtual_ip with the > +# sw0-p1's MAC. > +AT_CHECK([cat lflows.txt], [0], [dnl > + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > +]) > + > +# send the garp from sw0-p2 (in hv2). hv2 should claim sw0-vir > +# and sw0-p2 shpuld be its virtual_parent. > +eth_src=505400000004 > +eth_dst=ffffffffffff > +spa=$(ip_to_hex 10 0 0 10) > +tpa=$(ip_to_hex 10 0 0 10) > +send_garp 2 1 $eth_src $eth_dst $spa $tpa > + > +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find > port_binding \ > +logical_port=sw0-vir) = x$hv2_ch_uuid], [0], []) > + > +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find > port_binding \ > +logical_port=sw0-vir) = xsw0-p2]) > + > +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == > 10.0.0.10" \ > +> lflows.txt > + > +# There should be an arp resolve flow to resolve the virtual_ip with the > +# sw0-p2's MAC. > +AT_CHECK([cat lflows.txt], [0], [dnl > + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > +]) > + > +# Now send arp reply from sw0-p1. hv1 should claim sw0-vir > +# and sw0-p1 shpuld be its virtual_parent. > +eth_src=505400000003 > +eth_dst=ffffffffffff > +spa=$(ip_to_hex 10 0 0 10) > +tpa=$(ip_to_hex 10 0 0 4) > +send_arp_reply 1 1 $eth_src $eth_dst $spa $tpa > + > +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find > port_binding \ > +logical_port=sw0-vir) = x$hv1_ch_uuid], [0], []) > + > +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find > port_binding \ > +logical_port=sw0-vir) = xsw0-p1]) > + > +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == > 10.0.0.10" \ > +> lflows.txt > + > +AT_CHECK([cat lflows.txt], [0], [dnl > + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) > +]) > + > +# Delete hv1-vif1 port. hv1 should release sw0-vir > +as hv1 ovs-vsctl del-port hv1-vif1 > + > +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find > port_binding \ > +logical_port=sw0-vir) = x], [0], []) > + > +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find > port_binding \ > +logical_port=sw0-vir) = x]) > + > +# Since the sw0-vir is not claimed by any chassis, eth.dst should be set > to > +# zero if the ip4.dst is the virtual ip. > +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == > 10.0.0.10" \ > +> lflows.txt > + > +AT_CHECK([cat lflows.txt], [0], [dnl > + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) > +]) > + > +# Now send arp reply from sw0-p2. hv2 should claim sw0-vir > +# and sw0-p2 shpuld be its virtual_parent. > +eth_src=505400000004 > +eth_dst=ffffffffffff > +spa=$(ip_to_hex 10 0 0 10) > +tpa=$(ip_to_hex 10 0 0 3) > +send_arp_reply 2 1 $eth_src $eth_dst $spa $tpa > + > +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find > port_binding \ > +logical_port=sw0-vir) = x$hv2_ch_uuid], [0], []) > + > +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find > port_binding \ > +logical_port=sw0-vir) = xsw0-p2]) > + > +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == > 10.0.0.10" \ > +> lflows.txt > + > +AT_CHECK([cat lflows.txt], [0], [dnl > + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == > "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) > +]) > + > +# Delete sw0-p2 logical port > +ovn-nbctl lsp-del sw0-p2 > + > +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find > port_binding \ > +logical_port=sw0-vir) = x], [0], []) > + > +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find > port_binding \ > +logical_port=sw0-vir) = x]) > + > +# Clear virtual_ip column of sw0-vir. There should be no bind_vport flows. > +ovn-nbctl --wait=hv remove logical_switch_port sw0-vir options virtual-ip > + > +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > > lflows.txt > + > +AT_CHECK([cat lflows.txt], [0], [dnl > +]) > + > +# Add back virtual_ip and clear virtual_parents. > +ovn-nbctl --wait=hv set logical_switch_port sw0-vir > options:virtual-ip=10.0.0.10 > + > +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > > lflows.txt > + > +AT_CHECK([cat lflows.txt], [0], [dnl > + table=11(ls_in_arp_rsp ), priority=100 , match=(inport == > "sw0-p1" && !is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == > 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == > 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) > +]) > + > +ovn-nbctl --wait=hv remove logical_switch_port sw0-vir options > virtual-parents > +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > > lflows.txt > + > +AT_CHECK([cat lflows.txt], [0], [dnl > +]) > + > +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == > 10.0.0.10" \ > +> lflows.txt > + > +AT_CHECK([cat lflows.txt], [0], [dnl > +]) > + > +OVN_CLEANUP([hv1], [hv2]) > +AT_CLEANUP > + > # Run ovn-nbctl in daemon mode, change to a backup database and verify > that > # an insert operation is not allowed. > AT_SETUP([ovn -- can't write to a backup database server instance]) > diff --git a/tests/test-ovn.c b/tests/test-ovn.c > index 0b9e8246e..cf1bc5432 100644 > --- a/tests/test-ovn.c > +++ b/tests/test-ovn.c > @@ -1253,6 +1253,7 @@ test_parse_actions(struct ovs_cmdl_context *ctx > OVS_UNUSED) > simap_put(&ports, "eth0", 5); > simap_put(&ports, "eth1", 6); > simap_put(&ports, "LOCAL", ofp_to_u16(OFPP_LOCAL)); > + simap_put(&ports, "lsp1", 0x11); > > ds_init(&input); > while (!ds_get_test_line(&input, stdin)) { > -- > 2.21.0 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Fri, Aug 2, 2019 at 6:20 AM Guru Shetty <guru@ovn.org> wrote: > > > On Thu, 1 Aug 2019 at 10:52, <nusiddiq@redhat.com> wrote: > >> From: Numan Siddique <nusiddiq@redhat.com> >> >> This new type is added for the following reasons: >> >> - When a load balancer is created in an OpenStack deployment with >> Octavia >> service, it creates a logical port 'VIP' for the virtual ip. >> >> - This logical port is not bound to any VIF. >> >> - Octavia service creates a service VM (with another logical port 'P' >> which >> belongs to the same logical switch) >> >> - The virtual ip 'VIP' is configured on this service VM. >> >> - This service VM provides the load balancing for the VIP with the >> configured >> backend IPs. >> >> - Octavia service can be configured to create few service VMs with >> active-standby mode >> with the active VM configured with the VIP. The VIP can move between >> these service nodes. >> >> Presently there are few problems: >> >> - When a floating ip (externally reachable IP) is associated to the VIP >> and if >> the compute nodes have external connectivity then the external >> traffic cannot >> reach the VIP using the floating ip as the VIP logical port would be >> down. >> dnat_and_snat entry in NAT table for this vip will have >> 'external_mac' and >> 'logical_port' configured. >> >> - The only way to make it work is to clear the 'external_mac' entry so >> that >> the gateway chassis does the DNAT for the VIP. >> >> To solve these problems, this patch proposes a new logical port type - >> virtual. >> CMS when creating the logical port for the VIP, should >> >> - set the type as 'virtual' >> >> - configure the VIP in the options - >> Logical_Switch_Port.options:virtual-ip >> >> - And set the virtual parents in the options >> Logical_Switch_Port.options:virtual-parents. >> These virtual parents are the one which can be configured with the VIP. >> >> If suppose the virtual_ip is configured to 10.0.0.10 on a virtual logical >> port 'sw0-vip' >> and the virtual_parents are set to - [sw0-p1, sw0-p2] then below logical >> flows are added in the >> lsp_in_arp_rsp logical switch pipeline >> >> - table=11(ls_in_arp_rsp), priority=100, >> match=(inport == "sw0-p1" && !is_chassis_resident("sw0-vip") && >> ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) >> || >> (arp.op == 2 && arp.spa == 10.0.0.10))), >> action=(bind_vport("sw0-vip", inport); next;) >> - table=11(ls_in_arp_rsp), priority=100, >> match=(inport == "sw0-p2" && !is_chassis_resident("sw0-vip") && >> ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) >> || >> (arp.op == 2 && arp.spa == 10.0.0.10))), >> action=(bind_vport("sw0-vip", inport); next;) >> >> The action bind_vport will claim the logical port - sw0-vip on the >> chassis where this action >> is executed. Since the port - sw0-vip is claimed by a chassis, the >> dnat_and_snat rule for >> the VIP will be handled by the compute node. >> >> Co-authored-by: Ben Pfaff <blp@ovn.org> >> Signed-off-by: Ben Pfaff <blp@ovn.org> >> Acked-by: Gurucharan Shetty <guru@ovn.org> >> Acked-by: Mark Michelson <mmichels@redhat.com> >> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> >> > > Ben is on vacation. So I took the liberty to commit it. > Thanks for the review and committing the patch. Numan > > >> >> (cherry picked from ovn commit 054f4c85c413e20d893e10ba053ec52ac15db49c) >> --- >> NEWS | 1 + >> include/ovn/actions.h | 18 ++- >> ovn/controller/binding.c | 30 +++- >> ovn/controller/pinctrl.c | 174 ++++++++++++++++++++ >> ovn/lib/actions.c | 59 +++++++ >> ovn/lib/ovn-util.c | 1 + >> ovn/northd/ovn-northd.8.xml | 61 ++++++- >> ovn/northd/ovn-northd.c | 306 +++++++++++++++++++++++++++--------- >> ovn/ovn-nb.xml | 45 ++++++ >> ovn/ovn-sb.ovsschema | 6 +- >> ovn/ovn-sb.xml | 46 ++++++ >> ovn/utilities/ovn-trace.c | 3 + >> tests/ovn.at | 290 ++++++++++++++++++++++++++++++++++ >> tests/test-ovn.c | 1 + >> 14 files changed, 954 insertions(+), 87 deletions(-) >> >> diff --git a/NEWS b/NEWS >> index 8cf850823..be3ea42b4 100644 >> --- a/NEWS >> +++ b/NEWS >> @@ -60,6 +60,7 @@ v2.12.0 - xx xxx xxxx >> logical groups which results in tunnels only been formed between >> members of the same transport zone(s). >> * Support for IGMP Snooping and IGMP Querier. >> + * Support for new logical switch port type - 'virtual'. >> - New QoS type "linux-netem" on Linux. >> - Added support for TLS Server Name Indication (SNI). >> - Linux datapath: >> diff --git a/include/ovn/actions.h b/include/ovn/actions.h >> index 63d3907d8..0ca06537c 100644 >> --- a/include/ovn/actions.h >> +++ b/include/ovn/actions.h >> @@ -85,7 +85,8 @@ struct ovn_extend_table; >> OVNACT(SET_METER, ovnact_set_meter) \ >> OVNACT(OVNFIELD_LOAD, ovnact_load) \ >> OVNACT(CHECK_PKT_LARGER, ovnact_check_pkt_larger) \ >> - OVNACT(TRIGGER_EVENT, ovnact_controller_event) >> + OVNACT(TRIGGER_EVENT, ovnact_controller_event) \ >> + OVNACT(BIND_VPORT, ovnact_bind_vport) >> >> /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. */ >> enum OVS_PACKED_ENUM ovnact_type { >> @@ -328,6 +329,13 @@ struct ovnact_controller_event { >> size_t n_options; >> }; >> >> +/* OVNACT_BIND_VPORT. */ >> +struct ovnact_bind_vport { >> + struct ovnact ovnact; >> + char *vport; >> + struct expr_field vport_parent; /* Logical virtual port's port >> name. */ >> +}; >> + >> /* Internal use by the helpers below. */ >> void ovnact_init(struct ovnact *, enum ovnact_type, size_t len); >> void *ovnact_put(struct ofpbuf *, enum ovnact_type, size_t len); >> @@ -505,6 +513,14 @@ enum action_opcode { >> * Snoop IGMP, learn the multicast participants >> */ >> ACTION_OPCODE_IGMP, >> + >> + /* "bind_vport(vport, vport_parent)". >> + * >> + * 'vport' follows the action_header, in the format - 32-bit field. >> + * 'vport_parent' is passed through the packet metadata as >> + * MFF_LOG_INPORT. >> + */ >> + ACTION_OPCODE_BIND_VPORT, >> }; >> >> /* Header. */ >> diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c >> index ace0f811b..dfe002b60 100644 >> --- a/ovn/controller/binding.c >> +++ b/ovn/controller/binding.c >> @@ -571,11 +571,31 @@ consider_local_datapath(struct ovsdb_idl_txn >> *ovnsb_idl_txn, >> sbrec_port_binding_set_encap(binding_rec, encap_rec); >> } >> } else if (binding_rec->chassis == chassis_rec) { >> - VLOG_INFO("Releasing lport %s from this chassis.", >> - binding_rec->logical_port); >> - if (binding_rec->encap) >> - sbrec_port_binding_set_encap(binding_rec, NULL); >> - sbrec_port_binding_set_chassis(binding_rec, NULL); >> + if (!strcmp(binding_rec->type, "virtual")) { >> + /* pinctrl module takes care of binding the ports >> + * of type 'virtual'. >> + * Release such ports if their virtual parents are no >> + * longer claimed by this chassis. */ >> + const struct sbrec_port_binding *parent >> + = lport_lookup_by_name(sbrec_port_binding_by_name, >> + binding_rec->virtual_parent); >> + if (!parent || parent->chassis != chassis_rec) { >> + VLOG_INFO("Releasing lport %s from this chassis.", >> + binding_rec->logical_port); >> + if (binding_rec->encap) { >> + sbrec_port_binding_set_encap(binding_rec, NULL); >> + } >> + sbrec_port_binding_set_chassis(binding_rec, NULL); >> + sbrec_port_binding_set_virtual_parent(binding_rec, >> NULL); >> + } >> + } else { >> + VLOG_INFO("Releasing lport %s from this chassis.", >> + binding_rec->logical_port); >> + if (binding_rec->encap) { >> + sbrec_port_binding_set_encap(binding_rec, NULL); >> + } >> + sbrec_port_binding_set_chassis(binding_rec, NULL); >> + } >> } else if (our_chassis) { >> static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, >> 1); >> VLOG_INFO_RL(&rl, >> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c >> index d857067a5..357050eb5 100644 >> --- a/ovn/controller/pinctrl.c >> +++ b/ovn/controller/pinctrl.c >> @@ -273,9 +273,22 @@ static void pinctrl_ip_mcast_handle_igmp( >> >> static bool may_inject_pkts(void); >> >> +static void init_put_vport_bindings(void); >> +static void destroy_put_vport_bindings(void); >> +static void run_put_vport_bindings( >> + struct ovsdb_idl_txn *ovnsb_idl_txn, >> + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, >> + struct ovsdb_idl_index *sbrec_port_binding_by_key, >> + const struct sbrec_chassis *chassis) >> + OVS_REQUIRES(pinctrl_mutex); >> +static void wait_put_vport_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn); >> +static void pinctrl_handle_bind_vport(const struct flow *md, >> + struct ofpbuf *userdata); >> + >> COVERAGE_DEFINE(pinctrl_drop_put_mac_binding); >> COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map); >> COVERAGE_DEFINE(pinctrl_drop_controller_event); >> +COVERAGE_DEFINE(pinctrl_drop_put_vport_binding); >> >> struct empty_lb_backends_event { >> struct hmap_node hmap_node; >> @@ -432,6 +445,7 @@ pinctrl_init(void) >> init_buffered_packets_map(); >> init_event_table(); >> ip_mcast_snoop_init(); >> + init_put_vport_bindings(); >> pinctrl.br_int_name = NULL; >> pinctrl_handler_seq = seq_create(); >> pinctrl_main_seq = seq_create(); >> @@ -1957,6 +1971,12 @@ process_packet_in(struct rconn *swconn, const >> struct ofp_header *msg) >> ovs_mutex_unlock(&pinctrl_mutex); >> break; >> >> + case ACTION_OPCODE_BIND_VPORT: >> + ovs_mutex_lock(&pinctrl_mutex); >> + pinctrl_handle_bind_vport(&pin.flow_metadata.flow, &userdata); >> + ovs_mutex_unlock(&pinctrl_mutex); >> + break; >> + >> default: >> VLOG_WARN_RL(&rl, "unrecognized packet-in opcode %"PRIu32, >> ntohl(ah->opcode)); >> @@ -2135,6 +2155,8 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, >> run_put_mac_bindings(ovnsb_idl_txn, sbrec_datapath_binding_by_key, >> sbrec_port_binding_by_key, >> sbrec_mac_binding_by_lport_ip); >> + run_put_vport_bindings(ovnsb_idl_txn, sbrec_datapath_binding_by_key, >> + sbrec_port_binding_by_key, chassis); >> send_garp_prepare(sbrec_port_binding_by_datapath, >> sbrec_port_binding_by_name, br_int, chassis, >> local_datapaths, active_tunnels); >> @@ -2481,6 +2503,7 @@ pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn) >> { >> wait_put_mac_bindings(ovnsb_idl_txn); >> wait_controller_event(ovnsb_idl_txn); >> + wait_put_vport_bindings(ovnsb_idl_txn); >> int64_t new_seq = seq_read(pinctrl_main_seq); >> seq_wait(pinctrl_main_seq, new_seq); >> } >> @@ -2498,6 +2521,7 @@ pinctrl_destroy(void) >> destroy_buffered_packets_map(); >> event_table_destroy(); >> destroy_put_mac_bindings(); >> + destroy_put_vport_bindings(); >> destroy_dns_cache(); >> ip_mcast_snoop_destroy(); >> seq_destroy(pinctrl_main_seq); >> @@ -4341,3 +4365,153 @@ pinctrl_handle_event(struct ofpbuf *userdata) >> return; >> } >> } >> + >> +struct put_vport_binding { >> + struct hmap_node hmap_node; >> + >> + /* Key and value. */ >> + uint32_t dp_key; >> + uint32_t vport_key; >> + >> + uint32_t vport_parent_key; >> +}; >> + >> +/* Contains "struct put_vport_binding"s. */ >> +static struct hmap put_vport_bindings; >> + >> +static void >> +init_put_vport_bindings(void) >> +{ >> + hmap_init(&put_vport_bindings); >> +} >> + >> +static void >> +flush_put_vport_bindings(void) >> +{ >> + struct put_vport_binding *vport_b; >> + HMAP_FOR_EACH_POP (vport_b, hmap_node, &put_vport_bindings) { >> + free(vport_b); >> + } >> +} >> + >> +static void >> +destroy_put_vport_bindings(void) >> +{ >> + flush_put_vport_bindings(); >> + hmap_destroy(&put_vport_bindings); >> +} >> + >> +static void >> +wait_put_vport_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn) >> +{ >> + if (ovnsb_idl_txn && !hmap_is_empty(&put_vport_bindings)) { >> + poll_immediate_wake(); >> + } >> +} >> + >> +static struct put_vport_binding * >> +pinctrl_find_put_vport_binding(uint32_t dp_key, uint32_t vport_key, >> + uint32_t hash) >> +{ >> + struct put_vport_binding *vpb; >> + HMAP_FOR_EACH_WITH_HASH (vpb, hmap_node, hash, &put_vport_bindings) { >> + if (vpb->dp_key == dp_key && vpb->vport_key == vport_key) { >> + return vpb; >> + } >> + } >> + return NULL; >> +} >> + >> +static void >> +run_put_vport_binding(struct ovsdb_idl_txn *ovnsb_idl_txn OVS_UNUSED, >> + struct ovsdb_idl_index >> *sbrec_datapath_binding_by_key, >> + struct ovsdb_idl_index *sbrec_port_binding_by_key, >> + const struct sbrec_chassis *chassis, >> + const struct put_vport_binding *vpb) >> +{ >> + /* Convert logical datapath and logical port key into lport. */ >> + const struct sbrec_port_binding *pb = lport_lookup_by_key( >> + sbrec_datapath_binding_by_key, sbrec_port_binding_by_key, >> + vpb->dp_key, vpb->vport_key); >> + if (!pb) { >> + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); >> + >> + VLOG_WARN_RL(&rl, "unknown logical port with datapath %"PRIu32" " >> + "and port %"PRIu32, vpb->dp_key, vpb->vport_key); >> + return; >> + } >> + >> + /* pinctrl module updates the port binding only for type 'virtual'. >> */ >> + if (!strcmp(pb->type, "virtual")) { >> + const struct sbrec_port_binding *parent = lport_lookup_by_key( >> + sbrec_datapath_binding_by_key, sbrec_port_binding_by_key, >> + vpb->dp_key, vpb->vport_parent_key); >> + if (parent) { >> + VLOG_INFO("Claiming virtual lport %s for this chassis " >> + "with the virtual parent %s", >> + pb->logical_port, parent->logical_port); >> + sbrec_port_binding_set_chassis(pb, chassis); >> + sbrec_port_binding_set_virtual_parent(pb, >> parent->logical_port); >> + } >> + } >> +} >> + >> +/* Called by pinctrl_run(). Runs with in the main ovn-controller >> + * thread context. */ >> +static void >> +run_put_vport_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn, >> + struct ovsdb_idl_index >> *sbrec_datapath_binding_by_key, >> + struct ovsdb_idl_index *sbrec_port_binding_by_key, >> + const struct sbrec_chassis *chassis) >> + OVS_REQUIRES(pinctrl_mutex) >> +{ >> + if (!ovnsb_idl_txn) { >> + return; >> + } >> + >> + const struct put_vport_binding *vpb; >> + HMAP_FOR_EACH (vpb, hmap_node, &put_vport_bindings) { >> + run_put_vport_binding(ovnsb_idl_txn, >> sbrec_datapath_binding_by_key, >> + sbrec_port_binding_by_key, chassis, vpb); >> + } >> + >> + flush_put_vport_bindings(); >> +} >> + >> +/* Called with in the pinctrl_handler thread context. */ >> +static void >> +pinctrl_handle_bind_vport( >> + const struct flow *md, struct ofpbuf *userdata) >> + OVS_REQUIRES(pinctrl_mutex) >> +{ >> + /* Get the datapath key from the packet metadata. */ >> + uint32_t dp_key = ntohll(md->metadata); >> + uint32_t vport_parent_key = md->regs[MFF_LOG_INPORT - MFF_REG0]; >> + >> + /* Get the virtual port key from the userdata buffer. */ >> + uint32_t *vport_key = ofpbuf_try_pull(userdata, sizeof *vport_key); >> + >> + if (!vport_key) { >> + return; >> + } >> + >> + uint32_t hash = hash_2words(dp_key, *vport_key); >> + >> + struct put_vport_binding *vpb >> + = pinctrl_find_put_vport_binding(dp_key, *vport_key, hash); >> + if (!vpb) { >> + if (hmap_count(&put_vport_bindings) >= 1000) { >> + COVERAGE_INC(pinctrl_drop_put_vport_binding); >> + return; >> + } >> + >> + vpb = xmalloc(sizeof *vpb); >> + hmap_insert(&put_vport_bindings, &vpb->hmap_node, hash); >> + } >> + >> + vpb->dp_key = dp_key; >> + vpb->vport_key = *vport_key; >> + vpb->vport_parent_key = vport_parent_key; >> + >> + notify_pinctrl_main(); >> +} >> diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c >> index 4eacc44ed..66916a837 100644 >> --- a/ovn/lib/actions.c >> +++ b/ovn/lib/actions.c >> @@ -2599,6 +2599,63 @@ ovnact_check_pkt_larger_free(struct >> ovnact_check_pkt_larger *cipl OVS_UNUSED) >> { >> } >> >> +static void >> +parse_bind_vport(struct action_context *ctx) >> +{ >> + if (!lexer_force_match(ctx->lexer, LEX_T_LPAREN)) { >> + return; >> + } >> + >> + if (ctx->lexer->token.type != LEX_T_STRING) { >> + lexer_syntax_error(ctx->lexer, "expecting port name string"); >> + return; >> + } >> + >> + struct ovnact_bind_vport *bind_vp = >> ovnact_put_BIND_VPORT(ctx->ovnacts); >> + bind_vp->vport = xstrdup(ctx->lexer->token.s); >> + lexer_get(ctx->lexer); >> + (void) (lexer_force_match(ctx->lexer, LEX_T_COMMA) >> + && action_parse_field(ctx, 0, false, &bind_vp->vport_parent) >> + && lexer_force_match(ctx->lexer, LEX_T_RPAREN)); >> +} >> + >> +static void >> +format_BIND_VPORT(const struct ovnact_bind_vport *bind_vp, >> + struct ds *s ) >> +{ >> + ds_put_format(s, "bind_vport(\"%s\", ", bind_vp->vport); >> + expr_field_format(&bind_vp->vport_parent, s); >> + ds_put_cstr(s, ");"); >> +} >> + >> +static void >> +encode_BIND_VPORT(const struct ovnact_bind_vport *vp, >> + const struct ovnact_encode_params *ep, >> + struct ofpbuf *ofpacts) >> +{ >> + uint32_t vport_key; >> + if (!ep->lookup_port(ep->aux, vp->vport, &vport_key)) { >> + return; >> + } >> + >> + const struct arg args[] = { >> + { expr_resolve_field(&vp->vport_parent), MFF_LOG_INPORT }, >> + }; >> + encode_setup_args(args, ARRAY_SIZE(args), ofpacts); >> + size_t oc_offset = >> encode_start_controller_op(ACTION_OPCODE_BIND_VPORT, >> + false, >> NX_CTLR_NO_METER, >> + ofpacts); >> + ofpbuf_put(ofpacts, &vport_key, sizeof(uint32_t)); >> + encode_finish_controller_op(oc_offset, ofpacts); >> + encode_restore_args(args, ARRAY_SIZE(args), ofpacts); >> +} >> + >> +static void >> +ovnact_bind_vport_free(struct ovnact_bind_vport *bp) >> +{ >> + free(bp->vport); >> +} >> + >> /* Parses an assignment or exchange or put_dhcp_opts action. */ >> static void >> parse_set_action(struct action_context *ctx) >> @@ -2706,6 +2763,8 @@ parse_action(struct action_context *ctx) >> parse_set_meter_action(ctx); >> } else if (lexer_match_id(ctx->lexer, "trigger_event")) { >> parse_trigger_event(ctx, ovnact_put_TRIGGER_EVENT(ctx->ovnacts)); >> + } else if (lexer_match_id(ctx->lexer, "bind_vport")) { >> + parse_bind_vport(ctx); >> } else { >> lexer_syntax_error(ctx->lexer, "expecting action"); >> } >> diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c >> index 0f07d80ac..de745d73f 100644 >> --- a/ovn/lib/ovn-util.c >> +++ b/ovn/lib/ovn-util.c >> @@ -326,6 +326,7 @@ static const char *OVN_NB_LSP_TYPES[] = { >> "router", >> "vtep", >> "external", >> + "virtual", >> }; >> >> bool >> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml >> index d2267de0e..6ff7aaff1 100644 >> --- a/ovn/northd/ovn-northd.8.xml >> +++ b/ovn/northd/ovn-northd.8.xml >> @@ -519,6 +519,34 @@ >> some additional flow cost for this and the value appears limited. >> </li> >> >> + <li> >> + <p> >> + If inport <code>V</code> is of type <code>virtual</code> adds a >> + priority-100 logical flow for each <var>P</var> configured in >> the >> + <ref table="Logical_Switch_Port" >> column="options:virtual-parents"/> >> + column with the match >> + </p> >> + <pre> >> +<code>inport == <var>P</var> && >> !is_chassis_resident(<var>V</var>) && ((arp.op == 1 && >> arp.spa == <var>VIP</var> && arp.tpa == <var>VIP</var>) || (arp.op >> == 2 && arp.spa == <var>VIP</var>))</code> >> + </pre> >> + >> + <p> >> + and applies the action >> + </p> >> + <pre> >> +<code>bind_vport(<var>V</var>, inport);</code> >> + </pre> >> + >> + <p> >> + and advances the packet to the next table. >> + </p> >> + >> + <p> >> + Where <var>VIP</var> is the virtual ip configured in the column >> + <ref table="Logical_Switch_Port" column="options:virtual-ip"/>. >> + </p> >> + </li> >> + >> <li> >> <p> >> Priority-50 flows that match ARP requests to each known IP >> address >> @@ -541,7 +569,8 @@ output; >> >> <p> >> These flows are omitted for logical ports (other than router >> ports or >> - <code>localport</code> ports) that are down. >> + <code>localport</code> ports) that are down and for logical >> ports of >> + type <code>virtual</code>. >> </p> >> </li> >> >> @@ -588,7 +617,8 @@ nd_na_router { >> >> <p> >> These flows are omitted for logical ports (other than router >> ports or >> - <code>localport</code> ports) that are down. >> + <code>localport</code> ports) that are down and for logical >> ports of >> + type <code>virtual</code>. >> </p> >> </li> >> >> @@ -2031,6 +2061,33 @@ next; >> <code>eth.dst = <var>E</var>; next;</code>. >> </p> >> >> + <p> >> + For each virtual ip <var>A</var> configured on a logical port >> + of type <code>virtual</code> and its virtual parent set in >> + its corresponding <ref db="OVN_Southbound" >> table="Port_Binding"/> >> + record and the virtual parent with the Ethernet address >> <var>E</var> >> + and the virtual ip is reachable via the router port >> <var>P</var>, a >> + priority-100 flow with match <code>outport === <var>P</var> >> + && reg0 == <var>A</var></code> has actions >> + <code>eth.dst = <var>E</var>; next;</code>. >> + </p> >> + >> + <p> >> + For each virtual ip <var>A</var> configured on a logical port >> + of type <code>virtual</code> and its virtual parent >> <code>not</code> >> + set in its corresponding >> + <ref db="OVN_Southbound" table="Port_Binding"/> >> + record and the virtual ip <var>A</var> is reachable via the >> + router port <var>P</var>, a >> + priority-100 flow with match <code>outport === <var>P</var> >> + && reg0 == <var>A</var></code> has actions >> + <code>eth.dst = <var>00:00:00:00:00:00</var>; next;</code>. >> + This flow is added so that the ARP is always resolved for the >> + virtual ip <var>A</var> by generating ARP request and >> + <code>not</code> consulting the MAC_Binding table as it can >> have >> + incorrect value for the virtual ip <var>A</var>. >> + </p> >> + >> <p> >> For each IPv6 address <var>A</var> whose host is known to have >> Ethernet address <var>E</var> on router port <var>P</var>, a >> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c >> index eb6c47cad..ae09cf338 100644 >> --- a/ovn/northd/ovn-northd.c >> +++ b/ovn/northd/ovn-northd.c >> @@ -4878,96 +4878,146 @@ build_lswitch_flows(struct hmap *datapaths, >> struct hmap *ports, >> continue; >> } >> >> - /* >> - * Add ARP/ND reply flows if either the >> - * - port is up or >> - * - port type is router or >> - * - port type is localport >> - */ >> - if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") && >> - strcmp(op->nbsp->type, "localport")) { >> - continue; >> - } >> + if (!strcmp(op->nbsp->type, "virtual")) { >> + /* Handle >> + * - GARPs for virtual ip which belongs to a logical port >> + * of type 'virtual' and bind that port. >> + * >> + * - ARP reply from the virtual ip which belongs to a >> logical >> + * port of type 'virtual' and bind that port. >> + * */ >> + ovs_be32 ip; >> + const char *virtual_ip = smap_get(&op->nbsp->options, >> + "virtual-ip"); >> + const char *virtual_parents = smap_get(&op->nbsp->options, >> + "virtual-parents"); >> + if (!virtual_ip || !virtual_parents || >> + !ip_parse(virtual_ip, &ip)) { >> + continue; >> + } >> >> - if (lsp_is_external(op->nbsp)) { >> - continue; >> - } >> + char *tokstr = xstrdup(virtual_parents); >> + char *save_ptr = NULL; >> + char *vparent; >> + for (vparent = strtok_r(tokstr, ",", &save_ptr); vparent != >> NULL; >> + vparent = strtok_r(NULL, ",", &save_ptr)) { >> + struct ovn_port *vp = ovn_port_find(ports, vparent); >> + if (!vp || vp->od != op->od) { >> + /* vparent name should be valid and it should belong >> + * to the same logical switch. */ >> + continue; >> + } >> >> - for (size_t i = 0; i < op->n_lsp_addrs; i++) { >> - for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) { >> ds_clear(&match); >> - ds_put_format(&match, "arp.tpa == %s && arp.op == 1", >> - op->lsp_addrs[i].ipv4_addrs[j].addr_s); >> + ds_put_format(&match, "inport == \"%s\" && " >> + "!is_chassis_resident(%s) && " >> + "((arp.op == 1 && arp.spa == %s && " >> + "arp.tpa == %s) || (arp.op == 2 && " >> + "arp.spa == %s))", >> + vparent, op->json_key, virtual_ip, >> virtual_ip, >> + virtual_ip); >> ds_clear(&actions); >> ds_put_format(&actions, >> - "eth.dst = eth.src; " >> - "eth.src = %s; " >> - "arp.op = 2; /* ARP reply */ " >> - "arp.tha = arp.sha; " >> - "arp.sha = %s; " >> - "arp.tpa = arp.spa; " >> - "arp.spa = %s; " >> - "outport = inport; " >> - "flags.loopback = 1; " >> - "output;", >> - op->lsp_addrs[i].ea_s, op->lsp_addrs[i].ea_s, >> - op->lsp_addrs[i].ipv4_addrs[j].addr_s); >> - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 50, >> + "bind_vport(%s, inport); " >> + "next;", >> + op->json_key); >> + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, >> 100, >> ds_cstr(&match), ds_cstr(&actions)); >> + } >> >> - /* Do not reply to an ARP request from the port that >> owns the >> - * address (otherwise a DHCP client that ARPs to check >> for a >> - * duplicate address will fail). Instead, forward it >> the usual >> - * way. >> - * >> - * (Another alternative would be to simply drop the >> packet. If >> - * everything is working as it is configured, then this >> would >> - * produce equivalent results, since no one should reply >> to the >> - * request. But ARPing for one's own IP address is >> intended to >> - * detect situations where the network is not working as >> - * configured, so dropping the request would frustrate >> that >> - * intent.) */ >> - ds_put_format(&match, " && inport == %s", op->json_key); >> - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, >> 100, >> - ds_cstr(&match), "next;"); >> + free(tokstr); >> + } else { >> + /* >> + * Add ARP/ND reply flows if either the >> + * - port is up or >> + * - port type is router or >> + * - port type is localport >> + */ >> + if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") >> && >> + strcmp(op->nbsp->type, "localport")) { >> + continue; >> } >> >> - /* For ND solicitations, we need to listen for both the >> - * unicast IPv6 address and its all-nodes multicast address, >> - * but always respond with the unicast IPv6 address. */ >> - for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; j++) { >> - ds_clear(&match); >> - ds_put_format(&match, >> - "nd_ns && ip6.dst == {%s, %s} && nd.target == >> %s", >> - op->lsp_addrs[i].ipv6_addrs[j].addr_s, >> - op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s, >> - op->lsp_addrs[i].ipv6_addrs[j].addr_s); >> + if (lsp_is_external(op->nbsp)) { >> + continue; >> + } >> >> - ds_clear(&actions); >> - ds_put_format(&actions, >> - "%s { " >> + for (size_t i = 0; i < op->n_lsp_addrs; i++) { >> + for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; >> j++) { >> + ds_clear(&match); >> + ds_put_format(&match, "arp.tpa == %s && arp.op == 1", >> + op->lsp_addrs[i].ipv4_addrs[j].addr_s); >> + ds_clear(&actions); >> + ds_put_format(&actions, >> + "eth.dst = eth.src; " >> "eth.src = %s; " >> - "ip6.src = %s; " >> - "nd.target = %s; " >> - "nd.tll = %s; " >> + "arp.op = 2; /* ARP reply */ " >> + "arp.tha = arp.sha; " >> + "arp.sha = %s; " >> + "arp.tpa = arp.spa; " >> + "arp.spa = %s; " >> "outport = inport; " >> "flags.loopback = 1; " >> - "output; " >> - "};", >> - !strcmp(op->nbsp->type, "router") ? >> - "nd_na_router" : "nd_na", >> - op->lsp_addrs[i].ea_s, >> - op->lsp_addrs[i].ipv6_addrs[j].addr_s, >> - op->lsp_addrs[i].ipv6_addrs[j].addr_s, >> - op->lsp_addrs[i].ea_s); >> - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 50, >> - ds_cstr(&match), ds_cstr(&actions)); >> + "output;", >> + op->lsp_addrs[i].ea_s, op->lsp_addrs[i].ea_s, >> + op->lsp_addrs[i].ipv4_addrs[j].addr_s); >> + ovn_lflow_add(lflows, op->od, >> S_SWITCH_IN_ARP_ND_RSP, 50, >> + ds_cstr(&match), ds_cstr(&actions)); >> + >> + /* Do not reply to an ARP request from the port that >> owns >> + * the address (otherwise a DHCP client that ARPs to >> check >> + * for a duplicate address will fail). Instead, >> forward >> + * it the usual way. >> + * >> + * (Another alternative would be to simply drop the >> packet. >> + * If everything is working as it is configured, >> then this >> + * would produce equivalent results, since no one >> should >> + * reply to the request. But ARPing for one's own IP >> + * address is intended to detect situations where the >> + * network is not working as configured, so dropping >> the >> + * request would frustrate that intent.) */ >> + ds_put_format(&match, " && inport == %s", >> op->json_key); >> + ovn_lflow_add(lflows, op->od, >> S_SWITCH_IN_ARP_ND_RSP, 100, >> + ds_cstr(&match), "next;"); >> + } >> >> - /* Do not reply to a solicitation from the port that >> owns the >> - * address (otherwise DAD detection will fail). */ >> - ds_put_format(&match, " && inport == %s", op->json_key); >> - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, >> 100, >> - ds_cstr(&match), "next;"); >> + /* For ND solicitations, we need to listen for both the >> + * unicast IPv6 address and its all-nodes multicast >> address, >> + * but always respond with the unicast IPv6 address. */ >> + for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; >> j++) { >> + ds_clear(&match); >> + ds_put_format(&match, >> + "nd_ns && ip6.dst == {%s, %s} && nd.target >> == %s", >> + op->lsp_addrs[i].ipv6_addrs[j].addr_s, >> + op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s, >> + op->lsp_addrs[i].ipv6_addrs[j].addr_s); >> + >> + ds_clear(&actions); >> + ds_put_format(&actions, >> + "%s { " >> + "eth.src = %s; " >> + "ip6.src = %s; " >> + "nd.target = %s; " >> + "nd.tll = %s; " >> + "outport = inport; " >> + "flags.loopback = 1; " >> + "output; " >> + "};", >> + !strcmp(op->nbsp->type, "router") ? >> + "nd_na_router" : "nd_na", >> + op->lsp_addrs[i].ea_s, >> + op->lsp_addrs[i].ipv6_addrs[j].addr_s, >> + op->lsp_addrs[i].ipv6_addrs[j].addr_s, >> + op->lsp_addrs[i].ea_s); >> + ovn_lflow_add(lflows, op->od, >> S_SWITCH_IN_ARP_ND_RSP, 50, >> + ds_cstr(&match), ds_cstr(&actions)); >> + >> + /* Do not reply to a solicitation from the port that >> owns >> + * the address (otherwise DAD detection will fail). >> */ >> + ds_put_format(&match, " && inport == %s", >> op->json_key); >> + ovn_lflow_add(lflows, op->od, >> S_SWITCH_IN_ARP_ND_RSP, 100, >> + ds_cstr(&match), "next;"); >> + } >> } >> } >> } >> @@ -7504,7 +7554,8 @@ build_lrouter_flows(struct hmap *datapaths, struct >> hmap *ports, >> 100, ds_cstr(&match), >> ds_cstr(&actions)); >> } >> } >> - } else if (op->od->n_router_ports && strcmp(op->nbsp->type, >> "router")) { >> + } else if (op->od->n_router_ports && strcmp(op->nbsp->type, >> "router") >> + && strcmp(op->nbsp->type, "virtual")) { >> /* This is a logical switch port that backs a VM or a >> container. >> * Extract its addresses. For each of the address, go >> through all >> * the router ports attached to the switch (to which this >> port >> @@ -7581,6 +7632,105 @@ build_lrouter_flows(struct hmap *datapaths, >> struct hmap *ports, >> } >> } >> } >> + } else if (op->od->n_router_ports && strcmp(op->nbsp->type, >> "router") >> + && !strcmp(op->nbsp->type, "virtual")) { >> + /* This is a virtual port. Add ARP replies for the virtual >> ip with >> + * the mac of the present active virtual parent. >> + * If the logical port doesn't have virtual parent set in >> + * Port_Binding table, then add the flow to set eth.dst to >> + * 00:00:00:00:00:00 and advance to next table so that ARP is >> + * resolved by router pipeline using the arp{} action. >> + * The MAC_Binding entry for the virtual ip might be >> invalid. */ >> + ovs_be32 ip; >> + >> + const char *vip = smap_get(&op->nbsp->options, >> + "virtual-ip"); >> + const char *virtual_parents = smap_get(&op->nbsp->options, >> + "virtual-parents"); >> + if (!vip || !virtual_parents || >> + !ip_parse(vip, &ip) || !op->sb) { >> + continue; >> + } >> + >> + if (!op->sb->virtual_parent || !op->sb->virtual_parent[0] || >> + !op->sb->chassis) { >> + /* The virtual port is not claimed yet. */ >> + for (size_t i = 0; i < op->od->n_router_ports; i++) { >> + const char *peer_name = smap_get( >> + &op->od->router_ports[i]->nbsp->options, >> + "router-port"); >> + if (!peer_name) { >> + continue; >> + } >> + >> + struct ovn_port *peer = ovn_port_find(ports, >> peer_name); >> + if (!peer || !peer->nbrp) { >> + continue; >> + } >> + >> + if (find_lrp_member_ip(peer, vip)) { >> + ds_clear(&match); >> + ds_put_format(&match, "outport == %s && reg0 == >> %s", >> + peer->json_key, vip); >> + >> + ds_clear(&actions); >> + ds_put_format(&actions, >> + "eth.dst = 00:00:00:00:00:00; >> next;"); >> + ovn_lflow_add(lflows, peer->od, >> + S_ROUTER_IN_ARP_RESOLVE, 100, >> + ds_cstr(&match), >> ds_cstr(&actions)); >> + break; >> + } >> + } >> + } else { >> + struct ovn_port *vp = >> + ovn_port_find(ports, op->sb->virtual_parent); >> + if (!vp || !vp->nbsp) { >> + continue; >> + } >> + >> + for (size_t i = 0; i < vp->n_lsp_addrs; i++) { >> + bool found_vip_network = false; >> + const char *ea_s = vp->lsp_addrs[i].ea_s; >> + for (size_t j = 0; j < vp->od->n_router_ports; j++) { >> + /* Get the Logical_Router_Port that the >> + * Logical_Switch_Port is connected to, as >> + * 'peer'. */ >> + const char *peer_name = smap_get( >> + &vp->od->router_ports[j]->nbsp->options, >> + "router-port"); >> + if (!peer_name) { >> + continue; >> + } >> + >> + struct ovn_port *peer = >> + ovn_port_find(ports, peer_name); >> + if (!peer || !peer->nbrp) { >> + continue; >> + } >> + >> + if (!find_lrp_member_ip(peer, vip)) { >> + continue; >> + } >> + >> + ds_clear(&match); >> + ds_put_format(&match, "outport == %s && reg0 == >> %s", >> + peer->json_key, vip); >> + >> + ds_clear(&actions); >> + ds_put_format(&actions, "eth.dst = %s; next;", >> ea_s); >> + ovn_lflow_add(lflows, peer->od, >> + S_ROUTER_IN_ARP_RESOLVE, 100, >> + ds_cstr(&match), >> ds_cstr(&actions)); >> + found_vip_network = true; >> + break; >> + } >> + >> + if (found_vip_network) { >> + break; >> + } >> + } >> + } >> } else if (!strcmp(op->nbsp->type, "router")) { >> /* This is a logical switch port that connects to a router. >> */ >> >> @@ -9256,6 +9406,8 @@ main(int argc, char *argv[]) >> &sbrec_port_binding_col_gateway_chassis); >> ovsdb_idl_add_column(ovnsb_idl_loop.idl, >> &sbrec_port_binding_col_ha_chassis_group); >> + ovsdb_idl_add_column(ovnsb_idl_loop.idl, >> + &sbrec_port_binding_col_virtual_parent); >> ovsdb_idl_add_column(ovnsb_idl_loop.idl, >> &sbrec_gateway_chassis_col_chassis); >> ovsdb_idl_add_column(ovnsb_idl_loop.idl, >> &sbrec_gateway_chassis_col_name); >> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml >> index 57b6edbf8..f5f10a5c1 100644 >> --- a/ovn/ovn-nb.xml >> +++ b/ovn/ovn-nb.xml >> @@ -465,6 +465,31 @@ >> </li> >> </ul> >> </dd> >> + >> + <dt><code>virtual</code></dt> >> + <dd> >> + <p> >> + Represents a logical port which does not have an OVS >> + port in the integration bridge and has a virtual ip >> configured >> + in the <ref column="options:virtual-ip"/> column. This >> virtual ip >> + can move around between the logical ports configured in >> + the <ref column="options:virtual-parents"/> column. >> + </p> >> + >> + <p> >> + One of the use case where <code>virtual</code> >> + ports can be used is. >> + </p> >> + >> + <ul> >> + <li> >> + The <code>virtual ip</code> represents a load balancer >> vip >> + and the <code>virtual parents</code> provide load >> balancer >> + service in an active-standby setup with the active >> virtual >> + parent owning the <code>virtual ip</code>. >> + </li> >> + </ul> >> + </dd> >> </dl> >> </column> >> </group> >> @@ -618,6 +643,26 @@ >> interface, in bits. >> </column> >> </group> >> + >> + <group title="Virtual port Options"> >> + <p> >> + These options apply when <ref column="type"/> is >> + <code>virtual</code>. >> + </p> >> + >> + <column name="options" key="virtual-ip"> >> + This option represents the virtual IPv4 address. >> + </column> >> + >> + <column name="options" key="virtual-parents"> >> + This options represents a set of logical port names (with in >> the same >> + logical switch) which can own the <code>virtual ip</code> >> configured >> + in the <ref column="options:virtual-ip"/>. All these virtual >> parents >> + should add the <code>virtual ip</code> in the >> + <ref column="port_security"/> if port security addressed are >> enabled. >> + </column> >> + </group> >> + >> </group> >> >> <group title="Containers"> >> diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema >> index 2b7bc57a7..5c013b17e 100644 >> --- a/ovn/ovn-sb.ovsschema >> +++ b/ovn/ovn-sb.ovsschema >> @@ -1,7 +1,7 @@ >> { >> "name": "OVN_Southbound", >> - "version": "2.4.0", >> - "cksum": "3059284885 20260", >> + "version": "2.5.0", >> + "cksum": "1257419092 20387", >> "tables": { >> "SB_Global": { >> "columns": { >> @@ -173,6 +173,8 @@ >> "minInteger": 1, >> "maxInteger": 4095}, >> "min": 0, "max": 1}}, >> + "virtual_parent": {"type": {"key": "string", "min": 0, >> + "max": 1}}, >> "chassis": {"type": {"key": {"type": "uuid", >> "refTable": "Chassis", >> "refType": "weak"}, >> diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml >> index 544a071fa..17c45bbac 100644 >> --- a/ovn/ovn-sb.xml >> +++ b/ovn/ovn-sb.xml >> @@ -2017,6 +2017,24 @@ tcp.flags = RST; >> </p> >> <p><b>Prerequisite:</b> <code>igmp</code></p> >> </dd> >> + >> + <dt><code>bind_vport(<var>V</var>, <var>P</var>);</code></dt> >> + <dd> >> + <p> >> + <b>Parameters</b>: logical port string field <var>V</var> >> + of type <code>virtual</code>, logical port string field >> + <var>P</var>. >> + </p> >> + >> + <p> >> + Binds the virtual logical port <var>V</var> and sets the >> + <ref table="Port_Binding" column="chassis"/> column and >> + <ref table="Port_Binding" column="virtual_parent"/> of >> + the table <ref table="Port_Binding"/>. >> + <ref table="Port_Binding" column="virtual_parent"/> is >> + set to <var>P</var>. >> + </p> >> + </dd> >> </dl> >> </column> >> >> @@ -2480,6 +2498,13 @@ tcp.flags = RST; >> the <code>outport</code> will be reset to the value of the >> distributed port. >> </dd> >> + >> + <dt><code>virtual</code></dt> >> + <dd> >> + Represents a logical port with an <code>virtual ip</code>. >> + This <code>virtual ip</code> can be configured on a >> + logical port (which is refered as virtual parent). >> + </dd> >> </dl> >> </column> >> </group> >> @@ -2720,6 +2745,27 @@ tcp.flags = RST; >> </column> >> </group> >> >> + <group title="Virtual ports"> >> + <column name="virtual_parent"> >> + <p> >> + This column is set by <code>ovn-controller</code> with one of >> the >> + value from the >> + <ref table="Logical_Switch_Port" >> column="options:virtual-parents" >> + db="OVN_Northbound"/> in the OVN_Northbound database's >> + <ref table="Logical_Switch_Port" db="OVN_Northbound"/> table >> + when the OVN action <code>bind_vport</code> is executed. >> + <code>ovn-controller</code> also sets the >> + <ref column="chassis"/> column when it executes this action >> + with its chassis id. >> + </p> >> + >> + <p> >> + <code>ovn-controller</code> sets this column only if the >> + <ref column="type"/> is "virtual". >> + </p> >> + </column> >> + </group> >> + >> <group title="Naming"> >> <column name="external_ids" key="name"> >> <p> >> diff --git a/ovn/utilities/ovn-trace.c b/ovn/utilities/ovn-trace.c >> index 044eb1cc2..b532b8eaf 100644 >> --- a/ovn/utilities/ovn-trace.c >> +++ b/ovn/utilities/ovn-trace.c >> @@ -2144,6 +2144,9 @@ trace_actions(const struct ovnact *ovnacts, size_t >> ovnacts_len, >> >> case OVNACT_CHECK_PKT_LARGER: >> break; >> + >> + case OVNACT_BIND_VPORT: >> + break; >> } >> } >> ds_destroy(&s); >> diff --git a/tests/ovn.at b/tests/ovn.at >> index cb380d275..5d6c90c5f 100644 >> --- a/tests/ovn.at >> +++ b/tests/ovn.at >> @@ -1368,6 +1368,24 @@ reg0 = check_pkt_larger(foo); >> reg0[0] = check_pkt_larger(foo); >> Syntax error at `foo' expecting `;'. >> >> +# bind_vport >> +# lsp1's port key is 0x11. >> +bind_vport("lsp1", inport); >> + encodes as controller(userdata=00.00.00.11.00.00.00.00.11.00.00.00) >> +# lsp2 doesn't exist. So it should be encoded as drop. >> +bind_vport("lsp2", inport); >> + encodes as drop >> +bind_vport; >> + Syntax error at `;' expecting `('. >> +bind_vport(; >> + Syntax error at `;' expecting port name string. >> +bind_vport("xyzzy"; >> + Syntax error at `;' expecting `,'. >> +bind_vport("xyzzy",; >> + Syntax error at `;' expecting field name. >> +bind_vport("xyzzy", inport; >> + Syntax error at `;' expecting `)'. >> + >> # Miscellaneous negative tests. >> ; >> Syntax error at `;'. >> @@ -14345,6 +14363,278 @@ OVN_CLEANUP([hv1],[hv2]) >> >> AT_CLEANUP >> >> +AT_SETUP([ovn -- virtual ports]) >> +AT_KEYWORDS([virtual ports]) >> +AT_SKIP_IF([test $HAVE_PYTHON = no]) >> +ovn_start >> + >> +send_garp() { >> + local hv=$1 inport=$2 eth_src=$3 eth_dst=$4 spa=$5 tpa=$6 >> + local >> request=${eth_dst}${eth_src}08060001080006040001${eth_src}${spa}${eth_dst}${tpa} >> + as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request >> +} >> + >> +send_arp_reply() { >> + local hv=$1 inport=$2 eth_src=$3 eth_dst=$4 spa=$5 tpa=$6 >> + local >> request=${eth_dst}${eth_src}08060001080006040002${eth_src}${spa}${eth_dst}${tpa} >> + as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request >> +} >> + >> +net_add n1 >> + >> +sim_add hv1 >> +as hv1 >> +ovs-vsctl add-br br-phys >> +ovn_attach n1 br-phys 192.168.0.1 >> +ovs-vsctl -- add-port br-int hv1-vif1 -- \ >> + set interface hv1-vif1 external-ids:iface-id=sw0-p1 \ >> + options:tx_pcap=hv1/vif1-tx.pcap \ >> + options:rxq_pcap=hv1/vif1-rx.pcap \ >> + ofport-request=1 >> +ovs-vsctl -- add-port br-int hv1-vif2 -- \ >> + set interface hv1-vif2 external-ids:iface-id=sw0-p3 \ >> + options:tx_pcap=hv1/vif2-tx.pcap \ >> + options:rxq_pcap=hv1/vif2-rx.pcap \ >> + ofport-request=2 >> + >> +sim_add hv2 >> +as hv2 >> +ovs-vsctl add-br br-phys >> +ovn_attach n1 br-phys 192.168.0.2 >> +ovs-vsctl -- add-port br-int hv2-vif1 -- \ >> + set interface hv2-vif1 external-ids:iface-id=sw0-p2 \ >> + options:tx_pcap=hv2/vif1-tx.pcap \ >> + options:rxq_pcap=hv2/vif1-rx.pcap \ >> + ofport-request=1 >> +ovs-vsctl -- add-port br-int hv2-vif2 -- \ >> + set interface hv2-vif2 external-ids:iface-id=sw1-p1 \ >> + options:tx_pcap=hv2/vif2-tx.pcap \ >> + options:rxq_pcap=hv2/vif2-rx.pcap \ >> + ofport-request=2 >> + >> +ovn-nbctl ls-add sw0 >> + >> +ovn-nbctl lsp-add sw0 sw0-vir >> +ovn-nbctl lsp-set-addresses sw0-vir "50:54:00:00:00:10 10.0.0.10" >> +ovn-nbctl lsp-set-port-security sw0-vir "50:54:00:00:00:10 10.0.0.10" >> +ovn-nbctl lsp-set-type sw0-vir virtual >> +ovn-nbctl set logical_switch_port sw0-vir options:virtual-ip=10.0.0.10 >> +ovn-nbctl set logical_switch_port sw0-vir >> options:virtual-parents=sw0-p1,sw0-p2 >> + >> +ovn-nbctl lsp-add sw0 sw0-p1 >> +ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" >> +ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3 >> 10.0.0.10" >> + >> +ovn-nbctl lsp-add sw0 sw0-p2 >> +ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4" >> +ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4 >> 10.0.0.10" >> + >> +ovn-nbctl lsp-add sw0 sw0-p3 >> +ovn-nbctl lsp-set-addresses sw0-p3 "50:54:00:00:00:05 10.0.0.5" >> +ovn-nbctl lsp-set-port-security sw0-p3 "50:54:00:00:00:05 10.0.0.5" >> + >> +# Create the second logical switch with one port >> +ovn-nbctl ls-add sw1 >> +ovn-nbctl lsp-add sw1 sw1-p1 >> +ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03 20.0.0.3" >> +ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03 20.0.0.3" >> + >> +# Create a logical router and attach both logical switches >> +ovn-nbctl lr-add lr0 >> +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 >> +ovn-nbctl lsp-add sw0 sw0-lr0 >> +ovn-nbctl lsp-set-type sw0-lr0 router >> +ovn-nbctl lsp-set-addresses sw0-lr0 00:00:00:00:ff:01 >> +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 >> + >> +ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24 >> +ovn-nbctl lsp-add sw1 sw1-lr0 >> +ovn-nbctl lsp-set-type sw1-lr0 router >> +ovn-nbctl lsp-set-addresses sw1-lr0 00:00:00:00:ff:02 >> +ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1 >> + >> +OVN_POPULATE_ARP >> +ovn-nbctl --wait=hv sync >> + >> +# Check that logical flows are added for sw0-vir in lsp_in_arp_rsp >> pipeline >> +# with bind_vport action. >> + >> +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > >> lflows.txt >> + >> +AT_CHECK([cat lflows.txt], [0], [dnl >> + table=11(ls_in_arp_rsp ), priority=100 , match=(inport == >> "sw0-p1" && !is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == >> 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == >> 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) >> + table=11(ls_in_arp_rsp ), priority=100 , match=(inport == >> "sw0-p2" && !is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == >> 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == >> 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) >> +]) >> + >> +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == >> 10.0.0.10" \ >> +> lflows.txt >> + >> +# Since the sw0-vir is not claimed by any chassis, eth.dst should be set >> to >> +# zero if the ip4.dst is the virtual ip in the router pipeline. >> +AT_CHECK([cat lflows.txt], [0], [dnl >> + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == >> "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) >> +]) >> + >> +ip_to_hex() { >> + printf "%02x%02x%02x%02x" "$@" >> +} >> + >> +hv1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv1"` >> +hv2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv2"` >> + >> +AT_CHECK([test x$(ovn-sbctl --bare --columns chassis find port_binding \ >> +logical_port=sw0-vir) = x], [0], []) >> + >> +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find >> port_binding \ >> +logical_port=sw0-vir) = x]) >> + >> +# From sw0-p0 send GARP for 10.0.0.10. hv1 should claim sw0-vir >> +# and sw0-p1 should be its virtual_parent. >> +eth_src=505400000003 >> +eth_dst=ffffffffffff >> +spa=$(ip_to_hex 10 0 0 10) >> +tpa=$(ip_to_hex 10 0 0 10) >> +send_garp 1 1 $eth_src $eth_dst $spa $tpa >> + >> +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find >> port_binding \ >> +logical_port=sw0-vir) = x$hv1_ch_uuid], [0], []) >> + >> +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find >> port_binding \ >> +logical_port=sw0-vir) = xsw0-p1]) >> + >> +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == >> 10.0.0.10" \ >> +> lflows.txt >> + >> +# There should be an arp resolve flow to resolve the virtual_ip with the >> +# sw0-p1's MAC. >> +AT_CHECK([cat lflows.txt], [0], [dnl >> + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == >> "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) >> +]) >> + >> +# send the garp from sw0-p2 (in hv2). hv2 should claim sw0-vir >> +# and sw0-p2 shpuld be its virtual_parent. >> +eth_src=505400000004 >> +eth_dst=ffffffffffff >> +spa=$(ip_to_hex 10 0 0 10) >> +tpa=$(ip_to_hex 10 0 0 10) >> +send_garp 2 1 $eth_src $eth_dst $spa $tpa >> + >> +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find >> port_binding \ >> +logical_port=sw0-vir) = x$hv2_ch_uuid], [0], []) >> + >> +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find >> port_binding \ >> +logical_port=sw0-vir) = xsw0-p2]) >> + >> +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == >> 10.0.0.10" \ >> +> lflows.txt >> + >> +# There should be an arp resolve flow to resolve the virtual_ip with the >> +# sw0-p2's MAC. >> +AT_CHECK([cat lflows.txt], [0], [dnl >> + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == >> "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) >> +]) >> + >> +# Now send arp reply from sw0-p1. hv1 should claim sw0-vir >> +# and sw0-p1 shpuld be its virtual_parent. >> +eth_src=505400000003 >> +eth_dst=ffffffffffff >> +spa=$(ip_to_hex 10 0 0 10) >> +tpa=$(ip_to_hex 10 0 0 4) >> +send_arp_reply 1 1 $eth_src $eth_dst $spa $tpa >> + >> +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find >> port_binding \ >> +logical_port=sw0-vir) = x$hv1_ch_uuid], [0], []) >> + >> +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find >> port_binding \ >> +logical_port=sw0-vir) = xsw0-p1]) >> + >> +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == >> 10.0.0.10" \ >> +> lflows.txt >> + >> +AT_CHECK([cat lflows.txt], [0], [dnl >> + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == >> "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) >> +]) >> + >> +# Delete hv1-vif1 port. hv1 should release sw0-vir >> +as hv1 ovs-vsctl del-port hv1-vif1 >> + >> +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find >> port_binding \ >> +logical_port=sw0-vir) = x], [0], []) >> + >> +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find >> port_binding \ >> +logical_port=sw0-vir) = x]) >> + >> +# Since the sw0-vir is not claimed by any chassis, eth.dst should be set >> to >> +# zero if the ip4.dst is the virtual ip. >> +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == >> 10.0.0.10" \ >> +> lflows.txt >> + >> +AT_CHECK([cat lflows.txt], [0], [dnl >> + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == >> "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) >> +]) >> + >> +# Now send arp reply from sw0-p2. hv2 should claim sw0-vir >> +# and sw0-p2 shpuld be its virtual_parent. >> +eth_src=505400000004 >> +eth_dst=ffffffffffff >> +spa=$(ip_to_hex 10 0 0 10) >> +tpa=$(ip_to_hex 10 0 0 3) >> +send_arp_reply 2 1 $eth_src $eth_dst $spa $tpa >> + >> +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find >> port_binding \ >> +logical_port=sw0-vir) = x$hv2_ch_uuid], [0], []) >> + >> +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find >> port_binding \ >> +logical_port=sw0-vir) = xsw0-p2]) >> + >> +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == >> 10.0.0.10" \ >> +> lflows.txt >> + >> +AT_CHECK([cat lflows.txt], [0], [dnl >> + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == >> "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) >> +]) >> + >> +# Delete sw0-p2 logical port >> +ovn-nbctl lsp-del sw0-p2 >> + >> +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find >> port_binding \ >> +logical_port=sw0-vir) = x], [0], []) >> + >> +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find >> port_binding \ >> +logical_port=sw0-vir) = x]) >> + >> +# Clear virtual_ip column of sw0-vir. There should be no bind_vport >> flows. >> +ovn-nbctl --wait=hv remove logical_switch_port sw0-vir options virtual-ip >> + >> +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > >> lflows.txt >> + >> +AT_CHECK([cat lflows.txt], [0], [dnl >> +]) >> + >> +# Add back virtual_ip and clear virtual_parents. >> +ovn-nbctl --wait=hv set logical_switch_port sw0-vir >> options:virtual-ip=10.0.0.10 >> + >> +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > >> lflows.txt >> + >> +AT_CHECK([cat lflows.txt], [0], [dnl >> + table=11(ls_in_arp_rsp ), priority=100 , match=(inport == >> "sw0-p1" && !is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == >> 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == >> 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) >> +]) >> + >> +ovn-nbctl --wait=hv remove logical_switch_port sw0-vir options >> virtual-parents >> +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > >> lflows.txt >> + >> +AT_CHECK([cat lflows.txt], [0], [dnl >> +]) >> + >> +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == >> 10.0.0.10" \ >> +> lflows.txt >> + >> +AT_CHECK([cat lflows.txt], [0], [dnl >> +]) >> + >> +OVN_CLEANUP([hv1], [hv2]) >> +AT_CLEANUP >> + >> # Run ovn-nbctl in daemon mode, change to a backup database and verify >> that >> # an insert operation is not allowed. >> AT_SETUP([ovn -- can't write to a backup database server instance]) >> diff --git a/tests/test-ovn.c b/tests/test-ovn.c >> index 0b9e8246e..cf1bc5432 100644 >> --- a/tests/test-ovn.c >> +++ b/tests/test-ovn.c >> @@ -1253,6 +1253,7 @@ test_parse_actions(struct ovs_cmdl_context *ctx >> OVS_UNUSED) >> simap_put(&ports, "eth0", 5); >> simap_put(&ports, "eth1", 6); >> simap_put(&ports, "LOCAL", ofp_to_u16(OFPP_LOCAL)); >> + simap_put(&ports, "lsp1", 0x11); >> >> ds_init(&input); >> while (!ds_get_test_line(&input, stdin)) { >> -- >> 2.21.0 >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >
diff --git a/NEWS b/NEWS index 8cf850823..be3ea42b4 100644 --- a/NEWS +++ b/NEWS @@ -60,6 +60,7 @@ v2.12.0 - xx xxx xxxx logical groups which results in tunnels only been formed between members of the same transport zone(s). * Support for IGMP Snooping and IGMP Querier. + * Support for new logical switch port type - 'virtual'. - New QoS type "linux-netem" on Linux. - Added support for TLS Server Name Indication (SNI). - Linux datapath: diff --git a/include/ovn/actions.h b/include/ovn/actions.h index 63d3907d8..0ca06537c 100644 --- a/include/ovn/actions.h +++ b/include/ovn/actions.h @@ -85,7 +85,8 @@ struct ovn_extend_table; OVNACT(SET_METER, ovnact_set_meter) \ OVNACT(OVNFIELD_LOAD, ovnact_load) \ OVNACT(CHECK_PKT_LARGER, ovnact_check_pkt_larger) \ - OVNACT(TRIGGER_EVENT, ovnact_controller_event) + OVNACT(TRIGGER_EVENT, ovnact_controller_event) \ + OVNACT(BIND_VPORT, ovnact_bind_vport) /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. */ enum OVS_PACKED_ENUM ovnact_type { @@ -328,6 +329,13 @@ struct ovnact_controller_event { size_t n_options; }; +/* OVNACT_BIND_VPORT. */ +struct ovnact_bind_vport { + struct ovnact ovnact; + char *vport; + struct expr_field vport_parent; /* Logical virtual port's port name. */ +}; + /* Internal use by the helpers below. */ void ovnact_init(struct ovnact *, enum ovnact_type, size_t len); void *ovnact_put(struct ofpbuf *, enum ovnact_type, size_t len); @@ -505,6 +513,14 @@ enum action_opcode { * Snoop IGMP, learn the multicast participants */ ACTION_OPCODE_IGMP, + + /* "bind_vport(vport, vport_parent)". + * + * 'vport' follows the action_header, in the format - 32-bit field. + * 'vport_parent' is passed through the packet metadata as + * MFF_LOG_INPORT. + */ + ACTION_OPCODE_BIND_VPORT, }; /* Header. */ diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c index ace0f811b..dfe002b60 100644 --- a/ovn/controller/binding.c +++ b/ovn/controller/binding.c @@ -571,11 +571,31 @@ consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn, sbrec_port_binding_set_encap(binding_rec, encap_rec); } } else if (binding_rec->chassis == chassis_rec) { - VLOG_INFO("Releasing lport %s from this chassis.", - binding_rec->logical_port); - if (binding_rec->encap) - sbrec_port_binding_set_encap(binding_rec, NULL); - sbrec_port_binding_set_chassis(binding_rec, NULL); + if (!strcmp(binding_rec->type, "virtual")) { + /* pinctrl module takes care of binding the ports + * of type 'virtual'. + * Release such ports if their virtual parents are no + * longer claimed by this chassis. */ + const struct sbrec_port_binding *parent + = lport_lookup_by_name(sbrec_port_binding_by_name, + binding_rec->virtual_parent); + if (!parent || parent->chassis != chassis_rec) { + VLOG_INFO("Releasing lport %s from this chassis.", + binding_rec->logical_port); + if (binding_rec->encap) { + sbrec_port_binding_set_encap(binding_rec, NULL); + } + sbrec_port_binding_set_chassis(binding_rec, NULL); + sbrec_port_binding_set_virtual_parent(binding_rec, NULL); + } + } else { + VLOG_INFO("Releasing lport %s from this chassis.", + binding_rec->logical_port); + if (binding_rec->encap) { + sbrec_port_binding_set_encap(binding_rec, NULL); + } + sbrec_port_binding_set_chassis(binding_rec, NULL); + } } else if (our_chassis) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); VLOG_INFO_RL(&rl, diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c index d857067a5..357050eb5 100644 --- a/ovn/controller/pinctrl.c +++ b/ovn/controller/pinctrl.c @@ -273,9 +273,22 @@ static void pinctrl_ip_mcast_handle_igmp( static bool may_inject_pkts(void); +static void init_put_vport_bindings(void); +static void destroy_put_vport_bindings(void); +static void run_put_vport_bindings( + struct ovsdb_idl_txn *ovnsb_idl_txn, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, + struct ovsdb_idl_index *sbrec_port_binding_by_key, + const struct sbrec_chassis *chassis) + OVS_REQUIRES(pinctrl_mutex); +static void wait_put_vport_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn); +static void pinctrl_handle_bind_vport(const struct flow *md, + struct ofpbuf *userdata); + COVERAGE_DEFINE(pinctrl_drop_put_mac_binding); COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map); COVERAGE_DEFINE(pinctrl_drop_controller_event); +COVERAGE_DEFINE(pinctrl_drop_put_vport_binding); struct empty_lb_backends_event { struct hmap_node hmap_node; @@ -432,6 +445,7 @@ pinctrl_init(void) init_buffered_packets_map(); init_event_table(); ip_mcast_snoop_init(); + init_put_vport_bindings(); pinctrl.br_int_name = NULL; pinctrl_handler_seq = seq_create(); pinctrl_main_seq = seq_create(); @@ -1957,6 +1971,12 @@ process_packet_in(struct rconn *swconn, const struct ofp_header *msg) ovs_mutex_unlock(&pinctrl_mutex); break; + case ACTION_OPCODE_BIND_VPORT: + ovs_mutex_lock(&pinctrl_mutex); + pinctrl_handle_bind_vport(&pin.flow_metadata.flow, &userdata); + ovs_mutex_unlock(&pinctrl_mutex); + break; + default: VLOG_WARN_RL(&rl, "unrecognized packet-in opcode %"PRIu32, ntohl(ah->opcode)); @@ -2135,6 +2155,8 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, run_put_mac_bindings(ovnsb_idl_txn, sbrec_datapath_binding_by_key, sbrec_port_binding_by_key, sbrec_mac_binding_by_lport_ip); + run_put_vport_bindings(ovnsb_idl_txn, sbrec_datapath_binding_by_key, + sbrec_port_binding_by_key, chassis); send_garp_prepare(sbrec_port_binding_by_datapath, sbrec_port_binding_by_name, br_int, chassis, local_datapaths, active_tunnels); @@ -2481,6 +2503,7 @@ pinctrl_wait(struct ovsdb_idl_txn *ovnsb_idl_txn) { wait_put_mac_bindings(ovnsb_idl_txn); wait_controller_event(ovnsb_idl_txn); + wait_put_vport_bindings(ovnsb_idl_txn); int64_t new_seq = seq_read(pinctrl_main_seq); seq_wait(pinctrl_main_seq, new_seq); } @@ -2498,6 +2521,7 @@ pinctrl_destroy(void) destroy_buffered_packets_map(); event_table_destroy(); destroy_put_mac_bindings(); + destroy_put_vport_bindings(); destroy_dns_cache(); ip_mcast_snoop_destroy(); seq_destroy(pinctrl_main_seq); @@ -4341,3 +4365,153 @@ pinctrl_handle_event(struct ofpbuf *userdata) return; } } + +struct put_vport_binding { + struct hmap_node hmap_node; + + /* Key and value. */ + uint32_t dp_key; + uint32_t vport_key; + + uint32_t vport_parent_key; +}; + +/* Contains "struct put_vport_binding"s. */ +static struct hmap put_vport_bindings; + +static void +init_put_vport_bindings(void) +{ + hmap_init(&put_vport_bindings); +} + +static void +flush_put_vport_bindings(void) +{ + struct put_vport_binding *vport_b; + HMAP_FOR_EACH_POP (vport_b, hmap_node, &put_vport_bindings) { + free(vport_b); + } +} + +static void +destroy_put_vport_bindings(void) +{ + flush_put_vport_bindings(); + hmap_destroy(&put_vport_bindings); +} + +static void +wait_put_vport_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn) +{ + if (ovnsb_idl_txn && !hmap_is_empty(&put_vport_bindings)) { + poll_immediate_wake(); + } +} + +static struct put_vport_binding * +pinctrl_find_put_vport_binding(uint32_t dp_key, uint32_t vport_key, + uint32_t hash) +{ + struct put_vport_binding *vpb; + HMAP_FOR_EACH_WITH_HASH (vpb, hmap_node, hash, &put_vport_bindings) { + if (vpb->dp_key == dp_key && vpb->vport_key == vport_key) { + return vpb; + } + } + return NULL; +} + +static void +run_put_vport_binding(struct ovsdb_idl_txn *ovnsb_idl_txn OVS_UNUSED, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, + struct ovsdb_idl_index *sbrec_port_binding_by_key, + const struct sbrec_chassis *chassis, + const struct put_vport_binding *vpb) +{ + /* Convert logical datapath and logical port key into lport. */ + const struct sbrec_port_binding *pb = lport_lookup_by_key( + sbrec_datapath_binding_by_key, sbrec_port_binding_by_key, + vpb->dp_key, vpb->vport_key); + if (!pb) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); + + VLOG_WARN_RL(&rl, "unknown logical port with datapath %"PRIu32" " + "and port %"PRIu32, vpb->dp_key, vpb->vport_key); + return; + } + + /* pinctrl module updates the port binding only for type 'virtual'. */ + if (!strcmp(pb->type, "virtual")) { + const struct sbrec_port_binding *parent = lport_lookup_by_key( + sbrec_datapath_binding_by_key, sbrec_port_binding_by_key, + vpb->dp_key, vpb->vport_parent_key); + if (parent) { + VLOG_INFO("Claiming virtual lport %s for this chassis " + "with the virtual parent %s", + pb->logical_port, parent->logical_port); + sbrec_port_binding_set_chassis(pb, chassis); + sbrec_port_binding_set_virtual_parent(pb, parent->logical_port); + } + } +} + +/* Called by pinctrl_run(). Runs with in the main ovn-controller + * thread context. */ +static void +run_put_vport_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn, + struct ovsdb_idl_index *sbrec_datapath_binding_by_key, + struct ovsdb_idl_index *sbrec_port_binding_by_key, + const struct sbrec_chassis *chassis) + OVS_REQUIRES(pinctrl_mutex) +{ + if (!ovnsb_idl_txn) { + return; + } + + const struct put_vport_binding *vpb; + HMAP_FOR_EACH (vpb, hmap_node, &put_vport_bindings) { + run_put_vport_binding(ovnsb_idl_txn, sbrec_datapath_binding_by_key, + sbrec_port_binding_by_key, chassis, vpb); + } + + flush_put_vport_bindings(); +} + +/* Called with in the pinctrl_handler thread context. */ +static void +pinctrl_handle_bind_vport( + const struct flow *md, struct ofpbuf *userdata) + OVS_REQUIRES(pinctrl_mutex) +{ + /* Get the datapath key from the packet metadata. */ + uint32_t dp_key = ntohll(md->metadata); + uint32_t vport_parent_key = md->regs[MFF_LOG_INPORT - MFF_REG0]; + + /* Get the virtual port key from the userdata buffer. */ + uint32_t *vport_key = ofpbuf_try_pull(userdata, sizeof *vport_key); + + if (!vport_key) { + return; + } + + uint32_t hash = hash_2words(dp_key, *vport_key); + + struct put_vport_binding *vpb + = pinctrl_find_put_vport_binding(dp_key, *vport_key, hash); + if (!vpb) { + if (hmap_count(&put_vport_bindings) >= 1000) { + COVERAGE_INC(pinctrl_drop_put_vport_binding); + return; + } + + vpb = xmalloc(sizeof *vpb); + hmap_insert(&put_vport_bindings, &vpb->hmap_node, hash); + } + + vpb->dp_key = dp_key; + vpb->vport_key = *vport_key; + vpb->vport_parent_key = vport_parent_key; + + notify_pinctrl_main(); +} diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c index 4eacc44ed..66916a837 100644 --- a/ovn/lib/actions.c +++ b/ovn/lib/actions.c @@ -2599,6 +2599,63 @@ ovnact_check_pkt_larger_free(struct ovnact_check_pkt_larger *cipl OVS_UNUSED) { } +static void +parse_bind_vport(struct action_context *ctx) +{ + if (!lexer_force_match(ctx->lexer, LEX_T_LPAREN)) { + return; + } + + if (ctx->lexer->token.type != LEX_T_STRING) { + lexer_syntax_error(ctx->lexer, "expecting port name string"); + return; + } + + struct ovnact_bind_vport *bind_vp = ovnact_put_BIND_VPORT(ctx->ovnacts); + bind_vp->vport = xstrdup(ctx->lexer->token.s); + lexer_get(ctx->lexer); + (void) (lexer_force_match(ctx->lexer, LEX_T_COMMA) + && action_parse_field(ctx, 0, false, &bind_vp->vport_parent) + && lexer_force_match(ctx->lexer, LEX_T_RPAREN)); +} + +static void +format_BIND_VPORT(const struct ovnact_bind_vport *bind_vp, + struct ds *s ) +{ + ds_put_format(s, "bind_vport(\"%s\", ", bind_vp->vport); + expr_field_format(&bind_vp->vport_parent, s); + ds_put_cstr(s, ");"); +} + +static void +encode_BIND_VPORT(const struct ovnact_bind_vport *vp, + const struct ovnact_encode_params *ep, + struct ofpbuf *ofpacts) +{ + uint32_t vport_key; + if (!ep->lookup_port(ep->aux, vp->vport, &vport_key)) { + return; + } + + const struct arg args[] = { + { expr_resolve_field(&vp->vport_parent), MFF_LOG_INPORT }, + }; + encode_setup_args(args, ARRAY_SIZE(args), ofpacts); + size_t oc_offset = encode_start_controller_op(ACTION_OPCODE_BIND_VPORT, + false, NX_CTLR_NO_METER, + ofpacts); + ofpbuf_put(ofpacts, &vport_key, sizeof(uint32_t)); + encode_finish_controller_op(oc_offset, ofpacts); + encode_restore_args(args, ARRAY_SIZE(args), ofpacts); +} + +static void +ovnact_bind_vport_free(struct ovnact_bind_vport *bp) +{ + free(bp->vport); +} + /* Parses an assignment or exchange or put_dhcp_opts action. */ static void parse_set_action(struct action_context *ctx) @@ -2706,6 +2763,8 @@ parse_action(struct action_context *ctx) parse_set_meter_action(ctx); } else if (lexer_match_id(ctx->lexer, "trigger_event")) { parse_trigger_event(ctx, ovnact_put_TRIGGER_EVENT(ctx->ovnacts)); + } else if (lexer_match_id(ctx->lexer, "bind_vport")) { + parse_bind_vport(ctx); } else { lexer_syntax_error(ctx->lexer, "expecting action"); } diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c index 0f07d80ac..de745d73f 100644 --- a/ovn/lib/ovn-util.c +++ b/ovn/lib/ovn-util.c @@ -326,6 +326,7 @@ static const char *OVN_NB_LSP_TYPES[] = { "router", "vtep", "external", + "virtual", }; bool diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml index d2267de0e..6ff7aaff1 100644 --- a/ovn/northd/ovn-northd.8.xml +++ b/ovn/northd/ovn-northd.8.xml @@ -519,6 +519,34 @@ some additional flow cost for this and the value appears limited. </li> + <li> + <p> + If inport <code>V</code> is of type <code>virtual</code> adds a + priority-100 logical flow for each <var>P</var> configured in the + <ref table="Logical_Switch_Port" column="options:virtual-parents"/> + column with the match + </p> + <pre> +<code>inport == <var>P</var> && !is_chassis_resident(<var>V</var>) && ((arp.op == 1 && arp.spa == <var>VIP</var> && arp.tpa == <var>VIP</var>) || (arp.op == 2 && arp.spa == <var>VIP</var>))</code> + </pre> + + <p> + and applies the action + </p> + <pre> +<code>bind_vport(<var>V</var>, inport);</code> + </pre> + + <p> + and advances the packet to the next table. + </p> + + <p> + Where <var>VIP</var> is the virtual ip configured in the column + <ref table="Logical_Switch_Port" column="options:virtual-ip"/>. + </p> + </li> + <li> <p> Priority-50 flows that match ARP requests to each known IP address @@ -541,7 +569,8 @@ output; <p> These flows are omitted for logical ports (other than router ports or - <code>localport</code> ports) that are down. + <code>localport</code> ports) that are down and for logical ports of + type <code>virtual</code>. </p> </li> @@ -588,7 +617,8 @@ nd_na_router { <p> These flows are omitted for logical ports (other than router ports or - <code>localport</code> ports) that are down. + <code>localport</code> ports) that are down and for logical ports of + type <code>virtual</code>. </p> </li> @@ -2031,6 +2061,33 @@ next; <code>eth.dst = <var>E</var>; next;</code>. </p> + <p> + For each virtual ip <var>A</var> configured on a logical port + of type <code>virtual</code> and its virtual parent set in + its corresponding <ref db="OVN_Southbound" table="Port_Binding"/> + record and the virtual parent with the Ethernet address <var>E</var> + and the virtual ip is reachable via the router port <var>P</var>, a + priority-100 flow with match <code>outport === <var>P</var> + && reg0 == <var>A</var></code> has actions + <code>eth.dst = <var>E</var>; next;</code>. + </p> + + <p> + For each virtual ip <var>A</var> configured on a logical port + of type <code>virtual</code> and its virtual parent <code>not</code> + set in its corresponding + <ref db="OVN_Southbound" table="Port_Binding"/> + record and the virtual ip <var>A</var> is reachable via the + router port <var>P</var>, a + priority-100 flow with match <code>outport === <var>P</var> + && reg0 == <var>A</var></code> has actions + <code>eth.dst = <var>00:00:00:00:00:00</var>; next;</code>. + This flow is added so that the ARP is always resolved for the + virtual ip <var>A</var> by generating ARP request and + <code>not</code> consulting the MAC_Binding table as it can have + incorrect value for the virtual ip <var>A</var>. + </p> + <p> For each IPv6 address <var>A</var> whose host is known to have Ethernet address <var>E</var> on router port <var>P</var>, a diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index eb6c47cad..ae09cf338 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -4878,96 +4878,146 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, continue; } - /* - * Add ARP/ND reply flows if either the - * - port is up or - * - port type is router or - * - port type is localport - */ - if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") && - strcmp(op->nbsp->type, "localport")) { - continue; - } + if (!strcmp(op->nbsp->type, "virtual")) { + /* Handle + * - GARPs for virtual ip which belongs to a logical port + * of type 'virtual' and bind that port. + * + * - ARP reply from the virtual ip which belongs to a logical + * port of type 'virtual' and bind that port. + * */ + ovs_be32 ip; + const char *virtual_ip = smap_get(&op->nbsp->options, + "virtual-ip"); + const char *virtual_parents = smap_get(&op->nbsp->options, + "virtual-parents"); + if (!virtual_ip || !virtual_parents || + !ip_parse(virtual_ip, &ip)) { + continue; + } - if (lsp_is_external(op->nbsp)) { - continue; - } + char *tokstr = xstrdup(virtual_parents); + char *save_ptr = NULL; + char *vparent; + for (vparent = strtok_r(tokstr, ",", &save_ptr); vparent != NULL; + vparent = strtok_r(NULL, ",", &save_ptr)) { + struct ovn_port *vp = ovn_port_find(ports, vparent); + if (!vp || vp->od != op->od) { + /* vparent name should be valid and it should belong + * to the same logical switch. */ + continue; + } - for (size_t i = 0; i < op->n_lsp_addrs; i++) { - for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) { ds_clear(&match); - ds_put_format(&match, "arp.tpa == %s && arp.op == 1", - op->lsp_addrs[i].ipv4_addrs[j].addr_s); + ds_put_format(&match, "inport == \"%s\" && " + "!is_chassis_resident(%s) && " + "((arp.op == 1 && arp.spa == %s && " + "arp.tpa == %s) || (arp.op == 2 && " + "arp.spa == %s))", + vparent, op->json_key, virtual_ip, virtual_ip, + virtual_ip); ds_clear(&actions); ds_put_format(&actions, - "eth.dst = eth.src; " - "eth.src = %s; " - "arp.op = 2; /* ARP reply */ " - "arp.tha = arp.sha; " - "arp.sha = %s; " - "arp.tpa = arp.spa; " - "arp.spa = %s; " - "outport = inport; " - "flags.loopback = 1; " - "output;", - op->lsp_addrs[i].ea_s, op->lsp_addrs[i].ea_s, - op->lsp_addrs[i].ipv4_addrs[j].addr_s); - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 50, + "bind_vport(%s, inport); " + "next;", + op->json_key); + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100, ds_cstr(&match), ds_cstr(&actions)); + } - /* Do not reply to an ARP request from the port that owns the - * address (otherwise a DHCP client that ARPs to check for a - * duplicate address will fail). Instead, forward it the usual - * way. - * - * (Another alternative would be to simply drop the packet. If - * everything is working as it is configured, then this would - * produce equivalent results, since no one should reply to the - * request. But ARPing for one's own IP address is intended to - * detect situations where the network is not working as - * configured, so dropping the request would frustrate that - * intent.) */ - ds_put_format(&match, " && inport == %s", op->json_key); - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100, - ds_cstr(&match), "next;"); + free(tokstr); + } else { + /* + * Add ARP/ND reply flows if either the + * - port is up or + * - port type is router or + * - port type is localport + */ + if (!lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") && + strcmp(op->nbsp->type, "localport")) { + continue; } - /* For ND solicitations, we need to listen for both the - * unicast IPv6 address and its all-nodes multicast address, - * but always respond with the unicast IPv6 address. */ - for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; j++) { - ds_clear(&match); - ds_put_format(&match, - "nd_ns && ip6.dst == {%s, %s} && nd.target == %s", - op->lsp_addrs[i].ipv6_addrs[j].addr_s, - op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s, - op->lsp_addrs[i].ipv6_addrs[j].addr_s); + if (lsp_is_external(op->nbsp)) { + continue; + } - ds_clear(&actions); - ds_put_format(&actions, - "%s { " + for (size_t i = 0; i < op->n_lsp_addrs; i++) { + for (size_t j = 0; j < op->lsp_addrs[i].n_ipv4_addrs; j++) { + ds_clear(&match); + ds_put_format(&match, "arp.tpa == %s && arp.op == 1", + op->lsp_addrs[i].ipv4_addrs[j].addr_s); + ds_clear(&actions); + ds_put_format(&actions, + "eth.dst = eth.src; " "eth.src = %s; " - "ip6.src = %s; " - "nd.target = %s; " - "nd.tll = %s; " + "arp.op = 2; /* ARP reply */ " + "arp.tha = arp.sha; " + "arp.sha = %s; " + "arp.tpa = arp.spa; " + "arp.spa = %s; " "outport = inport; " "flags.loopback = 1; " - "output; " - "};", - !strcmp(op->nbsp->type, "router") ? - "nd_na_router" : "nd_na", - op->lsp_addrs[i].ea_s, - op->lsp_addrs[i].ipv6_addrs[j].addr_s, - op->lsp_addrs[i].ipv6_addrs[j].addr_s, - op->lsp_addrs[i].ea_s); - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 50, - ds_cstr(&match), ds_cstr(&actions)); + "output;", + op->lsp_addrs[i].ea_s, op->lsp_addrs[i].ea_s, + op->lsp_addrs[i].ipv4_addrs[j].addr_s); + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 50, + ds_cstr(&match), ds_cstr(&actions)); + + /* Do not reply to an ARP request from the port that owns + * the address (otherwise a DHCP client that ARPs to check + * for a duplicate address will fail). Instead, forward + * it the usual way. + * + * (Another alternative would be to simply drop the packet. + * If everything is working as it is configured, then this + * would produce equivalent results, since no one should + * reply to the request. But ARPing for one's own IP + * address is intended to detect situations where the + * network is not working as configured, so dropping the + * request would frustrate that intent.) */ + ds_put_format(&match, " && inport == %s", op->json_key); + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100, + ds_cstr(&match), "next;"); + } - /* Do not reply to a solicitation from the port that owns the - * address (otherwise DAD detection will fail). */ - ds_put_format(&match, " && inport == %s", op->json_key); - ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100, - ds_cstr(&match), "next;"); + /* For ND solicitations, we need to listen for both the + * unicast IPv6 address and its all-nodes multicast address, + * but always respond with the unicast IPv6 address. */ + for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; j++) { + ds_clear(&match); + ds_put_format(&match, + "nd_ns && ip6.dst == {%s, %s} && nd.target == %s", + op->lsp_addrs[i].ipv6_addrs[j].addr_s, + op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s, + op->lsp_addrs[i].ipv6_addrs[j].addr_s); + + ds_clear(&actions); + ds_put_format(&actions, + "%s { " + "eth.src = %s; " + "ip6.src = %s; " + "nd.target = %s; " + "nd.tll = %s; " + "outport = inport; " + "flags.loopback = 1; " + "output; " + "};", + !strcmp(op->nbsp->type, "router") ? + "nd_na_router" : "nd_na", + op->lsp_addrs[i].ea_s, + op->lsp_addrs[i].ipv6_addrs[j].addr_s, + op->lsp_addrs[i].ipv6_addrs[j].addr_s, + op->lsp_addrs[i].ea_s); + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 50, + ds_cstr(&match), ds_cstr(&actions)); + + /* Do not reply to a solicitation from the port that owns + * the address (otherwise DAD detection will fail). */ + ds_put_format(&match, " && inport == %s", op->json_key); + ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100, + ds_cstr(&match), "next;"); + } } } } @@ -7504,7 +7554,8 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, 100, ds_cstr(&match), ds_cstr(&actions)); } } - } else if (op->od->n_router_ports && strcmp(op->nbsp->type, "router")) { + } else if (op->od->n_router_ports && strcmp(op->nbsp->type, "router") + && strcmp(op->nbsp->type, "virtual")) { /* This is a logical switch port that backs a VM or a container. * Extract its addresses. For each of the address, go through all * the router ports attached to the switch (to which this port @@ -7581,6 +7632,105 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, } } } + } else if (op->od->n_router_ports && strcmp(op->nbsp->type, "router") + && !strcmp(op->nbsp->type, "virtual")) { + /* This is a virtual port. Add ARP replies for the virtual ip with + * the mac of the present active virtual parent. + * If the logical port doesn't have virtual parent set in + * Port_Binding table, then add the flow to set eth.dst to + * 00:00:00:00:00:00 and advance to next table so that ARP is + * resolved by router pipeline using the arp{} action. + * The MAC_Binding entry for the virtual ip might be invalid. */ + ovs_be32 ip; + + const char *vip = smap_get(&op->nbsp->options, + "virtual-ip"); + const char *virtual_parents = smap_get(&op->nbsp->options, + "virtual-parents"); + if (!vip || !virtual_parents || + !ip_parse(vip, &ip) || !op->sb) { + continue; + } + + if (!op->sb->virtual_parent || !op->sb->virtual_parent[0] || + !op->sb->chassis) { + /* The virtual port is not claimed yet. */ + for (size_t i = 0; i < op->od->n_router_ports; i++) { + const char *peer_name = smap_get( + &op->od->router_ports[i]->nbsp->options, + "router-port"); + if (!peer_name) { + continue; + } + + struct ovn_port *peer = ovn_port_find(ports, peer_name); + if (!peer || !peer->nbrp) { + continue; + } + + if (find_lrp_member_ip(peer, vip)) { + ds_clear(&match); + ds_put_format(&match, "outport == %s && reg0 == %s", + peer->json_key, vip); + + ds_clear(&actions); + ds_put_format(&actions, + "eth.dst = 00:00:00:00:00:00; next;"); + ovn_lflow_add(lflows, peer->od, + S_ROUTER_IN_ARP_RESOLVE, 100, + ds_cstr(&match), ds_cstr(&actions)); + break; + } + } + } else { + struct ovn_port *vp = + ovn_port_find(ports, op->sb->virtual_parent); + if (!vp || !vp->nbsp) { + continue; + } + + for (size_t i = 0; i < vp->n_lsp_addrs; i++) { + bool found_vip_network = false; + const char *ea_s = vp->lsp_addrs[i].ea_s; + for (size_t j = 0; j < vp->od->n_router_ports; j++) { + /* Get the Logical_Router_Port that the + * Logical_Switch_Port is connected to, as + * 'peer'. */ + const char *peer_name = smap_get( + &vp->od->router_ports[j]->nbsp->options, + "router-port"); + if (!peer_name) { + continue; + } + + struct ovn_port *peer = + ovn_port_find(ports, peer_name); + if (!peer || !peer->nbrp) { + continue; + } + + if (!find_lrp_member_ip(peer, vip)) { + continue; + } + + ds_clear(&match); + ds_put_format(&match, "outport == %s && reg0 == %s", + peer->json_key, vip); + + ds_clear(&actions); + ds_put_format(&actions, "eth.dst = %s; next;", ea_s); + ovn_lflow_add(lflows, peer->od, + S_ROUTER_IN_ARP_RESOLVE, 100, + ds_cstr(&match), ds_cstr(&actions)); + found_vip_network = true; + break; + } + + if (found_vip_network) { + break; + } + } + } } else if (!strcmp(op->nbsp->type, "router")) { /* This is a logical switch port that connects to a router. */ @@ -9256,6 +9406,8 @@ main(int argc, char *argv[]) &sbrec_port_binding_col_gateway_chassis); ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_port_binding_col_ha_chassis_group); + ovsdb_idl_add_column(ovnsb_idl_loop.idl, + &sbrec_port_binding_col_virtual_parent); ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_gateway_chassis_col_chassis); ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_gateway_chassis_col_name); diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml index 57b6edbf8..f5f10a5c1 100644 --- a/ovn/ovn-nb.xml +++ b/ovn/ovn-nb.xml @@ -465,6 +465,31 @@ </li> </ul> </dd> + + <dt><code>virtual</code></dt> + <dd> + <p> + Represents a logical port which does not have an OVS + port in the integration bridge and has a virtual ip configured + in the <ref column="options:virtual-ip"/> column. This virtual ip + can move around between the logical ports configured in + the <ref column="options:virtual-parents"/> column. + </p> + + <p> + One of the use case where <code>virtual</code> + ports can be used is. + </p> + + <ul> + <li> + The <code>virtual ip</code> represents a load balancer vip + and the <code>virtual parents</code> provide load balancer + service in an active-standby setup with the active virtual + parent owning the <code>virtual ip</code>. + </li> + </ul> + </dd> </dl> </column> </group> @@ -618,6 +643,26 @@ interface, in bits. </column> </group> + + <group title="Virtual port Options"> + <p> + These options apply when <ref column="type"/> is + <code>virtual</code>. + </p> + + <column name="options" key="virtual-ip"> + This option represents the virtual IPv4 address. + </column> + + <column name="options" key="virtual-parents"> + This options represents a set of logical port names (with in the same + logical switch) which can own the <code>virtual ip</code> configured + in the <ref column="options:virtual-ip"/>. All these virtual parents + should add the <code>virtual ip</code> in the + <ref column="port_security"/> if port security addressed are enabled. + </column> + </group> + </group> <group title="Containers"> diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema index 2b7bc57a7..5c013b17e 100644 --- a/ovn/ovn-sb.ovsschema +++ b/ovn/ovn-sb.ovsschema @@ -1,7 +1,7 @@ { "name": "OVN_Southbound", - "version": "2.4.0", - "cksum": "3059284885 20260", + "version": "2.5.0", + "cksum": "1257419092 20387", "tables": { "SB_Global": { "columns": { @@ -173,6 +173,8 @@ "minInteger": 1, "maxInteger": 4095}, "min": 0, "max": 1}}, + "virtual_parent": {"type": {"key": "string", "min": 0, + "max": 1}}, "chassis": {"type": {"key": {"type": "uuid", "refTable": "Chassis", "refType": "weak"}, diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml index 544a071fa..17c45bbac 100644 --- a/ovn/ovn-sb.xml +++ b/ovn/ovn-sb.xml @@ -2017,6 +2017,24 @@ tcp.flags = RST; </p> <p><b>Prerequisite:</b> <code>igmp</code></p> </dd> + + <dt><code>bind_vport(<var>V</var>, <var>P</var>);</code></dt> + <dd> + <p> + <b>Parameters</b>: logical port string field <var>V</var> + of type <code>virtual</code>, logical port string field + <var>P</var>. + </p> + + <p> + Binds the virtual logical port <var>V</var> and sets the + <ref table="Port_Binding" column="chassis"/> column and + <ref table="Port_Binding" column="virtual_parent"/> of + the table <ref table="Port_Binding"/>. + <ref table="Port_Binding" column="virtual_parent"/> is + set to <var>P</var>. + </p> + </dd> </dl> </column> @@ -2480,6 +2498,13 @@ tcp.flags = RST; the <code>outport</code> will be reset to the value of the distributed port. </dd> + + <dt><code>virtual</code></dt> + <dd> + Represents a logical port with an <code>virtual ip</code>. + This <code>virtual ip</code> can be configured on a + logical port (which is refered as virtual parent). + </dd> </dl> </column> </group> @@ -2720,6 +2745,27 @@ tcp.flags = RST; </column> </group> + <group title="Virtual ports"> + <column name="virtual_parent"> + <p> + This column is set by <code>ovn-controller</code> with one of the + value from the + <ref table="Logical_Switch_Port" column="options:virtual-parents" + db="OVN_Northbound"/> in the OVN_Northbound database's + <ref table="Logical_Switch_Port" db="OVN_Northbound"/> table + when the OVN action <code>bind_vport</code> is executed. + <code>ovn-controller</code> also sets the + <ref column="chassis"/> column when it executes this action + with its chassis id. + </p> + + <p> + <code>ovn-controller</code> sets this column only if the + <ref column="type"/> is "virtual". + </p> + </column> + </group> + <group title="Naming"> <column name="external_ids" key="name"> <p> diff --git a/ovn/utilities/ovn-trace.c b/ovn/utilities/ovn-trace.c index 044eb1cc2..b532b8eaf 100644 --- a/ovn/utilities/ovn-trace.c +++ b/ovn/utilities/ovn-trace.c @@ -2144,6 +2144,9 @@ trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len, case OVNACT_CHECK_PKT_LARGER: break; + + case OVNACT_BIND_VPORT: + break; } } ds_destroy(&s); diff --git a/tests/ovn.at b/tests/ovn.at index cb380d275..5d6c90c5f 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -1368,6 +1368,24 @@ reg0 = check_pkt_larger(foo); reg0[0] = check_pkt_larger(foo); Syntax error at `foo' expecting `;'. +# bind_vport +# lsp1's port key is 0x11. +bind_vport("lsp1", inport); + encodes as controller(userdata=00.00.00.11.00.00.00.00.11.00.00.00) +# lsp2 doesn't exist. So it should be encoded as drop. +bind_vport("lsp2", inport); + encodes as drop +bind_vport; + Syntax error at `;' expecting `('. +bind_vport(; + Syntax error at `;' expecting port name string. +bind_vport("xyzzy"; + Syntax error at `;' expecting `,'. +bind_vport("xyzzy",; + Syntax error at `;' expecting field name. +bind_vport("xyzzy", inport; + Syntax error at `;' expecting `)'. + # Miscellaneous negative tests. ; Syntax error at `;'. @@ -14345,6 +14363,278 @@ OVN_CLEANUP([hv1],[hv2]) AT_CLEANUP +AT_SETUP([ovn -- virtual ports]) +AT_KEYWORDS([virtual ports]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +ovn_start + +send_garp() { + local hv=$1 inport=$2 eth_src=$3 eth_dst=$4 spa=$5 tpa=$6 + local request=${eth_dst}${eth_src}08060001080006040001${eth_src}${spa}${eth_dst}${tpa} + as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request +} + +send_arp_reply() { + local hv=$1 inport=$2 eth_src=$3 eth_dst=$4 spa=$5 tpa=$6 + local request=${eth_dst}${eth_src}08060001080006040002${eth_src}${spa}${eth_dst}${tpa} + as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request +} + +net_add n1 + +sim_add hv1 +as hv1 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.1 +ovs-vsctl -- add-port br-int hv1-vif1 -- \ + set interface hv1-vif1 external-ids:iface-id=sw0-p1 \ + options:tx_pcap=hv1/vif1-tx.pcap \ + options:rxq_pcap=hv1/vif1-rx.pcap \ + ofport-request=1 +ovs-vsctl -- add-port br-int hv1-vif2 -- \ + set interface hv1-vif2 external-ids:iface-id=sw0-p3 \ + options:tx_pcap=hv1/vif2-tx.pcap \ + options:rxq_pcap=hv1/vif2-rx.pcap \ + ofport-request=2 + +sim_add hv2 +as hv2 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.2 +ovs-vsctl -- add-port br-int hv2-vif1 -- \ + set interface hv2-vif1 external-ids:iface-id=sw0-p2 \ + options:tx_pcap=hv2/vif1-tx.pcap \ + options:rxq_pcap=hv2/vif1-rx.pcap \ + ofport-request=1 +ovs-vsctl -- add-port br-int hv2-vif2 -- \ + set interface hv2-vif2 external-ids:iface-id=sw1-p1 \ + options:tx_pcap=hv2/vif2-tx.pcap \ + options:rxq_pcap=hv2/vif2-rx.pcap \ + ofport-request=2 + +ovn-nbctl ls-add sw0 + +ovn-nbctl lsp-add sw0 sw0-vir +ovn-nbctl lsp-set-addresses sw0-vir "50:54:00:00:00:10 10.0.0.10" +ovn-nbctl lsp-set-port-security sw0-vir "50:54:00:00:00:10 10.0.0.10" +ovn-nbctl lsp-set-type sw0-vir virtual +ovn-nbctl set logical_switch_port sw0-vir options:virtual-ip=10.0.0.10 +ovn-nbctl set logical_switch_port sw0-vir options:virtual-parents=sw0-p1,sw0-p2 + +ovn-nbctl lsp-add sw0 sw0-p1 +ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" +ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3 10.0.0.10" + +ovn-nbctl lsp-add sw0 sw0-p2 +ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4" +ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4 10.0.0.10" + +ovn-nbctl lsp-add sw0 sw0-p3 +ovn-nbctl lsp-set-addresses sw0-p3 "50:54:00:00:00:05 10.0.0.5" +ovn-nbctl lsp-set-port-security sw0-p3 "50:54:00:00:00:05 10.0.0.5" + +# Create the second logical switch with one port +ovn-nbctl ls-add sw1 +ovn-nbctl lsp-add sw1 sw1-p1 +ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03 20.0.0.3" +ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03 20.0.0.3" + +# Create a logical router and attach both logical switches +ovn-nbctl lr-add lr0 +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 +ovn-nbctl lsp-add sw0 sw0-lr0 +ovn-nbctl lsp-set-type sw0-lr0 router +ovn-nbctl lsp-set-addresses sw0-lr0 00:00:00:00:ff:01 +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 + +ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24 +ovn-nbctl lsp-add sw1 sw1-lr0 +ovn-nbctl lsp-set-type sw1-lr0 router +ovn-nbctl lsp-set-addresses sw1-lr0 00:00:00:00:ff:02 +ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1 + +OVN_POPULATE_ARP +ovn-nbctl --wait=hv sync + +# Check that logical flows are added for sw0-vir in lsp_in_arp_rsp pipeline +# with bind_vport action. + +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > lflows.txt + +AT_CHECK([cat lflows.txt], [0], [dnl + table=11(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p1" && !is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) + table=11(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p2" && !is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) +]) + +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ +> lflows.txt + +# Since the sw0-vir is not claimed by any chassis, eth.dst should be set to +# zero if the ip4.dst is the virtual ip in the router pipeline. +AT_CHECK([cat lflows.txt], [0], [dnl + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) +]) + +ip_to_hex() { + printf "%02x%02x%02x%02x" "$@" +} + +hv1_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv1"` +hv2_ch_uuid=`ovn-sbctl --bare --columns _uuid find chassis name="hv2"` + +AT_CHECK([test x$(ovn-sbctl --bare --columns chassis find port_binding \ +logical_port=sw0-vir) = x], [0], []) + +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find port_binding \ +logical_port=sw0-vir) = x]) + +# From sw0-p0 send GARP for 10.0.0.10. hv1 should claim sw0-vir +# and sw0-p1 should be its virtual_parent. +eth_src=505400000003 +eth_dst=ffffffffffff +spa=$(ip_to_hex 10 0 0 10) +tpa=$(ip_to_hex 10 0 0 10) +send_garp 1 1 $eth_src $eth_dst $spa $tpa + +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find port_binding \ +logical_port=sw0-vir) = x$hv1_ch_uuid], [0], []) + +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find port_binding \ +logical_port=sw0-vir) = xsw0-p1]) + +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ +> lflows.txt + +# There should be an arp resolve flow to resolve the virtual_ip with the +# sw0-p1's MAC. +AT_CHECK([cat lflows.txt], [0], [dnl + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) +]) + +# send the garp from sw0-p2 (in hv2). hv2 should claim sw0-vir +# and sw0-p2 shpuld be its virtual_parent. +eth_src=505400000004 +eth_dst=ffffffffffff +spa=$(ip_to_hex 10 0 0 10) +tpa=$(ip_to_hex 10 0 0 10) +send_garp 2 1 $eth_src $eth_dst $spa $tpa + +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find port_binding \ +logical_port=sw0-vir) = x$hv2_ch_uuid], [0], []) + +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find port_binding \ +logical_port=sw0-vir) = xsw0-p2]) + +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ +> lflows.txt + +# There should be an arp resolve flow to resolve the virtual_ip with the +# sw0-p2's MAC. +AT_CHECK([cat lflows.txt], [0], [dnl + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) +]) + +# Now send arp reply from sw0-p1. hv1 should claim sw0-vir +# and sw0-p1 shpuld be its virtual_parent. +eth_src=505400000003 +eth_dst=ffffffffffff +spa=$(ip_to_hex 10 0 0 10) +tpa=$(ip_to_hex 10 0 0 4) +send_arp_reply 1 1 $eth_src $eth_dst $spa $tpa + +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find port_binding \ +logical_port=sw0-vir) = x$hv1_ch_uuid], [0], []) + +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find port_binding \ +logical_port=sw0-vir) = xsw0-p1]) + +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ +> lflows.txt + +AT_CHECK([cat lflows.txt], [0], [dnl + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) +]) + +# Delete hv1-vif1 port. hv1 should release sw0-vir +as hv1 ovs-vsctl del-port hv1-vif1 + +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find port_binding \ +logical_port=sw0-vir) = x], [0], []) + +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find port_binding \ +logical_port=sw0-vir) = x]) + +# Since the sw0-vir is not claimed by any chassis, eth.dst should be set to +# zero if the ip4.dst is the virtual ip. +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ +> lflows.txt + +AT_CHECK([cat lflows.txt], [0], [dnl + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) +]) + +# Now send arp reply from sw0-p2. hv2 should claim sw0-vir +# and sw0-p2 shpuld be its virtual_parent. +eth_src=505400000004 +eth_dst=ffffffffffff +spa=$(ip_to_hex 10 0 0 10) +tpa=$(ip_to_hex 10 0 0 3) +send_arp_reply 2 1 $eth_src $eth_dst $spa $tpa + +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find port_binding \ +logical_port=sw0-vir) = x$hv2_ch_uuid], [0], []) + +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find port_binding \ +logical_port=sw0-vir) = xsw0-p2]) + +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ +> lflows.txt + +AT_CHECK([cat lflows.txt], [0], [dnl + table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) +]) + +# Delete sw0-p2 logical port +ovn-nbctl lsp-del sw0-p2 + +OVS_WAIT_UNTIL([test x$(ovn-sbctl --bare --columns chassis find port_binding \ +logical_port=sw0-vir) = x], [0], []) + +AT_CHECK([test x$(ovn-sbctl --bare --columns virtual_parent find port_binding \ +logical_port=sw0-vir) = x]) + +# Clear virtual_ip column of sw0-vir. There should be no bind_vport flows. +ovn-nbctl --wait=hv remove logical_switch_port sw0-vir options virtual-ip + +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > lflows.txt + +AT_CHECK([cat lflows.txt], [0], [dnl +]) + +# Add back virtual_ip and clear virtual_parents. +ovn-nbctl --wait=hv set logical_switch_port sw0-vir options:virtual-ip=10.0.0.10 + +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > lflows.txt + +AT_CHECK([cat lflows.txt], [0], [dnl + table=11(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p1" && !is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) +]) + +ovn-nbctl --wait=hv remove logical_switch_port sw0-vir options virtual-parents +ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > lflows.txt + +AT_CHECK([cat lflows.txt], [0], [dnl +]) + +ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ +> lflows.txt + +AT_CHECK([cat lflows.txt], [0], [dnl +]) + +OVN_CLEANUP([hv1], [hv2]) +AT_CLEANUP + # Run ovn-nbctl in daemon mode, change to a backup database and verify that # an insert operation is not allowed. AT_SETUP([ovn -- can't write to a backup database server instance]) diff --git a/tests/test-ovn.c b/tests/test-ovn.c index 0b9e8246e..cf1bc5432 100644 --- a/tests/test-ovn.c +++ b/tests/test-ovn.c @@ -1253,6 +1253,7 @@ test_parse_actions(struct ovs_cmdl_context *ctx OVS_UNUSED) simap_put(&ports, "eth0", 5); simap_put(&ports, "eth1", 6); simap_put(&ports, "LOCAL", ofp_to_u16(OFPP_LOCAL)); + simap_put(&ports, "lsp1", 0x11); ds_init(&input); while (!ds_get_test_line(&input, stdin)) {