From patchwork Mon Sep 16 17:16:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 1163033 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46XCbW319Mz9sPk for ; Tue, 17 Sep 2019 03:17:10 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 3C4691A7B; Mon, 16 Sep 2019 17:17:09 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 2EBF712E4 for ; Mon, 16 Sep 2019 17:16:30 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 71F828BB for ; Mon, 16 Sep 2019 17:16:26 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C0C49881344; Mon, 16 Sep 2019 17:16:25 +0000 (UTC) Received: from nummac.local (dhcp35-127.lab.eng.blr.redhat.com [10.70.35.127]) by smtp.corp.redhat.com (Postfix) with ESMTP id BB9DB5D6C8; Mon, 16 Sep 2019 17:16:22 +0000 (UTC) From: nusiddiq@redhat.com To: dev@openvswitch.org Date: Mon, 16 Sep 2019 22:46:07 +0530 Message-Id: <20190916171607.12750-1-nusiddiq@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.69]); Mon, 16 Sep 2019 17:16:25 +0000 (UTC) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00, HTML_SINGLET_MANY, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Haidong Li Subject: [ovs-dev] [PATCH ovn v2] Learn the mac binding only if required X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Numan Siddique OVN has the actions - put_arp and put_nd to learn the mac bindings from the ARP/ND packets. These actions update the Southbound MAC_Binding table. These actions translates to controller actions. Whenever pinctrl thread receives such packets, it wakes up the main ovn-controller thread. If the MAC_Binding table is already upto date, this results in unnecessary CPU cyles. There are some security implications as well. A rogue VM can flood broadcast ARP request/reply packets and this could cause DoS issues. A physical switch may send periodic GARPs and these packets hit ovn-controllers. This patch solves these problems by learning the mac bindings only if required. There is no need to apply the put_arp/put_nd action if the Southbound MAC_Binding row is upto date. New actions - lookup_arp and lookup_nd are added which looks up the IP, MAC pair in the mac_binding table and stores the result in a register. 1 if lookup is successful, 0 otherwise. ovn-northd adds 2 new stages - lookup_arp and put_arp before ip_input in the router ingress pipeline. The logical flows looks something like: table=1 (lr_in_lookup_arp), priority=100 , match=(arp), reg9[4] = lookup_arp(inport, arp.spa, arp.sha); next;) table=1 (lr_in_lookup_arp), priority=0 , match=(1), action=(next;) ... table=2 (lr_in_put_arp ), priority=100 , match=(arp.op == 2 && reg9[4] == 0), action=(put_arp(inport, arp.spa, arp.sha);) table=2 (lr_in_put_arp ), priority=90 , match=(arp.op == 2), action=(drop;) table=2 (lr_in_put_arp ), priority=0 , match=(1), action=(next;) The lflow module of ovn-controller adds OF flows in table 31 (OFTABLE_MAC_LOOKUP) for each mac_binding entry with the match reg0 = ip && eth.src = mac with the action - load:1->reg2[0] Eg: table=31, priority=100,arp,reg0=0xaca8006f,reg14=0x3,metadata=0x3,dl_src=00:44:00:00:00:04 actions=load:1->NXM_NX_REG2[0] This patch should also address the issue reported in 'Reported-at' Reported-at: https://bugzilla.redhat.com/1729846 Reported-by: Haidong Li CC: Han ZHou CC: Dumitru Ceara Tested-by: Dumitru Ceara Signed-off-by: Numan Siddique --- v1 -> v2 ======= * Addressed review comments from Han - Storing the result of lookup_arp/lookup_nd in a register. controller/lflow.c | 36 ++++- controller/lflow.h | 1 + include/ovn/actions.h | 13 ++ include/ovn/logical-fields.h | 3 + lib/actions.c | 115 ++++++++++++++ northd/ovn-northd.8.xml | 251 ++++++++++++++++++++---------- northd/ovn-northd.c | 205 ++++++++++++++----------- ovn-sb.xml | 57 +++++++ tests/ovn.at | 290 ++++++++++++++++++++++++++++++++++- tests/test-ovn.c | 1 + utilities/ovn-trace.c | 69 +++++++++ 11 files changed, 861 insertions(+), 180 deletions(-) diff --git a/controller/lflow.c b/controller/lflow.c index d0335a83a..762752753 100644 --- a/controller/lflow.c +++ b/controller/lflow.c @@ -687,6 +687,7 @@ consider_logical_flow( .egress_ptable = OFTABLE_LOG_EGRESS_PIPELINE, .output_ptable = output_ptable, .mac_bind_ptable = OFTABLE_MAC_BINDING, + .mac_lookup_ptable = OFTABLE_MAC_LOOKUP, }; ovnacts_encode(ovnacts.data, ovnacts.size, &ep, &ofpacts); ovnacts_free(ovnacts.data, ovnacts.size); @@ -777,7 +778,9 @@ consider_neighbor_flow(struct ovsdb_idl_index *sbrec_port_binding_by_name, return; } - struct match match = MATCH_CATCHALL_INITIALIZER; + struct match get_arp_match = MATCH_CATCHALL_INITIALIZER; + struct match lookup_arp_match = MATCH_CATCHALL_INITIALIZER; + if (strchr(b->ip, '.')) { ovs_be32 ip; if (!ip_parse(b->ip, &ip)) { @@ -785,7 +788,9 @@ consider_neighbor_flow(struct ovsdb_idl_index *sbrec_port_binding_by_name, VLOG_WARN_RL(&rl, "bad 'ip' %s", b->ip); return; } - match_set_reg(&match, 0, ntohl(ip)); + match_set_reg(&get_arp_match, 0, ntohl(ip)); + match_set_reg(&lookup_arp_match, 0, ntohl(ip)); + match_set_dl_type(&lookup_arp_match, htons(ETH_TYPE_ARP)); } else { struct in6_addr ip6; if (!ipv6_parse(b->ip, &ip6)) { @@ -795,17 +800,34 @@ consider_neighbor_flow(struct ovsdb_idl_index *sbrec_port_binding_by_name, } ovs_be128 value; memcpy(&value, &ip6, sizeof(value)); - match_set_xxreg(&match, 0, ntoh128(value)); + match_set_xxreg(&get_arp_match, 0, ntoh128(value)); + + match_set_xxreg(&lookup_arp_match, 0, ntoh128(value)); + match_set_dl_type(&lookup_arp_match, htons(ETH_TYPE_IPV6)); + match_set_nw_proto(&lookup_arp_match, 58); + match_set_icmp_code(&lookup_arp_match, 0); } - match_set_metadata(&match, htonll(pb->datapath->tunnel_key)); - match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, pb->tunnel_key); + match_set_metadata(&get_arp_match, htonll(pb->datapath->tunnel_key)); + match_set_reg(&get_arp_match, MFF_LOG_OUTPORT - MFF_REG0, pb->tunnel_key); + + match_set_metadata(&lookup_arp_match, htonll(pb->datapath->tunnel_key)); + match_set_reg(&lookup_arp_match, MFF_LOG_INPORT - MFF_REG0, + pb->tunnel_key); uint64_t stub[1024 / 8]; struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(stub); put_load(mac.ea, sizeof mac.ea, MFF_ETH_DST, 0, 48, &ofpacts); - ofctrl_add_flow(flow_table, OFTABLE_MAC_BINDING, 100, 0, &match, &ofpacts, - &b->header_.uuid); + ofctrl_add_flow(flow_table, OFTABLE_MAC_BINDING, 100, 0, &get_arp_match, + &ofpacts, &b->header_.uuid); + + ofpbuf_clear(&ofpacts); + uint8_t value = 1; + put_load(&value, sizeof value, MFF_LOG_LOOKUP_MAC, 0, 1, &ofpacts); + match_set_dl_src(&lookup_arp_match, mac); + ofctrl_add_flow(flow_table, OFTABLE_MAC_LOOKUP, 100, 0, &lookup_arp_match, + &ofpacts, &b->header_.uuid); + ofpbuf_uninit(&ofpacts); } diff --git a/controller/lflow.h b/controller/lflow.h index 54da00b49..d6d18978a 100644 --- a/controller/lflow.h +++ b/controller/lflow.h @@ -58,6 +58,7 @@ struct uuid; * you make any changes. */ #define OFTABLE_PHY_TO_LOG 0 #define OFTABLE_LOG_INGRESS_PIPELINE 8 /* First of LOG_PIPELINE_LEN tables. */ +#define OFTABLE_MAC_LOOKUP 31 #define OFTABLE_REMOTE_OUTPUT 32 #define OFTABLE_LOCAL_OUTPUT 33 #define OFTABLE_CHECK_LOOPBACK 34 diff --git a/include/ovn/actions.h b/include/ovn/actions.h index 145f27f25..4e2f4d28d 100644 --- a/include/ovn/actions.h +++ b/include/ovn/actions.h @@ -73,8 +73,10 @@ struct ovn_extend_table; OVNACT(ND_NA_ROUTER, ovnact_nest) \ OVNACT(GET_ARP, ovnact_get_mac_bind) \ OVNACT(PUT_ARP, ovnact_put_mac_bind) \ + OVNACT(LOOKUP_ARP, ovnact_lookup_mac_bind) \ OVNACT(GET_ND, ovnact_get_mac_bind) \ OVNACT(PUT_ND, ovnact_put_mac_bind) \ + OVNACT(LOOKUP_ND, ovnact_lookup_mac_bind) \ OVNACT(PUT_DHCPV4_OPTS, ovnact_put_opts) \ OVNACT(PUT_DHCPV6_OPTS, ovnact_put_opts) \ OVNACT(SET_QUEUE, ovnact_set_queue) \ @@ -266,6 +268,15 @@ struct ovnact_put_mac_bind { struct expr_field mac; /* 48-bit Ethernet address. */ }; +/* OVNACT_LOOKUP_ARP, OVNACT_LOOKUP_ND. */ +struct ovnact_lookup_mac_bind { + struct ovnact ovnact; + struct expr_field dst; /* 1-bit destination field. */ + struct expr_field port; /* Logical port name. */ + struct expr_field ip; /* 32-bit or 128-bit IP address. */ + struct expr_field mac; /* 48-bit Ethernet address. */ +}; + struct ovnact_gen_option { const struct gen_opts_map *option; struct expr_constant_set value; @@ -628,6 +639,8 @@ struct ovnact_encode_params { uint8_t output_ptable; /* OpenFlow table for 'output' to resubmit. */ uint8_t mac_bind_ptable; /* OpenFlow table for 'get_arp'/'get_nd' to resubmit. */ + uint8_t mac_lookup_ptable; /* OpenFlow table for + 'lookup_arp'/'lookup_nd' to resubmit. */ }; void ovnacts_encode(const struct ovnact[], size_t ovnacts_len, diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h index 9bac8e027..cf1bb539b 100644 --- a/include/ovn/logical-fields.h +++ b/include/ovn/logical-fields.h @@ -40,6 +40,9 @@ enum ovn_controller_event { #define MFF_LOG_INPORT MFF_REG14 /* Logical input port (32 bits). */ #define MFF_LOG_OUTPORT MFF_REG15 /* Logical output port (32 bits). */ +#define MFF_LOG_LOOKUP_MAC MFF_REG2 /* Register to store the result of + * lookup_mac action. */ + /* Logical registers. * * Make sure these don't overlap with the logical fields! */ diff --git a/lib/actions.c b/lib/actions.c index 6a5907e1b..3efcbd418 100644 --- a/lib/actions.c +++ b/lib/actions.c @@ -1607,6 +1607,113 @@ ovnact_put_mac_bind_free(struct ovnact_put_mac_bind *put_mac OVS_UNUSED) { } +static void format_lookup_mac(const struct ovnact_lookup_mac_bind *lookup_mac, + struct ds *s, const char *name) +{ + expr_field_format(&lookup_mac->dst, s); + ds_put_format(s, " = %s(", name); + expr_field_format(&lookup_mac->port, s); + ds_put_cstr(s, ", "); + expr_field_format(&lookup_mac->ip, s); + ds_put_cstr(s, ", "); + expr_field_format(&lookup_mac->mac, s); + ds_put_cstr(s, ");"); +} + +static void +format_LOOKUP_ARP(const struct ovnact_lookup_mac_bind *lookup_mac, + struct ds *s) +{ + format_lookup_mac(lookup_mac, s, "lookup_arp"); +} + +static void +format_LOOKUP_ND(const struct ovnact_lookup_mac_bind *lookup_mac, + struct ds *s) +{ + format_lookup_mac(lookup_mac, s, "lookup_nd"); +} + +static void +encode_lookup_mac(const struct ovnact_lookup_mac_bind *lookup_mac, + enum mf_field_id ip_field, + const struct ovnact_encode_params *ep, + struct ofpbuf *ofpacts) +{ + const struct arg args[] = { + { expr_resolve_field(&lookup_mac->port), MFF_LOG_INPORT }, + { expr_resolve_field(&lookup_mac->ip), ip_field }, + { expr_resolve_field(&lookup_mac->mac), MFF_ETH_SRC}, + }; + + encode_setup_args(args, ARRAY_SIZE(args), ofpacts); + + struct mf_subfield dst = expr_resolve_field(&lookup_mac->dst); + ovs_assert(dst.field); + + put_load(0, MFF_LOG_LOOKUP_MAC, 0, 1, ofpacts); + emit_resubmit(ofpacts, ep->mac_lookup_ptable); + + if (dst.field->id != MFF_LOG_LOOKUP_MAC || dst.ofs != 0) { + struct ofpact_reg_move *orm = ofpact_put_REG_MOVE(ofpacts); + orm->dst = dst; + orm->src.field = mf_from_id(MFF_LOG_LOOKUP_MAC); + orm->src.ofs = 0; + orm->src.n_bits = 1; + } + encode_restore_args(args, ARRAY_SIZE(args), ofpacts); +} + +static void +encode_LOOKUP_ARP(const struct ovnact_lookup_mac_bind *lookup_mac, + const struct ovnact_encode_params *ep, + struct ofpbuf *ofpacts) +{ + encode_lookup_mac(lookup_mac, MFF_REG0, ep, ofpacts); +} + +static void +encode_LOOKUP_ND(const struct ovnact_lookup_mac_bind *lookup_mac, + const struct ovnact_encode_params *ep, + struct ofpbuf *ofpacts) +{ + encode_lookup_mac(lookup_mac, MFF_XXREG0, ep, ofpacts); +} + +static void +parse_lookup_mac_bind(struct action_context *ctx, + const struct expr_field *dst, + int width, + struct ovnact_lookup_mac_bind *lookup_mac) +{ + /* Validate that the destination is a 1-bit, modifiable field. */ + char *error = expr_type_check(dst, 1, true); + if (error) { + lexer_error(ctx->lexer, "%s", error); + free(error); + return; + } + + lexer_get(ctx->lexer); /* Skip lookup_arp/lookup_nd. */ + lexer_get(ctx->lexer); /* Skip '('. * */ + + action_parse_field(ctx, 0, false, &lookup_mac->port); + lexer_force_match(ctx->lexer, LEX_T_COMMA); + action_parse_field(ctx, width, false, &lookup_mac->ip); + lexer_force_match(ctx->lexer, LEX_T_COMMA); + action_parse_field(ctx, 48, false, &lookup_mac->mac); + lexer_force_match(ctx->lexer, LEX_T_RPAREN); + lookup_mac->dst = *dst; +} + +static void +ovnact_lookup_mac_bind_free( + struct ovnact_lookup_mac_bind *lookup_mac OVS_UNUSED) +{ + +} + + static void parse_gen_opt(struct action_context *ctx, struct ovnact_gen_option *o, const struct hmap *gen_opts, const char *opts_type) @@ -2722,6 +2829,14 @@ parse_set_action(struct action_context *ctx) && lexer_lookahead(ctx->lexer) == LEX_T_LPAREN) { parse_check_pkt_larger(ctx, &lhs, ovnact_put_CHECK_PKT_LARGER(ctx->ovnacts)); + } else if (!strcmp(ctx->lexer->token.s, "lookup_arp") + && lexer_lookahead(ctx->lexer) == LEX_T_LPAREN) { + parse_lookup_mac_bind(ctx, &lhs, 32, + ovnact_put_LOOKUP_ARP(ctx->ovnacts)); + } else if (!strcmp(ctx->lexer->token.s, "lookup_nd") + && lexer_lookahead(ctx->lexer) == LEX_T_LPAREN) { + parse_lookup_mac_bind(ctx, &lhs, 128, + ovnact_put_LOOKUP_ND(ctx->ovnacts)); } else { parse_assignment_action(ctx, false, &lhs); } diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 0f4f1c112..b62ca1a77 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -1218,7 +1218,164 @@ output; Other packets are implicitly dropped.

-

Ingress Table 1: IP Input

+

Ingress Table 1: ARP/ND lookup

+ +

+ For ARP and Neighbor Discovery packets, this table looks into the + records to determine + if OVN needs to learn the mac bindings. Following flows are added: +

+ +
    +
  • +

    + A priority-100 flow which matches on ARP packet and applies + the actions: +

    + +
    +reg9[4] = lookup_arp(inport, arp.spa, arp.sha);
    +next;
    +        
    +
  • + +
  • +

    + A priority-100 flow which matches on IPv6 Neighbor Discovery + advertisement packet and applies the actions: +

    + +
    +reg9[4] = lookup_nd(inport, nd.target, nd.tll);
    +next;
    +        
    +
  • + +
  • +

    + A priority-100 flow which matches on IPv6 Neighbor Discovery + solicitation packet and applies the actions: +

    + +
    +reg9[4] = lookup_nd(inport, ip6.src, nd.sll);
    +next;
    +        
    +
  • + +
  • + A priority-0 fallback flow that matches all packets + and advances to the next table. +
  • +
+ +

Ingress Table 2: MAC learning

+ +

+ This table adds flows to learn the mac bindings from the ARP and + IPv6 Neighbor Solicitation/Advertisement packets if ARP/ND lookup + failed in the previous table. +

+ +

+ reg9[4] will be 0 if the lookup_arp/lookup_nd + in the previous table failed the lookup in the mac binding table. +

+ +
    +
  • + A priority-100 flow with the match arp.op == 2 && + reg9[4] == 0 and applies the action + put_arp(inport, arp.spa, arp.sha); +
  • + +
  • + A priority-90 flow with the match arp.op == 2 and + applies the action drop; +
  • + +
  • +

    + MAC learning from ARP requests. +

    + +

    + These flows populates the mac binding table of the logical router + port from the ARP request packets for the router's own IP address. + The ARP requests are handled only if the requestor's IP belongs + to the same subnets of the logical router port. + For each router port P that owns IP address A, + which belongs to subnet S with prefix length L, + and Ethernet address E, a priority-90 flow matches + inport == P && + arp.spa == S/L && arp.op == 1 + && arp.tpa == A && + reg9[4] == 0 (ARP request) with the + following actions: +

    + +
    +put_arp(inport, arp.spa, arp.sha);
    +next;
    +        
    +
  • + +
  • +

    + MAC learning from ARP requests not redirected to router IPs. +

    + +

    + For each router port P that owns IP address + A, which belongs to subnet S with prefix length + L, and Ethernet address E, a priority-90 flow + matches inport == P && + arp.spa == S/L && arp.op == 1 + && reg9[4] = 0 (ARP request) + with the action put_arp(inport, arp.spa, arp.sha);. +

    + +

    + If the logical router port P is a distributed gateway + router port, additional match + is_chassis_resident(cr-P) is added so that + the resident gateway chassis handles such ARP packets. +

    +
  • + +
  • +

    + MAC learning from IPv6 Neighbor Solicitation packets. +

    + +

    + A priority-100 flow with the match nd_ns && + reg9[4] == 0 and applies the + below actions and advancing the packet to the next table. +

    + +
    +put_nd(inport, ip6.src, nd.sll);
    +next;
    +        
    +
  • + +
  • +

    + MAC learning from IPv6 Neighbor Advertisement packets. + This flow uses Neighbor Advertisements to populate the + logical router's mac binding table. +

    + +

    + A priority-100 flow with the match nd_na && + reg9[4] = 0 and applies the + action put_nd(inport, nd.target, nd.tll); +

    +
  • +
+ +

Ingress Table 3: IP Input

This table is the core of the logical router datapath functionality. It @@ -1315,8 +1472,7 @@ next;

- These flows reply to ARP requests for the router's own IP address - and populates mac binding table of the logical router port. + These flows reply to ARP requests for the router's own IP address. The ARP requests are handled only if the requestor's IP belongs to the same subnets of the logical router port. For each router port P that owns IP address A, @@ -1329,7 +1485,6 @@ next;

-put_arp(inport, arp.spa, arp.sha);
 eth.dst = eth.src;
 eth.src = E;
 arp.op = 2; /* ARP reply. */
@@ -1365,17 +1520,6 @@ output;
         

-
  • -

    - These flows handles ARP requests not for router's own IP address. - They use the SPA and SHA to populate the logical router port's - mac binding table, with priority 80. The typical use case of - these flows are GARP requests handling. For the gateway port - on a distributed logical router, these flows are only programmed - on the gateway port instance on the redirect-chassis. -

    -
  • -
  • These flows reply to ARP requests for the virtual IP addresses @@ -1446,36 +1590,6 @@ arp.sha = external_mac;

  • -
  • -

    - ARP reply handling. Following flows are added to handle ARP replies. -

    - -

    - For each distributed gateway logical router port a priority-92 flow - with match inport == P && - is_chassis_resident(cr-P) && eth.bcast && - arp.op == 2 && arp.spa == I with the - action put_arp(inport, arp.spa, arp.sha); so that the - resident gateway chassis can learn the GARP reply, where - P is the distributed gateway router port name, - I is the logical router port's network address. -

    - -

    - For each distributed gateway logical router port a priority-92 flow - with match inport == P && - !is_chassis_resident(cr-P) && eth.bcast && - arp.op == 2 && arp.spa == I with the action - drop; so that other chassis drop this packet. -

    - -

    - A priority-90 flow with match arp.op == 2 has actions - put_arp(inport, arp.spa, arp.sha);. -

    -
  • -
  • Reply to IPv6 Neighbor Solicitations. These flows reply to @@ -1494,7 +1608,6 @@ arp.sha = external_mac;

    -put_nd(inport, ip6.src, nd.sll);
     nd_na_router {
         eth.src = E;
         ip6.src = A;
    @@ -1516,7 +1629,6 @@ nd_na_router {
             

    -put_nd(inport, ip6.src, nd.sll);
     nd_na {
         eth.src = E;
         ip6.src = A;
    @@ -1540,23 +1652,6 @@ nd_na {
             

  • -
  • - IPv6 neighbor advertisement handling. This flow uses neighbor - advertisements to populate the logical router's mac binding - table. A priority-90 flow with match nd_na - has actions put_nd(inport, nd.target, nd.tll);. -
  • - -
  • - IPv6 neighbor solicitation for non-hosted addresses handling. - This flow uses neighbor solicitations to populate the logical - router's mac binding table (ones that were directed at the - logical router would have matched the priority-90 neighbor - solicitation flow already). A priority-80 flow with match - nd_ns has actions - put_nd(inport, ip6.src, nd.sll);. -
  • -
  • UDP port unreachable. Priority-80 flows generate ICMP port @@ -1670,7 +1765,7 @@ icmp6 {

  • -

    Ingress Table 2: DEFRAG

    +

    Ingress Table 4: DEFRAG

    This is to send packets to connection tracker for tracking and @@ -1728,7 +1823,7 @@ icmp6 { -

    Ingress Table 3: UNSNAT on Distributed Routers

    +

    Ingress Table 5: UNSNAT on Distributed Routers

    • @@ -1767,7 +1862,7 @@ icmp6 {
    -

    Ingress Table 4: DNAT

    +

    Ingress Table 6: DNAT

    Packets enter the pipeline with destination IP address that needs to @@ -1775,7 +1870,7 @@ icmp6 { in the reverse direction needs to be unDNATed.

    -

    Ingress Table 4: Load balancing DNAT rules

    +

    Ingress Table 6: Load balancing DNAT rules

    Following load balancing DNAT flows are added for Gateway router or @@ -1846,7 +1941,7 @@ icmp6 { -

    Ingress Table 4: DNAT on Gateway Routers

    +

    Ingress Table 6: DNAT on Gateway Routers

    • @@ -1872,7 +1967,7 @@ icmp6 {
    -

    Ingress Table 4: DNAT on Distributed Routers

    +

    Ingress Table 6: DNAT on Distributed Routers

    On distributed routers, the DNAT table only handles packets @@ -1919,7 +2014,7 @@ icmp6 { -

    Ingress Table 5: IPv6 ND RA option processing

    +

    Ingress Table 7: IPv6 ND RA option processing

    • @@ -1949,7 +2044,7 @@ reg0[5] = put_nd_ra_opts(options);next;
    -

    Ingress Table 6: IPv6 ND RA responder

    +

    Ingress Table 8: IPv6 ND RA responder

    This table implements IPv6 ND RA responder for the IPv6 ND RA replies @@ -1994,7 +2089,7 @@ output; -

    Ingress Table 7: IP Routing

    +

    Ingress Table 9: IP Routing

    A packet that arrives at this table is an IP packet that should be @@ -2144,7 +2239,7 @@ next; -

    Ingress Table 8: ARP/ND Resolution

    +

    Ingress Table 10: ARP/ND Resolution

    Any packet that reaches this table is an IP packet whose next-hop @@ -2291,7 +2386,7 @@ next; -

    Ingress Table 9: Check packet length

    +

    Ingress Table 11: Check packet length

    For distributed logical routers with distributed gateway port configured @@ -2321,7 +2416,7 @@ REGBIT_PKT_LARGER = check_pkt_larger(L); next; and advances to the next table.

    -

    Ingress Table 10: Handle larger packets

    +

    Ingress Table 12: Handle larger packets

    For distributed logical routers with distributed gateway port configured @@ -2370,7 +2465,7 @@ icmp4 { and advances to the next table.

    -

    Ingress Table 11: Gateway Redirect

    +

    Ingress Table 13: Gateway Redirect

    For distributed logical routers where one of the logical router @@ -2432,7 +2527,7 @@ icmp4 { -

    Ingress Table 12: ARP Request

    +

    Ingress Table 14: ARP Request

    In the common case where the Ethernet destination has been resolved, this diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index f393cebb8..930d32530 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -145,19 +145,21 @@ enum ovn_stage { \ /* Logical router ingress stages. */ \ PIPELINE_STAGE(ROUTER, IN, ADMISSION, 0, "lr_in_admission") \ - PIPELINE_STAGE(ROUTER, IN, IP_INPUT, 1, "lr_in_ip_input") \ - PIPELINE_STAGE(ROUTER, IN, DEFRAG, 2, "lr_in_defrag") \ - PIPELINE_STAGE(ROUTER, IN, UNSNAT, 3, "lr_in_unsnat") \ - PIPELINE_STAGE(ROUTER, IN, DNAT, 4, "lr_in_dnat") \ - PIPELINE_STAGE(ROUTER, IN, ND_RA_OPTIONS, 5, "lr_in_nd_ra_options") \ - PIPELINE_STAGE(ROUTER, IN, ND_RA_RESPONSE, 6, "lr_in_nd_ra_response") \ - PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 7, "lr_in_ip_routing") \ - PIPELINE_STAGE(ROUTER, IN, POLICY, 8, "lr_in_policy") \ - PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 9, "lr_in_arp_resolve") \ - PIPELINE_STAGE(ROUTER, IN, CHK_PKT_LEN , 10, "lr_in_chk_pkt_len") \ - PIPELINE_STAGE(ROUTER, IN, LARGER_PKTS, 11,"lr_in_larger_pkts") \ - PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 12, "lr_in_gw_redirect") \ - PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 13, "lr_in_arp_request") \ + PIPELINE_STAGE(ROUTER, IN, LOOKUP_ARP, 1, "lr_in_lookup_arp") \ + PIPELINE_STAGE(ROUTER, IN, PUT_ARP, 2, "lr_in_put_arp") \ + PIPELINE_STAGE(ROUTER, IN, IP_INPUT, 3, "lr_in_ip_input") \ + PIPELINE_STAGE(ROUTER, IN, DEFRAG, 4, "lr_in_defrag") \ + PIPELINE_STAGE(ROUTER, IN, UNSNAT, 5, "lr_in_unsnat") \ + PIPELINE_STAGE(ROUTER, IN, DNAT, 6, "lr_in_dnat") \ + PIPELINE_STAGE(ROUTER, IN, ND_RA_OPTIONS, 7, "lr_in_nd_ra_options") \ + PIPELINE_STAGE(ROUTER, IN, ND_RA_RESPONSE, 8, "lr_in_nd_ra_response") \ + PIPELINE_STAGE(ROUTER, IN, IP_ROUTING, 9, "lr_in_ip_routing") \ + PIPELINE_STAGE(ROUTER, IN, POLICY, 10, "lr_in_policy") \ + PIPELINE_STAGE(ROUTER, IN, ARP_RESOLVE, 11, "lr_in_arp_resolve") \ + PIPELINE_STAGE(ROUTER, IN, CHK_PKT_LEN , 12, "lr_in_chk_pkt_len") \ + PIPELINE_STAGE(ROUTER, IN, LARGER_PKTS, 13,"lr_in_larger_pkts") \ + PIPELINE_STAGE(ROUTER, IN, GW_REDIRECT, 14, "lr_in_gw_redirect") \ + PIPELINE_STAGE(ROUTER, IN, ARP_REQUEST, 15, "lr_in_arp_request") \ \ /* Logical router egress stages. */ \ PIPELINE_STAGE(ROUTER, OUT, UNDNAT, 0, "lr_out_undnat") \ @@ -196,6 +198,7 @@ enum ovn_stage { #define REGBIT_DISTRIBUTED_NAT "reg9[2]" /* Register to store the result of check_pkt_larger action. */ #define REGBIT_PKT_LARGER "reg9[3]" +#define REGBIT_LOOKUP_ARP_RESULT "reg9[4]" /* Returns an "enum ovn_stage" built from the arguments. */ static enum ovn_stage @@ -6375,7 +6378,105 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, ds_cstr(&match), "next;"); } - /* Logical router ingress table 1: IP Input. */ + /* Logical router ingress table 1: LOOKUP_ARP and table 2: PUT_ARP. */ + HMAP_FOR_EACH (od, key_node, datapaths) { + if (!od->nbr) { + continue; + } + + /* Learn from ARP requests and ARP replies. A typical + * use case is GARP request handling. + * Table LOOKUP_ARP does a lookup for the (arp.spa, arp.sha) + * in the mac binding table using the 'lookup_arp' action. + * If it is present, then this action stores the mac in the eth.dst + * of the packet. Before calling 'lookup_arp' we store + * eth.dst in xxreg1. After 'lookup_arp' action is applied + * we store the searched mac - eth.dst in xxreg0 and restore + * eth.dst to its original value. + * + * Table PUT_ARP learns the mac using the action - 'put_arp' + * only if xxreg0 is 00:00:00:00:00:00. There is no need to learn + * the mac otherwise. + * + * The same thing will be done for IPv6 ND/NS packets. + * */ + ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_ARP, 100, "arp", + REGBIT_LOOKUP_ARP_RESULT" = " + "lookup_arp(inport, arp.spa, arp.sha); next;"); + + ovn_lflow_add(lflows, od, S_ROUTER_IN_PUT_ARP, 100, + "arp.op == 2 && "REGBIT_LOOKUP_ARP_RESULT" == 0", + "put_arp(inport, arp.spa, arp.sha);"); + + ovn_lflow_add(lflows, od, S_ROUTER_IN_PUT_ARP, 90, "arp.op == 2", + "drop;"); + + /* IPv6 ND/NS handling. */ + ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_ARP, 100, "nd_na", + REGBIT_LOOKUP_ARP_RESULT" = " + "lookup_nd(inport, nd.target, nd.tll); next;"); + + ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_ARP, 100, "nd_ns", + REGBIT_LOOKUP_ARP_RESULT" = " + "lookup_nd(inport, ip6.src, nd.sll); next;"); + + ovn_lflow_add(lflows, od, S_ROUTER_IN_PUT_ARP, 100, + "nd_na && "REGBIT_LOOKUP_ARP_RESULT" == 0", + "put_nd(inport, nd.target, nd.tll);"); + + ovn_lflow_add(lflows, od, S_ROUTER_IN_PUT_ARP, 100, + "nd_ns && "REGBIT_LOOKUP_ARP_RESULT" == 0", + "put_nd(inport, ip6.src, nd.sll); next;"); + + /* Pass other traffic not already handled to the next table for + * routing. */ + ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_ARP, 0, "1", "next;"); + ovn_lflow_add(lflows, od, S_ROUTER_IN_PUT_ARP, 0, "1", "next;"); + } + + HMAP_FOR_EACH (op, key_node, ports) { + if (!op->nbrp) { + continue; + } + + for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) { + ds_clear(&match); + ds_put_format(&match, + "inport == %s && arp.spa == %s/%u && arp.tpa == %s" + " && arp.op == 1 && " + REGBIT_LOOKUP_ARP_RESULT" == 0", + op->json_key, + op->lrp_networks.ipv4_addrs[i].network_s, + op->lrp_networks.ipv4_addrs[i].plen, + op->lrp_networks.ipv4_addrs[i].addr_s); + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_PUT_ARP, 100, + ds_cstr(&match), + "put_arp(inport, arp.spa, arp.sha); next; "); + } + + /* Learn from ARP requests that were not directed at us. A typical + * use case is GARP request handling. (A priority-90 flow will + * respond to request to us and learn the sender's mac address.) */ + for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) { + ds_clear(&match); + ds_put_format(&match, + "inport == %s && arp.spa == %s/%u && arp.op == 1 && " + REGBIT_LOOKUP_ARP_RESULT" == 0", + op->json_key, + op->lrp_networks.ipv4_addrs[i].network_s, + op->lrp_networks.ipv4_addrs[i].plen); + if (op->od->l3dgw_port && op == op->od->l3dgw_port + && op->od->l3redirect_port) { + ds_put_format(&match, " && is_chassis_resident(%s)", + op->od->l3redirect_port->json_key); + } + ovn_lflow_add(lflows, op->od, S_ROUTER_IN_PUT_ARP, 90, + ds_cstr(&match), + "put_arp(inport, arp.spa, arp.sha);"); + } + } + + /* Logical router ingress table 3: IP Input. */ HMAP_FOR_EACH (od, key_node, datapaths) { if (!od->nbr) { continue; @@ -6397,11 +6498,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_INPUT, 95, "ip4.mcast", od->mcast_info.rtr.relay ? "next;" : "drop;"); - /* ARP reply handling. Use ARP replies to populate the logical - * router's ARP table. */ - ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_INPUT, 90, "arp.op == 2", - "put_arp(inport, arp.spa, arp.sha);"); - /* Drop Ethernet local broadcast. By definition this traffic should * not be forwarded.*/ ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_INPUT, 50, @@ -6413,23 +6509,12 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_INPUT, 30, ds_cstr(&match), "drop;"); - /* ND advertisement handling. Use advertisements to populate - * the logical router's ARP/ND table. */ - ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_INPUT, 90, "nd_na", - "put_nd(inport, nd.target, nd.tll);"); - - /* Lean from neighbor solicitations that were not directed at - * us. (A priority-90 flow will respond to requests to us and - * learn the sender's mac address. */ - ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_INPUT, 80, "nd_ns", - "put_nd(inport, ip6.src, nd.sll);"); - /* Pass other traffic not already handled to the next table for * routing. */ ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_INPUT, 0, "1", "next;"); } - /* Logical router ingress table 1: IP Input for IPv4. */ + /* Logical router ingress table 4: IP Input for IPv4. */ HMAP_FOR_EACH (op, key_node, ports) { if (!op->nbrp) { continue; @@ -6539,7 +6624,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, ds_clear(&actions); ds_put_format(&actions, - "put_arp(inport, arp.spa, arp.sha); " "eth.dst = eth.src; " "eth.src = %s; " "arp.op = 2; /* ARP reply */ " @@ -6558,62 +6642,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, ds_cstr(&match), ds_cstr(&actions)); } - /* Learn from ARP requests that were not directed at us. A typical - * use case is GARP request handling. (A priority-90 flow will - * respond to request to us and learn the sender's mac address.) */ - for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) { - ds_clear(&match); - ds_put_format(&match, - "inport == %s && arp.spa == %s/%u && arp.op == 1", - op->json_key, - op->lrp_networks.ipv4_addrs[i].network_s, - op->lrp_networks.ipv4_addrs[i].plen); - if (op->od->l3dgw_port && op == op->od->l3dgw_port - && op->od->l3redirect_port) { - ds_put_format(&match, " && is_chassis_resident(%s)", - op->od->l3redirect_port->json_key); - } - ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 80, - ds_cstr(&match), - "put_arp(inport, arp.spa, arp.sha);"); - - } - - /* Handle GARP reply packets received on a distributed router gateway - * port. GARP reply broadcast packets could be sent by external - * switches. We don't want them to be handled by all the - * ovn-controllers if they receive it. So add a priority-92 flow to - * apply the put_arp action on a redirect chassis and drop it on - * other chassis. - * Note that we are already adding a priority-90 logical flow in the - * table S_ROUTER_IN_IP_INPUT to apply the put_arp action if - * arp.op == 2. - * */ - if (op->od->l3dgw_port && op == op->od->l3dgw_port - && op->od->l3redirect_port) { - for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) { - ds_clear(&match); - ds_put_format(&match, - "inport == %s && is_chassis_resident(%s) && " - "eth.bcast && arp.op == 2 && arp.spa == %s/%u", - op->json_key, op->od->l3redirect_port->json_key, - op->lrp_networks.ipv4_addrs[i].network_s, - op->lrp_networks.ipv4_addrs[i].plen); - ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 92, - ds_cstr(&match), - "put_arp(inport, arp.spa, arp.sha);"); - ds_clear(&match); - ds_put_format(&match, - "inport == %s && !is_chassis_resident(%s) && " - "eth.bcast && arp.op == 2 && arp.spa == %s/%u", - op->json_key, op->od->l3redirect_port->json_key, - op->lrp_networks.ipv4_addrs[i].network_s, - op->lrp_networks.ipv4_addrs[i].plen); - ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 92, - ds_cstr(&match), "drop;"); - } - } - /* A set to hold all load-balancer vips that need ARP responses. */ struct sset all_ips = SSET_INITIALIZER(&all_ips); int addr_family; @@ -6924,7 +6952,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports, ds_clear(&actions); ds_put_format(&actions, - "put_nd(inport, ip6.src, nd.sll); " "nd_na_router { " "eth.src = %s; " "ip6.src = %s; " diff --git a/ovn-sb.xml b/ovn-sb.xml index 477e7bc7a..e5fb51a9d 100644 --- a/ovn-sb.xml +++ b/ovn-sb.xml @@ -1397,6 +1397,35 @@

    Example: put_arp(inport, arp.spa, arp.sha);

    +
    + R = lookup_arp(P, A, M); +
    + +
    +

    + Parameters: logical port string field P, 32-bit + IP address field A, 48-bit MAC address field + M. +

    + +

    + Result: stored to a 1-bit subfield R. +

    + +

    + Looks up A and M in P's mac + binding table. If an entry is found, stores 1 in + the 1-bit subfield R, else 0. +

    + +

    + Example: + + reg0[0] = lookup_arp(inport, arp.spa, arp.sha); + +

    +
    +
    nd_ns { action; ... };

    @@ -1553,6 +1582,34 @@

    Example: put_nd(inport, nd.target, nd.tll);

    +
    R = lookup_nd(P, A, M); +
    + +
    +

    + Parameters: logical port string field P, 128-bit + IP address field A, 48-bit MAC address field + M. +

    + +

    + Result: stored to a 1-bit subfield R. +

    + +

    + Looks up A and M in P's mac + binding table. If an entry is found, stores 1 in + the 1-bit subfield R, else 0. +

    + +

    + Example: + + reg0[0] = lookup_nd(inport, ip6.src, eth.src); + +

    +
    +
    R = put_dhcp_opts(D1 = V1, D2 = V2, ..., Dn = Vn);
    diff --git a/tests/ovn.at b/tests/ovn.at index 04898dd1f..0c26f8bc7 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -1143,6 +1143,33 @@ put_arp(inport, arp.spa, arp.sha); encodes as push:NXM_NX_REG0[],push:NXM_OF_ETH_SRC[],push:NXM_NX_ARP_SHA[],push:NXM_OF_ARP_SPA[],pop:NXM_NX_REG0[],pop:NXM_OF_ETH_SRC[],controller(userdata=00.00.00.01.00.00.00.00),pop:NXM_OF_ETH_SRC[],pop:NXM_NX_REG0[] has prereqs eth.type == 0x806 && eth.type == 0x806 +# lookup_arp +reg0[0] = lookup_arp(inport, ip4.dst, eth.src); + encodes as push:NXM_NX_REG0[],push:NXM_OF_IP_DST[],pop:NXM_NX_REG0[],set_field:0/0x1->reg2,resubmit(,31),move:NXM_NX_REG2[0]->NXM_NX_XXREG0[96],pop:NXM_NX_REG0[] + has prereqs eth.type == 0x800 +reg1[1] = lookup_arp(inport, arp.spa, arp.sha); + encodes as push:NXM_NX_REG0[],push:NXM_OF_ETH_SRC[],push:NXM_NX_ARP_SHA[],push:NXM_OF_ARP_SPA[],pop:NXM_NX_REG0[],pop:NXM_OF_ETH_SRC[],set_field:0/0x1->reg2,resubmit(,31),move:NXM_NX_REG2[0]->NXM_NX_XXREG0[65],pop:NXM_OF_ETH_SRC[],pop:NXM_NX_REG0[] + has prereqs eth.type == 0x806 && eth.type == 0x806 + +lookup_arp; + Syntax error at `lookup_arp' expecting action. +reg0[0] = lookup_arp; + Syntax error at `lookup_arp' expecting field name. +reg0[0] = lookup_arp(); + Syntax error at `)' expecting field name. +reg0[0] = lookup_arp(inport); + Syntax error at `)' expecting `,'. +reg0[0] = lookup_arp(inport ip4.dst); + Syntax error at `ip4.dst' expecting `,'. +reg0[0] = lookup_arp(inport, ip4.dst; + Syntax error at `;' expecting `,'. +reg0[0] = lookup_arp(inport, ip4.dst, eth.src; + Syntax error at `;' expecting `)'. +reg0[0] = lookup_arp(inport, eth.dst); + Cannot use 48-bit field eth.dst[0..47] where 32-bit field is required. +reg0[0] = lookup_arp(inport, ip4.src, ip4.dst); + Cannot use 32-bit field ip4.dst[0..31] where 48-bit field is required. + # put_dhcp_opts reg1[0] = put_dhcp_opts(offerip = 1.2.3.4, router = 10.0.0.1); encodes as controller(userdata=00.00.00.02.00.00.00.00.00.01.de.10.00.00.00.40.01.02.03.04.03.04.0a.00.00.01,pause) @@ -1243,6 +1270,35 @@ reg1[0] = put_dhcpv6_opts(ia_addr="ae70::4"); reg1[0] = put_dhcpv6_opts(ia_addr=ae70::4, domain_search=ae70::1); DHCPv6 option domain_search requires string value. +# lookup_nd +reg2[0] = lookup_nd(inport, ip6.dst, eth.src); + encodes as push:NXM_NX_XXREG0[],push:NXM_NX_IPV6_DST[],pop:NXM_NX_XXREG0[],set_field:0/0x1->reg2,resubmit(,31),move:NXM_NX_REG2[0]->NXM_NX_XXREG0[32],pop:NXM_NX_XXREG0[] + has prereqs eth.type == 0x86dd +reg3[0] = lookup_nd(inport, nd.target, nd.tll); + encodes as push:NXM_NX_XXREG0[],push:NXM_OF_ETH_SRC[],push:NXM_NX_ND_TLL[],push:NXM_NX_ND_TARGET[],pop:NXM_NX_XXREG0[],pop:NXM_OF_ETH_SRC[],set_field:0/0x1->reg2,resubmit(,31),move:NXM_NX_REG2[0]->NXM_NX_XXREG0[0],pop:NXM_OF_ETH_SRC[],pop:NXM_NX_XXREG0[] + has prereqs (icmp6.type == 0x87 || icmp6.type == 0x88) && eth.type == 0x86dd && ip.proto == 0x3a && (eth.type == 0x800 || eth.type == 0x86dd) && icmp6.code == 0 && eth.type == 0x86dd && ip.proto == 0x3a && (eth.type == 0x800 || eth.type == 0x86dd) && ip.ttl == 0xff && (eth.type == 0x800 || eth.type == 0x86dd) && icmp6.type == 0x88 && eth.type == 0x86dd && ip.proto == 0x3a && (eth.type == 0x800 || eth.type == 0x86dd) && icmp6.code == 0 && eth.type == 0x86dd && ip.proto == 0x3a && (eth.type == 0x800 || eth.type == 0x86dd) && ip.ttl == 0xff && (eth.type == 0x800 || eth.type == 0x86dd) + +lookup_nd; + Syntax error at `lookup_nd' expecting action. +reg0[0] = lookup_nd; + Syntax error at `lookup_nd' expecting field name. +reg0[0] = lookup_nd(); + Syntax error at `)' expecting field name. +reg0[0] = lookup_nd(inport); + Syntax error at `)' expecting `,'. +reg0[0] = lookup_nd(inport ip6.dst); + Syntax error at `ip6.dst' expecting `,'. +reg0[0] = lookup_nd(inport, ip6.dst; + Syntax error at `;' expecting `,'. +reg0[0] = lookup_nd(inport, ip6.dst, eth.src; + Syntax error at `;' expecting `)'. +reg0[0] = lookup_nd(inport, eth.dst); + Cannot use 48-bit field eth.dst[0..47] where 128-bit field is required. +reg0[0] = lookup_nd(inport, ip4.src, ip4.dst); + Cannot use 32-bit field ip4.src[0..31] where 128-bit field is required. +reg0[0] = lookup_nd(inport, ip6.src, ip6.dst); + Cannot use 128-bit field ip6.dst[0..127] where 48-bit field is required. + # set_queue set_queue(0); encodes as set_queue:0 @@ -14528,7 +14584,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ # Since the sw0-vir is not claimed by any chassis, eth.dst should be set to # zero if the ip4.dst is the virtual ip in the router pipeline. AT_CHECK([cat lflows.txt], [0], [dnl - table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) + table=11(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) ]) ip_to_hex() { @@ -14564,7 +14620,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ # There should be an arp resolve flow to resolve the virtual_ip with the # sw0-p1's MAC. AT_CHECK([cat lflows.txt], [0], [dnl - table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) + table=11(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) ]) # send the garp from sw0-p2 (in hv2). hv2 should claim sw0-vir @@ -14587,7 +14643,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ # There should be an arp resolve flow to resolve the virtual_ip with the # sw0-p2's MAC. AT_CHECK([cat lflows.txt], [0], [dnl - table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) + table=11(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) ]) # Now send arp reply from sw0-p1. hv1 should claim sw0-vir @@ -14608,7 +14664,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) + table=11(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:03; next;) ]) # Delete hv1-vif1 port. hv1 should release sw0-vir @@ -14626,7 +14682,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) + table=11(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 00:00:00:00:00:00; next;) ]) # Now send arp reply from sw0-p2. hv2 should claim sw0-vir @@ -14647,7 +14703,7 @@ ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=9 (lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) + table=11(lr_in_arp_resolve ), priority=100 , match=(outport == "lr0-sw0" && reg0 == 10.0.0.10), action=(eth.dst = 50:54:00:00:00:04; next;) ]) # Delete sw0-p2 logical port @@ -15879,3 +15935,225 @@ as hv4 ovs-appctl fdb/show br-phys OVN_CLEANUP([hv1],[hv2],[hv3],[hv4]) AT_CLEANUP + +AT_SETUP([ovn -- ARP lookup before learning]) +AT_KEYWORDS([virtual ports]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +ovn_start + +send_garp() { + local hv=$1 inport=$2 eth_src=$3 eth_dst=$4 spa=$5 tpa=$6 + local request=${eth_dst}${eth_src}08060001080006040001${eth_src}${spa}${eth_dst}${tpa} + as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request +} + +send_arp_reply() { + local hv=$1 inport=$2 eth_src=$3 eth_dst=$4 spa=$5 tpa=$6 + local request=${eth_dst}${eth_src}08060001080006040002${eth_src}${spa}${eth_dst}${tpa} + as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request +} + +net_add n1 + +sim_add hv1 +as hv1 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.1 +ovs-vsctl -- add-port br-int hv1-vif1 -- \ + set interface hv1-vif1 external-ids:iface-id=sw0-p1 \ + options:tx_pcap=hv1/vif1-tx.pcap \ + options:rxq_pcap=hv1/vif1-rx.pcap \ + ofport-request=1 +ovs-vsctl -- add-port br-int hv1-vif2 -- \ + set interface hv1-vif2 external-ids:iface-id=sw0-p3 \ + options:tx_pcap=hv1/vif2-tx.pcap \ + options:rxq_pcap=hv1/vif2-rx.pcap \ + ofport-request=2 + +sim_add hv2 +as hv2 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.2 +ovs-vsctl -- add-port br-int hv2-vif1 -- \ + set interface hv2-vif1 external-ids:iface-id=sw1-p1 \ + options:tx_pcap=hv2/vif1-tx.pcap \ + options:rxq_pcap=hv2/vif1-rx.pcap \ + ofport-request=1 + +ovn-nbctl ls-add sw0 + +ovn-nbctl lsp-add sw0 sw0-p1 +ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03" + +# Create the second logical switch with one port +ovn-nbctl ls-add sw1 +ovn-nbctl lsp-add sw1 sw1-p1 +ovn-nbctl lsp-set-addresses sw1-p1 "40:54:00:00:00:03 20.0.0.3" +ovn-nbctl lsp-set-port-security sw1-p1 "40:54:00:00:00:03 20.0.0.3" + +# Create a logical router and attach both logical switches +ovn-nbctl lr-add lr0 +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 +ovn-nbctl lsp-add sw0 sw0-lr0 +ovn-nbctl lsp-set-type sw0-lr0 router +ovn-nbctl lsp-set-addresses sw0-lr0 00:00:00:00:ff:01 +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 + +ovn-nbctl lrp-add lr0 lr0-sw1 00:00:00:00:ff:02 20.0.0.1/24 +ovn-nbctl lsp-add sw1 sw1-lr0 +ovn-nbctl lsp-set-type sw1-lr0 router +ovn-nbctl lsp-set-addresses sw1-lr0 00:00:00:00:ff:02 +ovn-nbctl lsp-set-options sw1-lr0 router-port=lr0-sw1 + +OVN_POPULATE_ARP +ovn-nbctl --wait=hv sync + +as hv1 ovs-appctl -t ovn-controller vlog/set dbg + +ip_to_hex() { + printf "%02x%02x%02x%02x" "$@" +} + +# From sw0-p1 send GARP for 10.0.0.30. +# ovn-controller should learn the +# mac_binding entry +# port - lr0-sw0 +# ip - 10.0.0.30 +# mac - 50:54:00:00:00:03 + +AT_CHECK([test 0 = `ovn-sbctl list mac_binding | wc -l`]) +eth_src=505400000003 +eth_dst=ffffffffffff +spa=$(ip_to_hex 10 0 0 30) +tpa=$(ip_to_hex 10 0 0 30) +send_garp 1 1 $eth_src $eth_dst $spa $tpa + +OVS_WAIT_UNTIL([test 1 = `ovn-sbctl --bare --columns _uuid list mac_binding | wc -l`]) + +AT_CHECK([ovn-sbctl --format=csv --bare --columns logical_port,ip,mac \ +list mac_binding], [0], [lr0-sw0 +10.0.0.30 +50:54:00:00:00:03 +]) + +AT_CHECK([test 1 = `cat hv1/ovn-controller.log | grep NXT_PACKET_IN2 | wc -l`]) +AT_CHECK([test 1 = `as hv1 ovs-ofctl dump-flows br-int table=10 | grep arp | \ +grep controller | grep -v n_packets=0 | wc -l`]) + +# Wait for an entry in table=31 +OVS_WAIT_UNTIL( + [test 1 = `as hv1 ovs-ofctl dump-flows br-int table=31 | grep n_packets=0 \ +| wc -l`] +) + +# Send garp again. This time the packet should not be sent to ovn-controller. +send_garp 1 1 $eth_src $eth_dst $spa $tpa +# Wait for an entry in table=31 +OVS_WAIT_UNTIL([test 1 = `as hv1 ovs-ofctl dump-flows br-int table=31 | grep n_packets=1 | wc -l`]) + +# The packet should not be sent to ovn-controller. The packet +count should be 1 only. +AT_CHECK([test 1 = `cat hv1/ovn-controller.log | grep NXT_PACKET_IN2 | wc -l`]) +AT_CHECK([test 1 = `as hv1 ovs-ofctl dump-flows br-int table=10 | grep arp | \ +grep controller | grep -v n_packets=0 | wc -l`]) + +# Now send garp packet with different mac. +eth_src=505400000013 +eth_dst=ffffffffffff +spa=$(ip_to_hex 10 0 0 30) +tpa=$(ip_to_hex 10 0 0 30) +send_garp 1 1 $eth_src $eth_dst $spa $tpa + +# The garp packet should be sent to ovn-controller and the mac_binding entry +# should be updated. +OVS_WAIT_UNTIL([test 2 = `cat hv1/ovn-controller.log | grep NXT_PACKET_IN2 | wc -l`]) + +AT_CHECK([test 1 = `ovn-sbctl --bare --columns _uuid list mac_binding | wc -l`]) + +AT_CHECK([ovn-sbctl --format=csv --bare --columns logical_port,ip,mac \ +list mac_binding], [0], [lr0-sw0 +10.0.0.30 +50:54:00:00:00:13 +]) + +# Send ARP request to lrp - lr0-sw1 (20.0.0.1) using src mac 50:54:00:00:00:33 +# and src ip - 10.0.0.50.from sw0-p1. +# ovn-controller should add the mac_binding entry +# logical_port - lr0 +# IP - 10.0.0.50 +# MAC - 50:54:00:00:00:33 +eth_src=505400000033 +eth_dst=ffffffffffff +spa=$(ip_to_hex 10 0 0 50) +tpa=$(ip_to_hex 20 0 0 1) + +send_garp 1 1 $eth_src $eth_dst $spa $tpa + +# The garp packet should be sent to ovn-controller and the mac_binding entry +# should be updated. +OVS_WAIT_UNTIL([test 3 = `cat hv1/ovn-controller.log | grep NXT_PACKET_IN2 | wc -l`]) + +OVS_WAIT_UNTIL( + [test 1 = `as hv1 ovs-ofctl dump-flows br-int table=31 | grep dl_src=50:54:00:00:00:33 \ +| wc -l`] +) + +AT_CHECK([ovn-sbctl --format=csv --bare --columns logical_port,ip,mac \ +find mac_binding ip=10.0.0.50], [0], [lr0-sw0 +10.0.0.50 +50:54:00:00:00:33 +]) + +# Send the same packet again. +send_garp 1 1 $eth_src $eth_dst $spa $tpa + +OVS_WAIT_UNTIL( + [test 1 = `as hv1 ovs-ofctl dump-flows br-int table=31 | grep dl_src=50:54:00:00:00:33 \ +| grep n_packets=1 | wc -l`] +) + +AT_CHECK([test 3 = `cat hv1/ovn-controller.log | grep NXT_PACKET_IN2 | wc -l`]) + +# Now send ARP reply packet with IP - 10.0.0.40 and mac 505400000023 +eth_src=505400000023 +eth_dst=ffffffffffff +spa=$(ip_to_hex 10 0 0 40) +tpa=$(ip_to_hex 10 0 0 50) +send_arp_reply 1 1 $eth_src $eth_dst $spa $tpa + +# ovn-controller should add the +# mac_binding entry +# port - lr0-sw0 +# ip - 10.0.0.40 +# mac - 50:54:00:00:00:23 + +# The garp packet should be sent to ovn-controller and the mac_binding entry +# should be updated. +OVS_WAIT_UNTIL([test 4 = `cat hv1/ovn-controller.log | grep NXT_PACKET_IN2 | wc -l`]) + +# Wait for an entry in table=31 for the learnt mac_binding entry. + +OVS_WAIT_UNTIL( + [test 1 = `as hv1 ovs-ofctl dump-flows br-int table=31 | grep dl_src=50:54:00:00:00:23 \ +| wc -l`] +) + +# Send the same garp reply. This time it should not be sent to ovn-controller. +send_arp_reply 1 1 $eth_src $eth_dst $spa $tpa +OVS_WAIT_UNTIL( + [test 1 = `as hv1 ovs-ofctl dump-flows br-int table=31 | grep dl_src=50:54:00:00:00:23 \ +| grep n_packets=1 | wc -l`] +) + +AT_CHECK([test 4 = `cat hv1/ovn-controller.log | grep NXT_PACKET_IN2 | wc -l`]) + +send_arp_reply 1 1 $eth_src $eth_dst $spa $tpa +OVS_WAIT_UNTIL( + [test 1 = `as hv1 ovs-ofctl dump-flows br-int table=31 | grep dl_src=50:54:00:00:00:23 \ +| grep n_packets=2 | wc -l`] +) + +AT_CHECK([test 4 = `cat hv1/ovn-controller.log | grep NXT_PACKET_IN2 | wc -l`]) + +OVN_CLEANUP([hv1], [hv2]) +AT_CLEANUP diff --git a/tests/test-ovn.c b/tests/test-ovn.c index 8462c21b6..e96321bd6 100644 --- a/tests/test-ovn.c +++ b/tests/test-ovn.c @@ -1297,6 +1297,7 @@ test_parse_actions(struct ovs_cmdl_context *ctx OVS_UNUSED) .egress_ptable = 40, .output_ptable = 64, .mac_bind_ptable = 65, + .mac_lookup_ptable = 31, }; struct ofpbuf ofpacts; ofpbuf_init(&ofpacts, 0); diff --git a/utilities/ovn-trace.c b/utilities/ovn-trace.c index 0583610b9..c95acb897 100644 --- a/utilities/ovn-trace.c +++ b/utilities/ovn-trace.c @@ -556,6 +556,22 @@ ovntrace_mac_binding_find(const struct ovntrace_datapath *dp, return NULL; } +static const struct ovntrace_mac_binding * +ovntrace_mac_binding_find_mac_ip(const struct ovntrace_datapath *dp, + uint16_t port_key, const struct in6_addr *ip, + struct eth_addr mac) +{ + const struct ovntrace_mac_binding *bind; + HMAP_FOR_EACH_WITH_HASH (bind, node, hash_mac_binding(port_key, ip), + &dp->mac_bindings) { + if (bind->port_key == port_key && ipv6_addr_equals(ip, &bind->ip) + && eth_addr_equals(bind->mac, mac)) { + return bind; + } + } + return NULL; +} + /* If 's' ends with a UUID, returns a copy of it with the UUID truncated to * just the first 6 characters; otherwise, returns a copy of 's'. */ static char * @@ -1704,6 +1720,51 @@ execute_get_mac_bind(const struct ovnact_get_mac_bind *bind, ETH_ADDR_ARGS(uflow->dl_dst)); } +static void +execute_lookup_mac(const struct ovnact_lookup_mac_bind *bind OVS_UNUSED, + const struct ovntrace_datapath *dp OVS_UNUSED, + struct flow *uflow OVS_UNUSED, + struct ovs_list *super OVS_UNUSED) +{ + /* Get logical port number.*/ + struct mf_subfield port_sf = expr_resolve_field(&bind->port); + ovs_assert(port_sf.n_bits == 32); + uint32_t port_key = mf_get_subfield(&port_sf, uflow); + + /* Get IP address. */ + struct mf_subfield ip_sf = expr_resolve_field(&bind->ip); + ovs_assert(ip_sf.n_bits == 32 || ip_sf.n_bits == 128); + union mf_subvalue ip_sv; + mf_read_subfield(&ip_sf, uflow, &ip_sv); + struct in6_addr ip = (ip_sf.n_bits == 32 + ? in6_addr_mapped_ipv4(ip_sv.ipv4) + : ip_sv.ipv6); + + /* Get MAC. */ + struct mf_subfield mac_sf = expr_resolve_field(&bind->mac); + ovs_assert(mac_sf.n_bits == 48); + union mf_subvalue mac_sv; + mf_read_subfield(&mac_sf, uflow, &mac_sv); + + const struct ovntrace_mac_binding *binding + = ovntrace_mac_binding_find_mac_ip(dp, port_key, &ip, mac_sv.mac); + + struct mf_subfield dst = expr_resolve_field(&bind->dst); + uint8_t val = 0; + + if (binding) { + val = 1; + ovntrace_node_append(super, OVNTRACE_NODE_ACTION, + "/* MAC binding to "ETH_ADDR_FMT" found. */", + ETH_ADDR_ARGS(uflow->dl_dst)); + } else { + ovntrace_node_append(super, OVNTRACE_NODE_ACTION, + "/* lookup failed - No MAC binding. */"); + } + union mf_subvalue sv = { .u8_val = val }; + mf_write_subfield_flow(&dst, &sv, uflow); +} + static void execute_put_opts(const struct ovnact_put_opts *po, const char *name, struct flow *uflow, @@ -2072,6 +2133,14 @@ trace_actions(const struct ovnact *ovnacts, size_t ovnacts_len, /* Nothing to do for tracing. */ break; + case OVNACT_LOOKUP_ARP: + execute_lookup_mac(ovnact_get_LOOKUP_ARP(a), dp, uflow, super); + break; + + case OVNACT_LOOKUP_ND: + execute_lookup_mac(ovnact_get_LOOKUP_ND(a), dp, uflow, super); + break; + case OVNACT_PUT_DHCPV4_OPTS: execute_put_dhcp_opts(ovnact_get_PUT_DHCPV4_OPTS(a), "put_dhcp_opts", uflow, super);