From patchwork Wed Sep 2 15:05:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1355881 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=MYuQ4ddZ; dkim-atps=neutral Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BhS1D2rPSz9sTR for ; Thu, 3 Sep 2020 01:05:36 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 7F1F8868DC; Wed, 2 Sep 2020 15:05:34 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HoDekmhnMjew; Wed, 2 Sep 2020 15:05:28 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id 53D758681A; Wed, 2 Sep 2020 15:05:28 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 37DC4C0052; Wed, 2 Sep 2020 15:05:28 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 5E93CC0052 for ; Wed, 2 Sep 2020 15:05:27 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 2F06E8445E for ; Wed, 2 Sep 2020 15:05:27 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vHLdZYqZvHpn for ; Wed, 2 Sep 2020 15:05:24 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) by fraxinus.osuosl.org (Postfix) with ESMTPS id C8E768065D for ; Wed, 2 Sep 2020 15:05:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599059122; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h+o/bSpICmGl+XMWmHRe3FMLrR8ECe6tnaZT2w+C++s=; b=MYuQ4ddZwE56BmOF0FKOh0WcSD4oSKJEFL2X2NoBH5ckGAJx10ILtbTa2Zm3I9ZawcVgO4 EdCXWa0wMX+3k9kTNzrDMwzXqsy8msdGNLL0f0W9//HJz6KOCdAfVstkHGCDEtE5L9c/2W bIeXqQF185lhrmmLQqp1vXbEzQicL7s= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-448-1ujWK1gtOROc5RyIoYG2-w-1; Wed, 02 Sep 2020 11:05:18 -0400 X-MC-Unique: 1ujWK1gtOROc5RyIoYG2-w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 23D6C8018A3 for ; Wed, 2 Sep 2020 15:05:18 +0000 (UTC) Received: from dceara.remote.csb (ovpn-112-132.ams2.redhat.com [10.36.112.132]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5E8B05D9CC for ; Wed, 2 Sep 2020 15:05:17 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Wed, 2 Sep 2020 17:05:09 +0200 Message-Id: <20200902150504.20965.58557.stgit@dceara.remote.csb> In-Reply-To: <20200902150447.20965.95083.stgit@dceara.remote.csb> References: <20200902150447.20965.95083.stgit@dceara.remote.csb> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dceara@redhat.com X-Mimecast-Spam-Score: 0.003 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH v3 ovn 1/2] ovn-northd: Reduce number of flows generated for stateful ACLs. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Introduce two new stages in the logical switch pipeline: - ls_in_acl_hint - ls_out_acl_hint Flows in these stages match on various combinations of conntrack flags to determine how traffic might be processed in the ACL stage. Four possible hints are set (there may be more than one set at the same time per packet): - REGBIT_ACL_HINT_ALLOW_NEW: the packet might match an allow-related ACL in which case it will have to commit or update a connection to conntrack. - REGBIT_ACL_HINT_ALLOW: the packet might match an allow-related ACL but the session already exists so no commit will be needed. - REGBIT_ACL_HINT_DROP: the packet might match a drop/reject ACL but the session already exists so no commit will be needed. - REGBIT_ACL_HINT_BLOCK: the packet might match a drop/reject ACL in which case it will have to commit or update a connection in conntrack. These hints are used in the ls_in_acl/ls_out_acl tables and simplify the match expressions for logical flows generated for ACLs reducing the number of disjunctions in the match, therefore reducing the number of openflows by a factor of 2 for allow-related ACLs and by a factor of 3 for drop/reject ACLs. Suggested-by: Han Zhou Signed-off-by: Dumitru Ceara Acked-by: Mark Michelson --- NOTE: The "ovn -- ECMP symmetric reply" system test will fail with this patch applied until the following patch that fixes the test is also merged: http://patchwork.ozlabs.org/project/ovn/patch/1599033403-1659-1-git-send-email-dceara@redhat.com/ --- northd/ovn-northd.8.xml | 134 ++++++++++++++++++++++++++++------ northd/ovn-northd.c | 186 +++++++++++++++++++++++++++++++++++------------ tests/ovn-northd.at | 26 +++---- tests/ovn.at | 58 +++++++-------- tests/system-ovn.at | 4 + 5 files changed, 292 insertions(+), 116 deletions(-) diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 989e364..226afc8 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -386,7 +386,86 @@ ct_next; action.

-

Ingress table 6: from-lport ACLs

+

Ingress Table 6: from-lport ACL hints

+ +

+ This table consists of logical flows that set hints + (reg0 bits) to be used in the next stage, in the ACL + processing table. Multiple hints can be set for the same packet. + The possible hints are: +

+
    +
  • + reg0[7]: the packet might match an + allow-related ACL and might have to commit the + connection to conntrack. +
  • +
  • + reg0[8]: the packet might match an + allow-related ACL but there will be no need to commit + the connection to conntrack because it already exists. +
  • +
  • + reg0[9]: the packet might match a + drop/reject. +
  • +
  • + reg0[10]: the packet might match a + drop/reject ACL but the connection was previously + allowed so it might have to be committed again with + ct_label=1/1. +
  • +
+ +

+ The table contains the following flows: +

+
    +
  • + A priority-7 flow that matches on packets that initiate a new session. + This flow sets reg0[7] and reg0[9] and + then advances to the next table. +
  • +
  • + A priority-6 flow that matches on packets that are in the request + direction of an already existing session that has been marked + as blocked. This flow sets reg0[7] and + reg0[9] and then advances to the next table. +
  • +
  • + A priority-5 flow that matches untracked packets. This flow sets + reg0[8] and reg0[9] and then advances to + the next table. +
  • +
  • + A priority-4 flow that matches on packets that are in the request + direction of an already existing session that has not been marked + as blocked. This flow sets reg0[8] and + reg0[10] and then advances to the next table. +
  • +
  • + A priority-3 flow that matches on packets that are in not part of + established sessions. This flow sets reg0[9] and then + advances to the next table. +
  • +
  • + A priority-2 flow that matches on packets that are part of an + established session that has been marked as blocked. + This flow sets reg0[9] and then advances to the next + table. +
  • +
  • + A priority-1 flow that matches on packets that are part of an + established session that has not been marked as blocked. + This flow sets reg0[10] and then advances to the next + table. +
  • +
  • + A priority-0 flow to advance to the next table. +
  • +
+ +

Ingress table 7: from-lport ACLs

Logical flows in this table closely reproduce those in the @@ -494,7 +573,7 @@ -

Ingress Table 7: from-lport QoS Marking

+

Ingress Table 8: from-lport QoS Marking

Logical flows in this table closely reproduce those in the @@ -516,7 +595,7 @@ -

Ingress Table 8: from-lport QoS Meter

+

Ingress Table 9: from-lport QoS Meter

Logical flows in this table closely reproduce those in the @@ -538,7 +617,7 @@ -

Ingress Table 9: LB

+

Ingress Table 10: LB

It contains a priority-0 flow that simply moves traffic to the next @@ -564,7 +643,7 @@ connection.)

-

Ingress Table 10: Stateful

+

Ingress Table 11: Stateful

  • @@ -612,7 +691,7 @@
-

Ingress Table 11: Pre-Hairpin

+

Ingress Table 12: Pre-Hairpin

  • For all configured load balancer VIPs a priority-2 flow that @@ -632,7 +711,7 @@
-

Ingress Table 12: Hairpin

+

Ingress Table 13: Hairpin

  • A priority-1 flow that hairpins traffic matched by non-default @@ -645,7 +724,7 @@
-

Ingress Table 13: ARP/ND responder

+

Ingress Table 14: ARP/ND responder

This table implements ARP/ND responder in a logical switch for known @@ -930,7 +1009,7 @@ output; -

Ingress Table 14: DHCP option processing

+

Ingress Table 15: DHCP option processing

This table adds the DHCPv4 options to a DHCPv4 packet from the @@ -987,11 +1066,11 @@ next;

  • - A priority-0 flow that matches all packets to advances to table 15. + A priority-0 flow that matches all packets to advances to table 16.
  • -

    Ingress Table 15: DHCP responses

    +

    Ingress Table 16: DHCP responses

    This table implements DHCP responder for the DHCP replies generated by @@ -1068,11 +1147,11 @@ output;

  • - A priority-0 flow that matches all packets to advances to table 16. + A priority-0 flow that matches all packets to advances to table 17.
  • -

    Ingress Table 16 DNS Lookup

    +

    Ingress Table 17 DNS Lookup

    This table looks up and resolves the DNS names to the corresponding @@ -1101,7 +1180,7 @@ reg0[4] = dns_lookup(); next; -

    Ingress Table 17 DNS Responses

    +

    Ingress Table 18 DNS Responses

    This table implements DNS responder for the DNS replies generated by @@ -1136,7 +1215,7 @@ output; -

    Ingress table 18 External ports

    +

    Ingress table 19 External ports

    Traffic from the external logical ports enter the ingress @@ -1175,11 +1254,11 @@ output;

  • - A priority-0 flow that matches all packets to advances to table 19. + A priority-0 flow that matches all packets to advances to table 20.
  • -

    Ingress Table 19 Destination Lookup

    +

    Ingress Table 20 Destination Lookup

    This table implements switching behavior. It contains these logical @@ -1412,7 +1491,12 @@ output; This is similar to ingress table LB.

    -

    Egress Table 4: to-lport ACLs

    +

    Ingress Table 6: from-lport ACL hints

    +

    + This is similar to ingress table ACL hints. +

    + +

    Egress Table 5: to-lport ACLs

    This is similar to ingress table ACLs except for @@ -1427,14 +1511,14 @@ output; A priority 34000 logical flow is added for each logical port which has DHCPv4 options defined to allow the DHCPv4 reply packet and which has DHCPv6 options defined to allow the DHCPv6 reply packet from the - Ingress Table 15: DHCP responses. + Ingress Table 16: DHCP responses.

  • A priority 34000 logical flow is added for each logical switch datapath configured with DNS records with the match udp.dst = 53 to allow the DNS reply packet from the - Ingress Table 17: DNS responses. + Ingress Table 18: DNS responses.
  • @@ -1449,28 +1533,28 @@ output;
  • -

    Egress Table 5: to-lport QoS Marking

    +

    Egress Table 6: to-lport QoS Marking

    This is similar to ingress table QoS marking except they apply to to-lport QoS rules.

    -

    Egress Table 6: to-lport QoS Meter

    +

    Egress Table 7: to-lport QoS Meter

    This is similar to ingress table QoS meter except they apply to to-lport QoS rules.

    -

    Egress Table 7: Stateful

    +

    Egress Table 8: Stateful

    This is similar to ingress table Stateful except that there are no rules added for load balancing new connections.

    -

    Egress Table 8: Egress Port Security - IP

    +

    Egress Table 9: Egress Port Security - IP

    This is similar to the port security logic in table @@ -1480,7 +1564,7 @@ output; ip4.src and ip6.src

    -

    Egress Table 9: Egress Port Security - L2

    +

    Egress Table 10: Egress Port Security - L2

    This is similar to the ingress port security logic in ingress table diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 7be0e85..2025446 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -138,32 +138,34 @@ enum ovn_stage { PIPELINE_STAGE(SWITCH, IN, PRE_ACL, 3, "ls_in_pre_acl") \ PIPELINE_STAGE(SWITCH, IN, PRE_LB, 4, "ls_in_pre_lb") \ PIPELINE_STAGE(SWITCH, IN, PRE_STATEFUL, 5, "ls_in_pre_stateful") \ - PIPELINE_STAGE(SWITCH, IN, ACL, 6, "ls_in_acl") \ - PIPELINE_STAGE(SWITCH, IN, QOS_MARK, 7, "ls_in_qos_mark") \ - PIPELINE_STAGE(SWITCH, IN, QOS_METER, 8, "ls_in_qos_meter") \ - PIPELINE_STAGE(SWITCH, IN, LB, 9, "ls_in_lb") \ - PIPELINE_STAGE(SWITCH, IN, STATEFUL, 10, "ls_in_stateful") \ - PIPELINE_STAGE(SWITCH, IN, PRE_HAIRPIN, 11, "ls_in_pre_hairpin") \ - PIPELINE_STAGE(SWITCH, IN, HAIRPIN, 12, "ls_in_hairpin") \ - PIPELINE_STAGE(SWITCH, IN, ARP_ND_RSP, 13, "ls_in_arp_rsp") \ - PIPELINE_STAGE(SWITCH, IN, DHCP_OPTIONS, 14, "ls_in_dhcp_options") \ - PIPELINE_STAGE(SWITCH, IN, DHCP_RESPONSE, 15, "ls_in_dhcp_response") \ - PIPELINE_STAGE(SWITCH, IN, DNS_LOOKUP, 16, "ls_in_dns_lookup") \ - PIPELINE_STAGE(SWITCH, IN, DNS_RESPONSE, 17, "ls_in_dns_response") \ - PIPELINE_STAGE(SWITCH, IN, EXTERNAL_PORT, 18, "ls_in_external_port") \ - PIPELINE_STAGE(SWITCH, IN, L2_LKUP, 19, "ls_in_l2_lkup") \ + PIPELINE_STAGE(SWITCH, IN, ACL_HINT, 6, "ls_in_acl_hint") \ + PIPELINE_STAGE(SWITCH, IN, ACL, 7, "ls_in_acl") \ + PIPELINE_STAGE(SWITCH, IN, QOS_MARK, 8, "ls_in_qos_mark") \ + PIPELINE_STAGE(SWITCH, IN, QOS_METER, 9, "ls_in_qos_meter") \ + PIPELINE_STAGE(SWITCH, IN, LB, 10, "ls_in_lb") \ + PIPELINE_STAGE(SWITCH, IN, STATEFUL, 11, "ls_in_stateful") \ + PIPELINE_STAGE(SWITCH, IN, PRE_HAIRPIN, 12, "ls_in_pre_hairpin") \ + PIPELINE_STAGE(SWITCH, IN, HAIRPIN, 13, "ls_in_hairpin") \ + PIPELINE_STAGE(SWITCH, IN, ARP_ND_RSP, 14, "ls_in_arp_rsp") \ + PIPELINE_STAGE(SWITCH, IN, DHCP_OPTIONS, 15, "ls_in_dhcp_options") \ + PIPELINE_STAGE(SWITCH, IN, DHCP_RESPONSE, 16, "ls_in_dhcp_response") \ + PIPELINE_STAGE(SWITCH, IN, DNS_LOOKUP, 17, "ls_in_dns_lookup") \ + PIPELINE_STAGE(SWITCH, IN, DNS_RESPONSE, 18, "ls_in_dns_response") \ + PIPELINE_STAGE(SWITCH, IN, EXTERNAL_PORT, 19, "ls_in_external_port") \ + PIPELINE_STAGE(SWITCH, IN, L2_LKUP, 20, "ls_in_l2_lkup") \ \ /* Logical switch egress stages. */ \ PIPELINE_STAGE(SWITCH, OUT, PRE_LB, 0, "ls_out_pre_lb") \ PIPELINE_STAGE(SWITCH, OUT, PRE_ACL, 1, "ls_out_pre_acl") \ PIPELINE_STAGE(SWITCH, OUT, PRE_STATEFUL, 2, "ls_out_pre_stateful") \ PIPELINE_STAGE(SWITCH, OUT, LB, 3, "ls_out_lb") \ - PIPELINE_STAGE(SWITCH, OUT, ACL, 4, "ls_out_acl") \ - PIPELINE_STAGE(SWITCH, OUT, QOS_MARK, 5, "ls_out_qos_mark") \ - PIPELINE_STAGE(SWITCH, OUT, QOS_METER, 6, "ls_out_qos_meter") \ - PIPELINE_STAGE(SWITCH, OUT, STATEFUL, 7, "ls_out_stateful") \ - PIPELINE_STAGE(SWITCH, OUT, PORT_SEC_IP, 8, "ls_out_port_sec_ip") \ - PIPELINE_STAGE(SWITCH, OUT, PORT_SEC_L2, 9, "ls_out_port_sec_l2") \ + PIPELINE_STAGE(SWITCH, OUT, ACL_HINT, 4, "ls_out_acl_hint") \ + PIPELINE_STAGE(SWITCH, OUT, ACL, 5, "ls_out_acl") \ + PIPELINE_STAGE(SWITCH, OUT, QOS_MARK, 6, "ls_out_qos_mark") \ + PIPELINE_STAGE(SWITCH, OUT, QOS_METER, 7, "ls_out_qos_meter") \ + PIPELINE_STAGE(SWITCH, OUT, STATEFUL, 8, "ls_out_stateful") \ + PIPELINE_STAGE(SWITCH, OUT, PORT_SEC_IP, 9, "ls_out_port_sec_ip") \ + PIPELINE_STAGE(SWITCH, OUT, PORT_SEC_L2, 10, "ls_out_port_sec_l2") \ \ /* Logical router ingress stages. */ \ PIPELINE_STAGE(ROUTER, IN, ADMISSION, 0, "lr_in_admission") \ @@ -205,13 +207,17 @@ enum ovn_stage { #define OVN_ACL_PRI_OFFSET 1000 /* Register definitions specific to switches. */ -#define REGBIT_CONNTRACK_DEFRAG "reg0[0]" -#define REGBIT_CONNTRACK_COMMIT "reg0[1]" -#define REGBIT_CONNTRACK_NAT "reg0[2]" -#define REGBIT_DHCP_OPTS_RESULT "reg0[3]" -#define REGBIT_DNS_LOOKUP_RESULT "reg0[4]" -#define REGBIT_ND_RA_OPTS_RESULT "reg0[5]" -#define REGBIT_HAIRPIN "reg0[6]" +#define REGBIT_CONNTRACK_DEFRAG "reg0[0]" +#define REGBIT_CONNTRACK_COMMIT "reg0[1]" +#define REGBIT_CONNTRACK_NAT "reg0[2]" +#define REGBIT_DHCP_OPTS_RESULT "reg0[3]" +#define REGBIT_DNS_LOOKUP_RESULT "reg0[4]" +#define REGBIT_ND_RA_OPTS_RESULT "reg0[5]" +#define REGBIT_HAIRPIN "reg0[6]" +#define REGBIT_ACL_HINT_ALLOW_NEW "reg0[7]" +#define REGBIT_ACL_HINT_ALLOW "reg0[8]" +#define REGBIT_ACL_HINT_DROP "reg0[9]" +#define REGBIT_ACL_HINT_BLOCK "reg0[10]" /* Register definitions for switches and routers. */ @@ -246,11 +252,12 @@ enum ovn_stage { * OVS register usage: * * Logical Switch pipeline: - * +---------+-------------------------------------+ - * | R0 | REGBIT_{CONNTRACK/DHCP/DNS/HAIRPIN} | - * +---------+-------------------------------------+ - * | R1 - R9 | UNUSED | - * +---------+-------------------------------------+ + * +---------+----------------------------------------------+ + * | R0 | REGBIT_{CONNTRACK/DHCP/DNS/HAIRPIN} | + * | | REGBIT_ACL_HINT_{ALLOW_NEW/ALLOW/DROP/BLOCK} | + * +---------+----------------------------------------------+ + * | R1 - R9 | UNUSED | + * +---------+----------------------------------------------+ * * Logical Router pipeline: * +-----+--------------------------+---+-----------------+---+---------------+ @@ -5140,6 +5147,96 @@ build_pre_stateful(struct ovn_datapath *od, struct hmap *lflows) } static void +build_acl_hints(struct ovn_datapath *od, struct hmap *lflows) +{ + /* This stage builds hints for the IN/OUT_ACL stage. Based on various + * combinations of ct flags packets may hit only a subset of the logical + * flows in the IN/OUT_ACL stage. + * + * Populating ACL hints first and storing them in registers simplifies + * the logical flow match expressions in the IN/OUT_ACL stage and + * generates less openflows. + * + * Certain combinations of ct flags might be valid matches for multiple + * types of ACL logical flows (e.g., allow/drop). In such cases hints + * corresponding to all potential matches are set. + */ + + enum ovn_stage stages[] = { + S_SWITCH_IN_ACL_HINT, + S_SWITCH_OUT_ACL_HINT, + }; + + for (size_t i = 0; i < ARRAY_SIZE(stages); i++) { + enum ovn_stage stage = stages[i]; + + /* New, not already established connections, may hit either allow + * or drop ACLs. For allow ACLs, the connection must also be committed + * to conntrack so we set REGBIT_ACL_HINT_ALLOW_NEW. + */ + ovn_lflow_add(lflows, od, stage, 7, "ct.new && !ct.est", + REGBIT_ACL_HINT_ALLOW_NEW " = 1; " + REGBIT_ACL_HINT_DROP " = 1; " + "next;"); + + /* Already established connections in the "request" direction that + * are already marked as "blocked" may hit either: + * - allow ACLs for connections that were previously allowed by a + * policy that was deleted and is being readded now. In this case + * the connection should be recommitted so we set + * REGBIT_ACL_HINT_ALLOW_NEW. + * - drop ACLs. + */ + ovn_lflow_add(lflows, od, stage, 6, + "!ct.new && ct.est && !ct.rpl && ct_label.blocked == 1", + REGBIT_ACL_HINT_ALLOW_NEW " = 1; " + REGBIT_ACL_HINT_DROP " = 1; " + "next;"); + + /* Not tracked traffic can either be allowed or dropped. */ + ovn_lflow_add(lflows, od, stage, 5, "!ct.trk", + REGBIT_ACL_HINT_ALLOW " = 1; " + REGBIT_ACL_HINT_DROP " = 1; " + "next;"); + + /* Already established connections in the "request" direction may hit + * either: + * - allow ACLs in which case the traffic should be allowed so we set + * REGBIT_ACL_HINT_ALLOW. + * - drop ACLs in which case the traffic should be blocked and the + * connection must be committed with ct_label.blocked set so we set + * REGBIT_ACL_HINT_BLOCK. + */ + ovn_lflow_add(lflows, od, stage, 4, + "!ct.new && ct.est && !ct.rpl && ct_label.blocked == 0", + REGBIT_ACL_HINT_ALLOW " = 1; " + REGBIT_ACL_HINT_BLOCK " = 1; " + "next;"); + + /* Not established or established and already blocked connections may + * hit drop ACLs. + */ + ovn_lflow_add(lflows, od, stage, 3, "!ct.est", + REGBIT_ACL_HINT_DROP " = 1; " + "next;"); + ovn_lflow_add(lflows, od, stage, 2, "ct.est && ct_label.blocked == 1", + REGBIT_ACL_HINT_DROP " = 1; " + "next;"); + + /* Established connections that were previously allowed might hit + * drop ACLs in which case the connection must be committed with + * ct_label.blocked set. + */ + ovn_lflow_add(lflows, od, stage, 1, "ct.est && ct_label.blocked == 0", + REGBIT_ACL_HINT_BLOCK " = 1; " + "next;"); + + /* In any case, advance to the next stage. */ + ovn_lflow_add(lflows, od, stage, 0, "1", "next;"); + } +} + +static void build_acl_log(struct ds *actions, const struct nbrec_acl *acl) { if (!acl->log) { @@ -5197,7 +5294,7 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, "eth.dst <-> eth.src; ip4.dst <-> ip4.src; " "tcp_reset { outport <-> inport; %s };", ingress ? "next(pipeline=egress,table=5);" - : "next(pipeline=ingress,table=19);"); + : "next(pipeline=ingress,table=20);"); ovn_lflow_add_with_hint(lflows, od, stage, acl->priority + OVN_ACL_PRI_OFFSET + 10, ds_cstr(&match), ds_cstr(&actions), stage_hint); @@ -5212,7 +5309,7 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, "eth.dst <-> eth.src; ip6.dst <-> ip6.src; " "tcp_reset { outport <-> inport; %s };", ingress ? "next(pipeline=egress,table=5);" - : "next(pipeline=ingress,table=19);"); + : "next(pipeline=ingress,table=20);"); ovn_lflow_add_with_hint(lflows, od, stage, acl->priority + OVN_ACL_PRI_OFFSET + 10, ds_cstr(&match), ds_cstr(&actions), stage_hint); @@ -5232,7 +5329,7 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, "icmp4 { eth.dst <-> eth.src; ip4.dst <-> ip4.src; " "outport <-> inport; %s };", ingress ? "next(pipeline=egress,table=5);" - : "next(pipeline=ingress,table=19);"); + : "next(pipeline=ingress,table=20);"); ovn_lflow_add_with_hint(lflows, od, stage, acl->priority + OVN_ACL_PRI_OFFSET, ds_cstr(&match), ds_cstr(&actions), stage_hint); @@ -5250,7 +5347,7 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, "eth.dst <-> eth.src; ip6.dst <-> ip6.src; " "outport <-> inport; %s };", ingress ? "next(pipeline=egress,table=5);" - : "next(pipeline=ingress,table=19);"); + : "next(pipeline=ingress,table=20);"); ovn_lflow_add_with_hint(lflows, od, stage, acl->priority + OVN_ACL_PRI_OFFSET, ds_cstr(&match), ds_cstr(&actions), stage_hint); @@ -5298,10 +5395,8 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, * by ct_commit in the "stateful" stage) to indicate that the * connection should be allowed to resume. */ - ds_put_format(&match, "((ct.new && !ct.est)" - " || (!ct.new && ct.est && !ct.rpl " - "&& ct_label.blocked == 1)) " - "&& (%s)", acl->match); + ds_put_format(&match, REGBIT_ACL_HINT_ALLOW_NEW " == 1 && (%s)", + acl->match); ds_put_cstr(&actions, REGBIT_CONNTRACK_COMMIT" = 1; "); build_acl_log(&actions, acl); ds_put_cstr(&actions, "next;"); @@ -5319,9 +5414,7 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, * policy. Match untracked packets too. */ ds_clear(&match); ds_clear(&actions); - ds_put_format(&match, - "(!ct.trk || (!ct.new && ct.est && !ct.rpl" - " && ct_label.blocked == 0)) && (%s)", + ds_put_format(&match, REGBIT_ACL_HINT_ALLOW " == 1 && (%s)", acl->match); build_acl_log(&actions, acl); @@ -5346,9 +5439,7 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, if (has_stateful) { /* If the packet is not tracked or not part of an established * connection, then we can simply reject/drop it. */ - ds_put_cstr(&match, - "(!ct.trk || !ct.est" - " || (ct.est && ct_label.blocked == 1))"); + ds_put_cstr(&match, REGBIT_ACL_HINT_DROP " == 1"); if (!strcmp(acl->action, "reject")) { build_reject_acl_rules(od, lflows, stage, acl, &match, &actions, &acl->header_); @@ -5374,7 +5465,7 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, */ ds_clear(&match); ds_clear(&actions); - ds_put_cstr(&match, "ct.est && ct_label.blocked == 0"); + ds_put_cstr(&match, REGBIT_ACL_HINT_BLOCK " == 1"); ds_put_cstr(&actions, "ct_commit { ct_label.blocked = 1; }; "); if (!strcmp(acl->action, "reject")) { build_reject_acl_rules(od, lflows, stage, acl, &match, @@ -6621,6 +6712,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, build_pre_acls(od, lflows); build_pre_lb(od, lflows, meter_groups, lbs); build_pre_stateful(od, lflows); + build_acl_hints(od, lflows); build_acls(od, lflows, port_groups); build_qos(od, lflows); build_lb(od, lflows); diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index 8344c7f..87644bd 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -1185,7 +1185,7 @@ ovn-nbctl --wait=sb ls-lb-add sw0 lb1 ovn-sbctl dump-flows sw0 | grep ct_lb | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) ]) # Delete the Load_Balancer_Health_Check @@ -1194,7 +1194,7 @@ OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list service_monitor | wc -l`]) ovn-sbctl dump-flows sw0 | grep ct_lb | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) ]) # Create the Load_Balancer_Health_Check again. @@ -1207,7 +1207,7 @@ service_monitor | sed '/^$/d' | wc -l`]) ovn-sbctl dump-flows sw0 | grep ct_lb | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) ]) # Get the uuid of both the service_monitor @@ -1223,7 +1223,7 @@ OVS_WAIT_UNTIL([ ovn-sbctl dump-flows sw0 | grep ct_lb | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80);) ]) # Set the service monitor for sw0-p1 to offline @@ -1240,7 +1240,7 @@ AT_CHECK([cat lflows.txt], [0], [dnl ovn-sbctl dump-flows sw0 | grep "ip4.dst == 10.0.0.10 && tcp.dst == 80" \ | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(drop;) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(drop;) ]) # Set the service monitor for sw0-p1 and sw1-p1 to online @@ -1253,7 +1253,7 @@ OVS_WAIT_UNTIL([ ovn-sbctl dump-flows sw0 | grep ct_lb | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) ]) # Set the service monitor for sw1-p1 to error @@ -1265,7 +1265,7 @@ OVS_WAIT_UNTIL([ ovn-sbctl dump-flows sw0 | grep "ip4.dst == 10.0.0.10 && tcp.dst == 80" \ | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80);) ]) # Add one more vip to lb1 @@ -1295,8 +1295,8 @@ service_monitor port=1000 | sed '/^$/d' | wc -l`]) ovn-sbctl dump-flows sw0 | grep ct_lb | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80);) - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.40 && tcp.dst == 1000), action=(ct_lb(backends=10.0.0.3:1000);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.40 && tcp.dst == 1000), action=(ct_lb(backends=10.0.0.3:1000);) ]) # Set the service monitor for sw1-p1 to online @@ -1308,16 +1308,16 @@ OVS_WAIT_UNTIL([ ovn-sbctl dump-flows sw0 | grep ct_lb | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.40 && tcp.dst == 1000), action=(ct_lb(backends=10.0.0.3:1000,20.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.40 && tcp.dst == 1000), action=(ct_lb(backends=10.0.0.3:1000,20.0.0.3:80);) ]) # Associate lb1 to sw1 ovn-nbctl --wait=sb ls-lb-add sw1 lb1 ovn-sbctl dump-flows sw1 | grep ct_lb | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.40 && tcp.dst == 1000), action=(ct_lb(backends=10.0.0.3:1000,20.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.40 && tcp.dst == 1000), action=(ct_lb(backends=10.0.0.3:1000,20.0.0.3:80);) ]) # Now create lb2 same as lb1 but udp protocol. diff --git a/tests/ovn.at b/tests/ovn.at index 5ad51c0..99861bf 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -14237,17 +14237,17 @@ ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys AT_CHECK([ovn-sbctl dump-flows ls1 | grep "offerip = 10.0.0.6" | \ wc -l], [0], [0 ]) -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.06" | wc -l], [0], [0 ]) -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.06" | wc -l], [0], [0 ]) -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 ]) -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 ]) @@ -14278,17 +14278,17 @@ port_binding logical_port=ls1-lp_ext1` # No DHCPv4/v6 flows for the external port - ls1-lp_ext1 - 10.0.0.6 in hv1 and hv2 # as no localnet port added to ls1 yet. -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.06" | wc -l], [0], [0 ]) -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.06" | wc -l], [0], [0 ]) -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 ]) -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 ]) @@ -14310,38 +14310,38 @@ logical_port=ls1-lp_ext1` test "$chassis" = "$hv1_uuid"]) # There should be DHCPv4/v6 OF flows for the ls1-lp_ext1 port in hv1 -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \ wc -l], [0], [3 ]) -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \ grep reg14=0x$ln_public_key | wc -l], [0], [1 ]) # There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv2 -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.06" | wc -l], [0], [0 ]) -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | wc -l], [0], [0 ]) # No DHCPv4/v6 flows for the external port - ls1-lp_ext2 - 10.0.0.7 in hv1 and # hv2 as requested-chassis option is not set. -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.07" | wc -l], [0], [0 ]) -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.07" | wc -l], [0], [0 ]) -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0 ]) -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.07" | wc -l], [0], [0 ]) @@ -14593,21 +14593,21 @@ logical_port=ls1-lp_ext1` test "$chassis" = "$hv2_uuid"]) # There should be OF flows for DHCP4/v6 for the ls1-lp_ext1 port in hv2 -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.06" | grep reg14=0x$ln_public_key | \ wc -l], [0], [3 ]) -AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv2 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \ grep reg14=0x$ln_public_key | wc -l], [0], [1 ]) # There should be no DHCPv4/v6 flows for ls1-lp_ext1 on hv1 -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep "0a.00.00.06" | wc -l], [0], [0 ]) -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=22 | \ +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=23 | \ grep controller | grep tp_src=546 | grep \ "ae.70.00.00.00.00.00.00.00.00.00.00.00.00.00.06" | \ grep reg14=0x$ln_public_key | wc -l], [0], [0 @@ -14873,7 +14873,7 @@ logical_port=ls1-lp_ext1` # There should be a flow in hv2 to drop traffic from ls1-lp_ext1 destined # to router mac. AT_CHECK([as hv2 ovs-ofctl dump-flows br-int \ -table=26,dl_src=f0:00:00:00:00:03,dl_dst=a0:10:00:00:00:01 | \ +table=27,dl_src=f0:00:00:00:00:03,dl_dst=a0:10:00:00:00:01 | \ grep -c "actions=drop"], [0], [1 ]) @@ -16144,9 +16144,9 @@ ovn-nbctl --wait=hv sync ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=13(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p1" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) - table=13(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p2" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) - table=13(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p3" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) + table=14(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p1" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) + table=14(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p2" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) + table=14(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p3" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) ]) ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \ @@ -16356,8 +16356,8 @@ ovn-nbctl --wait=hv set logical_switch_port sw0-vir options:virtual-ip=10.0.0.10 ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=13(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p1" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) - table=13(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p3" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) + table=14(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p1" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) + table=14(ls_in_arp_rsp ), priority=100 , match=(inport == "sw0-p3" && ((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;) ]) ovn-nbctl --wait=hv remove logical_switch_port sw0-vir options virtual-parents @@ -18340,7 +18340,7 @@ test_ip vif11 f00000000011 000001010203 $sip $dip vif-north OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv4/vif-north-tx.pcap], [vif-north.expected]) # Confirm that packets did not go out via tunnel port. -AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=32 | grep NXM_NX_TUN_METADATA0 | grep n_packets=0 | wc -l], [0], [[0 +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep table=33 | grep NXM_NX_TUN_METADATA0 | grep n_packets=0 | wc -l], [0], [[0 ]]) # Confirm that packet went out via localnet port @@ -19087,7 +19087,7 @@ service_monitor | sed '/^$/d' | wc -l`]) ovn-sbctl dump-flows sw0 | grep ct_lb | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(ct_lb(backends=10.0.0.3:80,20.0.0.3:80);) ]) ovn-sbctl dump-flows lr0 | grep ct_lb | grep priority=120 > lflows.txt @@ -19125,7 +19125,7 @@ grep "405400000003${svc_mon_src_mac}" | wc -l`] ovn-sbctl dump-flows sw0 | grep "ip4.dst == 10.0.0.10 && tcp.dst == 80" \ | grep priority=120 > lflows.txt AT_CHECK([cat lflows.txt], [0], [dnl - table=10(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(drop;) + table=11(ls_in_stateful ), priority=120 , match=(ct.new && ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(drop;) ]) ovn-sbctl dump-flows lr0 | grep lr_in_dnat | grep priority=120 > lflows.txt diff --git a/tests/system-ovn.at b/tests/system-ovn.at index 40ba6e4..b9b5eaa 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -2163,7 +2163,7 @@ tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=,dport=),reply=(sr ]) check_est_flows () { - n=$(ovs-ofctl dump-flows br-int table=14 | grep \ + n=$(ovs-ofctl dump-flows br-int table=15 | grep \ "priority=120,ct_state=+est+trk,tcp,metadata=0x2,nw_dst=30.0.0.2,tp_dst=8000" \ | grep nat | sed -n 's/.*n_packets=\([[0-9]]\{1,\}\).*/\1/p') @@ -4548,7 +4548,7 @@ OVS_WAIT_UNTIL([ ]) OVS_WAIT_UNTIL([ - n_pkt=$(ovs-ofctl dump-flows br-int table=44 | grep -v n_packets=0 | \ + n_pkt=$(ovs-ofctl dump-flows br-int table=45 | grep -v n_packets=0 | \ grep controller | grep tp_dst=84 -c) test $n_pkt -eq 1 ]) From patchwork Wed Sep 2 15:05:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1355899 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=G/PXmFsr; dkim-atps=neutral Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BhS9X1Kybz9sRK for ; Thu, 3 Sep 2020 01:12:48 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 5E3312BC43; Wed, 2 Sep 2020 15:12:46 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jtkC-MQhDOTK; Wed, 2 Sep 2020 15:11:54 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by silver.osuosl.org (Postfix) with ESMTP id ED32B274B0; Wed, 2 Sep 2020 15:05:40 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id D1F66C0890; Wed, 2 Sep 2020 15:05:40 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 9B080C0051 for ; Wed, 2 Sep 2020 15:05:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 8BE2F869C0 for ; Wed, 2 Sep 2020 15:05:39 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TRuD7kHX1Vy9 for ; Wed, 2 Sep 2020 15:05:36 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by fraxinus.osuosl.org (Postfix) with ESMTPS id 4186A869EB for ; Wed, 2 Sep 2020 15:05:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599059134; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Jn/7M9ULp4S0KEZ89Ia8moHtw4f2Nkzbc168qHSNznw=; b=G/PXmFsrKS5HG4aQbuRs24vZAUHEAh3S/P2Z1fwz4nQHJlhL7DUngDiBO23ApvA18Pazk/ eGraz3qi/h4pqw2VGKU0netbuk/UueNtXPkHGSMhDp3OKMZjc4iX6fInfoPNkiH0yGD+Qa gKn3nhURB806Zum6vzzoIx9uiQ5n5m0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-562-faS7ykZwMJCeLbjzIRhXLw-1; Wed, 02 Sep 2020 11:05:28 -0400 X-MC-Unique: faS7ykZwMJCeLbjzIRhXLw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 595581009440 for ; Wed, 2 Sep 2020 15:05:27 +0000 (UTC) Received: from dceara.remote.csb (ovpn-112-132.ams2.redhat.com [10.36.112.132]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9547860C0F for ; Wed, 2 Sep 2020 15:05:26 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Wed, 2 Sep 2020 17:05:24 +0200 Message-Id: <20200902150523.20965.3988.stgit@dceara.remote.csb> In-Reply-To: <20200902150447.20965.95083.stgit@dceara.remote.csb> References: <20200902150447.20965.95083.stgit@dceara.remote.csb> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dceara@redhat.com X-Mimecast-Spam-Score: 0.003 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH v3 ovn 2/2] ovn-northd: Support mixing stateless/stateful ACLs with Stateless_Filter. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" A new table is added to OVN_Northbound: Stateless_Filter. Users can populate this table with records consisting of . These records generate logical flows in the PRE_ACL stages of the logical switch pipeline. Packets matching these flows will completely bypass connection tracking for ACL purposes. In specific scenarios CMSs can predetermine which traffic must be firewalled statefully or not, e.g., UDP vs TCP. However, until now, if at least one stateful ACL (allow-related) is configured on the switch, all traffic gets sent to connection tracking. This induces a hit in performance when forwarding packets that don't need stateful processing. New command line arguments are added to ovn-nbctl (stateless-filter-*) to allow the users to interact with the Stateless_Filter table. Signed-off-by: Dumitru Ceara --- NEWS | 3 northd/ovn-northd.8.xml | 25 ++++ northd/ovn-northd.c | 95 +++++++++++++-- ovn-nb.ovsschema | 26 ++++ ovn-nb.xml | 56 ++++++++- tests/ovn-nbctl.at | 53 ++++++++ tests/ovn-northd.at | 263 +++++++++++++++++++++++++++++++++++++++++ tests/system-common-macros.at | 8 + tests/system-ovn.at | 113 ++++++++++++++++++ utilities/ovn-detrace.in | 12 ++ utilities/ovn-nbctl.c | 213 ++++++++++++++++++++++++++++++++- 11 files changed, 840 insertions(+), 27 deletions(-) diff --git a/NEWS b/NEWS index a1ce4e8..eedd091 100644 --- a/NEWS +++ b/NEWS @@ -11,6 +11,9 @@ Post-v20.06.0 called Chassis_Private now contains the nb_cfg column which is updated by incrementing the value in the NB_Global table, CMSes relying on this mechanism should update their code to use this new table. + - Added support for bypassing connection tracking for ACL processing for + specific types of traffic through the user supplied Stateless_Filter + configuration. OVN v20.06.0 -------------------------- diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 226afc8..4e190d8 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -322,6 +322,16 @@

    + For each record in table Stateless_Filter in the + OVN_Northbound database, a flow with + priority + 1000 is added and sets reg0[11] = 1 + for traffic that matches the condition in the match + column and advances to next table. reg0[11] acts as a hint + for tables ACL hints and ACL to avoid + sending this traffic to the connection tracker. +

    + +

    This table also has a priority-110 flow with the match eth.dst == E for all logical switch datapaths to move traffic to the next table. Where E @@ -422,6 +432,11 @@

    • + A priority-8 flow that matches on packets that have been marked + for stateless ACL processing. This flow sets reg0[8] + and reg0[9] and then advances to the next table. +
    • +
    • A priority-7 flow that matches on packets that initiate a new session. This flow sets reg0[7] and reg0[9] and then advances to the next table. @@ -1445,6 +1460,16 @@ output;

      + For each record in table Stateless_Filter in the + OVN_Northbound database, a flow with + priority + 1000 is added and sets reg0[11] = 1 + for traffic that matches the condition in the match + column and advances to next table. reg0[11] acts as a hint + for tables ACL hints and ACL to avoid + sending this traffic to the connection tracker. +

      + +

      This table also has a priority-110 flow with the match eth.src == E for all logical switch datapaths to move traffic to the next table. Where E diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 2025446..cf27431 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -218,6 +218,7 @@ enum ovn_stage { #define REGBIT_ACL_HINT_ALLOW "reg0[8]" #define REGBIT_ACL_HINT_DROP "reg0[9]" #define REGBIT_ACL_HINT_BLOCK "reg0[10]" +#define REGBIT_SKIP_ACL_CT "reg0[11]" /* Register definitions for switches and routers. */ @@ -4889,7 +4890,47 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op, } static void -build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) +build_stateless_filter(struct ovn_datapath *od, + const struct nbrec_stateless_filter *filter, + struct hmap *lflows) +{ + /* Stateless filters must be applied in both directions so that reply + * traffic bypasses conntrack too. + */ + ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_PRE_ACL, + filter->priority + OVN_ACL_PRI_OFFSET, + filter->match, + REGBIT_SKIP_ACL_CT" = 1; next;", + &filter->header_); + ovn_lflow_add_with_hint(lflows, od, S_SWITCH_OUT_PRE_ACL, + filter->priority + OVN_ACL_PRI_OFFSET, + filter->match, + REGBIT_SKIP_ACL_CT" = 1; next;", + &filter->header_); +} + +static void +build_stateless_filters(struct ovn_datapath *od, struct hmap *port_groups, + struct hmap *lflows) +{ + for (size_t i = 0; i < od->nbs->n_stateless_filters; i++) { + build_stateless_filter(od, od->nbs->stateless_filters[i], lflows); + } + + struct ovn_port_group *pg; + HMAP_FOR_EACH (pg, key_node, port_groups) { + if (ovn_port_group_ls_find(pg, &od->nbs->header_.uuid)) { + for (size_t i = 0; i < pg->nb_pg->n_stateless_filters; i++) { + build_stateless_filter(od, pg->nb_pg->stateless_filters[i], + lflows); + } + } + } +} + +static void +build_pre_acls(struct ovn_datapath *od, struct hmap *port_groups, + struct hmap *lflows) { bool has_stateful = has_stateful_acl(od); @@ -4934,6 +4975,13 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) "nd || nd_rs || nd_ra || " "(udp && udp.src == 546 && udp.dst == 547)", "next;"); + /* Ingress and Egress Pre-ACL Table (Stateless_Filter). + * + * If the logical switch is configured to bypass conntrack for + * specific types of traffic, skip conntrack for that traffic. + */ + build_stateless_filters(od, port_groups, lflows); + /* Ingress and Egress Pre-ACL Table (Priority 100). * * Regardless of whether the ACL is "from-lport" or "to-lport", @@ -5170,6 +5218,15 @@ build_acl_hints(struct ovn_datapath *od, struct hmap *lflows) for (size_t i = 0; i < ARRAY_SIZE(stages); i++) { enum ovn_stage stage = stages[i]; + /* Traffic that matches a Stateless_Filter may hit both allow or + * drop ACLs but should never commit connections to conntrack. Only + * set REGBIT_ACL_HINT_ALLOW and REGBIT_ACL_HINT_DROP. + */ + ovn_lflow_add(lflows, od, stage, 8, REGBIT_SKIP_ACL_CT " == 1", + REGBIT_ACL_HINT_ALLOW " = 1; " + REGBIT_ACL_HINT_DROP " = 1; " + "next;"); + /* New, not already established connections, may hit either allow * or drop ACLs. For allow ACLs, the connection must also be committed * to conntrack so we set REGBIT_ACL_HINT_ALLOW_NEW. @@ -5383,7 +5440,7 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, struct ds match = DS_EMPTY_INITIALIZER; struct ds actions = DS_EMPTY_INITIALIZER; - /* Commit the connection tracking entry if it's a new + /* Otherwise commit the connection tracking entry if it's a new * connection that matches this ACL. After this commit, * the reply traffic is allowed by a flow we create at * priority 65535, defined earlier. @@ -5600,11 +5657,15 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, * Subsequent packets will hit the flow at priority 0 that just * uses "next;". */ ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, 1, - "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", - REGBIT_CONNTRACK_COMMIT" = 1; next;"); + REGBIT_SKIP_ACL_CT " == 0 " + "&& ip " + "&& (!ct.est || (ct.est && ct_label.blocked == 1))", + REGBIT_CONNTRACK_COMMIT" = 1; next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, 1, - "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", - REGBIT_CONNTRACK_COMMIT" = 1; next;"); + REGBIT_SKIP_ACL_CT " == 0 " + "&& ip " + "&& (!ct.est || (ct.est && ct_label.blocked == 1))", + REGBIT_CONNTRACK_COMMIT" = 1; next;"); /* Ingress and Egress ACL Table (Priority 65535). * @@ -5614,10 +5675,14 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, * * This is enforced at a higher priority than ACLs can be defined. */ ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, - "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", + REGBIT_SKIP_ACL_CT " == 0 " + "&& (ct.inv " + "|| (ct.est && ct.rpl && ct_label.blocked == 1))", "drop;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, - "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", + REGBIT_SKIP_ACL_CT " == 0 " + "&& (ct.inv " + "|| (ct.est && ct.rpl && ct_label.blocked == 1))", "drop;"); /* Ingress and Egress ACL Table (Priority 65535). @@ -5630,11 +5695,13 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, * * This is enforced at a higher priority than ACLs can be defined. */ ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, - "ct.est && !ct.rel && !ct.new && !ct.inv " + REGBIT_SKIP_ACL_CT "== 0 " + "&& ct.est && !ct.rel && !ct.new && !ct.inv " "&& ct.rpl && ct_label.blocked == 0", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, - "ct.est && !ct.rel && !ct.new && !ct.inv " + REGBIT_SKIP_ACL_CT "== 0 " + "&& ct.est && !ct.rel && !ct.new && !ct.inv " "&& ct.rpl && ct_label.blocked == 0", "next;"); @@ -5650,11 +5717,13 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, * related traffic such as an ICMP Port Unreachable through * that's generated from a non-listening UDP port. */ ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, - "!ct.est && ct.rel && !ct.new && !ct.inv " + REGBIT_SKIP_ACL_CT "== 0 " + "&& !ct.est && ct.rel && !ct.new && !ct.inv " "&& ct_label.blocked == 0", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, - "!ct.est && ct.rel && !ct.new && !ct.inv " + REGBIT_SKIP_ACL_CT "== 0 " + "&& !ct.est && ct.rel && !ct.new && !ct.inv " "&& ct_label.blocked == 0", "next;"); @@ -6709,7 +6778,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, continue; } - build_pre_acls(od, lflows); + build_pre_acls(od, port_groups, lflows); build_pre_lb(od, lflows, meter_groups, lbs); build_pre_stateful(od, lflows); build_acl_hints(od, lflows); diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema index 0c939b7..ef0121d 100644 --- a/ovn-nb.ovsschema +++ b/ovn-nb.ovsschema @@ -1,7 +1,7 @@ { "name": "OVN_Northbound", - "version": "5.25.0", - "cksum": "1354137211 26116", + "version": "5.26.0", + "cksum": "1450952466 27225", "tables": { "NB_Global": { "columns": { @@ -35,6 +35,12 @@ "refType": "strong"}, "min": 0, "max": "unlimited"}}, + "stateless_filters": { + "type": {"key": {"type": "uuid", + "refTable": "Stateless_Filter", + "refType": "strong"}, + "min": 0, + "max": "unlimited"}}, "acls": {"type": {"key": {"type": "uuid", "refTable": "ACL", "refType": "strong"}, @@ -150,6 +156,12 @@ "refType": "weak"}, "min": 0, "max": "unlimited"}}, + "stateless_filters": { + "type": {"key": {"type": "uuid", + "refTable": "Stateless_Filter", + "refType": "strong"}, + "min": 0, + "max": "unlimited"}}, "acls": {"type": {"key": {"type": "uuid", "refTable": "ACL", "refType": "strong"}, @@ -201,6 +213,16 @@ "type": {"key": "string", "value": "string", "min": 0, "max": "unlimited"}}}, "isRoot": false}, + "Stateless_Filter": { + "columns": { + "priority": {"type": {"key": {"type": "integer", + "minInteger": 0, + "maxInteger": 32767}}}, + "match": {"type": "string"}, + "external_ids": { + "type": {"key": "string", "value": "string", + "min": 0, "max": "unlimited"}}}, + "isRoot": false}, "ACL": { "columns": { "name": {"type": {"key": {"type": "string", diff --git a/ovn-nb.xml b/ovn-nb.xml index 1f2dbb9..81373ce 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -271,9 +271,16 @@ ip addresses. - - Access control rules that apply to packets within the logical switch. - + + + Access control rules that apply to packets within the logical switch. + + + + Stateless filters to bypass connection tracking that apply to packets + within the logical switch. + + QoS marking and metering rules that apply to packets within the @@ -1430,6 +1437,11 @@ lswitches that the ports of the port group belong to. + + Stateless filters to bypass connection tracking that apply to the + port_group. + + See External IDs at the beginning of this document. @@ -1589,6 +1601,44 @@ + +

      + Each row in this table represents a rule to determine if traffic should + be processed in a stateless way in the ACL stage, without recirculating + through connection tracking, regardless of the type of ACL that is hit. + + In normal operation, whenever an ACL associated to a Logical_Switch + has action allow-related, all IP traffic is sent to + the connection tracker. + + If is set to E all ACLs that match + packets for which E is true are considered stateless and + will not generate recirculation of packets through connection tracking. + + This also implies that the CMS should add an explicit allow + ACL for return traffic, because return traffic will not go to conntrack + either so it has to be explicitly allowed. + + This is useful when some specific types of traffic do not need + stateful processing. +

      + + The priority of the filter rule. Rules with numerically higher priority + take precedence. + + + The packets that the stateless filter should match, in the same + expression language used for the column in the OVN Southbound database's + table. + + + + See External IDs at the beginning of this document. + + +
      +

      Each row in this table represents one ACL rule for a logical switch diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at index 619051d..b55ee03 100644 --- a/tests/ovn-nbctl.at +++ b/tests/ovn-nbctl.at @@ -270,6 +270,59 @@ AT_CHECK([ovn-nbctl --type=port-group acl-add ls0 to-lport 100 ip drop], [0], [i dnl --------------------------------------------------------------------- +OVN_NBCTL_TEST([ovn_nbctl_stateless_filters], [Stateless_Filters], [ +ovn_nbctl_test_stateless_filters() { + AT_CHECK([ovn-nbctl $2 stateless-filter-add $1 300 udp]) + AT_CHECK([ovn-nbctl $2 stateless-filter-add $1 200 tcp]) + AT_CHECK([ovn-nbctl $2 stateless-filter-add $1 100 ip]) + dnl Add duplicated Stateless_Filter + AT_CHECK([ovn-nbctl $2 stateless-filter-add $1 100 ip], [1], [], [stderr]) + AT_CHECK([grep 'already existed' stderr], [0], [ignore]) + AT_CHECK([ovn-nbctl $2 --may-exist stateless-filter-add $1 100 ip]) + + AT_CHECK([ovn-nbctl $2 stateless-filter-list $1], [0], [dnl + 300 (udp) + 200 (tcp) + 100 (ip) +]) + + dnl Delete all Stateless_Filters. + AT_CHECK([ovn-nbctl $2 stateless-filter-del $1]) + AT_CHECK([ovn-nbctl $2 stateless-filter-list $1], [0], [dnl +]) + + AT_CHECK([ovn-nbctl $2 stateless-filter-add $1 300 udp]) + AT_CHECK([ovn-nbctl $2 stateless-filter-add $1 200 tcp]) + AT_CHECK([ovn-nbctl $2 stateless-filter-add $1 100 ip]) + + dnl Delete a single filter. + AT_CHECK([ovn-nbctl $2 stateless-filter-del $1 200 tcp]) + AT_CHECK([ovn-nbctl $2 stateless-filter-list $1], [0], [dnl + 300 (udp) + 100 (ip) +]) +} + +AT_CHECK([ovn-nbctl ls-add ls0]) +ovn_nbctl_test_stateless_filters ls0 +AT_CHECK([ovn-nbctl ls-add ls1]) +ovn_nbctl_test_stateless_filters ls1 --type=switch +AT_CHECK([ovn-nbctl create port_group name=pg0], [0], [ignore]) +ovn_nbctl_test_stateless_filters pg0 --type=port-group + +dnl Test when port group doesn't exist +AT_CHECK([ovn-nbctl --type=port-group stateless-filter-add pg1 100 ip], [1], [], [dnl +ovn-nbctl: pg1: port group name not found +]) + +dnl Test when same name exists in logical switches and portgroups +AT_CHECK([ovn-nbctl create port_group name=ls0], [0], [ignore]) +AT_CHECK([ovn-nbctl stateless-filter-add ls0 100 ip], [1], [], [stderr]) +AT_CHECK([grep 'exists in both' stderr], [0], [ignore]) +AT_CHECK([ovn-nbctl --type=port-group stateless-filter-add ls0 100 ip], [0], [ignore])]) + +dnl --------------------------------------------------------------------- + OVN_NBCTL_TEST([ovn_nbctl_qos], [QoS], [ AT_CHECK([ovn-nbctl ls-add ls0]) AT_CHECK([ovn-nbctl qos-add ls0 from-lport 600 tcp dscp=63]) diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index 87644bd..2fc7dc3 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -1781,3 +1781,266 @@ AT_CHECK([ovn-sbctl lflow-list | grep "ls_out_pre_lb.*priority=100" | grep reg0 ]) AT_CLEANUP + +AT_SETUP([ovn -- ACL Stateful Bypass - Logical_Switch]) +ovn_start + +ovn-nbctl ls-add ls +ovn-nbctl lsp-add ls lsp1 +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01 +ovn-nbctl lsp-add ls lsp2 +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02 + +ovn-nbctl acl-add ls from-lport 3 "tcp" allow +ovn-nbctl acl-add ls from-lport 2 "udp" allow-related +ovn-nbctl acl-add ls from-lport 1 "ip" drop +ovn-nbctl --wait=sb sync + +flow_eth='eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02' +flow_ip='ip.ttl==64 && ip4.src == 42.42.42.1 && ip4.dst == 66.66.66.66' +flow_tcp='tcp && tcp.dst == 80' +flow_udp='udp && udp.dst == 80' + +# TCP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# UDP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# Enable Stateful Bypass for TCP. +ovn-nbctl stateless-filter-add ls 1 tcp +ovn-nbctl --wait=sb sync + +# TCP packets should not go to conntrack anymore. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK([ovn-trace --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +output("lsp2"); +]) + +# UDP packets still go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# Add a load balancer. +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp +ovn-nbctl ls-lb-add ls lb-tcp +ovn-nbctl ls-lb-add ls lb-udp + +# Disable Stateful Bypass for TCP. +ovn-nbctl stateless-filter-del ls +ovn-nbctl --wait=sb sync + +# TCP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_next(ct_state=new|trk) { + ct_lb { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; + }; +}; +]) + +# UDP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_lb { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; + }; +}; +]) + +# Enable Stateful Bypass for TCP. +ovn-nbctl stateless-filter-add ls 1 tcp +ovn-nbctl --wait=sb sync + +# TCP packets should go to conntrack for load balancing. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_next(ct_state=new|trk) { + ct_lb { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; + }; +}; +]) + +# UDP packets still go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_lb { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; + }; +}; +]) + +AT_CLEANUP + +AT_SETUP([ovn -- ACL Stateful Bypass - Port_Group]) +ovn_start + +ovn-nbctl ls-add ls +ovn-nbctl lsp-add ls lsp1 +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01 +ovn-nbctl lsp-add ls lsp2 +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02 + +ovn-nbctl pg-add pg lsp1 lsp2 +ovn-nbctl acl-add pg from-lport 3 "tcp" allow +ovn-nbctl acl-add pg from-lport 2 "udp" allow-related +ovn-nbctl acl-add pg from-lport 1 "ip" drop +ovn-nbctl --wait=sb sync + +flow_eth='eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02' +flow_ip='ip.ttl==64 && ip4.src == 42.42.42.1 && ip4.dst == 66.66.66.66' +flow_tcp='tcp && tcp.dst == 80' +flow_udp='udp && udp.dst == 80' + +# TCP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# UDP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# Enable Stateful Bypass for TCP. +ovn-nbctl stateless-filter-add pg 1 tcp +ovn-nbctl --wait=sb sync + +# TCP packets should not go to conntrack anymore. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK([ovn-trace --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +output("lsp2"); +]) + +# UDP packets still go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# Add a load balancer. +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp +ovn-nbctl ls-lb-add ls lb-tcp +ovn-nbctl ls-lb-add ls lb-udp + +# Disable Stateful Bypass for TCP. +ovn-nbctl stateless-filter-del pg +ovn-nbctl --wait=sb sync + +# TCP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_next(ct_state=new|trk) { + ct_lb { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; + }; +}; +]) + +# UDP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_lb { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; + }; +}; +]) + +# Enable Stateful Bypass for TCP. +ovn-nbctl stateless-filter-add pg 1 tcp +ovn-nbctl --wait=sb sync + +# TCP packets should go to conntrack for load balancing. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_next(ct_state=new|trk) { + ct_lb { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; + }; +}; +]) + +# UDP packets still go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_lb { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; + }; +}; +]) + +AT_CLEANUP diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at index c8fa6f0..65904ed 100644 --- a/tests/system-common-macros.at +++ b/tests/system-common-macros.at @@ -234,6 +234,14 @@ m4_define([FORMAT_PING], [grep "transmitted" | sed 's/time.*ms$/time 0ms/']) # m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed 's/csum:.*/csum: /']) +# FORMAT_CT_STATE([ip-addr]) +# +# Strip content from the piped input which would differ from test to test +# and limit the output to the rows containing 'ip-addr'. Don't strip state. +# +m4_define([FORMAT_CT_STATE], + [[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 's/id=[0-9]*/id=/g' | sort | uniq]]) + # FORMAT_CT([ip-addr]) # # Strip content from the piped input which would differ from test to test diff --git a/tests/system-ovn.at b/tests/system-ovn.at index b9b5eaa..32f9acc 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -5397,3 +5397,116 @@ as OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d /.*terminating with signal 15.*/d"]) AT_CLEANUP + +AT_SETUP([ovn -- ACL Stateful Bypass + Load balancer]) +AT_SKIP_IF([test $HAVE_NC = no]) +AT_KEYWORDS([lb]) +AT_KEYWORDS([conntrack]) +ovn_start + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-int]) + +# Set external-ids in br-int needed for ovn-controller +ovs-vsctl \ + -- set Open_vSwitch . external-ids:system-id=hv1 \ + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true + +# Start ovn-controller +start_daemon ovn-controller + +# Logical network: +# One logical switch with a load balancer with one backend. +# On the LS we add "allow" ACLs for TCP and "allow-related" ACLs for UDP. +# The "allow-related" ACL normally forces all traffic to go to conntrack. +# We enable ACL stateful bypass for TCP so TCP traffic should not be +# sent to conntrack for ACLs (only for LB). + +ovn-nbctl ls-add ls +ovn-nbctl lsp-add ls lsp1 +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01 +ovn-nbctl lsp-add ls lsp2 +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02 + +ovn-nbctl acl-add ls from-lport 3 "tcp" allow +ovn-nbctl acl-add ls from-lport 2 "udp" allow-related +ovn-nbctl acl-add ls from-lport 1 "ip" drop + +ovn-nbctl lr-add rtr +ovn-nbctl lrp-add rtr rtr-ls 00:00:00:00:01:00 42.42.42.254/24 +ovn-nbctl lsp-add ls ls-rtr \ + -- lsp-set-type ls-rtr router \ + -- lsp-set-addresses ls-rtr 00:00:00:00:01:00 \ + -- lsp-set-options ls-rtr router-port=rtr-ls + +# Add a load balancer. +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp +ovn-nbctl ls-lb-add ls lb-tcp +ovn-nbctl ls-lb-add ls lb-udp + +# Enable Stateful Bypass for TCP. +ovn-nbctl \ + --id=@f1 create Stateless_Filter priority=1 match="tcp" -- \ + set Logical_Switch ls stateless_filters='@f1' + +ADD_NAMESPACES(lsp1) +ADD_VETH(lsp1, lsp1, br-int, "42.42.42.1/24", "00:00:00:00:00:01", \ + "42.42.42.254") + +ADD_NAMESPACES(lsp2) +ADD_VETH(lsp2, lsp2, br-int, "42.42.42.2/24", "00:00:00:00:00:02", \ + "42.42.42.254") + +ovn-nbctl --wait=hv sync + +# Start a UDP server on lsp2. +NETNS_DAEMONIZE([lsp2], [nc -l --no-shutdown -u 42.42.42.2 8080], [nc2.pid]) + +# Start a UDP connection. +NS_CHECK_EXEC([lsp1], [echo "foo" | nc --no-shutdown -u 66.66.66.66 80]) + +# There should be 2 UDP conntrack entries: +# - one for the allow-related ACL. +# - one for the LB dnat. +OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | grep udp | grep '42.42.42.1' -c)" = "2"]) + +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT_STATE(42.42.42.1) | grep udp | \ +sed -e 's/zone=[[0-9]]*/zone=/'], [0], [dnl +udp,orig=(src=42.42.42.1,dst=42.42.42.2,sport=,dport=),reply=(src=42.42.42.2,dst=42.42.42.1,sport=,dport=),zone= +udp,orig=(src=42.42.42.1,dst=66.66.66.66,sport=,dport=),reply=(src=42.42.42.2,dst=42.42.42.1,sport=,dport=),zone=,labels=0x2 +]) + +# Start a TCP server on lsp2. +NETNS_DAEMONIZE([lsp2], [nc -l --no-shutdown 42.42.42.2 8080], [nc0.pid]) + +# Start a TCP connection. +NETNS_DAEMONIZE([lsp1], [nc --no-shutdown 66.66.66.66 80], [nc1.pid]) + +OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | grep tcp | grep '42.42.42.1' -c)" = "1"]) + +# There should be only one TCP conntrack entry, for the LB dnat. +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT_STATE(42.42.42.1) | grep tcp | \ +sed -e 's/zone=[[0-9]]*/zone=/'], [0], [dnl +tcp,orig=(src=42.42.42.1,dst=66.66.66.66,sport=,dport=),reply=(src=42.42.42.2,dst=42.42.42.1,sport=,dport=),zone=,labels=0x2,protoinfo=(state=ESTABLISHED) +]) + +OVS_APP_EXIT_AND_WAIT([ovn-controller]) + +as ovn-sb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as ovn-nb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as northd +OVS_APP_EXIT_AND_WAIT([ovn-northd]) + +as +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d +/connection dropped.*/d"]) + +AT_CLEANUP diff --git a/utilities/ovn-detrace.in b/utilities/ovn-detrace.in index 4f8dd5f..343965d 100755 --- a/utilities/ovn-detrace.in +++ b/utilities/ovn-detrace.in @@ -232,6 +232,17 @@ class StaticRouteHintHandler(CookieHandlerByUUUID): route.ip_prefix, route.nexthop, route.output_port, route.policy)) +class StatelessFilterHintHandler(CookieHandlerByUUUID): + def __init__(self, ovnnb_db): + super(StatelessFilterHintHandler, self).__init__(ovnnb_db, + 'Stateless_Filter') + + def print_record(self, s_filter): + output = 'Stateless_Filter: priority=%s, match=(%s)' % ( + s_filter.priority, + s_filter.match.strip('"')) + print_h(output) + class QoSHintHandler(CookieHandlerByUUUID): def __init__(self, ovnnb_db): super(QoSHintHandler, self).__init__(ovnnb_db, 'QoS') @@ -254,6 +265,7 @@ class LogicalFlowHandler(CookieHandlerByUUUID): LoadBalancerHintHandler(ovnnb_db), NATHintHandler(ovnnb_db), StaticRouteHintHandler(ovnnb_db), + StatelessFilterHintHandler(ovnnb_db), QoSHintHandler(ovnnb_db), ] diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c index d7bb4b4..7716dcd 100644 --- a/utilities/ovn-nbctl.c +++ b/utilities/ovn-nbctl.c @@ -601,6 +601,17 @@ ACL commands:\n\ acl-list {SWITCH | PORTGROUP}\n\ print ACLs for SWITCH\n\ \n\ +Stateless filter commands:\n\ + [--type={switch | port-group}] [--may-exist]\n\ + stateless-filter-add {SWITCH | PORTGROUP} PRIORITY MATCH \n\ + add a stateless filter to SWITCH/PORTGROUP\n\ + [--type={switch | port-group}]\n\ + stateless-filter-del {SWITCH | PORTGROUP} [PRIORITY MATCH]\n\ + remove stateless filters from SWITCH/PORTGROUP\n\ + [--type={switch | port-group}]\n\ + stateless-filter-list {SWITCH | PORTGROUP}\n\ + print stateless filters for SWITCH\n\ +\n\ QoS commands:\n\ qos-add SWITCH DIRECTION PRIORITY MATCH [rate=RATE [burst=BURST]] [dscp=DSCP]\n\ add an QoS rule to SWITCH\n\ @@ -725,7 +736,8 @@ LB commands:\n\ ls-lb-add SWITCH LB add a load-balancer to SWITCH\n\ ls-lb-del SWITCH [LB] remove load-balancers from SWITCH\n\ ls-lb-list SWITCH print load-balancers\n\ -\n\ +\n\n",program_name, program_name); + printf("\ DHCP Options commands:\n\ dhcp-options-create CIDR [EXTERNAL_IDS]\n\ create a DHCP options row with CIDR\n\ @@ -743,8 +755,7 @@ Connection commands:\n\ del-connection delete the connections\n\ [--inactivity-probe=MSECS]\n\ set-connection TARGET... set the list of connections to TARGET...\n\ -\n\n",program_name, program_name); - printf("\ +\n\ SSL commands:\n\ get-ssl print the SSL configuration\n\ del-ssl delete the SSL configuration\n\ @@ -2021,9 +2032,9 @@ acl_cmp(const void *acl1_, const void *acl2_) } static char * OVS_WARN_UNUSED_RESULT -acl_cmd_get_pg_or_ls(struct ctl_context *ctx, - const struct nbrec_logical_switch **ls, - const struct nbrec_port_group **pg) +cmd_get_pg_or_ls(struct ctl_context *ctx, + const struct nbrec_logical_switch **ls, + const struct nbrec_port_group **pg) { const char *opt_type = shash_find_data(&ctx->options, "--type"); char *error; @@ -2073,7 +2084,7 @@ nbctl_acl_list(struct ctl_context *ctx) const struct nbrec_acl **acls; size_t i; - char *error = acl_cmd_get_pg_or_ls(ctx, &ls, &pg); + char *error = cmd_get_pg_or_ls(ctx, &ls, &pg); if (error) { ctx->error = error; return; @@ -2173,7 +2184,7 @@ nbctl_acl_add(struct ctl_context *ctx) const struct nbrec_port_group *pg = NULL; const char *action = ctx->argv[5]; - char *error = acl_cmd_get_pg_or_ls(ctx, &ls, &pg); + char *error = cmd_get_pg_or_ls(ctx, &ls, &pg); if (error) { ctx->error = error; return; @@ -2264,7 +2275,7 @@ nbctl_acl_del(struct ctl_context *ctx) const struct nbrec_logical_switch *ls = NULL; const struct nbrec_port_group *pg = NULL; - char *error = acl_cmd_get_pg_or_ls(ctx, &ls, &pg); + char *error = cmd_get_pg_or_ls(ctx, &ls, &pg); if (error) { ctx->error = error; return; @@ -2351,6 +2362,181 @@ nbctl_acl_del(struct ctl_context *ctx) } } +static int +stateless_filter_cmp(const void *filter1_, const void *filter2_) +{ + const struct nbrec_stateless_filter *const *filter1p = filter1_; + const struct nbrec_stateless_filter *const *filter2p = filter2_; + const struct nbrec_stateless_filter *filter1 = *filter1p; + const struct nbrec_stateless_filter *filter2 = *filter2p; + + if (filter1->priority != filter2->priority) { + return filter1->priority > filter2->priority ? -1 : 1; + } else { + return strcmp(filter1->match, filter2->match); + } +} + +static void +nbctl_stateless_filter_list(struct ctl_context *ctx) +{ + const struct nbrec_logical_switch *ls = NULL; + const struct nbrec_port_group *pg = NULL; + const struct nbrec_stateless_filter **filters; + size_t i; + + char *error = cmd_get_pg_or_ls(ctx, &ls, &pg); + if (error) { + ctx->error = error; + return; + } + + size_t n_filters = pg ? pg->n_stateless_filters : ls->n_stateless_filters; + struct nbrec_stateless_filter **nb_filters = pg + ? pg->stateless_filters + : ls->stateless_filters; + + filters = xmalloc(sizeof *filters * n_filters); + for (i = 0; i < n_filters; i++) { + filters[i] = nb_filters[i]; + } + + qsort(filters, n_filters, sizeof *filters, stateless_filter_cmp); + + for (i = 0; i < n_filters; i++) { + const struct nbrec_stateless_filter *filter = filters[i]; + ds_put_format(&ctx->output, "%5"PRId64" (%s)\n", + filter->priority, filter->match); + } + + free(filters); +} + +static void +nbctl_stateless_filter_add(struct ctl_context *ctx) +{ + const struct nbrec_logical_switch *ls = NULL; + const struct nbrec_port_group *pg = NULL; + + char *error = cmd_get_pg_or_ls(ctx, &ls, &pg); + if (error) { + ctx->error = error; + return; + } + + int64_t priority; + error = parse_priority(ctx->argv[2], &priority); + if (error) { + ctx->error = error; + return; + } + + /* Create the filter. */ + struct nbrec_stateless_filter *filter = + nbrec_stateless_filter_insert(ctx->txn); + nbrec_stateless_filter_set_priority(filter, priority); + nbrec_stateless_filter_set_match(filter, ctx->argv[3]); + + /* Check if same filter already exists for the ls/portgroup */ + size_t n_filters = pg ? pg->n_stateless_filters : ls->n_stateless_filters; + struct nbrec_stateless_filter **filters = pg + ? pg->stateless_filters + : ls->stateless_filters; + for (size_t i = 0; i < n_filters; i++) { + if (!stateless_filter_cmp(&filters[i], &filter)) { + bool may_exist = shash_find(&ctx->options, "--may-exist") != NULL; + if (!may_exist) { + ctl_error(ctx, + "Same filter already existed on ls or pg %s.", + ctx->argv[1]); + return; + } + return; + } + } + + /* Insert the filter into the logical switch/port group. */ + struct nbrec_stateless_filter **new_filters = + xmalloc(sizeof *new_filters * (n_filters + 1)); + nullable_memcpy(new_filters, filters, sizeof *new_filters * n_filters); + new_filters[n_filters] = filter; + if (pg) { + nbrec_port_group_verify_stateless_filters(pg); + nbrec_port_group_set_stateless_filters(pg, new_filters, + n_filters + 1); + } else { + nbrec_logical_switch_verify_stateless_filters(ls); + nbrec_logical_switch_set_stateless_filters(ls, new_filters, + n_filters + 1); + } + free(new_filters); +} + +static void +nbctl_stateless_filter_del(struct ctl_context *ctx) +{ + const struct nbrec_logical_switch *ls = NULL; + const struct nbrec_port_group *pg = NULL; + + char *error = cmd_get_pg_or_ls(ctx, &ls, &pg); + if (error) { + ctx->error = error; + return; + } + + if (ctx->argc == 2) { + /* If priority and match are not specified, delete filters. */ + if (pg) { + nbrec_port_group_verify_stateless_filters(pg); + nbrec_port_group_set_stateless_filters(pg, NULL, 0); + } else { + nbrec_logical_switch_verify_stateless_filters(ls); + nbrec_logical_switch_set_stateless_filters(ls, NULL, 0); + } + return; + } + + int64_t priority; + error = parse_priority(ctx->argv[2], &priority); + if (error) { + ctx->error = error; + return; + } + + if (ctx->argc == 3) { + ctl_error(ctx, "cannot specify priority without match"); + return; + } + + size_t n_filters = pg ? pg->n_stateless_filters : ls->n_stateless_filters; + struct nbrec_stateless_filter **filters = pg + ? pg->stateless_filters + : ls->stateless_filters; + + /* Remove the matching rule. */ + for (size_t i = 0; i < n_filters; i++) { + struct nbrec_stateless_filter *filter = filters[i]; + + if (priority == filter->priority + && !strcmp(ctx->argv[3], filter->match)) { + struct nbrec_stateless_filter **new_filters + = xmemdup(filters, sizeof *new_filters * n_filters); + new_filters[i] = filters[n_filters - 1]; + if (pg) { + nbrec_port_group_verify_stateless_filters(pg); + nbrec_port_group_set_stateless_filters(pg, new_filters, + n_filters - 1); + } else { + nbrec_logical_switch_verify_stateless_filters(ls); + nbrec_logical_switch_set_stateless_filters(ls, new_filters, + n_filters - 1); + } + free(new_filters); + return; + } + } +} + static void nbctl_qos_list(struct ctl_context *ctx) { @@ -6283,6 +6469,15 @@ static const struct ctl_command_syntax nbctl_commands[] = { { "acl-list", 1, 1, "{SWITCH | PORTGROUP}", NULL, nbctl_acl_list, NULL, "--type=", RO }, + /* stateless filter commands. */ + { "stateless-filter-add", 3, 4, "{SWITCH | PORTGROUP} PRIORITY MATCH", + NULL, nbctl_stateless_filter_add, NULL, + "--may-exist,--type=", RW }, + { "stateless-filter-del", 1, 4, "{SWITCH | PORTGROUP} [PRIORITY MATCH]", + NULL, nbctl_stateless_filter_del, NULL, "--type=", RW }, + { "stateless-filter-list", 1, 1, "{SWITCH | PORTGROUP}", + NULL, nbctl_stateless_filter_list, NULL, "--type=", RO }, + /* qos commands. */ { "qos-add", 5, 7, "SWITCH DIRECTION PRIORITY MATCH [rate=RATE [burst=BURST]] [dscp=DSCP]",