From patchwork Thu Sep 10 02:00:19 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joe Stringer X-Patchwork-Id: 516079 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (li376-54.members.linode.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id C97F71400CB for ; Thu, 10 Sep 2015 12:00:56 +1000 (AEST) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 64CB610A87; Wed, 9 Sep 2015 19:00:43 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e4.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id EFD5710A84 for ; Wed, 9 Sep 2015 19:00:41 -0700 (PDT) Received: from bar5.cudamail.com (unknown [192.168.21.12]) by mx1e4.cudamail.com (Postfix) with ESMTPS id 7E70C1E0294 for ; Wed, 9 Sep 2015 20:00:41 -0600 (MDT) X-ASG-Debug-ID: 1441850439-09eadd7da416bba0001-byXFYA Received: from mx1-pf2.cudamail.com ([192.168.24.2]) by bar5.cudamail.com with ESMTP id yI5U41zEnQflbSxf (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 09 Sep 2015 20:00:39 -0600 (MDT) X-Barracuda-Envelope-From: joestringer@nicira.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.2 Received: from unknown (HELO mail-pa0-f42.google.com) (209.85.220.42) by mx1-pf2.cudamail.com with ESMTPS (RC4-SHA encrypted); 10 Sep 2015 02:00:39 -0000 Received-SPF: unknown (mx1-pf2.cudamail.com: Multiple SPF records returned) X-Barracuda-RBL-Trusted-Forwarder: 209.85.220.42 Received: by padhk3 with SMTP id hk3so26844854pad.3 for ; Wed, 09 Sep 2015 19:00:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=4/LebQ7YIfOvw5008prAifvOgoGgdp0DoDaSq5vuu8o=; b=IWVBENEbaZCU8DmVjYGnFlIiWMpwp1ySOmQiCcYiT/9Tzyj0brRx13PG8J4KsHErjI PuUVZ82Sogiks+NrmvG6d/BMbN8m9FsW9Eud3LlMeDrFhd1F0uNnWrzWA6pb5sLmjC9r jtmpw9I3NgTY3wSqeBHdNKL5GR6qKDaLoW6+uCC6AOensK7XO+AMxJ53A+DPh6HAEMg6 PCm4qA1gNXf/PeoQYLjFmzPAfN0plJVtF9wNLfqHOpJpykAsBjO8qKHH4F1kKVBrkFVy z3LiAfwTwl06ZKWwW+bJgLoeIUkHsk3TtnzDiiqMf3BJ7aQePGVBAccqRJgcxbqHnNgP DBqw== X-Gm-Message-State: ALoCoQntmrZqE1f3oSdeCRlaBN0J+esxFKd95Abq2vnNAWSR0Z2vNWqTU9S1exi57gnU0QC+gVm5 X-Received: by 10.68.239.69 with SMTP id vq5mr77231365pbc.111.1441850438643; Wed, 09 Sep 2015 19:00:38 -0700 (PDT) Received: from localhost.localdomain ([208.91.2.4]) by smtp.gmail.com with ESMTPSA id ga1sm8546864pbb.80.2015.09.09.19.00.37 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 09 Sep 2015 19:00:38 -0700 (PDT) X-CudaMail-Envelope-Sender: joestringer@nicira.com X-Barracuda-Apparent-Source-IP: 208.91.2.4 From: Joe Stringer To: dev@openvswitch.org X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-E2-908106759 X-CudaMail-DTE: 090915 X-CudaMail-Originating-IP: 209.85.220.42 Date: Wed, 9 Sep 2015 19:00:19 -0700 X-ASG-Orig-Subj: [##CM-E2-908106759##][PATCH 3/8] Add support for connection tracking. Message-Id: <1441850422-33750-4-git-send-email-joestringer@nicira.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1441850422-33750-1-git-send-email-joestringer@nicira.com> References: <1441850422-33750-1-git-send-email-joestringer@nicira.com> X-Barracuda-Connect: UNKNOWN[192.168.24.2] X-Barracuda-Start-Time: 1441850439 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Barracuda-BRTS-Status: 1 X-Virus-Scanned: by bsmtpd at cudamail.com Subject: [ovs-dev] [PATCH 3/8] Add support for connection tracking. X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" This patch adds a new action and fields to OVS that allow connection tracking to be performed. This support works in conjunction with the Linux kernel support merged into the Linux-4.3 development cycle. The new 'ct' action sends this flow through the connection tracker, which accepts the following parameters initally: - "commit": Connections that execute ct(commit) should be remembered, allowing future packets in the connection to be recognized as part of an "established" (est) connection, packets in the reply (rpl) direction, or related to an existing connection (rel). - "zone=[u16|NXM]": Perform connection tracking in the zone specified. Each zone is an independent connection tracking context. If the "commit" parameter is used, the connection will only be remembered in the specified zone, and not in other zones. This is 0 by default. - "table=NUMBER": This makes the conntrack action act like the "output" action, with a copy of the packet sent to the connection tracker. When the connection tracker has finished processing, it may return and resume pipeline processing in the specified table, with connection tracking fields populated (initially ct_state and ct_zone). This table is strongly recommended to be greater than the current table, to prevent loops. When the "table" option is used, the packet that continues processing in the specified table will have the ct_state populated. The ct_state may have any of the following flags set: - Tracked (trk): Connection tracking has occurred. - Reply (rpl): The flow is in the reply direction. - Invalid (inv): The connection tracker couldn't identify the connection. - New (new): This is the beginning of a new connection. - Established (est): This is part of an already existing connection. - Related (rel): This connection is related to an existing connection. For more information, consult the ovs-ofctl(8) man pages. Below is a simple example flow table to allow outbound TCP traffic from port 1 and drop traffic from port 2 that was not initiated by port 1: ovs-ofctl del-flows br0 ovs-ofctl add-flow br0 "arp,action=normal" ovs-ofctl add-flow br0 \ "in_port=1,conn_state=-trk,tcp,action=ct(commit,zone=9),2" ovs-ofctl add-flow br0 \ "in_port=2,conn_state=-trk,tcp,action=ct(table=1,zone=9)" ovs-ofctl add-flow br0 \ "table=1,in_port=2,conn_state=+trk+est-new,tcp,action=1" ovs-ofctl add-flow br0 \ "table=1,in_port=2,conn_state=+trk-est+new,tcp,action=drop" Based on original design by Justin Pettit, contributions from Thomas Graf and Daniele Di Proietto. Signed-off-by: Joe Stringer Acked-by: Jarno Rajahalme --- Changes since RFCv4: - OpenFlow and Netlink wire formats have been rearranged. - Zone may be specified from any NXM field. - NXM field values for ct_state, etc are different. - Extend the testsuite. - Rebase. --- NEWS | 3 + build-aux/extract-ofp-fields | 1 + datapath/flow_netlink.c | 2 +- datapath/linux/compat/include/linux/openvswitch.h | 36 ++ lib/dpif-netdev.c | 12 + lib/dpif.c | 1 + lib/flow.c | 42 ++ lib/flow.h | 6 +- lib/match.c | 55 +++ lib/match.h | 3 + lib/meta-flow.c | 77 ++++ lib/meta-flow.h | 46 +++ lib/nx-match.c | 10 + lib/odp-execute.c | 6 + lib/odp-util.c | 213 ++++++++++ lib/odp-util.h | 8 +- lib/ofp-actions.c | 260 ++++++++++++ lib/ofp-actions.h | 47 +++ lib/packets.c | 21 + lib/packets.h | 31 +- ofproto/ofproto-dpif-rid.c | 2 + ofproto/ofproto-dpif-rid.h | 1 + ofproto/ofproto-dpif-sflow.c | 3 + ofproto/ofproto-dpif-xlate.c | 90 ++++- ofproto/ofproto-dpif.c | 70 ++++ ofproto/ofproto-unixctl.man | 4 + tests/atlocal.in | 7 + tests/automake.mk | 1 + tests/dpif-netdev.at | 2 +- tests/odp.at | 12 + tests/ofproto-dpif.at | 4 +- tests/ofproto.at | 2 + tests/system-common-macros.at | 18 + tests/system-kmod-macros.at | 16 + tests/system-traffic.at | 469 ++++++++++++++++++++++ tests/system-userspace-macros.at | 9 + tests/test-l7.py | 72 ++++ tests/test-odp.c | 3 + utilities/ovs-ofctl.8.in | 85 ++++ 39 files changed, 1735 insertions(+), 15 deletions(-) create mode 100755 tests/test-l7.py diff --git a/NEWS b/NEWS index ca22c8e..45ae059 100644 --- a/NEWS +++ b/NEWS @@ -21,6 +21,9 @@ Post-v2.4.0 targets to run a new system testsuite. These tests can be run inside a Vagrant box. See INSTALL.md for details - Dropped support for GRE64 tunnel. + - Add support for connection tracking through the new "ct" action + and "ct_state"/"ct_zone" match fields. Only available on Linux kernels + with the connection tracking module loaded. v2.4.0 - 20 Aug 2015 diff --git a/build-aux/extract-ofp-fields b/build-aux/extract-ofp-fields index e0284f9..b0d9681 100755 --- a/build-aux/extract-ofp-fields +++ b/build-aux/extract-ofp-fields @@ -24,6 +24,7 @@ TYPES = {"u8": (1, False), FORMATTING = {"decimal": ("MFS_DECIMAL", 1, 8), "hexadecimal": ("MFS_HEXADECIMAL", 1, 127), + "conn state": ("MFS_CT_STATE", 1, 1), "Ethernet": ("MFS_ETHERNET", 6, 6), "IPv4": ("MFS_IPV4", 4, 4), "IPv6": ("MFS_IPV6", 16, 16), diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c index 9115d15..8527fe0 100644 --- a/datapath/flow_netlink.c +++ b/datapath/flow_netlink.c @@ -281,7 +281,7 @@ size_t ovs_key_attr_size(void) /* Whenever adding new OVS_KEY_ FIELDS, we should consider * updating this function. */ - BUILD_BUG_ON(OVS_KEY_ATTR_TUNNEL_INFO != 22); + BUILD_BUG_ON(OVS_KEY_ATTR_TUNNEL_INFO != 24); return nla_total_size(4) /* OVS_KEY_ATTR_PRIORITY */ + nla_total_size(0) /* OVS_KEY_ATTR_TUNNEL */ diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h index 578cd88..69bdf32 100644 --- a/datapath/linux/compat/include/linux/openvswitch.h +++ b/datapath/linux/compat/include/linux/openvswitch.h @@ -343,6 +343,8 @@ enum ovs_key_attr { OVS_KEY_ATTR_MPLS, /* array of struct ovs_key_mpls. * The implementation may restrict * the accepted length of the array. */ + OVS_KEY_ATTR_CT_STATE, /* u8 bitmask of OVS_CS_F_* */ + OVS_KEY_ATTR_CT_ZONE, /* u16 connection tracking zone. */ #ifdef __KERNEL__ /* Only used within kernel data path. */ @@ -456,6 +458,15 @@ struct ovs_key_nd { __u8 nd_tll[ETH_ALEN]; }; +/* OVS_KEY_ATTR_CT_STATE flags */ +#define OVS_CS_F_NEW 0x01 /* Beginning of a new connection. */ +#define OVS_CS_F_ESTABLISHED 0x02 /* Part of an existing connection. */ +#define OVS_CS_F_RELATED 0x04 /* Related to an established + * connection. */ +#define OVS_CS_F_INVALID 0x20 /* Could not track connection. */ +#define OVS_CS_F_REPLY_DIR 0x40 /* Flow is in the reply direction. */ +#define OVS_CS_F_TRACKED 0x80 /* Conntrack has occurred. */ + /** * enum ovs_flow_attr - attributes for %OVS_FLOW_* commands. * @OVS_FLOW_ATTR_KEY: Nested %OVS_KEY_ATTR_* attributes specifying the flow @@ -642,6 +653,28 @@ struct ovs_action_push_tnl { #endif /** + * enum ovs_ct_attr - Attributes for %OVS_ACTION_ATTR_CT action. + * @OVS_CT_ATTR_FLAGS: u32 connection tracking flags. + * @OVS_CT_ATTR_ZONE: u16 connection tracking zone. + */ +enum ovs_ct_attr { + OVS_CT_ATTR_UNSPEC, + OVS_CT_ATTR_FLAGS, /* u32 bitmask of OVS_CT_F_*. */ + OVS_CT_ATTR_ZONE, /* u16 zone id. */ + __OVS_CT_ATTR_MAX +}; + +#define OVS_CT_ATTR_MAX (__OVS_CT_ATTR_MAX - 1) + +/* + * OVS_CT_ATTR_FLAGS flags - bitmask of %OVS_CT_F_* + * @OVS_CT_F_COMMIT: Commits the flow to the conntrack table. This allows + * future packets for the same connection to be identified as 'established' + * or 'related'. + */ +#define OVS_CT_F_COMMIT 0x01 + +/** * enum ovs_action_attr - Action types. * * @OVS_ACTION_ATTR_OUTPUT: Output packet to port. @@ -672,6 +705,8 @@ struct ovs_action_push_tnl { * indicate the new packet contents. This could potentially still be * %ETH_P_MPLS if the resulting MPLS label stack is not empty. If there * is no MPLS label stack, as determined by ethertype, no action is taken. + * @OVS_ACTION_ATTR_CT: Track the connection. Populate the conntrack-related + * entries in the flow key. * * Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not all * fields within a header are modifiable, e.g. the IPv4 protocol and fragment @@ -702,6 +737,7 @@ enum ovs_action_attr { * data immediately followed by a mask. * The data must be zero for the unmasked * bits. */ + OVS_ACTION_ATTR_CT, /* One nested OVS_CT_ATTR_* . */ #ifndef __KERNEL__ OVS_ACTION_ATTR_TUNNEL_PUSH, /* struct ovs_action_push_tnl*/ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index db76290..0fc5f8c 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -1920,6 +1920,11 @@ dpif_netdev_flow_from_nlattrs(const struct nlattr *key, uint32_t key_len, return EINVAL; } + /* Userspace datapath doesn't support conntrack. */ + if (flow->ct_state || flow->ct_zone) { + return EINVAL; + } + return 0; } @@ -3595,6 +3600,13 @@ dp_execute_cb(void *aux_, struct dp_packet **packets, int cnt, VLOG_WARN("Packet dropped. Max recirculation depth exceeded."); break; + case OVS_ACTION_ATTR_CT: + /* If a flow with this action is slow-pathed, datapath assistance is + * required to implement it. However, we don't support this action + * in the userspace datapath. */ + VLOG_WARN("Cannot execute conntrack action in userspace."); + break; + case OVS_ACTION_ATTR_PUSH_VLAN: case OVS_ACTION_ATTR_POP_VLAN: case OVS_ACTION_ATTR_PUSH_MPLS: diff --git a/lib/dpif.c b/lib/dpif.c index 9a67a24..c03aa1d 100644 --- a/lib/dpif.c +++ b/lib/dpif.c @@ -1097,6 +1097,7 @@ dpif_execute_helper_cb(void *aux_, struct dp_packet **packets, int cnt, ovs_assert(cnt == 1); switch ((enum ovs_action_attr)type) { + case OVS_ACTION_ATTR_CT: case OVS_ACTION_ATTR_OUTPUT: case OVS_ACTION_ATTR_TUNNEL_PUSH: case OVS_ACTION_ATTR_TUNNEL_POP: diff --git a/lib/flow.c b/lib/flow.c index 84048e8..032df14 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -199,6 +199,23 @@ BUILD_MESSAGE("FLOW_WC_SEQ changed: miniflow_extract() will have runtime " } \ } +#define miniflow_push_uint8_(MF, OFS, VALUE) \ +{ \ + MINIFLOW_ASSERT(MF.data < MF.end) \ + \ + if ((OFS) % 8 == 0) { \ + miniflow_set_map(MF, OFS / 8); \ + *(uint8_t *)MF.data = VALUE; \ + } else if ((OFS) % 8 < 7) { \ + miniflow_assert_in_map(MF, OFS / 8); \ + *((uint8_t *)MF.data + (OFS % 8)) = VALUE; \ + } else { \ + miniflow_assert_in_map(MF, OFS / 8); \ + *((uint8_t *)MF.data + 7) = VALUE; \ + MF.data++; \ + } \ +} + #define miniflow_pad_to_64_(MF, OFS) \ { \ MINIFLOW_ASSERT((OFS) % 8 != 0); \ @@ -262,6 +279,9 @@ BUILD_MESSAGE("FLOW_WC_SEQ changed: miniflow_extract() will have runtime " #define miniflow_push_be16(MF, FIELD, VALUE) \ miniflow_push_be16_(MF, offsetof(struct flow, FIELD), VALUE) +#define miniflow_push_uint8(MF, FIELD, VALUE) \ + miniflow_push_uint8_(MF, offsetof(struct flow, FIELD), VALUE) + #define miniflow_pad_to_64(MF, FIELD) \ miniflow_pad_to_64_(MF, offsetof(struct flow, FIELD)) @@ -482,6 +502,12 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) miniflow_pad_to_64(mf, conj_id); } + if (md->ct_state) { + miniflow_push_uint16(mf, ct_zone, md->ct_zone); + miniflow_push_uint8(mf, ct_state, md->ct_state); + miniflow_pad_to_64(mf, pad1); + } + /* Initialize packet's layer pointer and offsets. */ l2 = data; dp_packet_reset_offsets(packet); @@ -832,6 +858,12 @@ flow_get_metadata(const struct flow *flow, struct match *flow_metadata) } match_set_in_port(flow_metadata, flow->in_port.ofp_port); + if (flow->ct_state != 0) { + match_set_ct_state(flow_metadata, flow->ct_state); + } + if (flow->ct_zone != 0) { + match_set_ct_zone(flow_metadata, flow->ct_zone); + } } char * @@ -1107,6 +1139,12 @@ flow_format(struct ds *ds, const struct flow *flow) if (!flow->dp_hash) { WC_UNMASK_FIELD(wc, dp_hash); } + if (!flow->ct_state) { + WC_UNMASK_FIELD(wc, ct_state); + } + if (!flow->ct_zone) { + WC_UNMASK_FIELD(wc, ct_zone); + } for (int i = 0; i < FLOW_N_REGS; i++) { if (!flow->regs[i]) { WC_UNMASK_FIELD(wc, regs[i]); @@ -1181,6 +1219,8 @@ void flow_wildcards_init_for_packet(struct flow_wildcards *wc, WC_MASK_FIELD(wc, skb_priority); WC_MASK_FIELD(wc, pkt_mark); + WC_MASK_FIELD(wc, ct_state); + WC_MASK_FIELD(wc, ct_zone); WC_MASK_FIELD(wc, recirc_id); WC_MASK_FIELD(wc, dp_hash); WC_MASK_FIELD(wc, in_port); @@ -1284,6 +1324,8 @@ flow_wc_map(const struct flow *flow, struct flowmap *map) FLOWMAP_SET(map, dl_src); FLOWMAP_SET(map, dl_type); FLOWMAP_SET(map, vlan_tci); + FLOWMAP_SET(map, ct_state); + FLOWMAP_SET(map, ct_zone); /* Ethertype-dependent fields. */ if (OVS_LIKELY(flow->dl_type == htons(ETH_TYPE_IP))) { diff --git a/lib/flow.h b/lib/flow.h index d8632ff..2d6d3ee 100644 --- a/lib/flow.h +++ b/lib/flow.h @@ -103,8 +103,10 @@ struct flow { union flow_in_port in_port; /* Input port.*/ uint32_t recirc_id; /* Must be exact match. */ uint32_t conj_id; /* Conjunction ID. */ + uint16_t ct_zone; /* Connection Zone. */ + uint8_t ct_state; /* Connection state. */ + uint8_t pad1[3]; /* Pad to 64 bits. */ ofp_port_t actset_output; /* Output port in action set. */ - uint8_t pad1[6]; /* Pad to 64 bits. */ /* L2, Order the same as in the Ethernet header! (64-bit aligned) */ struct eth_addr dl_dst; /* Ethernet destination address. */ @@ -980,6 +982,8 @@ pkt_metadata_from_flow(struct pkt_metadata *md, const struct flow *flow) md->skb_priority = flow->skb_priority; md->pkt_mark = flow->pkt_mark; md->in_port = flow->in_port; + md->ct_state = flow->ct_state; + md->ct_zone = flow->ct_zone; } static inline bool is_ip_any(const struct flow *flow) diff --git a/lib/match.c b/lib/match.c index 9e465d8..87f940c 100644 --- a/lib/match.c +++ b/lib/match.c @@ -285,6 +285,26 @@ match_set_pkt_mark_masked(struct match *match, uint32_t pkt_mark, uint32_t mask) } void +match_set_ct_state(struct match *match, uint8_t ct_state) +{ + match_set_ct_state_masked(match, ct_state, UINT8_MAX); +} + +void +match_set_ct_state_masked(struct match *match, uint8_t ct_state, uint8_t mask) +{ + match->flow.ct_state = ct_state & mask; + match->wc.masks.ct_state = mask; +} + +void +match_set_ct_zone(struct match *match, uint16_t ct_zone) +{ + match->flow.ct_zone = ct_zone; + match->wc.masks.ct_zone = UINT16_MAX; +} + +void match_set_dl_type(struct match *match, ovs_be16 dl_type) { match->wc.masks.dl_type = OVS_BE16_MAX; @@ -816,6 +836,21 @@ format_ipv6_netmask(struct ds *s, const char *name, } static void +format_uint16_masked(struct ds *s, const char *name, + uint16_t value, uint16_t mask) +{ + if (mask != 0) { + ds_put_format(s, "%s=", name); + if (mask == UINT16_MAX) { + ds_put_format(s, "%"PRIu16, value); + } else { + ds_put_format(s, "0x%"PRIx16"/0x%"PRIx16, value, mask); + } + ds_put_char(s, ','); + } +} + +static void format_be16_masked(struct ds *s, const char *name, ovs_be16 value, ovs_be16 mask) { @@ -953,6 +988,26 @@ match_format(const struct match *match, struct ds *s, int priority) ds_put_char(s, ','); } + if (wc->masks.ct_state) { + if (wc->masks.ct_state == UINT8_MAX) { + ds_put_cstr(s, "ct_state="); + if (f->ct_state) { + format_flags(s, packet_ct_state_to_string, f->ct_state, + '|'); + } else { + ds_put_cstr(s, "0"); /* No state. */ + } + } else { + format_flags_masked(s, "ct_state", packet_ct_state_to_string, + f->ct_state, wc->masks.ct_state, UINT8_MAX); + } + ds_put_char(s, ','); + } + + if (wc->masks.ct_zone) { + format_uint16_masked(s, "ct_zone", f->ct_zone, wc->masks.ct_zone); + } + if (wc->masks.dl_type) { skip_type = true; if (f->dl_type == htons(ETH_TYPE_IP)) { diff --git a/lib/match.h b/lib/match.h index 3e133e5..f264551 100644 --- a/lib/match.h +++ b/lib/match.h @@ -83,6 +83,9 @@ void match_set_tun_gbp_flags(struct match *match, uint8_t flags); void match_set_in_port(struct match *, ofp_port_t ofp_port); void match_set_pkt_mark(struct match *, uint32_t pkt_mark); void match_set_pkt_mark_masked(struct match *, uint32_t pkt_mark, uint32_t mask); +void match_set_ct_state(struct match *, uint8_t ct_state); +void match_set_ct_state_masked(struct match *, uint8_t ct_state, uint8_t mask); +void match_set_ct_zone(struct match *, uint16_t ct_zone); void match_set_skb_priority(struct match *, uint32_t skb_priority); void match_set_dl_type(struct match *, ovs_be16); void match_set_dl_src(struct match *, const struct eth_addr ); diff --git a/lib/meta-flow.c b/lib/meta-flow.c index 224ba53..4a7ec7f 100644 --- a/lib/meta-flow.c +++ b/lib/meta-flow.c @@ -214,6 +214,10 @@ mf_is_all_wild(const struct mf_field *mf, const struct flow_wildcards *wc) return !wc->masks.skb_priority; case MFF_PKT_MARK: return !wc->masks.pkt_mark; + case MFF_CT_STATE: + return !wc->masks.ct_state; + case MFF_CT_ZONE: + return !wc->masks.ct_zone; CASE_MFF_REGS: return !wc->masks.regs[mf->id - MFF_REG0]; CASE_MFF_XREGS: @@ -497,6 +501,8 @@ mf_is_value_valid(const struct mf_field *mf, const union mf_value *value) case MFF_IN_PORT: case MFF_SKB_PRIORITY: case MFF_PKT_MARK: + case MFF_CT_STATE: + case MFF_CT_ZONE: CASE_MFF_REGS: CASE_MFF_XREGS: case MFF_ETH_SRC: @@ -644,6 +650,14 @@ mf_get_value(const struct mf_field *mf, const struct flow *flow, value->be32 = htonl(flow->pkt_mark); break; + case MFF_CT_STATE: + value->u8 = flow->ct_state; + break; + + case MFF_CT_ZONE: + value->be16 = htons(flow->ct_zone); + break; + CASE_MFF_REGS: value->be32 = htonl(flow->regs[mf->id - MFF_REG0]); break; @@ -876,6 +890,14 @@ mf_set_value(const struct mf_field *mf, match_set_pkt_mark(match, ntohl(value->be32)); break; + case MFF_CT_STATE: + match_set_ct_state(match, value->u8); + break; + + case MFF_CT_ZONE: + match_set_ct_zone(match, ntohs(value->be16)); + break; + CASE_MFF_REGS: match_set_reg(match, mf->id - MFF_REG0, ntohl(value->be32)); break; @@ -1160,6 +1182,14 @@ mf_set_flow_value(const struct mf_field *mf, flow->pkt_mark = ntohl(value->be32); break; + case MFF_CT_STATE: + flow->ct_state = value->u8; + break; + + case MFF_CT_ZONE: + flow->ct_zone = ntohs(value->be16); + break; + CASE_MFF_REGS: flow->regs[mf->id - MFF_REG0] = ntohl(value->be32); break; @@ -1449,6 +1479,16 @@ mf_set_wild(const struct mf_field *mf, struct match *match, char **err_str) match->wc.masks.pkt_mark = 0; break; + case MFF_CT_STATE: + match->flow.ct_state = 0; + match->wc.masks.ct_state = 0; + break; + + case MFF_CT_ZONE: + match->flow.ct_zone = 0; + match->wc.masks.ct_zone = 0; + break; + CASE_MFF_REGS: match_set_reg_masked(match, mf->id - MFF_REG0, 0, 0); break; @@ -1636,6 +1676,7 @@ mf_set(const struct mf_field *mf, } switch (mf->id) { + case MFF_CT_ZONE: case MFF_RECIRC_ID: case MFF_CONJ_ID: case MFF_IN_PORT: @@ -1711,6 +1752,10 @@ mf_set(const struct mf_field *mf, ntohl(mask->be32)); break; + case MFF_CT_STATE: + match_set_ct_state_masked(match, value->u8, mask->u8); + break; + case MFF_ETH_DST: match_set_dl_dst_masked(match, value->mac, mask->mac); break; @@ -2103,6 +2148,22 @@ mf_from_tun_flags_string(const char *s, ovs_be16 *flagsp, ovs_be16 *maskp) htons(FLOW_TNL_PUB_F_MASK), maskp); } +static char * +mf_from_ct_state_string(const char *s, uint8_t *flagsp, uint8_t *maskp) +{ + ovs_be16 flags, mask; + char *error; + + error = parse_mf_flags(s, packet_ct_state_to_string, "ct_state", &flags, + htons(CS_SUPPORTED_MASK), &mask); + if (!error) { + *flagsp = ntohs(flags); + *maskp = ntohs(mask); + } + + return error; +} + /* Parses 's', a string value for field 'mf', into 'value' and 'mask'. Returns * NULL if successful, otherwise a malloc()'d string describing the error. */ char * @@ -2124,6 +2185,11 @@ mf_parse(const struct mf_field *mf, const char *s, (uint8_t *) value, (uint8_t *) mask); break; + case MFS_CT_STATE: + ovs_assert(mf->n_bytes == sizeof(uint8_t)); + error = mf_from_ct_state_string(s, &value->u8, &mask->u8); + break; + case MFS_ETHERNET: error = mf_from_ethernet_string(mf, s, &value->mac, &mask->mac); break; @@ -2244,6 +2310,13 @@ mf_format_tcp_flags_string(ovs_be16 value, ovs_be16 mask, struct ds *s) TCP_FLAGS(mask), TCP_FLAGS(OVS_BE16_MAX)); } +static void +mf_format_ct_state_string(uint8_t value, uint8_t mask, struct ds *s) +{ + format_flags_masked(s, NULL, packet_ct_state_to_string, value, mask, + UINT8_MAX); +} + /* Appends to 's' a string representation of field 'mf' whose value is in * 'value' and 'mask'. 'mask' may be NULL to indicate an exact match. */ void @@ -2280,6 +2353,10 @@ mf_format(const struct mf_field *mf, mf_format_integer_string(mf, (uint8_t *) value, (uint8_t *) mask, s); break; + case MFS_CT_STATE: + mf_format_ct_state_string(value->u8, mask ? mask->u8 : UINT8_MAX, s); + break; + case MFS_ETHERNET: eth_format_masked(value->mac, mask ? &mask->mac : NULL, s); break; diff --git a/lib/meta-flow.h b/lib/meta-flow.h index 02272ef..44a29e9 100644 --- a/lib/meta-flow.h +++ b/lib/meta-flow.h @@ -703,6 +703,51 @@ enum OVS_PACKED_ENUM mf_field_id { */ MFF_PKT_MARK, + /* "ct_state". + * + * Connection tracking state. The field is populated by the NXAST_CT + * action. The following bit values describe the state of the connection: + * + * - New (0x01): This is the beginning of a new connection. + * - Established (0x02): This is part of an already existing connection. + * - Related (0x04): This is a separate connection that is related to an + * existing connection. + * - Invalid (0x20): This flow could not be associated with a connection. + * This could be set for a variety of reasons, + * including (but not limited to): + * - L3/L4 protocol handler is not loaded/unavailable. + * - L3/L4 protocol handler determines that the packet + * is malformed or invalid for the current FSM stage. + * - Packets are unexpected length for protocol. + * - Reply (0x40): This flow is in the reply direction, ie it did not + * initiate the connection. + * - Tracked (0x80): Connection tracking has occurred. + * + * Type: u8. + * Maskable: bitwise. + * Formatting: conn state. + * Prerequisites: none. + * Access: read-only. + * NXM: NXM_NX_CT_STATE(105) since v2.5. + * OXM: none. + */ + MFF_CT_STATE, + + /* "ct_zone". + * + * Connection tracking zone. The field is populated by the + * NXAST_CT action. + * + * Type: be16. + * Maskable: no. + * Formatting: hexadecimal. + * Prerequisites: none. + * Access: read-only. + * NXM: NXM_NX_CT_ZONE(106) since v2.5. + * OXM: none. + */ + MFF_CT_ZONE, + #if FLOW_N_REGS == 8 /* "reg". * @@ -1679,6 +1724,7 @@ enum OVS_PACKED_ENUM mf_string { MFS_HEXADECIMAL, /* Other formats. */ + MFS_CT_STATE, /* Connection tracking state */ MFS_ETHERNET, MFS_IPV4, MFS_IPV6, diff --git a/lib/nx-match.c b/lib/nx-match.c index eef2c54..4268b42 100644 --- a/lib/nx-match.c +++ b/lib/nx-match.c @@ -1036,6 +1036,16 @@ nx_put_raw(struct ofpbuf *b, enum ofp_version oxm, const struct match *match, nxm_put_32m(b, MFF_PKT_MARK, oxm, htonl(flow->pkt_mark), htonl(match->wc.masks.pkt_mark)); + /* Connection tracking. */ + if (match->wc.masks.ct_state) { + nxm_put_8m(b, MFF_CT_STATE, oxm, flow->ct_state, + match->wc.masks.ct_state); + } + if (match->wc.masks.ct_zone) { + nxm_put_16m(b, MFF_CT_ZONE, oxm, htons(flow->ct_zone), + htons(match->wc.masks.ct_zone)); + } + /* OpenFlow 1.1+ Metadata. */ nxm_put_64m(b, MFF_METADATA, oxm, flow->metadata, match->wc.masks.metadata); diff --git a/lib/odp-execute.c b/lib/odp-execute.c index 54a43cd..575631c 100644 --- a/lib/odp-execute.c +++ b/lib/odp-execute.c @@ -326,6 +326,8 @@ odp_execute_set_action(struct dp_packet *packet, const struct nlattr *a) case OVS_KEY_ATTR_ICMP: case OVS_KEY_ATTR_ICMPV6: case OVS_KEY_ATTR_TCP_FLAGS: + case OVS_KEY_ATTR_CT_STATE: + case OVS_KEY_ATTR_CT_ZONE: case __OVS_KEY_ATTR_MAX: default: OVS_NOT_REACHED(); @@ -414,6 +416,8 @@ odp_execute_masked_set_action(struct dp_packet *packet, case OVS_KEY_ATTR_TUNNEL: /* Masked data not supported for tunnel. */ case OVS_KEY_ATTR_UNSPEC: + case OVS_KEY_ATTR_CT_STATE: + case OVS_KEY_ATTR_CT_ZONE: case OVS_KEY_ATTR_ENCAP: case OVS_KEY_ATTR_ETHERTYPE: case OVS_KEY_ATTR_IN_PORT: @@ -476,6 +480,7 @@ requires_datapath_assistance(const struct nlattr *a) case OVS_ACTION_ATTR_TUNNEL_POP: case OVS_ACTION_ATTR_USERSPACE: case OVS_ACTION_ATTR_RECIRC: + case OVS_ACTION_ATTR_CT: return true; case OVS_ACTION_ATTR_SET: @@ -611,6 +616,7 @@ odp_execute_actions(void *dp, struct dp_packet **packets, int cnt, bool steal, case OVS_ACTION_ATTR_TUNNEL_POP: case OVS_ACTION_ATTR_USERSPACE: case OVS_ACTION_ATTR_RECIRC: + case OVS_ACTION_ATTR_CT: case OVS_ACTION_ATTR_UNSPEC: case __OVS_ACTION_ATTR_MAX: OVS_NOT_REACHED(); diff --git a/lib/odp-util.c b/lib/odp-util.c index d7696c9..1ce6258 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -113,6 +113,7 @@ odp_action_len(uint16_t type) case OVS_ACTION_ATTR_SET: return ATTR_LEN_VARIABLE; case OVS_ACTION_ATTR_SET_MASKED: return ATTR_LEN_VARIABLE; case OVS_ACTION_ATTR_SAMPLE: return ATTR_LEN_VARIABLE; + case OVS_ACTION_ATTR_CT: return ATTR_LEN_VARIABLE; case OVS_ACTION_ATTR_UNSPEC: case __OVS_ACTION_ATTR_MAX: @@ -134,6 +135,8 @@ ovs_key_attr_to_string(enum ovs_key_attr attr, char *namebuf, size_t bufsize) case OVS_KEY_ATTR_ENCAP: return "encap"; case OVS_KEY_ATTR_PRIORITY: return "skb_priority"; case OVS_KEY_ATTR_SKB_MARK: return "skb_mark"; + case OVS_KEY_ATTR_CT_STATE: return "ct_state"; + case OVS_KEY_ATTR_CT_ZONE: return "ct_zone"; case OVS_KEY_ATTR_TUNNEL: return "tunnel"; case OVS_KEY_ATTR_IN_PORT: return "in_port"; case OVS_KEY_ATTR_ETHERNET: return "eth"; @@ -532,6 +535,44 @@ format_odp_tnl_push_action(struct ds *ds, const struct nlattr *attr) ds_put_format(ds, ",out_port(%"PRIu32"))", data->out_port); } +static const struct nl_policy ovs_conntrack_policy[] = { + [OVS_CT_ATTR_FLAGS] = { .type = NL_A_U32, .optional = true, + .min_len = sizeof(uint32_t) }, + [OVS_CT_ATTR_ZONE] = { .type = NL_A_U16, .optional = true, + .min_len = sizeof(uint16_t)}, +}; + +static void +format_odp_conntrack_action(struct ds *ds, const struct nlattr *attr) +{ + struct nlattr *a[ARRAY_SIZE(ovs_conntrack_policy)]; + uint32_t flags; + uint16_t zone; + + if (!nl_parse_nested(attr, ovs_conntrack_policy, a, ARRAY_SIZE(a))) { + ds_put_cstr(ds, "ct(error)"); + return; + } + + flags = a[OVS_CT_ATTR_FLAGS] ? nl_attr_get_u32(a[OVS_CT_ATTR_FLAGS]) : 0; + zone = a[OVS_CT_ATTR_ZONE] ? nl_attr_get_u16(a[OVS_CT_ATTR_ZONE]) : 0; + + ds_put_format(ds, "ct"); + if (flags || zone) { + ds_put_cstr(ds, "("); + if (flags & OVS_CT_F_COMMIT) { + ds_put_format(ds, "commit"); + } + if (zone) { + if (ds_last(ds) != '(') { + ds_put_char(ds, ','); + } + ds_put_format(ds, "zone=%"PRIu16, zone); + } + ds_put_cstr(ds, ")"); + } +} + static void format_odp_action(struct ds *ds, const struct nlattr *a) { @@ -622,6 +663,9 @@ format_odp_action(struct ds *ds, const struct nlattr *a) case OVS_ACTION_ATTR_SAMPLE: format_odp_sample_action(ds, a); break; + case OVS_ACTION_ATTR_CT: + format_odp_conntrack_action(ds, a); + break; case OVS_ACTION_ATTR_UNSPEC: case __OVS_ACTION_ATTR_MAX: default: @@ -960,6 +1004,59 @@ ovs_parse_tnl_push(const char *s, struct ovs_action_push_tnl *data) } static int +parse_conntrack_action(const char *s_, struct ofpbuf *actions) +{ + const char *s = s_; + + if (ovs_scan(s, "ct")) { + uint32_t flags = 0; + uint16_t zone = 0; + size_t start; + char *end; + + s += 2; + if (ovs_scan(s, "(")) { + s++; + end = strchr(s, ')'); + if (!end) { + return -EINVAL; + } + + while (s != end) { + int n = -1; + + s += strspn(s, delimiters); + if (ovs_scan(s, "commit%n", &n)) { + flags |= OVS_CT_F_COMMIT; + s += n; + continue; + } + if (ovs_scan(s, "zone=%"SCNu16"%n", &zone, &n)) { + s += n; + continue; + } + + if (n < 0) { + return -EINVAL; + } + } + s++; + } + + start = nl_msg_start_nested(actions, OVS_ACTION_ATTR_CT); + if (flags) { + nl_msg_put_u32(actions, OVS_CT_ATTR_FLAGS, flags); + } + if (zone) { + nl_msg_put_u16(actions, OVS_CT_ATTR_ZONE, zone); + } + nl_msg_end_nested(actions, start); + } + + return s - s_; +} + +static int parse_odp_action(const char *s, const struct simap *port_names, struct ofpbuf *actions) { @@ -1117,6 +1214,15 @@ parse_odp_action(const char *s, const struct simap *port_names, } { + int retval; + + retval = parse_conntrack_action(s, actions); + if (retval) { + return retval; + } + } + + { struct ovs_action_push_tnl data; int n; @@ -1211,6 +1317,8 @@ static const struct attr_len_tbl ovs_flow_key_attr_lens[OVS_KEY_ATTR_MAX + 1] = [OVS_KEY_ATTR_ICMPV6] = { .len = sizeof(struct ovs_key_icmpv6) }, [OVS_KEY_ATTR_ARP] = { .len = sizeof(struct ovs_key_arp) }, [OVS_KEY_ATTR_ND] = { .len = sizeof(struct ovs_key_nd) }, + [OVS_KEY_ATTR_CT_STATE] = { .len = 1 }, + [OVS_KEY_ATTR_CT_ZONE] = { .len = 2 }, }; /* Returns the correct length of the payload for a flow key attribute of the @@ -2016,6 +2124,21 @@ format_frag(struct ds *ds, const char *name, uint8_t key, } } +static bool +mask_empty(const struct nlattr *ma) +{ + const void *mask; + size_t n; + + if (!ma) { + return true; + } + mask = nl_attr_get(ma); + n = nl_attr_get_size(ma); + + return is_all_zeros(mask, n); +} + static void format_odp_key_attr(const struct nlattr *a, const struct nlattr *ma, const struct hmap *portno_names, struct ds *ds, @@ -2057,6 +2180,24 @@ format_odp_key_attr(const struct nlattr *a, const struct nlattr *ma, } break; + case OVS_KEY_ATTR_CT_STATE: + if (!is_exact) { + format_flags_masked(ds, NULL, packet_ct_state_to_string, + nl_attr_get_u8(a), nl_attr_get_u8(ma), + UINT8_MAX); + } else { + format_flags(ds, packet_ct_state_to_string, + nl_attr_get_u8(a), ','); + } + break; + + case OVS_KEY_ATTR_CT_ZONE: + if (verbose || !mask_empty(ma)) { + ds_put_format(ds, "%"PRIx16, nl_attr_get_u16(a)); + } + break; + + case OVS_KEY_ATTR_TUNNEL: format_odp_tun_attr(a, ma, ds, verbose); break; @@ -2503,6 +2644,26 @@ scan_u8(const char *s, uint8_t *key, uint8_t *mask) } static int +scan_u16(const char *s, uint16_t *key, uint16_t *mask) +{ + int n; + + if (ovs_scan(s, "%"SCNi16"%n", key, &n)) { + int len = n; + + if (mask) { + if (ovs_scan(s + len, "/%"SCNi16"%n", mask, &n)) { + len += n; + } else { + *mask = UINT16_MAX; + } + } + return len; + } + return 0; +} + +static int scan_u32(const char *s, uint32_t *key, uint32_t *mask) { int n; @@ -2605,6 +2766,25 @@ scan_tcp_flags(const char *s, ovs_be16 *key, ovs_be16 *mask) } static int +scan_ct_state(const char *s, uint8_t *key, uint8_t *mask) +{ + uint32_t flags, fmask; + int n; + + n = parse_flags(s, packet_ct_state_to_string, ')', NULL, NULL, &flags, + CS_SUPPORTED_MASK, mask ? &fmask : NULL); + + if (n >= 0) { + *key = flags; + if (mask) { + *mask = fmask; + } + return n; + } + return 0; +} + +static int scan_frag(const char *s, uint8_t *key, uint8_t *mask) { int n; @@ -3127,6 +3307,9 @@ parse_odp_key_mask_attr(const char *s, const struct simap *port_names, OVS_KEY_ATTR_RECIRC_ID); SCAN_SINGLE("dp_hash(", uint32_t, u32, OVS_KEY_ATTR_DP_HASH); + SCAN_SINGLE("ct_state(", uint8_t, ct_state, OVS_KEY_ATTR_CT_STATE); + SCAN_SINGLE("ct_zone(", uint16_t, u16, OVS_KEY_ATTR_CT_ZONE); + SCAN_BEGIN_NESTED("tunnel(", OVS_KEY_ATTR_TUNNEL) { SCAN_FIELD_NESTED("tun_id=", ovs_be64, be64, OVS_TUNNEL_KEY_ATTR_ID); SCAN_FIELD_NESTED("src=", ovs_be32, ipv4, OVS_TUNNEL_KEY_ATTR_IPV4_SRC); @@ -3361,6 +3544,12 @@ odp_flow_key_from_flow__(const struct odp_flow_key_parms *parms, nl_msg_put_u32(buf, OVS_KEY_ATTR_SKB_MARK, data->pkt_mark); + if (parms->support.ct_state) { + nl_msg_put_u8(buf, OVS_KEY_ATTR_CT_STATE, data->ct_state); + } + if (parms->support.ct_zone) { + nl_msg_put_u16(buf, OVS_KEY_ATTR_CT_ZONE, data->ct_zone); + } if (parms->support.recirc) { nl_msg_put_u32(buf, OVS_KEY_ATTR_RECIRC_ID, data->recirc_id); nl_msg_put_u32(buf, OVS_KEY_ATTR_DP_HASH, data->dp_hash); @@ -3543,6 +3732,13 @@ odp_key_from_pkt_metadata(struct ofpbuf *buf, const struct pkt_metadata *md) nl_msg_put_u32(buf, OVS_KEY_ATTR_SKB_MARK, md->pkt_mark); + if (md->ct_state) { + nl_msg_put_u8(buf, OVS_KEY_ATTR_CT_STATE, md->ct_state); + if (md->ct_zone) { + nl_msg_put_u16(buf, OVS_KEY_ATTR_CT_ZONE, md->ct_zone); + } + } + /* Add an ingress port attribute if 'odp_in_port' is not the magical * value "ODPP_NONE". */ if (md->in_port.odp_port != ODPP_NONE) { @@ -3590,6 +3786,14 @@ odp_key_to_pkt_metadata(const struct nlattr *key, size_t key_len, md->pkt_mark = nl_attr_get_u32(nla); wanted_attrs &= ~(1u << OVS_KEY_ATTR_SKB_MARK); break; + case OVS_KEY_ATTR_CT_STATE: + md->ct_state = nl_attr_get_u8(nla); + wanted_attrs &= ~(1u << OVS_KEY_ATTR_CT_STATE); + break; + case OVS_KEY_ATTR_CT_ZONE: + md->ct_zone = nl_attr_get_u16(nla); + wanted_attrs &= ~(1u << OVS_KEY_ATTR_CT_ZONE); + break; case OVS_KEY_ATTR_TUNNEL: { enum odp_key_fitness res; @@ -4144,6 +4348,15 @@ odp_flow_key_to_flow__(const struct nlattr *key, size_t key_len, expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_SKB_MARK; } + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_CT_STATE)) { + flow->ct_state = nl_attr_get_u8(attrs[OVS_KEY_ATTR_CT_STATE]); + expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_CT_STATE; + } + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_CT_ZONE)) { + flow->ct_zone = nl_attr_get_u16(attrs[OVS_KEY_ATTR_CT_ZONE]); + expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_CT_ZONE; + } + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_TUNNEL)) { enum odp_key_fitness res; diff --git a/lib/odp-util.h b/lib/odp-util.h index bc27794..5822777 100644 --- a/lib/odp-util.h +++ b/lib/odp-util.h @@ -120,6 +120,8 @@ void odp_portno_names_destroy(struct hmap *portno_names); * OVS_KEY_ATTR_SKB_MARK 4 -- 4 8 * OVS_KEY_ATTR_DP_HASH 4 -- 4 8 * OVS_KEY_ATTR_RECIRC_ID 4 -- 4 8 + * OVS_KEY_ATTR_CONN_STATE 1 3 4 8 + * OVS_KEY_ATTR_CONN_ZONE 2 2 4 8 * OVS_KEY_ATTR_ETHERNET 12 -- 4 16 * OVS_KEY_ATTR_ETHERTYPE 2 2 4 8 (outer VLAN ethertype) * OVS_KEY_ATTR_VLAN 2 2 4 8 @@ -129,7 +131,7 @@ void odp_portno_names_destroy(struct hmap *portno_names); * OVS_KEY_ATTR_ICMPV6 2 2 4 8 * OVS_KEY_ATTR_ND 28 -- 4 32 * ---------------------------------------------------------- - * total 488 + * total 504 * * We include some slack space in case the calculation isn't quite right or we * add another field and forget to adjust this value. @@ -166,6 +168,10 @@ struct odp_support { /* If this is true, then recirculation fields will always be serialised. */ bool recirc; + + /* If true, serialise the corresponding OVS_KEY_ATTR_CONN_* field. */ + bool ct_state; + bool ct_zone; }; struct odp_flow_key_parms { diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c index c1e46b5..6ea421f 100644 --- a/lib/ofp-actions.c +++ b/lib/ofp-actions.c @@ -286,6 +286,9 @@ enum ofp_raw_action_type { /* NX1.0+(34): struct nx_action_conjunction. */ NXAST_RAW_CONJUNCTION, + /* NX1.0+(35): struct nx_action_conntrack, ... */ + NXAST_RAW_CT, + /* ## ------------------ ## */ /* ## Debugging actions. ## */ /* ## ------------------ ## */ @@ -346,6 +349,10 @@ static void *ofpact_put_raw(struct ofpbuf *, enum ofp_version, static char *OVS_WARN_UNUSED_RESULT ofpacts_parse( char *str, struct ofpbuf *ofpacts, enum ofputil_protocol *usable_protocols, bool allow_instructions, enum ofpact_type outer_action); +static enum ofperr ofpacts_pull_openflow_actions__( + struct ofpbuf *openflow, unsigned int actions_len, + enum ofp_version version, uint32_t allowed_ovsinsts, + struct ofpbuf *ofpacts, enum ofpact_type outer_action); /* Pull off existing actions or instructions. Used by nesting actions to keep * ofpacts_parse() oblivious of actions nesting. */ @@ -4448,6 +4455,245 @@ format_DEBUG_RECIRC(const struct ofpact_null *a OVS_UNUSED, struct ds *s) { ds_put_cstr(s, "debug_recirc"); } + +/* Action structure for NXAST_CT. + * + * Pass traffic to the connection tracker. + * + * The "zone" specifies a context within which the tracking is done: + * + * If 'zone_src' is nonzero, this specifies that the zone should be + * sourced from a field zone_src[ofs:ofs+nbits]. The format and semantics + * of 'zone_src' and 'zone_ofs_nbits' are similar to those for the + * NXAST_REG_LOAD action. The acceptable nxm_header values for 'zone_src' + * are the same as the acceptable nxm_header values for the 'src' field of + * NXAST_REG_MOVE. + * + * If 'zone_src' is zero, then the value of 'zone_imm' will be used as the + * connection tracking zone. + * + * If "recirc_table" has a value other than NX_CT_RECIRC_NONE, then this field + * specifies that the packet should be logically cloned after the packet is + * sent through the connection tracker, and pipeline processing for the cloned + * packet will continue in the OpenFlow table specified by this value. The + * NXM_NX_CT_* fields will be populated for the subsequent processing. It is + * strongly recommended that this table is later than the current table, to + * prevent loops. + * + * A standard "resubmit" action is not sufficient to populate NXM_NX_CT_* + * fields, since connection tracking occurs outside of the classifier. + */ +struct nx_action_conntrack { + ovs_be16 type; /* OFPAT_VENDOR. */ + ovs_be16 len; /* At least 24. */ + ovs_be32 vendor; /* NX_VENDOR_ID. */ + ovs_be16 subtype; /* NXAST_CT. */ + ovs_be16 flags; /* Zero or more NX_CT_F_* flags. + * Unspecified flag bits must be zero. */ + ovs_be32 zone_src; /* Connection tracking context. */ + union { + ovs_be16 zone_ofs_nbits;/* Range to use from source field. */ + ovs_be16 zone_imm; /* Immediate value for zone. */ + }; + uint8_t recirc_table; /* Recirculate to a specific table, or + NX_CT_RECIRC_NONE for no recirculation. */ + uint8_t pad[3]; /* Zeroes */ + ovs_be16 alg; /* Well-known port number for the protocol. + * 0 indicates no ALG is required. */ + /* Followed by a sequence of ofpact elements, until the end of the action + * is reached. */ +}; +OFP_ASSERT(sizeof(struct nx_action_conntrack) == 24); + +static enum ofperr +decode_ct_zone(const struct nx_action_conntrack *nac, + struct ofpact_conntrack *out) +{ + if (nac->zone_src) { + enum ofperr error; + + out->zone_src.field = mf_from_nxm_header(ntohl(nac->zone_src)); + out->zone_src.ofs = nxm_decode_ofs(nac->zone_ofs_nbits); + out->zone_src.n_bits = nxm_decode_n_bits(nac->zone_ofs_nbits); + error = mf_check_src(&out->zone_src, NULL); + if (error) { + return error; + } + + if (out->zone_src.n_bits != 16) { + VLOG_WARN_RL(&rl, "zone n_bits %d not within valid range [16..16]", + out->zone_src.n_bits); + return OFPERR_OFPBAC_BAD_SET_LEN; + } + } else { + out->zone_src.field = NULL; + out->zone_imm = ntohs(nac->zone_imm); + } + + return 0; +} + +static enum ofperr +decode_NXAST_RAW_CT(const struct nx_action_conntrack *nac, struct ofpbuf *out) +{ + const size_t ct_offset = ofpacts_pull(out); + struct ofpact_conntrack *conntrack; + struct ofpbuf openflow; + int error = 0; + + conntrack = ofpact_put_CT(out); + conntrack->flags = ntohs(nac->flags); + error = decode_ct_zone(nac, conntrack); + if (error) { + goto out; + } + conntrack->recirc_table = nac->recirc_table; + conntrack->alg = ntohs(nac->alg); + + ofpbuf_pull(out, sizeof(*conntrack)); + + /* XXX: version */ + ofpbuf_use_const(&openflow, nac + 1, ntohs(nac->len) - sizeof(*nac)); + error = ofpacts_pull_openflow_actions__(&openflow, openflow.size, + OFP10_VERSION, + 1u << OVSINST_OFPIT11_APPLY_ACTIONS, + out, OFPACT_CT); + if (error) { + goto out; + } + + conntrack = ofpbuf_push_uninit(out, sizeof(*conntrack)); + out->header = &conntrack->ofpact; + ofpact_update_len(out, &conntrack->ofpact); + +out: + ofpbuf_push_uninit(out, ct_offset); + return error; +} + +static void +encode_CT(const struct ofpact_conntrack *conntrack, + enum ofp_version ofp_version, struct ofpbuf *out) +{ + struct nx_action_conntrack *nac; + const size_t ofs = out->size; + size_t len; + + nac = put_NXAST_CT(out); + nac->flags = htons(conntrack->flags); + if (conntrack->zone_src.field) { + nac->zone_src = htonl(mf_nxm_header(conntrack->zone_src.field->id)); + nac->zone_ofs_nbits = nxm_encode_ofs_nbits(conntrack->zone_src.ofs, + conntrack->zone_src.n_bits); + } else { + nac->zone_src = htonl(0); + nac->zone_imm = htons(conntrack->zone_imm); + } + nac->recirc_table = conntrack->recirc_table; + nac->alg = htons(conntrack->alg); + + len = ofpacts_put_openflow_actions(conntrack->actions, + ofpact_ct_get_action_len(conntrack), + out, ofp_version); + len += sizeof(*nac); + nac = ofpbuf_at(out, ofs, sizeof(*nac)); + nac->len = htons(len); +} + +/* Parses 'arg' as the argument to a "ct" action, and appends such an + * action to 'ofpacts'. + * + * Returns NULL if successful, otherwise a malloc()'d string describing the + * error. The caller is responsible for freeing the returned string. */ +static char * OVS_WARN_UNUSED_RESULT +parse_CT(char *arg, struct ofpbuf *ofpacts, + enum ofputil_protocol *usable_protocols) +{ + const size_t ct_offset = ofpacts_pull(ofpacts); + struct ofpact_conntrack *oc; + char *error = NULL; + char *key, *value; + + oc = ofpact_put_CT(ofpacts); + oc->flags = 0; + oc->recirc_table = NX_CT_RECIRC_NONE; + while (ofputil_parse_key_value(&arg, &key, &value)) { + if (!strcmp(key, "commit")) { + oc->flags |= NX_CT_F_COMMIT; + } else if (!strcmp(key, "table")) { + error = str_to_u8(value, "recirc_table", &oc->recirc_table); + } else if (!strcmp(key, "zone")) { + error = str_to_u16(value, "zone", &oc->zone_imm); + + if (error) { + free(error); + error = mf_parse_subfield(&oc->zone_src, value); + if (error) { + return error; + } + } + } else if (!strcmp(key, "exec")) { + /* Hide existing actions from ofpacts_parse_actions(), so the + * nesting can be handled transparently. */ + ofpbuf_pull(ofpacts, sizeof(*oc)); + error = ofpacts_parse_actions(value, ofpacts, usable_protocols, + OFPACT_CT); + + /* Put the CT action back on and update its length. */ + oc = ofpbuf_push_uninit(ofpacts, sizeof(*oc)); + oc->ofpact.len = ofpacts->size; + } else { + error = xasprintf("invalid argument to \"ct\" action: `%s'", key); + } + if (error) { + break; + } + } + + ofpbuf_push_uninit(ofpacts, ct_offset); + return error; +} + +static void +append_comma(struct ds *s, bool *first) +{ + if (*first) { + *first = false; + } else { + ds_put_char(s, ','); + } +} + +static void +format_CT(const struct ofpact_conntrack *a, struct ds *s) +{ + bool first = true; + + ds_put_cstr(s, "ct("); + if (a->flags & NX_CT_F_COMMIT) { + append_comma(s, &first); + ds_put_cstr(s, "commit"); + } + if (a->recirc_table != NX_CT_RECIRC_NONE) { + append_comma(s, &first); + ds_put_format(s, "table=%"PRIu8, a->recirc_table); + } + if (a->zone_src.field) { + append_comma(s, &first); + ds_put_format(s, "zone="); + mf_format_subfield(&a->zone_src, s); + } else if (a->zone_imm) { + append_comma(s, &first); + ds_put_format(s, "zone=%"PRIu16, a->zone_imm); + } + if (ofpact_ct_get_action_len(a)) { + append_comma(s, &first); + ds_put_cstr(s, "exec("); + ofpacts_format(a->actions, ofpact_ct_get_action_len(a), s); + ds_put_char(s, ')'); + } + ds_put_char(s, ')'); +} /* Meter instruction. */ @@ -4827,6 +5073,7 @@ ofpact_is_set_or_move_action(const struct ofpact *a) return true; case OFPACT_BUNDLE: case OFPACT_CLEAR_ACTIONS: + case OFPACT_CT: case OFPACT_CONTROLLER: case OFPACT_DEC_MPLS_TTL: case OFPACT_DEC_TTL: @@ -4901,6 +5148,7 @@ ofpact_is_allowed_in_actions_set(const struct ofpact *a) * in the action set is undefined. */ case OFPACT_BUNDLE: case OFPACT_CONTROLLER: + case OFPACT_CT: case OFPACT_ENQUEUE: case OFPACT_EXIT: case OFPACT_UNROLL_XLATE: @@ -5130,6 +5378,7 @@ ovs_instruction_type_from_ofpact_type(enum ofpact_type type) case OFPACT_UNROLL_XLATE: case OFPACT_SAMPLE: case OFPACT_DEBUG_RECIRC: + case OFPACT_CT: default: return OVSINST_OFPIT11_APPLY_ACTIONS; } @@ -5686,6 +5935,16 @@ ofpact_check__(enum ofputil_protocol *usable_protocols, struct ofpact *a, case OFPACT_SAMPLE: return 0; + case OFPACT_CT: { + struct ofpact_conntrack *oc = ofpact_get_CT(a); + + if (!dl_type_is_ip_any(flow->dl_type) + || (flow->ct_state & CS_INVALID && oc->flags & NX_CT_F_COMMIT)) { + inconsistent_match(usable_protocols); + } + return 0; + } + case OFPACT_CLEAR_ACTIONS: return 0; @@ -6155,6 +6414,7 @@ ofpact_outputs_to_port(const struct ofpact *ofpact, ofp_port_t port) case OFPACT_METER: case OFPACT_GROUP: case OFPACT_DEBUG_RECIRC: + case OFPACT_CT: default: return false; } diff --git a/lib/ofp-actions.h b/lib/ofp-actions.h index 818f5c8..38a9547 100644 --- a/lib/ofp-actions.h +++ b/lib/ofp-actions.h @@ -106,6 +106,7 @@ OFPACT(EXIT, ofpact_null, ofpact, "exit") \ OFPACT(SAMPLE, ofpact_sample, ofpact, "sample") \ OFPACT(UNROLL_XLATE, ofpact_unroll_xlate, ofpact, "unroll_xlate") \ + OFPACT(CT, ofpact_conntrack, ofpact, "ct") \ \ /* Debugging actions. \ * \ @@ -471,6 +472,52 @@ BUILD_ASSERT_DECL(offsetof(struct ofpact_nest, actions) % OFPACT_ALIGNTO == 0); BUILD_ASSERT_DECL(offsetof(struct ofpact_nest, actions) == sizeof(struct ofpact_nest)); +/* Bits for 'flags' in struct nx_action_conntrack. + * + * If NX_CT_F_COMMIT is set, then the connection entry is moved from the + * unconfirmed to confirmed list in the tracker. */ +enum nx_conntrack_flags { + NX_CT_F_COMMIT = 1 << 0, +}; + +/* Magic value for struct nx_action_conntrack 'recirc_table' field, to specify + * that the packet should not be recirculated. This value should commonly be + * used in conjunction with the NX_CT_F_COMMIT flag above. */ +#define NX_CT_RECIRC_NONE 0xFF + +#define CT_MEMBERS \ + uint16_t flags; \ + uint16_t zone_imm; \ + struct mf_subfield zone_src; \ + uint8_t recirc_table; \ + uint16_t alg + +/* Used to perform build-time size checking for ofpact_conntrack. */ +struct ofpact_ct_size { + CT_MEMBERS; +}; + +/* OFPACT_CT. + * + * Used for NXAST_CT. */ +struct ofpact_conntrack { + struct ofpact ofpact; + CT_MEMBERS; + uint8_t pad[PAD_SIZE(sizeof(struct ofpact) + sizeof(struct ofpact_ct_size), + OFPACT_ALIGNTO)]; + struct ofpact actions[]; +}; +BUILD_ASSERT_DECL(offsetof(struct ofpact_conntrack, actions) + % OFPACT_ALIGNTO == 0); +BUILD_ASSERT_DECL(offsetof(struct ofpact_conntrack, actions) + == sizeof(struct ofpact_conntrack)); + +static inline size_t +ofpact_ct_get_action_len(const struct ofpact_conntrack *oc) +{ + return oc->ofpact.len - offsetof(struct ofpact_conntrack, actions); +} + static inline size_t ofpact_nest_get_action_len(const struct ofpact_nest *on) { diff --git a/lib/packets.c b/lib/packets.c index a4d7854..d80f0b2 100644 --- a/lib/packets.c +++ b/lib/packets.c @@ -1062,3 +1062,24 @@ packet_csum_pseudoheader(const struct ip_header *ip) return partial; } + +const char * +packet_ct_state_to_string(uint32_t flag) +{ + switch (flag) { + case CS_REPLY_DIR: + return "rpl"; + case CS_TRACKED: + return "trk"; + case CS_NEW: + return "new"; + case CS_ESTABLISHED: + return "est"; + case CS_RELATED: + return "rel"; + case CS_INVALID: + return "inv"; + default: + return NULL; + } +} diff --git a/lib/packets.h b/lib/packets.h index 4fb1427..ed4540c 100644 --- a/lib/packets.h +++ b/lib/packets.h @@ -126,6 +126,8 @@ struct pkt_metadata { uint32_t skb_priority; /* Packet priority for QoS. */ uint32_t pkt_mark; /* Packet mark. */ union flow_in_port in_port; /* Input port. */ + uint8_t ct_state; /* Connection state. */ + uint16_t ct_zone; /* Connection zone. */ struct flow_tnl tunnel; /* Encapsulating tunnel parameters. Note that * if 'ip_dst' == 0, the rest of the fields may * be uninitialized. */ @@ -134,13 +136,18 @@ struct pkt_metadata { static inline void pkt_metadata_init(struct pkt_metadata *md, odp_port_t port) { - /* It can be expensive to zero out all of the tunnel metadata. However, - * we can just zero out ip_dst and the rest of the data will never be - * looked at. */ - memset(md, 0, offsetof(struct pkt_metadata, tunnel)); - md->tunnel.ip_dst = 0; + /* It can be expensive to zero out all metadata. Therefore: + * + * - We can just zero out 'tunnel.ip_dst' and the rest of the 'tunnel' + * field will never be looked at. + * - We can just zero out 'ct_state' and the rest of the 'ct_*' members + * will never be looked at. */ + memset(md, 0, offsetof(struct pkt_metadata, in_port)); md->in_port.odp_port = port; + + md->ct_state = 0; + md->tunnel.ip_dst = 0; } bool dpid_from_string(const char *s, uint64_t *dpidp); @@ -710,6 +717,18 @@ struct tcp_header { }; BUILD_ASSERT_DECL(TCP_HEADER_LEN == sizeof(struct tcp_header)); +/* Connection states */ +#define CS_NEW 0x01 +#define CS_ESTABLISHED 0x02 +#define CS_RELATED 0x04 +#define CS_INVALID 0x20 +#define CS_REPLY_DIR 0x40 +#define CS_TRACKED 0x80 + +/* Undefined connection state bits. */ +#define CS_UNSUPPORTED_MASK 0x18 +#define CS_SUPPORTED_MASK (~CS_UNSUPPORTED_MASK & 0xFF) + #define ARP_HRD_ETHERNET 1 #define ARP_PRO_IP 0x0800 #define ARP_OP_REQUEST 1 @@ -937,4 +956,6 @@ void compose_arp(struct dp_packet *, uint16_t arp_op, ovs_be32 arp_spa, ovs_be32 arp_tpa); uint32_t packet_csum_pseudoheader(const struct ip_header *); +const char *packet_ct_state_to_string(uint32_t flag); + #endif /* packets.h */ diff --git a/ofproto/ofproto-dpif-rid.c b/ofproto/ofproto-dpif-rid.c index 5a02bb0..eb0b0d9 100644 --- a/ofproto/ofproto-dpif-rid.c +++ b/ofproto/ofproto-dpif-rid.c @@ -137,6 +137,7 @@ recirc_metadata_hash(const struct recirc_state *state) flow_tnl_size(state->metadata.tunnel) / sizeof(uint64_t), hash); } + hash = hash_boolean(state->conntrack, hash); hash = hash_words64((const uint64_t *) &state->metadata.metadata, (sizeof state->metadata - sizeof state->metadata.tunnel) / sizeof(uint64_t), @@ -168,6 +169,7 @@ recirc_metadata_equal(const struct recirc_state *a, (!b->stack || !b->stack->size)) || (a->stack && b->stack && ofpbuf_equal(a->stack, b->stack))) && a->mirrors == b->mirrors + && a->conntrack == b->conntrack && a->action_set_len == b->action_set_len && ofpacts_equal(a->ofpacts, a->ofpacts_len, b->ofpacts, b->ofpacts_len)); diff --git a/ofproto/ofproto-dpif-rid.h b/ofproto/ofproto-dpif-rid.h index e7d95bf..7754e68 100644 --- a/ofproto/ofproto-dpif-rid.h +++ b/ofproto/ofproto-dpif-rid.h @@ -142,6 +142,7 @@ struct recirc_state { struct recirc_metadata metadata; /* Flow metadata. */ struct ofpbuf *stack; /* Stack if any. */ mirror_mask_t mirrors; /* Mirrors already output. */ + bool conntrack; /* Conntrack occurred prior to recirc. */ /* Actions to be translated on recirculation. */ uint32_t action_set_len; /* How much of 'ofpacts' consists of an diff --git a/ofproto/ofproto-dpif-sflow.c b/ofproto/ofproto-dpif-sflow.c index 260c01b..0f7d258 100644 --- a/ofproto/ofproto-dpif-sflow.c +++ b/ofproto/ofproto-dpif-sflow.c @@ -1029,6 +1029,8 @@ sflow_read_set_action(const struct nlattr *attr, case OVS_KEY_ATTR_ICMPV6: case OVS_KEY_ATTR_ARP: case OVS_KEY_ATTR_ND: + case OVS_KEY_ATTR_CT_STATE: + case OVS_KEY_ATTR_CT_ZONE: case OVS_KEY_ATTR_UNSPEC: case __OVS_KEY_ATTR_MAX: default: @@ -1137,6 +1139,7 @@ dpif_sflow_read_actions(const struct flow *flow, case OVS_ACTION_ATTR_USERSPACE: case OVS_ACTION_ATTR_RECIRC: case OVS_ACTION_ATTR_HASH: + case OVS_ACTION_ATTR_CT: break; case OVS_ACTION_ATTR_SET_MASKED: diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 81838be..44064a6 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -300,6 +300,11 @@ struct xlate_ctx { * the MPLS label stack that was originally present. */ bool was_mpls; + /* True if conntrack has been performed on this packet during processing + * on the current bridge. This is used to determine whether conntrack + * state from the datapath should be honored after recirculation. */ + bool conntracked; + /* OpenFlow 1.1+ action set. * * 'action_set' accumulates "struct ofpact"s added by OFPACT_WRITE_ACTIONS. @@ -2798,6 +2803,13 @@ xlate_commit_actions(struct xlate_ctx *ctx) } static void +clear_conntrack(struct flow *flow) +{ + flow->ct_state = 0; + flow->ct_zone = 0; +} + +static void compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, const struct xlate_bond_recirc *xr, bool check_stp) { @@ -2807,6 +2819,8 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, struct flow_tnl flow_tnl; ovs_be16 flow_vlan_tci; uint32_t flow_pkt_mark; + uint8_t flow_ct_state; + uint16_t flow_ct_zone; uint8_t flow_nw_tos; odp_port_t out_port, odp_port; bool tnl_push_pop_send = false; @@ -2852,6 +2866,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, if (xport->peer) { const struct xport *peer = xport->peer; struct flow old_flow = ctx->xin->flow; + bool old_conntrack = ctx->conntracked; bool old_was_mpls = ctx->was_mpls; cls_version_t old_version = ctx->tables_version; struct ofpbuf old_stack = ctx->stack; @@ -2867,6 +2882,8 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, memset(&flow->tunnel, 0, sizeof flow->tunnel); memset(flow->regs, 0, sizeof flow->regs); flow->actset_output = OFPP_UNSET; + ctx->conntracked = false; + clear_conntrack(flow); /* The bridge is now known so obtain its table version. */ ctx->tables_version @@ -2921,6 +2938,10 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, * bridge. */ ctx->was_mpls = old_was_mpls; + /* The peer bridge's conntrack execution should have no effect on the + * original bridge. */ + ctx->conntracked = old_conntrack; + /* The fact that the peer bridge exits (for any reason) does not mean * that the original bridge should exit. Specifically, if the peer * bridge recirculates (which typically modifies the packet), the @@ -2948,6 +2969,8 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, flow_vlan_tci = flow->vlan_tci; flow_pkt_mark = flow->pkt_mark; + flow_ct_state = flow->ct_state; + flow_ct_zone = flow->ct_zone; flow_nw_tos = flow->nw_tos; if (count_skb_priorities(xport)) { @@ -3070,6 +3093,8 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, /* Restore flow */ flow->vlan_tci = flow_vlan_tci; flow->pkt_mark = flow_pkt_mark; + flow->ct_state = flow_ct_state; + flow->ct_zone = flow_ct_zone; flow->nw_tos = flow_nw_tos; } @@ -3509,24 +3534,23 @@ execute_controller_action(struct xlate_ctx *ctx, int len, dp_packet_delete(packet); } -/* Called only when ctx->recirc_action_offset is set. */ static void -compose_recirculate_action(struct xlate_ctx *ctx) +compose_recirculate_action__(struct xlate_ctx *ctx, uint8_t table) { struct recirc_metadata md; uint32_t id; - xlate_commit_actions(ctx); recirc_metadata_from_flow(&md, &ctx->xin->flow); ovs_assert(ctx->recirc_action_offset >= 0); struct recirc_state state = { - .table_id = 0, + .table_id = table, .ofproto = ctx->xbridge->ofproto, .metadata = md, .stack = &ctx->stack, .mirrors = ctx->mirrors, + .conntrack = ctx->conntracked, .action_set_len = ctx->recirc_action_offset, .ofpacts_len = ctx->action_set.size, .ofpacts = ctx->action_set.data, @@ -3563,6 +3587,14 @@ compose_recirculate_action(struct xlate_ctx *ctx) ctx->last_unroll_offset = -1; } +/* Called only when ctx->recirc_action_offset is set. */ +static void +compose_recirculate_action(struct xlate_ctx *ctx) +{ + xlate_commit_actions(ctx); + compose_recirculate_action__(ctx, 0); +} + static void compose_mpls_push_action(struct xlate_ctx *ctx, struct ofpact_push_mpls *mpls) { @@ -4096,6 +4128,7 @@ recirc_unroll_actions(const struct ofpact *ofpacts, size_t ofpacts_len, case OFPACT_METER: case OFPACT_SAMPLE: case OFPACT_DEBUG_RECIRC: + case OFPACT_CT: break; /* These need not be copied for restoration. */ @@ -4119,6 +4152,46 @@ recirc_unroll_actions(const struct ofpact *ofpacts, size_t ofpacts_len, } static void +compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc) +{ + uint32_t flags = 0; + size_t ct_offset; + uint16_t zone; + + /* Ensure that any prior actions are applied before composing the new + * conntrack action. */ + xlate_commit_actions(ctx); + + /* Process nested actions first, to populate the key. */ + do_xlate_actions(ofc->actions, ofpact_ct_get_action_len(ofc), ctx); + + if (ofc->flags & NX_CT_F_COMMIT) { + flags |= OVS_CT_F_COMMIT; + } + if (ofc->zone_src.field) { + zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow) & 0xFF; + } else { + zone = ofc->zone_imm; + } + + ct_offset = nl_msg_start_nested(ctx->odp_actions, OVS_ACTION_ATTR_CT); + nl_msg_put_u32(ctx->odp_actions, OVS_CT_ATTR_FLAGS, flags); + nl_msg_put_u16(ctx->odp_actions, OVS_CT_ATTR_ZONE, zone); + nl_msg_end_nested(ctx->odp_actions, ct_offset); + + if (ofc->recirc_table == NX_CT_RECIRC_NONE) { + /* If we do not recirculate as part of this action, hide the results of + * connection tracking from subsequent recirculations. */ + ctx->conntracked = false; + } else { + /* Use ct_* fields from datapath during recirculation upcall. */ + ctx->conntracked = true; + ctx_trigger_recirculation(ctx); + compose_recirculate_action__(ctx, ofc->recirc_table); + } +} + +static void do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len, struct xlate_ctx *ctx) { @@ -4482,6 +4555,10 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len, xlate_sample_action(ctx, ofpact_get_SAMPLE(a)); break; + case OFPACT_CT: + compose_conntrack_action(ctx, ofpact_get_CT(a)); + break; + case OFPACT_DEBUG_RECIRC: ctx_trigger_recirculation(ctx); a = ofpact_next(a); @@ -4786,6 +4863,7 @@ xlate_actions(struct xlate_in *xin, struct xlate_out *xout) .last_unroll_offset = -1, .was_mpls = false, + .conntracked = false, .action_set_has_group = false, .action_set = OFPBUF_STUB_INITIALIZER(action_set_stub), @@ -4855,6 +4933,10 @@ xlate_actions(struct xlate_in *xin, struct xlate_out *xout) ctx.table_id = state->table_id; xlate_report(&ctx, "- Resuming from table %"PRIu8, ctx.table_id); + if (!state->conntrack) { + clear_conntrack(flow); + } + /* Restore pipeline metadata. May change flow's in_port and other * metadata to the values that existed when recirculation was * triggered. */ diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index e39a959..51ab3f0 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -1228,6 +1228,45 @@ check_masked_set_action(struct dpif_backer *backer) return !error; } +#define CHECK_FEATURE__(NAME, FIELD) \ +static bool \ +check_##NAME(struct dpif_backer *backer) \ +{ \ + struct flow flow; \ + struct odputil_keybuf keybuf; \ + struct ofpbuf key; \ + bool enable; \ + struct odp_flow_key_parms odp_parms = { \ + .flow = &flow, \ + .support = { \ + .NAME = true, \ + }, \ + }; \ + \ + memset(&flow, 0, sizeof flow); \ + flow.FIELD = 1; \ + \ + ofpbuf_use_stack(&key, &keybuf, sizeof keybuf); \ + odp_flow_key_from_flow(&odp_parms, &key); \ + enable = dpif_probe_feature(backer->dpif, #NAME, &key, NULL); \ + \ + if (enable) { \ + VLOG_INFO("%s: Datapath supports "#NAME, dpif_name(backer->dpif)); \ + } else { \ + VLOG_INFO("%s: Datapath does not support "#NAME, \ + dpif_name(backer->dpif)); \ + } \ + \ + return enable; \ +} +#define CHECK_FEATURE(FIELD) CHECK_FEATURE__(FIELD, FIELD) + +CHECK_FEATURE(ct_state) +CHECK_FEATURE(ct_zone) + +#undef CHECK_FEATURE +#undef CHECK_FEATURE__ + static void check_support(struct dpif_backer *backer) { @@ -1239,6 +1278,9 @@ check_support(struct dpif_backer *backer) backer->support.masked_set_action = check_masked_set_action(backer); backer->support.ufid = check_ufid(backer); backer->support.tnl_push_pop = dpif_supports_tnl_push_pop(backer->dpif); + + backer->support.odp.ct_state = check_ct_state(backer); + backer->support.odp.ct_zone = check_ct_zone(backer); } static int @@ -3935,10 +3977,38 @@ rule_dealloc(struct rule *rule_) } static enum ofperr +rule_check(struct rule *rule) +{ + struct ofproto_dpif *ofproto = ofproto_dpif_cast(rule->ofproto); + const struct odp_support *support; + struct match match; + + minimatch_expand(&rule->cr.match, &match); + support = &ofproto_dpif_get_support(ofproto)->odp; + + if ((match.wc.masks.ct_state && !support->ct_state) + || (match.wc.masks.ct_zone && !support->ct_zone)) { + return OFPERR_OFPBMC_BAD_FIELD; + } + if (match.wc.masks.ct_state & CS_UNSUPPORTED_MASK) { + return OFPERR_OFPBMC_BAD_MASK; + } + + return 0; +} + +static enum ofperr rule_construct(struct rule *rule_) OVS_NO_THREAD_SAFETY_ANALYSIS { struct rule_dpif *rule = rule_dpif_cast(rule_); + int error; + + error = rule_check(rule_); + if (error) { + return error; + } + ovs_mutex_init_adaptive(&rule->stats_mutex); rule->stats.n_packets = 0; rule->stats.n_bytes = 0; diff --git a/ofproto/ofproto-unixctl.man b/ofproto/ofproto-unixctl.man index 89013d9..892f6fc 100644 --- a/ofproto/ofproto-unixctl.man +++ b/ofproto/ofproto-unixctl.man @@ -103,6 +103,10 @@ only metadata. The metadata can be: Packet QoS priority. .IP \fIpkt_mark\fR Mark of the packet. +.IP \fIct_state\fR +Connection state of the packet. +.IP \fIct_zone\fR +Connection tracking zone for packet. .IP \fItun_id\fR The tunnel ID on which the packet arrived. .IP \fIin_port\fR diff --git a/tests/atlocal.in b/tests/atlocal.in index 5946a3c..8e9fd9b 100644 --- a/tests/atlocal.in +++ b/tests/atlocal.in @@ -110,3 +110,10 @@ fi if test "$IS_WIN32" = "yes"; then HAVE_PYTHON="no" fi + +# Conntrack test requirements +if test x`which conntrack` != x; then + HAVE_CONNTRACK="yes" +else + HAVE_CONNTRACK="no" +fi diff --git a/tests/automake.mk b/tests/automake.mk index 4198039..cef67c0 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -337,6 +337,7 @@ CHECK_PYFILES = \ tests/test-daemon.py \ tests/test-json.py \ tests/test-jsonrpc.py \ + tests/test-l7.py \ tests/test-ovsdb.py \ tests/test-reconnect.py \ tests/MockXenAPI.py \ diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at index e9af63f..43108e7 100644 --- a/tests/dpif-netdev.at +++ b/tests/dpif-netdev.at @@ -82,7 +82,7 @@ AT_CHECK([cat ovs-vswitchd.log | grep -A 1 'miss upcall' | tail -n 1], [0], [dnl skb_priority(0),skb_mark(0),recirc_id(0),dp_hash(0),in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0) ]) AT_CHECK([cat ovs-vswitchd.log | FILTER_FLOW_INSTALL | STRIP_XOUT], [0], [dnl -pkt_mark=0,recirc_id=0,dp_hash=0,skb_priority=0,icmp,in_port=1,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.0.0.2,nw_dst=10.0.0.1,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0, actions: +pkt_mark=0,recirc_id=0,dp_hash=0,skb_priority=0,ct_state=0,ct_zone=0,icmp,in_port=1,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.0.0.2,nw_dst=10.0.0.1,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0, actions: recirc_id=0,ip,in_port=1,vlan_tci=0x0000/0x1fff,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_frag=no, actions: ]) diff --git a/tests/odp.at b/tests/odp.at index 61edde3..74d44ff 100644 --- a/tests/odp.at +++ b/tests/odp.at @@ -86,6 +86,11 @@ sed '/bos=0/{ s/^/ODP_FIT_TOO_LITTLE: / }' < odp-in.txt > odp-out.txt +dnl Some fields are always printed for this test, because wildcards aren't +dnl specified. We can skip these. +sed -i 's/\(skb_mark(0)\),\(ct\)/\1,ct_state(0),ct_zone(0),\2/' odp-out.txt +sed -i 's/\(skb_mark([[^)]]*)\),\(recirc\)/\1,ct_state(0),ct_zone(0),\2/' odp-out.txt + AT_CHECK_UNQUOTED([ovstest test-odp parse-keys < odp-in.txt], [0], [`cat odp-out.txt` ]) AT_CLEANUP @@ -153,6 +158,10 @@ s/\(eth([[^)]]*)\),*/\1,eth_type(0x8100),vlan(vid=99,pcp=7),encap(/ s/$/)/' odp-base.txt echo + echo '# Valid forms with conntrack fields.' + sed 's/\(eth([[^)]]*),?\)/\1,ct_state(+trk),/' odp-base.txt + + echo echo '# Valid forms with IP first fragment.' sed -n 's/,frag=no),/,frag=first),/p' odp-base.txt @@ -293,6 +302,9 @@ tnl_push(tnl_port(6),header(size=50,type=4,eth(dst=f8:bc:12:44:34:b6,src=f8:bc:1 tnl_push(tnl_port(6),header(size=50,type=5,eth(dst=f8:bc:12:44:34:b6,src=f8:bc:12:46:58:e0,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x40),udp(src=0,dst=6081,csum=0x0),geneve(oam,vni=0x1c7)),out_port(1)) tnl_push(tnl_port(6),header(size=58,type=5,eth(dst=f8:bc:12:44:34:b6,src=f8:bc:12:46:58:e0,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x40),udp(src=0,dst=6081,csum=0x0),geneve(crit,vni=0x1c7,options({class=0xffff,type=0x80,len=4,0xa}))),out_port(1)) tnl_push(tnl_port(6),header(size=50,type=5,eth(dst=f8:bc:12:44:34:b6,src=f8:bc:12:46:58:e0,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x40),udp(src=0,dst=6081,csum=0xffff),geneve(vni=0x1c7)),out_port(1)) +ct +ct(commit) +ct(commit,zone=5) ]) AT_CHECK_UNQUOTED([ovstest test-odp parse-actions < actions.txt], [0], [`cat actions.txt` diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index eb8647b..1e3013b 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -6486,8 +6486,8 @@ for i in 1 2 3 4; do done sleep 1 AT_CHECK([cat ovs-vswitchd.log | STRIP_UFID | FILTER_FLOW_INSTALL | STRIP_USED], [0], [dnl -pkt_mark=0,recirc_id=0,dp_hash=0,skb_priority=0,icmp,in_port=1,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.0.0.2,nw_dst=10.0.0.1,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0, actions:2 -pkt_mark=0,recirc_id=0,dp_hash=0,skb_priority=0,icmp,in_port=1,vlan_tci=0x0000,dl_src=50:54:00:00:00:0b,dl_dst=50:54:00:00:00:0c,nw_src=10.0.0.4,nw_dst=10.0.0.3,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0, actions:drop +pkt_mark=0,recirc_id=0,dp_hash=0,skb_priority=0,ct_state=0,ct_zone=0,icmp,in_port=1,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.0.0.2,nw_dst=10.0.0.1,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0, actions:2 +pkt_mark=0,recirc_id=0,dp_hash=0,skb_priority=0,ct_state=0,ct_zone=0,icmp,in_port=1,vlan_tci=0x0000,dl_src=50:54:00:00:00:0b,dl_dst=50:54:00:00:00:0c,nw_src=10.0.0.4,nw_dst=10.0.0.3,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0, actions:drop ]) AT_CHECK([cat ovs-vswitchd.log | STRIP_UFID | FILTER_FLOW_DUMP | grep 'packets:3'], [0], [dnl skb_priority(0),skb_mark(0),recirc_id(0),dp_hash(0),in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0), packets:3, bytes:180, used:0.0s, actions:2 diff --git a/tests/ofproto.at b/tests/ofproto.at index 7e80293..c522a85 100644 --- a/tests/ofproto.at +++ b/tests/ofproto.at @@ -1609,6 +1609,8 @@ head_table () { in_port_oxm: exact match or wildcard actset_output: exact match or wildcard pkt_mark: arbitrary mask + ct_state: arbitrary mask + ct_zone: exact match or wildcard reg0: arbitrary mask reg1: arbitrary mask reg2: arbitrary mask diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at index c5691e7..8f3b318 100644 --- a/tests/system-common-macros.at +++ b/tests/system-common-macros.at @@ -118,3 +118,21 @@ m4_define([ADD_NATIVE_TUNNEL], # Strip variant pieces from ping output so the output can be reliably compared. # m4_define([FORMAT_PING], [grep "transmitted" | sed 's/time.*ms$/time 0ms/']) + +# FORMAT_CT() +# +# Strip content from the piped input which would differ from test to test. +# +m4_define([FORMAT_CT], + [[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 's/ */ /g' -e 's/secctx[^ ]* //' | cut -d' ' -f4- | sort | uniq]]) + +# NETNS_DAEMONIZE([namespace], [command], [pidfile]) +# +# Run 'command' as a background process within 'namespace' and record its pid +# to 'pidfile' to allow cleanup on exit. +# +m4_define([NETNS_DAEMONIZE], + [ip netns exec $1 $2 & echo $! > $3 + echo "kill \`cat $3\`" >> cleanup + ] +) diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at index 1216db8..a48e8d9 100644 --- a/tests/system-kmod-macros.at +++ b/tests/system-kmod-macros.at @@ -43,3 +43,19 @@ m4_define([OVS_TRAFFIC_VSWITCHD_STOP], [OVS_VSWITCHD_STOP([$1]) AT_CHECK([:; $2]) ]) + +# CHECK_CONNTRACK() +# +# Perform requirements checks for running conntrack tests, and flush the +# kernel conntrack tables when the test is finished. +# +m4_define([CHECK_CONNTRACK], + [AT_SKIP_IF([test $HAVE_CONNTRACK = no]) + AT_SKIP_IF([test $HAVE_PYTHON = no]) + m4_foreach([mod], [[nf_conntrack_ipv4], [nf_conntrack_ipv6]], + [modprobe mod || echo "Module mod not loaded." + on_exit 'modprobe -r mod' + ]) + on_exit 'conntrack -F' + ] +) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 7dbed68..de6b016 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -139,3 +139,472 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | FORMAT_PI OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([conntrack - controller]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") + +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +in_port=1,udp,action=ct(commit),controller +in_port=2,ct_state=-trk,udp,action=ct(table=0) +in_port=2,ct_state=+trk+est-new,udp,action=controller +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) +AT_CHECK([ovs-ofctl monitor br0 65534 invalid_ttl --detach --no-chdir --pidfile 2> ofctl_monitor.log]) + +dnl Send an unsolicited reply from port 2. This should be dropped. +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 2 ct\(table=0\) '50540000000a50540000000908004500001c00000000001100000a0101020a0101010002000100080000']) + +dnl OK, now start a new connection from port 1. +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 1 ct\(commit\),controller '50540000000a50540000000908004500001c00000000001100000a0101010a0101020001000200080000']) + +dnl Now try a reply from port 2. +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 2 ct\(table=0\) '50540000000a50540000000908004500001c00000000001100000a0101020a0101010002000100080000']) + +dnl Check this output. We only see the latter two packets, not the first. +AT_CHECK([cat ofctl_monitor.log], [0], [dnl +NXT_PACKET_IN (xid=0x0): total_len=42 in_port=1 (via action) data_len=42 (unbuffered) +udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=0,tp_src=1,tp_dst=2 udp_csum:0 +NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=42 ct_state=est|rpl|trk,in_port=2 (via action) data_len=42 (unbuffered) +udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=0,tp_src=2,tp_dst=1 udp_csum:0 +]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - IPv4 HTTP]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") + +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=10,icmp,action=normal +in_port=1,tcp,action=ct(commit),2 +in_port=2,ct_state=-trk,tcp,action=ct(table=0) +in_port=2,ct_state=+trk+est-new,tcp,action=1 +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +dnl Basic connectivity check. +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 >/dev/null]) + +dnl HTTP requests from ns0->ns1 should work fine. +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid]) +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log]) + +AT_CHECK([conntrack -L 2>&1 | FORMAT_CT(10.1.1.2)], [0], [dnl +TIME_WAIT src=10.1.1.1 dst=10.1.1.2 sport= dport= src=10.1.1.2 dst=10.1.1.1 sport= dport= [[ASSURED]] mark=0 use=1 +]) + +dnl HTTP requests from ns1->ns0 should fail due to network failure. +dnl Try 3 times, in 1 second intervals. +NETNS_DAEMONIZE([at_ns0], [[$PYTHON $srcdir/test-l7.py]], [http1.pid]) +NS_CHECK_EXEC([at_ns1], [wget 10.1.1.1 -t 3 -T 1 -v -o wget1.log], [4]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - IPv6 HTTP]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "fc00::1/96") +ADD_VETH(p1, at_ns1, br0, "fc00::2/96") + +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,icmp6,action=normal +in_port=1,tcp6,action=ct(commit),2 +in_port=2,ct_state=-trk,tcp6,action=ct(table=0) +in_port=2,ct_state=+trk+est-new,tcp6,action=1 +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +dnl Without this sleep, we get occasional failures due to the following error: +dnl "connect: Cannot assign requested address" +sleep 2; + +dnl HTTP requests from ns0->ns1 should work fine. +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py http6]], [http0.pid]) + +NS_CHECK_EXEC([at_ns0], [wget http://[[fc00::2]] -t 3 -T 1 --retry-connrefused -v -o wget0.log]) + +dnl HTTP requests from ns1->ns0 should fail due to network failure. +dnl Try 3 times, in 1 second intervals. +NETNS_DAEMONIZE([at_ns0], [[$PYTHON $srcdir/test-l7.py http6]], [http1.pid]) +NS_CHECK_EXEC([at_ns1], [wget http://[[fc00::1]] -t 3 -T 1 -v -o wget1.log], [4]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - commit, recirc]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2, at_ns3) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") +ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24") +ADD_VETH(p3, at_ns3, br0, "10.1.1.4/24") + +dnl Allow any traffic from ns0->ns1, ns2->ns3. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=10,icmp,action=normal +in_port=1,tcp,ct_state=-trk,action=ct(commit,table=0) +in_port=1,tcp,ct_state=+trk,action=2 +in_port=2,tcp,ct_state=-trk,action=ct(table=0) +in_port=2,tcp,ct_state=+trk,action=1 +in_port=3,tcp,ct_state=-trk,action=set_field:0->metadata,ct(table=0) +in_port=3,tcp,ct_state=+trk,metadata=0,action=set_field:1->metadata,ct(commit,table=0) +in_port=3,tcp,ct_state=+trk,metadata=1,action=4 +in_port=4,tcp,ct_state=-trk,action=ct(commit,table=0) +in_port=4,tcp,ct_state=+trk,action=3 +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +dnl HTTP requests from p0->p1 should work fine. +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid]) +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log]) + +dnl HTTP requests from p2->p3 should work fine. +NETNS_DAEMONIZE([at_ns3], [[$PYTHON $srcdir/test-l7.py]], [http1.pid]) +NS_CHECK_EXEC([at_ns2], [wget 10.1.1.4 -t 3 -T 1 --retry-connrefused -v -o wget1.log]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - preserve registers]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2, at_ns3) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") +ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24") +ADD_VETH(p3, at_ns3, br0, "10.1.1.4/24") + +dnl Allow any traffic from ns0->ns1, ns2->ns3. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=10,icmp,action=normal +in_port=1,tcp,ct_state=-trk,action=ct(commit,table=0) +in_port=1,tcp,ct_state=+trk,action=2 +in_port=2,tcp,ct_state=-trk,action=ct(table=0) +in_port=2,tcp,ct_state=+trk,action=1 +in_port=3,tcp,ct_state=-trk,action=load:0->NXM_NX_REG0[[]],ct(table=0) +in_port=3,tcp,ct_state=+trk,reg0=0,action=load:1->NXM_NX_REG0[[]],ct(commit,table=0) +in_port=3,tcp,ct_state=+trk,reg0=1,action=4 +in_port=4,tcp,ct_state=-trk,action=ct(commit,table=0) +in_port=4,tcp,ct_state=+trk,action=3 +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +dnl HTTP requests from p0->p1 should work fine. +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid]) +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log]) + +dnl HTTP requests from p2->p3 should work fine. +NETNS_DAEMONIZE([at_ns3], [[$PYTHON $srcdir/test-l7.py]], [http1.pid]) +NS_CHECK_EXEC([at_ns2], [wget 10.1.1.4 -t 3 -T 1 --retry-connrefused -v -o wget1.log]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - invalid]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2, at_ns3) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") +ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24") +ADD_VETH(p3, at_ns3, br0, "10.1.1.4/24") + +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=10,icmp,action=normal +in_port=1,tcp,action=ct(),2 +in_port=2,ct_state=-trk,tcp,action=ct(table=0) +in_port=2,ct_state=+trk+new,tcp,action=1 +in_port=3,tcp,action=ct(),4 +in_port=4,ct_state=-trk,tcp,action=ct(table=0) +in_port=4,ct_state=+trk+inv,tcp,action=3 +in_port=4,ct_state=+trk+new,tcp,action=3 +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +dnl We set up our rules to allow the request without committing. The return +dnl traffic can't be identified, because the initial request wasn't committed. +dnl For the first pair of ports, this means that the connection fails. +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid]) +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log], [4]) + +dnl For the second pair, we allow packets from invalid connections, so it works. +NETNS_DAEMONIZE([at_ns3], [[$PYTHON $srcdir/test-l7.py]], [http1.pid]) +NS_CHECK_EXEC([at_ns2], [wget 10.1.1.4 -t 3 -T 1 --retry-connrefused -v -o wget1.log]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - zones]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2, at_ns3) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") +ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24") +ADD_VETH(p3, at_ns3, br0, "10.1.1.4/24") + +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=10,icmp,action=normal +in_port=1,tcp,action=ct(commit,zone=1),2 +in_port=2,ct_state=-trk,tcp,action=ct(table=0,zone=1) +in_port=2,ct_state=+trk,ct_zone=1,tcp,action=1 +in_port=3,tcp,action=ct(commit,zone=2),4 +in_port=4,ct_state=-trk,tcp,action=ct(table=0,zone=2) +in_port=4,ct_state=+trk,ct_zone=1,tcp,action=3 +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +dnl HTTP requests from p0->p1 should work fine. +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid]) +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log]) + +AT_CHECK([conntrack -L 2>&1 | FORMAT_CT(10.1.1.2)], [0], [dnl +TIME_WAIT src=10.1.1.1 dst=10.1.1.2 sport= dport= src=10.1.1.2 dst=10.1.1.1 sport= dport= [[ASSURED]] mark=0 zone=1 use=1 +]) + +dnl HTTP requests from p2->p3 should fail due to network failure. +dnl Try 3 times, in 1 second intervals. +NETNS_DAEMONIZE([at_ns3], [[$PYTHON $srcdir/test-l7.py]], [http1.pid]) +NS_CHECK_EXEC([at_ns2], [wget 10.1.1.4 -t 3 -T 1 -v -o wget1.log], [4]) + +AT_CHECK([conntrack -L 2>&1 | FORMAT_CT(10.1.1.4)], [0], [dnl +SYN_RECV src=10.1.1.3 dst=10.1.1.4 sport= dport= src=10.1.1.4 dst=10.1.1.3 sport= dport= mark=0 zone=2 use=1 +]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - zones from field]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2, at_ns3) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") +ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24") +ADD_VETH(p3, at_ns3, br0, "10.1.1.4/24") + +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=10,icmp,action=normal +in_port=1,tcp,action=load:1->NXM_NX_REG0[[0..15]],ct(commit,zone=NXM_NX_REG0[[0..15]]),2 +in_port=2,ct_state=-trk,tcp,action=load:1->NXM_NX_REG0[[0..15]],ct(table=0,zone=NXM_NX_REG0[[0..15]]) +in_port=2,ct_state=+trk,ct_zone=1,tcp,action=1 +in_port=3,tcp,action=load:2->NXM_NX_REG0[[0..15]],ct(commit,zone=NXM_NX_REG0[[0..15]]),4 +in_port=4,ct_state=-trk,tcp,action=load:2->NXM_NX_REG0[[0..15]],ct(table=0,zone=NXM_NX_REG0[[0..15]]) +in_port=4,ct_state=+trk,ct_zone=1,tcp,action=3 +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +dnl HTTP requests from p0->p1 should work fine. +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid]) +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log]) + +AT_CHECK([conntrack -L 2>&1 | FORMAT_CT(10.1.1.2)], [0], [dnl +TIME_WAIT src=10.1.1.1 dst=10.1.1.2 sport= dport= src=10.1.1.2 dst=10.1.1.1 sport= dport= [[ASSURED]] mark=0 zone=1 use=1 +]) + +dnl HTTP requests from p2->p3 should fail due to network failure. +dnl Try 3 times, in 1 second intervals. +NETNS_DAEMONIZE([at_ns3], [[$PYTHON $srcdir/test-l7.py]], [http1.pid]) +NS_CHECK_EXEC([at_ns2], [wget 10.1.1.4 -t 3 -T 1 -v -o wget1.log], [4]) + +AT_CHECK([conntrack -L 2>&1 | FORMAT_CT(10.1.1.4)], [0], [dnl +SYN_RECV src=10.1.1.3 dst=10.1.1.4 sport= dport= src=10.1.1.4 dst=10.1.1.3 sport= dport= mark=0 zone=2 use=1 +]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - multiple bridges]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone --\ + add-br br1 --\ + add-port br0 patch+ -- set int patch+ type=patch options:peer=patch- --\ + add-port br1 patch- -- set int patch- type=patch options:peer=patch+ --]) + +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24") + +dnl Allow any traffic from ns0->ns1, allow established in reverse. +AT_DATA([flows-br0.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=10,icmp,action=normal +in_port=2,tcp,ct_state=-trk,action=ct(commit,zone=1),1 +in_port=1,tcp,ct_state=-trk,action=ct(table=0,zone=1) +in_port=1,tcp,ct_state=+trk+est,ct_zone=1,action=2 +]) + +dnl Allow any traffic from ns0->ns1, allow established in reverse. +AT_DATA([flows-br1.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=10,icmp,action=normal +in_port=1,tcp,ct_state=-trk,action=ct(table=0,zone=2) +in_port=1,tcp,ct_state=+trk+new,ct_zone=2,action=ct(commit,zone=2),2 +in_port=1,tcp,ct_state=+trk+est,ct_zone=2,action=2 +in_port=2,tcp,ct_state=-trk,action=ct(table=0,zone=2) +in_port=2,tcp,ct_state=+trk+est,ct_zone=2,action=ct(commit,zone=2),1 +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows-br0.txt]) +AT_CHECK([ovs-ofctl add-flows br1 flows-br1.txt]) + +dnl HTTP requests from p0->p1 should work fine. +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid]) +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - multiple zones]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2, at_ns3) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") +ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24") +ADD_VETH(p3, at_ns3, br0, "10.1.1.4/24") + +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +priority=10,icmp,action=normal +in_port=1,tcp,action=ct(commit,zone=1),ct(commit,zone=2),2 +in_port=2,ct_state=-trk,tcp,action=ct(table=0,zone=2) +in_port=2,ct_state=+trk,ct_zone=2,tcp,action=1 +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +dnl HTTP requests from p0->p1 should work fine. +NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid]) +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log]) + +dnl (again) HTTP requests from p0->p1 should work fine. +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log]) + +AT_CHECK([conntrack -L 2>&1 | FORMAT_CT(10.1.1.2)], [0], [dnl +SYN_SENT src=10.1.1.1 dst=10.1.1.2 sport= dport= [[UNREPLIED]] src=10.1.1.2 dst=10.1.1.1 sport= dport= mark=0 zone=1 use=1 +TIME_WAIT src=10.1.1.1 dst=10.1.1.2 sport= dport= src=10.1.1.2 dst=10.1.1.1 sport= dport= [[ASSURED]] mark=0 zone=2 use=1 +]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([conntrack - ICMP related 2]) +CHECK_CONNTRACK() +OVS_TRAFFIC_VSWITCHD_START( + [set-fail-mode br0 standalone -- ]) + +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "172.16.0.1/24") +ADD_VETH(p1, at_ns1, br0, "172.16.0.2/24") + +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0. +AT_DATA([flows.txt], [dnl +priority=1,action=drop +priority=10,arp,action=normal +in_port=1,ct_state=-trk,udp,action=ct(commit,table=0) +in_port=1,ct_state=+trk,actions=controller +in_port=2,ct_state=-trk,action=ct(table=0) +in_port=2,ct_state=+trk-inv-new,action=controller +]) + +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) + +AT_CAPTURE_FILE([ofctl_monitor.log]) +AT_CHECK([ovs-ofctl monitor br0 65534 invalid_ttl --detach --no-chdir --pidfile 2> ofctl_monitor.log]) + +dnl 1. Send an ICMP port unreach reply for port 8738, without any previous request +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 2 ct\(table=0\) 'f64c473528c9c6f54ecb72db080045c0003d2e8700004001f355ac100004ac1000030303553f0000000045000021317040004011b138ac100003ac10000411112222000d20966369616f0a']) + +dnl 2. Send and UDP packet to port 5555 +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 1 ct\(commit,table=0\) 'c6f94ecb72dbe64c473528c9080045000021317040004011b138ac100001ac100002a28e15b3000d20966369616f0a']) + +dnl 3. Send an ICMP port unreach reply for port 5555, related to the first packet +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 2 ct\(table=0\) 'e64c473528c9c6f94ecb72db080045c0003d2e8700004001f355ac100002ac1000010303553f0000000045000021317040004011b138ac100001ac100002a28e15b3000d20966369616f0a']) + +dnl Check this output. We only see the latter two packets, not the first. +AT_CHECK([cat ofctl_monitor.log], [0], [dnl +NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=47 ct_state=new|trk,in_port=1 (via action) data_len=47 (unbuffered) +udp,vlan_tci=0x0000,dl_src=e6:4c:47:35:28:c9,dl_dst=c6:f9:4e:cb:72:db,nw_src=172.16.0.1,nw_dst=172.16.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=41614,tp_dst=5555 udp_csum:2096 +NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=75 ct_state=rel|rpl|trk,in_port=2 (via action) data_len=75 (unbuffered) +icmp,vlan_tci=0x0000,dl_src=c6:f9:4e:cb:72:db,dl_dst=e6:4c:47:35:28:c9,nw_src=172.16.0.2,nw_dst=172.16.0.1,nw_tos=192,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=3 icmp_csum:553f +]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP diff --git a/tests/system-userspace-macros.at b/tests/system-userspace-macros.at index fca26f7..16256cb 100644 --- a/tests/system-userspace-macros.at +++ b/tests/system-userspace-macros.at @@ -37,3 +37,12 @@ m4_define([OVS_TRAFFIC_VSWITCHD_STOP], /dpif_netlink.*Generic Netlink family 'ovs_datapath' does not exist. The Open vSwitch kernel module is probably not loaded./d"]) AT_CHECK([:; $2]) ]) + +# CHECK_CONNTRACK() +# +# Perform requirements checks for running conntrack tests, and flush the +# kernel conntrack tables when the test is finished. +# +m4_define([CHECK_CONNTRACK], + [AT_SKIP_IF(true)] +) diff --git a/tests/test-l7.py b/tests/test-l7.py new file mode 100755 index 0000000..65c6c2a --- /dev/null +++ b/tests/test-l7.py @@ -0,0 +1,72 @@ +# Copyright (c) 2015 Nicira, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import argparse +import socket + +from BaseHTTPServer import HTTPServer +from SimpleHTTPServer import SimpleHTTPRequestHandler +from SocketServer import TCPServer + + +class TCPServerV6(HTTPServer): + address_family = socket.AF_INET6 + + +def get_ftpd(): + try: + from pyftpdlib.authorizers import DummyAuthorizer + from pyftpdlib.handlers import FTPHandler + from pyftpdlib.servers import FTPServer + + class OVSFTPHandler(FTPHandler): + authorizer = DummyAuthorizer() + authorizer.add_anonymous("/tmp") + server = [FTPServer, OVSFTPHandler, 21] + except ImportError: + server = None + pass + return server + + +def main(): + SERVERS = { + 'http': [TCPServer, SimpleHTTPRequestHandler, 80], + 'http6': [TCPServerV6, SimpleHTTPRequestHandler, 80], + } + + ftpd = get_ftpd() + if ftpd is not None: + SERVERS['ftp'] = ftpd + + protocols = [srv for srv in SERVERS] + parser = argparse.ArgumentParser( + description='Run basic application servers.') + parser.add_argument('proto', default='http', nargs='?', + help='protocol to serve (%s)' % protocols) + args = parser.parse_args() + + if args.proto not in SERVERS: + parser.print_help() + exit(1) + + constructor = SERVERS[args.proto][0] + handler = SERVERS[args.proto][1] + port = SERVERS[args.proto][2] + srv = constructor(('', port), handler) + srv.serve_forever() + + +if __name__ == '__main__': + main() diff --git a/tests/test-odp.c b/tests/test-odp.c index 3e7939e..cb245d6 100644 --- a/tests/test-odp.c +++ b/tests/test-odp.c @@ -57,7 +57,10 @@ parse_keys(bool wc_keys) struct odp_flow_key_parms odp_parms = { .flow = &flow, .support = { + .max_mpls_depth = SIZE_MAX, .recirc = true, + .ct_state = true, + .ct_zone = true, }, }; diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in index 8bb3715..7e5fcaa 100644 --- a/utilities/ovs-ofctl.8.in +++ b/utilities/ovs-ofctl.8.in @@ -1305,6 +1305,65 @@ is used only with the \fBconjunction\fR action (see below). .IP This field was introduced in Open vSwitch 2.4. . +.IP \fBct_state=\fIflags\fB/\fImask\fR +.IQ \fBct_state=\fR[\fB+\fIflag\fR...][\fB-\fIflag\fR...] +Bitwise match on connection state flags. The flags are only available +after a call to the \fBct\fR action with the \fBtable\fR specified. + +.IP +The \fIflags\fR and \fImask\fR are 8-bit numbers written in decimal or +in hexadecimal prefixed by \fB0x\fR. Each 1-bit in \fImask\fR requires +that the corresponding bit in \fIflags\fR must match. Each 0-bit in +\fImask\fR causes the corresponding bit to be ignored. +.IP +Alternatively, the flags can be specified by their symbolic names +(listed below), each preceded by either \fB+\fR for a flag that must +be set, or \fB\-\fR for a flag that must be unset, without any other +delimiters between the flags. Flags not mentioned are wildcarded. For +example, \fBtcp,ct_state=+trk\-new\fR matches TCP packets that +have been run through the connection tracker and do not establish a new +flow. +.IP +The following flags describe the state of the tracking: +.RS +.IP "\fB0x80: trk\fR" +Connection tracking has occurred. +.IP "\fB0x40: rpl\fR" +The flow is in the reply direction, meaning it did not initiate the +connection. +.IP "\fB0x20: inv\fR" +The flow is invalid, meaning that the connection tracker couldn't identify the +connection. +.RS +.PP +This flag may be set for the following reasons: +.RS +L3/L4 protocol handler is not loaded/unavailable. With the Linux kernel +datapath, this may mean that the "nf_conntrack_ipv4" or "nf_conntrack_ipv6" +modules are not loaded. +.PP +L3/L4 protocol handler determines that the packet is malformed or invalid. +.PP +Packets are unexpected length for protocol. +.RE +.RE +.IP "\fB0x01: new\fR" +This is the beginning of a new connection. +.IP "\fB0x02: est\fR" +This is part of an already existing connection. +.IP "\fB0x04: rel\fR" +This is a connection that is related to an existing connection, for +instance ICMP "destination unreachable" messages or FTP data connections. +.RE +. +.PP +The following fields are data associated with the connection tracker and +can only be matched or set after running through the connection tracker +by using the \fBct\fR action. +. +.IP \fBct_zone=\fIvalue +Matches connection zone \fIvalue\fR exactly. +. .PP Defining IPv6 flows (those with \fBdl_type\fR equal to 0x86dd) requires support for NXM. The following shorthand notations are available for @@ -1542,6 +1601,32 @@ OpenFlow implementations do not support queuing at all. Restores the queue to the value it was before any \fBset_queue\fR actions were applied. . +.IP \fBct\fR +.IQ \fBct\fB(\fR[\fIargument\fR][\fB,\fIargument\fR...]\fB) +Send the packet through the connection tracker. The following arguments +are supported: + +.RS +.IP \fBcommit\fR +Commit the flow to the connection tracking module. +.IP \fBtable=\fInumber\fR +After ct processing, the packet should be re-inserted into OpenFlow pipeline +processing in table \fInumber\fR, with the \fBct_state\fR and other ct match +fields set. It is strongly recommended to use a table later than the current +table to prevent loops. +.IP \fBzone=\fIvalue\fR +.IQ \fBzone=\fIsrc\fB[\fIstart\fB..\fIend\fB]\fR +A 16-bit context id that can be used to isolate connections into separate +domains, allowing overlapping network addresses in different zones. If a zone +is not provided, then the default is to use zone zero. The \fBzone\fR may be +specified either as an immediate 16-bit \fIvalue\fR, or may be provided from an +NXM field \fIsrc\fR. The \fIstart\fR and \fIend\fR pair are inclusive, and must +specify a 16-bit range within the field. +.RE +.IP +Currently, connection tracking is only available on Linux kernels with the +conntrack module loaded. +. .IP \fBdec_ttl\fR .IQ \fBdec_ttl\fB[\fR(\fIid1,id2\fI)\fR]\fR Decrement TTL of IPv4 packet or hop limit of IPv6 packet. If the