From patchwork Mon May 27 15:12:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 1105880 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=samsung.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=samsung.com header.i=@samsung.com header.b="LTkn2cFd"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45CL9l70G4z9s5c for ; Tue, 28 May 2019 01:14:35 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id A32B419B6; Mon, 27 May 2019 15:14:32 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id C83331743 for ; Mon, 27 May 2019 15:13:08 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id EC9226C5 for ; Mon, 27 May 2019 15:13:06 +0000 (UTC) Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20190527151305euoutp0261ddc6ffd92a0973299d59ac5f11764c~ikvEQFYMp2678526785euoutp02m for ; Mon, 27 May 2019 15:13:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20190527151305euoutp0261ddc6ffd92a0973299d59ac5f11764c~ikvEQFYMp2678526785euoutp02m DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1558969985; bh=759JjUD6kD6YRUG0miumMJzbI+g3WtwpnoW2fb8TCyw=; h=From:To:Cc:Subject:Date:References:From; b=LTkn2cFdLNQpcZ0J3mdJHOuhh4yS4wVkfAftPJqAW7lst328bGG0oYeJOoMqr5JWr ougSTylOkXWjlW4/dDAwzpT/hFSGeFA/TAgVVjjGfXY9lF8HVss5Mmc2pdH6Ozz9mJ FUg7BD5nP5rQmCCFBoLBqPMsm9Y4c4nD8QdOqBWE= Received: from eusmges1new.samsung.com (unknown [203.254.199.242]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20190527151304eucas1p1bfdb402837d266afcc1ccefd4a3b42f3~ikvD1IkaH1199611996eucas1p1h; Mon, 27 May 2019 15:13:04 +0000 (GMT) Received: from eucas1p1.samsung.com ( [182.198.249.206]) by eusmges1new.samsung.com (EUCPMTA) with SMTP id 48.60.04298.08EFBEC5; Mon, 27 May 2019 16:13:04 +0100 (BST) Received: from eusmtrp2.samsung.com (unknown [182.198.249.139]) by eucas1p2.samsung.com (KnoxPortal) with ESMTPA id 20190527151303eucas1p26eb97d4aeefbc2346af926f8e97f95cc~ikvDA0q1L2990929909eucas1p2c; Mon, 27 May 2019 15:13:03 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp2.samsung.com (KnoxPortal) with ESMTP id 20190527151303eusmtrp2065eeb92ff8af97aeaf67bf2d8458d24~ikvCyoLt-2492424924eusmtrp2B; Mon, 27 May 2019 15:13:03 +0000 (GMT) X-AuditID: cbfec7f2-f2dff700000010ca-e4-5cebfe8087c2 Received: from eusmtip1.samsung.com ( [203.254.199.221]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id F4.B7.04140.F7EFBEC5; Mon, 27 May 2019 16:13:03 +0100 (BST) Received: from imaximets.rnd.samsung.ru (unknown [106.109.129.180]) by eusmtip1.samsung.com (KnoxPortal) with ESMTPA id 20190527151302eusmtip14e7ffaea39413f94f8047a8ba760eb81~ikvCOnOfA2224322243eusmtip1L; Mon, 27 May 2019 15:13:02 +0000 (GMT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Mon, 27 May 2019 18:12:56 +0300 Message-Id: <20190527151256.16434-1-i.maximets@samsung.com> X-Mailer: git-send-email 2.17.1 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrFIsWRmVeSWpSXmKPExsWy7djPc7oN/17HGFzoErJ4NbmB0eJK+092 i40Pz7JarD30gd1i7qfnjBZ/tq1ntLixk9OB3WPxnpdMHs+mH2bymN79kNnj2c3/jB7Pr/Ww eLzfd5XNo2/LKsYA9igum5TUnMyy1CJ9uwSujNbpTWwFTfsZK34vOcjWwLi8g7GLkYNDQsBE YsoRiS5GLg4hgRWMEks272SCcL4wSjyc9ZUVwvkMlDneDNTBCdGxdjFU1XJGiVm7l0NV/WCU uHD7H1gVm4COxKnVR8BsEQFpide9b8CKmAUeMEr0HbjFApIQFgiW+HFoJ5jNIqAqcXzRO1YQ m1fAWqLrylUmiHXyEqs3HGAGaZYQeM4m8bt/MzNEwkWiefFFFghbWOLV8S3sELaMxP+d86Ga 6yXut7xkhGjuYJSYfugfVMJeYsvrc+ygIGAW0JRYv0sfEhqOEk9a+CFMPokbbwVBipmBzEnb pjNDhHklOtqEIGaoSPw+uBzqGCmJm+8+Qx3gIdH5+AbYHiGBWIkTDw6xTWCUm4WwagEj4ypG 8dTS4tz01GLDvNRyveLE3OLSvHS95PzcTYzAFHH63/FPOxi/Xko6xCjAwajEw2tx6nWMEGti WXFl7iFGCQ5mJRFe0y2vYoR4UxIrq1KL8uOLSnNSiw8xSnOwKInzVjM8iBYSSE8sSc1OTS1I LYLJMnFwSjUwhhaus3NTOH3lw68pdmzrAuO+8rPpqyn5BNbMD7ttb/PkIuOPI2qXylsdzl5a M6f07IzGq4GcQvJRPHIJKVr9/y5lPpN9vu0cswX3/km1LJ9e5ZzwWN393+3Is699cyQyc4TS 0nawFHK2+34q/vdTJi5vw0ypzSEWMyRmrjCftafQY6XrzJfflFiKMxINtZiLihMB7QaPpg0D AAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrILMWRmVeSWpSXmKPExsVy+t/xu7r1/17HGBzYJG3xanIDo8WV9p/s FhsfnmW1WHvoA7vF3E/PGS3+bFvPaHFjJ6cDu8fiPS+ZPJ5NP8zkMb37IbPHs5v/GT2eX+th 8Xi/7yqbR9+WVYwB7FF6NkX5pSWpChn5xSW2StGGFkZ6hpYWekYmlnqGxuaxVkamSvp2Nimp OZllqUX6dgl6Ga3Tm9gKmvYzVvxecpCtgXF5B2MXIyeHhICJxJS1i5m6GLk4hASWMkrM/rCA BSIhJfHj1wVWCFtY4s+1LjYQW0jgG6PEzXOxIDabgI7EqdVHwAaJCEhLvO59wwoyiFngGaPE vK53YM3CAoESWw5uAhvKIqAqcXwRRJxXwFqi68pVJogF8hKrNxxgnsDIs4CRYRWjSGppcW56 brGRXnFibnFpXrpecn7uJkZgcG479nPLDsaud8GHGAU4GJV4eC1OvY4RYk0sK67MPcQowcGs JMJruuVVjBBvSmJlVWpRfnxRaU5q8SFGU6DlE5mlRJPzgZGTVxJvaGpobmFpaG5sbmxmoSTO 2yFwMEZIID2xJDU7NbUgtQimj4mDU6qBUbG+/lU/Z+MtBS9N/4CVR/+rBXWu4N37w/HSfd3V fD433D1fzfz9n3vx00XLxTXdDgY9+WpxZK7c3RkR820z61vuTfbddaOg4f/FBVW31l0odD4e wQm0Mi5OI09XMvxFlX9dUaWf8nGtEnmGFtFTIQ/XNboxKSh+Kpn1UvzNb/4gjcpzWg5KLMUZ iYZazEXFiQCX5I+cZAIAAA== X-CMS-MailID: 20190527151303eucas1p26eb97d4aeefbc2346af926f8e97f95cc X-Msg-Generator: CA X-RootMTR: 20190527151303eucas1p26eb97d4aeefbc2346af926f8e97f95cc X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20190527151303eucas1p26eb97d4aeefbc2346af926f8e97f95cc References: X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Simon Horman , Ilya Maximets Subject: [ovs-dev] [PATCH v2] dpif-netdev: Forwarding optimization for direct output flows. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org There are some cases where users want to have simple forwarding or drop rules for all packets received from particular port, i.e : "in_port=1,actions=2" "in_port=1,actions=IN_PORT" "in_port=1,actions=drop" There are also cases where complex OF flows could be simplified down to simple forwarding/drop datapath flows. In theory, we don't need to parse packets at all to follow these flows. "Direct output forwarding" optimization is intended to speed up above cases. Design: Due to various implementation restrictions userspace datapath has following flow fields always in exact match (i.e. it's required to match at least these fields of a packet even if the OF rule doesn't need that): - recirc_id - in_port - packet_type - dl_type - vlan_tci - nw_frag (for ip packets) Not all of these fields are related to packet itself. We already know the current 'recirc_id' and the 'in_port' before starting the packet processing. It also seems safe to assume that we're working with Ethernet packets. dpif-netdev sets exact match on 'vlan_tci' to avoid issues with flow format conversion and we don't really need to match with it until ofproto layer didn't ask us to. So, for the simple forwarding OF rule we need to match only with 'dl_type' and 'nw_frag'. 'in_port', 'dl_type' and 'nw_frag' could be combined in a single 64bit integer that could be used as a hash in hash map. New per-PMD flow table 'direct_output_table' introduced to store direct output flows only. 'dp_netdev_flow_add' adds flow to the usual 'flow_table' and to 'direct_output_table' if the flow meets following constraints: - 'recirc_id' in flow match is 0. - 'packet_type' in flow match is Ethernet. - Flow wildcards originally had wildcarded 'vlan_tci'. - Flow has no actions (drop) or exactly one action equal to OVS_ACTION_ATTR_OUTPUT. - Flow wildcards contains only minimal set of non-wildcarded fields (listed above). If the number of flows for current 'in_port' in regular 'flow_table' equals number of flows for current 'in_port' in 'direct_output_table', we may use direct output optimization, because all the flows we have are direct output flows. This means that we only need to parse 'dl_type' and 'nw_frag' to perform packet matching. Now we making the unique flow mark from the 'in_port', 'dl_type' and 'nw_frag' and looking for it in 'direct_output_table'. On successful lookup we don't need to make full 'miniflow_extract()'. Unsuccessful lookup technically means that we have no sufficient flow in datapath and upcall will be required. We may optimize this path in the future by bypassing the EMC, SMC and dpcls lookups in this case. Performance improvement of this solution on a 'direct output' flows should be comparable with partial HW offloading, because it parses same packet fields and uses similar flow lookup scheme. However, unlike partial HW offloading, it works for all port types including virtual ones. Signed-off-by: Ilya Maximets --- This patch was made as a point for "virtio-forwarder" discussion: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/358686.html However, it might be very useful by itself for usual cases too. Testing is very welcome. I didn't run the performance tests on real systems, so I don't know the real performance impact yet. Version 2: * Updated comment about output arguments of 'parse_tcp_flags()'. * Fixed using uninitialized 'dl_type' and 'nw_frag' for non-IP/TCP. * 'dp_netdev_direct_output_enabled()' now checked once per batch. * Added check for 'packet_type == PT_ETH' before inserting flow. lib/dpif-netdev.c | 260 +++++++++++++++++++++++++++++++++++++++++++--- lib/flow.c | 12 ++- lib/flow.h | 3 +- 3 files changed, 256 insertions(+), 19 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 5a6f2abac..993997c3b 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -34,6 +34,7 @@ #include "bitmap.h" #include "cmap.h" +#include "ccmap.h" #include "conntrack.h" #include "coverage.h" #include "ct-dpif.h" @@ -530,6 +531,8 @@ struct dp_netdev_flow { /* Hash table index by unmasked flow. */ const struct cmap_node node; /* In owning dp_netdev_pmd_thread's */ /* 'flow_table'. */ + const struct cmap_node direct_output_node; /* In dp_netdev_pmd_thread's + 'direct_output_table'. */ const struct cmap_node mark_node; /* In owning flow_mark's mark_to_flow */ const ovs_u128 ufid; /* Unique flow identifier. */ const ovs_u128 mega_ufid; /* Unique mega flow identifier. */ @@ -543,7 +546,8 @@ struct dp_netdev_flow { struct ovs_refcount ref_cnt; bool dead; - uint32_t mark; /* Unique flow mark assigned to a flow */ + uint32_t mark; /* Unique flow mark for netdev offloading. */ + uint64_t direct_output_mark; /* Unique flow mark for direct output. */ /* Statistics. */ struct dp_netdev_flow_stats stats; @@ -658,12 +662,19 @@ struct dp_netdev_pmd_thread { /* Flow-Table and classifiers * - * Writers of 'flow_table' must take the 'flow_mutex'. Corresponding - * changes to 'classifiers' must be made while still holding the - * 'flow_mutex'. + * Writers of 'flow_table'/'direct_output_table' and their n* ccmap's must + * take the 'flow_mutex'. Corresponding changes to 'classifiers' must be + * made while still holding the 'flow_mutex'. */ struct ovs_mutex flow_mutex; struct cmap flow_table OVS_GUARDED; /* Flow table. */ + struct cmap direct_output_table OVS_GUARDED; /* Flow table with direct + output flows only. */ + struct ccmap n_flows OVS_GUARDED; /* Number of flows in 'flow_table' + per in_port. */ + struct ccmap n_direct_flows OVS_GUARDED; /* Number of flows in + 'direct_output_table' + per in_port. */ /* One classifier per in_port polled by the pmd */ struct cmap classifiers; @@ -835,6 +846,24 @@ pmd_perf_metrics_enabled(const struct dp_netdev_pmd_thread *pmd); static void queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd, struct dp_netdev_flow *flow); +static void dp_netdev_direct_output_insert(struct dp_netdev_pmd_thread *pmd, + struct dp_netdev_flow *flow) + OVS_REQUIRES(pmd->flow_mutex); +static void dp_netdev_direct_output_remove(struct dp_netdev_pmd_thread *pmd, + struct dp_netdev_flow *flow) + OVS_REQUIRES(pmd->flow_mutex); + +static bool dp_netdev_flow_is_direct_output(const struct flow_wildcards *wc, + const struct nlattr *actions, + size_t actions_len); +static bool +dp_netdev_direct_output_enabled(const struct dp_netdev_pmd_thread *pmd, + odp_port_t in_port); +static struct dp_netdev_flow * +dp_netdev_direct_output_lookup(const struct dp_netdev_pmd_thread *pmd, + odp_port_t in_port, + ovs_be16 dp_type, uint8_t nw_frag); + static void emc_cache_init(struct emc_cache *flow_cache) { @@ -2516,7 +2545,9 @@ dp_netdev_pmd_remove_flow(struct dp_netdev_pmd_thread *pmd, cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port); ovs_assert(cls != NULL); dpcls_remove(cls, &flow->cr); + dp_netdev_direct_output_remove(pmd, flow); cmap_remove(&pmd->flow_table, node, dp_netdev_flow_hash(&flow->ufid)); + ccmap_dec(&pmd->n_flows, odp_to_u32(in_port)); if (flow->mark != INVALID_FLOW_MARK) { queue_netdev_flow_del(pmd, flow); } @@ -3195,10 +3226,166 @@ dp_netdev_get_mega_ufid(const struct match *match, ovs_u128 *mega_ufid) dpif_flow_hash(NULL, &masked_flow, sizeof(struct flow), mega_ufid); } +static uint64_t +dp_netdev_direct_output_mark(odp_port_t in_port, + ovs_be16 dl_type, uint8_t nw_frag) +{ + return ((uint64_t) odp_to_u32(in_port) << 32) + | ((uint32_t) ntohs(dl_type) << 16) | nw_frag; +} + +static struct dp_netdev_flow * +dp_netdev_direct_output_lookup(const struct dp_netdev_pmd_thread *pmd, + odp_port_t in_port, + ovs_be16 dl_type, uint8_t nw_frag) +{ + uint32_t hash; + uint64_t mark; + struct dp_netdev_flow *flow; + + mark = dp_netdev_direct_output_mark(in_port, dl_type, nw_frag); + hash = hash_uint64(mark); + + CMAP_FOR_EACH_WITH_HASH (flow, direct_output_node, + hash, &pmd->direct_output_table) { + if (flow->direct_output_mark == mark) { + VLOG_DBG("Direct output lookup: " + "core_id(%d),in_port(%"PRIu32"),mark(0x%"PRIx64") -> %s.", + pmd->core_id, in_port, mark, "success"); + return flow; + } + } + VLOG_DBG("Direct output lookup: " + "core_id(%d),in_port(%"PRIu32"),mark(0x%"PRIx64") -> %s.", + pmd->core_id, in_port, mark, "fail"); + return NULL; +} + +static bool +dp_netdev_direct_output_enabled(const struct dp_netdev_pmd_thread *pmd, + odp_port_t in_port) +{ + return ccmap_find(&pmd->n_flows, odp_to_u32(in_port)) + == ccmap_find(&pmd->n_direct_flows, odp_to_u32(in_port)); +} + +static void +dp_netdev_direct_output_insert(struct dp_netdev_pmd_thread *pmd, + struct dp_netdev_flow *dp_flow) + OVS_REQUIRES(pmd->flow_mutex) +{ + uint32_t hash; + uint64_t mark; + uint8_t nw_frag = dp_flow->flow.nw_frag; + ovs_be16 dl_type = dp_flow->flow.dl_type; + odp_port_t in_port = dp_flow->flow.in_port.odp_port; + + if (!dp_netdev_flow_ref(dp_flow)) { + return; + } + + /* Avoid double insertion. Should not happen in practice. */ + dp_netdev_direct_output_remove(pmd, dp_flow); + + mark = dp_netdev_direct_output_mark(in_port, dl_type, nw_frag); + hash = hash_uint64(mark); + + dp_flow->direct_output_mark = mark; + cmap_insert(&pmd->direct_output_table, + CONST_CAST(struct cmap_node *, &dp_flow->direct_output_node), + hash); + ccmap_inc(&pmd->n_direct_flows, odp_to_u32(in_port)); + + VLOG_DBG("Direct output insert: " + "core_id(%d),in_port(%"PRIu32"),mark(0x%"PRIx64").", + pmd->core_id, in_port, mark); +} + +static void +dp_netdev_direct_output_remove(struct dp_netdev_pmd_thread *pmd, + struct dp_netdev_flow *dp_flow) + OVS_REQUIRES(pmd->flow_mutex) +{ + uint32_t hash; + uint64_t mark; + struct dp_netdev_flow *flow; + uint8_t nw_frag = dp_flow->flow.nw_frag; + ovs_be16 dl_type = dp_flow->flow.dl_type; + odp_port_t in_port = dp_flow->flow.in_port.odp_port; + + mark = dp_netdev_direct_output_mark(in_port, dl_type, nw_frag); + hash = hash_uint64(mark); + + flow = dp_netdev_direct_output_lookup(pmd, in_port, dl_type, nw_frag); + if (flow) { + ovs_assert(dp_flow == flow); + VLOG_DBG("Direct output remove: " + "core_id(%d),in_port(%"PRIu32"),mark(0x%"PRIx64").", + pmd->core_id, in_port, mark); + cmap_remove(&pmd->direct_output_table, + CONST_CAST(struct cmap_node *, &flow->direct_output_node), + hash); + ccmap_dec(&pmd->n_direct_flows, odp_to_u32(in_port)); + dp_netdev_flow_unref(flow); + } +} + +static bool +dp_netdev_flow_is_direct_output(const struct flow_wildcards *wc, + const struct nlattr *actions, + size_t actions_len) +{ + /* Drop flows has no explicit actions. Treat them as direct output. */ + if (actions && actions_len) { + unsigned int left, n_actions = 0; + const struct nlattr *a; + + /* Check that there is only one action and it's OUTPUT action. */ + NL_ATTR_FOR_EACH (a, left, actions, actions_len) { + enum ovs_action_attr type = nl_attr_type(a); + + if (++n_actions > 1 || type != OVS_ACTION_ATTR_OUTPUT) { + return false; + } + } + } + + /* Check that flow matches only minimal set of fields that always set. */ + if (wc) { + struct flow_wildcards *minimal = xmalloc(sizeof *minimal); + + flow_wildcards_init_catchall(minimal); + /* 'dpif-netdev' always has following in exact match: + * - recirc_id <-- recirc_id == 0 checked on input. + * - in_port <-- will be checked on input. + * - packet_type <-- Assuming all packets are PT_ETH. + * - dl_type <-- Need to match with. + * - vlan_tci <-- No need to match if not asked. + * - and nw_frag for ip packets. <-- Need to match for ip packets. + */ + WC_MASK_FIELD(minimal, recirc_id); + WC_MASK_FIELD(minimal, in_port); + WC_MASK_FIELD(minimal, packet_type); + WC_MASK_FIELD(minimal, dl_type); + WC_MASK_FIELD(minimal, vlans[0].tci); + WC_MASK_FIELD_MASK(minimal, nw_frag, FLOW_NW_FRAG_MASK); + + if (flow_wildcards_has_extra(minimal, wc)) { + free(minimal); + return false; + } + free(minimal); + } + + return true; +} + + static struct dp_netdev_flow * dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd, struct match *match, const ovs_u128 *ufid, - const struct nlattr *actions, size_t actions_len) + const struct nlattr *actions, size_t actions_len, + bool vlan_tci_wc_faked) OVS_REQUIRES(pmd->flow_mutex) { struct dp_netdev_flow *flow; @@ -3246,6 +3433,14 @@ dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd, cmap_insert(&pmd->flow_table, CONST_CAST(struct cmap_node *, &flow->node), dp_netdev_flow_hash(&flow->ufid)); + ccmap_inc(&pmd->n_flows, odp_to_u32(in_port)); + + if (vlan_tci_wc_faked + && match->flow.recirc_id == 0 + && match->flow.packet_type == htonl(PT_ETH) + && dp_netdev_flow_is_direct_output(&match->wc, actions, actions_len)) { + dp_netdev_direct_output_insert(pmd, flow); + } queue_netdev_flow_put(pmd, flow, match, actions, actions_len); @@ -3302,7 +3497,8 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, struct match *match, ovs_u128 *ufid, const struct dpif_flow_put *put, - struct dpif_flow_stats *stats) + struct dpif_flow_stats *stats, + bool vlan_tci_wc_faked) { struct dp_netdev_flow *netdev_flow; int error = 0; @@ -3317,7 +3513,7 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, if (put->flags & DPIF_FP_CREATE) { if (cmap_count(&pmd->flow_table) < MAX_FLOWS) { dp_netdev_flow_add(pmd, match, ufid, put->actions, - put->actions_len); + put->actions_len, vlan_tci_wc_faked); error = 0; } else { error = EFBIG; @@ -3336,6 +3532,12 @@ flow_put_on_pmd(struct dp_netdev_pmd_thread *pmd, old_actions = dp_netdev_flow_get_actions(netdev_flow); ovsrcu_set(&netdev_flow->actions, new_actions); + if (!dp_netdev_flow_is_direct_output(NULL, new_actions->actions, + new_actions->size)) { + /* New actions are not direct output. */ + dp_netdev_direct_output_remove(pmd, netdev_flow); + } + queue_netdev_flow_put(pmd, netdev_flow, match, put->actions, put->actions_len); @@ -3377,6 +3579,7 @@ dpif_netdev_flow_put(struct dpif *dpif, const struct dpif_flow_put *put) ovs_u128 ufid; int error; bool probe = put->flags & DPIF_FP_PROBE; + bool vlan_tci_wc_faked = false; if (put->stats) { memset(put->stats, 0, sizeof *put->stats); @@ -3406,6 +3609,7 @@ dpif_netdev_flow_put(struct dpif *dpif, const struct dpif_flow_put *put) * Netlink and struct flow representations, we have to do the same * here. This must be in sync with 'match' in handle_packet_upcall(). */ if (!match.wc.masks.vlans[0].tci) { + vlan_tci_wc_faked = true; match.wc.masks.vlans[0].tci = htons(0xffff); } @@ -3424,7 +3628,7 @@ dpif_netdev_flow_put(struct dpif *dpif, const struct dpif_flow_put *put) int pmd_error; pmd_error = flow_put_on_pmd(pmd, &key, &match, &ufid, put, - &pmd_stats); + &pmd_stats, vlan_tci_wc_faked); if (pmd_error) { error = pmd_error; } else if (put->stats) { @@ -3439,7 +3643,8 @@ dpif_netdev_flow_put(struct dpif *dpif, const struct dpif_flow_put *put) if (!pmd) { return EINVAL; } - error = flow_put_on_pmd(pmd, &key, &match, &ufid, put, put->stats); + error = flow_put_on_pmd(pmd, &key, &match, &ufid, put, put->stats, + vlan_tci_wc_faked); dp_netdev_pmd_unref(pmd); } @@ -5907,6 +6112,9 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, ovs_mutex_init(&pmd->flow_mutex); ovs_mutex_init(&pmd->port_mutex); cmap_init(&pmd->flow_table); + cmap_init(&pmd->direct_output_table); + ccmap_init(&pmd->n_flows); + ccmap_init(&pmd->n_direct_flows); cmap_init(&pmd->classifiers); pmd->ctx.last_rxq = NULL; pmd_thread_ctx_time_update(pmd); @@ -5944,6 +6152,9 @@ dp_netdev_destroy_pmd(struct dp_netdev_pmd_thread *pmd) } cmap_destroy(&pmd->classifiers); cmap_destroy(&pmd->flow_table); + cmap_destroy(&pmd->direct_output_table); + ccmap_destroy(&pmd->n_flows); + ccmap_destroy(&pmd->n_direct_flows); ovs_mutex_destroy(&pmd->flow_mutex); latch_destroy(&pmd->exit_latch); seq_destroy(pmd->reload_seq); @@ -6390,6 +6601,7 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, bool smc_enable_db; size_t map_cnt = 0; bool batch_enable = true; + bool direct_output_enabled = dp_netdev_direct_output_enabled(pmd, port_no); atomic_read_relaxed(&pmd->dp->smc_enable_db, &smc_enable_db); pmd_perf_update_counter(&pmd->perf_stats, @@ -6397,7 +6609,7 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, cnt); DP_PACKET_BATCH_REFILL_FOR_EACH (i, cnt, packet, packets_) { - struct dp_netdev_flow *flow; + struct dp_netdev_flow *flow = NULL; uint32_t mark; if (OVS_UNLIKELY(dp_packet_size(packet) < ETH_HEADER_LEN)) { @@ -6414,13 +6626,24 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, if (!md_is_valid) { pkt_metadata_init(&packet->md, port_no); - } - if ((*recirc_depth_get() == 0) && - dp_packet_has_flow_mark(packet, &mark)) { - flow = mark_to_flow_find(pmd, mark); - if (OVS_LIKELY(flow)) { - tcp_flags = parse_tcp_flags(packet); + if (dp_packet_has_flow_mark(packet, &mark)) { + flow = mark_to_flow_find(pmd, mark); + if (OVS_LIKELY(flow)) { + tcp_flags = parse_tcp_flags(packet, NULL, NULL); + } + } + + if (!flow && direct_output_enabled) { + ovs_be16 dl_type = 0; + uint8_t nw_frag = 0; + + tcp_flags = parse_tcp_flags(packet, &dl_type, &nw_frag); + flow = dp_netdev_direct_output_lookup(pmd, port_no, + dl_type, nw_frag); + } + + if (flow) { if (OVS_LIKELY(batch_enable)) { dp_netdev_queue_batches(packet, flow, tcp_flags, batches, n_batches); @@ -6508,6 +6731,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, ovs_u128 ufid; int error; uint64_t cycles = cycles_counter_update(&pmd->perf_stats); + bool vlan_tci_wc_faked = false; match.tun_md.valid = false; miniflow_expand(&key->mf, &match.flow); @@ -6532,6 +6756,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, * here. This must be in sync with 'match' in dpif_netdev_flow_put(). */ if (!match.wc.masks.vlans[0].tci) { match.wc.masks.vlans[0].tci = htons(0xffff); + vlan_tci_wc_faked = true; } /* We can't allow the packet batching in the next loop to execute @@ -6555,7 +6780,8 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, if (OVS_LIKELY(!netdev_flow)) { netdev_flow = dp_netdev_flow_add(pmd, &match, &ufid, add_actions->data, - add_actions->size); + add_actions->size, + vlan_tci_wc_faked); } ovs_mutex_unlock(&pmd->flow_mutex); uint32_t hash = dp_netdev_flow_hash(&netdev_flow->ufid); diff --git a/lib/flow.c b/lib/flow.c index f39b57f5b..88c54a37b 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -1089,11 +1089,14 @@ parse_dl_type(const struct eth_header *data_, size_t size) /* Parses and return the TCP flags in 'packet', converted to host byte order. * If 'packet' is not an Ethernet packet embedding TCP, returns 0. + * 'dl_type_p' will be set only if 'packet' is an Ethernet packet. + * 'nw_frag_p' will be set only if 'packet' is an IP packet. * * The caller must ensure that 'packet' is at least ETH_HEADER_LEN bytes * long.'*/ uint16_t -parse_tcp_flags(struct dp_packet *packet) +parse_tcp_flags(struct dp_packet *packet, + ovs_be16 *dl_type_p, uint8_t *nw_frag_p) { const void *data = dp_packet_data(packet); const char *frame = (const char *)data; @@ -1109,6 +1112,9 @@ parse_tcp_flags(struct dp_packet *packet) data_pull(&data, &size, ETH_ADDR_LEN * 2); dl_type = parse_ethertype(&data, &size); + if (dl_type_p) { + *dl_type_p = dl_type; + } if (OVS_UNLIKELY(eth_type_mpls(dl_type))) { packet->l2_5_ofs = (char *)data - frame; } @@ -1150,6 +1156,10 @@ parse_tcp_flags(struct dp_packet *packet) return 0; } + if (nw_frag_p) { + *nw_frag_p = nw_frag; + } + packet->l4_ofs = (uint16_t)((char *)data - frame); if (!(nw_frag & FLOW_NW_FRAG_LATER) && nw_proto == IPPROTO_TCP && size >= TCP_HEADER_LEN) { diff --git a/lib/flow.h b/lib/flow.h index 7298c71f3..c46bf9e32 100644 --- a/lib/flow.h +++ b/lib/flow.h @@ -135,7 +135,8 @@ bool parse_ipv6_ext_hdrs(const void **datap, size_t *sizep, uint8_t *nw_proto, const struct ovs_16aligned_ip6_frag **frag_hdr); ovs_be16 parse_dl_type(const struct eth_header *data_, size_t size); bool parse_nsh(const void **datap, size_t *sizep, struct ovs_key_nsh *key); -uint16_t parse_tcp_flags(struct dp_packet *packet); +uint16_t parse_tcp_flags(struct dp_packet *packet, ovs_be16 *dl_type_p, + uint8_t *nw_frag_p); static inline uint64_t flow_get_xreg(const struct flow *flow, int idx)