From patchwork Thu Aug 22 10:09:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yanqin Wei X-Patchwork-Id: 1151440 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46DgHj469qz9sN6 for ; Thu, 22 Aug 2019 20:09:37 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id C1048DC1; Thu, 22 Aug 2019 10:09:29 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 298B6DB6 for ; Thu, 22 Aug 2019 10:09:28 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp1.linuxfoundation.org (Postfix) with ESMTP id 62A3367F for ; Thu, 22 Aug 2019 10:09:27 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 179B515A2; Thu, 22 Aug 2019 03:09:27 -0700 (PDT) Received: from net-arm-thunderx2-01.test.ast.arm.com (net-arm-thunderx2-01.shanghai.arm.com [10.169.40.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id BFCF33F246; Thu, 22 Aug 2019 03:09:25 -0700 (PDT) From: Yanqin Wei To: dev@openvswitch.org Date: Thu, 22 Aug 2019 18:09:16 +0800 Message-Id: <1566468556-21195-1-git-send-email-Yanqin.Wei@arm.com> X-Mailer: git-send-email 2.7.4 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: nd@arm.com, Gavin.Hu@arm.com Subject: [ovs-dev] [PATCH v2] flow: miniflow_extract metadata branchless optimization X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org "miniflow_extract" is a branch heavy implementation for packet header and metadata parsing. There is a lot of meta data handling for all traffic. But this should not be applicable for packets from interface. This patch adds a layer of inline encapsulation to miniflow_extract and introduces constant "md_valid" input parameter as a branch condition. The new branch will be removed by the compiler at compile time. Two instances of miniflow_extract with different branches will be generated. This patch is tested on an arm64 platform. It improves more than 3.5% performance in P2P forwarding cases. Reviewed-by: Gavin Hu Signed-off-by: Yanqin Wei --- lib/dpif-netdev.c | 13 +++--- lib/flow.c | 116 ++++++++++++++++++++++++++++++++---------------------- lib/flow.h | 2 + 3 files changed, 79 insertions(+), 52 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index d0a1c58..6686b14 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -6508,12 +6508,15 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, } } - miniflow_extract(packet, &key->mf); + if (!md_is_valid) { + miniflow_extract_firstpass(packet, &key->mf); + key->hash = + dpif_netdev_packet_get_rss_hash_orig_pkt(packet, &key->mf); + } else { + miniflow_extract(packet, &key->mf); + key->hash = dpif_netdev_packet_get_rss_hash(packet, &key->mf); + } key->len = 0; /* Not computed yet. */ - key->hash = - (md_is_valid == false) - ? dpif_netdev_packet_get_rss_hash_orig_pkt(packet, &key->mf) - : dpif_netdev_packet_get_rss_hash(packet, &key->mf); /* If EMC is disabled skip emc_lookup */ flow = (cur_min != 0) ? emc_lookup(&cache->emc_cache, key) : NULL; diff --git a/lib/flow.c b/lib/flow.c index e54fd2e..e5b554b 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -707,7 +707,8 @@ ipv6_sanity_check(const struct ovs_16aligned_ip6_hdr *nh, size_t size) } /* Initializes 'dst' from 'packet' and 'md', taking the packet type into - * account. 'dst' must have enough space for FLOW_U64S * 8 bytes. + * account. 'dst' must have enough space for FLOW_U64S * 8 bytes. Metadata + * initialization should be bypassed if "md_valid" is false. * * Initializes the layer offsets as follows: * @@ -732,8 +733,9 @@ ipv6_sanity_check(const struct ovs_16aligned_ip6_hdr *nh, size_t size) * present and the packet has at least the content used for the fields * of interest for the flow, otherwise UINT16_MAX. */ -void -miniflow_extract(struct dp_packet *packet, struct miniflow *dst) +static inline ALWAYS_INLINE void +miniflow_extract__(struct dp_packet *packet, struct miniflow *dst, + const bool md_valid) { /* Add code to this function (or its callees) to extract new fields. */ BUILD_ASSERT_DECL(FLOW_WC_SEQ == 41); @@ -752,54 +754,60 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) ovs_be16 ct_tp_src = 0, ct_tp_dst = 0; /* Metadata. */ - if (flow_tnl_dst_is_set(&md->tunnel)) { - miniflow_push_words(mf, tunnel, &md->tunnel, - offsetof(struct flow_tnl, metadata) / - sizeof(uint64_t)); - - if (!(md->tunnel.flags & FLOW_TNL_F_UDPIF)) { - if (md->tunnel.metadata.present.map) { - miniflow_push_words(mf, tunnel.metadata, &md->tunnel.metadata, - sizeof md->tunnel.metadata / - sizeof(uint64_t)); - } - } else { - if (md->tunnel.metadata.present.len) { - miniflow_push_words(mf, tunnel.metadata.present, - &md->tunnel.metadata.present, 1); - miniflow_push_words(mf, tunnel.metadata.opts.gnv, - md->tunnel.metadata.opts.gnv, - DIV_ROUND_UP(md->tunnel.metadata.present.len, - sizeof(uint64_t))); + if (md_valid) { + if (flow_tnl_dst_is_set(&md->tunnel)) { + miniflow_push_words(mf, tunnel, &md->tunnel, + offsetof(struct flow_tnl, metadata) / + sizeof(uint64_t)); + + if (!(md->tunnel.flags & FLOW_TNL_F_UDPIF)) { + if (md->tunnel.metadata.present.map) { + miniflow_push_words(mf, tunnel.metadata, + &md->tunnel.metadata, + sizeof md->tunnel.metadata / + sizeof(uint64_t)); + } + } else { + if (md->tunnel.metadata.present.len) { + miniflow_push_words(mf, tunnel.metadata.present, + &md->tunnel.metadata.present, 1); + miniflow_push_words(mf, tunnel.metadata.opts.gnv, + md->tunnel.metadata.opts.gnv, + DIV_ROUND_UP( + md->tunnel.metadata.present.len, + sizeof(uint64_t))); + } } } - } - if (md->skb_priority || md->pkt_mark) { - miniflow_push_uint32(mf, skb_priority, md->skb_priority); - miniflow_push_uint32(mf, pkt_mark, md->pkt_mark); - } - miniflow_push_uint32(mf, dp_hash, md->dp_hash); - miniflow_push_uint32(mf, in_port, odp_to_u32(md->in_port.odp_port)); - if (md->ct_state) { - miniflow_push_uint32(mf, recirc_id, md->recirc_id); - miniflow_push_uint8(mf, ct_state, md->ct_state); - ct_nw_proto_p = miniflow_pointer(mf, ct_nw_proto); - miniflow_push_uint8(mf, ct_nw_proto, 0); - miniflow_push_uint16(mf, ct_zone, md->ct_zone); - } else if (md->recirc_id) { - miniflow_push_uint32(mf, recirc_id, md->recirc_id); - miniflow_pad_to_64(mf, recirc_id); - } - - if (md->ct_state) { - miniflow_push_uint32(mf, ct_mark, md->ct_mark); - miniflow_push_be32(mf, packet_type, packet_type); - - if (!ovs_u128_is_zero(md->ct_label)) { - miniflow_push_words(mf, ct_label, &md->ct_label, - sizeof md->ct_label / sizeof(uint64_t)); + if (md->skb_priority || md->pkt_mark) { + miniflow_push_uint32(mf, skb_priority, md->skb_priority); + miniflow_push_uint32(mf, pkt_mark, md->pkt_mark); + } + miniflow_push_uint32(mf, dp_hash, md->dp_hash); + miniflow_push_uint32(mf, in_port, odp_to_u32(md->in_port.odp_port)); + if (md->ct_state) { + miniflow_push_uint32(mf, recirc_id, md->recirc_id); + miniflow_push_uint8(mf, ct_state, md->ct_state); + ct_nw_proto_p = miniflow_pointer(mf, ct_nw_proto); + miniflow_push_uint8(mf, ct_nw_proto, 0); + miniflow_push_uint16(mf, ct_zone, md->ct_zone); + } else if (md->recirc_id) { + miniflow_push_uint32(mf, recirc_id, md->recirc_id); + miniflow_pad_to_64(mf, recirc_id); + } + + if (md->ct_state) { + miniflow_push_uint32(mf, ct_mark, md->ct_mark); + miniflow_push_be32(mf, packet_type, packet_type); + + if (!ovs_u128_is_zero(md->ct_label)) { + miniflow_push_words(mf, ct_label, &md->ct_label, + sizeof md->ct_label / sizeof(uint64_t)); + } } } else { + miniflow_push_uint32(mf, dp_hash, md->dp_hash); + miniflow_push_uint32(mf, in_port, odp_to_u32(md->in_port.odp_port)); miniflow_pad_from_64(mf, packet_type); miniflow_push_be32(mf, packet_type, packet_type); } @@ -865,6 +873,7 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) /* Push both source and destination address at once. */ miniflow_push_words(mf, nw_src, &nh->ip_src, 1); + if (ct_nw_proto_p && !md->ct_orig_tuple_ipv6) { *ct_nw_proto_p = md->ct_orig_tuple.ipv4.ipv4_proto; if (*ct_nw_proto_p) { @@ -900,6 +909,7 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) sizeof nh->ip6_src / 8); miniflow_push_words(mf, ipv6_dst, &nh->ip6_dst, sizeof nh->ip6_dst / 8); + if (ct_nw_proto_p && md->ct_orig_tuple_ipv6) { *ct_nw_proto_p = md->ct_orig_tuple.ipv6.ipv6_proto; if (*ct_nw_proto_p) { @@ -1076,6 +1086,18 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) dst->map = mf.map; } +void +miniflow_extract(struct dp_packet *packet, struct miniflow *dst) +{ + miniflow_extract__(packet, dst, true); +} + +void +miniflow_extract_firstpass(struct dp_packet *packet, struct miniflow *dst) +{ + miniflow_extract__(packet, dst, false); +} + ovs_be16 parse_dl_type(const struct eth_header *data_, size_t size) { diff --git a/lib/flow.h b/lib/flow.h index 7298c71..2d38574 100644 --- a/lib/flow.h +++ b/lib/flow.h @@ -541,6 +541,8 @@ struct pkt_metadata; * 'dst->map' is ignored on input and set on output to indicate which fields * were extracted. */ void miniflow_extract(struct dp_packet *packet, struct miniflow *dst); +void miniflow_extract_firstpass(struct dp_packet *packet, + struct miniflow *dst); void miniflow_map_init(struct miniflow *, const struct flow *); void flow_wc_map(const struct flow *, struct flowmap *); size_t miniflow_alloc(struct miniflow *dsts[], size_t n,