From patchwork Fri Feb 12 17:17:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Van Haaren, Harry" X-Patchwork-Id: 1439956 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=fraxinus.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DcgGX6Zppz9sTD for ; Sat, 13 Feb 2021 04:19:32 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 6DE88863FF; Fri, 12 Feb 2021 17:19:31 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UBEcp1Ftc2DJ; Fri, 12 Feb 2021 17:19:29 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by fraxinus.osuosl.org (Postfix) with ESMTP id DB30886F1B; Fri, 12 Feb 2021 17:19:21 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id BFE82C0891; Fri, 12 Feb 2021 17:19:21 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 99125C0891 for ; Fri, 12 Feb 2021 17:19:20 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 560836F976 for ; Fri, 12 Feb 2021 17:19:20 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9HrKG-ynt1zu for ; Fri, 12 Feb 2021 17:19:18 +0000 (UTC) Received: by smtp3.osuosl.org (Postfix, from userid 1001) id EC9F86F975; Fri, 12 Feb 2021 17:19:17 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by smtp3.osuosl.org (Postfix) with ESMTPS id C425C6F79F for ; Fri, 12 Feb 2021 17:17:54 +0000 (UTC) IronPort-SDR: nEoaX8ZocR7GHAt12QAabw9PU0ybBF33bQmeo1gybBs9Vv2c9KfjuHKF6/gRmPOOBbkiSzglf7 Curntvxk6AQA== X-IronPort-AV: E=McAfee;i="6000,8403,9893"; a="201595239" X-IronPort-AV: E=Sophos;i="5.81,174,1610438400"; d="scan'208";a="201595239" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Feb 2021 09:17:54 -0800 IronPort-SDR: O3CqZcc+XvuLuFL5k7zzsqE6D8pIi9KHaHffJdrO7GDapfxZnsLJkdizw1MmgGjnRkn7Gx/Ap1 BB7ni0GnJY3w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.81,174,1610438400"; d="scan'208";a="360485215" Received: from silpixa00400633.ir.intel.com ([10.237.213.44]) by orsmga003.jf.intel.com with ESMTP; 12 Feb 2021 09:17:53 -0800 From: Harry van Haaren To: ovs-dev@openvswitch.org Date: Fri, 12 Feb 2021 17:17:16 +0000 Message-Id: <20210212171718.2189798-15-harry.van.haaren@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210212171718.2189798-1-harry.van.haaren@intel.com> References: <20210104163653.2218575-1-harry.van.haaren@intel.com> <20210212171718.2189798-1-harry.van.haaren@intel.com> MIME-Version: 1.0 Cc: i.maximets@ovn.org Subject: [ovs-dev] [PATCH v9 14/16] dpif-netdev: Optimize dp output action X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" This commit optimizes the output action, by enabling the compiler to optimize the code better through reducing code complexity. The core concept of this optimization is that the array-length checks have already been performed above the copying code, so can be removed. Removing of the per-packet length checks allows the compiler to auto-vectorize the stores using SIMD registers. Signed-off-by: Harry van Haaren --- v8: Add NEWS entry. --- NEWS | 1 + lib/dpif-netdev.c | 23 ++++++++++++++++++----- 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/NEWS b/NEWS index 5f1e3b5e0..2ffc155f9 100644 --- a/NEWS +++ b/NEWS @@ -13,6 +13,7 @@ Post-v2.15.0 * Enable the AVX512 DPCLS implementation to use VPOPCNT instruction if the CPU supports it. This enhances performance by using the native vpopcount instructions, instead of the emulated version of vpopcount. + * Optimize dp_netdev_output by enhancing compiler optimization potential. v2.15.0 - xx xxx xxxx --------------------- diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 5e83755d7..b2cf1bd46 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7254,12 +7254,25 @@ dp_execute_output_action(struct dp_netdev_pmd_thread *pmd, pmd->n_output_batches++; } - struct dp_packet *packet; - DP_PACKET_BATCH_FOR_EACH (i, packet, packets_) { - p->output_pkts_rxqs[dp_packet_batch_size(&p->output_pkts)] = - pmd->ctx.last_rxq; - dp_packet_batch_add(&p->output_pkts, packet); + /* The above checks ensure that there is enough space in the output batch. + * Using dp_packet_batch_add() has a branch to check if the batch is full. + * This branch reduces the compiler's ability to optimize efficiently. The + * below code implements packet movement between batches without checks, + * with the required semantics of output batch perhaps containing packets. + */ + int batch_size = dp_packet_batch_size(packets_); + int out_batch_idx = dp_packet_batch_size(&p->output_pkts); + struct dp_netdev_rxq *rxq = pmd->ctx.last_rxq; + struct dp_packet_batch *output_batch = &p->output_pkts; + + for (int i = 0; i < batch_size; i++) { + struct dp_packet *packet = packets_->packets[i]; + p->output_pkts_rxqs[out_batch_idx] = rxq; + output_batch->packets[out_batch_idx] = packet; + out_batch_idx++; } + output_batch->count += batch_size; + return true; }