From patchwork Sun Oct 1 07:57:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Bodireddy, Bhanuprakash" X-Patchwork-Id: 820239 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3y4dHC6gczz9t2M for ; Sun, 1 Oct 2017 19:09:11 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 5A3D43EE; Sun, 1 Oct 2017 08:07:47 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id EEDE682 for ; Sun, 1 Oct 2017 08:07:43 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 6CB50CE for ; Sun, 1 Oct 2017 08:07:43 +0000 (UTC) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP; 01 Oct 2017 01:07:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.42,463,1500966000"; d="scan'208"; a="1020308549" Received: from silpixa00393942.ir.intel.com (HELO silpixa00393942.ger.corp.intel.com) ([10.237.223.42]) by orsmga003.jf.intel.com with ESMTP; 01 Oct 2017 01:07:42 -0700 From: Bhanuprakash Bodireddy To: dev@openvswitch.org Date: Sun, 1 Oct 2017 08:57:36 +0100 Message-Id: <1506844660-4902-3-git-send-email-bhanuprakash.bodireddy@intel.com> X-Mailer: git-send-email 2.4.11 In-Reply-To: <1506844660-4902-1-git-send-email-bhanuprakash.bodireddy@intel.com> References: <1506844660-4902-1-git-send-email-bhanuprakash.bodireddy@intel.com> X-Spam-Status: No, score=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD autolearn=disabled version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH 3/7] dpif_netdev: Refactor dp_netdev_pmd_thread structure. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This commit introduces below changes to dp_netdev_pmd_thread structure. - Mark cachelines and in this process reorder few members to avoid holes. - Align emc_cache to a cacheline. - Maintain the grouping of related member variables. - Add comment on the information on pad bytes whereever appropriate so that new member variables may be introduced to fill the holes in future. Below is how the structure looks with this commit. Member size OVS_CACHE_LINE_MARKER cacheline0; struct dp_netdev * dp; 8 struct cmap_node node; 8 pthread_cond_t cond; 48 OVS_CACHE_LINE_MARKER cacheline1; struct ovs_mutex cond_mutex; 48 pthread_t thread; 8 unsigned int core_id; 4 int numa_id; 4 OVS_CACHE_LINE_MARKER cacheline2; struct emc_cache flow_cache; 4849672 ###cachelineX: 64 bytes, 0 pad bytes#### struct cmap flow_table; 8 .... ###cachelineY: 59 bytes, 5 pad bytes#### struct dp_netdev_pmd_stats stats 40 .... ###cachelineZ: 48 bytes, 16 pad bytes### struct ovs_mutex port_mutex; 48 .... This change also improve the performance marginally. Signed-off-by: Bhanuprakash Bodireddy --- lib/dpif-netdev.c | 160 +++++++++++++++++++++++++++++++----------------------- 1 file changed, 91 insertions(+), 69 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index d5eb830..4cd0edf 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -547,18 +547,31 @@ struct tx_port { * actions in either case. * */ struct dp_netdev_pmd_thread { - struct dp_netdev *dp; - struct ovs_refcount ref_cnt; /* Every reference must be refcount'ed. */ - struct cmap_node node; /* In 'dp->poll_threads'. */ - - pthread_cond_t cond; /* For synchronizing pmd thread reload. */ - struct ovs_mutex cond_mutex; /* Mutex for condition variable. */ + PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0, + struct dp_netdev *dp; + struct cmap_node node; /* In 'dp->poll_threads'. */ + pthread_cond_t cond; /* For synchronizing pmd thread + reload. */ + ); + + PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline1, + struct ovs_mutex cond_mutex; /* Mutex for condition variable. */ + pthread_t thread; + unsigned core_id; /* CPU core id of this pmd thread. */ + int numa_id; /* numa node id of this pmd thread. */ + ); /* Per thread exact-match cache. Note, the instance for cpu core * NON_PMD_CORE_ID can be accessed by multiple threads, and thusly * need to be protected by 'non_pmd_mutex'. Every other instance * will only be accessed by its own pmd thread. */ - struct emc_cache flow_cache; + OVS_ALIGNED_VAR(CACHE_LINE_SIZE) struct emc_cache flow_cache; + struct ovs_refcount ref_cnt; /* Every reference must be refcount'ed. */ + + /* Queue id used by this pmd thread to send packets on all netdevs if + * XPS disabled for this netdev. All static_tx_qid's are unique and less + * than 'cmap_count(dp->poll_threads)'. */ + uint32_t static_tx_qid; /* Flow-Table and classifiers * @@ -567,68 +580,77 @@ struct dp_netdev_pmd_thread { * 'flow_mutex'. */ struct ovs_mutex flow_mutex; - struct cmap flow_table OVS_GUARDED; /* Flow table. */ - - /* One classifier per in_port polled by the pmd */ - struct cmap classifiers; - /* Periodically sort subtable vectors according to hit frequencies */ - long long int next_optimization; - /* End of the next time interval for which processing cycles - are stored for each polled rxq. */ - long long int rxq_interval; - - /* Statistics. */ - struct dp_netdev_pmd_stats stats; - - /* Cycles counters */ - struct dp_netdev_pmd_cycles cycles; - - /* Used to count cicles. See 'cycles_counter_end()' */ - unsigned long long last_cycles; - - struct latch exit_latch; /* For terminating the pmd thread. */ - struct seq *reload_seq; - uint64_t last_reload_seq; - atomic_bool reload; /* Do we need to reload ports? */ - pthread_t thread; - unsigned core_id; /* CPU core id of this pmd thread. */ - int numa_id; /* numa node id of this pmd thread. */ - bool isolated; - - /* Queue id used by this pmd thread to send packets on all netdevs if - * XPS disabled for this netdev. All static_tx_qid's are unique and less - * than 'cmap_count(dp->poll_threads)'. */ - uint32_t static_tx_qid; - - struct ovs_mutex port_mutex; /* Mutex for 'poll_list' and 'tx_ports'. */ - /* List of rx queues to poll. */ - struct hmap poll_list OVS_GUARDED; - /* Map of 'tx_port's used for transmission. Written by the main thread, - * read by the pmd thread. */ - struct hmap tx_ports OVS_GUARDED; - - /* These are thread-local copies of 'tx_ports'. One contains only tunnel - * ports (that support push_tunnel/pop_tunnel), the other contains ports - * with at least one txq (that support send). A port can be in both. - * - * There are two separate maps to make sure that we don't try to execute - * OUTPUT on a device which has 0 txqs or PUSH/POP on a non-tunnel device. - * - * The instances for cpu core NON_PMD_CORE_ID can be accessed by multiple - * threads, and thusly need to be protected by 'non_pmd_mutex'. Every - * other instance will only be accessed by its own pmd thread. */ - struct hmap tnl_port_cache; - struct hmap send_port_cache; - - /* Only a pmd thread can write on its own 'cycles' and 'stats'. - * The main thread keeps 'stats_zero' and 'cycles_zero' as base - * values and subtracts them from 'stats' and 'cycles' before - * reporting to the user */ - unsigned long long stats_zero[DP_N_STATS]; - uint64_t cycles_zero[PMD_N_CYCLES]; - - /* Set to true if the pmd thread needs to be reloaded. */ - bool need_reload; + PADDED_MEMBERS(CACHE_LINE_SIZE, + struct cmap flow_table OVS_GUARDED; /* Flow table. */ + + /* One classifier per in_port polled by the pmd */ + struct cmap classifiers; + /* Periodically sort subtable vectors according to hit frequencies */ + long long int next_optimization; + /* End of the next time interval for which processing cycles + are stored for each polled rxq. */ + long long int rxq_interval; + + /* Cycles counters */ + struct dp_netdev_pmd_cycles cycles; + + /* Used to count cycles. See 'cycles_counter_end()'. */ + unsigned long long last_cycles; + struct latch exit_latch; /* For terminating the pmd thread. */ + ); + + PADDED_MEMBERS(CACHE_LINE_SIZE, + /* Statistics. */ + struct dp_netdev_pmd_stats stats; + + struct seq *reload_seq; + uint64_t last_reload_seq; + atomic_bool reload; /* Do we need to reload ports? */ + bool isolated; + + /* Set to true if the pmd thread needs to be reloaded. */ + bool need_reload; + /* 5 pad bytes. */ + ); + + PADDED_MEMBERS(CACHE_LINE_SIZE, + struct ovs_mutex port_mutex; /* Mutex for 'poll_list' + and 'tx_ports'. */ + /* 16 pad bytes. */ + ); + PADDED_MEMBERS(CACHE_LINE_SIZE, + /* List of rx queues to poll. */ + struct hmap poll_list OVS_GUARDED; + /* Map of 'tx_port's used for transmission. Written by the main + * thread, read by the pmd thread. */ + struct hmap tx_ports OVS_GUARDED; + ); + PADDED_MEMBERS(CACHE_LINE_SIZE, + /* These are thread-local copies of 'tx_ports'. One contains only + * tunnel ports (that support push_tunnel/pop_tunnel), the other + * contains ports with at least one txq (that support send). + * A port can be in both. + * + * There are two separate maps to make sure that we don't try to + * execute OUTPUT on a device which has 0 txqs or PUSH/POP on a + * non-tunnel device. + * + * The instances for cpu core NON_PMD_CORE_ID can be accessed by + * multiple threads and thusly need to be protected by 'non_pmd_mutex'. + * Every other instance will only be accessed by its own pmd thread. */ + struct hmap tnl_port_cache; + struct hmap send_port_cache; + ); + + PADDED_MEMBERS(CACHE_LINE_SIZE, + /* Only a pmd thread can write on its own 'cycles' and 'stats'. + * The main thread keeps 'stats_zero' and 'cycles_zero' as base + * values and subtracts them from 'stats' and 'cycles' before + * reporting to the user */ + unsigned long long stats_zero[DP_N_STATS]; + uint64_t cycles_zero[PMD_N_CYCLES]; + /* 8 pad bytes. */ + ); }; /* Interface to netdev-based datapath. */