From patchwork Fri Sep 9 09:26:49 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ciara Loftus X-Patchwork-Id: 667944 X-Patchwork-Delegate: diproiettod@vmware.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3sVsKW1vvFz9s1h for ; Fri, 9 Sep 2016 19:26:55 +1000 (AEST) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 5E121105D9; Fri, 9 Sep 2016 02:26:54 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx3v3.cudamail.com (mx3.cudamail.com [64.34.241.5]) by archives.nicira.com (Postfix) with ESMTPS id 04EB0105D7 for ; Fri, 9 Sep 2016 02:26:53 -0700 (PDT) Received: from bar6.cudamail.com (localhost [127.0.0.1]) by mx3v3.cudamail.com (Postfix) with ESMTPS id 90393160FF1 for ; Fri, 9 Sep 2016 03:26:52 -0600 (MDT) X-ASG-Debug-ID: 1473413211-0b3237053c19560001-byXFYA Received: from mx3-pf2.cudamail.com ([192.168.14.1]) by bar6.cudamail.com with ESMTP id imnaq2rWwBdD3jBE (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 09 Sep 2016 03:26:51 -0600 (MDT) X-Barracuda-Envelope-From: cloftus@ecsmtp.ir.intel.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.14.1 Received: from unknown (HELO mga02.intel.com) (134.134.136.20) by mx3-pf2.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 9 Sep 2016 09:26:51 -0000 Received-SPF: none (mx3-pf2.cudamail.com: domain at ecsmtp.ir.intel.com does not designate permitted sender hosts) X-Barracuda-Apparent-Source-IP: 134.134.136.20 X-Barracuda-RBL-IP: 134.134.136.20 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP; 09 Sep 2016 02:26:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.30,304,1470726000"; d="scan'208"; a="1053549026" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga002.fm.intel.com with ESMTP; 09 Sep 2016 02:26:49 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id u899QnM5024682 for ; Fri, 9 Sep 2016 10:26:49 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id u899Qn5x014144 for ; Fri, 9 Sep 2016 10:26:49 +0100 Received: (from cloftus@localhost) by sivswdev01.ir.intel.com with id u899QnUC014140 for dev@openvswitch.org; Fri, 9 Sep 2016 10:26:49 +0100 X-CudaMail-Envelope-Sender: cloftus@ecsmtp.ir.intel.com From: Ciara Loftus To: dev@openvswitch.org X-CudaMail-MID: CM-V2-908002381 X-CudaMail-DTE: 090916 X-CudaMail-Originating-IP: 134.134.136.20 Date: Fri, 9 Sep 2016 10:26:49 +0100 X-ASG-Orig-Subj: [##CM-V2-908002381##][PATCH] netdev-dpdk: Allow configurable queue sizes for 'dpdk' ports Message-Id: <1473413209-14105-1-git-send-email-ciara.loftus@intel.com> X-Mailer: git-send-email 1.7.4.1 X-GBUdb-Analysis: 0, 134.134.136.20, Ugly c=0 p=0 Source New X-MessageSniffer-Rules: 0-0-0-13775-c X-Barracuda-Connect: UNKNOWN[192.168.14.1] X-Barracuda-Start-Time: 1473413211 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 X-ASG-Whitelist: EmailCat (corporate) Subject: [ovs-dev] [PATCH] netdev-dpdk: Allow configurable queue sizes for 'dpdk' ports X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" The other_config:dpdk-rxq-size and dpdk-txq-size fields allow for an integer between 1 and 4096 that reflects the number of rx/tx descriptors to initialise 'dpdk' devices with. If no value is specified, they default to 2048. 'dpdk-*xq-size' fields must be set before launching the daemon as changing the queue size requires the NIC to restart. Signed-off-by: Ciara Loftus --- INSTALL.DPDK-ADVANCED.md | 16 ++++++++++++++-- NEWS | 3 +++ lib/netdev-dpdk.c | 35 +++++++++++++++++++++++++++++++---- vswitchd/vswitch.xml | 26 ++++++++++++++++++++++++++ 4 files changed, 74 insertions(+), 6 deletions(-) diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md index 857c805..65df711 100755 --- a/INSTALL.DPDK-ADVANCED.md +++ b/INSTALL.DPDK-ADVANCED.md @@ -257,7 +257,19 @@ needs to be affinitized accordingly. The rx queues are assigned to pmd threads on the same NUMA node in a round-robin fashion. -### 4.4 Exact Match Cache +### 4.4 DPDK Physical Port Queue Sizes + `ovs-vsctl set Open_vSwitch . other_config:dpdk-rxq-size=` + `ovs-vsctl set Open_vSwitch . other_config:dpdk-txq-size=` + + The command above sets the number of rx/tx descriptors that the NICs + associated with 'dpdk' ports will be initialised with. + + Different 'dpdk-rxq-size' and 'dpdk-txq-size' configurations yield different + benefits in terms of throughput and latency for different scenarios. + Generally, smaller queue sizes can have a positive impact for latency at the + expense of throughput. The opposite is often true for larger queue sizes. + +### 4.5 Exact Match Cache Each pmd thread contains one EMC. After initial flow setup in the datapath, the EMC contains a single table and provides the lowest level @@ -274,7 +286,7 @@ needs to be affinitized accordingly. avoiding datapath classifier lookups is to have multiple pmd threads running. This can be done as described in section 4.2. -### 4.5 Rx Mergeable buffers +### 4.6 Rx Mergeable buffers Rx Mergeable buffers is a virtio feature that allows chaining of multiple virtio descriptors to handle large packet sizes. As such, large packets diff --git a/NEWS b/NEWS index 8c78b36..c36a06b 100644 --- a/NEWS +++ b/NEWS @@ -90,6 +90,9 @@ v2.6.0 - xx xxx xxxx * Jumbo frame support * Remove dpdkvhostcuse port type. * OVS client mode for vHost and vHost reconnect (Requires QEMU 2.7) + * New 'other_config:dpdk-rxq-size' and 'other_config:dpdk-txq-size' fields + that specify the number of rxq and txq descriptors to initialise DPDK + NICs with. - Increase number of registers to 16. - ovs-benchmark: This utility has been removed due to lack of use and bitrot. diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 6d334db..22de003 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -132,8 +132,9 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF)) #define SOCKET0 0 -#define NIC_PORT_RX_Q_SIZE 2048 /* Size of Physical NIC RX Queue, Max (n+32<=4096)*/ -#define NIC_PORT_TX_Q_SIZE 2048 /* Size of Physical NIC TX Queue, Max (n+32<=4096)*/ +#define NIC_PORT_DEFAULT_RXQ_SIZE 2048 /* Default size of Physical NIC RXQ */ +#define NIC_PORT_DEFAULT_TXQ_SIZE 2048 /* Default size of Physical NIC TXQ */ +#define NIC_PORT_MAX_Q_SIZE 4096 /* Maximum size of Physical NIC Queue */ #define OVS_VHOST_MAX_QUEUE_NUM 1024 /* Maximum number of vHost TX queues. */ #define OVS_VHOST_QUEUE_MAP_UNKNOWN (-1) /* Mapping not initialized. */ @@ -142,6 +143,9 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF)) static char *vhost_sock_dir = NULL; /* Location of vhost-user sockets */ +static int dpdk_rxq_size = 0; /* Configured size of Physical NIC RX Queue */ +static int dpdk_txq_size = 0; /* Configured size of Physical NIC TX Queue */ + #define VHOST_ENQ_RETRY_NUM 8 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) @@ -642,7 +646,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) } for (i = 0; i < n_txq; i++) { - diag = rte_eth_tx_queue_setup(dev->port_id, i, NIC_PORT_TX_Q_SIZE, + diag = rte_eth_tx_queue_setup(dev->port_id, i, dpdk_txq_size, dev->socket_id, NULL); if (diag) { VLOG_INFO("Interface %s txq(%d) setup error: %s", @@ -658,7 +662,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq) } for (i = 0; i < n_rxq; i++) { - diag = rte_eth_rx_queue_setup(dev->port_id, i, NIC_PORT_RX_Q_SIZE, + diag = rte_eth_rx_queue_setup(dev->port_id, i, dpdk_rxq_size, dev->socket_id, NULL, dev->dpdk_mp->mp); if (diag) { @@ -3122,6 +3126,23 @@ process_vhost_flags(char *flag, char *default_val, int size, return changed; } +static void +process_queue_size_flags(const struct smap *ovs_other_config, char *flag, + int default_size, int *new_size) +{ + int queue_size; + + queue_size = smap_get_int(ovs_other_config, flag, 0); + if (queue_size > 0 && queue_size <= NIC_PORT_MAX_Q_SIZE) { + *new_size = queue_size; + } else { + *new_size = default_size; + } + + VLOG_INFO("'dpdk' ports will be configured with a %s of %i", + flag, *new_size); +} + static char ** grow_argv(char ***argv, size_t cur_siz, size_t grow_by) { @@ -3364,6 +3385,12 @@ dpdk_init__(const struct smap *ovs_other_config) vhost_sock_dir = sock_dir_subcomponent; } + /* Determine the queue sizes to specify when initializing 'dpdk' ports */ + process_queue_size_flags(ovs_other_config, "dpdk-rxq-size", + NIC_PORT_DEFAULT_RXQ_SIZE, &dpdk_rxq_size); + process_queue_size_flags(ovs_other_config, "dpdk-txq-size", + NIC_PORT_DEFAULT_TXQ_SIZE, &dpdk_txq_size); + argv = grow_argv(&argv, 0, 1); argc = 1; argv[0] = xstrdup(ovs_get_program_name()); diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index f64c18a..41fef76 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -299,6 +299,32 @@

+ +

+ Specifies the queue size (number rx descriptors) of dpdk ports. + Ensure that your NIC(s) can support the particular value before + modifying 'dpdk-rxq-size'. +

+

+ Defaults to 2048. Maximum value is 4096. Changing this value requires + restarting the daemon. +

+
+ + +

+ Specifies the queue size (number tx descriptors) of dpdk ports. + Ensure that your NIC(s) can support the particular value before + modifying 'dpdk-txq-size'. +

+

+ Defaults to 2048. Maximum value is 4096. Changing this value requires + restarting the daemon. +

+
+