From patchwork Fri Sep 9 09:26:49 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ciara Loftus
X-Patchwork-Id: 667944
X-Patchwork-Delegate: diproiettod@vmware.com
Return-Path:
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from archives.nicira.com (archives.nicira.com [96.126.127.54])
by ozlabs.org (Postfix) with ESMTP id 3sVsKW1vvFz9s1h
for ;
Fri, 9 Sep 2016 19:26:55 +1000 (AEST)
Received: from archives.nicira.com (localhost [127.0.0.1])
by archives.nicira.com (Postfix) with ESMTP id 5E121105D9;
Fri, 9 Sep 2016 02:26:54 -0700 (PDT)
X-Original-To: dev@openvswitch.org
Delivered-To: dev@openvswitch.org
Received: from mx3v3.cudamail.com (mx3.cudamail.com [64.34.241.5])
by archives.nicira.com (Postfix) with ESMTPS id 04EB0105D7
for ; Fri, 9 Sep 2016 02:26:53 -0700 (PDT)
Received: from bar6.cudamail.com (localhost [127.0.0.1])
by mx3v3.cudamail.com (Postfix) with ESMTPS id 90393160FF1
for ; Fri, 9 Sep 2016 03:26:52 -0600 (MDT)
X-ASG-Debug-ID: 1473413211-0b3237053c19560001-byXFYA
Received: from mx3-pf2.cudamail.com ([192.168.14.1]) by bar6.cudamail.com
with
ESMTP id imnaq2rWwBdD3jBE (version=TLSv1 cipher=DHE-RSA-AES256-SHA
bits=256 verify=NO) for ;
Fri, 09 Sep 2016 03:26:51 -0600 (MDT)
X-Barracuda-Envelope-From: cloftus@ecsmtp.ir.intel.com
X-Barracuda-RBL-Trusted-Forwarder: 192.168.14.1
Received: from unknown (HELO mga02.intel.com) (134.134.136.20)
by mx3-pf2.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted);
9 Sep 2016 09:26:51 -0000
Received-SPF: none (mx3-pf2.cudamail.com: domain at ecsmtp.ir.intel.com does
not designate permitted sender hosts)
X-Barracuda-Apparent-Source-IP: 134.134.136.20
X-Barracuda-RBL-IP: 134.134.136.20
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
by orsmga101.jf.intel.com with ESMTP; 09 Sep 2016 02:26:50 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos; i="5.30,304,1470726000"; d="scan'208";
a="1053549026"
Received: from irvmail001.ir.intel.com ([163.33.26.43])
by fmsmga002.fm.intel.com with ESMTP; 09 Sep 2016 02:26:49 -0700
Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com
[10.237.217.45])
by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id
u899QnM5024682
for ; Fri, 9 Sep 2016 10:26:49 +0100
Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1])
by sivswdev01.ir.intel.com with ESMTP id u899Qn5x014144
for ; Fri, 9 Sep 2016 10:26:49 +0100
Received: (from cloftus@localhost)
by sivswdev01.ir.intel.com with id u899QnUC014140
for dev@openvswitch.org; Fri, 9 Sep 2016 10:26:49 +0100
X-CudaMail-Envelope-Sender: cloftus@ecsmtp.ir.intel.com
From: Ciara Loftus
To: dev@openvswitch.org
X-CudaMail-MID: CM-V2-908002381
X-CudaMail-DTE: 090916
X-CudaMail-Originating-IP: 134.134.136.20
Date: Fri, 9 Sep 2016 10:26:49 +0100
X-ASG-Orig-Subj: [##CM-V2-908002381##][PATCH] netdev-dpdk: Allow configurable
queue sizes for 'dpdk' ports
Message-Id: <1473413209-14105-1-git-send-email-ciara.loftus@intel.com>
X-Mailer: git-send-email 1.7.4.1
X-GBUdb-Analysis: 0, 134.134.136.20, Ugly c=0 p=0 Source New
X-MessageSniffer-Rules: 0-0-0-13775-c
X-Barracuda-Connect: UNKNOWN[192.168.14.1]
X-Barracuda-Start-Time: 1473413211
X-Barracuda-Encrypted: DHE-RSA-AES256-SHA
X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at cudamail.com
X-Barracuda-BRTS-Status: 1
X-ASG-Whitelist: EmailCat (corporate)
Subject: [ovs-dev] [PATCH] netdev-dpdk: Allow configurable queue sizes for
'dpdk' ports
X-BeenThere: dev@openvswitch.org
X-Mailman-Version: 2.1.16
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
MIME-Version: 1.0
Errors-To: dev-bounces@openvswitch.org
Sender: "dev"
The other_config:dpdk-rxq-size and dpdk-txq-size fields allow for an
integer between 1 and 4096 that reflects the number of rx/tx descriptors
to initialise 'dpdk' devices with. If no value is specified, they
default to 2048. 'dpdk-*xq-size' fields must be set before launching the
daemon as changing the queue size requires the NIC to restart.
Signed-off-by: Ciara Loftus
---
INSTALL.DPDK-ADVANCED.md | 16 ++++++++++++++--
NEWS | 3 +++
lib/netdev-dpdk.c | 35 +++++++++++++++++++++++++++++++----
vswitchd/vswitch.xml | 26 ++++++++++++++++++++++++++
4 files changed, 74 insertions(+), 6 deletions(-)
diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md
index 857c805..65df711 100755
--- a/INSTALL.DPDK-ADVANCED.md
+++ b/INSTALL.DPDK-ADVANCED.md
@@ -257,7 +257,19 @@ needs to be affinitized accordingly.
The rx queues are assigned to pmd threads on the same NUMA node in a
round-robin fashion.
-### 4.4 Exact Match Cache
+### 4.4 DPDK Physical Port Queue Sizes
+ `ovs-vsctl set Open_vSwitch . other_config:dpdk-rxq-size=`
+ `ovs-vsctl set Open_vSwitch . other_config:dpdk-txq-size=`
+
+ The command above sets the number of rx/tx descriptors that the NICs
+ associated with 'dpdk' ports will be initialised with.
+
+ Different 'dpdk-rxq-size' and 'dpdk-txq-size' configurations yield different
+ benefits in terms of throughput and latency for different scenarios.
+ Generally, smaller queue sizes can have a positive impact for latency at the
+ expense of throughput. The opposite is often true for larger queue sizes.
+
+### 4.5 Exact Match Cache
Each pmd thread contains one EMC. After initial flow setup in the
datapath, the EMC contains a single table and provides the lowest level
@@ -274,7 +286,7 @@ needs to be affinitized accordingly.
avoiding datapath classifier lookups is to have multiple pmd threads
running. This can be done as described in section 4.2.
-### 4.5 Rx Mergeable buffers
+### 4.6 Rx Mergeable buffers
Rx Mergeable buffers is a virtio feature that allows chaining of multiple
virtio descriptors to handle large packet sizes. As such, large packets
diff --git a/NEWS b/NEWS
index 8c78b36..c36a06b 100644
--- a/NEWS
+++ b/NEWS
@@ -90,6 +90,9 @@ v2.6.0 - xx xxx xxxx
* Jumbo frame support
* Remove dpdkvhostcuse port type.
* OVS client mode for vHost and vHost reconnect (Requires QEMU 2.7)
+ * New 'other_config:dpdk-rxq-size' and 'other_config:dpdk-txq-size' fields
+ that specify the number of rxq and txq descriptors to initialise DPDK
+ NICs with.
- Increase number of registers to 16.
- ovs-benchmark: This utility has been removed due to lack of use and
bitrot.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 6d334db..22de003 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -132,8 +132,9 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF))
#define SOCKET0 0
-#define NIC_PORT_RX_Q_SIZE 2048 /* Size of Physical NIC RX Queue, Max (n+32<=4096)*/
-#define NIC_PORT_TX_Q_SIZE 2048 /* Size of Physical NIC TX Queue, Max (n+32<=4096)*/
+#define NIC_PORT_DEFAULT_RXQ_SIZE 2048 /* Default size of Physical NIC RXQ */
+#define NIC_PORT_DEFAULT_TXQ_SIZE 2048 /* Default size of Physical NIC TXQ */
+#define NIC_PORT_MAX_Q_SIZE 4096 /* Maximum size of Physical NIC Queue */
#define OVS_VHOST_MAX_QUEUE_NUM 1024 /* Maximum number of vHost TX queues. */
#define OVS_VHOST_QUEUE_MAP_UNKNOWN (-1) /* Mapping not initialized. */
@@ -142,6 +143,9 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF))
static char *vhost_sock_dir = NULL; /* Location of vhost-user sockets */
+static int dpdk_rxq_size = 0; /* Configured size of Physical NIC RX Queue */
+static int dpdk_txq_size = 0; /* Configured size of Physical NIC TX Queue */
+
#define VHOST_ENQ_RETRY_NUM 8
#define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
@@ -642,7 +646,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq)
}
for (i = 0; i < n_txq; i++) {
- diag = rte_eth_tx_queue_setup(dev->port_id, i, NIC_PORT_TX_Q_SIZE,
+ diag = rte_eth_tx_queue_setup(dev->port_id, i, dpdk_txq_size,
dev->socket_id, NULL);
if (diag) {
VLOG_INFO("Interface %s txq(%d) setup error: %s",
@@ -658,7 +662,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq)
}
for (i = 0; i < n_rxq; i++) {
- diag = rte_eth_rx_queue_setup(dev->port_id, i, NIC_PORT_RX_Q_SIZE,
+ diag = rte_eth_rx_queue_setup(dev->port_id, i, dpdk_rxq_size,
dev->socket_id, NULL,
dev->dpdk_mp->mp);
if (diag) {
@@ -3122,6 +3126,23 @@ process_vhost_flags(char *flag, char *default_val, int size,
return changed;
}
+static void
+process_queue_size_flags(const struct smap *ovs_other_config, char *flag,
+ int default_size, int *new_size)
+{
+ int queue_size;
+
+ queue_size = smap_get_int(ovs_other_config, flag, 0);
+ if (queue_size > 0 && queue_size <= NIC_PORT_MAX_Q_SIZE) {
+ *new_size = queue_size;
+ } else {
+ *new_size = default_size;
+ }
+
+ VLOG_INFO("'dpdk' ports will be configured with a %s of %i",
+ flag, *new_size);
+}
+
static char **
grow_argv(char ***argv, size_t cur_siz, size_t grow_by)
{
@@ -3364,6 +3385,12 @@ dpdk_init__(const struct smap *ovs_other_config)
vhost_sock_dir = sock_dir_subcomponent;
}
+ /* Determine the queue sizes to specify when initializing 'dpdk' ports */
+ process_queue_size_flags(ovs_other_config, "dpdk-rxq-size",
+ NIC_PORT_DEFAULT_RXQ_SIZE, &dpdk_rxq_size);
+ process_queue_size_flags(ovs_other_config, "dpdk-txq-size",
+ NIC_PORT_DEFAULT_TXQ_SIZE, &dpdk_txq_size);
+
argv = grow_argv(&argv, 0, 1);
argc = 1;
argv[0] = xstrdup(ovs_get_program_name());
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index f64c18a..41fef76 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -299,6 +299,32 @@
+
+
+ Specifies the queue size (number rx descriptors) of dpdk ports.
+ Ensure that your NIC(s) can support the particular value before
+ modifying 'dpdk-rxq-size'.
+
+
+ Defaults to 2048. Maximum value is 4096. Changing this value requires
+ restarting the daemon.
+
+
+
+
+
+ Specifies the queue size (number tx descriptors) of dpdk ports.
+ Ensure that your NIC(s) can support the particular value before
+ modifying 'dpdk-txq-size'.
+
+
+ Defaults to 2048. Maximum value is 4096. Changing this value requires
+ restarting the daemon.
+
+
+