From patchwork Wed Feb 10 15:34:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gaetan Rivet X-Patchwork-Id: 1439070 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=u256.net header.i=@u256.net header.a=rsa-sha256 header.s=fm1 header.b=O/7yD6lW; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.a=rsa-sha256 header.s=fm2 header.b=oi+Y6USY; dkim-atps=neutral Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DbP2L4p4Fz9sB4 for ; Thu, 11 Feb 2021 02:34:34 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 3C7B487492; Wed, 10 Feb 2021 15:34:33 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5f-91ah4YdDy; Wed, 10 Feb 2021 15:34:29 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by hemlock.osuosl.org (Postfix) with ESMTP id 0B05887499; Wed, 10 Feb 2021 15:34:28 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id EBE18C1DA9; Wed, 10 Feb 2021 15:34:27 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id D5471C013A for ; Wed, 10 Feb 2021 15:34:24 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 9DD208747F for ; Wed, 10 Feb 2021 15:34:24 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OMprCTNToljm for ; Wed, 10 Feb 2021 15:34:21 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from wnew4-smtp.messagingengine.com (wnew4-smtp.messagingengine.com [64.147.123.18]) by hemlock.osuosl.org (Postfix) with ESMTPS id 1CEE787431 for ; Wed, 10 Feb 2021 15:34:20 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 5D339A1E; Wed, 10 Feb 2021 10:34:20 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 10 Feb 2021 10:34:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=u256.net; h=from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=vLGAQfiuY49kl o1CHPjM8xKm40sp/sEgR/0lOx0BPM0=; b=O/7yD6lW/pLwVFgW8Qid931CczoUD ke6YCsKSCKduTAfi67O1jcOomyViGgIVbkF3Es2xEM+gOwlE0f0zMPH6V3+xTBeX Fhkz2K+5dm8aojQ9OvjIml1lXce4PFt7Q38C/vlyZTZLG+7tsyoD1wKW98OKNIHI rcp0gD8d+9/FPLHTMZ+0tf8clGiaUvvw+dmLOk319OQRc7JCMEEOa8IKwmhwvZ3e TMRiUVpn/nsDeAHoAiC9bpAo9/iHXu1yumdxSlovZsPfpoUHJlQgVB3vANzwnj0a c3XVu1e8XNU/FRx2O6TniH4MNlNGuSrQxkaGkoA3Jke0Kr+hU+6Xiwqng== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=vLGAQfiuY49klo1CHPjM8xKm40sp/sEgR/0lOx0BPM0=; b=oi+Y6USY i5gl0dDRkV7BjUEevoVKcrMBHuQ4IG5i9UiUSzZtIhVuOVJjGQ5n5FmnC43BSEGG ndqFSSwUM8q25k2K16gnrtk+k74RvhjVGkKso7ML+ZcN/WB6CaU+NwA7vgsoszn3 eI8DGV6DJVJhZ8le3a6O0H8h4a993q+p7XPKCaHaXTHZPnjakmP2PfNIf81Wdux3 wK/AOW3su9auCVCthjKFWGxGo7HKeEWOYlWGirKKbcmBsOkd95ZQQAMgoxC9Sfds iIJe9bCfg9nXxUdJVxoRWG26O1reX2JQ+5m13pyDf6lgPrzpyUAHG7MiKKv0NG+q 7x+FNPtkxrQkXQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrheejgdejjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepifgrvghtrghn ucftihhvvghtuceoghhrihhvvgesuhdvheeirdhnvghtqeenucggtffrrghtthgvrhhnpe ehgfevffekteehteefieefvdehleefjeefheevudetjefhkeeutdekieeuvdetheenucfk phepkeeirddvheegrdduieegrddujeegnecuvehluhhsthgvrhfuihiivgeptdenucfrrg hrrghmpehmrghilhhfrhhomhepghhrihhvvgesuhdvheeirdhnvght X-ME-Proxy: Received: from inocybe.home (lfbn-poi-1-842-174.w86-254.abo.wanadoo.fr [86.254.164.174]) by mail.messagingengine.com (Postfix) with ESMTPA id 325C1240064; Wed, 10 Feb 2021 10:34:19 -0500 (EST) From: Gaetan Rivet To: dev@openvswitch.org Date: Wed, 10 Feb 2021 16:34:05 +0100 Message-Id: <4f26bd9c6667264d961176f7e2be46c2a6fd492e.1612968146.git.grive@u256.net> X-Mailer: git-send-email 2.30.0 In-Reply-To: References: MIME-Version: 1.0 Cc: Eli Britstein Subject: [ovs-dev] [PATCH v1 19/23] dpif-netdev: Use lockless queue to manage offloads X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" The dataplane threads (PMDs) send offloading commands to a dedicated offload management thread. The current implementation uses a lock and benchmarks show a high contention on the queue in some cases. With high-contention, the mutex will more often lead to the locking thread yielding in wait, using a syscall. This should be avoided in a userland dataplane. The mpsc-queue can be used instead. It uses less cycles and has lower latency. Benchmarks show better behavior as multiple revalidators and one or multiple PMDs writes to a single queue while another thread polls it. One trade-off with the new scheme however is to be forced to poll the queue from the offload thread. Without mutex, a cond_wait cannot be used for signaling. The offload thread is implementing an exponential backoff and will sleep in short increments when no data is available. This makes the thread yield, at the price of some latency to manage offloads after an inactivity period. Signed-off-by: Gaetan Rivet Reviewed-by: Eli Britstein --- lib/dpif-netdev.c | 78 ++++++++++++++++++++++++++++------------------- 1 file changed, 46 insertions(+), 32 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 8b9115609..09d62a3d5 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -52,6 +52,7 @@ #include "id-pool.h" #include "ipf.h" #include "mov-avg.h" +#include "mpsc-queue.h" #include "netdev.h" #include "netdev-offload.h" #include "netdev-provider.h" @@ -434,22 +435,19 @@ struct dp_offload_thread_item { size_t actions_len; long long int timestamp; - struct ovs_list node; + struct mpsc_queue_node node; }; struct dp_offload_thread { - struct ovs_mutex mutex; - struct ovs_list list; - uint64_t enqueued_item; + struct mpsc_queue queue; + atomic_uint64_t enqueued_item; struct mov_avg_cma cma; struct mov_avg_ema ema; - pthread_cond_t cond; }; static struct dp_offload_thread dp_offload_thread = { - .mutex = OVS_MUTEX_INITIALIZER, - .list = OVS_LIST_INITIALIZER(&dp_offload_thread.list), - .enqueued_item = 0, + .queue = MPSC_QUEUE_INITIALIZER(&dp_offload_thread.queue), + .enqueued_item = ATOMIC_VAR_INIT(0), .cma = MOV_AVG_CMA_INITIALIZER, .ema = MOV_AVG_EMA_INITIALIZER(100), }; @@ -2649,11 +2647,8 @@ dp_netdev_free_flow_offload(struct dp_offload_thread_item *offload) static void dp_netdev_append_flow_offload(struct dp_offload_thread_item *offload) { - ovs_mutex_lock(&dp_offload_thread.mutex); - ovs_list_push_back(&dp_offload_thread.list, &offload->node); - dp_offload_thread.enqueued_item++; - xpthread_cond_signal(&dp_offload_thread.cond); - ovs_mutex_unlock(&dp_offload_thread.mutex); + mpsc_queue_insert(&dp_offload_thread.queue, &offload->node); + atomic_count_inc64(&dp_offload_thread.enqueued_item); } static int @@ -2751,33 +2746,48 @@ err_free: return -1; } +#define DP_NETDEV_OFFLOAD_BACKOFF_MIN 1 +#define DP_NETDEV_OFFLOAD_BACKOFF_MAX 64 #define DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US (10 * 1000) /* 10 ms */ static void * dp_netdev_flow_offload_main(void *data OVS_UNUSED) { struct dp_offload_thread_item *offload; - struct ovs_list *list; + enum mpsc_queue_poll_result poll_result; + struct mpsc_queue_node *node; + struct mpsc_queue *queue; long long int latency_us; long long int next_rcu; long long int now; + uint64_t backoff; const char *op; int ret; + queue = &dp_offload_thread.queue; + if (!mpsc_queue_acquire(queue)) { + VLOG_ERR("failed to register as consumer of the offload queue"); + return NULL; + } + +sleep_until_next: + backoff = DP_NETDEV_OFFLOAD_BACKOFF_MIN; + while ((poll_result = mpsc_queue_poll(queue, &node)) == MPSC_QUEUE_EMPTY) { + xnanosleep(backoff * 1E6); + if (backoff < DP_NETDEV_OFFLOAD_BACKOFF_MAX) { + backoff <<= 1; + } + } + next_rcu = time_usec() + DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US; - for (;;) { - ovs_mutex_lock(&dp_offload_thread.mutex); - if (ovs_list_is_empty(&dp_offload_thread.list)) { - ovsrcu_quiesce_start(); - ovs_mutex_cond_wait(&dp_offload_thread.cond, - &dp_offload_thread.mutex); - ovsrcu_quiesce_end(); - next_rcu = time_usec() + DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US; - } - list = ovs_list_pop_front(&dp_offload_thread.list); - dp_offload_thread.enqueued_item--; - offload = CONTAINER_OF(list, struct dp_offload_thread_item, node); - ovs_mutex_unlock(&dp_offload_thread.mutex); + + do { + while (poll_result == MPSC_QUEUE_RETRY) { + poll_result = mpsc_queue_poll(queue, &node); + } + + offload = CONTAINER_OF(node, struct dp_offload_thread_item, node); + atomic_count_dec64(&dp_offload_thread.enqueued_item); switch (offload->op) { case DP_NETDEV_FLOW_OFFLOAD_OP_ADD: @@ -2813,7 +2823,11 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED) next_rcu += DP_NETDEV_OFFLOAD_QUIESCE_INTERVAL_US; } } - } + + poll_result = mpsc_queue_poll(queue, &node); + } while (poll_result != MPSC_QUEUE_EMPTY); + + goto sleep_until_next; return NULL; } @@ -2825,7 +2839,7 @@ queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd, struct dp_offload_thread_item *offload; if (ovsthread_once_start(&offload_thread_once)) { - xpthread_cond_init(&dp_offload_thread.cond, NULL); + mpsc_queue_init(&dp_offload_thread.queue); ovs_thread_create("dp_netdev_flow_offload", dp_netdev_flow_offload_main, NULL); ovsthread_once_done(&offload_thread_once); @@ -2850,7 +2864,7 @@ queue_netdev_flow_put(struct dp_netdev_pmd_thread *pmd, } if (ovsthread_once_start(&offload_thread_once)) { - xpthread_cond_init(&dp_offload_thread.cond, NULL); + mpsc_queue_init(&dp_offload_thread.queue); ovs_thread_create("dp_netdev_flow_offload", dp_netdev_flow_offload_main, NULL); ovsthread_once_done(&offload_thread_once); @@ -4295,8 +4309,8 @@ dpif_netdev_offload_stats_get(struct dpif *dpif, } ovs_mutex_unlock(&dp->port_mutex); - stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED].value = - dp_offload_thread.enqueued_item; + atomic_read_relaxed(&dp_offload_thread.enqueued_item, + &stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_ENQUEUED].value); stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_INSERTED].value = nb_offloads; stats->counters[DP_NETDEV_HW_OFFLOADS_STATS_LAT_CMA_MEAN].value = mov_avg_cma(&dp_offload_thread.cma);