From patchwork Fri Sep 16 23:10:51 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jarno Rajahalme X-Patchwork-Id: 671188 X-Patchwork-Delegate: blp@nicira.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3sbWJ84GS9z9s2Q for ; Sat, 17 Sep 2016 09:11:52 +1000 (AEST) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 68002109DF; Fri, 16 Sep 2016 16:11:34 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e3.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id C600E109D7 for ; Fri, 16 Sep 2016 16:11:32 -0700 (PDT) Received: from bar5.cudamail.com (localhost [127.0.0.1]) by mx1e3.cudamail.com (Postfix) with ESMTPS id 35AE1420142 for ; Fri, 16 Sep 2016 17:11:32 -0600 (MDT) X-ASG-Debug-ID: 1474067491-09eadd353234f190001-byXFYA Received: from mx1-pf1.cudamail.com ([192.168.24.1]) by bar5.cudamail.com with ESMTP id JjtHackrwVC4o7Lb (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Fri, 16 Sep 2016 17:11:31 -0600 (MDT) X-Barracuda-Envelope-From: jarno@ovn.org X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.1 Received: from unknown (HELO relay6-d.mail.gandi.net) (217.70.183.198) by mx1-pf1.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 16 Sep 2016 23:11:31 -0000 Received-SPF: pass (mx1-pf1.cudamail.com: SPF record at ovn.org designates 217.70.183.198 as permitted sender) X-Barracuda-Apparent-Source-IP: 217.70.183.198 X-Barracuda-RBL-IP: 217.70.183.198 Received: from mfilter46-d.gandi.net (mfilter46-d.gandi.net [217.70.178.177]) by relay6-d.mail.gandi.net (Postfix) with ESMTP id CE7A0FB881; Sat, 17 Sep 2016 01:11:29 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter46-d.gandi.net Received: from relay6-d.mail.gandi.net ([IPv6:::ffff:217.70.183.198]) by mfilter46-d.gandi.net (mfilter46-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id gtrQUeVF-YhK; Sat, 17 Sep 2016 01:11:27 +0200 (CEST) X-Originating-IP: 208.91.1.34 Received: from sc9-mailhost3.vmware.com (unknown [208.91.1.34]) (Authenticated sender: jarno@ovn.org) by relay6-d.mail.gandi.net (Postfix) with ESMTPSA id 016A6FB887; Sat, 17 Sep 2016 01:11:26 +0200 (CEST) X-CudaMail-Envelope-Sender: jarno@ovn.org From: Jarno Rajahalme To: dev@openvswitch.org X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-E1-915068411 X-CudaMail-DTE: 091616 X-CudaMail-Originating-IP: 217.70.183.198 Date: Fri, 16 Sep 2016 16:10:51 -0700 X-ASG-Orig-Subj: [##CM-E1-915068411##][PATCH 3/3] seq: Add support for initial delay before waking up. Message-Id: <1474067451-78603-4-git-send-email-jarno@ovn.org> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1474067451-78603-1-git-send-email-jarno@ovn.org> References: <1474067451-78603-1-git-send-email-jarno@ovn.org> X-Barracuda-Connect: UNKNOWN[192.168.24.1] X-Barracuda-Start-Time: 1474067491 X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384 X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 Subject: [ovs-dev] [PATCH 3/3] seq: Add support for initial delay before waking up. X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" The execution time of 'ovs-ofctl add-flows' with a large number of flows can be more than halved if revalidators are not running after each flow mod separately. This was first suspected when it was found that 'ovs-ofctl --bundle add-flows' is about 10 times faster than the same command without the '--bundle' option in a scenario where there is a large set of flows being added and no datapath flows at all. One of the differences caused by the '--bundle' option is that the revalidators are woken up only once, at the end of the whole set of flow table changes, rather than after each flow table change individually. This patch limits the revalidation to run at most 200 times a second by enforcing a minimum of 5ms delay for flow table change wakeup after each complete revalidation round. This is implemented by adding a new seq_wait_delay() function, that takes a delay parameter, which, when non-zero, causes the wakeup to be postponed by the given number of milliseconds from the original seq_wait_delay() call time. If nothing happens in, say 6 milliseconds, and then a new flow table change is signaled, the revalidator threads wake up immediately without any further delay. Values smaller than 5 were found to increase the 'ovs-ofctl add-flows' execution time noticeably. Since the revalidators are not running after each flow mod, the overall OVS CPU utilization during the 'ovs-ofctl add-flows' run time is reduced roughly by one core on a four core machine. In testing the 'ovs-ofctl add-flows' execution time is not significantly improved from this even if the revalidators are not notified about the flow table changes at all. Signed-off-by: Jarno Rajahalme --- lib/seq.c | 50 ++++++++++++++++++++++++++++++------------- lib/seq.h | 9 ++++++-- ofproto/ofproto-dpif-upcall.c | 2 +- tests/ofproto-dpif.at | 3 +++ 4 files changed, 46 insertions(+), 18 deletions(-) diff --git a/lib/seq.c b/lib/seq.c index 6e2f596..2b64dd3 100644 --- a/lib/seq.c +++ b/lib/seq.c @@ -22,9 +22,10 @@ #include "coverage.h" #include "hash.h" -#include "openvswitch/hmap.h" #include "latch.h" +#include "openvswitch/hmap.h" #include "openvswitch/list.h" +#include "timeval.h" #include "ovs-thread.h" #include "poll-loop.h" @@ -45,6 +46,8 @@ struct seq_waiter { struct seq_thread *thread OVS_GUARDED; /* Thread preparing to wait. */ struct ovs_list list_node OVS_GUARDED; /* In 'thread->waiters'. */ + long long int delay_until_time OVS_GUARDED; + uint64_t value OVS_GUARDED; /* seq->value we're waiting to change. */ }; @@ -66,7 +69,8 @@ static struct seq_thread *seq_thread_get(void) OVS_REQUIRES(seq_mutex); static void seq_thread_exit(void *thread_) OVS_EXCLUDED(seq_mutex); static void seq_thread_woke(struct seq_thread *) OVS_REQUIRES(seq_mutex); static void seq_waiter_destroy(struct seq_waiter *) OVS_REQUIRES(seq_mutex); -static void seq_wake_waiters(struct seq *) OVS_REQUIRES(seq_mutex); +static void seq_wake_waiters(struct seq *, long long int now) + OVS_REQUIRES(seq_mutex); /* Creates and returns a new 'seq' object. */ struct seq * OVS_EXCLUDED(seq_mutex) @@ -94,7 +98,7 @@ seq_destroy(struct seq *seq) OVS_EXCLUDED(seq_mutex) { ovs_mutex_lock(&seq_mutex); - seq_wake_waiters(seq); + seq_wake_waiters(seq, LLONG_MAX); hmap_destroy(&seq->waiters); free(seq); ovs_mutex_unlock(&seq_mutex); @@ -123,13 +127,13 @@ seq_unlock(void) /* Increments 'seq''s sequence number, waking up any threads that are waiting * on 'seq'. */ void -seq_change_protected(struct seq *seq) +seq_change_protected_now(struct seq *seq, long long int now) OVS_REQUIRES(seq_mutex) { COVERAGE_INC(seq_change); seq->value = seq_next++; - seq_wake_waiters(seq); + seq_wake_waiters(seq, now); } /* Increments 'seq''s sequence number, waking up any threads that are waiting @@ -138,8 +142,12 @@ void seq_change(struct seq *seq) OVS_EXCLUDED(seq_mutex) { + /* time initialization uses a seq, so we must take 'now' before + * locking 'seq_mutex'. */ + long long int now = time_msec(); + ovs_mutex_lock(&seq_mutex); - seq_change_protected(seq); + seq_change_protected_now(seq, now); ovs_mutex_unlock(&seq_mutex); } @@ -173,8 +181,10 @@ seq_read(const struct seq *seq) return value; } +/* Find or create a waiter for the current thread. */ static void -seq_wait__(struct seq *seq, uint64_t value, const char *where) +seq_wait__(struct seq *seq, uint64_t value, long long int delay_until_time, + const char *where) OVS_REQUIRES(seq_mutex) { unsigned int id = ovsthread_id_self(); @@ -185,8 +195,9 @@ seq_wait__(struct seq *seq, uint64_t value, const char *where) if (waiter->ovsthread_id == id) { if (waiter->value != value) { /* The current value is different from the value we've already - * waited for, */ - poll_immediate_wake_at(where); + * waited for. Wakes immediately if 'waiter->delay_until_time' + * is in the past. */ + poll_timer_wait_until_at(waiter->delay_until_time, where); } else { /* Already waiting on 'value', nothing more to do. */ } @@ -201,6 +212,7 @@ seq_wait__(struct seq *seq, uint64_t value, const char *where) waiter->value = value; waiter->thread = seq_thread_get(); ovs_list_push_back(&waiter->thread->waiters, &waiter->list_node); + waiter->delay_until_time = delay_until_time; if (!waiter->thread->waiting) { latch_wait_at(&waiter->thread->latch, where); @@ -220,15 +232,21 @@ seq_wait__(struct seq *seq, uint64_t value, const char *where) * automatically provide the caller's source file and line number for * 'where'.) */ void -seq_wait_at(const struct seq *seq_, uint64_t value, const char *where) +seq_wait_delay_at(const struct seq *seq_, uint64_t value, + unsigned int delay_time, const char *where) OVS_EXCLUDED(seq_mutex) { struct seq *seq = CONST_CAST(struct seq *, seq_); + /* time initialization uses a seq, so we must call time_msec() before + * locking 'seq_mutex'. */ + long long int delay_until_time = time_msec() + delay_time; ovs_mutex_lock(&seq_mutex); - if (value == seq->value) { - seq_wait__(seq, value, where); + if (value == seq->value || delay_time) { + seq_wait__(seq, value, delay_until_time, where); } else { + /* The current value is different from the value we want to wait for + * and we do not want to wait. */ poll_immediate_wake_at(where); } ovs_mutex_unlock(&seq_mutex); @@ -316,13 +334,15 @@ seq_waiter_destroy(struct seq_waiter *waiter) } static void -seq_wake_waiters(struct seq *seq) +seq_wake_waiters(struct seq *seq, long long int now) OVS_REQUIRES(seq_mutex) { struct seq_waiter *waiter, *next_waiter; HMAP_FOR_EACH_SAFE (waiter, next_waiter, hmap_node, &seq->waiters) { - latch_set(&waiter->thread->latch); - seq_waiter_destroy(waiter); + if (now >= waiter->delay_until_time) { + latch_set(&waiter->thread->latch); + seq_waiter_destroy(waiter); + } } } diff --git a/lib/seq.h b/lib/seq.h index 221ab9a..a459921 100644 --- a/lib/seq.h +++ b/lib/seq.h @@ -121,7 +121,8 @@ struct seq *seq_create(void); void seq_destroy(struct seq *); void seq_change(struct seq *); -void seq_change_protected(struct seq *); +void seq_change_protected_now(struct seq *, long long int now); +#define seq_change_protected(seq) seq_change_protected_now(seq, LLONG_MAX) void seq_lock(void); int seq_try_lock(void); void seq_unlock(void); @@ -130,7 +131,11 @@ void seq_unlock(void); uint64_t seq_read(const struct seq *); uint64_t seq_read_protected(const struct seq *); -void seq_wait_at(const struct seq *, uint64_t value, const char *where); +void seq_wait_delay_at(const struct seq *, uint64_t value, unsigned int delay, + const char *where); +#define seq_wait_delay(seq, value, delay) \ + seq_wait_delay_at(seq, value, delay, OVS_SOURCE_LOCATOR) +#define seq_wait_at(seq, value, where) seq_wait_delay_at(seq, value, 0, where) #define seq_wait(seq, value) seq_wait_at(seq, value, OVS_SOURCE_LOCATOR) /* For poll_block() internal use. */ diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index eecea53..2de8e70 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -935,7 +935,7 @@ udpif_revalidator(void *arg) } poll_timer_wait_until(start_time + MIN(ofproto_max_idle, 500)); - seq_wait(udpif->reval_seq, last_reval_seq); + seq_wait_delay(udpif->reval_seq, last_reval_seq, 5); latch_wait(&udpif->exit_latch); latch_wait(&udpif->pause_latch); poll_block(); diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index de57efd..2105718 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -4609,6 +4609,9 @@ m4_define([CHECK_CONTINUATION], [dnl m4_if([$3], [0], [], [AT_CHECK([echo "$actions1" | sed 's/pause/controller(pause)/g' | ovs-ofctl -O OpenFlow13 add-flows br1 -])]) + # Wait for the revalidators to catch up. + ovs-appctl revalidator/wait + # Run a packet through the switch. AT_CHECK([ovs-appctl netdev-dummy/receive p1 "$flow"], [0], [stdout])