From patchwork Sun Jun 19 22:28:32 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: sergey.fedorov@linaro.org X-Patchwork-Id: 637755 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rXphw4vrTz9t0q for ; Mon, 20 Jun 2016 08:35:12 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b=ORrsnZtG; dkim-atps=neutral Received: from localhost ([::1]:40285 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bElJ8-0002X7-L5 for incoming@patchwork.ozlabs.org; Sun, 19 Jun 2016 18:35:10 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49802) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bElDo-0005HB-2K for qemu-devel@nongnu.org; Sun, 19 Jun 2016 18:29:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bElDl-0002Nr-AR for qemu-devel@nongnu.org; Sun, 19 Jun 2016 18:29:38 -0400 Received: from mail-lf0-x22e.google.com ([2a00:1450:4010:c07::22e]:34846) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bElDk-0002Nc-Sx for qemu-devel@nongnu.org; Sun, 19 Jun 2016 18:29:37 -0400 Received: by mail-lf0-x22e.google.com with SMTP id l188so27181927lfe.2 for ; Sun, 19 Jun 2016 15:29:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uRHT6jkZ11wThsjGhPGZkp9xxXfG5sKCXgeeYg0eMx8=; b=ORrsnZtGbTp/aiIKNiI9tTEKdKbLMtU2TipXv+vVbP2Sr6jwoe5EJ17+r2O+iwxBLl 0+7OR84FbA5pfkKrX0TBgEsqYBiXV1eci8C5fDRCZ4p6nJLskwgA1UtD3bPrAy5bd5Bj lVFtwPSi42/DWS+J1WrSwJyZ1+jpCCrS8hmqo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uRHT6jkZ11wThsjGhPGZkp9xxXfG5sKCXgeeYg0eMx8=; b=d6f/MD5HgJ2q6ds4qTq8ktja9/R8APhbVfCF+BI6Y4Q+0k2+25AU4PhZQV2sMTo/Hh Sx/NHbvNRALLt3TJPfEepz55L1HMuaYzAqW5iIy0PDa5z2WroQoXcabKm59PzFBTwf8Z kwBUrkzMapbBOk3/LC9DHN32VnXf24v4XkevIDlsE2Qpc3EPfQJQiRp+CO3nHc4K0691 SwKRrQW+apbtfdA2mJIdVrbsjMhB1DGKwIcTjd/uJ71fCCsZyVLWkqYUmE7tzFsD4/O4 9saxZMsXx20IyCWIRlSBOeXomcNTEPxar+bhHGDN2SoTgVq6QLRHibEBjeVPp87bJ61Z Ir7A== X-Gm-Message-State: ALyK8tLv201hp0FqPadNSShkfne8pIVupleyI+qyYO1Bv9cLhHRMNc/V8JOSomYfpAOqvBEW X-Received: by 10.25.88.19 with SMTP id m19mr3486691lfb.150.1466375376006; Sun, 19 Jun 2016 15:29:36 -0700 (PDT) Received: from sergey-laptop.Dlink (broadband-46-188-121-115.2com.net. [46.188.121.115]) by smtp.gmail.com with ESMTPSA id eu7sm3191552lbc.39.2016.06.19.15.29.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 19 Jun 2016 15:29:34 -0700 (PDT) From: Sergey Fedorov To: qemu-devel@nongnu.org Date: Mon, 20 Jun 2016 01:28:32 +0300 Message-Id: <1466375313-7562-8-git-send-email-sergey.fedorov@linaro.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1466375313-7562-1-git-send-email-sergey.fedorov@linaro.org> References: <1466375313-7562-1-git-send-email-sergey.fedorov@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:4010:c07::22e Subject: [Qemu-devel] [RFC 7/8] cpu-exec-common: Introduce async_safe_run_on_cpu() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Riku Voipio , Sergey Fedorov , Peter Crosthwaite , patches@linaro.org, Paolo Bonzini , Sergey Fedorov , Richard Henderson Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Sergey Fedorov This patch is based on the ideas found in work of KONRAD Frederic [1], Alex Bennée [2], and Alvise Rigo [3]. This mechanism allows to perform an operation safely in a quiescent state. Quiescent state means: (1) no vCPU is running and (2) BQL in system-mode or 'exclusive_lock' in user-mode emulation is held while performing the operation. This functionality is required e.g. for performing translation buffer flush safely in multi-threaded user-mode emulation. The existing CPU work queue is used to schedule such safe operations. A new 'safe' flag is added into struct qemu_work_item to designate the special requirements of the safe work. An operation in a quiescent sate can be scheduled by using async_safe_run_on_cpu() function which is actually the same as sync_run_on_cpu() except that it marks the queued work item with the 'safe' flag set to true. Given this flag set queue_work_on_cpu() atomically increments 'safe_work_pending' global counter and kicks all the CPUs instead of just the target CPU as in case of normal CPU work. This allows to force other CPUs to exit their execution loops and wait in wait_safe_cpu_work() function for the safe work to finish. When a CPU drains its work queue, if it encounters a work item marked as safe, it first waits for other CPUs to exit their execution loops, then called the work item function, and finally decrements 'safe_work_pending' counter with signalling other CPUs to let them continue execution as soon as all pending safe work items have been processed. The 'tcg_pending_cpus' protected by 'exclusive_lock' in user-mode or by 'qemu_global_mutex' in system-mode emulation is used to determine if there is any CPU run and wait for it to exit the execution loop. The fairness of all the CPU work queues is ensured by draining all the pending safe work items before any CPU can run. [1] http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg01128.html [2] http://lists.nongnu.org/archive/html/qemu-devel/2016-04/msg02531.html [3] http://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg04792.html Signed-off-by: Sergey Fedorov Signed-off-by: Sergey Fedorov --- cpu-exec-common.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- cpus.c | 16 ++++++++++++++++ include/exec/exec-all.h | 2 ++ include/qom/cpu.h | 14 ++++++++++++++ linux-user/main.c | 2 +- 5 files changed, 77 insertions(+), 2 deletions(-) diff --git a/cpu-exec-common.c b/cpu-exec-common.c index 8184e0662cbd..3056324738f8 100644 --- a/cpu-exec-common.c +++ b/cpu-exec-common.c @@ -25,6 +25,7 @@ bool exit_request; CPUState *tcg_current_cpu; +int tcg_pending_cpus; /* exit the current TB, but without causing any exception to be raised */ void cpu_loop_exit_noexc(CPUState *cpu) @@ -78,6 +79,15 @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc) siglongjmp(cpu->jmp_env, 1); } +static int safe_work_pending; + +void wait_safe_cpu_work(void) +{ + while (atomic_mb_read(&safe_work_pending) > 0) { + wait_cpu_work(); + } +} + static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi) { qemu_mutex_lock(&cpu->work_mutex); @@ -89,9 +99,18 @@ static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi) cpu->queued_work_last = wi; wi->next = NULL; wi->done = false; + if (wi->safe) { + atomic_inc(&safe_work_pending); + } qemu_mutex_unlock(&cpu->work_mutex); - qemu_cpu_kick(cpu); + if (!wi->safe) { + qemu_cpu_kick(cpu); + } else { + CPU_FOREACH(cpu) { + qemu_cpu_kick(cpu); + } + } } void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) @@ -106,6 +125,7 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) wi.func = func; wi.data = data; wi.free = false; + wi.safe = false; queue_work_on_cpu(cpu, &wi); while (!atomic_mb_read(&wi.done)) { @@ -129,6 +149,20 @@ void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) wi->func = func; wi->data = data; wi->free = true; + wi->safe = false; + + queue_work_on_cpu(cpu, wi); +} + +void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data) +{ + struct qemu_work_item *wi; + + wi = g_malloc0(sizeof(struct qemu_work_item)); + wi->func = func; + wi->data = data; + wi->free = true; + wi->safe = true; queue_work_on_cpu(cpu, wi); } @@ -148,9 +182,18 @@ void flush_queued_work(CPUState *cpu) if (!cpu->queued_work_first) { cpu->queued_work_last = NULL; } + if (wi->safe) { + while (tcg_pending_cpus) { + wait_cpu_work(); + } + } qemu_mutex_unlock(&cpu->work_mutex); wi->func(cpu, wi->data); qemu_mutex_lock(&cpu->work_mutex); + if (wi->safe) { + atomic_dec(&safe_work_pending); + signal_cpu_work(); + } if (wi->free) { g_free(wi); } else { diff --git a/cpus.c b/cpus.c index 98f60f6f98f5..bb6bd8615cfc 100644 --- a/cpus.c +++ b/cpus.c @@ -932,6 +932,18 @@ static void qemu_tcg_destroy_vcpu(CPUState *cpu) { } +static void tcg_cpu_exec_start(CPUState *cpu) +{ + tcg_pending_cpus++; +} + +static void tcg_cpu_exec_end(CPUState *cpu) +{ + if (--tcg_pending_cpus) { + signal_cpu_work(); + } +} + static void qemu_wait_io_event_common(CPUState *cpu) { if (cpu->stop) { @@ -956,6 +968,8 @@ static void qemu_tcg_wait_io_event(CPUState *cpu) CPU_FOREACH(cpu) { qemu_wait_io_event_common(cpu); } + + wait_safe_cpu_work(); } static void qemu_kvm_wait_io_event(CPUState *cpu) @@ -1491,7 +1505,9 @@ static void tcg_exec_all(void) (cpu->singlestep_enabled & SSTEP_NOTIMER) == 0); if (cpu_can_run(cpu)) { + tcg_cpu_exec_start(cpu); r = tcg_cpu_exec(cpu); + tcg_cpu_exec_end(cpu); if (r == EXCP_DEBUG) { cpu_handle_guest_debug(cpu); break; diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 23b4b50e0a45..3bc44ed81473 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -405,10 +405,12 @@ extern int singlestep; /* cpu-exec.c, accessed with atomic_mb_read/atomic_mb_set */ extern CPUState *tcg_current_cpu; +extern int tcg_pending_cpus; extern bool exit_request; void wait_cpu_work(void); void signal_cpu_work(void); void flush_queued_work(CPUState *cpu); +void wait_safe_cpu_work(void); #endif diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 4e688f645b4a..5128fcc1745a 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -231,6 +231,7 @@ struct qemu_work_item { void *data; int done; bool free; + bool safe; }; /** @@ -625,6 +626,19 @@ void run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data); void async_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data); /** + * async_safe_run_on_cpu: + * @cpu: The vCPU to run on. + * @func: The function to be executed. + * @data: Data to pass to the function. + * + * Schedules the function @func for execution on the vCPU @cpu asynchronously + * and in quiescent state. Quiescent state means: (1) all other vCPUs are + * halted and (2) #qemu_global_mutex (a.k.a. BQL) in system-mode or + * #exclusive_lock in user-mode emulation is held while @func is executing. + */ +void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func, void *data); + +/** * qemu_get_cpu: * @index: The CPUState@cpu_index value of the CPU to obtain. * diff --git a/linux-user/main.c b/linux-user/main.c index 5a68651159c2..6da3bb32186b 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -113,7 +113,6 @@ static pthread_cond_t exclusive_cond = PTHREAD_COND_INITIALIZER; static pthread_cond_t exclusive_resume = PTHREAD_COND_INITIALIZER; static pthread_cond_t work_cond = PTHREAD_COND_INITIALIZER; static bool exclusive_pending; -static int tcg_pending_cpus; /* Make sure everything is in a consistent state for calling fork(). */ void fork_start(void) @@ -219,6 +218,7 @@ static inline void cpu_exec_end(CPUState *cpu) } exclusive_idle(); flush_queued_work(cpu); + wait_safe_cpu_work(); pthread_mutex_unlock(&exclusive_lock); }