From patchwork Mon Aug 3 09:05:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudio Fontana X-Patchwork-Id: 1340212 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.de Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BKsXb3s9Lz9sTb for ; Mon, 3 Aug 2020 19:09:51 +1000 (AEST) Received: from localhost ([::1]:54312 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k2WTd-0003Pp-7k for incoming@patchwork.ozlabs.org; Mon, 03 Aug 2020 05:09:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60906) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k2WPg-0006X6-6L for qemu-devel@nongnu.org; Mon, 03 Aug 2020 05:05:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:51706) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k2WPc-0002pt-PG for qemu-devel@nongnu.org; Mon, 03 Aug 2020 05:05:43 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 282D3B70B; Mon, 3 Aug 2020 09:05:54 +0000 (UTC) From: Claudio Fontana To: Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Peter Maydell , =?utf-8?q?Philippe_Mathieu-Daud?= =?utf-8?q?=C3=A9?= Subject: [RFC v3 3/8] cpus: extract out TCG-specific code to accel/tcg Date: Mon, 3 Aug 2020 11:05:28 +0200 Message-Id: <20200803090533.7410-4-cfontana@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200803090533.7410-1-cfontana@suse.de> References: <20200803090533.7410-1-cfontana@suse.de> Received-SPF: pass client-ip=195.135.220.15; envelope-from=cfontana@suse.de; helo=mx2.suse.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/08/02 22:52:29 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x (no timestamps) [generic] X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Thomas Huth , Eduardo Habkost , Pavel Dovgalyuk , Marcelo Tosatti , qemu-devel@nongnu.org, Markus Armbruster , Roman Bolshakov , Wenchao Wang , Colin Xu , Claudio Fontana , haxm-team@intel.com, Sunil Muthuswamy , Richard Henderson Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" TCG is the first accelerator to register a "CpusAccel" interface on initialization, providing functions for starting a vcpu, kicking a vcpu, sychronizing state and getting virtual clock and ticks. Signed-off-by: Claudio Fontana --- accel/tcg/Makefile.objs | 1 + accel/tcg/tcg-all.c | 12 +- accel/tcg/tcg-cpus.c | 541 ++++++++++++++++++++++++++++++++++++++++++++++++ accel/tcg/tcg-cpus.h | 17 ++ softmmu/cpus.c | 500 +------------------------------------------- 5 files changed, 569 insertions(+), 502 deletions(-) create mode 100644 accel/tcg/tcg-cpus.c create mode 100644 accel/tcg/tcg-cpus.h diff --git a/accel/tcg/Makefile.objs b/accel/tcg/Makefile.objs index a92f2c454b..ecf9aa582e 100644 --- a/accel/tcg/Makefile.objs +++ b/accel/tcg/Makefile.objs @@ -1,5 +1,6 @@ obj-$(CONFIG_SOFTMMU) += tcg-all.o obj-$(CONFIG_SOFTMMU) += cputlb.o +obj-$(CONFIG_SOFTMMU) += tcg-cpus.o obj-y += tcg-runtime.o tcg-runtime-gvec.o obj-y += cpu-exec.o cpu-exec-common.o translate-all.o obj-y += translator.o diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c index f1feea20c8..01957b130d 100644 --- a/accel/tcg/tcg-all.c +++ b/accel/tcg/tcg-all.c @@ -24,19 +24,17 @@ */ #include "qemu/osdep.h" -#include "sysemu/accel.h" +#include "qemu-common.h" #include "sysemu/tcg.h" -#include "qom/object.h" -#include "cpu.h" -#include "sysemu/cpus.h" #include "sysemu/cpu-timers.h" -#include "qemu/main-loop.h" #include "tcg/tcg.h" #include "qapi/error.h" #include "qemu/error-report.h" #include "hw/boards.h" #include "qapi/qapi-builtin-visit.h" +#include "tcg-cpus.h" + typedef struct TCGState { AccelState parent_obj; @@ -123,6 +121,8 @@ static void tcg_accel_instance_init(Object *obj) s->mttcg_enabled = default_mttcg_enabled(); } +bool mttcg_enabled; + static int tcg_init(MachineState *ms) { TCGState *s = TCG_STATE(current_accel()); @@ -130,6 +130,8 @@ static int tcg_init(MachineState *ms) tcg_exec_init(s->tb_size * 1024 * 1024); cpu_interrupt_handler = tcg_handle_interrupt; mttcg_enabled = s->mttcg_enabled; + cpus_register_accel(&tcg_cpus); + return 0; } diff --git a/accel/tcg/tcg-cpus.c b/accel/tcg/tcg-cpus.c new file mode 100644 index 0000000000..c82d142523 --- /dev/null +++ b/accel/tcg/tcg-cpus.c @@ -0,0 +1,541 @@ +/* + * QEMU System Emulator + * + * Copyright (c) 2003-2008 Fabrice Bellard + * Copyright (c) 2014 Red Hat Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" +#include "sysemu/tcg.h" +#include "sysemu/replay.h" +#include "qemu/main-loop.h" +#include "qemu/guest-random.h" +#include "exec/exec-all.h" + +#include "tcg-cpus.h" + +/* Kick all RR vCPUs */ +static void qemu_cpu_kick_rr_cpus(void) +{ + CPUState *cpu; + + CPU_FOREACH(cpu) { + cpu_exit(cpu); + }; +} + +static void tcg_kick_vcpu_thread(CPUState *cpu) +{ + if (qemu_tcg_mttcg_enabled()) { + cpu_exit(cpu); + } else { + qemu_cpu_kick_rr_cpus(); + } +} + +/* + * TCG vCPU kick timer + * + * The kick timer is responsible for moving single threaded vCPU + * emulation on to the next vCPU. If more than one vCPU is running a + * timer event with force a cpu->exit so the next vCPU can get + * scheduled. + * + * The timer is removed if all vCPUs are idle and restarted again once + * idleness is complete. + */ + +static QEMUTimer *tcg_kick_vcpu_timer; +static CPUState *tcg_current_rr_cpu; + +#define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10) + +static inline int64_t qemu_tcg_next_kick(void) +{ + return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD; +} + +/* Kick the currently round-robin scheduled vCPU to next */ +static void qemu_cpu_kick_rr_next_cpu(void) +{ + CPUState *cpu; + do { + cpu = atomic_mb_read(&tcg_current_rr_cpu); + if (cpu) { + cpu_exit(cpu); + } + } while (cpu != atomic_mb_read(&tcg_current_rr_cpu)); +} + +static void kick_tcg_thread(void *opaque) +{ + timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); + qemu_cpu_kick_rr_next_cpu(); +} + +static void start_tcg_kick_timer(void) +{ + assert(!mttcg_enabled); + if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) { + tcg_kick_vcpu_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, + kick_tcg_thread, NULL); + } + if (tcg_kick_vcpu_timer && !timer_pending(tcg_kick_vcpu_timer)) { + timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); + } +} + +static void stop_tcg_kick_timer(void) +{ + assert(!mttcg_enabled); + if (tcg_kick_vcpu_timer && timer_pending(tcg_kick_vcpu_timer)) { + timer_del(tcg_kick_vcpu_timer); + } +} + +static void qemu_tcg_destroy_vcpu(CPUState *cpu) +{ +} + +static void qemu_tcg_rr_wait_io_event(void) +{ + CPUState *cpu; + + while (all_cpu_threads_idle()) { + stop_tcg_kick_timer(); + qemu_cond_wait_iothread(first_cpu->halt_cond); + } + + start_tcg_kick_timer(); + + CPU_FOREACH(cpu) { + qemu_wait_io_event_common(cpu); + } +} + +static int64_t tcg_get_icount_limit(void) +{ + int64_t deadline; + + if (replay_mode != REPLAY_MODE_PLAY) { + /* + * Include all the timers, because they may need an attention. + * Too long CPU execution may create unnecessary delay in UI. + */ + deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, + QEMU_TIMER_ATTR_ALL); + /* Check realtime timers, because they help with input processing */ + deadline = qemu_soonest_timeout(deadline, + qemu_clock_deadline_ns_all(QEMU_CLOCK_REALTIME, + QEMU_TIMER_ATTR_ALL)); + + /* + * Maintain prior (possibly buggy) behaviour where if no deadline + * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than + * INT32_MAX nanoseconds ahead, we still use INT32_MAX + * nanoseconds. + */ + if ((deadline < 0) || (deadline > INT32_MAX)) { + deadline = INT32_MAX; + } + + return icount_round(deadline); + } else { + return replay_get_instructions(); + } +} + +static void notify_aio_contexts(void) +{ + /* Wake up other AioContexts. */ + qemu_clock_notify(QEMU_CLOCK_VIRTUAL); + qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL); +} + +static void handle_icount_deadline(void) +{ + assert(qemu_in_vcpu_thread()); + if (icount_enabled()) { + int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, + QEMU_TIMER_ATTR_ALL); + + if (deadline == 0) { + notify_aio_contexts(); + } + } +} + +static void prepare_icount_for_run(CPUState *cpu) +{ + if (icount_enabled()) { + int insns_left; + + /* + * These should always be cleared by process_icount_data after + * each vCPU execution. However u16.high can be raised + * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt + */ + g_assert(cpu_neg(cpu)->icount_decr.u16.low == 0); + g_assert(cpu->icount_extra == 0); + + cpu->icount_budget = tcg_get_icount_limit(); + insns_left = MIN(0xffff, cpu->icount_budget); + cpu_neg(cpu)->icount_decr.u16.low = insns_left; + cpu->icount_extra = cpu->icount_budget - insns_left; + + replay_mutex_lock(); + + if (cpu->icount_budget == 0 && replay_has_checkpoint()) { + notify_aio_contexts(); + } + } +} + +static void process_icount_data(CPUState *cpu) +{ + if (icount_enabled()) { + /* Account for executed instructions */ + icount_update(cpu); + + /* Reset the counters */ + cpu_neg(cpu)->icount_decr.u16.low = 0; + cpu->icount_extra = 0; + cpu->icount_budget = 0; + + replay_account_executed_instructions(); + + replay_mutex_unlock(); + } +} + +static int tcg_cpu_exec(CPUState *cpu) +{ + int ret; +#ifdef CONFIG_PROFILER + int64_t ti; +#endif + + assert(tcg_enabled()); +#ifdef CONFIG_PROFILER + ti = profile_getclock(); +#endif + cpu_exec_start(cpu); + ret = cpu_exec(cpu); + cpu_exec_end(cpu); +#ifdef CONFIG_PROFILER + atomic_set(&tcg_ctx->prof.cpu_exec_time, + tcg_ctx->prof.cpu_exec_time + profile_getclock() - ti); +#endif + return ret; +} + +/* + * Destroy any remaining vCPUs which have been unplugged and have + * finished running + */ +static void deal_with_unplugged_cpus(void) +{ + CPUState *cpu; + + CPU_FOREACH(cpu) { + if (cpu->unplug && !cpu_can_run(cpu)) { + qemu_tcg_destroy_vcpu(cpu); + cpu_thread_signal_destroyed(cpu); + break; + } + } +} + +/* + * Single-threaded TCG + * + * In the single-threaded case each vCPU is simulated in turn. If + * there is more than a single vCPU we create a simple timer to kick + * the vCPU and ensure we don't get stuck in a tight loop in one vCPU. + * This is done explicitly rather than relying on side-effects + * elsewhere. + */ + +static void *tcg_rr_cpu_thread_fn(void *arg) +{ + CPUState *cpu = arg; + + assert(tcg_enabled()); + rcu_register_thread(); + tcg_register_thread(); + + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + + cpu->thread_id = qemu_get_thread_id(); + cpu->can_do_io = 1; + cpu_thread_signal_created(cpu); + qemu_guest_random_seed_thread_part2(cpu->random_seed); + + /* wait for initial kick-off after machine start */ + while (first_cpu->stopped) { + qemu_cond_wait_iothread(first_cpu->halt_cond); + + /* process any pending work */ + CPU_FOREACH(cpu) { + current_cpu = cpu; + qemu_wait_io_event_common(cpu); + } + } + + start_tcg_kick_timer(); + + cpu = first_cpu; + + /* process any pending work */ + cpu->exit_request = 1; + + while (1) { + qemu_mutex_unlock_iothread(); + replay_mutex_lock(); + qemu_mutex_lock_iothread(); + /* Account partial waits to QEMU_CLOCK_VIRTUAL. */ + icount_account_warp_timer(); + + /* + * Run the timers here. This is much more efficient than + * waking up the I/O thread and waiting for completion. + */ + handle_icount_deadline(); + + replay_mutex_unlock(); + + if (!cpu) { + cpu = first_cpu; + } + + while (cpu && cpu_work_list_empty(cpu) && !cpu->exit_request) { + + atomic_mb_set(&tcg_current_rr_cpu, cpu); + current_cpu = cpu; + + qemu_clock_enable(QEMU_CLOCK_VIRTUAL, + (cpu->singlestep_enabled & SSTEP_NOTIMER) == 0); + + if (cpu_can_run(cpu)) { + int r; + + qemu_mutex_unlock_iothread(); + prepare_icount_for_run(cpu); + + r = tcg_cpu_exec(cpu); + + process_icount_data(cpu); + qemu_mutex_lock_iothread(); + + if (r == EXCP_DEBUG) { + cpu_handle_guest_debug(cpu); + break; + } else if (r == EXCP_ATOMIC) { + qemu_mutex_unlock_iothread(); + cpu_exec_step_atomic(cpu); + qemu_mutex_lock_iothread(); + break; + } + } else if (cpu->stop) { + if (cpu->unplug) { + cpu = CPU_NEXT(cpu); + } + break; + } + + cpu = CPU_NEXT(cpu); + } /* while (cpu && !cpu->exit_request).. */ + + /* Does not need atomic_mb_set because a spurious wakeup is okay. */ + atomic_set(&tcg_current_rr_cpu, NULL); + + if (cpu && cpu->exit_request) { + atomic_mb_set(&cpu->exit_request, 0); + } + + if (icount_enabled() && all_cpu_threads_idle()) { + /* + * When all cpus are sleeping (e.g in WFI), to avoid a deadlock + * in the main_loop, wake it up in order to start the warp timer. + */ + qemu_notify_event(); + } + + qemu_tcg_rr_wait_io_event(); + deal_with_unplugged_cpus(); + } + + rcu_unregister_thread(); + return NULL; +} + +/* + * Multi-threaded TCG + * + * In the multi-threaded case each vCPU has its own thread. The TLS + * variable current_cpu can be used deep in the code to find the + * current CPUState for a given thread. + */ + +static void *tcg_cpu_thread_fn(void *arg) +{ + CPUState *cpu = arg; + + assert(tcg_enabled()); + g_assert(!icount_enabled()); + + rcu_register_thread(); + tcg_register_thread(); + + qemu_mutex_lock_iothread(); + qemu_thread_get_self(cpu->thread); + + cpu->thread_id = qemu_get_thread_id(); + cpu->can_do_io = 1; + current_cpu = cpu; + cpu_thread_signal_created(cpu); + qemu_guest_random_seed_thread_part2(cpu->random_seed); + + /* process any pending work */ + cpu->exit_request = 1; + + do { + if (cpu_can_run(cpu)) { + int r; + qemu_mutex_unlock_iothread(); + r = tcg_cpu_exec(cpu); + qemu_mutex_lock_iothread(); + switch (r) { + case EXCP_DEBUG: + cpu_handle_guest_debug(cpu); + break; + case EXCP_HALTED: + /* + * during start-up the vCPU is reset and the thread is + * kicked several times. If we don't ensure we go back + * to sleep in the halted state we won't cleanly + * start-up when the vCPU is enabled. + * + * cpu->halted should ensure we sleep in wait_io_event + */ + g_assert(cpu->halted); + break; + case EXCP_ATOMIC: + qemu_mutex_unlock_iothread(); + cpu_exec_step_atomic(cpu); + qemu_mutex_lock_iothread(); + default: + /* Ignore everything else? */ + break; + } + } + + atomic_mb_set(&cpu->exit_request, 0); + qemu_wait_io_event(cpu); + } while (!cpu->unplug || cpu_can_run(cpu)); + + qemu_tcg_destroy_vcpu(cpu); + cpu_thread_signal_destroyed(cpu); + qemu_mutex_unlock_iothread(); + rcu_unregister_thread(); + return NULL; +} + +static void tcg_start_vcpu_thread(CPUState *cpu) +{ + char thread_name[VCPU_THREAD_NAME_SIZE]; + static QemuCond *single_tcg_halt_cond; + static QemuThread *single_tcg_cpu_thread; + static int tcg_region_inited; + + assert(tcg_enabled()); + /* + * Initialize TCG regions--once. Now is a good time, because: + * (1) TCG's init context, prologue and target globals have been set up. + * (2) qemu_tcg_mttcg_enabled() works now (TCG init code runs before the + * -accel flag is processed, so the check doesn't work then). + */ + if (!tcg_region_inited) { + tcg_region_inited = 1; + tcg_region_init(); + } + + if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) { + cpu->thread = g_malloc0(sizeof(QemuThread)); + cpu->halt_cond = g_malloc0(sizeof(QemuCond)); + qemu_cond_init(cpu->halt_cond); + + if (qemu_tcg_mttcg_enabled()) { + /* create a thread per vCPU with TCG (MTTCG) */ + parallel_cpus = true; + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG", + cpu->cpu_index); + + qemu_thread_create(cpu->thread, thread_name, tcg_cpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); + + } else { + /* share a single thread for all cpus with TCG */ + snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG"); + qemu_thread_create(cpu->thread, thread_name, + tcg_rr_cpu_thread_fn, + cpu, QEMU_THREAD_JOINABLE); + + single_tcg_halt_cond = cpu->halt_cond; + single_tcg_cpu_thread = cpu->thread; + } +#ifdef _WIN32 + cpu->hThread = qemu_thread_get_handle(cpu->thread); +#endif + } else { + /* For non-MTTCG cases we share the thread */ + cpu->thread = single_tcg_cpu_thread; + cpu->halt_cond = single_tcg_halt_cond; + cpu->thread_id = first_cpu->thread_id; + cpu->can_do_io = 1; + cpu->created = true; + } +} + +static int64_t tcg_get_virtual_clock(void) +{ + if (icount_enabled()) { + return icount_get(); + } + return cpu_get_clock(); +} + +static int64_t tcg_get_elapsed_ticks(void) +{ + if (icount_enabled()) { + return icount_get(); + } + return cpu_get_ticks(); +} + +CpusAccel tcg_cpus = { + .create_vcpu_thread = tcg_start_vcpu_thread, + .kick_vcpu_thread = tcg_kick_vcpu_thread, + .get_virtual_clock = tcg_get_virtual_clock, + .get_elapsed_ticks = tcg_get_elapsed_ticks, +}; diff --git a/accel/tcg/tcg-cpus.h b/accel/tcg/tcg-cpus.h new file mode 100644 index 0000000000..af4be6a151 --- /dev/null +++ b/accel/tcg/tcg-cpus.h @@ -0,0 +1,17 @@ +/* + * Accelerator CPUS Interface + * + * Copyright 2020 SUSE LLC + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef TCG_CPUS_H +#define TCG_CPUS_H + +#include "sysemu/cpus.h" + +extern CpusAccel tcg_cpus; + +#endif /* TCG_CPUS_H */ diff --git a/softmmu/cpus.c b/softmmu/cpus.c index bad6302ca3..eaed6749e3 100644 --- a/softmmu/cpus.c +++ b/softmmu/cpus.c @@ -24,27 +24,19 @@ #include "qemu/osdep.h" #include "qemu-common.h" -#include "qemu/config-file.h" -#include "qemu/cutils.h" -#include "migration/vmstate.h" #include "monitor/monitor.h" #include "qapi/error.h" #include "qapi/qapi-commands-misc.h" #include "qapi/qapi-events-run-state.h" #include "qapi/qmp/qerror.h" -#include "qemu/error-report.h" -#include "qemu/qemu-print.h" #include "sysemu/tcg.h" -#include "sysemu/block-backend.h" #include "exec/gdbstub.h" -#include "sysemu/dma.h" #include "sysemu/hw_accel.h" #include "sysemu/kvm.h" #include "sysemu/hax.h" #include "sysemu/hvf.h" #include "sysemu/whpx.h" #include "exec/exec-all.h" - #include "qemu/thread.h" #include "qemu/plugin.h" #include "sysemu/cpus.h" @@ -124,79 +116,6 @@ bool all_cpu_threads_idle(void) return true; } -bool mttcg_enabled; - - -/***********************************************************/ -/* TCG vCPU kick timer - * - * The kick timer is responsible for moving single threaded vCPU - * emulation on to the next vCPU. If more than one vCPU is running a - * timer event with force a cpu->exit so the next vCPU can get - * scheduled. - * - * The timer is removed if all vCPUs are idle and restarted again once - * idleness is complete. - */ - -static QEMUTimer *tcg_kick_vcpu_timer; -static CPUState *tcg_current_rr_cpu; - -#define TCG_KICK_PERIOD (NANOSECONDS_PER_SECOND / 10) - -static inline int64_t qemu_tcg_next_kick(void) -{ - return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD; -} - -/* Kick the currently round-robin scheduled vCPU to next */ -static void qemu_cpu_kick_rr_next_cpu(void) -{ - CPUState *cpu; - do { - cpu = atomic_mb_read(&tcg_current_rr_cpu); - if (cpu) { - cpu_exit(cpu); - } - } while (cpu != atomic_mb_read(&tcg_current_rr_cpu)); -} - -/* Kick all RR vCPUs */ -static void qemu_cpu_kick_rr_cpus(void) -{ - CPUState *cpu; - - CPU_FOREACH(cpu) { - cpu_exit(cpu); - }; -} - -static void kick_tcg_thread(void *opaque) -{ - timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); - qemu_cpu_kick_rr_next_cpu(); -} - -static void start_tcg_kick_timer(void) -{ - assert(!mttcg_enabled); - if (!tcg_kick_vcpu_timer && CPU_NEXT(first_cpu)) { - tcg_kick_vcpu_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, - kick_tcg_thread, NULL); - } - if (tcg_kick_vcpu_timer && !timer_pending(tcg_kick_vcpu_timer)) { - timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); - } -} - -static void stop_tcg_kick_timer(void) -{ - assert(!mttcg_enabled); - if (tcg_kick_vcpu_timer && timer_pending(tcg_kick_vcpu_timer)) { - timer_del(tcg_kick_vcpu_timer); - } -} - /***********************************************************/ void hw_error(const char *fmt, ...) { @@ -328,9 +247,7 @@ int64_t cpus_get_virtual_clock(void) if (cpus_accel && cpus_accel->get_virtual_clock) { return cpus_accel->get_virtual_clock(); } - if (icount_enabled()) { - return icount_get(); - } else if (qtest_enabled()) { /* for qtest_clock_warp */ + if (qtest_enabled()) { /* for qtest_clock_warp */ return qtest_get_virtual_clock(); } return cpu_get_clock(); @@ -338,7 +255,7 @@ int64_t cpus_get_virtual_clock(void) /* * return the time elapsed in VM between vm_start and vm_stop. Unless - * icount is active, cpu_get_ticks() uses units of the host CPU cycle + * icount is active, cpus_get_elapsed_ticks() uses units of the host CPU cycle * counter. */ int64_t cpus_get_elapsed_ticks(void) @@ -346,9 +263,6 @@ int64_t cpus_get_elapsed_ticks(void) if (cpus_accel && cpus_accel->get_elapsed_ticks) { return cpus_accel->get_elapsed_ticks(); } - if (icount_enabled()) { - return icount_get(); - } return cpu_get_ticks(); } @@ -482,10 +396,6 @@ static void qemu_kvm_destroy_vcpu(CPUState *cpu) } } -static void qemu_tcg_destroy_vcpu(CPUState *cpu) -{ -} - static void qemu_cpu_stop(CPUState *cpu, bool exit) { g_assert(qemu_cpu_is_self(cpu)); @@ -506,22 +416,6 @@ void qemu_wait_io_event_common(CPUState *cpu) process_queued_cpu_work(cpu); } -static void qemu_tcg_rr_wait_io_event(void) -{ - CPUState *cpu; - - while (all_cpu_threads_idle()) { - stop_tcg_kick_timer(); - qemu_cond_wait(first_cpu->halt_cond, &qemu_global_mutex); - } - - start_tcg_kick_timer(); - - CPU_FOREACH(cpu) { - qemu_wait_io_event_common(cpu); - } -} - void qemu_wait_io_event(CPUState *cpu) { bool slept = false; @@ -538,7 +432,7 @@ void qemu_wait_io_event(CPUState *cpu) } #ifdef _WIN32 - /* Eat dummy APC queued by qemu_cpu_kick_thread. */ + /* Eat dummy APC queued by cpus_kick_thread */ /* NB!!! Should not this be if (hax_enabled)? Is this wrong for whpx? */ if (!tcg_enabled()) { SleepEx(0, TRUE); @@ -634,259 +528,6 @@ static void *qemu_dummy_cpu_thread_fn(void *arg) #endif } -static int64_t tcg_get_icount_limit(void) -{ - int64_t deadline; - - if (replay_mode != REPLAY_MODE_PLAY) { - /* - * Include all the timers, because they may need an attention. - * Too long CPU execution may create unnecessary delay in UI. - */ - deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, - QEMU_TIMER_ATTR_ALL); - /* Check realtime timers, because they help with input processing */ - deadline = qemu_soonest_timeout(deadline, - qemu_clock_deadline_ns_all(QEMU_CLOCK_REALTIME, - QEMU_TIMER_ATTR_ALL)); - - /* Maintain prior (possibly buggy) behaviour where if no deadline - * was set (as there is no QEMU_CLOCK_VIRTUAL timer) or it is more than - * INT32_MAX nanoseconds ahead, we still use INT32_MAX - * nanoseconds. - */ - if ((deadline < 0) || (deadline > INT32_MAX)) { - deadline = INT32_MAX; - } - - return icount_round(deadline); - } else { - return replay_get_instructions(); - } -} - -static void notify_aio_contexts(void) -{ - /* Wake up other AioContexts. */ - qemu_clock_notify(QEMU_CLOCK_VIRTUAL); - qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL); -} - -static void handle_icount_deadline(void) -{ - assert(qemu_in_vcpu_thread()); - if (icount_enabled()) { - int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL, - QEMU_TIMER_ATTR_ALL); - - if (deadline == 0) { - notify_aio_contexts(); - } - } -} - -static void prepare_icount_for_run(CPUState *cpu) -{ - if (icount_enabled()) { - int insns_left; - - /* These should always be cleared by process_icount_data after - * each vCPU execution. However u16.high can be raised - * asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt - */ - g_assert(cpu_neg(cpu)->icount_decr.u16.low == 0); - g_assert(cpu->icount_extra == 0); - - cpu->icount_budget = tcg_get_icount_limit(); - insns_left = MIN(0xffff, cpu->icount_budget); - cpu_neg(cpu)->icount_decr.u16.low = insns_left; - cpu->icount_extra = cpu->icount_budget - insns_left; - - replay_mutex_lock(); - - if (cpu->icount_budget == 0 && replay_has_checkpoint()) { - notify_aio_contexts(); - } - } -} - -static void process_icount_data(CPUState *cpu) -{ - if (icount_enabled()) { - /* Account for executed instructions */ - icount_update(cpu); - - /* Reset the counters */ - cpu_neg(cpu)->icount_decr.u16.low = 0; - cpu->icount_extra = 0; - cpu->icount_budget = 0; - - replay_account_executed_instructions(); - - replay_mutex_unlock(); - } -} - - -static int tcg_cpu_exec(CPUState *cpu) -{ - int ret; -#ifdef CONFIG_PROFILER - int64_t ti; -#endif - - assert(tcg_enabled()); -#ifdef CONFIG_PROFILER - ti = profile_getclock(); -#endif - cpu_exec_start(cpu); - ret = cpu_exec(cpu); - cpu_exec_end(cpu); -#ifdef CONFIG_PROFILER - atomic_set(&tcg_ctx->prof.cpu_exec_time, - tcg_ctx->prof.cpu_exec_time + profile_getclock() - ti); -#endif - return ret; -} - -/* Destroy any remaining vCPUs which have been unplugged and have - * finished running - */ -static void deal_with_unplugged_cpus(void) -{ - CPUState *cpu; - - CPU_FOREACH(cpu) { - if (cpu->unplug && !cpu_can_run(cpu)) { - qemu_tcg_destroy_vcpu(cpu); - cpu_thread_signal_destroyed(cpu); - break; - } - } -} - -/* Single-threaded TCG - * - * In the single-threaded case each vCPU is simulated in turn. If - * there is more than a single vCPU we create a simple timer to kick - * the vCPU and ensure we don't get stuck in a tight loop in one vCPU. - * This is done explicitly rather than relying on side-effects - * elsewhere. - */ - -static void *qemu_tcg_rr_cpu_thread_fn(void *arg) -{ - CPUState *cpu = arg; - - assert(tcg_enabled()); - rcu_register_thread(); - tcg_register_thread(); - - qemu_mutex_lock_iothread(); - qemu_thread_get_self(cpu->thread); - - cpu->thread_id = qemu_get_thread_id(); - cpu->can_do_io = 1; - cpu_thread_signal_created(cpu); - qemu_guest_random_seed_thread_part2(cpu->random_seed); - - /* wait for initial kick-off after machine start */ - while (first_cpu->stopped) { - qemu_cond_wait(first_cpu->halt_cond, &qemu_global_mutex); - - /* process any pending work */ - CPU_FOREACH(cpu) { - current_cpu = cpu; - qemu_wait_io_event_common(cpu); - } - } - - start_tcg_kick_timer(); - - cpu = first_cpu; - - /* process any pending work */ - cpu->exit_request = 1; - - while (1) { - qemu_mutex_unlock_iothread(); - replay_mutex_lock(); - qemu_mutex_lock_iothread(); - /* Account partial waits to QEMU_CLOCK_VIRTUAL. */ - icount_account_warp_timer(); - - /* Run the timers here. This is much more efficient than - * waking up the I/O thread and waiting for completion. - */ - handle_icount_deadline(); - - replay_mutex_unlock(); - - if (!cpu) { - cpu = first_cpu; - } - - while (cpu && cpu_work_list_empty(cpu) && !cpu->exit_request) { - - atomic_mb_set(&tcg_current_rr_cpu, cpu); - current_cpu = cpu; - - qemu_clock_enable(QEMU_CLOCK_VIRTUAL, - (cpu->singlestep_enabled & SSTEP_NOTIMER) == 0); - - if (cpu_can_run(cpu)) { - int r; - - qemu_mutex_unlock_iothread(); - prepare_icount_for_run(cpu); - - r = tcg_cpu_exec(cpu); - - process_icount_data(cpu); - qemu_mutex_lock_iothread(); - - if (r == EXCP_DEBUG) { - cpu_handle_guest_debug(cpu); - break; - } else if (r == EXCP_ATOMIC) { - qemu_mutex_unlock_iothread(); - cpu_exec_step_atomic(cpu); - qemu_mutex_lock_iothread(); - break; - } - } else if (cpu->stop) { - if (cpu->unplug) { - cpu = CPU_NEXT(cpu); - } - break; - } - - cpu = CPU_NEXT(cpu); - } /* while (cpu && !cpu->exit_request).. */ - - /* Does not need atomic_mb_set because a spurious wakeup is okay. */ - atomic_set(&tcg_current_rr_cpu, NULL); - - if (cpu && cpu->exit_request) { - atomic_mb_set(&cpu->exit_request, 0); - } - - if (icount_enabled() && all_cpu_threads_idle()) { - /* - * When all cpus are sleeping (e.g in WFI), to avoid a deadlock - * in the main_loop, wake it up in order to start the warp timer. - */ - qemu_notify_event(); - } - - qemu_tcg_rr_wait_io_event(); - deal_with_unplugged_cpus(); - } - - rcu_unregister_thread(); - return NULL; -} - static void *qemu_hax_cpu_thread_fn(void *arg) { CPUState *cpu = arg; @@ -1006,76 +647,6 @@ static void CALLBACK dummy_apc_func(ULONG_PTR unused) } #endif -/* Multi-threaded TCG - * - * In the multi-threaded case each vCPU has its own thread. The TLS - * variable current_cpu can be used deep in the code to find the - * current CPUState for a given thread. - */ - -static void *qemu_tcg_cpu_thread_fn(void *arg) -{ - CPUState *cpu = arg; - - assert(tcg_enabled()); - g_assert(!icount_enabled()); - - rcu_register_thread(); - tcg_register_thread(); - - qemu_mutex_lock_iothread(); - qemu_thread_get_self(cpu->thread); - - cpu->thread_id = qemu_get_thread_id(); - cpu->can_do_io = 1; - current_cpu = cpu; - cpu_thread_signal_created(cpu); - qemu_guest_random_seed_thread_part2(cpu->random_seed); - - /* process any pending work */ - cpu->exit_request = 1; - - do { - if (cpu_can_run(cpu)) { - int r; - qemu_mutex_unlock_iothread(); - r = tcg_cpu_exec(cpu); - qemu_mutex_lock_iothread(); - switch (r) { - case EXCP_DEBUG: - cpu_handle_guest_debug(cpu); - break; - case EXCP_HALTED: - /* during start-up the vCPU is reset and the thread is - * kicked several times. If we don't ensure we go back - * to sleep in the halted state we won't cleanly - * start-up when the vCPU is enabled. - * - * cpu->halted should ensure we sleep in wait_io_event - */ - g_assert(cpu->halted); - break; - case EXCP_ATOMIC: - qemu_mutex_unlock_iothread(); - cpu_exec_step_atomic(cpu); - qemu_mutex_lock_iothread(); - default: - /* Ignore everything else? */ - break; - } - } - - atomic_mb_set(&cpu->exit_request, 0); - qemu_wait_io_event(cpu); - } while (!cpu->unplug || cpu_can_run(cpu)); - - qemu_tcg_destroy_vcpu(cpu); - cpu_thread_signal_destroyed(cpu); - qemu_mutex_unlock_iothread(); - rcu_unregister_thread(); - return NULL; -} - void cpus_kick_thread(CPUState *cpu) { #ifndef _WIN32 @@ -1106,15 +677,8 @@ void cpus_kick_thread(CPUState *cpu) void qemu_cpu_kick(CPUState *cpu) { qemu_cond_broadcast(cpu->halt_cond); - if (cpus_accel && cpus_accel->kick_vcpu_thread) { cpus_accel->kick_vcpu_thread(cpu); - } else if (tcg_enabled()) { - if (qemu_tcg_mttcg_enabled()) { - cpu_exit(cpu); - } else { - qemu_cpu_kick_rr_cpus(); - } } else { if (hax_enabled()) { /* @@ -1270,62 +834,6 @@ void cpu_remove_sync(CPUState *cpu) qemu_mutex_lock_iothread(); } -static void qemu_tcg_init_vcpu(CPUState *cpu) -{ - char thread_name[VCPU_THREAD_NAME_SIZE]; - static QemuCond *single_tcg_halt_cond; - static QemuThread *single_tcg_cpu_thread; - static int tcg_region_inited; - - assert(tcg_enabled()); - /* - * Initialize TCG regions--once. Now is a good time, because: - * (1) TCG's init context, prologue and target globals have been set up. - * (2) qemu_tcg_mttcg_enabled() works now (TCG init code runs before the - * -accel flag is processed, so the check doesn't work then). - */ - if (!tcg_region_inited) { - tcg_region_inited = 1; - tcg_region_init(); - } - - if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) { - cpu->thread = g_malloc0(sizeof(QemuThread)); - cpu->halt_cond = g_malloc0(sizeof(QemuCond)); - qemu_cond_init(cpu->halt_cond); - - if (qemu_tcg_mttcg_enabled()) { - /* create a thread per vCPU with TCG (MTTCG) */ - parallel_cpus = true; - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG", - cpu->cpu_index); - - qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn, - cpu, QEMU_THREAD_JOINABLE); - - } else { - /* share a single thread for all cpus with TCG */ - snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG"); - qemu_thread_create(cpu->thread, thread_name, - qemu_tcg_rr_cpu_thread_fn, - cpu, QEMU_THREAD_JOINABLE); - - single_tcg_halt_cond = cpu->halt_cond; - single_tcg_cpu_thread = cpu->thread; - } -#ifdef _WIN32 - cpu->hThread = qemu_thread_get_handle(cpu->thread); -#endif - } else { - /* For non-MTTCG cases we share the thread */ - cpu->thread = single_tcg_cpu_thread; - cpu->halt_cond = single_tcg_halt_cond; - cpu->thread_id = first_cpu->thread_id; - cpu->can_do_io = 1; - cpu->created = true; - } -} - static void qemu_hax_start_vcpu(CPUState *cpu) { char thread_name[VCPU_THREAD_NAME_SIZE]; @@ -1436,8 +944,6 @@ void qemu_init_vcpu(CPUState *cpu) qemu_hax_start_vcpu(cpu); } else if (hvf_enabled()) { qemu_hvf_start_vcpu(cpu); - } else if (tcg_enabled()) { - qemu_tcg_init_vcpu(cpu); } else if (whpx_enabled()) { qemu_whpx_start_vcpu(cpu); } else {