From patchwork Mon Sep 7 08:40:20 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Dovgalyuk X-Patchwork-Id: 515018 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 1A00B1401CB for ; Mon, 7 Sep 2015 18:43:32 +1000 (AEST) Received: from localhost ([::1]:54653 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZYs1R-0005yU-L2 for incoming@patchwork.ozlabs.org; Mon, 07 Sep 2015 04:43:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41767) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZYryQ-0001BY-Sl for qemu-devel@nongnu.org; Mon, 07 Sep 2015 04:40:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZYryO-0000GN-Qu for qemu-devel@nongnu.org; Mon, 07 Sep 2015 04:40:22 -0400 Received: from mail.ispras.ru ([83.149.199.45]:52480) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZYryO-0000Fb-FF for qemu-devel@nongnu.org; Mon, 07 Sep 2015 04:40:20 -0400 Received: from [10.10.150.149] (unknown [85.142.117.224]) by mail.ispras.ru (Postfix) with ESMTPSA id A6577540122; Mon, 7 Sep 2015 11:40:19 +0300 (MSK) To: qemu-devel@nongnu.org From: Pavel Dovgalyuk Date: Mon, 07 Sep 2015 11:40:20 +0300 Message-ID: <20150907084019.1664.86148.stgit@PASHA-ISP> In-Reply-To: <20150907084005.1664.19540.stgit@PASHA-ISP> References: <20150907084005.1664.19540.stgit@PASHA-ISP> User-Agent: StGit/0.16 MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 83.149.199.45 Cc: edgar.iglesias@xilinx.com, peter.maydell@linaro.org, igor.rubinov@gmail.com, alex.bennee@linaro.org, mark.burton@greensocs.com, real@ispras.ru, batuzovk@ispras.ru, maria.klimushenkova@ispras.ru, pavel.dovgaluk@ispras.ru, pbonzini@redhat.com, hines@cert.org, fred.konrad@greensocs.com Subject: [Qemu-devel] [PATCH v17 02/21] replay: global variables and function stubs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This patch adds global variables, defines, function declarations, and function stubs for deterministic VM replay used by external modules. Reviewed-by: Paolo Bonzini Reviewed-by: Eric Blake Signed-off-by: Pavel Dovgalyuk --- Makefile.target | 1 docs/replay.txt | 168 ++++++++++++++++++++++++++++++++++++++++++++++++++ qapi-schema.json | 18 +++++ replay/Makefile.objs | 2 + replay/replay.c | 14 ++++ replay/replay.h | 19 ++++++ stubs/Makefile.objs | 1 stubs/replay.c | 3 + 8 files changed, 226 insertions(+), 0 deletions(-) create mode 100755 docs/replay.txt create mode 100755 replay/Makefile.objs create mode 100755 replay/replay.c create mode 100755 replay/replay.h create mode 100755 stubs/replay.c diff --git a/Makefile.target b/Makefile.target index 3e7aafd..e149ec9 100644 --- a/Makefile.target +++ b/Makefile.target @@ -86,6 +86,7 @@ all: $(PROGS) stap # cpu emulator library obj-y = exec.o translate-all.o cpu-exec.o obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o +obj-y += replay/ obj-$(CONFIG_TCG_INTERPRETER) += tci.o obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o obj-y += fpu/softfloat.o diff --git a/docs/replay.txt b/docs/replay.txt new file mode 100755 index 0000000..645d462 --- /dev/null +++ b/docs/replay.txt @@ -0,0 +1,168 @@ +Copyright (c) 2010-2015 Institute for System Programming + of the Russian Academy of Sciences. + +This work is licensed under the terms of the GNU GPL, version 2 or later. +See the COPYING file in the top-level directory. + +Record/replay +------------- + +Record/replay functions are used for the reverse execution and deterministic +replay of qemu execution. This implementation of deterministic replay can +be used for deterministic debugging of guest code through a gdb remote +interface. + +Execution recording writes a non-deterministic events log, which can be later +used for replaying the execution anywhere and for unlimited number of times. +It also supports checkpointing for faster rewinding during reverse debugging. +Execution replaying reads the log and replays all non-deterministic events +including external input, hardware clocks, and interrupts. + +Deterministic replay has the following features: + * Deterministically replays whole system execution and all contents of + the memory, state of the hardware devices, clocks, and screen of the VM. + * Writes execution log into the file for later replaying for multiple times + on different machines. + * Supports i386, x86_64, and ARM hardware platforms. + * Performs deterministic replay of all operations with keyboard and mouse + input devices. + +Usage of the record/replay: + * First, record the execution, by adding the following arguments to the command line: + '-icount shift=7,rr=record,rrfile=replay.bin -net none'. + Block devices' images are not actually changed in the recording mode, + because all of the changes are written to the temporary overlay file. + * Then you can replay it by using another command + line option: '-icount shift=7,rr=replay,rrfile=replay.bin -net none' + * '-net none' option should also be specified if network replay patches + are not applied. + +Papers with description of deterministic replay implementation: +http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html +http://dl.acm.org/citation.cfm?id=2786805.2803179 + +Modifications of qemu include: + * wrappers for clock and time functions to save their return values in the log + * saving different asynchronous events (e.g. system shutdown) into the log + * synchronization of the bottom halves execution + * synchronization of the threads from thread pool + * recording/replaying user input (mouse and keyboard) + * adding internal checkpoints for cpu and io synchronization + +Non-deterministic events +------------------------ + +Our record/replay system is based on saving and replaying non-deterministic +events (e.g. keyboard input) and simulating deterministic ones (e.g. reading +from HDD or memory of the VM). Saving only non-deterministic events makes +log file smaller, simulation faster, and allows using reverse debugging even +for realtime applications. + +The following non-deterministic data from peripheral devices is saved into +the log: mouse and keyboard input, network packets, audio controller input, +USB packets, serial port input, and hardware clocks (they are non-deterministic +too, because their values are taken from the host machine). Inputs from +simulated hardware, memory of VM, software interrupts, and execution of +instructions are not saved into the log, because they are deterministic and +can be replayed by simulating the behavior of virtual machine starting from +initial state. + +We had to solve three tasks to implement deterministic replay: recording +non-deterministic events, replaying non-deterministic events, and checking +that there is no divergence between record and replay modes. + +We changed several parts of QEMU to make event log recording and replaying. +Devices' models that have non-deterministic input from external devices were +changed to write every external event into the execution log immediately. +E.g. network packets are written into the log when they arrive into the virtual +network adapter. + +All non-deterministic events are coming from these devices. But to +replay them we need to know at which moments they occur. We specify +these moments by counting the number of instructions executed between +every pair of consecutive events. + +Instruction counting +-------------------- + +QEMU should work in icount mode to use record/replay feature. icount was +designed to allow deterministic execution in absence of external inputs +of the virtual machine. We also use icount to control the occurrence of the +non-deterministic events. The number of instructions elapsed from the last event +is written to the log while recording the execution. In replay mode we +can predict when to inject that event using the instruction counter. + +Timers +------ + +Timers are used to execute callbacks from different subsystems of QEMU +at the specified moments of time. There are several kinds of timers: + * Real time clock. Based on host time and used only for callbacks that + do not change the virtual machine state. For this reason real time + clock and timers does not affect deterministic replay at all. + * Virtual clock. These timers run only during the emulation. In icount + mode virtual clock value is calculated using executed instructions counter. + That is why it is completely deterministic and does not have to be recorded. + * Host clock. This clock is used by device models that simulate real time + sources (e.g. real time clock chip). Host clock is the one of the sources + of non-determinism. Host clock read operations should be logged to + make the execution deterministic. + * Real time clock for icount. This clock is similar to real time clock but + it is used only for increasing virtual clock while virtual machine is + sleeping. Due to its nature it is also non-deterministic as the host clock + and has to be logged too. + +Checkpoints +----------- + +Replaying of the execution of virtual machine is bound by sources of +non-determinism. These are inputs from clock and peripheral devices, +and QEMU thread scheduling. Thread scheduling affect on processing events +from timers, asynchronous input-output, and bottom halves. + +Invocations of timers are coupled with clock reads and changing the state +of the virtual machine. Reads produce non-deterministic data taken from +host clock. And VM state changes should preserve their order. Their relative +order in replay mode must replicate the order of callbacks in record mode. +To preserve this order we use checkpoints. When a specific clock is processed +in record mode we save to the log special "checkpoint" event. +Checkpoints here do not refer to virtual machine snapshots. They are just +record/replay events used for synchronization. + +QEMU in replay mode will try to invoke timers processing in random moment +of time. That's why we do not process a group of timers until the checkpoint +event will be read from the log. Such an event allows synchronizing CPU +execution and timer events. + +Another checkpoints application in record/replay is instruction counting +while the virtual machine is idle. This function (qemu_clock_warp) is called +from the wait loop. It changes virtual machine state and must be deterministic +then. That is why we added checkpoint to this function to prevent its +operation in replay mode when it does not correspond to record mode. + +Bottom halves +------------- + +Disk I/O events are completely deterministic in our model, because +in both record and replay modes we start virtual machine from the same +disk state. But callbacks that virtual disk controller uses for reading and +writing the disk may occur at different moments of time in record and replay +modes. + +Reading and writing requests are created by CPU thread of QEMU. Later these +requests proceed to block layer which creates "bottom halves". Bottom +halves consist of callback and its parameters. They are processed when +main loop locks the global mutex. These locks are not synchronized with +replaying process because main loop also processes the events that do not +affect the virtual machine state (like user interaction with monitor). + +That is why we had to implement saving and replaying bottom halves callbacks +synchronously to the CPU execution. When the callback is about to execute +it is added to the queue in the replay module. This queue is written to the +log when its callbacks are executed. In replay mode callbacks are not processed +until the corresponding event is read from the events log file. + +Sometimes the block layer uses asynchronous callbacks for its internal purposes +(like reading or writing VM snapshots or disk image cluster tables). In this +case bottom halves are not marked as "replayable" and do not saved +into the log. diff --git a/qapi-schema.json b/qapi-schema.json index 4342a08..563321c 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3794,3 +3794,21 @@ # Rocker ethernet network switch { 'include': 'qapi/rocker.json' } + +## +# ReplayMode: +# +# Mode of the replay subsystem. +# +# @none: normal execution mode. Replay or record are not enabled. +# +# @record: record mode. All non-deterministic data is written into the +# replay log. +# +# @play: replay mode. Non-deterministic data required for system execution +# is read from the log. +# +# Since: 2.5 +## +{ 'enum': 'ReplayMode', + 'data': [ 'none', 'record', 'play' ] } diff --git a/replay/Makefile.objs b/replay/Makefile.objs new file mode 100755 index 0000000..0b9cb99 --- /dev/null +++ b/replay/Makefile.objs @@ -0,0 +1,2 @@ +obj-$(CONFIG_SOFTMMU) += replay.o +obj-$(CONFIG_USER_ONLY) += replay-user.o diff --git a/replay/replay.c b/replay/replay.c new file mode 100755 index 0000000..5ce066f --- /dev/null +++ b/replay/replay.c @@ -0,0 +1,14 @@ +/* + * replay.c + * + * Copyright (c) 2010-2015 Institute for System Programming + * of the Russian Academy of Sciences. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "replay.h" + +ReplayMode replay_mode = REPLAY_MODE_NONE; diff --git a/replay/replay.h b/replay/replay.h new file mode 100755 index 0000000..d6b73c3 --- /dev/null +++ b/replay/replay.h @@ -0,0 +1,19 @@ +#ifndef REPLAY_H +#define REPLAY_H + +/* + * replay.h + * + * Copyright (c) 2010-2015 Institute for System Programming + * of the Russian Academy of Sciences. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qapi-types.h" + +extern ReplayMode replay_mode; + +#endif diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index 9937a12..200adc9 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -25,6 +25,7 @@ stub-obj-y += monitor-init.o stub-obj-y += notify-event.o stub-obj-$(CONFIG_SPICE) += qemu-chr-open-spice.o stub-obj-y += qtest.o +stub-obj-y += replay.o stub-obj-y += reset.o stub-obj-y += runstate-check.o stub-obj-y += set-fd-handler.o diff --git a/stubs/replay.c b/stubs/replay.c new file mode 100755 index 0000000..563c777 --- /dev/null +++ b/stubs/replay.c @@ -0,0 +1,3 @@ +#include "replay/replay.h" + +ReplayMode replay_mode;