From patchwork Fri Sep 21 14:07:40 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Roth X-Patchwork-Id: 185810 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 8B4582C0079 for ; Sat, 22 Sep 2012 02:06:46 +1000 (EST) Received: from localhost ([::1]:36435 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TF3vM-0002za-2Q for incoming@patchwork.ozlabs.org; Fri, 21 Sep 2012 10:09:44 -0400 Received: from eggs.gnu.org ([208.118.235.92]:41268) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TF3uL-0001ns-Tg for qemu-devel@nongnu.org; Fri, 21 Sep 2012 10:08:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TF3uJ-0007LZ-IJ for qemu-devel@nongnu.org; Fri, 21 Sep 2012 10:08:41 -0400 Received: from mail-ie0-f173.google.com ([209.85.223.173]:34534) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TF3uJ-0007Cw-Dd for qemu-devel@nongnu.org; Fri, 21 Sep 2012 10:08:39 -0400 Received: by mail-ie0-f173.google.com with SMTP id 17so990551iea.4 for ; Fri, 21 Sep 2012 07:08:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:x-mailer:in-reply-to :references; bh=LQcGWrB6SDVXXQIvx0ceNZQLq3FLAPFR3LsHmZlJHmU=; b=w8ks5iNVBScd8XmqChoe4Yv9O0Yu2EJGxQHUpi7W7YfY2uwIG0ylXxVj6bvtZMmyzs gCqWjqmy+QrezRX3aw3/W2EeNPKphmXWyaka4/WtENCkgGuRyQojOPvkaf/+D4PtJIlK I4tYMU34KcjEwz4wVB3F5+3rJbTRHxxJqF19Hv52kHfVU4Hw9NN/m3EZ0AJiyGmvInFd Px1T7Grf3tjCRy/UGR9AOtnq1zBu0q4QhGwiMvUicE9sf6InX2G/lc2VZ0F1f01fiHYq LrSpTMyX+T2lyn2a+3T+XpEyGLaeUgosbl2E9RLb03JiMK1BiROWmroJ/FVEviS7IPzW XSFQ== Received: by 10.50.41.132 with SMTP id f4mr1732013igl.39.1348236519235; Fri, 21 Sep 2012 07:08:39 -0700 (PDT) Received: from loki.morrigu.org (cpe-72-179-62-111.austin.res.rr.com. [72.179.62.111]) by mx.google.com with ESMTPS id ua5sm17301156igb.10.2012.09.21.07.08.37 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 21 Sep 2012 07:08:38 -0700 (PDT) From: Michael Roth To: qemu-devel@nongnu.org Date: Fri, 21 Sep 2012 09:07:40 -0500 Message-Id: <1348236465-23124-18-git-send-email-mdroth@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1348236465-23124-1-git-send-email-mdroth@linux.vnet.ibm.com> References: <1348236465-23124-1-git-send-email-mdroth@linux.vnet.ibm.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.223.173 Cc: blauwirbel@gmail.com, peter.maydell@linaro.org, aliguori@us.ibm.com, eblake@redhat.com Subject: [Qemu-devel] [PATCH 17/22] qidl: add documentation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Michael Roth --- docs/qidl.txt | 347 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 347 insertions(+) create mode 100644 docs/qidl.txt diff --git a/docs/qidl.txt b/docs/qidl.txt new file mode 100644 index 0000000..1cbf21f --- /dev/null +++ b/docs/qidl.txt @@ -0,0 +1,347 @@ +How to Serialize Device State with QIDL +====================================== + +This document describes how to implement save/restore of a device in QEMU using +the QIDL compiler. The QIDL compiler makes it easier to support live +migration in devices by converging the serialization description with the +device type declaration. It has the following features: + + 1. Single description of device state and how to serialize + + 2. Fully inclusive serialization description--fields that aren't serialized + are explicitly marked as such including the reason why. + + 3. Optimized for the common case. Even without any special annotations, + many devices will Just Work out of the box. + + 4. Build time schema definition. Since QIDL runs at build time, we have full + access to the schema during the build which means we can fail the build if + the schema breaks. + +For the rest of the document, the following simple device will be used as an +example. + + typedef struct SerialDevice { + SysBusDevice parent; + + uint8_t thr; /* transmit holding register */ + uint8_t lsr; /* line status register */ + uint8_t ier; /* interrupt enable register */ + + int int_pending; /* whether we have a pending queued interrupt */ + CharDriverState *chr; /* backend */ + } SerialDevice; + +Getting Started +--------------- + +Coverting a device struct to being serializable by QIDL, in general, only +requires the use of the QIDL_DECLARE() macro to handle the declaration. +The above-mentioned SerialDevice struct, for instance, can be made +serializable simply by using the following declaration format: + + typedef struct SerialDevice SerialDevice; + + QIDL_DECLARE(SerialDevice) { + SysBusDevice parent; + + uint8_t thr; /* transmit holding register */ + uint8_t lsr; /* line status register */ + uint8_t ier; /* interrupt enable register */ + + int int_pending; /* whether we have a pending queued interrupt */ + CharDriverState *chr; /* backend */ + }; + +Note that the typedef is required, and must be done in advance of the actual +struct declaration. + +Specifying What/How State Gets Saved +--------------------------------- + +By default, QIDL saves every field in a structure it sees. This provides +maximum correctness by default. However, device structures generally contain +state that reflects state that is in someway duplicated or not guest visible. +This more often that not reflects design implementation details. + +Since design implementation details change over time, saving this state makes +compatibility hard to maintain. The proper solution is to use an intermediate +protocol to handle cross-version compatibility (for instance, a QIDL-aware +implementation of VMState). But we can reduce churn and streamline the +serialization/deserialization process by explicitly marking fields with +information that QIDL can use to determine whether or not a particular field +will be serialized. However, a serializable device implementation that fails to +serialize state that is required to fully guest state is a broken one, so to +avoid that there are very strict rules about when this is allowed, and what +needs to be done to ensure that this does not impact correctness. + +There are also occassions where we want to specify *how* a field is +serialized. Array fields for instance might rely on a size value elsewhere in +the struct to determine the size of a dynamically-allocated array or, in the +case of a statically-allocated array, the number of elements that have +actually be set/initialized. There also cases where a field should only be +serialized if another value in the struct, or even a function call, indicates +that the field has been initialized. Markers can be used to handle these +types of cases as well. + +What follows is a description of these markers/annotations and how they are +used. + +## qImmutable Fields + +If a field is only set during device construction, based on parameters passed to +the device's constructor, then there is no need to send save and restore this +value. We call these fields immutable and we tell QIDL about this fact by using +a **qImmutable** marker. + +In our *SerialDevice* example, the *CharDriverState* pointer reflects the host +backend that we use to send serial output to the user. This is only assigned +during device construction and never changes. This means we can add an +**immutable** marker to it: + + QIDL_DECLARE(SerialDevice) { + SysBusDevice parent; + + uint8_t thr; + uint8_t lsr; + uint8_t ier; + + int int_pending; + CharDriverState *chr qImmutable; + }; + +When reviewing patches that make use of the **qImmutable** marker, the following +guidelines should be followed to determine if the marker is being used +correctly. + + 1. Check to see if the field is assigned anywhere other than the device + initialization function. + + 2. Check to see if any function is being called that modifies the state of the + field outside of the initialization function. + +It can be subtle whether a field is truly immutable. A good example is a +*QEMUTimer*. Timer's will usually have their timeout modified with a call to +*qemu_mod_timer()* even though they are only assigned in the device +initialization function. + +If the timer is always modified with a fixed value that is not dependent on +guest state, then the timer is immutable since it's unaffected by the state of +the guest. + +On the other hand, if the timer is modified based on guest state (such as a +guest programmed time out), then the timer carries state. It may be necessary +to save/restore the timer or mark it as **qDerived** and work with it +accordingly. + +### qDerived Fields + +If a field is set based on some other field in the device's structure, then its +value is derived. Since this is effectively duplicate state, we can avoid +sending it and then recompute it when we need to. Derived state requires a bit +more handling than immutable state. + +In our *SerialDevice* example, our *int_pending* flag is really derived from +two pieces of state. It is set based on whether interrupts are enabled in the +*ier* register and whether there is *THRE* flag is not set in the *lsr* +register. + +To mark a field as derived, use the **derived** marker. To update our +example, we would do: + + QIDL_DECLARE(SerialDevice) { + SysBusDevice parent; + + uint8_t thr; + uint8_t lsr; + uint8_t ier; + + int int_pending qDerived; + CharDriverState *chr qImmutable; + }; + +There is one other critical step needed when marking a field as derived. A +*post_load* function must be added that updates this field after loading the +rest of the device state. This function is implemented in the device's source +file, not in the QIDL header. Below is an example of what this function may do: + + static void serial_post_load(SerialDevice *s) + { + s->int_pending = !(s->lsr & THRE) && (s->ier & INTE); + } + +When reviewing a patch that marks a field as *derived*, the following criteria +should be used: + + 1. Does the device have a post load function? + + 2. Does the post load function assign a value to all of the derived fields? + + 3. Are there any obvious places where a derived field is holding unique state? + +### qBroken Fields + +QEMU does migration with a lot of devices today. When applying this methodology +to these devices, one will quickly discover that there are a lot of fields that +are not being saved today that are not derived or immutable state. + +These are all bugs. It just so happens that these bugs are usually not very +serious. In many cases, they cause small functionality glitches that so far +have not created any problems. + +Consider our *SerialDevice* example. In QEMU's real *SerialState* device, the +*thr* register is not saved, yet we have not marked it immutable or derived. + +The *thr* register is a temporary holding register that the next character to +transmit is placed in while we wait for the next baud cycle. In QEMU, we +emulate a very fast baud rate regardless of what guest programs. This means +that the contents of the *thr* register only matter for a very small period of +time (measured in microseconds). + +The likelihood of a migration converging in that very small period of time when +the *thr* register has a meaningful value is very small. Moreover, the worst +thing that can happen by not saving this register is that we lose a byte in the +data stream. Even if this has happened in practice, the chances of someone +noticing this as a bug is pretty small. + +Nonetheless, this is a bug and needs to be eventually fixed. However, it would +be very inconvenient to constantly break migration by fixing all of these bugs +one-by-one. Instead, QIDL has a **broken** marker. This indicates that a field +is not currently saved, but should be in the future. + +In general, qBroken markers should never be introduced in new code, and should +be used instead as a development aid to avoid serialization issues while +writing new device code. + +Below is an update of our example to reflect our real life serial device: + + QIDL_DECLARE(SerialDevice) { + SysBusDevice parent; + + uint8_t thr qBroken; + uint8_t lsr; + uint8_t ier; + + int int_pending qDerived; + CharDriverState qImmutable *chr; + }; + +When reviewing the use of the broken marker, the following things should be +considered: + + 1. What are the ramifications of not sending this data field? + + 2. If the not sending this data field can cause data corruption or very poor + behavior within the guest, the broken marker is not appropriate to use. + + 3. Assigning a default value to a field can also be used to fix a broken field + without significantly impacting live migration compatibility. + +### qElsewhere fields + +In some cases state is saved-off when serializing a seperate device +structure. For example, IDEState stores a reference to an IDEBus structure: + + QIDL_DECLARE(IDEState) { + IDEBus *bus qElsewhere; + uint8_t unit; + ... + }; + +However, IDEState is actually a member of IDEBus, so would have already been +serialized in the process of serializing IDEBus: + + QIDL_DECLARE(IDEBus) { + BusState qbus; + IDEDevice *master; + IDEDevice *slave; + IDEState ifs[2]; + ... + }; + +To handle this case we've used the *qElsewhere* marker to note that the +IDEBus* field in IDEState should not be saved since that is handled +elsewhere. + +### qOptional fields + +Some state is only serialized in certain circumstances. To handle these cases +you can specify a ***qOptional*** marker, which will, for a particular field +"fieldname", tell QIDL to reference the field "has_fieldname" (of type bool) +in the same struct to determine whether or not to serialize "fieldname". For +example, if the data field was optionally serialized, you could do following: + + QIDL_DECLARE(SerialFIFO) { + bool has_data; + uint8_t data[UART_FIFO_LENGTH] qOptional; + uint8_t count; + uint8_t itl; + uint8_t tail; + uint8_t head; + }; + +Of course, your device code will need to be updated to set has_data when +appropriate. If has_data is set based on guest state, then it must be +serialized as well. + +Arrays +------ + +QIDL has support for multiple types of arrays. The following sections describe +the different rules for arrays. + +Fixed Sized Arrays +------------------ + +A fixed sized array has a size that is known at build time. A typical example +would be: + + QIDL_DECLARE(SerialFIFO) { + uint8_t data[UART_FIFO_LENGTH]; + uint8_t count; + uint8_t itl; + uint8_t tail; + uint8_t head; + }; + +In this example, *data* is a fixed sized array. No special annotation is needed +for QIDL to marshal this area correctly. The following guidelines apply to +fixed sized arrays: + + 1. The size of the array is part of the device ABI. It should not change + without renaming the field. + +Variable Sized, Fixed Capacity Arrays +------------------------------------- + +Sometimes it's desirable to have a variable sized array. QIDL currently +supports variable sized arrays provided that the maximum capacity is fixed and +part of the device structure memory. + +A typical example would be a slightly modified version of our above example: + + QIDL_DECLARE(SerialFIFO) { + uint8_t count; + uint8_t data[UART_FIFO_LENGTH] qSize(count); + uint8_t itl; + uint8_t tail; + uint8_t head; + }; + +In this example, *data* is a variable sized array with a fixed capacity of +*UART_FIFO_LENGTH*. When we serialize, we want only want to serialize *count* +members. + +The ABI implications of capacity are a bit more relaxed with variable sized +arrays. In general, you can increase or decrease the capacity without breaking +the ABI although you may cause some instances of migration to fail between +versions of QEMU with different capacities. + +When reviewing variable sized, fixed capacity arrays, keep the following things +in mind: + + 1. The variable size must occur before the array element in the state + structure. + + 2. The capacity can change without breaking the ABI, but care should be used + when making these types of changes.