From patchwork Tue Jul 24 17:20:47 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Roth X-Patchwork-Id: 172996 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 4371E2C0088 for ; Wed, 25 Jul 2012 04:16:09 +1000 (EST) Received: from localhost ([::1]:39321 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Stioc-0005Ld-Dh for incoming@patchwork.ozlabs.org; Tue, 24 Jul 2012 13:22:34 -0400 Received: from eggs.gnu.org ([208.118.235.92]:49678) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Stio9-0004oG-Nu for qemu-devel@nongnu.org; Tue, 24 Jul 2012 13:22:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Stio7-0000Xb-Gp for qemu-devel@nongnu.org; Tue, 24 Jul 2012 13:22:05 -0400 Received: from mail-pb0-f45.google.com ([209.85.160.45]:53639) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Stio7-0000Fl-7P for qemu-devel@nongnu.org; Tue, 24 Jul 2012 13:22:03 -0400 Received: by mail-pb0-f45.google.com with SMTP id ro12so12773613pbb.4 for ; Tue, 24 Jul 2012 10:22:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:x-mailer:in-reply-to :references; bh=9zQtEDI+9RCpOtdvS3yhZUHnuWwbRwBtKuThU8HEgwE=; b=B1pVvH757gbRBmSvgMgNrHMWhmPbQAv0yVhvoPvBGrfjnqA7bsY1Qk5Q52JFQmWHob RvVXIKDBOxu3bnYddmAxesM9or3QbSBPOEvSvayqvikgM7lZB3fDB3/qIS1xYBNkIy7b wDKc8cqBBbvXR1jfULI1IvafHnkdye9TcrdWzA6gYpsdDpD2jU9uqSo1G32R0LUIEM0h O4UA8u6Go7qhp493D3Rhi/Lu0tRWrTpTmxq3xgZlb9dkq4cBkz0Cn+1+apBFe4BjgUlZ xvchd3L6E3w99w4KQIh3DA2R38E9dxj7895ZH43I8QpW4OlPK3HFFTkqiY0pN7Dzyvs2 PsXQ== Received: by 10.68.132.103 with SMTP id ot7mr46211881pbb.79.1343150522873; Tue, 24 Jul 2012 10:22:02 -0700 (PDT) Received: from loki.morrigu.org (cpe-72-179-62-111.austin.res.rr.com. [72.179.62.111]) by mx.google.com with ESMTPS id nh8sm12522083pbc.60.2012.07.24.10.21.59 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 24 Jul 2012 10:22:00 -0700 (PDT) From: Michael Roth To: qemu-devel@nongnu.org Date: Tue, 24 Jul 2012 12:20:47 -0500 Message-Id: <1343150454-4677-16-git-send-email-mdroth@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1343150454-4677-1-git-send-email-mdroth@linux.vnet.ibm.com> References: <1343150454-4677-1-git-send-email-mdroth@linux.vnet.ibm.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.45 Cc: aliguori@us.ibm.com, quintela@redhat.com, owasserm@redhat.com, yamahata@valinux.co.jp, pbonzini@redhat.com, akong@redhat.com, afaerber@suse.de Subject: [Qemu-devel] [PATCH 15/22] qidl: Add documentation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Michael Roth --- docs/qidl.txt | 331 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 331 insertions(+) create mode 100644 docs/qidl.txt diff --git a/docs/qidl.txt b/docs/qidl.txt new file mode 100644 index 0000000..30921b7 --- /dev/null +++ b/docs/qidl.txt @@ -0,0 +1,331 @@ +How to Serialize Device State with QIDL +====================================== + +This document describes how to implement save/restore of a device in QEMU using +the QIDL compiler. The QIDL compiler makes it easier to support live +migration in devices by converging the serialization description with the +device type declaration. It has the following features: + + 1. Single description of device state and how to serialize + + 2. Fully inclusive serialization description--fields that aren't serialized + are explicitly marked as such including the reason why. + + 3. Optimized for the common case. Even without any special annotations, + many devices will Just Work out of the box. + + 4. Build time schema definition. Since QIDL runs at build time, we have full + access to the schema during the build which means we can fail the build if + the schema breaks. + +For the rest, of the document, the following simple device will be used as an +example. + + typedef struct SerialDevice { + SysBusDevice parent; + + uint8_t thr; // transmit holding register + uint8_t lsr; // line status register + uint8_t ier; // interrupt enable register + + int int_pending; // whether we have a pending queued interrupt + CharDriverState *chr; // backend + } SerialDevice; + +Getting Started +--------------- + +The first step is to move your device struct definition to a header file. This +header file should only contain the struct definition and any preprocessor +declarations you need to define the structure. This header file will act as +the source for the QIDL compiler. + +Do not include any function declarations in this header file as QIDL does not +understand function declarations. + +Determining What State Gets Saved +--------------------------------- + +By default, QIDL saves every field in a structure it sees. This provides maximum +correctness by default. However, device structures generally contain state +that reflects state that is in someway duplicated or not guest visible. This +more often that not reflects design implementation details. + +Since design implementation details change over time, saving this state makes +compatibility hard to maintain since it would effectively lock down a device's +implementation. + +QIDL allows a device author to suppress certain fields from being saved although +there are very strict rules about when this is allowed and what needs to be done +to ensure that this does not impact correctness. + +There are three cases where state can be suppressed: when it is **immutable**, +**derived**, or **broken**. In addition, QIDL can decide at run time whether to +suppress a field by assigning it a **default** value. + +## Immutable Fields + +If a field is only set during device construction, based on parameters passed to +the device's constructor, then there is no need to send save and restore this +value. We call these fields immutable and we tell QIDL about this fact by using +a **immutable** marker. + +In our *SerialDevice* example, the *CharDriverState* pointer reflects the host +backend that we use to send serial output to the user. This is only assigned +during device construction and never changes. This means we can add an +**immutable** marker to it: + + typedef struct SerialDevice { + SysBusDevice parent; + + uint8_t thr; // transmit holding register + uint8_t lsr; // line status register + uint8_t ier; // interrupt enable register + + int int_pending; // whether we have a pending queued interrupt + CharDriverState *chr QIDL(immutable); + } SerialDevice; + +When reviewing patches that make use of the **immutable** marker, the following +guidelines should be followed to determine if the marker is being used +correctly. + + 1. Check to see if the field is assigned anywhere other than the device + initialization function. + + 2. Check to see if any function is being called that modifies the state of the + field outside of the initialization function. + +It can be subtle whether a field is truly immutable. A good example is a +*QEMUTimer*. Timer's will usually have their timeout modified with a call to +*qemu_mod_timer()* even though they are only assigned in the device +initialization function. + +If the timer is always modified with a fixed value that is not dependent on +guest state, then the timer is immutable since it's unaffected by the state of +the guest. + +On the other hand, if the timer is modified based on guest state (such as a +guest programmed time out), then the timer carries state. It may be necessary +to save/restore the timer or mark it as **derived** and work with it +accordingly. + +### Derived Fields + +If a field is set based on some other field in the device's structure, then its +value is derived. Since this is effectively duplicate state, we can avoid +sending it and then recompute it when we need to. Derived state requires a bit +more handling that immutable state. + +In our *SerialDevice* example, our *int_pending* flag is really derived from +two pieces of state. It is set based on whether interrupts are enabled in the +*ier* register and whether there is *THRE* flag is not set in the *lsr* +register. + +To mark a field as derived, use the **derived** marker. To update our +example, we would do: + + typedef struct SerialDevice { + SysBusDevice parent; + + uint8_t thr; // transmit holding register + uint8_t lsr; // line status register + uint8_t ier; // interrupt enable register + + int _derived int_pending; // whether we have a pending queued interrupt + CharDriverState *chr QIDL(immutable); + } SerialDevice; + +There is one other critical step needed when marking a field as derived. A +*post_load* function must be added that updates this field after loading the +rest of the device state. This function is implemented in the device's source +file, not in the QIDL header. Below is an example of what this function may do: + + static void serial_post_load(SerialDevice *s) + { + s->int_pending = !(s->lsr & THRE) && (s->ier & INTE); + } + +When reviewing a patch that marks a field as *derived*, the following criteria +should be used: + + 1. Does the device have a post load function? + + 2. Does the post load function assign a value to all of the derived fields? + + 3. Are there any obvious places where a derived field is holding unique state? + +### Broken State + +QEMU does migration with a lot of devices today. When applying this methodology +to these devices, one will quickly discover that there are a lot of fields that +are not being saved today that are not derived or immutable state. + +These are all bugs. It just so happens that these bugs are usually not very +serious. In many cases, they cause small functionality glitches that so far +have not created any problems. + +Consider our *SerialDevice* example. In QEMU's real *SerialState* device, the +*thr* register is not saved yet we have not marked it immutable or derived. + +The *thr* register is a temporary holding register that the next character to +transmit is placed in while we wait for the next baud cycle. In QEMU, we +emulate a very fast baud rate regardless of what guest programs. This means +that the contents of the *thr* register only matter for a very small period of +time (measured in microseconds). + +The likelihood of a migration converging in that very small period of time when +the *thr* register has a meaningful value is very small. Moreover, the worst +thing that can happen by not saving this register is that we lose a byte in the +data stream. Even if this has happened in practice, the chances of someone +noticing this as a bug is pretty small. + +Nonetheless, this is a bug and needs to be eventually fixed. However, it would +be very inconvenient to constantly break migration by fixing all of these bugs +one-by-one. Instead, QIDL has a **broken** marker. This indicates that a field +is not currently saved, but should be in the future. + +The idea behind the broken marker is that we can convert a large number of +devices without breaking migration compatibility, and then institute a flag day +where we go through and remove broken markers en-mass. + +Below is an update of our example to reflect our real life serial device: + + typedef struct SerialDevice { + SysBusDevice parent; + + uint8_t thr QIDL(broken); // transmit holding register + uint8_t lsr; // line status register + uint8_t ier; // interrupt enable register + + int _derived int_pending; // whether we have a pending queued interrupt + CharDriverState _immutable *chr; + } SerialDevice; + +When reviewing the use of the broken marker, the following things should be +considered: + + 1. What are the ramifications of not sending this data field? + + 2. If the not sending this data field can cause data corruption or very poor + behavior within the guest, the broken marker is not appropriate to use. + + 3. Assigning a default value to a field can also be used to fix a broken field + without significantly impacting live migration compatibility. + +### Default Values + +In many cases, a field that gets marked broken was not originally saved because +in the vast majority of the time, the field does not contain a meaningful value. + +In the case of our *thr* example, the field usually does not have a meaningful +value. + +Instead of always saving the field, QIDL has another mechanism that allows the +field to be saved only when it has a meaningful value. This is done using the +**default** marker. The default marker tells QIDL that if the field currently +has a specific value, do not save the value as part of serialization. + +When loading a field, QIDL will assign the default value to the field before it +tries to load the field. If the field cannot be loaded, QIDL will ignore the +error and rely on the default value. + +Using default values, we can fix broken fields while also minimizing the cases +where we break live migration compatibility. The **default** marker can be +used in conjunction with the **broken** marker. We can extend our example as +follows: + + typedef struct SerialDevice { + SysBusDevice parent; + + + uint8_t thr QIDL(default, 0); // transmit holding register + uint8_t lsr; // line status register + uint8_t ier; // interrupt enable register + + int _derived int_pending; // whether we have a pending queued interrupt + CharDriverState _immutable *chr; + } SerialDevice; + +The following guidelines should be followed when using a default marker: + + 1. Is the field set to the default value both during device initialization and + whenever the field is no longer in use? + + 2. If the non-default value is expected to occur often, then consider using the + **broken** marker along with the default marker and using a flag day to + remove the **broken** marker. + + 3. In general, setting default values as the value during device initialization + is a good idea even if the field was never broken. This gives us maximum + flexibility in the long term. + + 4. Never change a default value without renaming a field. The default value is + part of the device's ABI. + +The first guideline is particularly important. In the case of QEMU's real +*SerialDevice*, it would be necessary to add code to set the *thr* register to +zero after the byte has been successfully transmitted. Otherwise, it is +unlikely that it would ever contain the default value. + +Arrays +------ + +QIDL has support for multiple types of arrays. The following sections describe +the different rules for arrays. + +Fixed Sized Arrays +------------------ + +A fixed sized array has a size that is known at build time. A typical example +would be: + + struct SerialFIFO { + uint8_t data[UART_FIFO_LENGTH]; + uint8_t count; + uint8_t itl; + uint8_t tail; + uint8_t head; + }; + +In this example, *data* is a fixed sized array. No special annotation is needed +for QIDL to marshal this area correctly. The following guidelines apply to +fixed sized arrays: + + 1. The size of the array is part of the device ABI. It should not change + without renaming the field. + +Variable Sized, Fixed Capacity Arrays +------------------------------------- + +Sometimes it's desirable to have a variable sized array. QIDL currently supported +variable sized arrays provided that the maximum capacity is fixed and part of +the device structure memory. + +A typical example would be a slightly modified version of our above example: + + struct SerialFIFO { + uint8_t count; + uint8_t data[UART_FIFO_LENGTH] QIDL(size_is, count); + uint8_t itl; + uint8_t tail; + uint8_t head; + }; + +In this example, *data* is a variable sized array with a fixed capacity of +*UART_FIFO_LENGTH*. When we serialize, we want only want to serialize *count* +members. + +The ABI implications of capacity are a bit more relaxed with variable sized +arrays. In general, you can increase or decrease the capacity without breaking +the ABI although you may cause some instances of migration to fail between +versions of QEMU with different capacities. + +When reviewing variable sized, fixed capacity arrays, keep the following things +in mind: + + 1. The variable size must occur before the array element in the state + structure. + + 2. The capacity can change without breaking the ABI, but care should be used + when making these types of changes.