Patchwork [XEN,RFC,V2,11/17] xc: modify save/restore to support multiple device models

login
register
mail settings
Submitter Julien Grall
Date Aug. 22, 2012, 12:31 p.m.
Message ID <fcf046ea782dda6cacb3bf11813bf1d16e531e6b.1345552068.git.julien.grall@citrix.com>
Download mbox | patch
Permalink /patch/179383/
State New
Headers show

Comments

Julien Grall - Aug. 22, 2012, 12:31 p.m.
- add save/restore new special pages and remove unused
    - modify save file structure to allow multiple qemu states

Signed-off-by: Julien Grall <julien.grall@citrix.com>
---
 tools/libxc/xc_domain_restore.c |  150 +++++++++++++++++++++++++++++----------
 tools/libxc/xc_domain_save.c    |    6 +-
 2 files changed, 116 insertions(+), 40 deletions(-)
Ian Campbell - Aug. 23, 2012, 1:27 p.m.
On Wed, 2012-08-22 at 13:31 +0100, Julien Grall wrote:
> - add save/restore new special pages and remove unused
>     - modify save file structure to allow multiple qemu states
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> ---
>  tools/libxc/xc_domain_restore.c |  150 +++++++++++++++++++++++++++++----------
>  tools/libxc/xc_domain_save.c    |    6 +-

As you've changed the protocol olease can you update the docs in
xg_save_restore.h.

> @@ -103,6 +103,9 @@ static ssize_t rdexact(xc_interface *xch, struct restore_ctx *ctx,
>  #else
>  #define RDEXACT read_exact
>  #endif
> +
> +#define QEMUSIG_SIZE 21
> +
>  /*
>  ** In the state file (or during transfer), all page-table pages are
>  ** converted into a 'canonical' form where references to actual mfns
> @@ -467,7 +522,7 @@ static int buffer_tail_hvm(xc_interface *xch, struct restore_ctx *ctx,
>                             int vcpuextstate, uint32_t vcpuextstate_size)
>  {
>      uint8_t *tmp;
> -    unsigned char qemusig[21];
> +    unsigned char qemusig[QEMUSIG_SIZE + 1];

An extra + 1 here?

[...]
> -    qemusig[20] = '\0';
> +    qemusig[QEMUSIG_SIZE] = '\0';

This is one bigger than it used to be now.

Perhaps this is an unrelated bug fix (I haven't check the real length of
the sig), in which case please can you split it out and submit
separately?

Ian.
Julien Grall - Aug. 23, 2012, 7:13 p.m.
On 08/23/2012 02:27 PM, Ian Campbell wrote:
>
>> @@ -103,6 +103,9 @@ static ssize_t rdexact(xc_interface *xch, struct restore_ctx *ctx,
>>   #else
>>   #define RDEXACT read_exact
>>   #endif
>> +
>> +#define QEMUSIG_SIZE 21
>> +
>>   /*
>>   ** In the state file (or during transfer), all page-table pages are
>>   ** converted into a 'canonical' form where references to actual mfns
>> @@ -467,7 +522,7 @@ static int buffer_tail_hvm(xc_interface *xch, struct restore_ctx *ctx,
>>                              int vcpuextstate, uint32_t vcpuextstate_size)
>>   {
>>       uint8_t *tmp;
>> -    unsigned char qemusig[21];
>> +    unsigned char qemusig[QEMUSIG_SIZE + 1];
>>      
> An extra + 1 here?
>    
QEMUSIG_SIZE doesn't take into account the '\0'. So we need to add 1.
If an error occurred, without +1, the output log lost the last character.

> [...]
>    
>> -    qemusig[20] = '\0';
>> +    qemusig[QEMUSIG_SIZE] = '\0';
>>      
> This is one bigger than it used to be now.
>
> Perhaps this is an unrelated bug fix (I haven't check the real length of
> the sig), in which case please can you split it out and submit
> separately?
>    

#define QEMU_SIGNATURE "DeviceModelRecord0002"
Just checked, the length seems to be 21. I will send a patch with
this change.
Ian Campbell - Aug. 23, 2012, 7:52 p.m.
On Thu, 2012-08-23 at 20:13 +0100, Julien Grall wrote:
> On 08/23/2012 02:27 PM, Ian Campbell wrote:
> >
> >> @@ -103,6 +103,9 @@ static ssize_t rdexact(xc_interface *xch, struct restore_ctx *ctx,
> >>   #else
> >>   #define RDEXACT read_exact
> >>   #endif
> >> +
> >> +#define QEMUSIG_SIZE 21
> >> +
> >>   /*
> >>   ** In the state file (or during transfer), all page-table pages are
> >>   ** converted into a 'canonical' form where references to actual mfns
> >> @@ -467,7 +522,7 @@ static int buffer_tail_hvm(xc_interface *xch, struct restore_ctx *ctx,
> >>                              int vcpuextstate, uint32_t vcpuextstate_size)
> >>   {
> >>       uint8_t *tmp;
> >> -    unsigned char qemusig[21];
> >> +    unsigned char qemusig[QEMUSIG_SIZE + 1];
> >>      
> > An extra + 1 here?
> >    
> QEMUSIG_SIZE doesn't take into account the '\0'. So we need to add 1.
> If an error occurred, without +1, the output log lost the last character.

So this is just a bug fix for a pre-existing issue?

> > [...]
> >    
> >> -    qemusig[20] = '\0';
> >> +    qemusig[QEMUSIG_SIZE] = '\0';
> >>      
> > This is one bigger than it used to be now.
> >
> > Perhaps this is an unrelated bug fix (I haven't check the real length of
> > the sig), in which case please can you split it out and submit
> > separately?
> >    
> 
> #define QEMU_SIGNATURE "DeviceModelRecord0002"
> Just checked, the length seems to be 21. I will send a patch with
> this change.

Perhaps use either sizeof(QEMU_SIGNATURE) or strlen(QEMU_SIGNATURE)
(depending on which semantics you want)?

Ian.
Julien Grall - Aug. 24, 2012, 10:27 a.m.
On 08/23/2012 08:52 PM, Ian Campbell wrote:
> On Thu, 2012-08-23 at 20:13 +0100, Julien Grall wrote:
>    
>> On 08/23/2012 02:27 PM, Ian Campbell wrote:
>>      
>>>        
>>>> @@ -103,6 +103,9 @@ static ssize_t rdexact(xc_interface *xch, struct restore_ctx *ctx,
>>>>    #else
>>>>    #define RDEXACT read_exact
>>>>    #endif
>>>> +
>>>> +#define QEMUSIG_SIZE 21
>>>> +
>>>>    /*
>>>>    ** In the state file (or during transfer), all page-table pages are
>>>>    ** converted into a 'canonical' form where references to actual mfns
>>>> @@ -467,7 +522,7 @@ static int buffer_tail_hvm(xc_interface *xch, struct restore_ctx *ctx,
>>>>                               int vcpuextstate, uint32_t vcpuextstate_size)
>>>>    {
>>>>        uint8_t *tmp;
>>>> -    unsigned char qemusig[21];
>>>> +    unsigned char qemusig[QEMUSIG_SIZE + 1];
>>>>
>>>>          
>>> An extra + 1 here?
>>>
>>>        
>> QEMUSIG_SIZE doesn't take into account the '\0'. So we need to add 1.
>> If an error occurred, without +1, the output log lost the last character.
>>      
> So this is just a bug fix for a pre-existing issue?
>    
Yes.

>>> [...]
>>>
>>>        
>>>> -    qemusig[20] = '\0';
>>>> +    qemusig[QEMUSIG_SIZE] = '\0';
>>>>
>>>>          
>>> This is one bigger than it used to be now.
>>>
>>> Perhaps this is an unrelated bug fix (I haven't check the real length of
>>> the sig), in which case please can you split it out and submit
>>> separately?
>>>
>>>        
>> #define QEMU_SIGNATURE "DeviceModelRecord0002"
>> Just checked, the length seems to be 21. I will send a patch with
>> this change.
>>      
> Perhaps use either sizeof(QEMU_SIGNATURE) or strlen(QEMU_SIGNATURE)
> (depending on which semantics you want)?
>    
Here, QEMU_SIZE needs to be define as strlen (QEMU_SIGNATURE),
but QEMU_SIGNATURE is not defined in libxc. It's defined
in libxl/libxl_internal.h.
By the way, I'm wondering why QEMU save (libxl__domain_save_device_model)
is made in libxl and restore (dump_qemu) in libxc ?
Ian Campbell - Aug. 24, 2012, 10:35 a.m.
On Fri, 2012-08-24 at 11:27 +0100, Julien Grall wrote:
> On 08/23/2012 08:52 PM, Ian Campbell wrote:
> > On Thu, 2012-08-23 at 20:13 +0100, Julien Grall wrote:
> >    
> >> On 08/23/2012 02:27 PM, Ian Campbell wrote:
> >>      
> >>>        
> >>>> @@ -103,6 +103,9 @@ static ssize_t rdexact(xc_interface *xch, struct restore_ctx *ctx,
> >>>>    #else
> >>>>    #define RDEXACT read_exact
> >>>>    #endif
> >>>> +
> >>>> +#define QEMUSIG_SIZE 21
> >>>> +
> >>>>    /*
> >>>>    ** In the state file (or during transfer), all page-table pages are
> >>>>    ** converted into a 'canonical' form where references to actual mfns
> >>>> @@ -467,7 +522,7 @@ static int buffer_tail_hvm(xc_interface *xch, struct restore_ctx *ctx,
> >>>>                               int vcpuextstate, uint32_t vcpuextstate_size)
> >>>>    {
> >>>>        uint8_t *tmp;
> >>>> -    unsigned char qemusig[21];
> >>>> +    unsigned char qemusig[QEMUSIG_SIZE + 1];
> >>>>
> >>>>          
> >>> An extra + 1 here?
> >>>
> >>>        
> >> QEMUSIG_SIZE doesn't take into account the '\0'. So we need to add 1.
> >> If an error occurred, without +1, the output log lost the last character.
> >>      
> > So this is just a bug fix for a pre-existing issue?
> >    
> Yes.

Can we get it as a separate change?

> 
> >>> [...]
> >>>
> >>>        
> >>>> -    qemusig[20] = '\0';
> >>>> +    qemusig[QEMUSIG_SIZE] = '\0';
> >>>>
> >>>>          
> >>> This is one bigger than it used to be now.
> >>>
> >>> Perhaps this is an unrelated bug fix (I haven't check the real length of
> >>> the sig), in which case please can you split it out and submit
> >>> separately?
> >>>
> >>>        
> >> #define QEMU_SIGNATURE "DeviceModelRecord0002"
> >> Just checked, the length seems to be 21. I will send a patch with
> >> this change.
> >>      
> > Perhaps use either sizeof(QEMU_SIGNATURE) or strlen(QEMU_SIGNATURE)
> > (depending on which semantics you want)?
> >    
> Here, QEMU_SIZE needs to be define as strlen (QEMU_SIGNATURE),
> but QEMU_SIGNATURE is not defined in libxc. It's defined
> in libxl/libxl_internal.h.

Oh, right, this again :-/

> By the way, I'm wondering why QEMU save (libxl__domain_save_device_model)
> is made in libxl and restore (dump_qemu) in libxc ?

Mostly historical accident, we'd really like to sort this out one way or
the other but untangling the protocol and the callbacks etc is a pretty
big job.

In the meantime perhaps libxc could provide a suitable "typedef char
device_model_signature_t[21]"?

Ian.

Patch

diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
index 3fe2b12..9a49ee2 100644
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -103,6 +103,9 @@  static ssize_t rdexact(xc_interface *xch, struct restore_ctx *ctx,
 #else
 #define RDEXACT read_exact
 #endif
+
+#define QEMUSIG_SIZE 21
+
 /*
 ** In the state file (or during transfer), all page-table pages are
 ** converted into a 'canonical' form where references to actual mfns
@@ -342,8 +345,11 @@  typedef struct {
                 uint32_t version;
                 uint64_t len;
             } qemuhdr;
-            uint32_t qemubufsize;
-            uint8_t* qemubuf;
+            uint32_t num_dms;
+            struct devmodel_buffer {
+                uint32_t size;
+                uint8_t* buf;
+            } *dmsbuf;
         } hvm;
     } u;
 } tailbuf_t;
@@ -392,63 +398,112 @@  static int compat_buffer_qemu(xc_interface *xch, struct restore_ctx *ctx,
         return -1;
     }
 
-    buf->qemubuf = qbuf;
-    buf->qemubufsize = dlen;
+    if ( !(buf->dmsbuf = calloc(1, sizeof(*buf->dmsbuf))) ) {
+        ERROR("Error allocating Device Model buffer");
+        free(qbuf);
+        return -1;
+    }
+
+    buf->dmsbuf[0].buf = qbuf;
+    buf->dmsbuf[0].size = dlen;
+    buf->num_dms = 1;
 
     return 0;
 }
 
 static int buffer_qemu(xc_interface *xch, struct restore_ctx *ctx,
-                       int fd, struct tailbuf_hvm *buf)
+                       uint32_t dmid, int fd, struct tailbuf_hvm *buf)
 {
     uint32_t qlen;
     uint8_t *tmp;
+    struct devmodel_buffer *dmb = &buf->dmsbuf[dmid];
 
     if ( RDEXACT(fd, &qlen, sizeof(qlen)) ) {
-        PERROR("Error reading QEMU header length");
+        PERROR("Error reading Device Model %u header length", dmid);
         return -1;
     }
 
-    if ( qlen > buf->qemubufsize ) {
-        if ( buf->qemubuf) {
-            tmp = realloc(buf->qemubuf, qlen);
+    if ( qlen > dmb->size ) {
+        if ( dmb->buf ) {
+            tmp = realloc(dmb->buf, qlen);
             if ( tmp )
-                buf->qemubuf = tmp;
+                dmb->buf = tmp;
             else {
-                ERROR("Error reallocating QEMU state buffer");
+                ERROR("Error reallocating Device Model %u state buffer", dmid);
                 return -1;
             }
         } else {
-            buf->qemubuf = malloc(qlen);
-            if ( !buf->qemubuf ) {
-                ERROR("Error allocating QEMU state buffer");
+            dmb->buf = malloc(qlen);
+            if ( !dmb->buf ) {
+                ERROR("Error allocating Device Model %u state buffer", dmid);
                 return -1;
             }
         }
     }
-    buf->qemubufsize = qlen;
+    dmb->size = qlen;
 
-    if ( RDEXACT(fd, buf->qemubuf, buf->qemubufsize) ) {
-        PERROR("Error reading QEMU state");
+    if ( RDEXACT(fd, dmb->buf, dmb->size) ) {
+        PERROR("Error reading Device Model %u state", dmid);
         return -1;
     }
 
     return 0;
 }
 
-static int dump_qemu(xc_interface *xch, uint32_t dom, struct tailbuf_hvm *buf)
+static int buffer_device_models(xc_interface *xch, struct restore_ctx *ctx,
+                                int fd, struct tailbuf_hvm *buf)
+{
+    uint32_t i, num_dms;
+    unsigned char qemusig[QEMUSIG_SIZE + 1];
+    int ret = 0;
+
+    if ( RDEXACT(fd, &num_dms, sizeof(num_dms)) ) {
+        PERROR("Error reading num dms");
+        return -1;
+    }
+
+    if ( !(buf->dmsbuf = calloc(num_dms, sizeof (*buf->dmsbuf))) ) {
+        PERROR("Error allocating Device Model buffers");
+        return -1;
+    }
+
+    buf->num_dms = num_dms;
+
+    for ( i = 0; i < num_dms; i++ ) {
+        if ( RDEXACT(fd, qemusig, QEMUSIG_SIZE) ) {
+            PERROR("Error reading Device Model %u signature", i);
+            return -1;
+        }
+
+        if ( memcmp(qemusig, "DeviceModelRecord0002", QEMUSIG_SIZE) ) {
+            qemusig[QEMUSIG_SIZE] = '\0';
+            ERROR("Invalid Device Model %u signature: %s", i, qemusig);
+            return -1;
+        }
+
+        ret = buffer_qemu(xch, ctx, i, fd, buf);
+        if ( ret )
+            return ret;
+    }
+
+    return 0;
+}
+
+static int dump_qemu(xc_interface *xch, uint32_t dom,
+                     uint32_t dmid, struct tailbuf_hvm *buf)
 {
     int saved_errno;
     char path[256];
     FILE *fp;
+    struct devmodel_buffer *dmb = &buf->dmsbuf[dmid];
 
-    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dom);
+    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u.%u", dom, dmid);
     fp = fopen(path, "wb");
     if ( !fp )
         return -1;
 
-    DPRINTF("Writing %d bytes of QEMU data\n", buf->qemubufsize);
-    if ( fwrite(buf->qemubuf, 1, buf->qemubufsize, fp) != buf->qemubufsize) {
+    DPRINTF("Writing %d bytes of Device Model %u data\n", dmb->size, dmid);
+    if ( fwrite(dmb->buf, 1, dmb->size, fp) != dmb->size ) {
         saved_errno = errno;
         fclose(fp);
         errno = saved_errno;
@@ -467,7 +522,7 @@  static int buffer_tail_hvm(xc_interface *xch, struct restore_ctx *ctx,
                            int vcpuextstate, uint32_t vcpuextstate_size)
 {
     uint8_t *tmp;
-    unsigned char qemusig[21];
+    unsigned char qemusig[QEMUSIG_SIZE + 1];
 
     if ( RDEXACT(fd, buf->magicpfns, sizeof(buf->magicpfns)) ) {
         PERROR("Error reading magic PFNs");
@@ -504,7 +559,7 @@  static int buffer_tail_hvm(xc_interface *xch, struct restore_ctx *ctx,
         return -1;
     }
 
-    if ( RDEXACT(fd, qemusig, sizeof(qemusig)) ) {
+    if ( RDEXACT(fd, qemusig, QEMUSIG_SIZE) ) {
         PERROR("Error reading QEMU signature");
         return -1;
     }
@@ -517,13 +572,22 @@  static int buffer_tail_hvm(xc_interface *xch, struct restore_ctx *ctx,
      * live-migration QEMU record and Remus which includes a length
      * prefix
      */
-    if ( !memcmp(qemusig, "QemuDeviceModelRecord", sizeof(qemusig)) )
+    if ( !memcmp(qemusig, "QemuDeviceModelRecord", QEMUSIG_SIZE) )
         return compat_buffer_qemu(xch, ctx, fd, buf);
-    else if ( !memcmp(qemusig, "DeviceModelRecord0002", sizeof(qemusig)) ||
-              !memcmp(qemusig, "RemusDeviceModelState", sizeof(qemusig)) )
-        return buffer_qemu(xch, ctx, fd, buf);
+    else if ( !memcmp(qemusig, "DeviceModelRecord0002", QEMUSIG_SIZE) ||
+              !memcmp(qemusig, "RemusDeviceModelState", QEMUSIG_SIZE) )
+    {
+        if ( !(buf->dmsbuf = calloc(1, sizeof (*buf->dmsbuf))) ) {
+            PERROR("Error allocating Device Model buffer");
+            return -1;
+        }
+        return buffer_qemu(xch, ctx, 0, fd, buf);
+    }
+    else if ( !memcmp(qemusig, "DeviceModelRecords001", QEMUSIG_SIZE) ) {
+        return buffer_device_models(xch, ctx, fd, buf);
+   }
 
-    qemusig[20] = '\0';
+    qemusig[QEMUSIG_SIZE] = '\0';
     ERROR("Invalid QEMU signature: %s", qemusig);
     return -1;
 }
@@ -629,13 +693,18 @@  static int buffer_tail(xc_interface *xch, struct restore_ctx *ctx,
 
 static void tailbuf_free_hvm(struct tailbuf_hvm *buf)
 {
+    uint32_t i;
+
     if ( buf->hvmbuf ) {
         free(buf->hvmbuf);
         buf->hvmbuf = NULL;
     }
-    if ( buf->qemubuf ) {
-        free(buf->qemubuf);
-        buf->qemubuf = NULL;
+
+    for (i = 0; i < buf->num_dms; i++)
+    {
+        if (buf->dmsbuf[i].buf)
+            free(buf->dmsbuf[i].buf);
+        buf->dmsbuf[i].buf = NULL;
     }
 }
 
@@ -2137,10 +2206,17 @@  int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
         }
     }
 
-    /* Dump the QEMU state to a state file for QEMU to load */
-    if ( dump_qemu(xch, dom, &tailbuf.u.hvm) ) {
-        PERROR("Error dumping QEMU state to file");
-        goto out;
+    /**
+     * Dump the each Device Model state to a state file for the Device
+     * Model to load
+     */
+    for ( i = 0; i < tailbuf.u.hvm.num_dms; i++)
+    {
+        if ( dump_qemu(xch, dom, i, &tailbuf.u.hvm) )
+        {
+            PERROR("Error dumping Device Model %u state to file", i);
+            goto out;
+        }
     }
 
     /* These comms pages need to be zeroed at the start of day */
@@ -2153,9 +2229,9 @@  int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
     }
 
     if ( (frc = xc_set_hvm_param(xch, dom,
-                                 HVM_PARAM_IOREQ_PFN, tailbuf.u.hvm.magicpfns[0]))
+                                 HVM_PARAM_IO_PFN_FIRST, tailbuf.u.hvm.magicpfns[0]))
          || (frc = xc_set_hvm_param(xch, dom,
-                                    HVM_PARAM_BUFIOREQ_PFN, tailbuf.u.hvm.magicpfns[1]))
+                                    HVM_PARAM_IO_PFN_LAST, tailbuf.u.hvm.magicpfns[1]))
          || (frc = xc_set_hvm_param(xch, dom,
                                     HVM_PARAM_STORE_PFN, tailbuf.u.hvm.magicpfns[2]))
          || (frc = xc_set_hvm_param(xch, dom,
diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index c359649..2aa7a28 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -862,7 +862,7 @@  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
     uint8_t *hvm_buf = NULL;
 
     /* HVM: magic frames for ioreqs and xenstore comms. */
-    uint64_t magic_pfns[3]; /* ioreq_pfn, bufioreq_pfn, store_pfn */
+    uint64_t magic_pfns[3]; /* io_pfn_first , io_pfn_last, store_pfn */
 
     unsigned long mfn;
 
@@ -1787,9 +1787,9 @@  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
 
         /* Save magic-page locations. */
         memset(magic_pfns, 0, sizeof(magic_pfns));
-        xc_get_hvm_param(xch, dom, HVM_PARAM_IOREQ_PFN,
+        xc_get_hvm_param(xch, dom, HVM_PARAM_IO_PFN_FIRST,
                          (unsigned long *)&magic_pfns[0]);
-        xc_get_hvm_param(xch, dom, HVM_PARAM_BUFIOREQ_PFN,
+        xc_get_hvm_param(xch, dom, HVM_PARAM_IO_PFN_LAST,
                          (unsigned long *)&magic_pfns[1]);
         xc_get_hvm_param(xch, dom, HVM_PARAM_STORE_PFN,
                          (unsigned long *)&magic_pfns[2]);