diff mbox

[v2,for-1.6,v2,2/2] loader: put FW CFG ROM files into RAM

Message ID 1376310634-20778-3-git-send-email-mst@redhat.com
State New
Headers show

Commit Message

Michael S. Tsirkin Aug. 12, 2013, 6:16 p.m. UTC
ROM files that are put in FW CFG are copied to guest ram, by BIOS, but
they are not backed by RAM so they don't get migrated.

Each time we change two bytes in such a ROM this breaks cross-version
migration: since we can migrate after BIOS has read the first byte but
before it has read the second one, getting an inconsistent state.

Future-proof this by creating, for each such ROM,
an MR serving as the backing store.
This MR is never mapped into guest memory, but it's registered
as RAM so it's migrated with the guest.

Naturally, this only helps for -M 1.6 and up, older machine types
will still have the cross-version migration bug.
Luckily the race window for the problem to trigger is very small,
which is also likely why we didn't notice the cross-version
migration bug in testing yet.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/core/loader.c    | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 hw/i386/pc_piix.c   |  2 ++
 hw/i386/pc_q35.c    |  2 ++
 include/hw/loader.h |  1 +
 4 files changed, 55 insertions(+), 3 deletions(-)

Comments

Peter Maydell Aug. 12, 2013, 6:37 p.m. UTC | #1
On 12 August 2013 19:16, Michael S. Tsirkin <mst@redhat.com> wrote:
> +static void *rom_set_mr(Rom *rom, Object *owner, const char *name)
> +{
> +    /*
> +     * Migration code expects that all RAM blocks are full pages.
> +     * Round MR size up to satisfy this condition.
> +     */
> +    unsigned size = ROUND_UP(rom->datasize, qemu_migration_page_size);
> +    void *data = g_malloc0(size);
> +
> +    memcpy(data, rom->data, rom->datasize);
> +
> +    rom->mr = g_malloc(sizeof(*rom->mr));
> +    memory_region_init_ram_ptr(rom->mr, owner, name, size, data);
> +    memory_region_set_readonly(rom->mr, true);
> +    vmstate_register_ram_global(rom->mr);

So having thought about this a little I think the right answer
here is "don't use memory_region_init_ram_ptr()". At the moment
in-tree we have five users of this function:

hw/display/g364fb.c
hw/i386/kvm/pci-assign.c
hw/misc/ivshmem.c
hw/misc/vfio.c
target-ppc/kvm.c

The last four of these all absolutely have to have the guest
use a specific host pointer, typically the result of mmap()ing
something [shared file, PCI device, KVM_ALLOCATE_RMA fd, etc].
The first one I think should be converted to use
memory_region_init_ram() instead, because it doesn't need
to use a particular buffer.

Similarly, what you're trying to do here doesn't require
that the guest sees any specific host pointer, so you should
just use memory_region_init_ram().

We should add an assert to the _init_ram_ptr functions that
checks that the size is OK, as well.

I seem to recall having a conversation with Paolo along these
lines a few months back (we fixed the exynos devices which
were incorrectly using the _ram_ptr function); he can correct
me if I'm off-base here.

-- PMM
Michael S. Tsirkin Aug. 12, 2013, 6:56 p.m. UTC | #2
On Mon, Aug 12, 2013 at 07:37:21PM +0100, Peter Maydell wrote:
> On 12 August 2013 19:16, Michael S. Tsirkin <mst@redhat.com> wrote:
> > +static void *rom_set_mr(Rom *rom, Object *owner, const char *name)
> > +{
> > +    /*
> > +     * Migration code expects that all RAM blocks are full pages.
> > +     * Round MR size up to satisfy this condition.
> > +     */
> > +    unsigned size = ROUND_UP(rom->datasize, qemu_migration_page_size);
> > +    void *data = g_malloc0(size);
> > +
> > +    memcpy(data, rom->data, rom->datasize);
> > +
> > +    rom->mr = g_malloc(sizeof(*rom->mr));
> > +    memory_region_init_ram_ptr(rom->mr, owner, name, size, data);
> > +    memory_region_set_readonly(rom->mr, true);
> > +    vmstate_register_ram_global(rom->mr);
> 
> So having thought about this a little I think the right answer
> here is "don't use memory_region_init_ram_ptr()". At the moment
> in-tree we have five users of this function:
> 
> hw/display/g364fb.c
> hw/i386/kvm/pci-assign.c
> hw/misc/ivshmem.c
> hw/misc/vfio.c
> target-ppc/kvm.c
> 
> The last four of these all absolutely have to have the guest
> use a specific host pointer, typically the result of mmap()ing
> something [shared file, PCI device, KVM_ALLOCATE_RMA fd, etc].
> The first one I think should be converted to use
> memory_region_init_ram() instead, because it doesn't need
> to use a particular buffer.
> 
> Similarly, what you're trying to do here doesn't require
> that the guest sees any specific host pointer, so you should
> just use memory_region_init_ram().

I was concerned that we are wasting resources here.
In particular, huge page memory might get allocated
and there's no need for it as it's never mapped
into guest.

Still if Paolo is OK with this too, I'll switch, and do
+ memory_region_init_ram_ptr(...)
+ data = memory_region_get_ram_ptr(rom->mr);

Paolo could you please confirm?


> 
> We should add an assert to the _init_ram_ptr functions that
> checks that the size is OK, as well.

At least for pci-assign and vfio it's the wrong thing
to do - they block migration so we don't need
the ram to be a multiple of migration page size.

> I seem to recall having a conversation with Paolo along these
> lines a few months back (we fixed the exynos devices which
> were incorrectly using the _ram_ptr function); he can correct
> me if I'm off-base here.
> 
> -- PMM
Paolo Bonzini Aug. 12, 2013, 10:03 p.m. UTC | #3
Il 12/08/2013 20:37, Peter Maydell ha scritto:
> 
> I seem to recall having a conversation with Paolo along these
> lines a few months back (we fixed the exynos devices which
> were incorrectly using the _ram_ptr function); he can correct
> me if I'm off-base here.

Yeah, I think you're right.

Paolo
Paolo Bonzini Aug. 13, 2013, 2:47 p.m. UTC | #4
Il 12/08/2013 20:56, Michael S. Tsirkin ha scritto:
>> > Similarly, what you're trying to do here doesn't require
>> > that the guest sees any specific host pointer, so you should
>> > just use memory_region_init_ram().
> I was concerned that we are wasting resources here.
> In particular, huge page memory might get allocated
> and there's no need for it as it's never mapped
> into guest.

You are also right, but I think this issue should be fixed separately
(e.g. only use huge pages for main RAM).

Paolo
Michael S. Tsirkin Aug. 13, 2013, 3:11 p.m. UTC | #5
On Tue, Aug 13, 2013 at 04:47:00PM +0200, Paolo Bonzini wrote:
> Il 12/08/2013 20:56, Michael S. Tsirkin ha scritto:
> >> > Similarly, what you're trying to do here doesn't require
> >> > that the guest sees any specific host pointer, so you should
> >> > just use memory_region_init_ram().
> > I was concerned that we are wasting resources here.
> > In particular, huge page memory might get allocated
> > and there's no need for it as it's never mapped
> > into guest.
> 
> You are also right, but I think this issue should be fixed separately
> (e.g. only use huge pages for main RAM).
> 
> Paolo

OK.
I switched to memory_region_init_ram for v3.
diff mbox

Patch

diff --git a/hw/core/loader.c b/hw/core/loader.c
index 6875b7e..9f83aa8 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -54,6 +54,8 @@ 
 
 #include <zlib.h>
 
+bool rom_file_in_ram = true;
+
 static int roms_loaded;
 
 /* return the size or -1 if error */
@@ -576,6 +578,7 @@  struct Rom {
     size_t datasize;
 
     uint8_t *data;
+    MemoryRegion *mr;
     int isrom;
     char *fw_dir;
     char *fw_file;
@@ -605,6 +608,25 @@  static void rom_insert(Rom *rom)
     QTAILQ_INSERT_TAIL(&roms, rom, next);
 }
 
+static void *rom_set_mr(Rom *rom, Object *owner, const char *name)
+{
+    /*
+     * Migration code expects that all RAM blocks are full pages.
+     * Round MR size up to satisfy this condition.
+     */
+    unsigned size = ROUND_UP(rom->datasize, qemu_migration_page_size);
+    void *data = g_malloc0(size);
+
+    memcpy(data, rom->data, rom->datasize);
+
+    rom->mr = g_malloc(sizeof(*rom->mr));
+    memory_region_init_ram_ptr(rom->mr, owner, name, size, data);
+    memory_region_set_readonly(rom->mr, true);
+    vmstate_register_ram_global(rom->mr);
+
+    return data;
+}
+
 int rom_add_file(const char *file, const char *fw_dir,
                  hwaddr addr, int32_t bootindex)
 {
@@ -646,6 +668,7 @@  int rom_add_file(const char *file, const char *fw_dir,
     if (rom->fw_file && fw_cfg) {
         const char *basename;
         char fw_file_name[56];
+        void *data;
 
         basename = strrchr(rom->fw_file, '/');
         if (basename) {
@@ -655,8 +678,15 @@  int rom_add_file(const char *file, const char *fw_dir,
         }
         snprintf(fw_file_name, sizeof(fw_file_name), "%s/%s", rom->fw_dir,
                  basename);
-        fw_cfg_add_file(fw_cfg, fw_file_name, rom->data, rom->romsize);
         snprintf(devpath, sizeof(devpath), "/rom@%s", fw_file_name);
+
+        if (rom_file_in_ram) {
+            data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
+        } else {
+            data = rom->data;
+        }
+
+        fw_cfg_add_file(fw_cfg, fw_file_name, data, rom->romsize);
     } else {
         snprintf(devpath, sizeof(devpath), "/rom@" TARGET_FMT_plx, addr);
     }
@@ -731,7 +761,12 @@  static void rom_reset(void *unused)
         if (rom->data == NULL) {
             continue;
         }
-        cpu_physical_memory_write_rom(rom->addr, rom->data, rom->datasize);
+        if (rom->mr) {
+            void *host = memory_region_get_ram_ptr(rom->mr);
+            memcpy(host, rom->data, rom->datasize);
+        } else {
+            cpu_physical_memory_write_rom(rom->addr, rom->data, rom->datasize);
+        }
         if (rom->isrom) {
             /* rom needs to be written only once */
             g_free(rom->data);
@@ -781,6 +816,9 @@  static Rom *find_rom(hwaddr addr)
         if (rom->fw_file) {
             continue;
         }
+        if (rom->mr) {
+            continue;
+        }
         if (rom->addr > addr) {
             continue;
         }
@@ -808,6 +846,9 @@  int rom_copy(uint8_t *dest, hwaddr addr, size_t size)
         if (rom->fw_file) {
             continue;
         }
+        if (rom->mr) {
+            continue;
+        }
         if (rom->addr + rom->romsize < addr) {
             continue;
         }
@@ -866,7 +907,13 @@  void do_info_roms(Monitor *mon, const QDict *qdict)
     Rom *rom;
 
     QTAILQ_FOREACH(rom, &roms, next) {
-        if (!rom->fw_file) {
+        if (rom->mr) {
+            monitor_printf(mon, "%s"
+                           " size=0x%06zx name=\"%s\"\n",
+                           rom->mr->name,
+                           rom->romsize,
+                           rom->name);
+        } else if (!rom->fw_file) {
             monitor_printf(mon, "addr=" TARGET_FMT_plx
                            " size=0x%06zx mem=%s name=\"%s\"\n",
                            rom->addr, rom->romsize,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 95c45b8..f654977 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -25,6 +25,7 @@ 
 #include <glib.h>
 
 #include "hw/hw.h"
+#include "hw/loader.h"
 #include "hw/i386/pc.h"
 #include "hw/i386/apic.h"
 #include "hw/pci/pci.h"
@@ -257,6 +258,7 @@  static void pc_init_pci_1_6(QEMUMachineInitArgs *args)
 
 static void pc_init_pci_1_5(QEMUMachineInitArgs *args)
 {
+    rom_file_in_ram = false;
     pc_init_pci_1_6(args);
 }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 6bfc2ca..6848c7b 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -28,6 +28,7 @@ 
  * THE SOFTWARE.
  */
 #include "hw/hw.h"
+#include "hw/loader.h"
 #include "sysemu/arch_init.h"
 #include "hw/i2c/smbus.h"
 #include "hw/boards.h"
@@ -225,6 +226,7 @@  static void pc_q35_init_1_6(QEMUMachineInitArgs *args)
 
 static void pc_q35_init_1_5(QEMUMachineInitArgs *args)
 {
+    rom_file_in_ram = false;
     pc_q35_init_1_6(args);
 }
 
diff --git a/include/hw/loader.h b/include/hw/loader.h
index eb9c9a3..6145736 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -36,6 +36,7 @@  void pstrcpy_targphys(const char *name,
                       hwaddr dest, int buf_size,
                       const char *source);
 
+extern bool rom_file_in_ram;
 
 int rom_add_file(const char *file, const char *fw_dir,
                  hwaddr addr, int32_t bootindex);