diff mbox

[RFC/PATCH] migration: SMRAM dirty bitmap not fetched from kvm-kmod and not send to destination

Message ID 57D90289.6020003@huawei.com
State New
Headers show

Commit Message

Herongguang (Stephen) Sept. 14, 2016, 7:55 a.m. UTC
Hi,
We found a problem that when a redhat 6 VM reboots (in grub countdown UI), migrating this VM will result in VM’s memory difference between source and destination side. The difference always resides in GPA 0xA0000~0xC0000, i.e. SMRAM area.

Occasionally this result in VM instruction emulation error in destination side.

After some digging, I think this is because in migration code, in migration_bitmap_sync(), only memory slots in address space address_space_memory’s dirty bitmap  fetched from kvm-kmod, while SMRAM memory slot, in address space smram_address_space’s dirty bitmap not fetched from kvm-kmod, thus modifications in SMRAM in source side are not sent to destination side.

I tried following patch, and this phenomenon does not happen anymore. Do you think this patch is OK or do you have better idea? Thanks.

Comments

Paolo Bonzini Sept. 14, 2016, 9:05 a.m. UTC | #1
On 14/09/2016 09:55, Herongguang (Stephen) wrote:
> Hi,
> We found a problem that when a redhat 6 VM reboots (in grub countdown
> UI), migrating this VM will result in VM’s memory difference between
> source and destination side. The difference always resides in GPA
> 0xA0000~0xC0000, i.e. SMRAM area.
> 
> Occasionally this result in VM instruction emulation error in
> destination side.
> 
> After some digging, I think this is because in migration code, in
> migration_bitmap_sync(), only memory slots in address space
> address_space_memory’s dirty bitmap  fetched from kvm-kmod, while SMRAM
> memory slot, in address space smram_address_space’s dirty bitmap not
> fetched from kvm-kmod, thus modifications in SMRAM in source side are
> not sent to destination side.
> 
> I tried following patch, and this phenomenon does not happen anymore. Do
> you think this patch is OK or do you have better idea? Thanks.

Nice caatch!

I think the right solution here is to sync all RAM memory regions
instead of the address spaces.  You can do that by putting a notifier in
MemoryRegion; register the notifier in all the RAM creation functions
(basically after every mr->ram=true or mr->rom_device=true), and
unregister it in memory_region_destructor_ram.

Thanks,

Paolo

> diff --git a/migration/ram.c b/migration/ram.c
> index a3d70c4..1cc4360 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -607,6 +607,8 @@ static void migration_bitmap_sync_init(void)
>      iterations_prev = 0;
>  }
> 
> +extern AddressSpace smram_address_space;
> +
>  static void migration_bitmap_sync(void)
>  {
>      RAMBlock *block;
> @@ -627,6 +629,7 @@ static void migration_bitmap_sync(void)
> 
>      trace_migration_bitmap_sync_start();
>      address_space_sync_dirty_bitmap(&address_space_memory);
> +    address_space_sync_dirty_bitmap(&smram_address_space);
> 
>      qemu_mutex_lock(&migration_bitmap_mutex);
>      rcu_read_lock();
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index d1a25c5..b98fe22 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -1111,7 +1111,7 @@ static int kvm_get_supported_msrs(KVMState *s)
> 
>  static Notifier smram_machine_done;
>  static KVMMemoryListener smram_listener;
> -static AddressSpace smram_address_space;
> +AddressSpace smram_address_space;
>  static MemoryRegion smram_as_root;
>  static MemoryRegion smram_as_mem;
> 
> 
>
Herongguang (Stephen) Sept. 22, 2016, 1:16 p.m. UTC | #2
On 2016/9/14 17:05, Paolo Bonzini wrote:
>
>
> On 14/09/2016 09:55, Herongguang (Stephen) wrote:
>> Hi,
>> We found a problem that when a redhat 6 VM reboots (in grub countdown
>> UI), migrating this VM will result in VM’s memory difference between
>> source and destination side. The difference always resides in GPA
>> 0xA0000~0xC0000, i.e. SMRAM area.
>>
>> Occasionally this result in VM instruction emulation error in
>> destination side.
>>
>> After some digging, I think this is because in migration code, in
>> migration_bitmap_sync(), only memory slots in address space
>> address_space_memory’s dirty bitmap  fetched from kvm-kmod, while SMRAM
>> memory slot, in address space smram_address_space’s dirty bitmap not
>> fetched from kvm-kmod, thus modifications in SMRAM in source side are
>> not sent to destination side.
>>
>> I tried following patch, and this phenomenon does not happen anymore. Do
>> you think this patch is OK or do you have better idea? Thanks.
>
> Nice caatch!
>
> I think the right solution here is to sync all RAM memory regions
> instead of the address spaces.  You can do that by putting a notifier in
> MemoryRegion; register the notifier in all the RAM creation functions
> (basically after every mr->ram=true or mr->rom_device=true), and
> unregister it in memory_region_destructor_ram.
>
> Thanks,
>
> Paolo
>

I have some concern:
1. For example, vhost does not know about as_id, I wonder if guests in SMM can operate disk or ether card, as in
that case vhost would not logging dirty pages correctly, without knowing as_id.

2. If a memory region is disabled/enabled/disabled frequently, since disabled memory regions would be removed
from memory slots in kvm-kmod, dirty pages would be discarded in kvm-kmod and qemu when disabled, thus missing.
Is my assumption correct?

3. I agree your opinion that the right solution is to get dirty-page information for all memory region from
kvm-kmod. But I found it’s somewhat hard to implement since kvm_log_sync() expects a MemoryRegionSection*
parameter. Do you have good idea?

As to all the ram memory regions, I think they are all in the ram_list.blocks, so there is no need to create
a notifier, is this correct?
Herongguang (Stephen) Sept. 23, 2016, 1:11 a.m. UTC | #3
On 2016/9/22 21:16, Herongguang (Stephen) wrote:
>
>
> On 2016/9/14 17:05, Paolo Bonzini wrote:
>>
>>
>> On 14/09/2016 09:55, Herongguang (Stephen) wrote:
>>> Hi,
>>> We found a problem that when a redhat 6 VM reboots (in grub countdown
>>> UI), migrating this VM will result in VM’s memory difference between
>>> source and destination side. The difference always resides in GPA
>>> 0xA0000~0xC0000, i.e. SMRAM area.
>>>
>>> Occasionally this result in VM instruction emulation error in
>>> destination side.
>>>
>>> After some digging, I think this is because in migration code, in
>>> migration_bitmap_sync(), only memory slots in address space
>>> address_space_memory’s dirty bitmap  fetched from kvm-kmod, while SMRAM
>>> memory slot, in address space smram_address_space’s dirty bitmap not
>>> fetched from kvm-kmod, thus modifications in SMRAM in source side are
>>> not sent to destination side.
>>>
>>> I tried following patch, and this phenomenon does not happen anymore. Do
>>> you think this patch is OK or do you have better idea? Thanks.
>>
>> Nice caatch!
>>
>> I think the right solution here is to sync all RAM memory regions
>> instead of the address spaces.  You can do that by putting a notifier in
>> MemoryRegion; register the notifier in all the RAM creation functions
>> (basically after every mr->ram=true or mr->rom_device=true), and
>> unregister it in memory_region_destructor_ram.
>>
>> Thanks,
>>
>> Paolo
>>
>
> I have some concern:
> 1. For example, vhost does not know about as_id, I wonder if guests in SMM can operate disk or ether card, as in
> that case vhost would not logging dirty pages correctly, without knowing as_id.
>
> 2. If a memory region is disabled/enabled/disabled frequently, since disabled memory regions would be removed
> from memory slots in kvm-kmod, dirty pages would be discarded in kvm-kmod and qemu when disabled, thus missing.
> Is my assumption correct?
After reviewing code, I think question 2 does not exist as qemu will sync dirty page before removing memory slots in kvm_set_phys_mem.

>
> 3. I agree your opinion that the right solution is to get dirty-page information for all memory region from
> kvm-kmod. But I found it’s somewhat hard to implement since kvm_log_sync() expects a MemoryRegionSection*
> parameter. Do you have good idea?
>
> As to all the ram memory regions, I think they are all in the ram_list.blocks, so there is no need to create
> a notifier, is this correct?
diff mbox

Patch

diff --git a/migration/ram.c b/migration/ram.c
index a3d70c4..1cc4360 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -607,6 +607,8 @@  static void migration_bitmap_sync_init(void)
      iterations_prev = 0;
  }

+extern AddressSpace smram_address_space;
+
  static void migration_bitmap_sync(void)
  {
      RAMBlock *block;
@@ -627,6 +629,7 @@  static void migration_bitmap_sync(void)

      trace_migration_bitmap_sync_start();
      address_space_sync_dirty_bitmap(&address_space_memory);
+    address_space_sync_dirty_bitmap(&smram_address_space);

      qemu_mutex_lock(&migration_bitmap_mutex);
      rcu_read_lock();
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index d1a25c5..b98fe22 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1111,7 +1111,7 @@  static int kvm_get_supported_msrs(KVMState *s)

  static Notifier smram_machine_done;
  static KVMMemoryListener smram_listener;
-static AddressSpace smram_address_space;
+AddressSpace smram_address_space;
  static MemoryRegion smram_as_root;
  static MemoryRegion smram_as_mem;