Message ID | 1353473965-30678-1-git-send-email-david@gibson.dropbear.id.au |
---|---|
State | New |
Headers | show |
On Wed, Nov 21, 2012 at 03:59:25PM +1100, David Gibson wrote: > madvise(DONTNEED) will throw away the contents of the whole page at the > given address, even if the given length is less than the page size. One > can argue about whether that's the correct behaviour, but that's what it's > done for a long time in Linux at least. > > That means that the madvise() in ram_load(), on a setup where > TARGET_PAGE_SIZE is smaller than the host page size, can throw away data > in guest pages adjacent to the one it's actually processing right now, > leading to guest memory corruption on an incoming migration. > > This patch therefore, disables the madvise() if the host page size is > larger than TARGET_PAGE_SIZE. This means we don't get the benefits of that > madvise() in this case, but a more complete fix is more difficult to > accomplish. This at least fixes the guest memory corruption. So, discussing the more complete fix here. The first idea which occurred to me was to instead madvise(DONTNEED) the entire memory block in the RAM_SAVE_FLAG_MEM_SIZE phase. Then skip the memset() in the RAM_SAVE_FLAG_COMPRESS path if ch == 0, so that we don't force unneeded zero pages back into host memory. But that would be a bug in the case where the page is initially non-zero, we migrate its contents, but then the page is zeroed on the outgoing guest before the live migration completes. To handle that we'd need some kind of dirty flag on the incoming side recording if a page has already been loaded once or not. I don't know if anything suitable exists. Any thoughts on how to implement a more thorough fix than the one in my patch?
On Wed, Nov 21, 2012 at 03:59:25PM +1100, David Gibson wrote: > madvise(DONTNEED) will throw away the contents of the whole page at the > given address, even if the given length is less than the page size. One > can argue about whether that's the correct behaviour, but that's what it's > done for a long time in Linux at least. > > That means that the madvise() in ram_load(), on a setup where > TARGET_PAGE_SIZE is smaller than the host page size, can throw away data > in guest pages adjacent to the one it's actually processing right now, > leading to guest memory corruption on an incoming migration. > > This patch therefore, disables the madvise() if the host page size is > larger than TARGET_PAGE_SIZE. This means we don't get the benefits of that > madvise() in this case, but a more complete fix is more difficult to > accomplish. This at least fixes the guest memory corruption. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Sorry, forgot to add: Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
diff --git a/arch_init.c b/arch_init.c index b75a4c5..83dcc53 100644 --- a/arch_init.c +++ b/arch_init.c @@ -840,7 +840,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) memset(host, ch, TARGET_PAGE_SIZE); #ifndef _WIN32 if (ch == 0 && - (!kvm_enabled() || kvm_has_sync_mmu())) { + (!kvm_enabled() || kvm_has_sync_mmu()) && + getpagesize() <= TARGET_PAGE_SIZE) { qemu_madvise(host, TARGET_PAGE_SIZE, QEMU_MADV_DONTNEED); } #endif
madvise(DONTNEED) will throw away the contents of the whole page at the given address, even if the given length is less than the page size. One can argue about whether that's the correct behaviour, but that's what it's done for a long time in Linux at least. That means that the madvise() in ram_load(), on a setup where TARGET_PAGE_SIZE is smaller than the host page size, can throw away data in guest pages adjacent to the one it's actually processing right now, leading to guest memory corruption on an incoming migration. This patch therefore, disables the madvise() if the host page size is larger than TARGET_PAGE_SIZE. This means we don't get the benefits of that madvise() in this case, but a more complete fix is more difficult to accomplish. This at least fixes the guest memory corruption. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- arch_init.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)