Message ID | 1675796613-235716-1-git-send-email-steven.sistare@oracle.com |
---|---|
State | New |
Headers | show |
Series | [V2] memory: RAM_NAMED_FILE flag | expand |
On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: > migrate_ignore_shared() is an optimization that avoids copying memory > that is visible and can be mapped on the target. However, a > memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED > flag set is not migrated when migrate_ignore_shared() is true. This is > wrong, because the block has no named backing store, and its contents will > be lost. To fix, ignore shared memory iff it is a named file. Define a > new flag RAM_NAMED_FILE to distinguish this case. There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems it can still be used in similar ways. Pasting commit message from c6c0232: Because of its unique requirements, Linux manages EPC separately from normal memory. Similar to memfd, the device /dev/sgx_vepc can be opened to obtain a file descriptor which can in turn be used to mmap() EPC memory. I'm not sure whether it means that should apply for RAM_NAMED_FILE too, neither do I think it's super important.. Still better to define it properly. Another comment is, AFAIK this patch will modify senamtics of the old capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth directly modify qapi/migration.json to reflect it (especially it's x- prefixed) to something like: # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory # (since 4.0) Thanks,
On 2/7/2023 3:23 PM, Peter Xu wrote: > On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: >> migrate_ignore_shared() is an optimization that avoids copying memory >> that is visible and can be mapped on the target. However, a >> memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED >> flag set is not migrated when migrate_ignore_shared() is true. This is >> wrong, because the block has no named backing store, and its contents will >> be lost. To fix, ignore shared memory iff it is a named file. Define a >> new flag RAM_NAMED_FILE to distinguish this case. > > There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems > it can still be used in similar ways. Pasting commit message from c6c0232: > > Because of its unique requirements, Linux manages EPC separately from > normal memory. Similar to memfd, the device /dev/sgx_vepc can be > opened to obtain a file descriptor which can in turn be used to mmap() > EPC memory. > > I'm not sure whether it means that should apply for RAM_NAMED_FILE too, > neither do I think it's super important.. Still better to define it > properly. The RAM_NAMED_FILE flag will be false for TYPE_MEMORY_BACKEND_EPC, so ramblock_is_ignored will return false, and the contents will be copied over the socket to the target, and the segment will be recreated there. However, perhaps I do not understand your point. > Another comment is, AFAIK this patch will modify senamtics of the old > capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth > directly modify qapi/migration.json to reflect it (especially it's x- > prefixed) to something like: > > # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory > # (since 4.0) Good idea. I propose: # @x-ignore-shared: If enabled, QEMU will not migrate shared memory that is # accessible on the target. (since 4.0) - Steve
On Wed, Feb 08, 2023 at 01:34:18PM -0500, Steven Sistare wrote: > On 2/7/2023 3:23 PM, Peter Xu wrote: > > On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: > >> migrate_ignore_shared() is an optimization that avoids copying memory > >> that is visible and can be mapped on the target. However, a > >> memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED > >> flag set is not migrated when migrate_ignore_shared() is true. This is > >> wrong, because the block has no named backing store, and its contents will > >> be lost. To fix, ignore shared memory iff it is a named file. Define a > >> new flag RAM_NAMED_FILE to distinguish this case. > > > > There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems > > it can still be used in similar ways. Pasting commit message from c6c0232: > > > > Because of its unique requirements, Linux manages EPC separately from > > normal memory. Similar to memfd, the device /dev/sgx_vepc can be > > opened to obtain a file descriptor which can in turn be used to mmap() > > EPC memory. > > > > I'm not sure whether it means that should apply for RAM_NAMED_FILE too, > > neither do I think it's super important.. Still better to define it > > properly. > > The RAM_NAMED_FILE flag will be false for TYPE_MEMORY_BACKEND_EPC, so > ramblock_is_ignored will return false, and the contents will be copied > over the socket to the target, and the segment will be recreated there. > > However, perhaps I do not understand your point. My point was it looked like it should apply RAM_NAMED_FILE too, because it's also a named file. But.. I don't think another QEMU can share the same data if opening the same file. Based on my zero knowledge on it... I quickly looked up sgx_vepc_open() in the Linux impl where sgx_vepc.page_array plays a role of page cache iiuc, while it's private per vma even if VM_SHARED. So please ignore my comment.. > > > Another comment is, AFAIK this patch will modify senamtics of the old > > capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth > > directly modify qapi/migration.json to reflect it (especially it's x- > > prefixed) to something like: > > > > # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory > > # (since 4.0) > > Good idea. I propose: > > # @x-ignore-shared: If enabled, QEMU will not migrate shared memory that is > # accessible on the target. (since 4.0) I would use s/target/destination host/ because target can be misread as the VM itself (at least a general term in migration code base). Other than that looks good to me. Thanks,
On 2/8/2023 2:22 PM, Peter Xu wrote: > On Wed, Feb 08, 2023 at 01:34:18PM -0500, Steven Sistare wrote: >> On 2/7/2023 3:23 PM, Peter Xu wrote: >>> On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: >>>> migrate_ignore_shared() is an optimization that avoids copying memory >>>> that is visible and can be mapped on the target. However, a >>>> memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED >>>> flag set is not migrated when migrate_ignore_shared() is true. This is >>>> wrong, because the block has no named backing store, and its contents will >>>> be lost. To fix, ignore shared memory iff it is a named file. Define a >>>> new flag RAM_NAMED_FILE to distinguish this case. >>> >>> There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems >>> it can still be used in similar ways. Pasting commit message from c6c0232: >>> >>> Because of its unique requirements, Linux manages EPC separately from >>> normal memory. Similar to memfd, the device /dev/sgx_vepc can be >>> opened to obtain a file descriptor which can in turn be used to mmap() >>> EPC memory. >>> >>> I'm not sure whether it means that should apply for RAM_NAMED_FILE too, >>> neither do I think it's super important.. Still better to define it >>> properly. >> >> The RAM_NAMED_FILE flag will be false for TYPE_MEMORY_BACKEND_EPC, so >> ramblock_is_ignored will return false, and the contents will be copied >> over the socket to the target, and the segment will be recreated there. >> >> However, perhaps I do not understand your point. > > My point was it looked like it should apply RAM_NAMED_FILE too, because > it's also a named file. > > But.. I don't think another QEMU can share the same data if opening the > same file. Based on my zero knowledge on it... I quickly looked up > sgx_vepc_open() in the Linux impl where sgx_vepc.page_array plays a role of > page cache iiuc, while it's private per vma even if VM_SHARED. > > So please ignore my comment.. > >> >>> Another comment is, AFAIK this patch will modify senamtics of the old >>> capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth >>> directly modify qapi/migration.json to reflect it (especially it's x- >>> prefixed) to something like: >>> >>> # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory >>> # (since 4.0) >> >> Good idea. I propose: >> >> # @x-ignore-shared: If enabled, QEMU will not migrate shared memory that is >> # accessible on the target. (since 4.0) > > I would use s/target/destination host/ because target can be misread as the > VM itself (at least a general term in migration code base). Other than > that looks good to me. Hi Peter, I will rebase to the tip, modify the wording, and repost. Can I add your RB? - Steve
On Wed, Jun 07, 2023 at 10:05:11AM -0400, Steven Sistare wrote: > On 2/8/2023 2:22 PM, Peter Xu wrote: > > On Wed, Feb 08, 2023 at 01:34:18PM -0500, Steven Sistare wrote: > >> On 2/7/2023 3:23 PM, Peter Xu wrote: > >>> On Tue, Feb 07, 2023 at 11:03:33AM -0800, Steve Sistare wrote: > >>>> migrate_ignore_shared() is an optimization that avoids copying memory > >>>> that is visible and can be mapped on the target. However, a > >>>> memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED > >>>> flag set is not migrated when migrate_ignore_shared() is true. This is > >>>> wrong, because the block has no named backing store, and its contents will > >>>> be lost. To fix, ignore shared memory iff it is a named file. Define a > >>>> new flag RAM_NAMED_FILE to distinguish this case. > >>> > >>> There's also TYPE_MEMORY_BACKEND_EPC. Reading the commit message it seems > >>> it can still be used in similar ways. Pasting commit message from c6c0232: > >>> > >>> Because of its unique requirements, Linux manages EPC separately from > >>> normal memory. Similar to memfd, the device /dev/sgx_vepc can be > >>> opened to obtain a file descriptor which can in turn be used to mmap() > >>> EPC memory. > >>> > >>> I'm not sure whether it means that should apply for RAM_NAMED_FILE too, > >>> neither do I think it's super important.. Still better to define it > >>> properly. > >> > >> The RAM_NAMED_FILE flag will be false for TYPE_MEMORY_BACKEND_EPC, so > >> ramblock_is_ignored will return false, and the contents will be copied > >> over the socket to the target, and the segment will be recreated there. > >> > >> However, perhaps I do not understand your point. > > > > My point was it looked like it should apply RAM_NAMED_FILE too, because > > it's also a named file. > > > > But.. I don't think another QEMU can share the same data if opening the > > same file. Based on my zero knowledge on it... I quickly looked up > > sgx_vepc_open() in the Linux impl where sgx_vepc.page_array plays a role of > > page cache iiuc, while it's private per vma even if VM_SHARED. > > > > So please ignore my comment.. > > > >> > >>> Another comment is, AFAIK this patch will modify senamtics of the old > >>> capability "x-ignore-shared". But I'd say in a sensible way. Maybe worth > >>> directly modify qapi/migration.json to reflect it (especially it's x- > >>> prefixed) to something like: > >>> > >>> # @x-ignore-shared: If enabled, QEMU will not migrate named shared memory > >>> # (since 4.0) > >> > >> Good idea. I propose: > >> > >> # @x-ignore-shared: If enabled, QEMU will not migrate shared memory that is > >> # accessible on the target. (since 4.0) > > > > I would use s/target/destination host/ because target can be misread as the > > VM itself (at least a general term in migration code base). Other than > > that looks good to me. > > Hi Peter, I will rebase to the tip, modify the wording, and repost. > Can I add your RB? Yes. thanks,
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c index 2514128..486d77d 100644 --- a/backends/hostmem-file.c +++ b/backends/hostmem-file.c @@ -56,6 +56,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp) ram_flags = backend->share ? RAM_SHARED : 0; ram_flags |= backend->reserve ? 0 : RAM_NORESERVE; ram_flags |= fb->is_pmem ? RAM_PMEM : 0; + ram_flags |= RAM_NAMED_FILE; memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name, backend->size, fb->align, ram_flags, fb->mem_path, fb->readonly, errp); diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 6feaa40..8173e3b 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -92,6 +92,7 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb); bool qemu_ram_is_migratable(RAMBlock *rb); void qemu_ram_set_migratable(RAMBlock *rb); void qemu_ram_unset_migratable(RAMBlock *rb); +bool qemu_ram_is_named_file(RAMBlock *rb); int qemu_ram_get_fd(RAMBlock *rb); size_t qemu_ram_pagesize(RAMBlock *block); diff --git a/include/exec/memory.h b/include/exec/memory.h index 2e602a2..3224e09 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -232,6 +232,9 @@ typedef struct IOMMUTLBEvent { /* RAM that isn't accessible through normal means. */ #define RAM_PROTECTED (1 << 8) +/* RAM is an mmap-ed named file */ +#define RAM_NAMED_FILE (1 << 9) + static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn, IOMMUNotifierFlag flags, hwaddr start, hwaddr end, diff --git a/migration/ram.c b/migration/ram.c index 334309f..03f8131 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -190,7 +190,8 @@ static bool postcopy_preempt_active(void) bool ramblock_is_ignored(RAMBlock *block) { return !qemu_ram_is_migratable(block) || - (migrate_ignore_shared() && qemu_ram_is_shared(block)); + (migrate_ignore_shared() && qemu_ram_is_shared(block) && + qemu_ram_is_named_file(block)); } #undef RAMBLOCK_FOREACH diff --git a/softmmu/physmem.c b/softmmu/physmem.c index cb998cd..767a5f0 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -1749,6 +1749,11 @@ void qemu_ram_unset_migratable(RAMBlock *rb) rb->flags &= ~RAM_MIGRATABLE; } +bool qemu_ram_is_named_file(RAMBlock *rb) +{ + return rb->flags & RAM_NAMED_FILE; +} + int qemu_ram_get_fd(RAMBlock *rb) { return rb->fd; @@ -2059,7 +2064,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, /* Just support these ram flags by now. */ assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_NORESERVE | - RAM_PROTECTED)) == 0); + RAM_PROTECTED | RAM_NAMED_FILE)) == 0); if (xen_enabled()) { error_setg(errp, "-mem-path not supported with Xen");
migrate_ignore_shared() is an optimization that avoids copying memory that is visible and can be mapped on the target. However, a memory-backend-ram or a memory-backend-memfd block with the RAM_SHARED flag set is not migrated when migrate_ignore_shared() is true. This is wrong, because the block has no named backing store, and its contents will be lost. To fix, ignore shared memory iff it is a named file. Define a new flag RAM_NAMED_FILE to distinguish this case. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> --- backends/hostmem-file.c | 1 + include/exec/cpu-common.h | 1 + include/exec/memory.h | 3 +++ migration/ram.c | 3 ++- softmmu/physmem.c | 7 ++++++- 5 files changed, 13 insertions(+), 2 deletions(-)