Message ID | CADiFPYJSjeXf_zO756vUcn6UMELQHYeW9kmCZ+ynDJstSuFHyQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
Peter Feiner <peter@gridcentric.ca> wrote: > Enables providing a backing file for the PC's ram. The file is specified by the > new -pcram-file option. The file is mmap'd shared, so the RAMBlock that it backs > doesn't need to be saved by vm_save / migration. > > Signed-off-by: Peter Feiner <peter@gridcentric.com> Hi Do you have any performance number for this? And examples on how your are using it? > +#ifdef __linux__ > + new_block->host = mem_file_ram_alloc(new_block, size); > + if (new_block->host) { > + assert(!host); > + } else > +#endif > if (host) { This test is (at least suspicious). Shouldn't we check first if host is not NULL? (Not that I fully understand that part) Thanks, Juan.
> Hi Hi Juan, Sorry for taking so long to reply -- my email filters apparently aren't setup correctly! > Do you have any performance number for this? And examples on how your > are using it? The performance should depend only on the VMA backing the file, in addition to any indirect overhead caused by MMU synchronization. If the file is a disk file that gets flushed from the buffer cache frequently, then performance will be abysmal. However, if the file is guaranteed to be in-core (e.g., mounted on a ramfs), then KVM will hit the same kernel code paths as a file backed by an anonymous VMA that isn't swapped out. Our principal use case is implementing VM migration techniques. We're particularly interested in memory migration. Right now, QEMU implements VM migration, but QEMU's migration mechanism is inflexible with respect to memory. That is, the entire contents of the VM's RAM are copied from the migration host to the migration destination before the destination VM can run. With the current VM migration implementation, it's impossible, for instance, to allow the destination VM to start immediately and lazily fetch its RAM. With the -pcram-option, we could specify a file for the RAM that's backed by a fileystem that fetches pages on demand over the network. > >> +#ifdef __linux__ >> + new_block->host = mem_file_ram_alloc(new_block, size); >> + if (new_block->host) { >> + assert(!host); >> + } else >> +#endif >> if (host) { > > This test is (at least suspicious). Shouldn't we check first if host > is not NULL? (Not that I fully understand that part) Here's what I'm really testing: (host == NULL) or (there's no ram file for this RAMBlock) Here's my rationale: If there's a ram file for this RAMBlock, then the user of QEMU expects the RAMBlock to be backed by some file. Presumably the VM wouldn't run correctly otherwise (e.g., in the case of migration). However, if qemu passed host!=NULL into qemu_ram_alloc_from_ptr, then it expects the RAMBlock to be backed by something else; if the RAMBlock were backed by something other than the passed in host memory, then the VM presumably wouldn't work properly in this case either. Hence it's an error for host to be non-NULL and there to be a ram file for this RAMBlock, which is indicated by mem_file_ram_alloc returning non-NULL. It's up to the caller of add_memory_file to know if the RAMBlock named by idstr is normally allocated by qemu_ram_alloc_from_ptr. Hence why the exposed command-line option is "--pcram-file file" instead of "--memory-for-arbitrary-ram-block idstr=x,file". I hope this clears some things up! Peter
>> Do you have any performance number for this? And examples on how your >> are using it? > Our principal use case is implementing VM migration techniques. There are other uses of a RAM file interface that I can imagine: - debugging, e.g., inspecting the memory of a VM after it has crashed - security research, e.g., extracting passwords from a running VM
Hi, Is there any interest in this feature? BTW, as far as I can tell, on qemu-devel I'm not supposed to re-post the patch or post a v2 if there haven't been any specific requests for changes to v1. Please let me know if you'd like me to submit a new patch! Thanks, Peter Feiner
On 12/19/2011 12:26 PM, Peter Feiner wrote: > Hi, > > Is there any interest in this feature? > > BTW, as far as I can tell, on qemu-devel I'm not supposed to re-post > the patch or post a v2 if there haven't been any specific requests for > changes to v1. Please let me know if you'd like me to submit a new > patch! I still don't understand what the use-case is other than "we use this to implement RAM migration outside of QEMU" which is not something I'm terribly interested in. I'd prefer you submit patches to improve RAM migration within QEMU. Regards, Anthony Liguori > > Thanks, > Peter Feiner >
diff --git a/arch_init.c b/arch_init.c index a411fdf..96e8a28 100644 --- a/arch_init.c +++ b/arch_init.c @@ -122,6 +122,14 @@ static int ram_save_block(QEMUFile *f) if (!block) block = QLIST_FIRST(&ram_list.blocks); + while (block->do_not_save) { + last_block = block; + block = QLIST_NEXT(block, next); + if (!block) { + return 0; + } + } + current_addr = block->offset + offset; do { @@ -185,6 +193,9 @@ static ram_addr_t ram_save_remaining(void) QLIST_FOREACH(block, &ram_list.blocks, next) { ram_addr_t addr; + if (block->do_not_save) { + continue; + } for (addr = block->offset; addr < block->offset + block->length; addr += TARGET_PAGE_SIZE) { if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) { diff --git a/cpu-all.h b/cpu-all.h index 5f47ab8..a78f38c 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -482,6 +482,7 @@ typedef struct RAMBlock { uint32_t flags; char idstr[256]; QLIST_ENTRY(RAMBlock) next; + int do_not_save; #if defined(__linux__) && !defined(TARGET_S390X) int fd; #endif @@ -493,6 +494,17 @@ typedef struct RAMList { } RAMList; extern RAMList ram_list; +typedef struct MemFile { + const char *idstr; + const char *path; + QLIST_ENTRY(MemFile) next; +} MemFile; + +typedef struct MemFileList { + QLIST_HEAD(files, MemFile) files; +} MemFileList; +extern MemFileList mem_file_list; + extern const char *mem_path; extern int mem_prealloc; diff --git a/exec.c b/exec.c index 6b92198..9a1cbca 100644 --- a/exec.c +++ b/exec.c @@ -117,6 +117,8 @@ static MemoryRegion *system_io; #endif +MemFileList mem_file_list = { .files = QLIST_HEAD_INITIALIZER(mem_file_list) }; + CPUState *first_cpu; /* current CPU in the current thread. It is only valid inside cpu_exec() */ @@ -2774,6 +2776,59 @@ void qemu_flush_coalesced_mmio_buffer(void) kvm_flush_coalesced_mmio_buffer(); } +#ifdef __linux__ +static void *mem_file_ram_alloc(RAMBlock *block, + ram_addr_t memory) +{ + void *host; + MemFile *mf; + struct stat buf; + int ret; + + QLIST_FOREACH(mf, &mem_file_list.files, next) { + if (strcmp(mf->idstr, block->idstr)) { + continue; + } + + if (kvm_enabled() && !kvm_has_sync_mmu()) { + fprintf(stderr, "host lacks kvm mmu notifiers, " + "MemFile unsupported, abort!\n"); + abort(); + } + + block->fd = open(mf->path, O_RDWR); + if (block->fd == -1) { + fprintf(stderr, "Could not open %s for RAMBlock %s, abort!\n", + mf->path, mf->idstr); + abort(); + } + ret = fstat(block->fd, &buf); + if (ret != 0) { + fprintf(stderr, "Could not stat %s for RAMBlock %s, abort!\n", + mf->path, mf->idstr); + abort(); + } + if (buf.st_size != memory) { + fprintf(stderr, + "File %s has size %luB. RAMBlock %s expects %luB. Abort!\n", + mf->path, buf.st_size, block->idstr, memory); + abort(); + } + + host = mmap(NULL, memory, PROT_READ | PROT_WRITE, MAP_SHARED, + block->fd, 0); + if (host == MAP_FAILED) { + fprintf(stderr, "Failed to mmap %s for RAMBlock %s, abort!\n", + mf->path, mf->idstr); + abort(); + } + block->do_not_save = 1; + return host; + } + return NULL; +} +#endif + #if defined(__linux__) && !defined(TARGET_S390X) #include <sys/vfs.h> @@ -2914,6 +2969,28 @@ static ram_addr_t last_ram_offset(void) return last; } +void add_memory_file(const char *idstr, const char *path) +{ +#ifndef __linux__ + fprintf(stderr, "MemFile only supported on Linux, abort!\n"); + abort(); +#else + MemFile *mf; + + QLIST_FOREACH(mf, &mem_file_list.files, next) { + if (!strcmp(mf->idstr, idstr)) { + fprintf(stderr, "MemFile for \"%s\" already specified, abort!\n", + idstr); + abort(); + } + } + mf = g_malloc0(sizeof(*mf)); + mf->idstr = idstr; + mf->path = path; + QLIST_INSERT_HEAD(&mem_file_list.files, mf, next); +#endif +} + ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, ram_addr_t size, void *host) { @@ -2940,6 +3017,12 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, const char *name, } new_block->offset = find_ram_offset(size); +#ifdef __linux__ + new_block->host = mem_file_ram_alloc(new_block, size); + if (new_block->host) { + assert(!host); + } else +#endif if (host) { new_block->host = host; new_block->flags |= RAM_PREALLOC_MASK; diff --git a/qemu-common.h b/qemu-common.h index 2ce47aa..41adbac 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -306,6 +306,8 @@ char *os_find_datadir(const char *argv0); void os_parse_cmd_args(int index, const char *optarg); void os_pidfile_error(void); +void add_memory_file(const char *idstr, const char *path); + /* Convert a byte between binary and BCD. */ static inline uint8_t to_bcd(uint8_t val) { diff --git a/qemu-options.hx b/qemu-options.hx index 681eaf1..25b7c38 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -387,6 +387,19 @@ Preallocate memory when using -mem-path. ETEXI #endif +#ifdef __linux__ +DEF("pcram-file", HAS_ARG, QEMU_OPTION_pcram_file, + "-pcram-file FILE provide backing storage for PC RAM\n", QEMU_ARCH_I386) +STEXI +@item -pcram-file @var{path} +Populate guest PC RAM with memory mapped file @var{path}. All changes to guest +ram are reflected in the file (i.e., it is a @code{MAP_SHARED} mapping). + +PC RAM is neither migrated nor saved. +ETEXI +#endif + + DEF("k", HAS_ARG, QEMU_OPTION_k, "-k language use keyboard layout (for example 'fr' for French)\n", QEMU_ARCH_ALL) diff --git a/vl.c b/vl.c index f5afed4..2d28797 100644 --- a/vl.c +++ b/vl.c @@ -2549,6 +2549,9 @@ int main(int argc, char **argv, char **envp) ram_size = value; break; } + case QEMU_OPTION_pcram_file: + add_memory_file("pc.ram", optarg); + break; case QEMU_OPTION_mempath: mem_path = optarg; break;
Enables providing a backing file for the PC's ram. The file is specified by the new -pcram-file option. The file is mmap'd shared, so the RAMBlock that it backs doesn't need to be saved by vm_save / migration. Signed-off-by: Peter Feiner <peter@gridcentric.com> --- We have found this small feature very useful for experimenting with memory migration techniques. By exposing PC memory through a simple interface (i.e., the filesystem), we can implement various memory migration techniques independently of QEMU. For example, one can map a VM's ram to a file being served over a network, thus implementing on-demand fetching. In the future, RAMBlocks could be mmap'd privately to implement memory sharing. Note that unlike the existing -mem-path option, which specifies a (hugetlbfs) directory in which files for all RAMBlocks are to be created, -pcram-file specifies a file to be mapped for the "pc.ram" RAMBlock arch_init.c | 11 +++++++ cpu-all.h | 12 ++++++++ exec.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ qemu-common.h | 2 + qemu-options.hx | 13 ++++++++ vl.c | 3 ++ 6 files changed, 124 insertions(+), 0 deletions(-)