Message ID | 5055B5C2.9040700@msgid.tls.msk.ru |
---|---|
State | New |
Headers | show |
Ping^2 ? Thanks, /mjt 16.09.2012 15:19, Michael Tokarev wrote: > So, is the patch okay? > > Thanks, > > /mjt > > On 15.08.2012 19:03, Michael Tokarev wrote: >> On 15.08.2012 18:26, Avi Kivity wrote: >>> On 08/15/2012 05:22 PM, Michael Tokarev wrote: >>> >>>>> >>>>> Please provide extra info, like the setting of >>>>> /sys/kernel/mm/transparent_hugepage/enabled. >>>> >>>> That was it - sort of. Default value here is enabled=madvise. >>>> When setting it to always the effect finally started appearing, >>>> so it is actually working. >>>> >>>> But can't qemu set MADV_HUGEPAGE flag too, so it works automatically? >>> >>> It can and should. >> >> Something like the attached patch? >> >> Thanks, >> >> /mjt >
Isn't ad0b5321f1f797274603ebbe20108b0750baee94 enough? On Mon, Nov 12, 2012 at 07:18:49PM +0400, Michael Tokarev wrote: > Ping^2 ? > > Thanks, > > /mjt > > 16.09.2012 15:19, Michael Tokarev wrote: > > So, is the patch okay? > > > > Thanks, > > > > /mjt > > > > On 15.08.2012 19:03, Michael Tokarev wrote: > >> On 15.08.2012 18:26, Avi Kivity wrote: > >>> On 08/15/2012 05:22 PM, Michael Tokarev wrote: > >>> > >>>>> > >>>>> Please provide extra info, like the setting of > >>>>> /sys/kernel/mm/transparent_hugepage/enabled. > >>>> > >>>> That was it - sort of. Default value here is enabled=madvise. > >>>> When setting it to always the effect finally started appearing, > >>>> so it is actually working. > >>>> > >>>> But can't qemu set MADV_HUGEPAGE flag too, so it works automatically? > >>> > >>> It can and should. > >> > >> Something like the attached patch? > >> > >> Thanks, > >> > >> /mjt > > > > >
On 13.11.2012 18:30, Aurelien Jarno wrote: > Isn't ad0b5321f1f797274603ebbe20108b0750baee94 enough? Oh. It has been applied. I expected it will be ignored just like my patch has been. No, it is not enough: that patch alone does nothing for the alignment on at least x86, which is necessary for hugepages to work. My patch _also_ fixes alignment issue. Where to apply MADV_HUGEPAGE is a different question. I don't know which layer it is best to apply it to. Thanks, /mjt > On Mon, Nov 12, 2012 at 07:18:49PM +0400, Michael Tokarev wrote: >> Ping^2 ? >> >> Thanks, >> >> /mjt >> >> 16.09.2012 15:19, Michael Tokarev wrote: >>> So, is the patch okay? >>> >>> Thanks, >>> >>> /mjt >>> >>> On 15.08.2012 19:03, Michael Tokarev wrote: >>>> On 15.08.2012 18:26, Avi Kivity wrote: >>>>> On 08/15/2012 05:22 PM, Michael Tokarev wrote: >>>>> >>>>>>> >>>>>>> Please provide extra info, like the setting of >>>>>>> /sys/kernel/mm/transparent_hugepage/enabled. >>>>>> >>>>>> That was it - sort of. Default value here is enabled=madvise. >>>>>> When setting it to always the effect finally started appearing, >>>>>> so it is actually working. >>>>>> >>>>>> But can't qemu set MADV_HUGEPAGE flag too, so it works automatically? >>>>> >>>>> It can and should. >>>> >>>> Something like the attached patch? >>>> >>>> Thanks, >>>> >>>> /mjt >>> >> >> >> >
From 705b3efb8c0cf06cbf087204fc61863c2bbb9e27 Mon Sep 17 00:00:00 2001 From: Michael Tokarev <mjt@tls.msk.ru> Date: Wed, 15 Aug 2012 18:55:16 +0400 Subject: [PATCH] mark large vmalloc areas as MADV_HUGEPAGE and allow hugepages on i386 A followup to commit 36b586284e678d. On linux only (which supports transparent hugepages), explicitly mark large vmalloced areas with madvise(MADV_HUGEPAGES). The patch changes previous logic a bit to allow inserting the call to madvise(), but keeps the code the same (and saves one call to getpagesize() per allocation). The code also adds #include <sys/mman.h> to the linux-specific part, to get MADV_HUGEPAGES definition. While at it, enable transparent hugepages (alignment and the new explicit marking with madvise()) for 32bit x86 too - it makes good sense for, say, 32bit userspace on 64bit kernel. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> --- oslib-posix.c | 35 +++++++++++++++++++++++++---------- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/oslib-posix.c b/oslib-posix.c index dbeb627..ab32d6d 100644 --- a/oslib-posix.c +++ b/oslib-posix.c @@ -35,19 +35,23 @@ extern int daemon(int, int); #endif -#if defined(__linux__) && defined(__x86_64__) +#ifdef __linux__ +# include <sys/mman.h> + +# if defined(__x86_64__) || defined(__i386__) /* Use 2 MiB alignment so transparent hugepages can be used by KVM. Valgrind does not support alignments larger than 1 MiB, therefore we need special code which handles running on Valgrind. */ -# define QEMU_VMALLOC_ALIGN (512 * 4096) +# define QEMU_VMALLOC_ALIGN_HUGE (512 * 4096) # define CONFIG_VALGRIND -#elif defined(__linux__) && defined(__s390x__) +# elif defined(__s390x__) /* Use 1 MiB (segment size) alignment so gmap can be used by KVM. */ -# define QEMU_VMALLOC_ALIGN (256 * 4096) -#else -# define QEMU_VMALLOC_ALIGN getpagesize() +# define QEMU_VMALLOC_ALIGN_HUGE (256 * 4096) +# endif #endif +#define QEMU_VMALLOC_ALIGN getpagesize() + #include "config-host.h" #include "sysemu.h" #include "trace.h" @@ -114,7 +118,6 @@ void *qemu_memalign(size_t alignment, size_t size) void *qemu_vmalloc(size_t size) { void *ptr; - size_t align = QEMU_VMALLOC_ALIGN; #if defined(CONFIG_VALGRIND) if (running_on_valgrind < 0) { @@ -125,10 +128,22 @@ void *qemu_vmalloc(size_t size) } #endif - if (size < align || running_on_valgrind) { - align = getpagesize(); +#ifdef QEMU_VMALLOC_ALIGN_HUGE + /* try to allocate as huge pages if supported and large enough */ + if (size >= QEMU_VMALLOC_ALIGN_HUGE && !running_on_valgrind) { + ptr = qemu_memalign(QEMU_VMALLOC_ALIGN_HUGE, size); +#ifdef MADV_HUGEPAGE +#error + qemu_madvise(ptr, size, MADV_HUGEPAGE); +#endif } - ptr = qemu_memalign(align, size); + else +#endif + { + /* if unsupported or small, allocate pagesize-aligned */ + ptr = qemu_memalign(QEMU_VMALLOC_ALIGN, size); + } + trace_qemu_vmalloc(size, ptr); return ptr; } -- 1.7.10.4