Patchwork qemu and transparent huge pages

login
register
mail settings
Submitter Michael Tokarev
Date Aug. 15, 2012, 12:45 p.m.
Message ID <502B99D0.8010808@msgid.tls.msk.ru>
Download mbox | patch
Permalink /patch/177642/
State New
Headers show

Comments

Michael Tokarev - Aug. 15, 2012, 12:45 p.m.
[Reposting with the right email address of Andrea]

Quite some time ago there was a thread on qemu-devel,
started by Andrea, about modifying qemu to better
use transparent huge pages:

 http://lists.gnu.org/archive/html/qemu-devel/2010-03/msg01250.html

That thread hasn't reached any conclusion, but some time
after that Avi implemented a similar change:

commit 36b586284e678da28df3af9fd0907d2b16f9311c
Author: Avi Kivity <avi@redhat.com>
Date:   Mon Sep 5 11:07:05 2011 +0300

    qemu_vmalloc: align properly for transparent hugepages and KVM

    To make good use of transparent hugepages, KVM requires that guest-physical
    and host-virtual addresses share the low 21 bits (as opposed to just the low
    12 bits normally required).

    Adjust qemu_vmalloc() to honor that requirement.  Ignore it for small region
    to avoid fragmentation.

    Signed-off-by: Avi Kivity <avi@redhat.com>
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>



(why it is 64bit-only is a different, unrelated question).

But apparently, THP does not work still, even with 2Mb
alignment: when running a guest, AnonHugePages in
/proc/meminfo stays at 0 - either in kvm mode or in tcg
mode.  Any idea why?  What else is needed for THP to work?

This is quite a frequent question in #kvm IRC channel,
and I always suggested using -mem-path for this,  but
I'm curios why it doesn't work automatically when it
probably should?

Thanks,

/mjt
Avi Kivity - Aug. 15, 2012, 12:51 p.m.
On 08/15/2012 03:45 PM, Michael Tokarev wrote:
> 
> But apparently, THP does not work still, even with 2Mb
> alignment: when running a guest, AnonHugePages in
> /proc/meminfo stays at 0 - either in kvm mode or in tcg
> mode.  Any idea why?  What else is needed for THP to work?

It does for me:

AnonHugePages:    368640 kB

Note the patch you reference doesn't impact thp, just kvm's ability to
propagate them to the shadow page table.

> 
> This is quite a frequent question in #kvm IRC channel,
> and I always suggested using -mem-path for this,  but
> I'm curios why it doesn't work automatically when it
> probably should?
> 

Please provide extra info, like the setting of
/sys/kernel/mm/transparent_hugepage/enabled.
Michael Tokarev - Aug. 15, 2012, 2:22 p.m.
On 15.08.2012 16:51, Avi Kivity wrote:
> On 08/15/2012 03:45 PM, Michael Tokarev wrote:
>>
>> But apparently, THP does not work still, even with 2Mb
>> alignment: when running a guest, AnonHugePages in
>> /proc/meminfo stays at 0 - either in kvm mode or in tcg
>> mode.  Any idea why?  What else is needed for THP to work?
> 
> It does for me:
> 
> AnonHugePages:    368640 kB
> 
> Note the patch you reference doesn't impact thp, just kvm's ability to
> propagate them to the shadow page table.
> 
>>
>> This is quite a frequent question in #kvm IRC channel,
>> and I always suggested using -mem-path for this,  but
>> I'm curios why it doesn't work automatically when it
>> probably should?
>>
> 
> Please provide extra info, like the setting of
> /sys/kernel/mm/transparent_hugepage/enabled.

That was it - sort of.  Default value here is enabled=madvise.
When setting it to always the effect finally started appearing,
so it is actually working.

But can't qemu set MADV_HUGEPAGE flag too, so it works automatically?

Thanks,

/mjt
Avi Kivity - Aug. 15, 2012, 2:26 p.m.
On 08/15/2012 05:22 PM, Michael Tokarev wrote:

>> 
>> Please provide extra info, like the setting of
>> /sys/kernel/mm/transparent_hugepage/enabled.
> 
> That was it - sort of.  Default value here is enabled=madvise.
> When setting it to always the effect finally started appearing,
> so it is actually working.
> 
> But can't qemu set MADV_HUGEPAGE flag too, so it works automatically?

It can and should.

Patch

diff --git a/oslib-posix.c b/oslib-posix.c
index 196099c..a304fb0 100644
--- a/oslib-posix.c
+++ b/oslib-posix.c
@@ -35,6 +35,13 @@ 
 extern int daemon(int, int);
 #endif

+#if defined(__linux__) && defined(__x86_64__)
+   /* Use 2MB alignment so transparent hugepages can be used by KVM */
+#  define QEMU_VMALLOC_ALIGN (512 * 4096)
+#else
+#  define QEMU_VMALLOC_ALIGN getpagesize()
+#endif
+
 #include "config-host.h"
 #include "sysemu.h"
 #include "trace.h"
@@ -80,7 +87,12 @@  void *qemu_memalign(size_t alignment, size_t size)
 void *qemu_vmalloc(size_t size)
 {
     void *ptr;
-    ptr = qemu_memalign(getpagesize(), size);
+    size_t align = QEMU_VMALLOC_ALIGN;
+
+    if (size < align) {
+        align = getpagesize();
+    }
+    ptr = qemu_memalign(align, size);
     trace_qemu_vmalloc(size, ptr);
     return ptr;
 }