From patchwork Wed Aug 15 15:03:24 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Tokarev X-Patchwork-Id: 177695 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 341322C0096 for ; Thu, 16 Aug 2012 01:03:46 +1000 (EST) Received: from localhost ([::1]:50929 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T1f8K-0008Sd-BV for incoming@patchwork.ozlabs.org; Wed, 15 Aug 2012 11:03:44 -0400 Received: from eggs.gnu.org ([208.118.235.92]:41024) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T1f89-0008SW-1R for qemu-devel@nongnu.org; Wed, 15 Aug 2012 11:03:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T1f84-0007zm-GO for qemu-devel@nongnu.org; Wed, 15 Aug 2012 11:03:32 -0400 Received: from isrv.corpit.ru ([86.62.121.231]:51142) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T1f84-0007zi-3p for qemu-devel@nongnu.org; Wed, 15 Aug 2012 11:03:28 -0400 Received: from [192.168.88.2] (mjt.vpn.tls.msk.ru [192.168.177.99]) by isrv.corpit.ru (Postfix) with ESMTP id 8DBF6A0F48; Wed, 15 Aug 2012 19:03:26 +0400 (MSK) Message-ID: <502BBA3C.70506@msgid.tls.msk.ru> Date: Wed, 15 Aug 2012 19:03:24 +0400 From: Michael Tokarev Organization: Telecom Service, JSC User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:10.0.5) Gecko/20120624 Icedove/10.0.5 MIME-Version: 1.0 To: Avi Kivity References: <502B99D0.8010808@msgid.tls.msk.ru> <502B9B40.6090307@redhat.com> <502BB0C2.3040701@msgid.tls.msk.ru> <502BB1A6.7030407@redhat.com> In-Reply-To: <502BB1A6.7030407@redhat.com> X-Enigmail-Version: 1.4.1 OpenPGP: id=804465C5 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 86.62.121.231 Cc: Andrea Arcangeli , qemu-devel Subject: Re: [Qemu-devel] qemu and transparent huge pages X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org On 15.08.2012 18:26, Avi Kivity wrote: > On 08/15/2012 05:22 PM, Michael Tokarev wrote: > >>> >>> Please provide extra info, like the setting of >>> /sys/kernel/mm/transparent_hugepage/enabled. >> >> That was it - sort of. Default value here is enabled=madvise. >> When setting it to always the effect finally started appearing, >> so it is actually working. >> >> But can't qemu set MADV_HUGEPAGE flag too, so it works automatically? > > It can and should. Something like the attached patch? Thanks, /mjt From 705b3efb8c0cf06cbf087204fc61863c2bbb9e27 Mon Sep 17 00:00:00 2001 From: Michael Tokarev Date: Wed, 15 Aug 2012 18:55:16 +0400 Subject: [PATCH] mark large vmalloc areas as MADV_HUGEPAGE and allow hugepages on i386 A followup to commit 36b586284e678d. On linux only (which supports transparent hugepages), explicitly mark large vmalloced areas with madvise(MADV_HUGEPAGES). The patch changes previous logic a bit to allow inserting the call to madvise(), but keeps the code the same (and saves one call to getpagesize() per allocation). The code also adds #include to the linux-specific part, to get MADV_HUGEPAGES definition. While at it, enable transparent hugepages (alignment and the new explicit marking with madvise()) for 32bit x86 too - it makes good sense for, say, 32bit userspace on 64bit kernel. Signed-off-by: Michael Tokarev --- oslib-posix.c | 35 +++++++++++++++++++++++++---------- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/oslib-posix.c b/oslib-posix.c index dbeb627..ab32d6d 100644 --- a/oslib-posix.c +++ b/oslib-posix.c @@ -35,19 +35,23 @@ extern int daemon(int, int); #endif -#if defined(__linux__) && defined(__x86_64__) +#ifdef __linux__ +# include + +# if defined(__x86_64__) || defined(__i386__) /* Use 2 MiB alignment so transparent hugepages can be used by KVM. Valgrind does not support alignments larger than 1 MiB, therefore we need special code which handles running on Valgrind. */ -# define QEMU_VMALLOC_ALIGN (512 * 4096) +# define QEMU_VMALLOC_ALIGN_HUGE (512 * 4096) # define CONFIG_VALGRIND -#elif defined(__linux__) && defined(__s390x__) +# elif defined(__s390x__) /* Use 1 MiB (segment size) alignment so gmap can be used by KVM. */ -# define QEMU_VMALLOC_ALIGN (256 * 4096) -#else -# define QEMU_VMALLOC_ALIGN getpagesize() +# define QEMU_VMALLOC_ALIGN_HUGE (256 * 4096) +# endif #endif +#define QEMU_VMALLOC_ALIGN getpagesize() + #include "config-host.h" #include "sysemu.h" #include "trace.h" @@ -114,7 +118,6 @@ void *qemu_memalign(size_t alignment, size_t size) void *qemu_vmalloc(size_t size) { void *ptr; - size_t align = QEMU_VMALLOC_ALIGN; #if defined(CONFIG_VALGRIND) if (running_on_valgrind < 0) { @@ -125,10 +128,22 @@ void *qemu_vmalloc(size_t size) } #endif - if (size < align || running_on_valgrind) { - align = getpagesize(); +#ifdef QEMU_VMALLOC_ALIGN_HUGE + /* try to allocate as huge pages if supported and large enough */ + if (size >= QEMU_VMALLOC_ALIGN_HUGE && !running_on_valgrind) { + ptr = qemu_memalign(QEMU_VMALLOC_ALIGN_HUGE, size); +#ifdef MADV_HUGEPAGE +#error + qemu_madvise(ptr, size, MADV_HUGEPAGE); +#endif } - ptr = qemu_memalign(align, size); + else +#endif + { + /* if unsupported or small, allocate pagesize-aligned */ + ptr = qemu_memalign(QEMU_VMALLOC_ALIGN, size); + } + trace_qemu_vmalloc(size, ptr); return ptr; } -- 1.7.10.4