From patchwork Fri Jan 15 00:38:01 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 42939 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id B4DD0B7CE2 for ; Fri, 15 Jan 2010 12:24:11 +1100 (EST) Received: from localhost ([127.0.0.1]:41483 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NVaqK-0000f6-0H for incoming@patchwork.ozlabs.org; Thu, 14 Jan 2010 20:19:16 -0500 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NVanL-0008Ig-JI for qemu-devel@nongnu.org; Thu, 14 Jan 2010 20:16:11 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NVanJ-0008Gy-ME for qemu-devel@nongnu.org; Thu, 14 Jan 2010 20:16:10 -0500 Received: from [199.232.76.173] (port=51666 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NVanJ-0008Gm-Bp for qemu-devel@nongnu.org; Thu, 14 Jan 2010 20:16:09 -0500 Received: from are.twiddle.net ([75.149.56.221]:50182) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NVanI-0006r8-MP for qemu-devel@nongnu.org; Thu, 14 Jan 2010 20:16:09 -0500 Received: by are.twiddle.net (Postfix, from userid 5000) id DCB1CB93; Thu, 14 Jan 2010 17:16:04 -0800 (PST) From: Richard Henderson Date: Thu, 14 Jan 2010 16:38:01 -0800 To: qemu-devel@nongnu.org Message-Id: <20100115011604.DCB1CB93@are.twiddle.net> X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 2) Cc: aurelien@aurel32.net Subject: [Qemu-devel] [PATCH] linux-user: Align mmap memory to the target page size. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Previously, mmap_find_vma could return addresses not properly aligned to the target page size. This of course led to all sorts of odd problems down the road. The trivial fix, to simply reject the unaligned address and continue searching the address space by increments of one page, is not a good idea when there's a 64-bit address space involved. The kernel may well continue to return the last available address which we've already rejected while we search upward from e.g. 2**42 from 2**64. This patch uses a more complex search algorithm that takes the result of the previous allocation into account. We normally search upward, but notice 2 consecutive results and start searching downward instead. Signed-off-by: Richard Henderson --- linux-user/main.c | 7 +--- linux-user/mmap.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 64 insertions(+), 14 deletions(-) diff --git a/linux-user/main.c b/linux-user/main.c index a0d8ce7..7db9fc3 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -2725,12 +2725,9 @@ int main(int argc, char **argv, char **envp) /* * Read in mmap_min_addr kernel parameter. This value is used * When loading the ELF image to determine whether guest_base - * is needed. - * - * When user has explicitly set the quest base, we skip this - * test. + * is needed. It is also used in mmap_find_vma. */ - if (!have_guest_base) { + { FILE *fp; if ((fp = fopen("/proc/sys/vm/mmap_min_addr", "r")) != NULL) { diff --git a/linux-user/mmap.c b/linux-user/mmap.c index 144fb7c..b92fdc4 100644 --- a/linux-user/mmap.c +++ b/linux-user/mmap.c @@ -281,8 +281,9 @@ unsigned long last_brk; */ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size) { - void *ptr; + void *ptr, *prev; abi_ulong addr; + int wrapped, repeat; size = HOST_PAGE_ALIGN(size); start &= qemu_host_page_mask; @@ -292,8 +293,11 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size) start = mmap_next_start; addr = start; + wrapped = (start == 0); + repeat = 0; + prev = 0; - for(;;) { + for (;; prev = ptr) { /* * Reserve needed memory area to avoid a race. * It should be discarded using: @@ -305,20 +309,69 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size) MAP_ANONYMOUS|MAP_PRIVATE|MAP_NORESERVE, -1, 0); /* ENOMEM, if host address space has no memory */ - if (ptr == MAP_FAILED) + if (ptr == MAP_FAILED) { return (abi_ulong)-1; + } - /* If address fits target address space we've found what we need */ - if ((unsigned long)ptr + size - 1 <= (abi_ulong)-1) + /* Count the number of sequential returns of the same address. + This is used to modify the search algorithm below. */ + repeat = (ptr == prev ? repeat + 1 : 0); + + if ((unsigned long)ptr & ~TARGET_PAGE_MASK) { + /* The address is not properly aligned for the target. */ + switch (repeat) { + case 0: + /* Assume the result that the kernel gave us is the + first with enough free space, so start again at the + next higher target page. */ + addr = TARGET_PAGE_ALIGN((unsigned long)ptr); + break; + case 1: + /* Sometimes the kernel decides to perform the allocation + at the top end of memory instead. Notice this via + sequential allocations that result in the same address. */ + /* ??? This can be exacerbated by a successful allocation + at the top of memory on one round, and storing that + result in mmap_next_start. The next allocation is sure + to start at an address that's going to fail. */ + addr = (unsigned long)ptr & TARGET_PAGE_MASK; + break; + case 2: + /* Start over at low memory. */ + addr = 0; + break; + default: + /* Fail. This unaligned block must be the only one left. */ + addr = -1; + break; + } + } else if ((unsigned long)ptr + size - 1 <= (abi_ulong)-1) { break; + } else { + /* Since the result the kernel gave didn't fit, start + again at low memory. If any repetition, fail. */ + addr = (repeat ? -1 : 0); + } - /* Unmap and try again with new page */ + /* Unmap and try again. */ munmap(ptr, size); - addr += qemu_host_page_size; - /* ENOMEM if we check whole of target address space */ - if (addr == start) + /* ENOMEM if we checked the whole of the target address space. */ + if (addr == -1) { return (abi_ulong)-1; + } else if (addr == 0) { + if (wrapped) { + return (abi_ulong)-1; + } + wrapped = 1; + /* Don't actually use 0 when wrapping, instead indicate + that we'd truely like an allocation in low memory. */ + addr = (mmap_min_addr > TARGET_PAGE_SIZE + ? TARGET_PAGE_ALIGN(mmap_min_addr) + : TARGET_PAGE_SIZE); + } else if (wrapped && addr >= start) { + return (abi_ulong)-1; + } } /* Update default start address */