From patchwork Fri Oct 29 14:48:57 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Froyd X-Patchwork-Id: 69603 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 7A766B70DC for ; Sat, 30 Oct 2010 02:12:51 +1100 (EST) Received: from localhost ([127.0.0.1]:40977 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PBqTp-0007oC-EK for incoming@patchwork.ozlabs.org; Fri, 29 Oct 2010 11:02:57 -0400 Received: from [140.186.70.92] (port=44399 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PBqGK-00010s-QT for qemu-devel@nongnu.org; Fri, 29 Oct 2010 10:49:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PBqGI-0002ug-Rk for qemu-devel@nongnu.org; Fri, 29 Oct 2010 10:49:00 -0400 Received: from mail.codesourcery.com ([38.113.113.100]:43295) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PBqGI-0002uF-KJ for qemu-devel@nongnu.org; Fri, 29 Oct 2010 10:48:58 -0400 Received: (qmail 25698 invoked from network); 29 Oct 2010 14:48:57 -0000 Received: from unknown (HELO localhost) (froydnj@127.0.0.2) by mail.codesourcery.com with ESMTPA; 29 Oct 2010 14:48:57 -0000 From: Nathan Froyd To: qemu-devel@nongnu.org Date: Fri, 29 Oct 2010 07:48:57 -0700 Message-Id: <1288363737-14207-1-git-send-email-froydnj@codesourcery.com> X-Mailer: git-send-email 1.6.3.2 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) Subject: [Qemu-devel] [PATCH] linux-user: fix memory leaks with NPTL emulation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Running programs that create large numbers of threads, such as this snippet from libstdc++'s pthread7-rope.cc: const int max_thread_count = 4; const int max_loop_count = 10000; ... for (int j = 0; j < max_loop_count; j++) { ... for (int i = 0; i < max_thread_count; i++) pthread_create (&tid[i], NULL, thread_main, 0); for (int i = 0; i < max_thread_count; i++) pthread_join (tid[i], NULL); } in user-mode emulation will quickly run out of memory. This is caused by a failure to free memory in do_syscall prior to thread exit: /* TODO: Free CPU state. */ pthread_exit(NULL); The first step in fixing this is to make all TaskStates used by QEMU dynamically allocated. The TaskState used by the initial thread was not, as it was allocated on main's stack. So fix that, free the cpu_env, free the TaskState, and we're home free, right? Not exactly. When we create a thread, we do: ts = qemu_mallocz(sizeof(TaskState) + NEW_STACK_SIZE); ... new_stack = ts->stack; ... ret = pthread_attr_setstack(&attr, new_stack, NEW_STACK_SIZE); If we blindly free the TaskState, then, we yank the current (host) thread's stack out from underneath it while it still has things to do, like calling pthread_exit. That causes problems, as you might expect. The solution adopted here is to let the C library allocate the thread's stack (so the C library can properly clean it up at pthread_exit) and provide a hint that we want NEW_STACK_SIZE bytes of stack. With those two changes, we're done, right? Well, almost. You see, we're creating all these host threads and their parent threads never bother to check that their children are finished. There's no good place for the parent threads to do so. Therefore, we need to create the threads in a detached state so the parent thread doesn't have to call pthread_join on the child to release the child's resources; the child does so automatically. With those three major changes, we can comfortably run programs like the above without exhausting memory. We do need to delete 'stack' from the TaskState structure. Signed-off-by: Nathan Froyd --- linux-user/main.c | 4 ++-- linux-user/qemu.h | 2 -- linux-user/syscall.c | 11 +++++++---- 3 files changed, 9 insertions(+), 8 deletions(-) diff --git a/linux-user/main.c b/linux-user/main.c index dbba8be..7d41d4a 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -2711,7 +2711,7 @@ int main(int argc, char **argv, char **envp) struct target_pt_regs regs1, *regs = ®s1; struct image_info info1, *info = &info1; struct linux_binprm bprm; - TaskState ts1, *ts = &ts1; + TaskState *ts; CPUState *env; int optind; const char *r; @@ -3038,7 +3038,7 @@ int main(int argc, char **argv, char **envp) } target_argv[target_argc] = NULL; - memset(ts, 0, sizeof(TaskState)); + ts = qemu_mallocz (sizeof(TaskState)); init_task_state(ts); /* build Task State */ ts->info = info; diff --git a/linux-user/qemu.h b/linux-user/qemu.h index 708021e..00c6549 100644 --- a/linux-user/qemu.h +++ b/linux-user/qemu.h @@ -126,8 +126,6 @@ typedef struct TaskState { struct sigqueue sigqueue_table[MAX_SIGQUEUE_SIZE]; /* siginfo queue */ struct sigqueue *first_free; /* first free siginfo queue entry */ int signal_pending; /* non zero if a signal may be pending */ - - uint8_t stack[0]; } __attribute__((aligned(16))) TaskState; extern char *exec_path; diff --git a/linux-user/syscall.c b/linux-user/syscall.c index d44f512..5761106 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -3601,9 +3601,8 @@ static int do_fork(CPUState *env, unsigned int flags, abi_ulong newsp, new_thread_info info; pthread_attr_t attr; #endif - ts = qemu_mallocz(sizeof(TaskState) + NEW_STACK_SIZE); + ts = qemu_mallocz(sizeof(TaskState)); init_task_state(ts); - new_stack = ts->stack; /* we create a new CPU instance. */ new_env = cpu_copy(env); #if defined(TARGET_I386) || defined(TARGET_SPARC) || defined(TARGET_PPC) @@ -3639,7 +3638,8 @@ static int do_fork(CPUState *env, unsigned int flags, abi_ulong newsp, info.parent_tidptr = parent_tidptr; ret = pthread_attr_init(&attr); - ret = pthread_attr_setstack(&attr, new_stack, NEW_STACK_SIZE); + ret = pthread_attr_setstacksize(&attr, NEW_STACK_SIZE); + ret = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); /* It is not safe to deliver signals until the child has finished initializing, so temporarily block all signals. */ sigfillset(&sigmask); @@ -3667,6 +3667,7 @@ static int do_fork(CPUState *env, unsigned int flags, abi_ulong newsp, if (flags & CLONE_NPTL_FLAGS2) return -EINVAL; /* This is probably going to die very quickly, but do it anyway. */ + new_stack = qemu_mallocz (NEW_STACK_SIZE); #ifdef __ia64__ ret = __clone2(clone_func, new_stack, NEW_STACK_SIZE, flags, new_env); #else @@ -4240,7 +4241,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, sys_futex(g2h(ts->child_tidptr), FUTEX_WAKE, INT_MAX, NULL, NULL, 0); } - /* TODO: Free CPU state. */ + thread_env = NULL; + qemu_free(cpu_env); + qemu_free(ts); pthread_exit(NULL); } #endif