Patchwork linux-user: bigger default stack

login
register
mail settings
Submitter Riku Voipio
Date March 3, 2011, 3:37 p.m.
Message ID <20110303153737.GA23633@afflict.kos.to>
Download mbox | patch
Permalink /patch/85294/
State New
Headers show

Comments

Riku Voipio - March 3, 2011, 3:37 p.m.
PTHREAD_STACK_MIN (16KB) is somewhat inadequate for a new stack. follow
the pthread_create defaults, ie setting to RLIMIT_STACK or if unlimited
to 2MB.

Signed-off-by: Riku Voipio <riku.voipio@nokia.com>
---
 linux-user/syscall.c |   28 +++++++++++++++++++++-------
 1 files changed, 21 insertions(+), 7 deletions(-)
Nathan Froyd - March 3, 2011, 3:46 p.m.
On Thu, Mar 03, 2011 at 05:37:37PM +0200, Riku Voipio wrote:
> PTHREAD_STACK_MIN (16KB) is somewhat inadequate for a new stack. follow
> the pthread_create defaults, ie setting to RLIMIT_STACK or if unlimited
> to 2MB.

For what sort oof cases is it inadequate?  This stack is just for QEMU's
usage and QEMU shouldn't be using very much.  The target thread could
use quite a bit of course, but that's handled elsewhere.

-Nathan
Peter Maydell - March 3, 2011, 4:01 p.m.
On 3 March 2011 15:46, Nathan Froyd <froydnj@codesourcery.com> wrote:
> On Thu, Mar 03, 2011 at 05:37:37PM +0200, Riku Voipio wrote:
>> PTHREAD_STACK_MIN (16KB) is somewhat inadequate for a new stack. follow
>> the pthread_create defaults, ie setting to RLIMIT_STACK or if unlimited
>> to 2MB.
>
> For what sort oof cases is it inadequate?  This stack is just for QEMU's
> usage and QEMU shouldn't be using very much.  The target thread could
> use quite a bit of course, but that's handled elsewhere.

Maybe it's the fault of the alloca() usage in linux-user/syscalls.c?
Some of the syscall emulations do alloca(some size determined by the
target binary)...

-- PMM
Riku Voipio - March 3, 2011, 4:15 p.m.
On Thu, Mar 03, 2011 at 07:46:27AM -0800, Nathan Froyd wrote:
> On Thu, Mar 03, 2011 at 05:37:37PM +0200, Riku Voipio wrote:
> > PTHREAD_STACK_MIN (16KB) is somewhat inadequate for a new stack. follow
> > the pthread_create defaults, ie setting to RLIMIT_STACK or if unlimited
> > to 2MB.
 
> For what sort oof cases is it inadequate?  This stack is just for QEMU's
> usage and QEMU shouldn't be using very much.  

QEMU linux-user calls glibc functions which, while usually very conservative
with memory usage, are not guaranteed not take less than 10KB (at do_syscall
we are already around 5 functions deep).

Our specific case we hit this is ugly, LD_PRELOAD library over glibc functions
with lots messy string handling... 

Riku
Nathan Froyd - March 3, 2011, 4:46 p.m.
On Thu, Mar 03, 2011 at 06:15:49PM +0200, Riku Voipio wrote:
> On Thu, Mar 03, 2011 at 07:46:27AM -0800, Nathan Froyd wrote:
> > On Thu, Mar 03, 2011 at 05:37:37PM +0200, Riku Voipio wrote:
> > > PTHREAD_STACK_MIN (16KB) is somewhat inadequate for a new stack. follow
> > > the pthread_create defaults, ie setting to RLIMIT_STACK or if unlimited
> > > to 2MB.
>  
> > For what sort oof cases is it inadequate?  This stack is just for QEMU's
> > usage and QEMU shouldn't be using very much.  
> 
> QEMU linux-user calls glibc functions which, while usually very conservative
> with memory usage, are not guaranteed not take less than 10KB (at do_syscall
> we are already around 5 functions deep).

Bleh.  OK, so it needs to be increased.  Could we get by with somewhat
less (256K?), to try and maximize the number of threads we can
potentially run?  Maybe it doesn't matter (people creating thousands of
simultaneous threads inside QEMU have other problems...), but not
gratuitously wasting memory would be good.

-Nathan
Riku Voipio - March 4, 2011, 8:15 a.m.
On Thu, Mar 03, 2011 at 08:46:40AM -0800, Nathan Froyd wrote:
> On Thu, Mar 03, 2011 at 06:15:49PM +0200, Riku Voipio wrote:
> > QEMU linux-user calls glibc functions which, while usually very conservative
> > with memory usage, are not guaranteed not take less than 10KB (at do_syscall
> > we are already around 5 functions deep).
 
> Bleh.  OK, so it needs to be increased.  Could we get by with somewhat
> less (256K?), to try and maximize the number of threads we can
> potentially run?  Maybe it doesn't matter (people creating thousands of
> simultaneous threads inside QEMU have other problems...), but not
> gratuitously wasting memory would be good.

I believe 256K should be enough for everone now, then again we know 
what happened when someone suggested same about 640K.. In this case
however, if the limit is ever hit, increasing the limit again is easy.

Originally I did put the limit much higher, as I assume due to lazy
memory allocation from kernel it doesn't matter if we ask for 256K or
8M, as unused pages will never be allocated. But I'll do some testing with
lots of threads and see if stack size has an impact on how many threads
we can run.

Riku

Patch

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 23d7a63..99484ce 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -3692,8 +3692,6 @@  static abi_long do_arch_prctl(CPUX86State *env, int code, abi_ulong addr)
 
 #if defined(CONFIG_USE_NPTL)
 
-#define NEW_STACK_SIZE PTHREAD_STACK_MIN
-
 static pthread_mutex_t clone_lock = PTHREAD_MUTEX_INITIALIZER;
 typedef struct {
     CPUState *env;
@@ -3738,7 +3736,6 @@  static void *clone_func(void *arg)
 #else
 /* this stack is the equivalent of the kernel stack associated with a
    thread/process */
-#define NEW_STACK_SIZE 8192
 
 static int clone_func(void *arg)
 {
@@ -3749,6 +3746,22 @@  static int clone_func(void *arg)
 }
 #endif
 
+#define DEFAULT_STACK_SIZE 0x200000
+
+static long get_new_stack_size(void)
+{
+    struct rlimit limit;
+    if ((getrlimit(RLIMIT_STACK, &limit) == 0) &&
+        (limit.rlim_cur != RLIM_INFINITY)) {
+        return limit.rlim_cur;
+    }
+#if defined(CONFIG_USE_NPTL)
+    return DEFAULT_STACK_SIZE > PTHREAD_STACK_MIN ? DEFAULT_STACK_SIZE : PTHREAD_STACK_MIN;
+#else
+    return DEFAULT_STACK_SIZE;
+#endif
+}
+
 /* do_fork() Must return host values and target errnos (unlike most
    do_*() functions). */
 static int do_fork(CPUState *env, unsigned int flags, abi_ulong newsp,
@@ -3775,6 +3788,7 @@  static int do_fork(CPUState *env, unsigned int flags, abi_ulong newsp,
         new_thread_info info;
         pthread_attr_t attr;
 #endif
+        long new_stack_size = get_new_stack_size();
         ts = qemu_mallocz(sizeof(TaskState));
         init_task_state(ts);
         /* we create a new CPU instance. */
@@ -3812,7 +3826,7 @@  static int do_fork(CPUState *env, unsigned int flags, abi_ulong newsp,
             info.parent_tidptr = parent_tidptr;
 
         ret = pthread_attr_init(&attr);
-        ret = pthread_attr_setstacksize(&attr, NEW_STACK_SIZE);
+        ret = pthread_attr_setstacksize(&attr, new_stack_size);
         ret = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
         /* It is not safe to deliver signals until the child has finished
            initializing, so temporarily block all signals.  */
@@ -3841,11 +3855,11 @@  static int do_fork(CPUState *env, unsigned int flags, abi_ulong newsp,
         if (flags & CLONE_NPTL_FLAGS2)
             return -EINVAL;
         /* This is probably going to die very quickly, but do it anyway.  */
-        new_stack = qemu_mallocz (NEW_STACK_SIZE);
+        new_stack = qemu_mallocz (new_stack_size);
 #ifdef __ia64__
-        ret = __clone2(clone_func, new_stack, NEW_STACK_SIZE, flags, new_env);
+        ret = __clone2(clone_func, new_stack, new_stack_size, flags, new_env);
 #else
-	ret = clone(clone_func, new_stack + NEW_STACK_SIZE, flags, new_env);
+        ret = clone(clone_func, new_stack + new_stack_size, flags, new_env);
 #endif
 #endif
     } else {