diff mbox

[v2,1/3] posix: Remove dynamic memory allocation from execl{e,p}

Message ID 1454343665-1706-2-git-send-email-adhemerval.zanella@linaro.org
State New
Headers show

Commit Message

Adhemerval Zanella Netto Feb. 1, 2016, 4:21 p.m. UTC
GLIBC execl{e,p} implementation might use malloc if the total number of i
arguments exceed initial assumption size (1024).  This might lead to
issue in two situations:

1. execl/execle is stated to be async-signal-safe by POSIX [1].  However
   if execl is used in a signal handler with a large argument set (that
   may call malloc internally) and the resulting call fails, it might
   lead malloc in the program in a bad state.

2. If the functions are used in a vfork/clone(VFORK) situation it also
   might issue malloc internal bad state.

This patch fixes it by using stack allocation instead.  It also fixes
BZ#19534.

Tested on x86_64.

Changes from previous version:

- Remove arbitrary limit on stack allocation for argument handling
  (it is arbitrary and does no impose any limit since it does not 
  consider current stack size neither stack size limit).

[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html

	[BZ #19534]
	* posix/execl.c (execl): Remove dynamic memory allocation.
	* posix/execle.c (execle): Likewise.
	* posix/execlp.c (execlp): Likewise.
---
 posix/execl.c  | 65 +++++++++++++++++----------------------------------------
 posix/execle.c | 66 ++++++++++++++++++----------------------------------------
 posix/execlp.c | 64 +++++++++++++++++---------------------------------------
 4 files changed, 65 insertions(+), 137 deletions(-)

Comments

Joseph Myers Feb. 1, 2016, 4:52 p.m. UTC | #1
On Mon, 1 Feb 2016, Adhemerval Zanella wrote:

> +  char *argv[argc+1];
> +  va_start (ap, arg);
> +  argv[0] = (char*) arg;
> +  for (i = 1; i < argc; i++)
> +     argv[i] = va_arg (ap, char *);
> +  argv[i] = NULL;

I don't see how you're ensuring this stack allocation is safe (i.e. if 
it's too big, it doesn't corrupt memory that's in use by other threads).  
Can't it jump beyond any guard page and start overwriting other memory, 
possibly in use by another thread, before reaching unmapped memory?  I'd 
expect safe large stack allocations (as with -fstack-check) to need to 
touch the pages in the right order (and doing that safely probably means 
using -fstack-check).
Adhemerval Zanella Netto Feb. 1, 2016, 5:18 p.m. UTC | #2
On 01-02-2016 14:52, Joseph Myers wrote:
> On Mon, 1 Feb 2016, Adhemerval Zanella wrote:
> 
>> +  char *argv[argc+1];
>> +  va_start (ap, arg);
>> +  argv[0] = (char*) arg;
>> +  for (i = 1; i < argc; i++)
>> +     argv[i] = va_arg (ap, char *);
>> +  argv[i] = NULL;
> 
> I don't see how you're ensuring this stack allocation is safe (i.e. if 
> it's too big, it doesn't corrupt memory that's in use by other threads).  
> Can't it jump beyond any guard page and start overwriting other memory, 
> possibly in use by another thread, before reaching unmapped memory?  I'd 
> expect safe large stack allocations (as with -fstack-check) to need to 
> touch the pages in the right order (and doing that safely probably means 
> using -fstack-check).
> 

Right, it is not ensuring the safeness. Is '-fstack-check' the suffice 
option to ensure it or do we need a more strict one?
Joseph Myers Feb. 1, 2016, 5:53 p.m. UTC | #3
On Mon, 1 Feb 2016, Adhemerval Zanella wrote:

> Right, it is not ensuring the safeness. Is '-fstack-check' the suffice 
> option to ensure it or do we need a more strict one?

I think it's the right option, but don't know if it works reliably across 
supported architectures and GCC versions (or, at least, does not generate 
wrong code even if the checks aren't fully safe in all cases) - it's not 
very widely used.
Adhemerval Zanella Netto Feb. 1, 2016, 6:09 p.m. UTC | #4
On 01-02-2016 15:53, Joseph Myers wrote:
> On Mon, 1 Feb 2016, Adhemerval Zanella wrote:
> 
>> Right, it is not ensuring the safeness. Is '-fstack-check' the suffice 
>> option to ensure it or do we need a more strict one?
> 
> I think it's the right option, but don't know if it works reliably across 
> supported architectures and GCC versions (or, at least, does not generate 
> wrong code even if the checks aren't fully safe in all cases) - it's not 
> very widely used.
> 

I think we can use this option, since it supported on gcc (and I also do not
have any other better option in mind that fits in these function memory
constraints).
Florian Weimer Feb. 2, 2016, 11:24 a.m. UTC | #5
On 02/01/2016 06:18 PM, Adhemerval Zanella wrote:

> Right, it is not ensuring the safeness. Is '-fstack-check' the suffice 
> option to ensure it or do we need a more strict one?

In my tests, the initial stack banging probe is sometimes more than a
page away from the current stack pointer, so it does not look safe to me.

For posix_spawn, it's probably simpler for now to compute the shell
invocation in the parent.  That is, perform two vforks in case the first
execve fails.

Florian
Szabolcs Nagy Feb. 2, 2016, 12:46 p.m. UTC | #6
On 02/02/16 11:24, Florian Weimer wrote:
> On 02/01/2016 06:18 PM, Adhemerval Zanella wrote:
> 
>> Right, it is not ensuring the safeness. Is '-fstack-check' the suffice 
>> option to ensure it or do we need a more strict one?
> 
> In my tests, the initial stack banging probe is sometimes more than a
> page away from the current stack pointer, so it does not look safe to me.

i think that can be fixed by

memset(argv, 0, sizeof argv);

even if it clobbers other threads it will hit the guard page eventually.
Adhemerval Zanella Netto Feb. 2, 2016, 12:47 p.m. UTC | #7
On 02-02-2016 09:24, Florian Weimer wrote:
> On 02/01/2016 06:18 PM, Adhemerval Zanella wrote:
> 
>> Right, it is not ensuring the safeness. Is '-fstack-check' the suffice 
>> option to ensure it or do we need a more strict one?
> 
> In my tests, the initial stack banging probe is sometimes more than a
> page away from the current stack pointer, so it does not look safe to me.

I am not aware of a better option, if any, to avoid stack overflow in case of
a exec function call with large arguments.  Also, check at least on x86_64
I do see trying to overflow the stack with new implementation does force a
segfault on the stack protector code (it tries to orq a memory beyond stack).

> 
> For posix_spawn, it's probably simpler for now to compute the shell
> invocation in the parent.  That is, perform two vforks in case the first
> execve fails.
> 
> Florian
> 

I do not think it would require to clone(VFORK) twice in parent: we can
calculate the total argument size prior any call and calculate the
total stack to mmap based on this plus a slack for possible local
variables (128 or 256 bytes or even higher). I will add some latency
in any posix_spawn{p} case.

Another option is to just create a guard page in the end of the allocated
stack page (as pthread_create does) to force a segfaults in case of
a stack allocation higher. However, as for execl using stack-check will
also force a segfault in case of stack overflow in this case.
Rich Felker Feb. 2, 2016, 4:33 p.m. UTC | #8
On Mon, Feb 01, 2016 at 04:52:15PM +0000, Joseph Myers wrote:
> On Mon, 1 Feb 2016, Adhemerval Zanella wrote:
> 
> > +  char *argv[argc+1];
> > +  va_start (ap, arg);
> > +  argv[0] = (char*) arg;
> > +  for (i = 1; i < argc; i++)
> > +     argv[i] = va_arg (ap, char *);
> > +  argv[i] = NULL;
> 
> I don't see how you're ensuring this stack allocation is safe (i.e. if 
> it's too big, it doesn't corrupt memory that's in use by other threads).  

There's no obligation to. If you pass something like a million
arguments to a variadic function, the compiler will generate code in
the caller that overflows the stack before the callee is even reached.
The size of the vla used in execl is exactly the same size as the
argument block on the stack used for passing arguments to execl from
its caller, and it's nobody's fault but the programmer's if this is
way too big. It's not a runtime variable.

Rich
Rasmus Villemoes Feb. 7, 2016, 9:28 p.m. UTC | #9
On Tue, Feb 02 2016, Rich Felker <dalias@libc.org> wrote:

> On Mon, Feb 01, 2016 at 04:52:15PM +0000, Joseph Myers wrote:
>> On Mon, 1 Feb 2016, Adhemerval Zanella wrote:
>> 
>> > +  char *argv[argc+1];
>> > +  va_start (ap, arg);
>> > +  argv[0] = (char*) arg;
>> > +  for (i = 1; i < argc; i++)
>> > +     argv[i] = va_arg (ap, char *);
>> > +  argv[i] = NULL;
>> 
>> I don't see how you're ensuring this stack allocation is safe (i.e. if 
>> it's too big, it doesn't corrupt memory that's in use by other threads).  
>
> There's no obligation to. If you pass something like a million
> arguments to a variadic function, the compiler will generate code in
> the caller that overflows the stack before the callee is even reached.
> The size of the vla used in execl is exactly the same size as the
> argument block on the stack used for passing arguments to execl from
> its caller, and it's nobody's fault but the programmer's if this is
> way too big. It's not a runtime variable.

This is true, and maybe it's not worth the extra complication, but if
we're willing to make arch-specific versions of execl and execle we can
avoid the double stack use and the time spent copying the argv
array. That won't remove the possible stack overflow, of course, but
then it'll in all likelihood happen in the user code and not glibc.

On i386 it's pretty straight-forward since all args are already passed
on the stack, while on x86-64 we can, roughly speaking, make a local
change of the ABI by just stashing the return address somewhere else so
that we can prepend the few passed-by-register to the stack-passed
parameters. I suppose something similar is possible on most other
architectures.

Something like this passes some quick testing (nevermind the z_ prefix;
I just used that to avoid any clashes with the actual functions).

i386:

.globl z_execl ; .align 4,0x90 ; z_execl:
	pushl z_environ
	lea 0x0c(%esp), %eax
	push %eax
	pushl 0x0c(%esp)

	call z_execve

	add $0x0c, %esp
	ret
.type z_execl, @function ; .size z_execl, .-z_execl

.globl z_execle ; .align 4,0x90 ; z_execle:
	lea 0x0c(%esp), %eax // eax = address of first vararg (&argv[1])
	movl (%eax), %edx // we'll test this
1:	add $0x4, %eax // move on to the next word
	test %edx,%edx // have we found NULL?
	movl (%eax), %edx // load the next word into edx
	jnz 1b
	
	pushl %edx
	lea 0x0c(%esp), %eax // same offset as above, but because of the push this is now &argv[0]
	push %eax
	pushl 0x0c(%esp)

	call z_execve

	add $0x0c, %esp
	ret
.type z_execle, @function ; .size z_execle, .-z_execle
	

x86-64:

.globl z_execl  ; .align 4,0x90 ; z_execl:
	movq (%rsp),%rax // rax is caller-saved and dead at this point
	sub $0x28, %rsp // Only allocates room for five pointers, we store six!
	movq %rax, (%rsp) // Stash the return address here.
	movq %rsi, 0x8(%rsp)  // Put the paramaters passed via registers
	movq %rdx, 0x10(%rsp) // at the bottom of the array.
	movq %rcx, 0x18(%rsp)
	movq %r8,  0x20(%rsp)
	movq %r9,  0x28(%rsp) // Overwrites return address, adjacent to stack-passed parameters (arg 7+)

	lea 0x8(%rsp),%rsi // Put the address of our argv array in rsi.
	movq z_environ, %rdx // Third argument is the global environment.
	call z_execve // %rdi is unchanged since the beginning

	// If all goes well, this never returns. Otherwise, we need to clean up.
	// %rax should be preserved, but we're free to use any other caller-saved register.
	
	movq (%rsp), %r8 // Fetch the return address.
	add $0x28, %rsp // Clean up the stack.
	movq %r8, (%rsp) // Put the return address back where it belongs.
	ret
.type z_execl, @function ; .size z_execl, .-z_execl

.globl z_execle ; .align 4,0x90 ; z_execle:
	movq (%rsp),%rax
	sub $0x28, %rsp
	movq %rax, (%rsp)
	movq %rsi, 0x8(%rsp)
	movq %rdx, 0x10(%rsp)
	movq %rcx, 0x18(%rsp)
	movq %r8,  0x20(%rsp)
	movq %r9,  0x28(%rsp)

	// Find the env argument; it is the array element after the
	// argv NULL pointer, which cannot be located before argv[1].
	lea 0x10(%rsp),%r8
	movq (%r8), %rdx
1:	add $0x8, %r8
	test %rdx, %rdx
	movq (%r8), %rdx
	jnz 1b
	
	lea 0x8(%rsp),%rsi
	call z_execve

	movq (%rsp), %r8 
	add $0x28, %rsp
	movq %r8, (%rsp)
	ret
.type z_execle, @function ; .size z_execle, .-z_execle
diff mbox

Patch

diff --git a/posix/execl.c b/posix/execl.c
index 102d19d..8b8a324 100644
--- a/posix/execl.c
+++ b/posix/execl.c
@@ -16,58 +16,31 @@ 
    <http://www.gnu.org/licenses/>.  */
 
 #include <unistd.h>
+#include <errno.h>
 #include <stdarg.h>
-#include <stddef.h>
-#include <stdlib.h>
-#include <string.h>
-
-#include <stackinfo.h>
-
+#include <sys/param.h>
 
 /* Execute PATH with all arguments after PATH until
    a NULL pointer and environment from `environ'.  */
 int
 execl (const char *path, const char *arg, ...)
 {
-#define INITIAL_ARGV_MAX 1024
-  size_t argv_max = INITIAL_ARGV_MAX;
-  const char *initial_argv[INITIAL_ARGV_MAX];
-  const char **argv = initial_argv;
-  va_list args;
-
-  argv[0] = arg;
-
-  va_start (args, arg);
-  unsigned int i = 0;
-  while (argv[i++] != NULL)
-    {
-      if (i == argv_max)
-	{
-	  argv_max *= 2;
-	  const char **nptr = realloc (argv == initial_argv ? NULL : argv,
-				       argv_max * sizeof (const char *));
-	  if (nptr == NULL)
-	    {
-	      if (argv != initial_argv)
-		free (argv);
-	      va_end (args);
-	      return -1;
-	    }
-	  if (argv == initial_argv)
-	    /* We have to copy the already filled-in data ourselves.  */
-	    memcpy (nptr, argv, i * sizeof (const char *));
-
-	  argv = nptr;
-	}
-
-      argv[i] = va_arg (args, const char *);
-    }
-  va_end (args);
-
-  int ret = __execve (path, (char *const *) argv, __environ);
-  if (argv != initial_argv)
-    free (argv);
-
-  return ret;
+  int argc;
+  va_list ap;
+  va_start (ap, arg);
+  for (argc = 1; va_arg (ap, const char *); argc++)
+    continue;
+  va_end (ap);
+
+  int i;
+  char *argv[argc+1];
+  va_start (ap, arg);
+  argv[0] = (char*) arg;
+  for (i = 1; i < argc; i++)
+     argv[i] = va_arg (ap, char *);
+  argv[i] = NULL;
+  va_end (ap);
+
+  return __execve (path, argv, __environ);
 }
 libc_hidden_def (execl)
diff --git a/posix/execle.c b/posix/execle.c
index 8edc03a..1a0c9ee 100644
--- a/posix/execle.c
+++ b/posix/execle.c
@@ -17,57 +17,31 @@ 
 
 #include <unistd.h>
 #include <stdarg.h>
-#include <stddef.h>
-#include <stdlib.h>
-#include <string.h>
-
-#include <stackinfo.h>
+#include <errno.h>
+#include <sys/param.h>
 
 /* Execute PATH with all arguments after PATH until a NULL pointer,
    and the argument after that for environment.  */
 int
 execle (const char *path, const char *arg, ...)
 {
-#define INITIAL_ARGV_MAX 1024
-  size_t argv_max = INITIAL_ARGV_MAX;
-  const char *initial_argv[INITIAL_ARGV_MAX];
-  const char **argv = initial_argv;
-  va_list args;
-  argv[0] = arg;
-
-  va_start (args, arg);
-  unsigned int i = 0;
-  while (argv[i++] != NULL)
-    {
-      if (i == argv_max)
-	{
-	  argv_max *= 2;
-	  const char **nptr = realloc (argv == initial_argv ? NULL : argv,
-				       argv_max * sizeof (const char *));
-	  if (nptr == NULL)
-	    {
-	      if (argv != initial_argv)
-		free (argv);
-	      va_end (args);
-	      return -1;
-	    }
-	  if (argv == initial_argv)
-	    /* We have to copy the already filled-in data ourselves.  */
-	    memcpy (nptr, argv, i * sizeof (const char *));
-
-	  argv = nptr;
-	}
-
-      argv[i] = va_arg (args, const char *);
-    }
-
-  const char *const *envp = va_arg (args, const char *const *);
-  va_end (args);
-
-  int ret = __execve (path, (char *const *) argv, (char *const *) envp);
-  if (argv != initial_argv)
-    free (argv);
-
-  return ret;
+  int argc;
+  va_list ap;
+  va_start (ap, arg);
+  for (argc = 1; va_arg (ap, const char *); argc++)
+    continue;
+  va_end (ap);
+
+  int i;
+  char *argv[argc+1];
+  char **envp;
+  va_start (ap, arg);
+  argv[0] = (char*) arg;
+  for (i = 1; i < argc; i++)
+     argv[i] = va_arg (ap, char *);
+  envp = va_arg (ap, char **);
+  va_end (ap);
+
+  return __execve (path, argv, envp);
 }
 libc_hidden_def (execle)
diff --git a/posix/execlp.c b/posix/execlp.c
index 6700994..a0e1859 100644
--- a/posix/execlp.c
+++ b/posix/execlp.c
@@ -17,11 +17,8 @@ 
 
 #include <unistd.h>
 #include <stdarg.h>
-#include <stddef.h>
-#include <stdlib.h>
-#include <string.h>
-
-#include <stackinfo.h>
+#include <errno.h>
+#include <sys/param.h>
 
 /* Execute FILE, searching in the `PATH' environment variable if
    it contains no slashes, with all arguments after FILE until a
@@ -29,45 +26,22 @@ 
 int
 execlp (const char *file, const char *arg, ...)
 {
-#define INITIAL_ARGV_MAX 1024
-  size_t argv_max = INITIAL_ARGV_MAX;
-  const char *initial_argv[INITIAL_ARGV_MAX];
-  const char **argv = initial_argv;
-  va_list args;
-
-  argv[0] = arg;
-
-  va_start (args, arg);
-  unsigned int i = 0;
-  while (argv[i++] != NULL)
-    {
-      if (i == argv_max)
-	{
-	  argv_max *= 2;
-	  const char **nptr = realloc (argv == initial_argv ? NULL : argv,
-				       argv_max * sizeof (const char *));
-	  if (nptr == NULL)
-	    {
-	      if (argv != initial_argv)
-		free (argv);
-	      va_end (args);
-	      return -1;
-	    }
-	  if (argv == initial_argv)
-	    /* We have to copy the already filled-in data ourselves.  */
-	    memcpy (nptr, argv, i * sizeof (const char *));
-
-	  argv = nptr;
-	}
-
-      argv[i] = va_arg (args, const char *);
-    }
-  va_end (args);
-
-  int ret = execvp (file, (char *const *) argv);
-  if (argv != initial_argv)
-    free (argv);
-
-  return ret;
+  int argc;
+  va_list ap;
+  va_start (ap, arg);
+  for (argc = 1; va_arg (ap, const char *); argc++)
+    continue;
+  va_end (ap);
+
+  int i;
+  char *argv[argc+1];
+  va_start (ap, arg);
+  argv[0] = (char*) arg;
+  for (i = 1; i < argc; i++)
+     argv[i] = va_arg (ap, char *);
+  argv[i] = NULL;
+  va_end (ap);
+
+  return __execvpe (file, argv, __environ);
 }
 libc_hidden_def (execlp)