diff mbox series

[2/3] posix: Use posix_spawn on popen

Message ID 20180915151622.17789-2-adhemerval.zanella@linaro.org
State New
Headers show
Series [1/3] posix: Add internal symbols for posix_spawn interface | expand

Commit Message

Adhemerval Zanella Netto Sept. 15, 2018, 3:16 p.m. UTC
This patch uses posix_spawn on popen instead of fork and execl.  On Linux
this has the advantage of much lower memory consumption (usually 32 Kb
minimum for the mmap stack area).

Checked on x86_64-linux-gnu and i686-linux-gnu.

	* libio/iopopen.c (_IO_new_proc_open): use posix_spawn instead of
	fork and execl.
---
 ChangeLog       |  3 ++
 libio/iopopen.c | 97 +++++++++++++++++++++++++++++--------------------
 2 files changed, 61 insertions(+), 39 deletions(-)

Comments

David Newall Sept. 16, 2018, 5:13 a.m. UTC | #1
It seems to me that there are still reasonable questions about whether 
to use posix_spawn or vfork ("posix_spawn is a badly designed API").  
For (well over) 30 years, I've understood that vfork was the go-to call 
for a fork/exec scenario, so, what is the technical problem with using 
it for popen and system?  (I'm not asking about vfork's overall 
technical merits, I'm asking exclusively about using it for popen and 
system.)
Rich Felker Sept. 17, 2018, 2:50 p.m. UTC | #2
On Sun, Sep 16, 2018 at 02:43:02PM +0930, David Newall wrote:
> It seems to me that there are still reasonable questions about
> whether to use posix_spawn or vfork ("posix_spawn is a badly
> designed API").  For (well over) 30 years, I've understood that
> vfork was the go-to call for a fork/exec scenario, so, what is the
> technical problem with using it for popen and system?  (I'm not
> asking about vfork's overall technical merits, I'm asking
> exclusively about using it for popen and system.)

The historical contract of vfork is that you can basically do nothing
after it returns in the child except for exec or _exit, and there are
good reasons for this; sharing memory and stack with the parent has
lots of subtle issues, especially in the presence of a non-dead-stupid
compiler.

One thing to note is that vfork is completely unsafe to use as
documented if any signal handlers are installed, unless you block all
signals before calling vfork, in which case the exec'd process will
inherit a fully-blocked signal mask which is probably not what you
want. Otherwise signal handlers may wrongly run in the child that's
sharing memory with the parent.

The posix_spawn implementation already takes care of these issues by
not sharing the stack and uninstalling any signal handlers before
unmasking signals.

Rich
Adhemerval Zanella Netto Sept. 17, 2018, 5:32 p.m. UTC | #3
On 17/09/2018 07:50, Rich Felker wrote:
> On Sun, Sep 16, 2018 at 02:43:02PM +0930, David Newall wrote:
>> It seems to me that there are still reasonable questions about
>> whether to use posix_spawn or vfork ("posix_spawn is a badly
>> designed API").  For (well over) 30 years, I've understood that
>> vfork was the go-to call for a fork/exec scenario, so, what is the
>> technical problem with using it for popen and system?  (I'm not
>> asking about vfork's overall technical merits, I'm asking
>> exclusively about using it for popen and system.)
> 
> The historical contract of vfork is that you can basically do nothing
> after it returns in the child except for exec or _exit, and there are
> good reasons for this; sharing memory and stack with the parent has
> lots of subtle issues, especially in the presence of a non-dead-stupid
> compiler.
> 
> One thing to note is that vfork is completely unsafe to use as
> documented if any signal handlers are installed, unless you block all
> signals before calling vfork, in which case the exec'd process will
> inherit a fully-blocked signal mask which is probably not what you
> want. Otherwise signal handlers may wrongly run in the child that's
> sharing memory with the parent.
> 
> The posix_spawn implementation already takes care of these issues by
> not sharing the stack and uninstalling any signal handlers before
> unmasking signals.
> 
> Rich
> 

And posix_spawn implementation on Linux uses the same performance
improvements that vfork aims to provide.
David Newall Sept. 18, 2018, 1:30 a.m. UTC | #4
On 18/09/18 00:20, Rich Felker wrote:
> The historical contract of vfork is that you can basically do nothing
> after it returns in the child except for exec or _exit
Yes, that is true.  My  understanding is that vfork was intended only as 
a fast way of doing fork/exec sequence.

> One thing to note is that vfork is completely unsafe to use as
> documented if any signal handlers are installed, unless you block all
> signals before calling vfork, in which case the exec'd process will
> inherit a fully-blocked signal mask which is probably not what you
> want. Otherwise signal handlers may wrongly run in the child that's
> sharing memory with the parent.

You're saying that kernel will deliver a signal to child pid when it was 
parent pid that was signalled.  Can that really happen?
Rich Felker Sept. 18, 2018, 3:12 a.m. UTC | #5
On Tue, Sep 18, 2018 at 11:00:48AM +0930, David Newall wrote:
> On 18/09/18 00:20, Rich Felker wrote:
> >The historical contract of vfork is that you can basically do nothing
> >after it returns in the child except for exec or _exit
> Yes, that is true.  My  understanding is that vfork was intended
> only as a fast way of doing fork/exec sequence.
> 
> >One thing to note is that vfork is completely unsafe to use as
> >documented if any signal handlers are installed, unless you block all
> >signals before calling vfork, in which case the exec'd process will
> >inherit a fully-blocked signal mask which is probably not what you
> >want. Otherwise signal handlers may wrongly run in the child that's
> >sharing memory with the parent.
> 
> You're saying that kernel will deliver a signal to child pid when it
> was parent pid that was signalled.  Can that really happen?

There are various conditions under which signals are delivered to an
entire process group; the most well-known is tty signals from ^C, ^\,
^Z, SIGWINCH, etc. to the tty's foreground process group. After vfork
these would be delivered to both the parent and child while they share
memory. The parent is suspended and won't act until the child execs or
exits, but mere execution of the signal handler in the child is
observably and dangerously wrong behavior.

Rich
Zack Weinberg Sept. 18, 2018, 6:01 p.m. UTC | #6
On Sun, Sep 16, 2018 at 1:13 AM David Newall <glibc@davidnewall.com> wrote:
> It seems to me that there are still reasonable questions about whether
> to use posix_spawn or vfork ("posix_spawn is a badly designed API").

When I said to Sergey that I would rather see the problem they
reported addressed using vfork instead of posix_spawn, I was giving
advice to a new contributor.  I really _would_ rather see it addressed
that way, and I also thought that they were more likely to succeed in
writing those patches.

Adhemerval is not a new contributor and they deeply understand the
problems in this area.  Their patches strike me as a step generally in
the right direction.  I don't have time to review them in detail, but
I don't object to them.  However, do I think some of the fine details
demonstrate why an API that allows for arbitrary computation and
system calls before exec would be preferable, such as there being "no
safe way to clear close-on-exec in the child" (because, IIUC, there's
no posix_spawn action to do that).

zw
Rich Felker Sept. 19, 2018, 5:17 a.m. UTC | #7
On Tue, Sep 18, 2018 at 02:01:29PM -0400, Zack Weinberg wrote:
> On Sun, Sep 16, 2018 at 1:13 AM David Newall <glibc@davidnewall.com> wrote:
> > It seems to me that there are still reasonable questions about whether
> > to use posix_spawn or vfork ("posix_spawn is a badly designed API").
> 
> When I said to Sergey that I would rather see the problem they
> reported addressed using vfork instead of posix_spawn, I was giving
> advice to a new contributor.  I really _would_ rather see it addressed
> that way, and I also thought that they were more likely to succeed in
> writing those patches.
> 
> Adhemerval is not a new contributor and they deeply understand the
> problems in this area.  Their patches strike me as a step generally in
> the right direction.  I don't have time to review them in detail, but
> I don't object to them.  However, do I think some of the fine details
> demonstrate why an API that allows for arbitrary computation and
> system calls before exec would be preferable, such as there being "no
> safe way to clear close-on-exec in the child" (because, IIUC, there's
> no posix_spawn action to do that).

The resolution to Austin Group issue #411 made it so adddup2(n,n) does
what you want:

http://austingroupbugs.net/view.php?id=411

Rich
Adhemerval Zanella Netto Sept. 19, 2018, 3:53 p.m. UTC | #8
On 18/09/2018 22:17, Rich Felker wrote:
> On Tue, Sep 18, 2018 at 02:01:29PM -0400, Zack Weinberg wrote:
>> On Sun, Sep 16, 2018 at 1:13 AM David Newall <glibc@davidnewall.com> wrote:
>>> It seems to me that there are still reasonable questions about whether
>>> to use posix_spawn or vfork ("posix_spawn is a badly designed API").
>>
>> When I said to Sergey that I would rather see the problem they
>> reported addressed using vfork instead of posix_spawn, I was giving
>> advice to a new contributor.  I really _would_ rather see it addressed
>> that way, and I also thought that they were more likely to succeed in
>> writing those patches.
>>
>> Adhemerval is not a new contributor and they deeply understand the
>> problems in this area.  Their patches strike me as a step generally in
>> the right direction.  I don't have time to review them in detail, but
>> I don't object to them.  However, do I think some of the fine details
>> demonstrate why an API that allows for arbitrary computation and
>> system calls before exec would be preferable, such as there being "no
>> safe way to clear close-on-exec in the child" (because, IIUC, there's
>> no posix_spawn action to do that).
> 
> The resolution to Austin Group issue #411 made it so adddup2(n,n) does
> what you want:
> 
> http://austingroupbugs.net/view.php?id=411

I has been tracked on https://sourceware.org/bugzilla/show_bug.cgi?id=23640 
as well. I am not found of having adddup2(n,n) semantic different dup2(n,n),
but I don't have a better straightforward solution either.
Adhemerval Zanella Netto Oct. 17, 2018, 5:11 p.m. UTC | #9
Ping.

On 15/09/2018 12:16, Adhemerval Zanella wrote:
> This patch uses posix_spawn on popen instead of fork and execl.  On Linux
> this has the advantage of much lower memory consumption (usually 32 Kb
> minimum for the mmap stack area).
> 
> Checked on x86_64-linux-gnu and i686-linux-gnu.
> 
> 	* libio/iopopen.c (_IO_new_proc_open): use posix_spawn instead of
> 	fork and execl.
> ---
>  ChangeLog       |  3 ++
>  libio/iopopen.c | 97 +++++++++++++++++++++++++++++--------------------
>  2 files changed, 61 insertions(+), 39 deletions(-)
> 
> diff --git a/libio/iopopen.c b/libio/iopopen.c
> index 2eff45b4c8..3cce2e5596 100644
> --- a/libio/iopopen.c
> +++ b/libio/iopopen.c
> @@ -34,7 +34,8 @@
>  #include <not-cancel.h>
>  #include <sys/types.h>
>  #include <sys/wait.h>
> -#include <kernel-features.h>
> +#include <spawn.h>
> +#include <paths.h>
>  
>  struct _IO_proc_file
>  {
> @@ -63,9 +64,8 @@ FILE *
>  _IO_new_proc_open (FILE *fp, const char *command, const char *mode)
>  {
>    int read_or_write;
> -  int parent_end, child_end;
>    int pipe_fds[2];
> -  pid_t child_pid;
> +  int op;
>  
>    int do_read = 0;
>    int do_write = 0;
> @@ -108,59 +108,78 @@ _IO_new_proc_open (FILE *fp, const char *command, const char *mode)
>  
>    if (do_read)
>      {
> -      parent_end = pipe_fds[0];
> -      child_end = pipe_fds[1];
> +      op = 0;
>        read_or_write = _IO_NO_WRITES;
>      }
>    else
>      {
> -      parent_end = pipe_fds[1];
> -      child_end = pipe_fds[0];
> +      op = 1;
>        read_or_write = _IO_NO_READS;
>      }
>  
> -  ((_IO_proc_file *) fp)->pid = child_pid = __fork ();
> -  if (child_pid == 0)
> -    {
> -      int child_std_end = do_read ? 1 : 0;
> -      struct _IO_proc_file *p;
> -
> -      if (child_end != child_std_end)
> -	__dup2 (child_end, child_std_end);
> -      else
> -	/* The descriptor is already the one we will use.  But it must
> -	   not be marked close-on-exec.  Undo the effects.  */
> -	__fcntl (child_end, F_SETFD, 0);
> -      /* POSIX.2:  "popen() shall ensure that any streams from previous
> -         popen() calls that remain open in the parent process are closed
> -	 in the new child process." */
> -      for (p = proc_file_chain; p; p = p->next)
> -	{
> -	  int fd = _IO_fileno ((FILE *) p);
> +  {
> +    posix_spawn_file_actions_t fa;
> +    /* posix_spawn_file_actions_init does not fail.  */
> +    __posix_spawn_file_actions_init (&fa);
>  
> -	  /* If any stream from previous popen() calls has fileno
> -	     child_std_end, it has been already closed by the dup2 syscall
> -	     above.  */
> -	  if (fd != child_std_end)
> -	    __close_nocancel (fd);
> -	}
> +    /* The descriptor is already in the one the child will use.  In this case
> +       it must be moved to another one, otherwise there is no safe way to
> +       remove the close-on-exec flag in the child without creating a FD leak
> +       race in the parent.  */
> +    if (pipe_fds[1 - op] == 1 - op)
> +      {
> +	int tmp = __fcntl (1 - op, F_DUPFD_CLOEXEC, 0);
> +	if (tmp < 0)
> +	  goto spawn_failure;
> +	__close_nocancel (pipe_fds[1 - op]);
> +	pipe_fds[1 - op] = tmp;
> +      }
>  
> -      execl ("/bin/sh", "sh", "-c", command, (char *) 0);
> -      _exit (127);
> -    }
> -  __close_nocancel (child_end);
> -  if (child_pid < 0)
> +    if (__posix_spawn_file_actions_adddup2 (&fa, pipe_fds[1 - op], 1 - op)
> +	!= 0)
> +      goto spawn_failure;
> +
> +    /* POSIX.2: "popen() shall ensure that any streams from previous popen()
> +       calls that remain open in the parent process are closed in the new
> +       child process." */
> +    for (struct _IO_proc_file *p = proc_file_chain; p; p = p->next)
> +      {
> +	int fd = _IO_fileno ((FILE *) p);
> +
> +	/* If any stream from previous popen() calls has fileno
> +	   child_send, it has been already closed by the dup2 syscall
> +	   above.  */
> +	if (fd != 1 - op
> +	    && __posix_spawn_file_actions_addclose (&fa, fd) != 0)
> +	  goto spawn_failure;
> +      }
> +
> +    if (__posix_spawn (&((_IO_proc_file *) fp)->pid, _PATH_BSHELL, &fa, 0,
> +		     (char *const[]){ (char*) "sh", (char*) "-c",
> +		     (char *) command, NULL }, __environ) != 0)
> +      {
> +      spawn_failure:
> +	__posix_spawn_file_actions_destroy (&fa);
> +	__close_nocancel (pipe_fds[1 - op]);
> +	__set_errno (ENOMEM);
> +	return NULL;
> +      }
> +
> +    __posix_spawn_file_actions_destroy (&fa);
> +  }
> +  __close_nocancel (pipe_fds[1 - op]);
> +  if (((_IO_proc_file *) fp)->pid < 0)
>      {
> -      __close_nocancel (parent_end);
> +      __close_nocancel (pipe_fds[op]);
>        return NULL;
>      }
>  
>    if (!do_cloexec)
>      /* Undo the effects of the pipe2 call which set the
>         close-on-exec flag.  */
> -    __fcntl (parent_end, F_SETFD, 0);
> +    __fcntl (pipe_fds[op], F_SETFD, 0);
>  
> -  _IO_fileno (fp) = parent_end;
> +  _IO_fileno (fp) = pipe_fds[op];
>  
>    /* Link into proc_file_chain. */
>  #ifdef _IO_MTSAFE_IO
>
Adhemerval Zanella Netto Oct. 19, 2018, 8:23 p.m. UTC | #10
I also fixed BZ#17490.  Although POSIX pthread_atfork [1] description only
list 'fork' as the function where should issue the atfork handlers and 
popen description [2] states that:

  '[...] shall be *as if* a child process were created within the popen() 
   call using the fork() function [...]' 

Other libc/system seems to follow the idea atfork handlers should not be
issue for popen:

libc/system	| run atfork handles   | notes 
----------------|----------------------|---------------------------------------	
freebsd	master  |        no            | uses vfork
solaris 11	|        no            |
MacOSX (11.13)  |        no            | implemented through posix_spawn syscall
----------------|----------------------|----------------------------------------

And I also agree that, as for posix_spawn and system, popen idea is to spawn 
a different binary so all the POSIX rationale to run the atfork handlers to 
avoid internal process inconsistent are not really required and in some cases
might be unsafe.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/
[2] http://pubs.opengroup.org/onlinepubs/9699919799/

On 17/10/2018 14:11, Adhemerval Zanella wrote:
> Ping.
> 
> On 15/09/2018 12:16, Adhemerval Zanella wrote:
>> This patch uses posix_spawn on popen instead of fork and execl.  On Linux
>> this has the advantage of much lower memory consumption (usually 32 Kb
>> minimum for the mmap stack area).
>>
>> Checked on x86_64-linux-gnu and i686-linux-gnu.
>>
>> 	* libio/iopopen.c (_IO_new_proc_open): use posix_spawn instead of
>> 	fork and execl.
>> ---
>>  ChangeLog       |  3 ++
>>  libio/iopopen.c | 97 +++++++++++++++++++++++++++++--------------------
>>  2 files changed, 61 insertions(+), 39 deletions(-)
>>
>> diff --git a/libio/iopopen.c b/libio/iopopen.c
>> index 2eff45b4c8..3cce2e5596 100644
>> --- a/libio/iopopen.c
>> +++ b/libio/iopopen.c
>> @@ -34,7 +34,8 @@
>>  #include <not-cancel.h>
>>  #include <sys/types.h>
>>  #include <sys/wait.h>
>> -#include <kernel-features.h>
>> +#include <spawn.h>
>> +#include <paths.h>
>>  
>>  struct _IO_proc_file
>>  {
>> @@ -63,9 +64,8 @@ FILE *
>>  _IO_new_proc_open (FILE *fp, const char *command, const char *mode)
>>  {
>>    int read_or_write;
>> -  int parent_end, child_end;
>>    int pipe_fds[2];
>> -  pid_t child_pid;
>> +  int op;
>>  
>>    int do_read = 0;
>>    int do_write = 0;
>> @@ -108,59 +108,78 @@ _IO_new_proc_open (FILE *fp, const char *command, const char *mode)
>>  
>>    if (do_read)
>>      {
>> -      parent_end = pipe_fds[0];
>> -      child_end = pipe_fds[1];
>> +      op = 0;
>>        read_or_write = _IO_NO_WRITES;
>>      }
>>    else
>>      {
>> -      parent_end = pipe_fds[1];
>> -      child_end = pipe_fds[0];
>> +      op = 1;
>>        read_or_write = _IO_NO_READS;
>>      }
>>  
>> -  ((_IO_proc_file *) fp)->pid = child_pid = __fork ();
>> -  if (child_pid == 0)
>> -    {
>> -      int child_std_end = do_read ? 1 : 0;
>> -      struct _IO_proc_file *p;
>> -
>> -      if (child_end != child_std_end)
>> -	__dup2 (child_end, child_std_end);
>> -      else
>> -	/* The descriptor is already the one we will use.  But it must
>> -	   not be marked close-on-exec.  Undo the effects.  */
>> -	__fcntl (child_end, F_SETFD, 0);
>> -      /* POSIX.2:  "popen() shall ensure that any streams from previous
>> -         popen() calls that remain open in the parent process are closed
>> -	 in the new child process." */
>> -      for (p = proc_file_chain; p; p = p->next)
>> -	{
>> -	  int fd = _IO_fileno ((FILE *) p);
>> +  {
>> +    posix_spawn_file_actions_t fa;
>> +    /* posix_spawn_file_actions_init does not fail.  */
>> +    __posix_spawn_file_actions_init (&fa);
>>  
>> -	  /* If any stream from previous popen() calls has fileno
>> -	     child_std_end, it has been already closed by the dup2 syscall
>> -	     above.  */
>> -	  if (fd != child_std_end)
>> -	    __close_nocancel (fd);
>> -	}
>> +    /* The descriptor is already in the one the child will use.  In this case
>> +       it must be moved to another one, otherwise there is no safe way to
>> +       remove the close-on-exec flag in the child without creating a FD leak
>> +       race in the parent.  */
>> +    if (pipe_fds[1 - op] == 1 - op)
>> +      {
>> +	int tmp = __fcntl (1 - op, F_DUPFD_CLOEXEC, 0);
>> +	if (tmp < 0)
>> +	  goto spawn_failure;
>> +	__close_nocancel (pipe_fds[1 - op]);
>> +	pipe_fds[1 - op] = tmp;
>> +      }
>>  
>> -      execl ("/bin/sh", "sh", "-c", command, (char *) 0);
>> -      _exit (127);
>> -    }
>> -  __close_nocancel (child_end);
>> -  if (child_pid < 0)
>> +    if (__posix_spawn_file_actions_adddup2 (&fa, pipe_fds[1 - op], 1 - op)
>> +	!= 0)
>> +      goto spawn_failure;
>> +
>> +    /* POSIX.2: "popen() shall ensure that any streams from previous popen()
>> +       calls that remain open in the parent process are closed in the new
>> +       child process." */
>> +    for (struct _IO_proc_file *p = proc_file_chain; p; p = p->next)
>> +      {
>> +	int fd = _IO_fileno ((FILE *) p);
>> +
>> +	/* If any stream from previous popen() calls has fileno
>> +	   child_send, it has been already closed by the dup2 syscall
>> +	   above.  */
>> +	if (fd != 1 - op
>> +	    && __posix_spawn_file_actions_addclose (&fa, fd) != 0)
>> +	  goto spawn_failure;
>> +      }
>> +
>> +    if (__posix_spawn (&((_IO_proc_file *) fp)->pid, _PATH_BSHELL, &fa, 0,
>> +		     (char *const[]){ (char*) "sh", (char*) "-c",
>> +		     (char *) command, NULL }, __environ) != 0)
>> +      {
>> +      spawn_failure:
>> +	__posix_spawn_file_actions_destroy (&fa);
>> +	__close_nocancel (pipe_fds[1 - op]);
>> +	__set_errno (ENOMEM);
>> +	return NULL;
>> +      }
>> +
>> +    __posix_spawn_file_actions_destroy (&fa);
>> +  }
>> +  __close_nocancel (pipe_fds[1 - op]);
>> +  if (((_IO_proc_file *) fp)->pid < 0)
>>      {
>> -      __close_nocancel (parent_end);
>> +      __close_nocancel (pipe_fds[op]);
>>        return NULL;
>>      }
>>  
>>    if (!do_cloexec)
>>      /* Undo the effects of the pipe2 call which set the
>>         close-on-exec flag.  */
>> -    __fcntl (parent_end, F_SETFD, 0);
>> +    __fcntl (pipe_fds[op], F_SETFD, 0);
>>  
>> -  _IO_fileno (fp) = parent_end;
>> +  _IO_fileno (fp) = pipe_fds[op];
>>  
>>    /* Link into proc_file_chain. */
>>  #ifdef _IO_MTSAFE_IO
>>
diff mbox series

Patch

diff --git a/libio/iopopen.c b/libio/iopopen.c
index 2eff45b4c8..3cce2e5596 100644
--- a/libio/iopopen.c
+++ b/libio/iopopen.c
@@ -34,7 +34,8 @@ 
 #include <not-cancel.h>
 #include <sys/types.h>
 #include <sys/wait.h>
-#include <kernel-features.h>
+#include <spawn.h>
+#include <paths.h>
 
 struct _IO_proc_file
 {
@@ -63,9 +64,8 @@  FILE *
 _IO_new_proc_open (FILE *fp, const char *command, const char *mode)
 {
   int read_or_write;
-  int parent_end, child_end;
   int pipe_fds[2];
-  pid_t child_pid;
+  int op;
 
   int do_read = 0;
   int do_write = 0;
@@ -108,59 +108,78 @@  _IO_new_proc_open (FILE *fp, const char *command, const char *mode)
 
   if (do_read)
     {
-      parent_end = pipe_fds[0];
-      child_end = pipe_fds[1];
+      op = 0;
       read_or_write = _IO_NO_WRITES;
     }
   else
     {
-      parent_end = pipe_fds[1];
-      child_end = pipe_fds[0];
+      op = 1;
       read_or_write = _IO_NO_READS;
     }
 
-  ((_IO_proc_file *) fp)->pid = child_pid = __fork ();
-  if (child_pid == 0)
-    {
-      int child_std_end = do_read ? 1 : 0;
-      struct _IO_proc_file *p;
-
-      if (child_end != child_std_end)
-	__dup2 (child_end, child_std_end);
-      else
-	/* The descriptor is already the one we will use.  But it must
-	   not be marked close-on-exec.  Undo the effects.  */
-	__fcntl (child_end, F_SETFD, 0);
-      /* POSIX.2:  "popen() shall ensure that any streams from previous
-         popen() calls that remain open in the parent process are closed
-	 in the new child process." */
-      for (p = proc_file_chain; p; p = p->next)
-	{
-	  int fd = _IO_fileno ((FILE *) p);
+  {
+    posix_spawn_file_actions_t fa;
+    /* posix_spawn_file_actions_init does not fail.  */
+    __posix_spawn_file_actions_init (&fa);
 
-	  /* If any stream from previous popen() calls has fileno
-	     child_std_end, it has been already closed by the dup2 syscall
-	     above.  */
-	  if (fd != child_std_end)
-	    __close_nocancel (fd);
-	}
+    /* The descriptor is already in the one the child will use.  In this case
+       it must be moved to another one, otherwise there is no safe way to
+       remove the close-on-exec flag in the child without creating a FD leak
+       race in the parent.  */
+    if (pipe_fds[1 - op] == 1 - op)
+      {
+	int tmp = __fcntl (1 - op, F_DUPFD_CLOEXEC, 0);
+	if (tmp < 0)
+	  goto spawn_failure;
+	__close_nocancel (pipe_fds[1 - op]);
+	pipe_fds[1 - op] = tmp;
+      }
 
-      execl ("/bin/sh", "sh", "-c", command, (char *) 0);
-      _exit (127);
-    }
-  __close_nocancel (child_end);
-  if (child_pid < 0)
+    if (__posix_spawn_file_actions_adddup2 (&fa, pipe_fds[1 - op], 1 - op)
+	!= 0)
+      goto spawn_failure;
+
+    /* POSIX.2: "popen() shall ensure that any streams from previous popen()
+       calls that remain open in the parent process are closed in the new
+       child process." */
+    for (struct _IO_proc_file *p = proc_file_chain; p; p = p->next)
+      {
+	int fd = _IO_fileno ((FILE *) p);
+
+	/* If any stream from previous popen() calls has fileno
+	   child_send, it has been already closed by the dup2 syscall
+	   above.  */
+	if (fd != 1 - op
+	    && __posix_spawn_file_actions_addclose (&fa, fd) != 0)
+	  goto spawn_failure;
+      }
+
+    if (__posix_spawn (&((_IO_proc_file *) fp)->pid, _PATH_BSHELL, &fa, 0,
+		     (char *const[]){ (char*) "sh", (char*) "-c",
+		     (char *) command, NULL }, __environ) != 0)
+      {
+      spawn_failure:
+	__posix_spawn_file_actions_destroy (&fa);
+	__close_nocancel (pipe_fds[1 - op]);
+	__set_errno (ENOMEM);
+	return NULL;
+      }
+
+    __posix_spawn_file_actions_destroy (&fa);
+  }
+  __close_nocancel (pipe_fds[1 - op]);
+  if (((_IO_proc_file *) fp)->pid < 0)
     {
-      __close_nocancel (parent_end);
+      __close_nocancel (pipe_fds[op]);
       return NULL;
     }
 
   if (!do_cloexec)
     /* Undo the effects of the pipe2 call which set the
        close-on-exec flag.  */
-    __fcntl (parent_end, F_SETFD, 0);
+    __fcntl (pipe_fds[op], F_SETFD, 0);
 
-  _IO_fileno (fp) = parent_end;
+  _IO_fileno (fp) = pipe_fds[op];
 
   /* Link into proc_file_chain. */
 #ifdef _IO_MTSAFE_IO