diff mbox

Call exit directly in clone (BZ #21512)

Message ID 1498221450-8381-1-git-send-email-adhemerval.zanella@linaro.org
State New
Headers show

Commit Message

Adhemerval Zanella Netto June 23, 2017, 12:37 p.m. UTC
On aarch64, alpha, arm, hppa, mips, nios2, powerpc, s390, sh,
sparch, tile, and x86_64 the clone syscall jumps to _exit after
the child execution and the function ends the process execution by
calling exit_group.  This behavior have a small issue where
threads created with CLONE_THREAD using clone syscall directly
will eventually exit the whole group altogether instead of just
the thread created.  Also, microblaze, ia64, i386, and m68k
differs by calling exit syscall directly.

This patch changes all architectures to call the exit syscall
directly, as for microblaze, ia64, i386, and m68k.  This do not
have change glibc internal behavior in any sort, since the only
usage of clone implementation in posix_spawn calls _exit directly
in the created child (fork uses a direct call to clone).

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
powerpc-linux-gnu, powerpc64le-linux-gnu, sparc64-linux-gnu,
and sparcv9-linux-gnu.

	[BZ #21512]
	* sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Call exit
	syscall instead of jump to _exit.
	(CLONE_VM_BIT): Remove unused define.
	(CLONE_VM): Likewise.
	(CLONE_THREAD_BIT): Likewise.
	(CLONE_THREAD): Likewise.
	* sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise.
	(CLONE_VM): Remove unused define.
	* sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise.
	(CLONE_VM): Remove unused define.
	(CLONE_THREAD): Likewise.
	* sysdeps/unix/sysv/linux/i386/clone.S (CLONE_VM): Likewise.
	* sysdeps/unix/sysv/linux/ia64/clone2.S (__clone2): Call exit
	syscall instead of jump to _exit.
	* sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise.
	(CLONE_VM): Remove unused define.
	(CLONE_THREAD): Likewise.
	* sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise.
	(CLONE_VM): Remove unused define.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone):
	Likewise.
	(CLONE_VM): Remove unused define.
	(CLONE_THREAD): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone):
	Likewise.
	(CLONE_VM): Remove unused define.
	(CLONE_THREAD): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/sh/clone.S  (__clone): Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise.
	(CLONE_VM): Remove unused define.
	* sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise.
	(CLONE_VM): Remove unused define.
	* sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise.
	* sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise.
	(CLONE_VM): Remove unused define.
	* sysdeps/unix/sysv/linux/Makefile (tests): Add tst-clone3.
	* sysdeps/unix/sysv/linux/tst-clone3.c: New file.
---
 sysdeps/unix/sysv/linux/Makefile                  |  4 +-
 sysdeps/unix/sysv/linux/aarch64/clone.S           |  9 +--
 sysdeps/unix/sysv/linux/alpha/clone.S             | 11 +--
 sysdeps/unix/sysv/linux/arm/clone.S               |  6 +-
 sysdeps/unix/sysv/linux/hppa/clone.S              |  6 +-
 sysdeps/unix/sysv/linux/i386/clone.S              |  2 -
 sysdeps/unix/sysv/linux/ia64/clone2.S             |  8 +-
 sysdeps/unix/sysv/linux/m68k/clone.S              |  2 -
 sysdeps/unix/sysv/linux/mips/clone.S              | 12 +--
 sysdeps/unix/sysv/linux/nios2/clone.S             | 17 +---
 sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S |  7 +-
 sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S | 11 +--
 sysdeps/unix/sysv/linux/s390/s390-32/clone.S      |  2 +-
 sysdeps/unix/sysv/linux/s390/s390-64/clone.S      |  2 +-
 sysdeps/unix/sysv/linux/sh/clone.S                | 21 +----
 sysdeps/unix/sysv/linux/sparc/sparc32/clone.S     |  5 +-
 sysdeps/unix/sysv/linux/sparc/sparc64/clone.S     |  5 +-
 sysdeps/unix/sysv/linux/tile/clone.S              |  6 +-
 sysdeps/unix/sysv/linux/tst-clone3.c              | 95 +++++++++++++++++++++++
 sysdeps/unix/sysv/linux/x86_64/clone.S            |  5 +-
 20 files changed, 130 insertions(+), 106 deletions(-)
 create mode 100644 sysdeps/unix/sysv/linux/tst-clone3.c

Comments

Carlos O'Donell June 23, 2017, 12:55 p.m. UTC | #1
On 06/23/2017 08:37 AM, Adhemerval Zanella wrote:
> On aarch64, alpha, arm, hppa, mips, nios2, powerpc, s390, sh,
> sparch, tile, and x86_64 the clone syscall jumps to _exit after
> the child execution and the function ends the process execution by
> calling exit_group.  This behavior have a small issue where
> threads created with CLONE_THREAD using clone syscall directly
> will eventually exit the whole group altogether instead of just
> the thread created.  Also, microblaze, ia64, i386, and m68k
> differs by calling exit syscall directly.
> 
> This patch changes all architectures to call the exit syscall
> directly, as for microblaze, ia64, i386, and m68k.  This do not
> have change glibc internal behavior in any sort, since the only
> usage of clone implementation in posix_spawn calls _exit directly
> in the created child (fork uses a direct call to clone).
> 
> Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
> powerpc-linux-gnu, powerpc64le-linux-gnu, sparc64-linux-gnu,
> and sparcv9-linux-gnu.

How is it that this doesn't break threading completely on the existing
architectures that call _exit? Is it because NPTL calls the clone syscall
directly instead of calling the clone() function and nobody uses clone()
because of all the other problems it has?

I admit that only an explicit call to _exit() or exit() should terminate
every thread in the thread group, *and* looking at the kernel side code
I see it does try to terminate every thread in the group (zap_other_threads()).

The patch looks good to me, but I just want some clarification about my
question "Why haven't we noticed?" :-)

> 	[BZ #21512]
> 	* sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Call exit
> 	syscall instead of jump to _exit.
> 	(CLONE_VM_BIT): Remove unused define.
> 	(CLONE_VM): Likewise.
> 	(CLONE_THREAD_BIT): Likewise.
> 	(CLONE_THREAD): Likewise.
> 	* sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise.
> 	(CLONE_VM): Remove unused define.
> 	* sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise.
> 	(CLONE_VM): Remove unused define.
> 	(CLONE_THREAD): Likewise.
> 	* sysdeps/unix/sysv/linux/i386/clone.S (CLONE_VM): Likewise.
> 	* sysdeps/unix/sysv/linux/ia64/clone2.S (__clone2): Call exit
> 	syscall instead of jump to _exit.
> 	* sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise.
> 	* sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise.
> 	(CLONE_VM): Remove unused define.
> 	(CLONE_THREAD): Likewise.
> 	* sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise.
> 	(CLONE_VM): Remove unused define.
> 	* sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone):
> 	Likewise.
> 	(CLONE_VM): Remove unused define.
> 	(CLONE_THREAD): Likewise.
> 	* sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone):
> 	Likewise.
> 	(CLONE_VM): Remove unused define.
> 	(CLONE_THREAD): Likewise.
> 	* sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise.
> 	* sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise.
> 	* sysdeps/unix/sysv/linux/sh/clone.S  (__clone): Likewise.
> 	* sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise.
> 	(CLONE_VM): Remove unused define.
> 	* sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise.
> 	(CLONE_VM): Remove unused define.
> 	* sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise.
> 	* sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise.
> 	(CLONE_VM): Remove unused define.
> 	* sysdeps/unix/sysv/linux/Makefile (tests): Add tst-clone3.
> 	* sysdeps/unix/sysv/linux/tst-clone3.c: New file.
> ---
>  sysdeps/unix/sysv/linux/Makefile                  |  4 +-
>  sysdeps/unix/sysv/linux/aarch64/clone.S           |  9 +--
>  sysdeps/unix/sysv/linux/alpha/clone.S             | 11 +--
>  sysdeps/unix/sysv/linux/arm/clone.S               |  6 +-
>  sysdeps/unix/sysv/linux/hppa/clone.S              |  6 +-
>  sysdeps/unix/sysv/linux/i386/clone.S              |  2 -
>  sysdeps/unix/sysv/linux/ia64/clone2.S             |  8 +-
>  sysdeps/unix/sysv/linux/m68k/clone.S              |  2 -
>  sysdeps/unix/sysv/linux/mips/clone.S              | 12 +--
>  sysdeps/unix/sysv/linux/nios2/clone.S             | 17 +---
>  sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S |  7 +-
>  sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S | 11 +--
>  sysdeps/unix/sysv/linux/s390/s390-32/clone.S      |  2 +-
>  sysdeps/unix/sysv/linux/s390/s390-64/clone.S      |  2 +-
>  sysdeps/unix/sysv/linux/sh/clone.S                | 21 +----
>  sysdeps/unix/sysv/linux/sparc/sparc32/clone.S     |  5 +-
>  sysdeps/unix/sysv/linux/sparc/sparc64/clone.S     |  5 +-
>  sysdeps/unix/sysv/linux/tile/clone.S              |  6 +-
>  sysdeps/unix/sysv/linux/tst-clone3.c              | 95 +++++++++++++++++++++++
>  sysdeps/unix/sysv/linux/x86_64/clone.S            |  5 +-
>  20 files changed, 130 insertions(+), 106 deletions(-)
>  create mode 100644 sysdeps/unix/sysv/linux/tst-clone3.c
> 
> diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
> index 8b340d4..9d6a2de 100644
> --- a/sysdeps/unix/sysv/linux/Makefile
> +++ b/sysdeps/unix/sysv/linux/Makefile
> @@ -49,8 +49,8 @@ sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \
>  		  bits/mman-linux.h \
>  		  bits/siginfo-arch.h bits/siginfo-consts-arch.h
>  
> -tests += tst-clone tst-clone2 tst-fanotify tst-personality tst-quota \
> -	 tst-sync_file_range test-errno-linux
> +tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \
> +	 tst-quota tst-sync_file_range test-errno-linux
>  
>  # Generate the list of SYS_* macros for the system calls (__NR_* macros).
>  
> diff --git a/sysdeps/unix/sysv/linux/aarch64/clone.S b/sysdeps/unix/sysv/linux/aarch64/clone.S
> index 259ec07..905915a 100644
> --- a/sysdeps/unix/sysv/linux/aarch64/clone.S
> +++ b/sysdeps/unix/sysv/linux/aarch64/clone.S
> @@ -23,12 +23,6 @@
>  #define _ERRNO_H	1
>  #include <bits/errno.h>
>  
> -#define CLONE_VM_BIT      8
> -#define CLONE_VM          (1 << CLONE_VM_BIT)
> -
> -#define CLONE_THREAD_BIT  16
> -#define CLONE_THREAD      (1 << CLONE_THREAD_BIT)
> -
>  /* int clone(int (*fn)(void *arg),            x0
>  	     void *child_stack,               x1
>  	     int flags,                       x2
> @@ -84,7 +78,8 @@ thread_start:
>  	blr	x10
>  
>  	/* We are done, pass the return value through x0.  */
> -	b	HIDDEN_JUMPTARGET(_exit)
> +	mov	x8, #SYS_ify(exit)
> +	svc	0x0
>  	cfi_endproc
>  	.size thread_start, .-thread_start
>  
> diff --git a/sysdeps/unix/sysv/linux/alpha/clone.S b/sysdeps/unix/sysv/linux/alpha/clone.S
> index 20ae361..550461f 100644
> --- a/sysdeps/unix/sysv/linux/alpha/clone.S
> +++ b/sysdeps/unix/sysv/linux/alpha/clone.S
> @@ -23,8 +23,6 @@
>  #define _ERRNO_H	1
>  #include <bits/errno.h>
>  
> -#define CLONE_VM	0x00000100
> -
>  /* int clone(int (*fn)(void *arg), void *child_stack, int flags,
>  	     void *arg, pid_t *ptid, void *tls, pid_t *ctid);
>  
> @@ -100,13 +98,8 @@ thread_start:
>  	jsr	ra, (pv)
>  	ldgp	gp, 0(ra)
>  
> -	/* Call _exit rather than doing it inline for breakpoint purposes.  */
> -	mov	v0, a0
> -#ifdef PIC
> -	bsr	ra, HIDDEN_JUMPTARGET(_exit)	!samegp
> -#else
> -	jsr	ra, HIDDEN_JUMPTARGET(_exit)
> -#endif
> +	ldiq	v0, __NR_exit
> +	call_pal PAL_callsys
>  
>  	/* Die horribly.  */
>  	.align	4
> diff --git a/sysdeps/unix/sysv/linux/arm/clone.S b/sysdeps/unix/sysv/linux/arm/clone.S
> index a309add..f01968a 100644
> --- a/sysdeps/unix/sysv/linux/arm/clone.S
> +++ b/sysdeps/unix/sysv/linux/arm/clone.S
> @@ -24,9 +24,6 @@
>  #define _ERRNO_H	1
>  #include <bits/errno.h>
>  
> -#define CLONE_VM      0x00000100
> -#define CLONE_THREAD  0x00010000
> -
>  /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>  	     pid_t *ptid, struct user_desc *tls, pid_t *ctid); */
>  
> @@ -76,7 +73,8 @@ PSEUDO_END (__clone)
>  	BLX (ip)
>  
>  	@ and we are done, passing the return value through r0
> -	b	PLTJMP(HIDDEN_JUMPTARGET(_exit))
> +	ldr	r7, =SYS_ify(clone)
> +	swi	0x0
>  
>  	.fnend
>  
> diff --git a/sysdeps/unix/sysv/linux/hppa/clone.S b/sysdeps/unix/sysv/linux/hppa/clone.S
> index d36b302..8c43944 100644
> --- a/sysdeps/unix/sysv/linux/hppa/clone.S
> +++ b/sysdeps/unix/sysv/linux/hppa/clone.S
> @@ -148,10 +148,10 @@ ENTRY(__clone)
>  	copy	%r4, %r19
>  #endif
>  	/* The call to _exit needs saved r19.  */
> -	bl	_exit, %rp
> -	copy	%ret0, %arg0
> +	ble     0x100(%sr2, %r0)
> +	ldi	__NR_exit, %r20
>  
> -	/* We should not return from _exit.
> +	/* We should not return from exit.
>             We do not restore r4, or the stack state.  */
>  	iitlbp	%r0, (%sr0, %r0)
>  
> diff --git a/sysdeps/unix/sysv/linux/i386/clone.S b/sysdeps/unix/sysv/linux/i386/clone.S
> index a4ba3e2..49c82d9 100644
> --- a/sysdeps/unix/sysv/linux/i386/clone.S
> +++ b/sysdeps/unix/sysv/linux/i386/clone.S
> @@ -39,8 +39,6 @@
>  #define __NR_clone 120
>  #define SYS_clone 120
>  
> -#define CLONE_VM	0x00000100
> -
>          .text
>  ENTRY (__clone)
>  	/* Sanity check arguments.  */
> diff --git a/sysdeps/unix/sysv/linux/ia64/clone2.S b/sysdeps/unix/sysv/linux/ia64/clone2.S
> index 9b59473..3157ce9 100644
> --- a/sysdeps/unix/sysv/linux/ia64/clone2.S
> +++ b/sysdeps/unix/sysv/linux/ia64/clone2.S
> @@ -74,11 +74,11 @@ ENTRY(__clone2)
>  	mov b6=out1
>  	br.call.dptk.many rp=b6	/* Call fn(arg) in the child 	*/
>  	;;
> -	mov out0=r8		/* Argument to _exit		*/
> +	mov out0=r8		/* Argument to exit		*/
>  	mov gp=loc0
> -	.globl HIDDEN_JUMPTARGET(_exit)
> -	br.call.dpnt.many rp=HIDDEN_JUMPTARGET(_exit)
> -				/* call _exit with result from fn.	*/
> +	mov r15=SYS_ify (exit)
> +	.save rp, r0
> +	break __BREAK_SYSCALL
>  	ret			/* Not reached.		*/
>  PSEUDO_END(__clone2)
>  
> diff --git a/sysdeps/unix/sysv/linux/m68k/clone.S b/sysdeps/unix/sysv/linux/m68k/clone.S
> index a680191..0894b2a 100644
> --- a/sysdeps/unix/sysv/linux/m68k/clone.S
> +++ b/sysdeps/unix/sysv/linux/m68k/clone.S
> @@ -24,8 +24,6 @@
>  #include <bits/errno.h>
>  #include <tls.h>
>  
> -#define CLONE_VM      0x00000100
> -
>  /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>  	     void *parent_tidptr, void *tls, void *child_tidptr) */
>  
> diff --git a/sysdeps/unix/sysv/linux/mips/clone.S b/sysdeps/unix/sysv/linux/mips/clone.S
> index 8b79457..855f972 100644
> --- a/sysdeps/unix/sysv/linux/mips/clone.S
> +++ b/sysdeps/unix/sysv/linux/mips/clone.S
> @@ -25,9 +25,6 @@
>  #include <bits/errno.h>
>  #include <tls.h>
>  
> -#define CLONE_VM      0x00000100
> -#define CLONE_THREAD  0x00010000
> -
>  /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>  	     void *parent_tidptr, void *tls, void *child_tidptr) */
>  
> @@ -137,14 +134,9 @@ L(thread_start):
>  	/* Call the user's function.  */
>  	jal		t9
>  
> -	/* Call _exit rather than doing it inline for breakpoint purposes.  */
>  	move		a0,v0
> -#ifdef __PIC__
> -	PTR_LA		t9,_exit
> -	jalr		t9
> -#else
> -	jal		_exit
> -#endif
> +	li		v0,__NR_clone
> +	syscall
>  
>  	END(__thread_start)
>  
> diff --git a/sysdeps/unix/sysv/linux/nios2/clone.S b/sysdeps/unix/sysv/linux/nios2/clone.S
> index 7929dfa..2ba8258 100644
> --- a/sysdeps/unix/sysv/linux/nios2/clone.S
> +++ b/sysdeps/unix/sysv/linux/nios2/clone.S
> @@ -25,8 +25,6 @@
>  #include <bits/errno.h>
>  #include <tcb-offsets.h>
>  
> -#define CLONE_VM      0x00000100
> -
>  /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>  	     void *parent_tidptr, void *tls, void *child_tidptr) */
>  
> @@ -75,18 +73,9 @@ thread_start:
>          /* Call the user's function.  */
>  	callr	r5
>  
> -	/* _exit with the result.  */
> -	mov	r4, r2
> -#ifdef PIC
> -	nextpc	r22
> -1:	movhi	r8, %hiadj(_gp_got - 1b)
> -	addi	r8, r8, %lo(_gp_got - 1b)
> -	add	r22, r22, r8
> -	ldw	r8, %call(HIDDEN_JUMPTARGET(_exit))(r22)
> -	jmp	r8
> -#else
> -	jmpi	_exit
> -#endif
> +	/* exit with the result.  */
> +	movi	r2, SYS_ify (exit)
> +	trap
>  	cfi_endproc
>  
>  	cfi_startproc
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S b/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S
> index a07b7d3..e48cc5f 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S
> @@ -20,10 +20,6 @@
>  #define _ERRNO_H	1
>  #include <bits/errno.h>
>  
> -#define CLONE_VM	0x00000100
> -#define CLONE_THREAD	0x00010000
> -
> -
>  /* This is the only really unusual system call in PPC linux, but not
>     because of any weirdness in the system call itself; because of
>     all the freaky stuff we have to do to make the call useful.  */
> @@ -80,8 +76,7 @@ ENTRY (__clone)
>  	mtctr	r30
>  	mr	r3,r31
>  	bctrl
> -	/* Call _exit with result from procedure.  */
> -	b	HIDDEN_JUMPTARGET(_exit)
> +	DO_CALL(SYS_ify(exit))
>  
>  L(parent):
>  	/* Parent.  Restore registers & return.  */
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S b/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S
> index 9e5bfd2..78c353a 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S
> @@ -20,9 +20,6 @@
>  #define _ERRNO_H	1
>  #include <bits/errno.h>
>  
> -#define CLONE_VM	0x00000100
> -#define CLONE_THREAD	0x00010000
> -
>  /* This is the only really unusual system call in PPC linux, but not
>     because of any weirdness in the system call itself; because of
>     all the freaky stuff we have to do to make the call useful.  */
> @@ -84,15 +81,11 @@ ENTRY (__clone)
>  	mr	r3,r31
>  	bctrl
>  	ld	r2,FRAME_TOC_SAVE(r1)
> -	/* Call _exit with result from procedure.  */
> -#ifdef SHARED
> -	b	JUMPTARGET(__GI__exit)
> -#else
> -	bl	JUMPTARGET(_exit)
> +
> +	DO_CALL(SYS_ify(exit))
>  	/* We won't ever get here but provide a nop so that the linker
>  	   will insert a toc adjusting stub if necessary.  */
>  	nop
> -#endif
>  
>  L(badargs):
>  	cfi_startproc
> diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/clone.S b/sysdeps/unix/sysv/linux/s390/s390-32/clone.S
> index a8b4dbc..1588e5f 100644
> --- a/sysdeps/unix/sysv/linux/s390/s390-32/clone.S
> +++ b/sysdeps/unix/sysv/linux/s390/s390-32/clone.S
> @@ -59,7 +59,7 @@ thread_start:
>  	ahi     %r15,-96        /* make room on the stack for the save area */
>  	xc	0(4,%r15),0(%r15)
>  	basr    %r14,%r1        /* jump to fn */
> -	DO_CALL (exit, 1)
> +	svc	SYS_ify(exit)
>  
>  libc_hidden_def (__clone)
>  weak_alias (__clone, clone)
> diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/clone.S b/sysdeps/unix/sysv/linux/s390/s390-64/clone.S
> index daf8a58..5843188 100644
> --- a/sysdeps/unix/sysv/linux/s390/s390-64/clone.S
> +++ b/sysdeps/unix/sysv/linux/s390/s390-64/clone.S
> @@ -60,7 +60,7 @@ thread_start:
>  	aghi	%r15,-160	/* make room on the stack for the save area */
>  	xc	0(8,%r15),0(%r15)
>  	basr	%r14,%r1	/* jump to fn */
> -	DO_CALL	(exit, 1)
> +	svc	SYS_ify(exit)
>  
>  libc_hidden_def (__clone)
>  weak_alias (__clone, clone)
> diff --git a/sysdeps/unix/sysv/linux/sh/clone.S b/sysdeps/unix/sysv/linux/sh/clone.S
> index 9063b21..b13a64b 100644
> --- a/sysdeps/unix/sysv/linux/sh/clone.S
> +++ b/sysdeps/unix/sysv/linux/sh/clone.S
> @@ -73,25 +73,8 @@ ENTRY(__clone)
>  	 mov.l	@(4,r15), r4
>  
>  	/* we are done, passing the return value through r0  */
> -	mov.l	.L3, r1
> -#ifdef SHARED
> -	mov.l	r12, @-r15
> -	sts.l	pr, @-r15
> -	mov	r0, r4
> -	mova	.LG, r0
> -	mov.l	.LG, r12
> -	add	r0, r12
> -	mova	.L3, r0
> -	add	r0, r1
> -	jsr	@r1
> -	 nop
> -	lds.l	@r15+, pr
> -	rts
> -	 mov.l	@r15+, r12
> -#else
> -	jmp	@r1
> -	 mov	r0, r4
> -#endif
> +	mov	#+SYS_ify(exit), r3
> +	trapa	#0x15
>  	.align	2
>  .LG:
>  	.long	_GLOBAL_OFFSET_TABLE_
> diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S b/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S
> index 6d2f5bd..1afa26e 100644
> --- a/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S
> +++ b/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S
> @@ -24,8 +24,6 @@
>  #include <tcb-offsets.h>
>  #include <sysdep.h>
>  
> -#define CLONE_VM	0x00000100
> -
>  /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>  	     pid_t *ptid, void *tls, pid_t *ctid); */
>  
> @@ -81,7 +79,8 @@ __thread_start:
>  	mov	%g0, %fp	/* terminate backtrace */
>  	call	%g2
>  	 mov	%g3,%o0
> -	call	HIDDEN_JUMPTARGET(_exit),0
> +	set	__NR_exit, %g1
> +	ta	0x10
>  	 nop
>  
>  	.size	__thread_start, .-__thread_start
> diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S b/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S
> index fc28539..785ccd1 100644
> --- a/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S
> +++ b/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S
> @@ -24,8 +24,6 @@
>  #include <tcb-offsets.h>
>  #include <sysdep.h>
>  
> -#define CLONE_VM	0x00000100
> -
>  /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>  	     pid_t *ptid, void *tls, pid_t *ctid); */
>  
> @@ -78,7 +76,8 @@ __thread_start:
>  	mov	%g0, %fp	/* terminate backtrace */
>  	call	%g2
>  	 mov	%g3,%o0
> -	call	HIDDEN_JUMPTARGET(_exit),0
> +	set	__NR_exit, %g1
> +	ta	0x6d
>  	 nop
>  
>  	.size	__thread_start, .-__thread_start
> diff --git a/sysdeps/unix/sysv/linux/tile/clone.S b/sysdeps/unix/sysv/linux/tile/clone.S
> index d7d2a3b..9610acd 100644
> --- a/sysdeps/unix/sysv/linux/tile/clone.S
> +++ b/sysdeps/unix/sysv/linux/tile/clone.S
> @@ -168,10 +168,8 @@ ENTRY (__clone)
>  	 move r0, r31
>  	 jalr r32
>  	}
> -	{
> -	 j HIDDEN_JUMPTARGET(_exit)
> -	 info INFO_OP_CANNOT_BACKTRACE   /* Notify backtracer to stop. */
> -	}
> +	moveli TREG_SYSCALL_NR_NAME, __NR_exit
> +	swint1
>  PSEUDO_END (__clone)
>  
>  libc_hidden_def (__clone)
> diff --git a/sysdeps/unix/sysv/linux/tst-clone3.c b/sysdeps/unix/sysv/linux/tst-clone3.c
> new file mode 100644
> index 0000000..e893e4d
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/tst-clone3.c
> @@ -0,0 +1,95 @@
> +/* Check if clone (CLONE_THREAD) does not call exit_group (BZ #21512)
> +   Copyright (C) 2017 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <string.h>
> +#include <sched.h>
> +#include <signal.h>
> +#include <unistd.h>
> +#include <sys/syscall.h>
> +#include <sys/wait.h>
> +#include <sys/types.h>
> +#include <linux/futex.h>
> +
> +#include <stackinfo.h>  /* For _STACK_GROWS_{UP,DOWN}.  */
> +#include <support/check.h>
> +
> +/* Test if clone call with CLONE_THREAD does not call exit_group.  The 'f'
> +   function returns '1', which will be used by clone thread to call the
> +   'exit' syscall directly.  If _exit is used instead, exit_group will be
> +   used and thus the thread group will finish with return value of '1'
> +   (where '2' from main thread is expected.  */

s/(where/where/g

> +
> +static int
> +f (void *a)
> +{
> +  return 1;
> +}
> +
> +/* Futex wait for TID argument, similar to pthread_join internal
> +   implementation.  */
> +#define wait_tid(tid) \
> +  do {					\
> +    __typeof (tid) __tid;		\
> +    while ((__tid = (tid)) != 0)	\
> +      futex_wait (&(tid), __tid);	\
> +  } while (0)
> +
> +static inline int
> +futex_wait (int *futexp, int val)
> +{
> +  return syscall (__NR_futex, futexp, FUTEX_WAIT, val);
> +}
> +
> +static int
> +do_test (void)
> +{
> +  char st[1024] __attribute__ ((aligned));
> +  int clone_flags = CLONE_THREAD;
> +  /* Minimum required flags to used along with CLONE_THREAD.  */
> +  clone_flags |= CLONE_VM | CLONE_SIGHAND;
> +  /* We will used ctid to call on futex to wait for thread exit.  */
> +  clone_flags |= CLONE_CHILD_CLEARTID;
> +  pid_t ctid, tid;
> +
> +#ifdef __ia64__
> +  extern int __clone2 (int (*__fn) (void *__arg), void *__child_stack_base,
> +		       size_t __child_stack_size, int __flags,
> +		       void *__arg, ...);
> +  tid = __clone2 (f, st, sizeof (st), clone_flags, NULL, /* ptid */ NULL,
> +		  /* tls */ NULL, &ctid);
> +#else
> +#if _STACK_GROWS_DOWN
> +  tid = clone (f, st + sizeof (st), clone_flags, NULL, /* ptid */ NULL,
> +	       /* tls */ NULL, &ctid);
> +#elif _STACK_GROWS_UP
> +  tid = clone (f, st, clone_flags, NULL, /* ptid */ NULL, /* tls */ NULL,
> +	       &ctid);
> +#else
> +#error "Define either _STACK_GROWS_DOWN or _STACK_GROWS_UP"
> +#endif
> +#endif
> +  if (tid == -1)
> +    FAIL_EXIT1 ("clone failed: %m");
> +
> +  wait_tid (ctid);
> +
> +  return 2;
> +}
> +
> +#define EXPECTED_STATUS 2
> +#include <support/test-driver.c>
> diff --git a/sysdeps/unix/sysv/linux/x86_64/clone.S b/sysdeps/unix/sysv/linux/x86_64/clone.S
> index d5c2d07..b10fc29 100644
> --- a/sysdeps/unix/sysv/linux/x86_64/clone.S
> +++ b/sysdeps/unix/sysv/linux/x86_64/clone.S
> @@ -23,8 +23,6 @@
>  #include <bits/errno.h>
>  #include <asm-syntax.h>
>  
> -#define CLONE_VM	0x00000100
> -
>  /* The userland implementation is:
>     int clone (int (*fn)(void *arg), void *child_stack, int flags, void *arg),
>     the kernel entry is:
> @@ -97,7 +95,8 @@ L(thread_start):
>  	call	*%rax
>  	/* Call exit with return value from function call. */
>  	movq	%rax, %rdi
> -	call	HIDDEN_JUMPTARGET (_exit)
> +	movl	$SYS_ify(exit), %eax
> +	syscall
>  	cfi_endproc;
>  
>  	cfi_startproc;
>
Joseph Myers June 23, 2017, 1:14 p.m. UTC | #2
On Fri, 23 Jun 2017, Adhemerval Zanella wrote:

> diff --git a/sysdeps/unix/sysv/linux/arm/clone.S b/sysdeps/unix/sysv/linux/arm/clone.S

> @@ -76,7 +73,8 @@ PSEUDO_END (__clone)
>  	BLX (ip)
>  
>  	@ and we are done, passing the return value through r0
> -	b	PLTJMP(HIDDEN_JUMPTARGET(_exit))
> +	ldr	r7, =SYS_ify(clone)
> +	swi	0x0

This looks like it would call the clone syscall, not exit.

> diff --git a/sysdeps/unix/sysv/linux/mips/clone.S b/sysdeps/unix/sysv/linux/mips/clone.S

> @@ -137,14 +134,9 @@ L(thread_start):
>  	/* Call the user's function.  */
>  	jal		t9
>  
> -	/* Call _exit rather than doing it inline for breakpoint purposes.  */
>  	move		a0,v0
> -#ifdef __PIC__
> -	PTR_LA		t9,_exit
> -	jalr		t9
> -#else
> -	jal		_exit
> -#endif
> +	li		v0,__NR_clone
> +	syscall

Likewise.
Adhemerval Zanella Netto June 23, 2017, 3:07 p.m. UTC | #3
> On 23 Jun 2017, at 10:14, Joseph Myers <joseph@codesourcery.com> wrote:
> 
>> On Fri, 23 Jun 2017, Adhemerval Zanella wrote:
>> 
>> diff --git a/sysdeps/unix/sysv/linux/arm/clone.S b/sysdeps/unix/sysv/linux/arm/clone.S
> 
>> @@ -76,7 +73,8 @@ PSEUDO_END (__clone)
>>    BLX (ip)
>> 
>>    @ and we are done, passing the return value through r0
>> -    b    PLTJMP(HIDDEN_JUMPTARGET(_exit))
>> +    ldr    r7, =SYS_ify(clone)
>> +    swi    0x0
> 
> This looks like it would call the clone syscall, not exit.
> 

Ugh... I will fix it.

>> diff --git a/sysdeps/unix/sysv/linux/mips/clone.S b/sysdeps/unix/sysv/linux/mips/clone.S
> 
>> @@ -137,14 +134,9 @@ L(thread_start):
>>    /* Call the user's function.  */
>>    jal        t9
>> 
>> -    /* Call _exit rather than doing it inline for breakpoint purposes.  */
>>    move        a0,v0
>> -#ifdef __PIC__
>> -    PTR_LA        t9,_exit
>> -    jalr        t9
>> -#else
>> -    jal        _exit
>> -#endif
>> +    li        v0,__NR_clone
>> +    syscall
> 
> Likewise.
> 

Same.

> -- 
> Joseph S. Myers
> joseph@codesourcery.com
Adhemerval Zanella Netto June 23, 2017, 3:14 p.m. UTC | #4
> On 23 Jun 2017, at 09:55, Carlos O'Donell <carlos@redhat.com> wrote:
> 
>> On 06/23/2017 08:37 AM, Adhemerval Zanella wrote:
>> On aarch64, alpha, arm, hppa, mips, nios2, powerpc, s390, sh,
>> sparch, tile, and x86_64 the clone syscall jumps to _exit after
>> the child execution and the function ends the process execution by
>> calling exit_group.  This behavior have a small issue where
>> threads created with CLONE_THREAD using clone syscall directly
>> will eventually exit the whole group altogether instead of just
>> the thread created.  Also, microblaze, ia64, i386, and m68k
>> differs by calling exit syscall directly.
>> 
>> This patch changes all architectures to call the exit syscall
>> directly, as for microblaze, ia64, i386, and m68k.  This do not
>> have change glibc internal behavior in any sort, since the only
>> usage of clone implementation in posix_spawn calls _exit directly
>> in the created child (fork uses a direct call to clone).
>> 
>> Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
>> powerpc-linux-gnu, powerpc64le-linux-gnu, sparc64-linux-gnu,
>> and sparcv9-linux-gnu.
> 
> How is it that this doesn't break threading completely on the existing
> architectures that call _exit? Is it because NPTL calls the clone syscall
> directly instead of calling the clone() function and nobody uses clone()
> because of all the other problems it has?

For thread creation within GLIBC the clone is called using the ARCH_FORK macro which issues a direct syscall. Sometime ago I tried to see if it were worth a cleanup to use clone symbol instead, but I got stuck with an issue I don't quite remind the details.

> 
> I admit that only an explicit call to _exit() or exit() should terminate
> every thread in the thread group, *and* looking at the kernel side code
> I see it does try to terminate every thread in the group (zap_other_threads()).
> 
> The patch looks good to me, but I just want some clarification about my
> question "Why haven't we noticed?" :-)

Basically because we never used the clone symbol internally and when we did (new posix_spawn implementation) we call an explicit _exit (spawni). I also think no one really tried to use clone along with CLONE_THREAD.

> 
>>    [BZ #21512]
>>    * sysdeps/unix/sysv/linux/aarch64/clone.S (__clone): Call exit
>>    syscall instead of jump to _exit.
>>    (CLONE_VM_BIT): Remove unused define.
>>    (CLONE_VM): Likewise.
>>    (CLONE_THREAD_BIT): Likewise.
>>    (CLONE_THREAD): Likewise.
>>    * sysdeps/unix/sysv/linux/alpha/clone.S (__clone): Likewise.
>>    (CLONE_VM): Remove unused define.
>>    * sysdeps/unix/sysv/linux/arm/clone.S (__clone): Likewise.
>>    (CLONE_VM): Remove unused define.
>>    (CLONE_THREAD): Likewise.
>>    * sysdeps/unix/sysv/linux/i386/clone.S (CLONE_VM): Likewise.
>>    * sysdeps/unix/sysv/linux/ia64/clone2.S (__clone2): Call exit
>>    syscall instead of jump to _exit.
>>    * sysdeps/unix/sysv/linux/hppa/clone.S (__clone): Likewise.
>>    * sysdeps/unix/sysv/linux/mips/clone.S (__clone): Likewise.
>>    (CLONE_VM): Remove unused define.
>>    (CLONE_THREAD): Likewise.
>>    * sysdeps/unix/sysv/linux/nios2/clone.S (__clone): Likewise.
>>    (CLONE_VM): Remove unused define.
>>    * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S (__clone):
>>    Likewise.
>>    (CLONE_VM): Remove unused define.
>>    (CLONE_THREAD): Likewise.
>>    * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S (__clone):
>>    Likewise.
>>    (CLONE_VM): Remove unused define.
>>    (CLONE_THREAD): Likewise.
>>    * sysdeps/unix/sysv/linux/s390/s390-32/clone.S (__clone): Likewise.
>>    * sysdeps/unix/sysv/linux/s390/s390-64/clone.S (__clone): Likewise.
>>    * sysdeps/unix/sysv/linux/sh/clone.S  (__clone): Likewise.
>>    * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S (__clone): Likewise.
>>    (CLONE_VM): Remove unused define.
>>    * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S (__clone): Likewise.
>>    (CLONE_VM): Remove unused define.
>>    * sysdeps/unix/sysv/linux/tile/clone.S (__clone): Likewise.
>>    * sysdeps/unix/sysv/linux/x86_64/clone.S (__clone): Likewise.
>>    (CLONE_VM): Remove unused define.
>>    * sysdeps/unix/sysv/linux/Makefile (tests): Add tst-clone3.
>>    * sysdeps/unix/sysv/linux/tst-clone3.c: New file.
>> ---
>> sysdeps/unix/sysv/linux/Makefile                  |  4 +-
>> sysdeps/unix/sysv/linux/aarch64/clone.S           |  9 +--
>> sysdeps/unix/sysv/linux/alpha/clone.S             | 11 +--
>> sysdeps/unix/sysv/linux/arm/clone.S               |  6 +-
>> sysdeps/unix/sysv/linux/hppa/clone.S              |  6 +-
>> sysdeps/unix/sysv/linux/i386/clone.S              |  2 -
>> sysdeps/unix/sysv/linux/ia64/clone2.S             |  8 +-
>> sysdeps/unix/sysv/linux/m68k/clone.S              |  2 -
>> sysdeps/unix/sysv/linux/mips/clone.S              | 12 +--
>> sysdeps/unix/sysv/linux/nios2/clone.S             | 17 +---
>> sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S |  7 +-
>> sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S | 11 +--
>> sysdeps/unix/sysv/linux/s390/s390-32/clone.S      |  2 +-
>> sysdeps/unix/sysv/linux/s390/s390-64/clone.S      |  2 +-
>> sysdeps/unix/sysv/linux/sh/clone.S                | 21 +----
>> sysdeps/unix/sysv/linux/sparc/sparc32/clone.S     |  5 +-
>> sysdeps/unix/sysv/linux/sparc/sparc64/clone.S     |  5 +-
>> sysdeps/unix/sysv/linux/tile/clone.S              |  6 +-
>> sysdeps/unix/sysv/linux/tst-clone3.c              | 95 +++++++++++++++++++++++
>> sysdeps/unix/sysv/linux/x86_64/clone.S            |  5 +-
>> 20 files changed, 130 insertions(+), 106 deletions(-)
>> create mode 100644 sysdeps/unix/sysv/linux/tst-clone3.c
>> 
>> diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
>> index 8b340d4..9d6a2de 100644
>> --- a/sysdeps/unix/sysv/linux/Makefile
>> +++ b/sysdeps/unix/sysv/linux/Makefile
>> @@ -49,8 +49,8 @@ sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \
>>          bits/mman-linux.h \
>>          bits/siginfo-arch.h bits/siginfo-consts-arch.h
>> 
>> -tests += tst-clone tst-clone2 tst-fanotify tst-personality tst-quota \
>> -     tst-sync_file_range test-errno-linux
>> +tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \
>> +     tst-quota tst-sync_file_range test-errno-linux
>> 
>> # Generate the list of SYS_* macros for the system calls (__NR_* macros).
>> 
>> diff --git a/sysdeps/unix/sysv/linux/aarch64/clone.S b/sysdeps/unix/sysv/linux/aarch64/clone.S
>> index 259ec07..905915a 100644
>> --- a/sysdeps/unix/sysv/linux/aarch64/clone.S
>> +++ b/sysdeps/unix/sysv/linux/aarch64/clone.S
>> @@ -23,12 +23,6 @@
>> #define _ERRNO_H    1
>> #include <bits/errno.h>
>> 
>> -#define CLONE_VM_BIT      8
>> -#define CLONE_VM          (1 << CLONE_VM_BIT)
>> -
>> -#define CLONE_THREAD_BIT  16
>> -#define CLONE_THREAD      (1 << CLONE_THREAD_BIT)
>> -
>> /* int clone(int (*fn)(void *arg),            x0
>>         void *child_stack,               x1
>>         int flags,                       x2
>> @@ -84,7 +78,8 @@ thread_start:
>>    blr    x10
>> 
>>    /* We are done, pass the return value through x0.  */
>> -    b    HIDDEN_JUMPTARGET(_exit)
>> +    mov    x8, #SYS_ify(exit)
>> +    svc    0x0
>>    cfi_endproc
>>    .size thread_start, .-thread_start
>> 
>> diff --git a/sysdeps/unix/sysv/linux/alpha/clone.S b/sysdeps/unix/sysv/linux/alpha/clone.S
>> index 20ae361..550461f 100644
>> --- a/sysdeps/unix/sysv/linux/alpha/clone.S
>> +++ b/sysdeps/unix/sysv/linux/alpha/clone.S
>> @@ -23,8 +23,6 @@
>> #define _ERRNO_H    1
>> #include <bits/errno.h>
>> 
>> -#define CLONE_VM    0x00000100
>> -
>> /* int clone(int (*fn)(void *arg), void *child_stack, int flags,
>>         void *arg, pid_t *ptid, void *tls, pid_t *ctid);
>> 
>> @@ -100,13 +98,8 @@ thread_start:
>>    jsr    ra, (pv)
>>    ldgp    gp, 0(ra)
>> 
>> -    /* Call _exit rather than doing it inline for breakpoint purposes.  */
>> -    mov    v0, a0
>> -#ifdef PIC
>> -    bsr    ra, HIDDEN_JUMPTARGET(_exit)    !samegp
>> -#else
>> -    jsr    ra, HIDDEN_JUMPTARGET(_exit)
>> -#endif
>> +    ldiq    v0, __NR_exit
>> +    call_pal PAL_callsys
>> 
>>    /* Die horribly.  */
>>    .align    4
>> diff --git a/sysdeps/unix/sysv/linux/arm/clone.S b/sysdeps/unix/sysv/linux/arm/clone.S
>> index a309add..f01968a 100644
>> --- a/sysdeps/unix/sysv/linux/arm/clone.S
>> +++ b/sysdeps/unix/sysv/linux/arm/clone.S
>> @@ -24,9 +24,6 @@
>> #define _ERRNO_H    1
>> #include <bits/errno.h>
>> 
>> -#define CLONE_VM      0x00000100
>> -#define CLONE_THREAD  0x00010000
>> -
>> /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>>         pid_t *ptid, struct user_desc *tls, pid_t *ctid); */
>> 
>> @@ -76,7 +73,8 @@ PSEUDO_END (__clone)
>>    BLX (ip)
>> 
>>    @ and we are done, passing the return value through r0
>> -    b    PLTJMP(HIDDEN_JUMPTARGET(_exit))
>> +    ldr    r7, =SYS_ify(clone)
>> +    swi    0x0
>> 
>>    .fnend
>> 
>> diff --git a/sysdeps/unix/sysv/linux/hppa/clone.S b/sysdeps/unix/sysv/linux/hppa/clone.S
>> index d36b302..8c43944 100644
>> --- a/sysdeps/unix/sysv/linux/hppa/clone.S
>> +++ b/sysdeps/unix/sysv/linux/hppa/clone.S
>> @@ -148,10 +148,10 @@ ENTRY(__clone)
>>    copy    %r4, %r19
>> #endif
>>    /* The call to _exit needs saved r19.  */
>> -    bl    _exit, %rp
>> -    copy    %ret0, %arg0
>> +    ble     0x100(%sr2, %r0)
>> +    ldi    __NR_exit, %r20
>> 
>> -    /* We should not return from _exit.
>> +    /* We should not return from exit.
>>            We do not restore r4, or the stack state.  */
>>    iitlbp    %r0, (%sr0, %r0)
>> 
>> diff --git a/sysdeps/unix/sysv/linux/i386/clone.S b/sysdeps/unix/sysv/linux/i386/clone.S
>> index a4ba3e2..49c82d9 100644
>> --- a/sysdeps/unix/sysv/linux/i386/clone.S
>> +++ b/sysdeps/unix/sysv/linux/i386/clone.S
>> @@ -39,8 +39,6 @@
>> #define __NR_clone 120
>> #define SYS_clone 120
>> 
>> -#define CLONE_VM    0x00000100
>> -
>>         .text
>> ENTRY (__clone)
>>    /* Sanity check arguments.  */
>> diff --git a/sysdeps/unix/sysv/linux/ia64/clone2.S b/sysdeps/unix/sysv/linux/ia64/clone2.S
>> index 9b59473..3157ce9 100644
>> --- a/sysdeps/unix/sysv/linux/ia64/clone2.S
>> +++ b/sysdeps/unix/sysv/linux/ia64/clone2.S
>> @@ -74,11 +74,11 @@ ENTRY(__clone2)
>>    mov b6=out1
>>    br.call.dptk.many rp=b6    /* Call fn(arg) in the child    */
>>    ;;
>> -    mov out0=r8        /* Argument to _exit        */
>> +    mov out0=r8        /* Argument to exit        */
>>    mov gp=loc0
>> -    .globl HIDDEN_JUMPTARGET(_exit)
>> -    br.call.dpnt.many rp=HIDDEN_JUMPTARGET(_exit)
>> -                /* call _exit with result from fn.    */
>> +    mov r15=SYS_ify (exit)
>> +    .save rp, r0
>> +    break __BREAK_SYSCALL
>>    ret            /* Not reached.        */
>> PSEUDO_END(__clone2)
>> 
>> diff --git a/sysdeps/unix/sysv/linux/m68k/clone.S b/sysdeps/unix/sysv/linux/m68k/clone.S
>> index a680191..0894b2a 100644
>> --- a/sysdeps/unix/sysv/linux/m68k/clone.S
>> +++ b/sysdeps/unix/sysv/linux/m68k/clone.S
>> @@ -24,8 +24,6 @@
>> #include <bits/errno.h>
>> #include <tls.h>
>> 
>> -#define CLONE_VM      0x00000100
>> -
>> /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>>         void *parent_tidptr, void *tls, void *child_tidptr) */
>> 
>> diff --git a/sysdeps/unix/sysv/linux/mips/clone.S b/sysdeps/unix/sysv/linux/mips/clone.S
>> index 8b79457..855f972 100644
>> --- a/sysdeps/unix/sysv/linux/mips/clone.S
>> +++ b/sysdeps/unix/sysv/linux/mips/clone.S
>> @@ -25,9 +25,6 @@
>> #include <bits/errno.h>
>> #include <tls.h>
>> 
>> -#define CLONE_VM      0x00000100
>> -#define CLONE_THREAD  0x00010000
>> -
>> /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>>         void *parent_tidptr, void *tls, void *child_tidptr) */
>> 
>> @@ -137,14 +134,9 @@ L(thread_start):
>>    /* Call the user's function.  */
>>    jal        t9
>> 
>> -    /* Call _exit rather than doing it inline for breakpoint purposes.  */
>>    move        a0,v0
>> -#ifdef __PIC__
>> -    PTR_LA        t9,_exit
>> -    jalr        t9
>> -#else
>> -    jal        _exit
>> -#endif
>> +    li        v0,__NR_clone
>> +    syscall
>> 
>>    END(__thread_start)
>> 
>> diff --git a/sysdeps/unix/sysv/linux/nios2/clone.S b/sysdeps/unix/sysv/linux/nios2/clone.S
>> index 7929dfa..2ba8258 100644
>> --- a/sysdeps/unix/sysv/linux/nios2/clone.S
>> +++ b/sysdeps/unix/sysv/linux/nios2/clone.S
>> @@ -25,8 +25,6 @@
>> #include <bits/errno.h>
>> #include <tcb-offsets.h>
>> 
>> -#define CLONE_VM      0x00000100
>> -
>> /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>>         void *parent_tidptr, void *tls, void *child_tidptr) */
>> 
>> @@ -75,18 +73,9 @@ thread_start:
>>         /* Call the user's function.  */
>>    callr    r5
>> 
>> -    /* _exit with the result.  */
>> -    mov    r4, r2
>> -#ifdef PIC
>> -    nextpc    r22
>> -1:    movhi    r8, %hiadj(_gp_got - 1b)
>> -    addi    r8, r8, %lo(_gp_got - 1b)
>> -    add    r22, r22, r8
>> -    ldw    r8, %call(HIDDEN_JUMPTARGET(_exit))(r22)
>> -    jmp    r8
>> -#else
>> -    jmpi    _exit
>> -#endif
>> +    /* exit with the result.  */
>> +    movi    r2, SYS_ify (exit)
>> +    trap
>>    cfi_endproc
>> 
>>    cfi_startproc
>> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S b/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S
>> index a07b7d3..e48cc5f 100644
>> --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S
>> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S
>> @@ -20,10 +20,6 @@
>> #define _ERRNO_H    1
>> #include <bits/errno.h>
>> 
>> -#define CLONE_VM    0x00000100
>> -#define CLONE_THREAD    0x00010000
>> -
>> -
>> /* This is the only really unusual system call in PPC linux, but not
>>    because of any weirdness in the system call itself; because of
>>    all the freaky stuff we have to do to make the call useful.  */
>> @@ -80,8 +76,7 @@ ENTRY (__clone)
>>    mtctr    r30
>>    mr    r3,r31
>>    bctrl
>> -    /* Call _exit with result from procedure.  */
>> -    b    HIDDEN_JUMPTARGET(_exit)
>> +    DO_CALL(SYS_ify(exit))
>> 
>> L(parent):
>>    /* Parent.  Restore registers & return.  */
>> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S b/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S
>> index 9e5bfd2..78c353a 100644
>> --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S
>> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S
>> @@ -20,9 +20,6 @@
>> #define _ERRNO_H    1
>> #include <bits/errno.h>
>> 
>> -#define CLONE_VM    0x00000100
>> -#define CLONE_THREAD    0x00010000
>> -
>> /* This is the only really unusual system call in PPC linux, but not
>>    because of any weirdness in the system call itself; because of
>>    all the freaky stuff we have to do to make the call useful.  */
>> @@ -84,15 +81,11 @@ ENTRY (__clone)
>>    mr    r3,r31
>>    bctrl
>>    ld    r2,FRAME_TOC_SAVE(r1)
>> -    /* Call _exit with result from procedure.  */
>> -#ifdef SHARED
>> -    b    JUMPTARGET(__GI__exit)
>> -#else
>> -    bl    JUMPTARGET(_exit)
>> +
>> +    DO_CALL(SYS_ify(exit))
>>    /* We won't ever get here but provide a nop so that the linker
>>       will insert a toc adjusting stub if necessary.  */
>>    nop
>> -#endif
>> 
>> L(badargs):
>>    cfi_startproc
>> diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/clone.S b/sysdeps/unix/sysv/linux/s390/s390-32/clone.S
>> index a8b4dbc..1588e5f 100644
>> --- a/sysdeps/unix/sysv/linux/s390/s390-32/clone.S
>> +++ b/sysdeps/unix/sysv/linux/s390/s390-32/clone.S
>> @@ -59,7 +59,7 @@ thread_start:
>>    ahi     %r15,-96        /* make room on the stack for the save area */
>>    xc    0(4,%r15),0(%r15)
>>    basr    %r14,%r1        /* jump to fn */
>> -    DO_CALL (exit, 1)
>> +    svc    SYS_ify(exit)
>> 
>> libc_hidden_def (__clone)
>> weak_alias (__clone, clone)
>> diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/clone.S b/sysdeps/unix/sysv/linux/s390/s390-64/clone.S
>> index daf8a58..5843188 100644
>> --- a/sysdeps/unix/sysv/linux/s390/s390-64/clone.S
>> +++ b/sysdeps/unix/sysv/linux/s390/s390-64/clone.S
>> @@ -60,7 +60,7 @@ thread_start:
>>    aghi    %r15,-160    /* make room on the stack for the save area */
>>    xc    0(8,%r15),0(%r15)
>>    basr    %r14,%r1    /* jump to fn */
>> -    DO_CALL    (exit, 1)
>> +    svc    SYS_ify(exit)
>> 
>> libc_hidden_def (__clone)
>> weak_alias (__clone, clone)
>> diff --git a/sysdeps/unix/sysv/linux/sh/clone.S b/sysdeps/unix/sysv/linux/sh/clone.S
>> index 9063b21..b13a64b 100644
>> --- a/sysdeps/unix/sysv/linux/sh/clone.S
>> +++ b/sysdeps/unix/sysv/linux/sh/clone.S
>> @@ -73,25 +73,8 @@ ENTRY(__clone)
>>     mov.l    @(4,r15), r4
>> 
>>    /* we are done, passing the return value through r0  */
>> -    mov.l    .L3, r1
>> -#ifdef SHARED
>> -    mov.l    r12, @-r15
>> -    sts.l    pr, @-r15
>> -    mov    r0, r4
>> -    mova    .LG, r0
>> -    mov.l    .LG, r12
>> -    add    r0, r12
>> -    mova    .L3, r0
>> -    add    r0, r1
>> -    jsr    @r1
>> -     nop
>> -    lds.l    @r15+, pr
>> -    rts
>> -     mov.l    @r15+, r12
>> -#else
>> -    jmp    @r1
>> -     mov    r0, r4
>> -#endif
>> +    mov    #+SYS_ify(exit), r3
>> +    trapa    #0x15
>>    .align    2
>> .LG:
>>    .long    _GLOBAL_OFFSET_TABLE_
>> diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S b/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S
>> index 6d2f5bd..1afa26e 100644
>> --- a/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S
>> +++ b/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S
>> @@ -24,8 +24,6 @@
>> #include <tcb-offsets.h>
>> #include <sysdep.h>
>> 
>> -#define CLONE_VM    0x00000100
>> -
>> /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>>         pid_t *ptid, void *tls, pid_t *ctid); */
>> 
>> @@ -81,7 +79,8 @@ __thread_start:
>>    mov    %g0, %fp    /* terminate backtrace */
>>    call    %g2
>>     mov    %g3,%o0
>> -    call    HIDDEN_JUMPTARGET(_exit),0
>> +    set    __NR_exit, %g1
>> +    ta    0x10
>>     nop
>> 
>>    .size    __thread_start, .-__thread_start
>> diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S b/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S
>> index fc28539..785ccd1 100644
>> --- a/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S
>> +++ b/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S
>> @@ -24,8 +24,6 @@
>> #include <tcb-offsets.h>
>> #include <sysdep.h>
>> 
>> -#define CLONE_VM    0x00000100
>> -
>> /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
>>         pid_t *ptid, void *tls, pid_t *ctid); */
>> 
>> @@ -78,7 +76,8 @@ __thread_start:
>>    mov    %g0, %fp    /* terminate backtrace */
>>    call    %g2
>>     mov    %g3,%o0
>> -    call    HIDDEN_JUMPTARGET(_exit),0
>> +    set    __NR_exit, %g1
>> +    ta    0x6d
>>     nop
>> 
>>    .size    __thread_start, .-__thread_start
>> diff --git a/sysdeps/unix/sysv/linux/tile/clone.S b/sysdeps/unix/sysv/linux/tile/clone.S
>> index d7d2a3b..9610acd 100644
>> --- a/sysdeps/unix/sysv/linux/tile/clone.S
>> +++ b/sysdeps/unix/sysv/linux/tile/clone.S
>> @@ -168,10 +168,8 @@ ENTRY (__clone)
>>     move r0, r31
>>     jalr r32
>>    }
>> -    {
>> -     j HIDDEN_JUMPTARGET(_exit)
>> -     info INFO_OP_CANNOT_BACKTRACE   /* Notify backtracer to stop. */
>> -    }
>> +    moveli TREG_SYSCALL_NR_NAME, __NR_exit
>> +    swint1
>> PSEUDO_END (__clone)
>> 
>> libc_hidden_def (__clone)
>> diff --git a/sysdeps/unix/sysv/linux/tst-clone3.c b/sysdeps/unix/sysv/linux/tst-clone3.c
>> new file mode 100644
>> index 0000000..e893e4d
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/tst-clone3.c
>> @@ -0,0 +1,95 @@
>> +/* Check if clone (CLONE_THREAD) does not call exit_group (BZ #21512)
>> +   Copyright (C) 2017 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#include <string.h>
>> +#include <sched.h>
>> +#include <signal.h>
>> +#include <unistd.h>
>> +#include <sys/syscall.h>
>> +#include <sys/wait.h>
>> +#include <sys/types.h>
>> +#include <linux/futex.h>
>> +
>> +#include <stackinfo.h>  /* For _STACK_GROWS_{UP,DOWN}.  */
>> +#include <support/check.h>
>> +
>> +/* Test if clone call with CLONE_THREAD does not call exit_group.  The 'f'
>> +   function returns '1', which will be used by clone thread to call the
>> +   'exit' syscall directly.  If _exit is used instead, exit_group will be
>> +   used and thus the thread group will finish with return value of '1'
>> +   (where '2' from main thread is expected.  */
> 
> s/(where/where/g
> 
>> +
>> +static int
>> +f (void *a)
>> +{
>> +  return 1;
>> +}
>> +
>> +/* Futex wait for TID argument, similar to pthread_join internal
>> +   implementation.  */
>> +#define wait_tid(tid) \
>> +  do {                    \
>> +    __typeof (tid) __tid;        \
>> +    while ((__tid = (tid)) != 0)    \
>> +      futex_wait (&(tid), __tid);    \
>> +  } while (0)
>> +
>> +static inline int
>> +futex_wait (int *futexp, int val)
>> +{
>> +  return syscall (__NR_futex, futexp, FUTEX_WAIT, val);
>> +}
>> +
>> +static int
>> +do_test (void)
>> +{
>> +  char st[1024] __attribute__ ((aligned));
>> +  int clone_flags = CLONE_THREAD;
>> +  /* Minimum required flags to used along with CLONE_THREAD.  */
>> +  clone_flags |= CLONE_VM | CLONE_SIGHAND;
>> +  /* We will used ctid to call on futex to wait for thread exit.  */
>> +  clone_flags |= CLONE_CHILD_CLEARTID;
>> +  pid_t ctid, tid;
>> +
>> +#ifdef __ia64__
>> +  extern int __clone2 (int (*__fn) (void *__arg), void *__child_stack_base,
>> +               size_t __child_stack_size, int __flags,
>> +               void *__arg, ...);
>> +  tid = __clone2 (f, st, sizeof (st), clone_flags, NULL, /* ptid */ NULL,
>> +          /* tls */ NULL, &ctid);
>> +#else
>> +#if _STACK_GROWS_DOWN
>> +  tid = clone (f, st + sizeof (st), clone_flags, NULL, /* ptid */ NULL,
>> +           /* tls */ NULL, &ctid);
>> +#elif _STACK_GROWS_UP
>> +  tid = clone (f, st, clone_flags, NULL, /* ptid */ NULL, /* tls */ NULL,
>> +           &ctid);
>> +#else
>> +#error "Define either _STACK_GROWS_DOWN or _STACK_GROWS_UP"
>> +#endif
>> +#endif
>> +  if (tid == -1)
>> +    FAIL_EXIT1 ("clone failed: %m");
>> +
>> +  wait_tid (ctid);
>> +
>> +  return 2;
>> +}
>> +
>> +#define EXPECTED_STATUS 2
>> +#include <support/test-driver.c>
>> diff --git a/sysdeps/unix/sysv/linux/x86_64/clone.S b/sysdeps/unix/sysv/linux/x86_64/clone.S
>> index d5c2d07..b10fc29 100644
>> --- a/sysdeps/unix/sysv/linux/x86_64/clone.S
>> +++ b/sysdeps/unix/sysv/linux/x86_64/clone.S
>> @@ -23,8 +23,6 @@
>> #include <bits/errno.h>
>> #include <asm-syntax.h>
>> 
>> -#define CLONE_VM    0x00000100
>> -
>> /* The userland implementation is:
>>    int clone (int (*fn)(void *arg), void *child_stack, int flags, void *arg),
>>    the kernel entry is:
>> @@ -97,7 +95,8 @@ L(thread_start):
>>    call    *%rax
>>    /* Call exit with return value from function call. */
>>    movq    %rax, %rdi
>> -    call    HIDDEN_JUMPTARGET (_exit)
>> +    movl    $SYS_ify(exit), %eax
>> +    syscall
>>    cfi_endproc;
>> 
>>    cfi_startproc;
>> 
> 
> 
> -- 
> Cheers,
> Carlos.
diff mbox

Patch

diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 8b340d4..9d6a2de 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -49,8 +49,8 @@  sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \
 		  bits/mman-linux.h \
 		  bits/siginfo-arch.h bits/siginfo-consts-arch.h
 
-tests += tst-clone tst-clone2 tst-fanotify tst-personality tst-quota \
-	 tst-sync_file_range test-errno-linux
+tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \
+	 tst-quota tst-sync_file_range test-errno-linux
 
 # Generate the list of SYS_* macros for the system calls (__NR_* macros).
 
diff --git a/sysdeps/unix/sysv/linux/aarch64/clone.S b/sysdeps/unix/sysv/linux/aarch64/clone.S
index 259ec07..905915a 100644
--- a/sysdeps/unix/sysv/linux/aarch64/clone.S
+++ b/sysdeps/unix/sysv/linux/aarch64/clone.S
@@ -23,12 +23,6 @@ 
 #define _ERRNO_H	1
 #include <bits/errno.h>
 
-#define CLONE_VM_BIT      8
-#define CLONE_VM          (1 << CLONE_VM_BIT)
-
-#define CLONE_THREAD_BIT  16
-#define CLONE_THREAD      (1 << CLONE_THREAD_BIT)
-
 /* int clone(int (*fn)(void *arg),            x0
 	     void *child_stack,               x1
 	     int flags,                       x2
@@ -84,7 +78,8 @@  thread_start:
 	blr	x10
 
 	/* We are done, pass the return value through x0.  */
-	b	HIDDEN_JUMPTARGET(_exit)
+	mov	x8, #SYS_ify(exit)
+	svc	0x0
 	cfi_endproc
 	.size thread_start, .-thread_start
 
diff --git a/sysdeps/unix/sysv/linux/alpha/clone.S b/sysdeps/unix/sysv/linux/alpha/clone.S
index 20ae361..550461f 100644
--- a/sysdeps/unix/sysv/linux/alpha/clone.S
+++ b/sysdeps/unix/sysv/linux/alpha/clone.S
@@ -23,8 +23,6 @@ 
 #define _ERRNO_H	1
 #include <bits/errno.h>
 
-#define CLONE_VM	0x00000100
-
 /* int clone(int (*fn)(void *arg), void *child_stack, int flags,
 	     void *arg, pid_t *ptid, void *tls, pid_t *ctid);
 
@@ -100,13 +98,8 @@  thread_start:
 	jsr	ra, (pv)
 	ldgp	gp, 0(ra)
 
-	/* Call _exit rather than doing it inline for breakpoint purposes.  */
-	mov	v0, a0
-#ifdef PIC
-	bsr	ra, HIDDEN_JUMPTARGET(_exit)	!samegp
-#else
-	jsr	ra, HIDDEN_JUMPTARGET(_exit)
-#endif
+	ldiq	v0, __NR_exit
+	call_pal PAL_callsys
 
 	/* Die horribly.  */
 	.align	4
diff --git a/sysdeps/unix/sysv/linux/arm/clone.S b/sysdeps/unix/sysv/linux/arm/clone.S
index a309add..f01968a 100644
--- a/sysdeps/unix/sysv/linux/arm/clone.S
+++ b/sysdeps/unix/sysv/linux/arm/clone.S
@@ -24,9 +24,6 @@ 
 #define _ERRNO_H	1
 #include <bits/errno.h>
 
-#define CLONE_VM      0x00000100
-#define CLONE_THREAD  0x00010000
-
 /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
 	     pid_t *ptid, struct user_desc *tls, pid_t *ctid); */
 
@@ -76,7 +73,8 @@  PSEUDO_END (__clone)
 	BLX (ip)
 
 	@ and we are done, passing the return value through r0
-	b	PLTJMP(HIDDEN_JUMPTARGET(_exit))
+	ldr	r7, =SYS_ify(clone)
+	swi	0x0
 
 	.fnend
 
diff --git a/sysdeps/unix/sysv/linux/hppa/clone.S b/sysdeps/unix/sysv/linux/hppa/clone.S
index d36b302..8c43944 100644
--- a/sysdeps/unix/sysv/linux/hppa/clone.S
+++ b/sysdeps/unix/sysv/linux/hppa/clone.S
@@ -148,10 +148,10 @@  ENTRY(__clone)
 	copy	%r4, %r19
 #endif
 	/* The call to _exit needs saved r19.  */
-	bl	_exit, %rp
-	copy	%ret0, %arg0
+	ble     0x100(%sr2, %r0)
+	ldi	__NR_exit, %r20
 
-	/* We should not return from _exit.
+	/* We should not return from exit.
            We do not restore r4, or the stack state.  */
 	iitlbp	%r0, (%sr0, %r0)
 
diff --git a/sysdeps/unix/sysv/linux/i386/clone.S b/sysdeps/unix/sysv/linux/i386/clone.S
index a4ba3e2..49c82d9 100644
--- a/sysdeps/unix/sysv/linux/i386/clone.S
+++ b/sysdeps/unix/sysv/linux/i386/clone.S
@@ -39,8 +39,6 @@ 
 #define __NR_clone 120
 #define SYS_clone 120
 
-#define CLONE_VM	0x00000100
-
         .text
 ENTRY (__clone)
 	/* Sanity check arguments.  */
diff --git a/sysdeps/unix/sysv/linux/ia64/clone2.S b/sysdeps/unix/sysv/linux/ia64/clone2.S
index 9b59473..3157ce9 100644
--- a/sysdeps/unix/sysv/linux/ia64/clone2.S
+++ b/sysdeps/unix/sysv/linux/ia64/clone2.S
@@ -74,11 +74,11 @@  ENTRY(__clone2)
 	mov b6=out1
 	br.call.dptk.many rp=b6	/* Call fn(arg) in the child 	*/
 	;;
-	mov out0=r8		/* Argument to _exit		*/
+	mov out0=r8		/* Argument to exit		*/
 	mov gp=loc0
-	.globl HIDDEN_JUMPTARGET(_exit)
-	br.call.dpnt.many rp=HIDDEN_JUMPTARGET(_exit)
-				/* call _exit with result from fn.	*/
+	mov r15=SYS_ify (exit)
+	.save rp, r0
+	break __BREAK_SYSCALL
 	ret			/* Not reached.		*/
 PSEUDO_END(__clone2)
 
diff --git a/sysdeps/unix/sysv/linux/m68k/clone.S b/sysdeps/unix/sysv/linux/m68k/clone.S
index a680191..0894b2a 100644
--- a/sysdeps/unix/sysv/linux/m68k/clone.S
+++ b/sysdeps/unix/sysv/linux/m68k/clone.S
@@ -24,8 +24,6 @@ 
 #include <bits/errno.h>
 #include <tls.h>
 
-#define CLONE_VM      0x00000100
-
 /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
 	     void *parent_tidptr, void *tls, void *child_tidptr) */
 
diff --git a/sysdeps/unix/sysv/linux/mips/clone.S b/sysdeps/unix/sysv/linux/mips/clone.S
index 8b79457..855f972 100644
--- a/sysdeps/unix/sysv/linux/mips/clone.S
+++ b/sysdeps/unix/sysv/linux/mips/clone.S
@@ -25,9 +25,6 @@ 
 #include <bits/errno.h>
 #include <tls.h>
 
-#define CLONE_VM      0x00000100
-#define CLONE_THREAD  0x00010000
-
 /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
 	     void *parent_tidptr, void *tls, void *child_tidptr) */
 
@@ -137,14 +134,9 @@  L(thread_start):
 	/* Call the user's function.  */
 	jal		t9
 
-	/* Call _exit rather than doing it inline for breakpoint purposes.  */
 	move		a0,v0
-#ifdef __PIC__
-	PTR_LA		t9,_exit
-	jalr		t9
-#else
-	jal		_exit
-#endif
+	li		v0,__NR_clone
+	syscall
 
 	END(__thread_start)
 
diff --git a/sysdeps/unix/sysv/linux/nios2/clone.S b/sysdeps/unix/sysv/linux/nios2/clone.S
index 7929dfa..2ba8258 100644
--- a/sysdeps/unix/sysv/linux/nios2/clone.S
+++ b/sysdeps/unix/sysv/linux/nios2/clone.S
@@ -25,8 +25,6 @@ 
 #include <bits/errno.h>
 #include <tcb-offsets.h>
 
-#define CLONE_VM      0x00000100
-
 /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
 	     void *parent_tidptr, void *tls, void *child_tidptr) */
 
@@ -75,18 +73,9 @@  thread_start:
         /* Call the user's function.  */
 	callr	r5
 
-	/* _exit with the result.  */
-	mov	r4, r2
-#ifdef PIC
-	nextpc	r22
-1:	movhi	r8, %hiadj(_gp_got - 1b)
-	addi	r8, r8, %lo(_gp_got - 1b)
-	add	r22, r22, r8
-	ldw	r8, %call(HIDDEN_JUMPTARGET(_exit))(r22)
-	jmp	r8
-#else
-	jmpi	_exit
-#endif
+	/* exit with the result.  */
+	movi	r2, SYS_ify (exit)
+	trap
 	cfi_endproc
 
 	cfi_startproc
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S b/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S
index a07b7d3..e48cc5f 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S
@@ -20,10 +20,6 @@ 
 #define _ERRNO_H	1
 #include <bits/errno.h>
 
-#define CLONE_VM	0x00000100
-#define CLONE_THREAD	0x00010000
-
-
 /* This is the only really unusual system call in PPC linux, but not
    because of any weirdness in the system call itself; because of
    all the freaky stuff we have to do to make the call useful.  */
@@ -80,8 +76,7 @@  ENTRY (__clone)
 	mtctr	r30
 	mr	r3,r31
 	bctrl
-	/* Call _exit with result from procedure.  */
-	b	HIDDEN_JUMPTARGET(_exit)
+	DO_CALL(SYS_ify(exit))
 
 L(parent):
 	/* Parent.  Restore registers & return.  */
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S b/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S
index 9e5bfd2..78c353a 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S
@@ -20,9 +20,6 @@ 
 #define _ERRNO_H	1
 #include <bits/errno.h>
 
-#define CLONE_VM	0x00000100
-#define CLONE_THREAD	0x00010000
-
 /* This is the only really unusual system call in PPC linux, but not
    because of any weirdness in the system call itself; because of
    all the freaky stuff we have to do to make the call useful.  */
@@ -84,15 +81,11 @@  ENTRY (__clone)
 	mr	r3,r31
 	bctrl
 	ld	r2,FRAME_TOC_SAVE(r1)
-	/* Call _exit with result from procedure.  */
-#ifdef SHARED
-	b	JUMPTARGET(__GI__exit)
-#else
-	bl	JUMPTARGET(_exit)
+
+	DO_CALL(SYS_ify(exit))
 	/* We won't ever get here but provide a nop so that the linker
 	   will insert a toc adjusting stub if necessary.  */
 	nop
-#endif
 
 L(badargs):
 	cfi_startproc
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/clone.S b/sysdeps/unix/sysv/linux/s390/s390-32/clone.S
index a8b4dbc..1588e5f 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/clone.S
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/clone.S
@@ -59,7 +59,7 @@  thread_start:
 	ahi     %r15,-96        /* make room on the stack for the save area */
 	xc	0(4,%r15),0(%r15)
 	basr    %r14,%r1        /* jump to fn */
-	DO_CALL (exit, 1)
+	svc	SYS_ify(exit)
 
 libc_hidden_def (__clone)
 weak_alias (__clone, clone)
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/clone.S b/sysdeps/unix/sysv/linux/s390/s390-64/clone.S
index daf8a58..5843188 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/clone.S
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/clone.S
@@ -60,7 +60,7 @@  thread_start:
 	aghi	%r15,-160	/* make room on the stack for the save area */
 	xc	0(8,%r15),0(%r15)
 	basr	%r14,%r1	/* jump to fn */
-	DO_CALL	(exit, 1)
+	svc	SYS_ify(exit)
 
 libc_hidden_def (__clone)
 weak_alias (__clone, clone)
diff --git a/sysdeps/unix/sysv/linux/sh/clone.S b/sysdeps/unix/sysv/linux/sh/clone.S
index 9063b21..b13a64b 100644
--- a/sysdeps/unix/sysv/linux/sh/clone.S
+++ b/sysdeps/unix/sysv/linux/sh/clone.S
@@ -73,25 +73,8 @@  ENTRY(__clone)
 	 mov.l	@(4,r15), r4
 
 	/* we are done, passing the return value through r0  */
-	mov.l	.L3, r1
-#ifdef SHARED
-	mov.l	r12, @-r15
-	sts.l	pr, @-r15
-	mov	r0, r4
-	mova	.LG, r0
-	mov.l	.LG, r12
-	add	r0, r12
-	mova	.L3, r0
-	add	r0, r1
-	jsr	@r1
-	 nop
-	lds.l	@r15+, pr
-	rts
-	 mov.l	@r15+, r12
-#else
-	jmp	@r1
-	 mov	r0, r4
-#endif
+	mov	#+SYS_ify(exit), r3
+	trapa	#0x15
 	.align	2
 .LG:
 	.long	_GLOBAL_OFFSET_TABLE_
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S b/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S
index 6d2f5bd..1afa26e 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/clone.S
@@ -24,8 +24,6 @@ 
 #include <tcb-offsets.h>
 #include <sysdep.h>
 
-#define CLONE_VM	0x00000100
-
 /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
 	     pid_t *ptid, void *tls, pid_t *ctid); */
 
@@ -81,7 +79,8 @@  __thread_start:
 	mov	%g0, %fp	/* terminate backtrace */
 	call	%g2
 	 mov	%g3,%o0
-	call	HIDDEN_JUMPTARGET(_exit),0
+	set	__NR_exit, %g1
+	ta	0x10
 	 nop
 
 	.size	__thread_start, .-__thread_start
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S b/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S
index fc28539..785ccd1 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/clone.S
@@ -24,8 +24,6 @@ 
 #include <tcb-offsets.h>
 #include <sysdep.h>
 
-#define CLONE_VM	0x00000100
-
 /* int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg,
 	     pid_t *ptid, void *tls, pid_t *ctid); */
 
@@ -78,7 +76,8 @@  __thread_start:
 	mov	%g0, %fp	/* terminate backtrace */
 	call	%g2
 	 mov	%g3,%o0
-	call	HIDDEN_JUMPTARGET(_exit),0
+	set	__NR_exit, %g1
+	ta	0x6d
 	 nop
 
 	.size	__thread_start, .-__thread_start
diff --git a/sysdeps/unix/sysv/linux/tile/clone.S b/sysdeps/unix/sysv/linux/tile/clone.S
index d7d2a3b..9610acd 100644
--- a/sysdeps/unix/sysv/linux/tile/clone.S
+++ b/sysdeps/unix/sysv/linux/tile/clone.S
@@ -168,10 +168,8 @@  ENTRY (__clone)
 	 move r0, r31
 	 jalr r32
 	}
-	{
-	 j HIDDEN_JUMPTARGET(_exit)
-	 info INFO_OP_CANNOT_BACKTRACE   /* Notify backtracer to stop. */
-	}
+	moveli TREG_SYSCALL_NR_NAME, __NR_exit
+	swint1
 PSEUDO_END (__clone)
 
 libc_hidden_def (__clone)
diff --git a/sysdeps/unix/sysv/linux/tst-clone3.c b/sysdeps/unix/sysv/linux/tst-clone3.c
new file mode 100644
index 0000000..e893e4d
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-clone3.c
@@ -0,0 +1,95 @@ 
+/* Check if clone (CLONE_THREAD) does not call exit_group (BZ #21512)
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <string.h>
+#include <sched.h>
+#include <signal.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <sys/wait.h>
+#include <sys/types.h>
+#include <linux/futex.h>
+
+#include <stackinfo.h>  /* For _STACK_GROWS_{UP,DOWN}.  */
+#include <support/check.h>
+
+/* Test if clone call with CLONE_THREAD does not call exit_group.  The 'f'
+   function returns '1', which will be used by clone thread to call the
+   'exit' syscall directly.  If _exit is used instead, exit_group will be
+   used and thus the thread group will finish with return value of '1'
+   (where '2' from main thread is expected.  */
+
+static int
+f (void *a)
+{
+  return 1;
+}
+
+/* Futex wait for TID argument, similar to pthread_join internal
+   implementation.  */
+#define wait_tid(tid) \
+  do {					\
+    __typeof (tid) __tid;		\
+    while ((__tid = (tid)) != 0)	\
+      futex_wait (&(tid), __tid);	\
+  } while (0)
+
+static inline int
+futex_wait (int *futexp, int val)
+{
+  return syscall (__NR_futex, futexp, FUTEX_WAIT, val);
+}
+
+static int
+do_test (void)
+{
+  char st[1024] __attribute__ ((aligned));
+  int clone_flags = CLONE_THREAD;
+  /* Minimum required flags to used along with CLONE_THREAD.  */
+  clone_flags |= CLONE_VM | CLONE_SIGHAND;
+  /* We will used ctid to call on futex to wait for thread exit.  */
+  clone_flags |= CLONE_CHILD_CLEARTID;
+  pid_t ctid, tid;
+
+#ifdef __ia64__
+  extern int __clone2 (int (*__fn) (void *__arg), void *__child_stack_base,
+		       size_t __child_stack_size, int __flags,
+		       void *__arg, ...);
+  tid = __clone2 (f, st, sizeof (st), clone_flags, NULL, /* ptid */ NULL,
+		  /* tls */ NULL, &ctid);
+#else
+#if _STACK_GROWS_DOWN
+  tid = clone (f, st + sizeof (st), clone_flags, NULL, /* ptid */ NULL,
+	       /* tls */ NULL, &ctid);
+#elif _STACK_GROWS_UP
+  tid = clone (f, st, clone_flags, NULL, /* ptid */ NULL, /* tls */ NULL,
+	       &ctid);
+#else
+#error "Define either _STACK_GROWS_DOWN or _STACK_GROWS_UP"
+#endif
+#endif
+  if (tid == -1)
+    FAIL_EXIT1 ("clone failed: %m");
+
+  wait_tid (ctid);
+
+  return 2;
+}
+
+#define EXPECTED_STATUS 2
+#include <support/test-driver.c>
diff --git a/sysdeps/unix/sysv/linux/x86_64/clone.S b/sysdeps/unix/sysv/linux/x86_64/clone.S
index d5c2d07..b10fc29 100644
--- a/sysdeps/unix/sysv/linux/x86_64/clone.S
+++ b/sysdeps/unix/sysv/linux/x86_64/clone.S
@@ -23,8 +23,6 @@ 
 #include <bits/errno.h>
 #include <asm-syntax.h>
 
-#define CLONE_VM	0x00000100
-
 /* The userland implementation is:
    int clone (int (*fn)(void *arg), void *child_stack, int flags, void *arg),
    the kernel entry is:
@@ -97,7 +95,8 @@  L(thread_start):
 	call	*%rax
 	/* Call exit with return value from function call. */
 	movq	%rax, %rdi
-	call	HIDDEN_JUMPTARGET (_exit)
+	movl	$SYS_ify(exit), %eax
+	syscall
 	cfi_endproc;
 
 	cfi_startproc;