diff mbox series

[5/9] glibc: Perform rseq(2) registration at C startup and thread creation (v17)

Message ID 20200326155633.18236-6-mathieu.desnoyers@efficios.com
State New
Headers show
Series Restartable Sequences enablement | expand

Commit Message

Michael Kerrisk \(man-pages\) via Libc-alpha March 26, 2020, 3:56 p.m. UTC
Register rseq(2) TLS for each thread (including main), and unregister
for each thread (excluding main). "rseq" stands for Restartable
Sequences.

See the rseq(2) man page proposed here:
  https://lkml.org/lkml/2018/9/19/647

This patch depends on three patches from Florian Weimer:

- Introduce <elf_machine_sym_no_match.h>
- Implement __libc_early_init
- nptl: Start new threads with all signals blocked [BZ #25098]

those are based on glibc master branch commit 1fabdb99084df004f7f4cdc7068d1be209a258be.
The rseq(2) system call was merged into Linux 4.18.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Carlos O'Donell <carlos@redhat.com>
CC: Florian Weimer <fweimer@redhat.com>
CC: Joseph Myers <joseph@codesourcery.com>
CC: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ben Maurer <bmaurer@fb.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Paul Turner <pjt@google.com>
CC: Rich Felker <dalias@libc.org>
CC: libc-alpha@sourceware.org
CC: linux-kernel@vger.kernel.org
CC: linux-api@vger.kernel.org
---
Changes since v1:
- Move __rseq_refcount to an extra field at the end of __rseq_abi to
  eliminate one symbol.

  All libraries/programs which try to register rseq (glibc,
  early-adopter applications, early-adopter libraries) should use the
  rseq refcount. It becomes part of the ABI within a user-space
  process, but it's not part of the ABI shared with the kernel per se.

- Restructure how this code is organized so glibc keeps building on
  non-Linux targets.

- Use non-weak symbol for __rseq_abi.

- Move rseq registration/unregistration implementation into its own
  nptl/rseq.c compile unit.

- Move __rseq_abi symbol under GLIBC_2.29.

Changes since v2:
- Move __rseq_refcount to its own symbol, which is less ugly than
  trying to play tricks with the rseq uapi.
- Move __rseq_abi from nptl to csu (C start up), so it can be used
  across glibc, including memory allocator and sched_getcpu(). The
  __rseq_refcount symbol is kept in nptl, because there is no reason
  to use it elsewhere in glibc.

Changes since v3:
- Set __rseq_refcount TLS to 1 on register/set to 0 on unregister
  because glibc is the first/last user.
- Unconditionally register/unregister rseq at thread start/exit, because
  glibc is the first/last user.
- Add missing abilist items.
- Rebase on glibc master commit a502c5294.
- Add NEWS entry.

Changes since v4:
- Do not use "weak" symbols for __rseq_abi and __rseq_refcount. Based on
  "System V Application Binary Interface", weak only affects the link
  editor, not the dynamic linker.
- Install a new sys/rseq.h system header on Linux, which contains the
  RSEQ_SIG definition, __rseq_abi declaration and __rseq_refcount
  declaration. Move those definition/declarations from rseq-internal.h
  to the installed sys/rseq.h header.
- Considering that rseq is only available on Linux, move csu/rseq.c to
  sysdeps/unix/sysv/linux/rseq-sym.c.
- Move __rseq_refcount from nptl/rseq.c to
  sysdeps/unix/sysv/linux/rseq-sym.c, so it is only defined on Linux.
- Move both ABI definitions for __rseq_abi and __rseq_refcount to
  sysdeps/unix/sysv/linux/Versions, so they only appear on Linux.
- Document __rseq_abi and __rseq_refcount volatile.
- Document the RSEQ_SIG signature define.
- Move registration functions from rseq.c to rseq-internal.h static
  inline functions. Introduce empty stubs in misc/rseq-internal.h,
  which can be overridden by architecture code in
  sysdeps/unix/sysv/linux/rseq-internal.h.
- Rename __rseq_register_current_thread and __rseq_unregister_current_thread
  to rseq_register_current_thread and rseq_unregister_current_thread,
  now that those are only visible as internal static inline functions.
- Invoke rseq_register_current_thread() from libc-start.c LIBC_START_MAIN
  rather than nptl init, so applications not linked against
  libpthread.so have rseq registered for their main() thread. Note that
  it is invoked separately for SHARED and !SHARED builds.

Changes since v5:
- Replace __rseq_refcount by __rseq_lib_abi, which contains two
  uint32_t: register_state and refcount. The "register_state" field
  allows inhibiting rseq registration from signal handlers nested on top
  of glibc registration and occuring after rseq unregistration by glibc.
- Introduce enum rseq_register_state, which contains the states allowed
  for the struct rseq_lib_abi register_state field.

Changes since v6:
- Introduce bits/rseq.h to define RSEQ_SIG for each architecture.
  The generic bits/rseq.h does not define RSEQ_SIG, meaning that each
  architecture implementing rseq needs to implement bits/rseq.h.
- Rename enum item RSEQ_REGISTER_NESTED to RSEQ_REGISTER_ONGOING.
- Port to glibc-2.29.

Changes since v7:
- Remove __rseq_lib_abi symbol, including refcount and register_state
  fields.
- Remove reference counting and nested signals handling from
  registration/unregistration functions.
- Introduce new __rseq_handled exported symbol, which is set to 1
  by glibc on C startup when it handles restartable sequences.
  This allows glibc to coexist with early adopter libraries and
  applications wishing to register restartable sequences when it
  is not handled by glibc.
- Introduce rseq_init (), which sets __rseq_handled to 1 from
  C startup.
- Update NEWS entry.
- Update comments at the beginning of new files.
- Registration depends on both __NR_rseq and RSEQ_SIG.
- Remove ARM, powerpc, MIPS RSEQ_SIG until we agree with maintainers
  on the signature choice.
- Update x86, s390 RSEQ_SIG based on discussion with arch maintainers.
- Remove rseq-internal.h from headers list of misc/Makefile, so it
  it not installed by make install.

Changes since v8:
- Introduce RSEQ_SIG_CODE and RSEQ_SIG_DATA on aarch64 to handle
  compiling with -mbig-endian.

Changes since v9:
- Update Changelog.
- Remove unneeded new file comment header newlines.

Changes since v10:
- Remove volatile from __rseq_abi declaration.
- Document that __rseq_handled is about library managing rseq
  registration, independently of whether rseq is available or not.
- Move __rseq_handled symbol to ld.so, initialize this symbol within
  the dynamic linker initialization for both shared (rtld.c) and static
  (dl-support.c) builds.
- Only register the rseq TLS on initialization once in multiple-libc
  scenarios. Use rtld_active () for this purpose.
- In the static libc case, register the rseq TLS after LD_PRELOAD
  constructors are run, so it matches the order of this initialization
  vs LD_PRELOAD contructors execution for the shared libc.
- Agreed on signature choice with powerpc and MIPS maintainers,
  re-adding those signatures,
- The main architecture still left out signature-wise is ARM32.

Changes since v11:
- Rebase on glibc 2.30.
- Re-introduce ARM RSEQ_SIG following feedback from Will Deacon.

Changes since v12:
- Remove __rseq_handled,
- Rely on OS implicit rseq unregistration on thread teardown,
- Register main thread in __libc_early_init ().
- Add Restartable Sequences entry to threads manual.

Changes since v13:
- Update following be/le abilist split for arm, microblaze, and sh.
- Update manual to add the __rseq_abi variable and RSEQ_SIG macro to
  generate manual index entries, and add missing "Restartable Sequences"
  menu entry to the threads chapter.

Changes since v14:
- Update copyright range to include 2020.
- Introduce __ASSUME_RSEQ defined for --enable-kernel=4.18.0 and higher.
- Use ifdef __ASSUME_RSEQ rather than ifdef __NR_rseq to discover rseq
  availability. This is necessary now that the system call numbers are
  integrated within glibc.

Changes since v15:
- Remove __ASSUME_RSEQ from kernel features.
- rseq internal: remove assume rseq
- remove assume rseq and struct rseq def from sysdeps/unix/sysv/linux/rseq-sym.c
- sys/rseq.h: detect rseq header, implement fallback
- sysdeps/unix/sysv/linux/sys/rseq.h include cdefs.h, add _Static_assert
  to validate struct rseq and struct rseq_cs alignment.
- sys/rseq.h: document that posix_memalign should be used rather than
  malloc if allocating struct rseq or struct rseq_cs on the heap. This
  is required to guarantee 32-byte alignement.

Changes since v16:
- Move rseq NEWS entry under 2.32.
- Move new __rseq_abi symbol to GLIBC_2.32.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Carlos O'Donell <carlos@redhat.com>
CC: Florian Weimer <fweimer@redhat.com>
CC: Joseph Myers <joseph@codesourcery.com>
CC: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ben Maurer <bmaurer@fb.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Paul Turner <pjt@google.com>
CC: Rich Felker <dalias@libc.org>
CC: libc-alpha@sourceware.org
CC: linux-kernel@vger.kernel.org
CC: linux-api@vger.kernel.org
---
 NEWS                                          |  12 +-
 elf/libc_early_init.c                         |   3 +
 manual/threads.texi                           |  30 ++-
 misc/rseq-internal.h                          |  33 ++++
 nptl/pthread_create.c                         |  12 ++
 sysdeps/unix/sysv/linux/Makefile              |   5 +-
 sysdeps/unix/sysv/linux/Versions              |   3 +
 sysdeps/unix/sysv/linux/aarch64/bits/rseq.h   |  43 ++++
 sysdeps/unix/sysv/linux/aarch64/libc.abilist  |   1 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/arm/be/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/arm/bits/rseq.h       |  83 ++++++++
 sysdeps/unix/sysv/linux/arm/le/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/bits/rseq.h           |  29 +++
 sysdeps/unix/sysv/linux/csky/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/i386/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist     |   1 +
 .../sysv/linux/m68k/coldfire/libc.abilist     |   1 +
 .../unix/sysv/linux/m68k/m680x0/libc.abilist  |   1 +
 .../sysv/linux/microblaze/be/libc.abilist     |   1 +
 .../sysv/linux/microblaze/le/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/mips/bits/rseq.h      |  62 ++++++
 .../sysv/linux/mips/mips32/fpu/libc.abilist   |   1 +
 .../sysv/linux/mips/mips32/nofpu/libc.abilist |   1 +
 .../sysv/linux/mips/mips64/n32/libc.abilist   |   1 +
 .../sysv/linux/mips/mips64/n64/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/powerpc/bits/rseq.h   |  37 ++++
 .../linux/powerpc/powerpc32/fpu/libc.abilist  |   1 +
 .../powerpc/powerpc32/nofpu/libc.abilist      |   1 +
 .../linux/powerpc/powerpc64/be/libc.abilist   |   1 +
 .../linux/powerpc/powerpc64/le/libc.abilist   |   1 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/rseq-internal.h       |  73 +++++++
 sysdeps/unix/sysv/linux/rseq-sym.c            |  25 +++
 sysdeps/unix/sysv/linux/s390/bits/rseq.h      |  37 ++++
 .../unix/sysv/linux/s390/s390-32/libc.abilist |   1 +
 .../unix/sysv/linux/s390/s390-64/libc.abilist |   1 +
 sysdeps/unix/sysv/linux/sh/be/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/sh/le/libc.abilist    |   1 +
 .../sysv/linux/sparc/sparc32/libc.abilist     |   1 +
 .../sysv/linux/sparc/sparc64/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/sys/rseq.h            | 186 ++++++++++++++++++
 sysdeps/unix/sysv/linux/x86/bits/rseq.h       |  30 +++
 .../unix/sysv/linux/x86_64/64/libc.abilist    |   1 +
 .../unix/sysv/linux/x86_64/x32/libc.abilist   |   1 +
 47 files changed, 728 insertions(+), 5 deletions(-)
 create mode 100644 misc/rseq-internal.h
 create mode 100644 sysdeps/unix/sysv/linux/aarch64/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/arm/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/mips/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/powerpc/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/rseq-internal.h
 create mode 100644 sysdeps/unix/sysv/linux/rseq-sym.c
 create mode 100644 sysdeps/unix/sysv/linux/s390/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/sys/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/x86/bits/rseq.h

Comments

Florian Weimer April 27, 2020, 9:11 a.m. UTC | #1
* Mathieu Desnoyers via Libc-alpha:

> +* Support for automatically registering threads with the Linux rseq(2)
> +  system call has been added.  This system call is implemented starting
> +  from Linux 4.18.  The Restartable Sequences ABI accelerates user-space
> +  operations on per-cpu data.  It allows user-space to perform updates
> +  on per-cpu data without requiring heavy-weight atomic operations.
> +  Automatically registering threads allows all libraries, including libc,
> +  to make immediate use of the rseq(2) support by using the documented ABI.
> +  See 'man 2 rseq' for the details of the ABI shared between libc and the
> +  kernel.

This should refer documentation in the glibc manual.

(It is currently a glibc project requirement to add documentation for
new Linux interfaces, something that I do not necessarily agree with.)

>  
>  Deprecated and removed features, and other changes affecting compatibility:
>  
> diff --git a/elf/libc_early_init.c b/elf/libc_early_init.c
> index 1ac66d895d..30466afea0 100644
> --- a/elf/libc_early_init.c
> +++ b/elf/libc_early_init.c
> @@ -18,10 +18,13 @@
>  
>  #include <ctype.h>
>  #include <libc-early-init.h>
> +#include <rseq-internal.h>
>  
>  void
>  __libc_early_init (void)
>  {
>    /* Initialize ctype data.  */
>    __ctype_init ();
> +  /* Register rseq ABI to the kernel.   */
> +  (void) rseq_register_current_thread ();
>  }

The cast to void should be removed (see below the comment about the
return type.

> diff --git a/manual/threads.texi b/manual/threads.texi
> index 0858ef8f92..59f634e432 100644
> --- a/manual/threads.texi
> +++ b/manual/threads.texi
> @@ -9,8 +9,10 @@ This chapter describes functions used for managing threads.
>  POSIX threads.
>  
>  @menu
> -* ISO C Threads::	Threads based on the ISO C specification.
> -* POSIX Threads::	Threads based on the POSIX specification.
> +* ISO C Threads::		Threads based on the ISO C specification.
> +* POSIX Threads::		Threads based on the POSIX specification.
> +* Restartable Sequences::	Linux-specific Restartable Sequences
> +				integration.
>  @end menu
>  
>  
> @@ -881,3 +883,27 @@ Behaves like @code{pthread_timedjoin_np} except that the absolute time in
>  @c pthread_spin_unlock
>  @c pthread_testcancel
>  @c pthread_yield
> +
> +@node Restartable Sequences
> +@section Restartable Sequences
> +@cindex rseq
> +
> +This section describes @theglibc{} Restartable Sequences integration.
> +
> +@deftypevar {struct rseq} __rseq_abi
> +@standards{GNU, sys/rseq.h}
> +@Theglibc{} implements a @code{__rseq_abi} TLS symbol to interact with the
> +Restartable Sequences system call (Linux-specific).  The layout of this
> +structure is defined by the Linux kernel @file{linux/rseq.h} UAPI.
> +Registration of each thread's @code{__rseq_abi} is performed by
> +@theglibc{} at libc initialization and pthread creation.
> +@end deftypevar
> +
> +@deftypevr Macro int RSEQ_SIG
> +@standards{GNU, sys/rseq.h}
> +Each supported architecture provide a @code{RSEQ_SIG} macro in
> +@file{sys/rseq.h} which contains a signature.  That signature is expected to be
> +present in the code before each Restartable Sequences abort handler.  Failure
> +to provide the expected signature may terminate the process with a Segmentation
> +fault.
> +@end deftypevr

This should say Linux, not GNU as the standards source, given that the
interface is not added to the GNU ABI.

Is this sufficient documentation of this feature?  What do others
think about this?

> diff --git a/misc/rseq-internal.h b/misc/rseq-internal.h
> new file mode 100644
> index 0000000000..d564cf1bc3
> --- /dev/null
> +++ b/misc/rseq-internal.h
> @@ -0,0 +1,33 @@

> +static inline int
> +rseq_register_current_thread (void)
> +{
> +  return -1;
> +}

Nothing checks the return value of this function as far as I can see,
so it can return void.

> +static inline int
> +rseq_unregister_current_thread (void)
> +{
> +  return -1;
> +}

This function is unused.  This also applies to the full version.  I
believe we switched to implicit deregistration on thread exit, so I
think you can just remove it.

> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
> index afd379e89a..1ff248042e 100644
> --- a/nptl/pthread_create.c
> +++ b/nptl/pthread_create.c
> @@ -33,6 +33,7 @@
>  #include <default-sched.h>
>  #include <futex-internal.h>
>  #include <tls-setup.h>
> +#include <rseq-internal.h>
>  #include "libioP.h"
>  
>  #include <shlib-compat.h>
> @@ -384,6 +385,9 @@ START_THREAD_DEFN
>    /* Initialize pointers to locale data.  */
>    __ctype_init ();
>  
> +  /* Register rseq TLS to the kernel. */
> +  (void) rseq_register_current_thread ();
> +

The cast can go.

>  #ifndef __ASSUME_SET_ROBUST_LIST
>    if (__set_robust_list_avail >= 0)
>  #endif
> @@ -578,6 +582,14 @@ START_THREAD_DEFN
>       process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
>       flag.  The 'tid' field in the TCB will be set to zero.
>  
> +     rseq TLS is still registered at this point. Rely on implicit unregistration
> +     performed by the kernel on thread teardown. This is not a problem because the
> +     rseq TLS lives on the stack, and the stack outlives the thread. If TCB
> +     allocation is ever changed, additional steps may be required, such as
> +     performing explicit rseq unregistration before reclaiming the rseq TLS area
> +     memory. It is NOT sufficient to block signals because the kernel may write
> +     to the rseq area even without signals.
> +
>       The exit code is zero since in case all threads exit by calling
>       'pthread_exit' the exit status must be 0 (zero).  */
>    __exit_thread ();

Some of these lines are too long.  Also two spaces after . at the end
of sentences.

> diff --git a/sysdeps/unix/sysv/linux/rseq-internal.h b/sysdeps/unix/sysv/linux/rseq-internal.h
> new file mode 100644
> index 0000000000..5f7f02f1ec
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/rseq-internal.h

> +static inline int
> +rseq_register_current_thread (void)
> +{
> +  int rc, ret = 0;
> +
> +  if (__rseq_abi.cpu_id == RSEQ_CPU_ID_REGISTRATION_FAILED)
> +    return -1;
> +  rc = INTERNAL_SYSCALL_CALL (rseq, &__rseq_abi, sizeof (struct rseq),
> +                              0, RSEQ_SIG);
> +  if (!rc)
> +    goto end;
> +  if (INTERNAL_SYSCALL_ERRNO (rc) != EBUSY)
> +    __rseq_abi.cpu_id = RSEQ_CPU_ID_REGISTRATION_FAILED;
> +  ret = -1;
> +end:
> +  return ret;
> +}

This does not seem to use INTERNAL_SYSCALL_CALL correctly.  I think
you need to use INTERNAL_SYSCALL_ERROR_P on the result to check for an
error, and only then use INTERNAL_SYSCALL_ERRNO to extract the error
code.

> diff --git a/sysdeps/unix/sysv/linux/rseq-sym.c b/sysdeps/unix/sysv/linux/rseq-sym.c
> new file mode 100644
> index 0000000000..0e33fab278
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/rseq-sym.c

> +#include <sys/syscall.h>
> +#include <stdint.h>
> +#include <kernel-features.h>
> +#include <sys/rseq.h>
> +
> +__thread struct rseq __rseq_abi = {
> +  .cpu_id = RSEQ_CPU_ID_UNINITIALIZED,
> +};

{ should go onto its own line.  I'd also add attribute_tls_model_ie,
also it's implied by the declaration in the header.

> diff --git a/sysdeps/unix/sysv/linux/sys/rseq.h b/sysdeps/unix/sysv/linux/sys/rseq.h
> new file mode 100644
> index 0000000000..503dce4cac
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/sys/rseq.h
> @@ -0,0 +1,186 @@

I think there is some value in making this header compatible with
inclusion from the assembler (including constants for the relevant
struct offsets), but that can be a later change.

> +#ifndef _SYS_RSEQ_H
> +#define _SYS_RSEQ_H	1
> +
> +/* Architecture-specific rseq signature.  */
> +#include <bits/rseq.h>

Maybe add a newline between the above and the following, to make clear
the comment only applies to the first #include.

> +#include <stdint.h>
> +#include <sys/cdefs.h>
> +
> +#ifdef __has_include
> +# if __has_include ("linux/rseq.h")
> +#   define __GLIBC_HAVE_KERNEL_RSEQ
> +# endif
> +#else
> +# include <linux/version.h>
> +# if LINUX_VERSION_CODE >= KERNEL_VERSION (4, 18, 0)
> +#   define __GLIBC_HAVE_KERNEL_RSEQ
> +# endif
> +#endif
> +
> +#ifdef __GLIBC_HAVE_KERNEL_RSEQ
> +/* We use the structures declarations from the kernel headers.  */
> +# include <linux/rseq.h>
> +#else
> +/* We use a copy of the include/uapi/linux/rseq.h kernel header.  */
> +
> +#include <asm/byteorder.h>
> +
> +enum rseq_cpu_id_state
> +  {
> +    RSEQ_CPU_ID_UNINITIALIZED = -1,
> +    RSEQ_CPU_ID_REGISTRATION_FAILED = -2,
> +  };
> +
> +enum rseq_flags
> +  {
> +    RSEQ_FLAG_UNREGISTER = (1 << 0),
> +  };
> +
> +enum rseq_cs_flags_bit
> +  {
> +    RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT = 0,
> +    RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT = 1,
> +    RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT = 2,
> +  };
> +
> +enum rseq_cs_flags
> +  {
> +    RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT =
> +      (1U << RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT),
> +    RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL =
> +      (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT),
> +    RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE =
> +      (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
> +  };
> +
> +/* struct rseq_cs is aligned on 4 * 8 bytes to ensure it is always
> +   contained within a single cache-line. It is usually declared as
> +   link-time constant data.  */
> +struct rseq_cs
> +  {
> +    /* Version of this structure.  */
> +    uint32_t version;
> +    /* enum rseq_cs_flags.  */
> +    uint32_t flags;
> +    uint64_t start_ip;
> +    /* Offset from start_ip.  */
> +    uint64_t post_commit_offset;
> +    uint64_t abort_ip;
> +} __attribute__((aligned(4 * sizeof(uint64_t))));

The comment is wrong.  32-byte alignment does not put struct rseq_cs
on its own cache line on many (most?) CPUs.  Not using the constant 32
looks like unnecessary obfuscation to me.

I still think we should avoid the alignment.  The _ip fields should
perhaps be _pc (IP is more or less specific to x86).

{ and } are not aligned.  Please do not forget to add spaces before
opening parentheses, and two spaces after the . and the end of
sentences.  The opening { should always be on its own line.  (This
also applies to the definition of struct rseq below.)

> +
> +/* struct rseq is aligned on 4 * 8 bytes to ensure it is always
> +   contained within a single cache-line.
> +
> +   A single struct rseq per thread is allowed.  */
> +struct rseq
> +  {
> +    /* Restartable sequences cpu_id_start field. Updated by the
> +       kernel. Read by user-space with single-copy atomicity
> +       semantics. This field should only be read by the thread which
> +       registered this data structure. Aligned on 32-bit. Always

What does “Aligned on 32-bit” mean in this context?  Do you mean to
reference 32-*byte* alignment here?

> +    /* Restartable sequences rseq_cs field.
> +
> +       Contains NULL when no critical section is active for the current
> +       thread, or holds a pointer to the currently active struct rseq_cs.
> +
> +       Updated by user-space, which sets the address of the currently
> +       active rseq_cs at the beginning of assembly instruction sequence
> +       block, and set to NULL by the kernel when it restarts an assembly
> +       instruction sequence block, as well as when the kernel detects that
> +       it is preempting or delivering a signal outside of the range
> +       targeted by the rseq_cs. Also needs to be set to NULL by user-space
> +       before reclaiming memory that contains the targeted struct rseq_cs.
> +
> +       Read and set by the kernel. Set by user-space with single-copy
> +       atomicity semantics. This field should only be updated by the
> +       thread which registered this data structure. Aligned on 64-bit.  */
> +    union {
> +      uint64_t ptr64;
> +#ifdef __LP64__
> +      uint64_t ptr;
> +#else
> +      struct {
> +#if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined(__BIG_ENDIAN)
> +        uint32_t padding; /* Initialized to zero.  */
> +        uint32_t ptr32;
> +#else /* LITTLE */
> +        uint32_t ptr32;
> +        uint32_t padding; /* Initialized to zero.  */
> +#endif /* ENDIAN */
> +      } ptr;
> +#endif
> +    } rseq_cs;

Are these conditionals correct for x32?  Shouldn't there be a member
of type const struct rseq_cs * somewhere?

> diff --git a/sysdeps/unix/sysv/linux/x86/bits/rseq.h b/sysdeps/unix/sysv/linux/x86/bits/rseq.h
> new file mode 100644
> index 0000000000..75f52d9788
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/x86/bits/rseq.h
> @@ -0,0 +1,30 @@
> +/* Restartable Sequences Linux x86 architecture header.
> +   Copyright (C) 2019-2020 Free Software Foundation, Inc.

Please make sure that none of the new files reference the 2019 year.
It should be 2020, per GNU policy.

The patch needs some rebasing on top of current master.
Florian Weimer April 27, 2020, 11:59 a.m. UTC | #2
* Mathieu Desnoyers via Libc-alpha:

> diff --git a/elf/libc_early_init.c b/elf/libc_early_init.c
> index 1ac66d895d..30466afea0 100644
> --- a/elf/libc_early_init.c
> +++ b/elf/libc_early_init.c
> @@ -18,10 +18,13 @@
>  
>  #include <ctype.h>
>  #include <libc-early-init.h>
> +#include <rseq-internal.h>
>  
>  void
>  __libc_early_init (void)
>  {
>    /* Initialize ctype data.  */
>    __ctype_init ();
> +  /* Register rseq ABI to the kernel.   */
> +  (void) rseq_register_current_thread ();
>  }

I think the registration must be restricted to the primary namespace.
Otherwise, LD_AUDIT will register the area to the secondary libc (in
the audit module), not the primary libc for the entire process.

I think the easiest way to implement this for now is a flag argument
for __libc_early_init (as the upstream __libc_multiple_libcs is not
entirely accurate).  I will submit a patch.
Mathieu Desnoyers April 27, 2020, 4:40 p.m. UTC | #3
----- On Apr 27, 2020, at 5:11 AM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers via Libc-alpha:
> 
>> +* Support for automatically registering threads with the Linux rseq(2)
>> +  system call has been added.  This system call is implemented starting
>> +  from Linux 4.18.  The Restartable Sequences ABI accelerates user-space
>> +  operations on per-cpu data.  It allows user-space to perform updates
>> +  on per-cpu data without requiring heavy-weight atomic operations.
>> +  Automatically registering threads allows all libraries, including libc,
>> +  to make immediate use of the rseq(2) support by using the documented ABI.
>> +  See 'man 2 rseq' for the details of the ABI shared between libc and the
>> +  kernel.
> 
> This should refer documentation in the glibc manual.
> 
> (It is currently a glibc project requirement to add documentation for
> new Linux interfaces, something that I do not necessarily agree with.)

A related issue here is that editing of the rseq(2) man pages has been stalled
since March 2019. I have been waiting for reply from Michael Kerrisk, and I
suspect he might have be side-tracked by other projects. I just bumped that
thread.

Ref.: https://lore.kernel.org/r/211707091.921.1551722548347.JavaMail.zimbra@efficios.com

So as of today, "man 2 rseq" does not exist in the kernel man pages, so I
suggest we remove that sentence. Would the following change be OK with you ?

-  See 'man 2 rseq' for the details of the ABI shared between libc and the
-  kernel.
+  The GNU C Library manual has details on integration of Restartable
+  Sequences.

> 
>>  
>>  Deprecated and removed features, and other changes affecting compatibility:
>>  
>> diff --git a/elf/libc_early_init.c b/elf/libc_early_init.c
>> index 1ac66d895d..30466afea0 100644
>> --- a/elf/libc_early_init.c
>> +++ b/elf/libc_early_init.c
>> @@ -18,10 +18,13 @@
>>  
>>  #include <ctype.h>
>>  #include <libc-early-init.h>
>> +#include <rseq-internal.h>
>>  
>>  void
>>  __libc_early_init (void)
>>  {
>>    /* Initialize ctype data.  */
>>    __ctype_init ();
>> +  /* Register rseq ABI to the kernel.   */
>> +  (void) rseq_register_current_thread ();
>>  }
> 
> The cast to void should be removed (see below the comment about the
> return type.

OK

> 
>> diff --git a/manual/threads.texi b/manual/threads.texi
>> index 0858ef8f92..59f634e432 100644
>> --- a/manual/threads.texi
>> +++ b/manual/threads.texi
>> @@ -9,8 +9,10 @@ This chapter describes functions used for managing threads.
>>  POSIX threads.
>>  
>>  @menu
>> -* ISO C Threads::	Threads based on the ISO C specification.
>> -* POSIX Threads::	Threads based on the POSIX specification.
>> +* ISO C Threads::		Threads based on the ISO C specification.
>> +* POSIX Threads::		Threads based on the POSIX specification.
>> +* Restartable Sequences::	Linux-specific Restartable Sequences
>> +				integration.
>>  @end menu
>>  
>>  
>> @@ -881,3 +883,27 @@ Behaves like @code{pthread_timedjoin_np} except that the
>> absolute time in
>>  @c pthread_spin_unlock
>>  @c pthread_testcancel
>>  @c pthread_yield
>> +
>> +@node Restartable Sequences
>> +@section Restartable Sequences
>> +@cindex rseq
>> +
>> +This section describes @theglibc{} Restartable Sequences integration.
>> +
>> +@deftypevar {struct rseq} __rseq_abi
>> +@standards{GNU, sys/rseq.h}
>> +@Theglibc{} implements a @code{__rseq_abi} TLS symbol to interact with the
>> +Restartable Sequences system call (Linux-specific).  The layout of this
>> +structure is defined by the Linux kernel @file{linux/rseq.h} UAPI.
>> +Registration of each thread's @code{__rseq_abi} is performed by
>> +@theglibc{} at libc initialization and pthread creation.
>> +@end deftypevar
>> +
>> +@deftypevr Macro int RSEQ_SIG
>> +@standards{GNU, sys/rseq.h}
>> +Each supported architecture provide a @code{RSEQ_SIG} macro in
>> +@file{sys/rseq.h} which contains a signature.  That signature is expected to be
>> +present in the code before each Restartable Sequences abort handler.  Failure
>> +to provide the expected signature may terminate the process with a Segmentation
>> +fault.
>> +@end deftypevr
> 
> This should say Linux, not GNU as the standards source, given that the
> interface is not added to the GNU ABI.

Good point, done.

> 
> Is this sufficient documentation of this feature?  What do others
> think about this?

I'll leave this question to others.

> 
>> diff --git a/misc/rseq-internal.h b/misc/rseq-internal.h
>> new file mode 100644
>> index 0000000000..d564cf1bc3
>> --- /dev/null
>> +++ b/misc/rseq-internal.h
>> @@ -0,0 +1,33 @@
> 
>> +static inline int
>> +rseq_register_current_thread (void)
>> +{
>> +  return -1;
>> +}
> 
> Nothing checks the return value of this function as far as I can see,
> so it can return void.

OK

> 
>> +static inline int
>> +rseq_unregister_current_thread (void)
>> +{
>> +  return -1;
>> +}
> 
> This function is unused.  This also applies to the full version.  I
> believe we switched to implicit deregistration on thread exit, so I
> think you can just remove it.

OK

> 
>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>> index afd379e89a..1ff248042e 100644
>> --- a/nptl/pthread_create.c
>> +++ b/nptl/pthread_create.c
>> @@ -33,6 +33,7 @@
>>  #include <default-sched.h>
>>  #include <futex-internal.h>
>>  #include <tls-setup.h>
>> +#include <rseq-internal.h>
>>  #include "libioP.h"
>>  
>>  #include <shlib-compat.h>
>> @@ -384,6 +385,9 @@ START_THREAD_DEFN
>>    /* Initialize pointers to locale data.  */
>>    __ctype_init ();
>>  
>> +  /* Register rseq TLS to the kernel. */
>> +  (void) rseq_register_current_thread ();
>> +
> 
> The cast can go.

OK

> 
>>  #ifndef __ASSUME_SET_ROBUST_LIST
>>    if (__set_robust_list_avail >= 0)
>>  #endif
>> @@ -578,6 +582,14 @@ START_THREAD_DEFN
>>       process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
>>       flag.  The 'tid' field in the TCB will be set to zero.
>>  
>> +     rseq TLS is still registered at this point. Rely on implicit
>> unregistration
>> +     performed by the kernel on thread teardown. This is not a problem because
>> the
>> +     rseq TLS lives on the stack, and the stack outlives the thread. If TCB
>> +     allocation is ever changed, additional steps may be required, such as
>> +     performing explicit rseq unregistration before reclaiming the rseq TLS
>> area
>> +     memory. It is NOT sufficient to block signals because the kernel may write
>> +     to the rseq area even without signals.
>> +
>>       The exit code is zero since in case all threads exit by calling
>>       'pthread_exit' the exit status must be 0 (zero).  */
>>    __exit_thread ();
> 
> Some of these lines are too long.  Also two spaces after . at the end
> of sentences.

OK

> 
>> diff --git a/sysdeps/unix/sysv/linux/rseq-internal.h
>> b/sysdeps/unix/sysv/linux/rseq-internal.h
>> new file mode 100644
>> index 0000000000..5f7f02f1ec
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/rseq-internal.h
> 
>> +static inline int
>> +rseq_register_current_thread (void)
>> +{
>> +  int rc, ret = 0;
>> +
>> +  if (__rseq_abi.cpu_id == RSEQ_CPU_ID_REGISTRATION_FAILED)
>> +    return -1;
>> +  rc = INTERNAL_SYSCALL_CALL (rseq, &__rseq_abi, sizeof (struct rseq),
>> +                              0, RSEQ_SIG);
>> +  if (!rc)
>> +    goto end;
>> +  if (INTERNAL_SYSCALL_ERRNO (rc) != EBUSY)
>> +    __rseq_abi.cpu_id = RSEQ_CPU_ID_REGISTRATION_FAILED;
>> +  ret = -1;
>> +end:
>> +  return ret;
>> +}
> 
> This does not seem to use INTERNAL_SYSCALL_CALL correctly.  I think
> you need to use INTERNAL_SYSCALL_ERROR_P on the result to check for an
> error, and only then use INTERNAL_SYSCALL_ERRNO to extract the error
> code.

OK

> 
>> diff --git a/sysdeps/unix/sysv/linux/rseq-sym.c
>> b/sysdeps/unix/sysv/linux/rseq-sym.c
>> new file mode 100644
>> index 0000000000..0e33fab278
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/rseq-sym.c
> 
>> +#include <sys/syscall.h>
>> +#include <stdint.h>
>> +#include <kernel-features.h>
>> +#include <sys/rseq.h>
>> +
>> +__thread struct rseq __rseq_abi = {
>> +  .cpu_id = RSEQ_CPU_ID_UNINITIALIZED,
>> +};
> 
> { should go onto its own line.

OK

> I'd also add attribute_tls_model_ie,
> also it's implied by the declaration in the header.

This contradicts feedback I received from Szabolcs Nagy in September 2019:

https://public-inbox.org/libc-alpha/c58d4d6e-f22a-f5d9-e23a-5bd72cec1a86@arm.com/

"note that libpthread.so is built with -ftls-model=initial-exec

(and if it wasn't then you'd want to put the attribute on the
declaration in the internal header file, not on the definition,
so the actual tls accesses generate the right code)"

In the context of his feedback, __rseq_abi was defined within nptl/pthread_create.c.
It is now defined in sysdeps/unix/sysv/linux/rseq-sym.c, which is built into the
csu which AFAIU ends up in libc.so. His comment still applies though, because
libc.so is also built with -ftls-model=initial-exec.

So should I apply the "initial-exec" TLS model only to the __rseq_abi
declaration, or is it preferred to apply it to both the declaration
and the definition ?

> 
>> diff --git a/sysdeps/unix/sysv/linux/sys/rseq.h
>> b/sysdeps/unix/sysv/linux/sys/rseq.h
>> new file mode 100644
>> index 0000000000..503dce4cac
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/sys/rseq.h
>> @@ -0,0 +1,186 @@
> 
> I think there is some value in making this header compatible with
> inclusion from the assembler (including constants for the relevant
> struct offsets), but that can be a later change.

Agreed. By "later", do you mean before merging the patch, between
merge of the patch and next glibc release, or for a subsequent glibc
release ?

> 
>> +#ifndef _SYS_RSEQ_H
>> +#define _SYS_RSEQ_H	1
>> +
>> +/* Architecture-specific rseq signature.  */
>> +#include <bits/rseq.h>
> 
> Maybe add a newline between the above and the following, to make clear
> the comment only applies to the first #include.

OK

> 
>> +#include <stdint.h>
>> +#include <sys/cdefs.h>
>> +
>> +#ifdef __has_include
>> +# if __has_include ("linux/rseq.h")
>> +#   define __GLIBC_HAVE_KERNEL_RSEQ
>> +# endif
>> +#else
>> +# include <linux/version.h>
>> +# if LINUX_VERSION_CODE >= KERNEL_VERSION (4, 18, 0)
>> +#   define __GLIBC_HAVE_KERNEL_RSEQ
>> +# endif
>> +#endif
>> +
>> +#ifdef __GLIBC_HAVE_KERNEL_RSEQ
>> +/* We use the structures declarations from the kernel headers.  */
>> +# include <linux/rseq.h>
>> +#else
>> +/* We use a copy of the include/uapi/linux/rseq.h kernel header.  */
>> +
>> +#include <asm/byteorder.h>
>> +
>> +enum rseq_cpu_id_state
>> +  {
>> +    RSEQ_CPU_ID_UNINITIALIZED = -1,
>> +    RSEQ_CPU_ID_REGISTRATION_FAILED = -2,
>> +  };
>> +
>> +enum rseq_flags
>> +  {
>> +    RSEQ_FLAG_UNREGISTER = (1 << 0),
>> +  };
>> +
>> +enum rseq_cs_flags_bit
>> +  {
>> +    RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT = 0,
>> +    RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT = 1,
>> +    RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT = 2,
>> +  };
>> +
>> +enum rseq_cs_flags
>> +  {
>> +    RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT =
>> +      (1U << RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT),
>> +    RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL =
>> +      (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT),
>> +    RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE =
>> +      (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
>> +  };
>> +
>> +/* struct rseq_cs is aligned on 4 * 8 bytes to ensure it is always
>> +   contained within a single cache-line. It is usually declared as
>> +   link-time constant data.  */
>> +struct rseq_cs
>> +  {
>> +    /* Version of this structure.  */
>> +    uint32_t version;
>> +    /* enum rseq_cs_flags.  */
>> +    uint32_t flags;
>> +    uint64_t start_ip;
>> +    /* Offset from start_ip.  */
>> +    uint64_t post_commit_offset;
>> +    uint64_t abort_ip;
>> +} __attribute__((aligned(4 * sizeof(uint64_t))));
> 
> The comment is wrong.  32-byte alignment does not put struct rseq_cs
> on its own cache line on many (most?) CPUs.  Not using the constant 32
> looks like unnecessary obfuscation to me.

There is a difference between "being contained within a single cache-line"
and "being the only structure in a cache-line". The objective here is the
former.

For instance, if we do not enforce any minimum alignment, the compiler could
decide to align that structure on __alignof__(uint64_t) which happens to be
4 bytes on some architectures. This can cause this frequently accessed structure
to be sitting across 2 cache-lines, thus requiring the CPU to load two cache
lines rather than one very frequently.

I think what you have in mind is "being the only structure in a cache-line",
which is useful to eliminate false-sharing. However, considering that this is
a TLS variable, we don't care about false-sharing, because it is never meant
to be updated concurrently by many threads or CPUs.

So I think my comment is correct.

I agree that the constant 32 may be clearer here. I will change to align(32).

> I still think we should avoid the alignment.  The _ip fields should
> perhaps be _pc (IP is more or less specific to x86).

I am concerned that removing an alignment attribute which is exposed
in a public Linux UAPI header can be an ABI breakage.

I am also concerned about changing field names for fields already
exposed in a public Linux UAPI header, especially if the change is
only for cosmetic reasons.

> 
> { and } are not aligned.  Please do not forget to add spaces before
> opening parentheses, and two spaces after the . and the end of
> sentences.  The opening { should always be on its own line.  (This
> also applies to the definition of struct rseq below.)

OK. Old coding style habits die hard ;)

> 
>> +
>> +/* struct rseq is aligned on 4 * 8 bytes to ensure it is always
>> +   contained within a single cache-line.
>> +
>> +   A single struct rseq per thread is allowed.  */
>> +struct rseq
>> +  {
>> +    /* Restartable sequences cpu_id_start field. Updated by the
>> +       kernel. Read by user-space with single-copy atomicity
>> +       semantics. This field should only be read by the thread which
>> +       registered this data structure. Aligned on 32-bit. Always
> 
> What does “Aligned on 32-bit” mean in this context?  Do you mean to
> reference 32-*byte* alignment here?

No. I really mean 32-bit (4-byte). Being aligned on 32-byte guarantees that
this field is aligned at least on 4-byte. This is required by single-copy
atomicity semantics.

Should I update this comment to state "Aligned on 4-byte" instead ?

> 
>> +    /* Restartable sequences rseq_cs field.
>> +
>> +       Contains NULL when no critical section is active for the current
>> +       thread, or holds a pointer to the currently active struct rseq_cs.
>> +
>> +       Updated by user-space, which sets the address of the currently
>> +       active rseq_cs at the beginning of assembly instruction sequence
>> +       block, and set to NULL by the kernel when it restarts an assembly
>> +       instruction sequence block, as well as when the kernel detects that
>> +       it is preempting or delivering a signal outside of the range
>> +       targeted by the rseq_cs. Also needs to be set to NULL by user-space
>> +       before reclaiming memory that contains the targeted struct rseq_cs.
>> +
>> +       Read and set by the kernel. Set by user-space with single-copy
>> +       atomicity semantics. This field should only be updated by the
>> +       thread which registered this data structure. Aligned on 64-bit.  */
>> +    union {
>> +      uint64_t ptr64;
>> +#ifdef __LP64__
>> +      uint64_t ptr;
>> +#else
>> +      struct {
>> +#if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) ||
>> defined(__BIG_ENDIAN)
>> +        uint32_t padding; /* Initialized to zero.  */
>> +        uint32_t ptr32;
>> +#else /* LITTLE */
>> +        uint32_t ptr32;
>> +        uint32_t padding; /* Initialized to zero.  */
>> +#endif /* ENDIAN */
>> +      } ptr;
>> +#endif
>> +    } rseq_cs;
> 
> Are these conditionals correct for x32?

Let's see. With x86 gcc:

-m64: (__x86_64__ && __LP64__)
-m32: (__i386__)
-mx32: (__x86_64__ && __ILP32__)

So with "#ifdef __LP64__" we specifically target 64-bit pointers. The rest
falls into the "else" case, which expects 32-bit pointers. Considering that
x32 has 32-bit pointers, I don't see any issue here.

> Shouldn't there be a member of type const struct rseq_cs * somewhere?

Having pointers within structures in kernel UAPI headers is frowned upon. Indeed
here having it in the union could possibly make some use-cases easier in
user-space, so I'm open to it. It basically depends on how much we want the
Linux UAPI header and the glibc header to stay in sync, and if other kernel
maintainers are open to this addition.

We don't mind that user-space uses that pointer, but we never want the kernel
to touch that pointer rather than the 32/64-bit-aware fields. One possibility
would be to do:

    union
      {
        uint64_t ptr64;
#ifdef __LP64__
        uint64_t ptr;
#else
        struct
          {
#if (defined (__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined (__BIG_ENDIAN)
            uint32_t padding; /* Initialized to zero.  */
            uint32_t ptr32;
#else /* LITTLE */
            uint32_t ptr32;
            uint32_t padding; /* Initialized to zero.  */
#endif /* ENDIAN */
          } ptr;
#endif

#ifndef __KERNEL__
     const struct rseq_cs *uptr;
#endif
      } rseq_cs;

in the union, so only user-space can see that field. Thoughts ?

> 
>> diff --git a/sysdeps/unix/sysv/linux/x86/bits/rseq.h
>> b/sysdeps/unix/sysv/linux/x86/bits/rseq.h
>> new file mode 100644
>> index 0000000000..75f52d9788
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/x86/bits/rseq.h
>> @@ -0,0 +1,30 @@
>> +/* Restartable Sequences Linux x86 architecture header.
>> +   Copyright (C) 2019-2020 Free Software Foundation, Inc.
> 
> Please make sure that none of the new files reference the 2019 year.
> It should be 2020, per GNU policy.

OK

> 
> The patch needs some rebasing on top of current master.

Done.

Thanks for the review!

Mathieu
Mathieu Desnoyers April 27, 2020, 4:47 p.m. UTC | #4
----- On Apr 27, 2020, at 7:59 AM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers via Libc-alpha:
> 
>> diff --git a/elf/libc_early_init.c b/elf/libc_early_init.c
>> index 1ac66d895d..30466afea0 100644
>> --- a/elf/libc_early_init.c
>> +++ b/elf/libc_early_init.c
>> @@ -18,10 +18,13 @@
>>  
>>  #include <ctype.h>
>>  #include <libc-early-init.h>
>> +#include <rseq-internal.h>
>>  
>>  void
>>  __libc_early_init (void)
>>  {
>>    /* Initialize ctype data.  */
>>    __ctype_init ();
>> +  /* Register rseq ABI to the kernel.   */
>> +  (void) rseq_register_current_thread ();
>>  }
> 
> I think the registration must be restricted to the primary namespace.
> Otherwise, LD_AUDIT will register the area to the secondary libc (in
> the audit module), not the primary libc for the entire process.
> 
> I think the easiest way to implement this for now is a flag argument
> for __libc_early_init (as the upstream __libc_multiple_libcs is not
> entirely accurate).  I will submit a patch.

OK, once I get the patch, I will pick it up in my series.

Thanks,

Mathieu
Florian Weimer April 27, 2020, 4:54 p.m. UTC | #5
* Mathieu Desnoyers:

>>> +#include <sys/syscall.h>
>>> +#include <stdint.h>
>>> +#include <kernel-features.h>
>>> +#include <sys/rseq.h>
>>> +
>>> +__thread struct rseq __rseq_abi = {
>>> +  .cpu_id = RSEQ_CPU_ID_UNINITIALIZED,
>>> +};
>> 
>> { should go onto its own line.
>
> OK
>
>> I'd also add attribute_tls_model_ie,
>> also it's implied by the declaration in the header.
>
> This contradicts feedback I received from Szabolcs Nagy in September 2019:
>
> https://public-inbox.org/libc-alpha/c58d4d6e-f22a-f5d9-e23a-5bd72cec1a86@arm.com/
>
> "note that libpthread.so is built with -ftls-model=initial-exec
>
> (and if it wasn't then you'd want to put the attribute on the
> declaration in the internal header file, not on the definition,
> so the actual tls accesses generate the right code)"
>
> In the context of his feedback, __rseq_abi was defined within nptl/pthread_create.c.
> It is now defined in sysdeps/unix/sysv/linux/rseq-sym.c, which is built into the
> csu which AFAIU ends up in libc.so. His comment still applies though, because
> libc.so is also built with -ftls-model=initial-exec.
>
> So should I apply the "initial-exec" TLS model only to the __rseq_abi
> declaration, or is it preferred to apply it to both the declaration
> and the definition ?

I do not have a strong preference here.  Technically, the declaration
in the header file should be enough.

>>> diff --git a/sysdeps/unix/sysv/linux/sys/rseq.h
>>> b/sysdeps/unix/sysv/linux/sys/rseq.h
>>> new file mode 100644
>>> index 0000000000..503dce4cac
>>> --- /dev/null
>>> +++ b/sysdeps/unix/sysv/linux/sys/rseq.h
>>> @@ -0,0 +1,186 @@
>> 
>> I think there is some value in making this header compatible with
>> inclusion from the assembler (including constants for the relevant
>> struct offsets), but that can be a later change.
>
> Agreed. By "later", do you mean before merging the patch, between
> merge of the patch and next glibc release, or for a subsequent glibc
> release ?

It can happen some time after merging the patch, preferably for this
release.  But I don't think it's release-critical.

>>> +/* struct rseq_cs is aligned on 4 * 8 bytes to ensure it is always
>>> +   contained within a single cache-line. It is usually declared as
>>> +   link-time constant data.  */
>>> +struct rseq_cs
>>> +  {
>>> +    /* Version of this structure.  */
>>> +    uint32_t version;
>>> +    /* enum rseq_cs_flags.  */
>>> +    uint32_t flags;
>>> +    uint64_t start_ip;
>>> +    /* Offset from start_ip.  */
>>> +    uint64_t post_commit_offset;
>>> +    uint64_t abort_ip;
>>> +} __attribute__((aligned(4 * sizeof(uint64_t))));
>> 
>> The comment is wrong.  32-byte alignment does not put struct rseq_cs
>> on its own cache line on many (most?) CPUs.  Not using the constant 32
>> looks like unnecessary obfuscation to me.
>
> There is a difference between "being contained within a single cache-line"
> and "being the only structure in a cache-line". The objective here is the
> former.

Fair enough.

> I agree that the constant 32 may be clearer here. I will change to align(32).

With a space, please. 8-)

>>> +/* struct rseq is aligned on 4 * 8 bytes to ensure it is always
>>> +   contained within a single cache-line.
>>> +
>>> +   A single struct rseq per thread is allowed.  */
>>> +struct rseq
>>> +  {
>>> +    /* Restartable sequences cpu_id_start field. Updated by the
>>> +       kernel. Read by user-space with single-copy atomicity
>>> +       semantics. This field should only be read by the thread which
>>> +       registered this data structure. Aligned on 32-bit. Always
>> 
>> What does “Aligned on 32-bit” mean in this context?  Do you mean to
>> reference 32-*byte* alignment here?
>
> No. I really mean 32-bit (4-byte). Being aligned on 32-byte guarantees that
> this field is aligned at least on 4-byte. This is required by single-copy
> atomicity semantics.
>
> Should I update this comment to state "Aligned on 4-byte" instead ?

I think this is implied by all Linux ABIs.  And the explicit alignment
specification for struct rseq makes the alignment 32 bytes.

>>> +    /* Restartable sequences rseq_cs field.
>>> +
>>> +       Contains NULL when no critical section is active for the current
>>> +       thread, or holds a pointer to the currently active struct rseq_cs.
>>> +
>>> +       Updated by user-space, which sets the address of the currently
>>> +       active rseq_cs at the beginning of assembly instruction sequence
>>> +       block, and set to NULL by the kernel when it restarts an assembly
>>> +       instruction sequence block, as well as when the kernel detects that
>>> +       it is preempting or delivering a signal outside of the range
>>> +       targeted by the rseq_cs. Also needs to be set to NULL by user-space
>>> +       before reclaiming memory that contains the targeted struct rseq_cs.
>>> +
>>> +       Read and set by the kernel. Set by user-space with single-copy
>>> +       atomicity semantics. This field should only be updated by the
>>> +       thread which registered this data structure. Aligned on 64-bit.  */
>>> +    union {
>>> +      uint64_t ptr64;
>>> +#ifdef __LP64__
>>> +      uint64_t ptr;
>>> +#else
>>> +      struct {
>>> +#if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) ||
>>> defined(__BIG_ENDIAN)
>>> +        uint32_t padding; /* Initialized to zero.  */
>>> +        uint32_t ptr32;
>>> +#else /* LITTLE */
>>> +        uint32_t ptr32;
>>> +        uint32_t padding; /* Initialized to zero.  */
>>> +#endif /* ENDIAN */
>>> +      } ptr;
>>> +#endif
>>> +    } rseq_cs;
>> 
>> Are these conditionals correct for x32?
>
> Let's see. With x86 gcc:
>
> -m64: (__x86_64__ && __LP64__)
> -m32: (__i386__)
> -mx32: (__x86_64__ && __ILP32__)
>
> So with "#ifdef __LP64__" we specifically target 64-bit pointers. The rest
> falls into the "else" case, which expects 32-bit pointers. Considering that
> x32 has 32-bit pointers, I don't see any issue here.

Does the kernel have a separate 32-bit entry point for rseq on x32?
If not, it will expect the 64-bit struct layout.

> We don't mind that user-space uses that pointer, but we never want the kernel
> to touch that pointer rather than the 32/64-bit-aware fields. One possibility
> would be to do:
>
>     union
>       {
>         uint64_t ptr64;
> #ifdef __LP64__
>         uint64_t ptr;
> #else
>         struct
>           {
> #if (defined (__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined (__BIG_ENDIAN)
>             uint32_t padding; /* Initialized to zero.  */
>             uint32_t ptr32;
> #else /* LITTLE */
>             uint32_t ptr32;
>             uint32_t padding; /* Initialized to zero.  */
> #endif /* ENDIAN */
>           } ptr;
> #endif
>
> #ifndef __KERNEL__
>      const struct rseq_cs *uptr;
> #endif
>       } rseq_cs;
>
> in the union, so only user-space can see that field. Thoughts ?

I think this depends on where the x32 question lands.
Florian Weimer April 27, 2020, 4:59 p.m. UTC | #6
* Mathieu Desnoyers:

> ----- On Apr 27, 2020, at 7:59 AM, Florian Weimer fw@deneb.enyo.de wrote:
>
>> * Mathieu Desnoyers via Libc-alpha:
>> 
>>> diff --git a/elf/libc_early_init.c b/elf/libc_early_init.c
>>> index 1ac66d895d..30466afea0 100644
>>> --- a/elf/libc_early_init.c
>>> +++ b/elf/libc_early_init.c
>>> @@ -18,10 +18,13 @@
>>>  
>>>  #include <ctype.h>
>>>  #include <libc-early-init.h>
>>> +#include <rseq-internal.h>
>>>  
>>>  void
>>>  __libc_early_init (void)
>>>  {
>>>    /* Initialize ctype data.  */
>>>    __ctype_init ();
>>> +  /* Register rseq ABI to the kernel.   */
>>> +  (void) rseq_register_current_thread ();
>>>  }
>> 
>> I think the registration must be restricted to the primary namespace.
>> Otherwise, LD_AUDIT will register the area to the secondary libc (in
>> the audit module), not the primary libc for the entire process.
>> 
>> I think the easiest way to implement this for now is a flag argument
>> for __libc_early_init (as the upstream __libc_multiple_libcs is not
>> entirely accurate).  I will submit a patch.
>
> OK, once I get the patch, I will pick it up in my series.

There should be no need for that, it can be reviewed and committed
separately:

  <https://sourceware.org/pipermail/libc-alpha/2020-April/113182.html>
Mathieu Desnoyers April 27, 2020, 5:26 p.m. UTC | #7
----- On Apr 27, 2020, at 12:54 PM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>>>> +#include <sys/syscall.h>
>>>> +#include <stdint.h>
>>>> +#include <kernel-features.h>
>>>> +#include <sys/rseq.h>
>>>> +
>>>> +__thread struct rseq __rseq_abi = {
>>>> +  .cpu_id = RSEQ_CPU_ID_UNINITIALIZED,
>>>> +};
>>> 
>>> { should go onto its own line.
>>
>> OK
>>
>>> I'd also add attribute_tls_model_ie,
>>> also it's implied by the declaration in the header.
>>
>> This contradicts feedback I received from Szabolcs Nagy in September 2019:
>>
>> https://public-inbox.org/libc-alpha/c58d4d6e-f22a-f5d9-e23a-5bd72cec1a86@arm.com/
>>
>> "note that libpthread.so is built with -ftls-model=initial-exec
>>
>> (and if it wasn't then you'd want to put the attribute on the
>> declaration in the internal header file, not on the definition,
>> so the actual tls accesses generate the right code)"
>>
>> In the context of his feedback, __rseq_abi was defined within
>> nptl/pthread_create.c.
>> It is now defined in sysdeps/unix/sysv/linux/rseq-sym.c, which is built into the
>> csu which AFAIU ends up in libc.so. His comment still applies though, because
>> libc.so is also built with -ftls-model=initial-exec.
>>
>> So should I apply the "initial-exec" TLS model only to the __rseq_abi
>> declaration, or is it preferred to apply it to both the declaration
>> and the definition ?
> 
> I do not have a strong preference here.  Technically, the declaration
> in the header file should be enough.

OK, so I'll just keep the attribute on the declaration in the header.

> 
>>>> diff --git a/sysdeps/unix/sysv/linux/sys/rseq.h
>>>> b/sysdeps/unix/sysv/linux/sys/rseq.h
>>>> new file mode 100644
>>>> index 0000000000..503dce4cac
>>>> --- /dev/null
>>>> +++ b/sysdeps/unix/sysv/linux/sys/rseq.h
>>>> @@ -0,0 +1,186 @@
>>> 
>>> I think there is some value in making this header compatible with
>>> inclusion from the assembler (including constants for the relevant
>>> struct offsets), but that can be a later change.
>>
>> Agreed. By "later", do you mean before merging the patch, between
>> merge of the patch and next glibc release, or for a subsequent glibc
>> release ?
> 
> It can happen some time after merging the patch, preferably for this
> release.  But I don't think it's release-critical.

OK

> 
>>>> +/* struct rseq is aligned on 4 * 8 bytes to ensure it is always
>>>> +   contained within a single cache-line.
>>>> +
>>>> +   A single struct rseq per thread is allowed.  */
>>>> +struct rseq
>>>> +  {
>>>> +    /* Restartable sequences cpu_id_start field. Updated by the
>>>> +       kernel. Read by user-space with single-copy atomicity
>>>> +       semantics. This field should only be read by the thread which
>>>> +       registered this data structure. Aligned on 32-bit. Always
>>> 
>>> What does “Aligned on 32-bit” mean in this context?  Do you mean to
>>> reference 32-*byte* alignment here?
>>
>> No. I really mean 32-bit (4-byte). Being aligned on 32-byte guarantees that
>> this field is aligned at least on 4-byte. This is required by single-copy
>> atomicity semantics.
>>
>> Should I update this comment to state "Aligned on 4-byte" instead ?
> 
> I think this is implied by all Linux ABIs.  And the explicit alignment
> specification for struct rseq makes the alignment 32 bytes.

Unless a structure ends up being packed, which is of course not the case
here.

I would prefer to keep the comment about 32-bit alignment requirement on
the specific fields, because the motivation for alignment requirement is
much more strict for fields (correctness) than the motivation for alignment
of the structure (performance).

> 
>>>> +    /* Restartable sequences rseq_cs field.
>>>> +
>>>> +       Contains NULL when no critical section is active for the current
>>>> +       thread, or holds a pointer to the currently active struct rseq_cs.
>>>> +
>>>> +       Updated by user-space, which sets the address of the currently
>>>> +       active rseq_cs at the beginning of assembly instruction sequence
>>>> +       block, and set to NULL by the kernel when it restarts an assembly
>>>> +       instruction sequence block, as well as when the kernel detects that
>>>> +       it is preempting or delivering a signal outside of the range
>>>> +       targeted by the rseq_cs. Also needs to be set to NULL by user-space
>>>> +       before reclaiming memory that contains the targeted struct rseq_cs.
>>>> +
>>>> +       Read and set by the kernel. Set by user-space with single-copy
>>>> +       atomicity semantics. This field should only be updated by the
>>>> +       thread which registered this data structure. Aligned on 64-bit.  */
>>>> +    union {
>>>> +      uint64_t ptr64;
>>>> +#ifdef __LP64__
>>>> +      uint64_t ptr;
>>>> +#else
>>>> +      struct {
>>>> +#if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) ||
>>>> defined(__BIG_ENDIAN)
>>>> +        uint32_t padding; /* Initialized to zero.  */
>>>> +        uint32_t ptr32;
>>>> +#else /* LITTLE */
>>>> +        uint32_t ptr32;
>>>> +        uint32_t padding; /* Initialized to zero.  */
>>>> +#endif /* ENDIAN */
>>>> +      } ptr;
>>>> +#endif
>>>> +    } rseq_cs;
>>> 
>>> Are these conditionals correct for x32?
>>
>> Let's see. With x86 gcc:
>>
>> -m64: (__x86_64__ && __LP64__)
>> -m32: (__i386__)
>> -mx32: (__x86_64__ && __ILP32__)
>>
>> So with "#ifdef __LP64__" we specifically target 64-bit pointers. The rest
>> falls into the "else" case, which expects 32-bit pointers. Considering that
>> x32 has 32-bit pointers, I don't see any issue here.
> 
> Does the kernel have a separate 32-bit entry point for rseq on x32?
> If not, it will expect the 64-bit struct layout.

No, there is a single entry point into rseq covering all of 32-bit, 64-bit and x32.
We achieve this by ensuring the layout of the linux/rseq.h structures
uses the union representation for pointers. Therefore, the kernel does not care
whether it reads a pointer from a 32-bit or 64-bit process. This is becoming the
preferred way to design Linux kernel ABIs nowadays.

> 
>> We don't mind that user-space uses that pointer, but we never want the kernel
>> to touch that pointer rather than the 32/64-bit-aware fields. One possibility
>> would be to do:
>>
>>     union
>>       {
>>         uint64_t ptr64;
>> #ifdef __LP64__
>>         uint64_t ptr;
>> #else
>>         struct
>>           {
>> #if (defined (__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined
>> (__BIG_ENDIAN)
>>             uint32_t padding; /* Initialized to zero.  */
>>             uint32_t ptr32;
>> #else /* LITTLE */
>>             uint32_t ptr32;
>>             uint32_t padding; /* Initialized to zero.  */
>> #endif /* ENDIAN */
>>           } ptr;
>> #endif
>>
>> #ifndef __KERNEL__
>>      const struct rseq_cs *uptr;
>> #endif
>>       } rseq_cs;
>>
>> in the union, so only user-space can see that field. Thoughts ?
> 
> I think this depends on where the x32 question lands.

x32 should not be an issue as explained above, so I'm very open to
add this "uptr" for user-space only.

Thanks,

Mathieu
Mathieu Desnoyers April 27, 2020, 8:27 p.m. UTC | #8
----- On Apr 27, 2020, at 1:26 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
[...]
>>> We don't mind that user-space uses that pointer, but we never want the kernel
>>> to touch that pointer rather than the 32/64-bit-aware fields. One possibility
>>> would be to do:
>>>
>>>     union
>>>       {
>>>         uint64_t ptr64;
>>> #ifdef __LP64__
>>>         uint64_t ptr;
>>> #else
>>>         struct
>>>           {
>>> #if (defined (__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined
>>> (__BIG_ENDIAN)
>>>             uint32_t padding; /* Initialized to zero.  */
>>>             uint32_t ptr32;
>>> #else /* LITTLE */
>>>             uint32_t ptr32;
>>>             uint32_t padding; /* Initialized to zero.  */
>>> #endif /* ENDIAN */
>>>           } ptr;
>>> #endif
>>>
>>> #ifndef __KERNEL__
>>>      const struct rseq_cs *uptr;
>>> #endif
>>>       } rseq_cs;
>>>
>>> in the union, so only user-space can see that field. Thoughts ?
>> 
>> I think this depends on where the x32 question lands.
> 
> x32 should not be an issue as explained above, so I'm very open to
> add this "uptr" for user-space only.

Actually, the snippet above is broken on 32-bit. It needs to be:

    union
      {
        uint64_t ptr64;
#ifdef __LP64__
        uint64_t ptr;
# ifndef __KERNEL__
        const struct rseq_cs *uptr;
# endif
#else   
        struct
          {
#if (defined (__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined (__BIG_ENDIAN)
            uint32_t padding; /* Initialized to zero.  */
            uint32_t ptr32;
#else /* LITTLE */
            uint32_t ptr32;
            uint32_t padding; /* Initialized to zero.  */
#endif /* ENDIAN */
          } ptr;
# ifndef __KERNEL__
        struct
          {
#  if (defined (__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined (__BIG_ENDIAN)
            uint32_t padding; /* Initialized to zero.  */
            const struct rseq_cs *uptr32;
#  else /* LITTLE */
            const struct rseq_cs *uptr32;
            uint32_t padding; /* Initialized to zero.  */
#  endif /* ENDIAN */
          } uptr;
# endif
#endif
      } rseq_cs;

I'll leave this out of the patchset for now as we'd need more feedback on its
usefulness.

Thanks,

Mathieu
Florian Weimer April 28, 2020, 12:02 p.m. UTC | #9
* Mathieu Desnoyers:

>>>>> +/* struct rseq is aligned on 4 * 8 bytes to ensure it is always
>>>>> +   contained within a single cache-line.
>>>>> +
>>>>> +   A single struct rseq per thread is allowed.  */
>>>>> +struct rseq
>>>>> +  {
>>>>> +    /* Restartable sequences cpu_id_start field. Updated by the
>>>>> +       kernel. Read by user-space with single-copy atomicity
>>>>> +       semantics. This field should only be read by the thread which
>>>>> +       registered this data structure. Aligned on 32-bit. Always
>>>> 
>>>> What does “Aligned on 32-bit” mean in this context?  Do you mean to
>>>> reference 32-*byte* alignment here?
>>>
>>> No. I really mean 32-bit (4-byte). Being aligned on 32-byte guarantees that
>>> this field is aligned at least on 4-byte. This is required by single-copy
>>> atomicity semantics.
>>>
>>> Should I update this comment to state "Aligned on 4-byte" instead ?
>> 
>> I think this is implied by all Linux ABIs.  And the explicit alignment
>> specification for struct rseq makes the alignment 32 bytes.
>
> Unless a structure ends up being packed, which is of course not the case
> here.
>
> I would prefer to keep the comment about 32-bit alignment requirement on
> the specific fields, because the motivation for alignment requirement is
> much more strict for fields (correctness) than the motivation for alignment
> of the structure (performance).

But the correctness is already enforced by the compiler, so I fail to
see point of mentioning this in the comment.

Anyway, I don't want to make a big deal of it.  Please leave it in if
you think it is ehlpful.

> x32 should not be an issue as explained above, so I'm very open to
> add this "uptr" for user-space only.

Okay, then please use anonymous unions and structs as necessary, to
ensure that the uptr field can be reached on all platforms in the same
way.
Mathieu Desnoyers April 28, 2020, 12:33 p.m. UTC | #10
----- On Apr 28, 2020, at 8:02 AM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>>>>>> +/* struct rseq is aligned on 4 * 8 bytes to ensure it is always
>>>>>> +   contained within a single cache-line.
>>>>>> +
>>>>>> +   A single struct rseq per thread is allowed.  */
>>>>>> +struct rseq
>>>>>> +  {
>>>>>> +    /* Restartable sequences cpu_id_start field. Updated by the
>>>>>> +       kernel. Read by user-space with single-copy atomicity
>>>>>> +       semantics. This field should only be read by the thread which
>>>>>> +       registered this data structure. Aligned on 32-bit. Always
>>>>> 
>>>>> What does “Aligned on 32-bit” mean in this context?  Do you mean to
>>>>> reference 32-*byte* alignment here?
>>>>
>>>> No. I really mean 32-bit (4-byte). Being aligned on 32-byte guarantees that
>>>> this field is aligned at least on 4-byte. This is required by single-copy
>>>> atomicity semantics.
>>>>
>>>> Should I update this comment to state "Aligned on 4-byte" instead ?
>>> 
>>> I think this is implied by all Linux ABIs.  And the explicit alignment
>>> specification for struct rseq makes the alignment 32 bytes.
>>
>> Unless a structure ends up being packed, which is of course not the case
>> here.
>>
>> I would prefer to keep the comment about 32-bit alignment requirement on
>> the specific fields, because the motivation for alignment requirement is
>> much more strict for fields (correctness) than the motivation for alignment
>> of the structure (performance).
> 
> But the correctness is already enforced by the compiler, so I fail to
> see point of mentioning this in the comment.
> 
> Anyway, I don't want to make a big deal of it.  Please leave it in if
> you think it is ehlpful.

I would prefer to leave it in, just to make the requirements plain clear in
case those structures are allocated on the heap (for instance).

> 
>> x32 should not be an issue as explained above, so I'm very open to
>> add this "uptr" for user-space only.
> 
> Okay, then please use anonymous unions and structs as necessary, to
> ensure that the uptr field can be reached on all platforms in the same
> way.

OK, will do!

One issue I'm currently facing when running "make check": because nptl/tst-rseq-nptl.c
uses pthread_cancel(), I run into an Abort with:

libgcc_s.so.1 must be installed for pthread_cancel to work
Didn't expect signal from child: got `Aborted'

So far I've tested the rest of that file with a patch on top which disables the use of
pthread_cancel (), but I'd really like to give it a full coverage before sending this out.
In https://sourceware.org/glibc/wiki/Testing/Builds there is a section about
"Building glibc with intent to install" which describes that libgcc must be copied
manually. My use-case is that I just want to run "make check" in the build directory
and make sure it finds the libgcc it needs to succeed using pthread_cancel ().
How can I achieve this ?

Thanks,

Mathieu
Florian Weimer April 28, 2020, 12:35 p.m. UTC | #11
* Mathieu Desnoyers:

> One issue I'm currently facing when running "make check": because
> nptl/tst-rseq-nptl.c uses pthread_cancel(), I run into an Abort
> with:
>
> libgcc_s.so.1 must be installed for pthread_cancel to work
> Didn't expect signal from child: got `Aborted'

This is really unusual.  Is the affected test statically linked?
Mathieu Desnoyers April 28, 2020, 12:43 p.m. UTC | #12
----- On Apr 28, 2020, at 8:35 AM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>> One issue I'm currently facing when running "make check": because
>> nptl/tst-rseq-nptl.c uses pthread_cancel(), I run into an Abort
>> with:
>>
>> libgcc_s.so.1 must be installed for pthread_cancel to work
>> Didn't expect signal from child: got `Aborted'
> 
> This is really unusual.  Is the affected test statically linked?

I built glibc without specifying anything particular, and ran
"make check". It indeed seems to be dynamically linked to libc:

ldd tst-rseq-nptl
./tst-rseq-nptl: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./tst-rseq-nptl)
linux-vdso.so.1 (0x00007ffd3a2f3000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f0527560000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f052716f000)
/home/efficios/glibc-test5/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007f0527986000)

After make check I have:

cat tst-rseq-nptl.test-result 
FAIL: nptl/tst-rseq-nptl
original exit status 134

And if I run

./tst-rseq-nptl

Then I get

libgcc_s.so.1 must be installed for pthread_cancel to work
Didn't expect signal from child: got `Aborted'
libgcc_s.so.1 must be installed for pthread_cancel to work
Aborted (core dumped)

Same result if I do ./testrun.sh nptl/tst-rseq-nptl

Thanks,

Mathieu
Florian Weimer April 28, 2020, 12:54 p.m. UTC | #13
* Mathieu Desnoyers:

> ----- On Apr 28, 2020, at 8:35 AM, Florian Weimer fw@deneb.enyo.de wrote:
>
>> * Mathieu Desnoyers:
>> 
>>> One issue I'm currently facing when running "make check": because
>>> nptl/tst-rseq-nptl.c uses pthread_cancel(), I run into an Abort
>>> with:
>>>
>>> libgcc_s.so.1 must be installed for pthread_cancel to work
>>> Didn't expect signal from child: got `Aborted'
>> 
>> This is really unusual.  Is the affected test statically linked?
>
> I built glibc without specifying anything particular, and ran
> "make check". It indeed seems to be dynamically linked to libc:
>
> ldd tst-rseq-nptl
> ./tst-rseq-nptl: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./tst-rseq-nptl)
> linux-vdso.so.1 (0x00007ffd3a2f3000)
> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f0527560000)
> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f052716f000)
> /home/efficios/glibc-test5/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007f0527986000)

That's expected if the installed glibc is older than the built glibc.

> After make check I have:
>
> cat tst-rseq-nptl.test-result 
> FAIL: nptl/tst-rseq-nptl
> original exit status 134

What's in the tst-rseq-nptl.out file?

> And if I run
>
> ./tst-rseq-nptl
>
> Then I get
>
> libgcc_s.so.1 must be installed for pthread_cancel to work
> Didn't expect signal from child: got `Aborted'
> libgcc_s.so.1 must be installed for pthread_cancel to work
> Aborted (core dumped)

I'm puzzled why you don't get a GLIBC_2.32 version error in this case.
Do you build with --enable-hardcoded-path-in-tests?

> Same result if I do ./testrun.sh nptl/tst-rseq-nptl

That one definitely should work.

I expect you might see this if libgcc_s.so.1 is installed into a
multiarch subdirectory that upstream glibc does not search.  (The
Debian patches are unfortunately not upstream.)

I think on my system, the built glibc can find the system libgcc_s via
/etc/ld.so.cache, so I haven't seen this issue yet.
Mathieu Desnoyers April 28, 2020, 12:56 p.m. UTC | #14
----- On Apr 28, 2020, at 8:33 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On Apr 28, 2020, at 8:02 AM, Florian Weimer fw@deneb.enyo.de wrote:
> 
[...]
>> 
>>> x32 should not be an issue as explained above, so I'm very open to
>>> add this "uptr" for user-space only.
>> 
>> Okay, then please use anonymous unions and structs as necessary, to
>> ensure that the uptr field can be reached on all platforms in the same
>> way.
> 
> OK, will do!

What I came up with looks like this. User-space can use rseq_cs.uptr.ptr
both on 32-bit and 64-bit to update the pointer:

    /* Restartable sequences rseq_cs field.

       Contains NULL when no critical section is active for the current
       thread, or holds a pointer to the currently active struct rseq_cs.

       Updated by user-space, which sets the address of the currently
       active rseq_cs at the beginning of assembly instruction sequence
       block, and set to NULL by the kernel when it restarts an assembly
       instruction sequence block, as well as when the kernel detects that
       it is preempting or delivering a signal outside of the range
       targeted by the rseq_cs.  Also needs to be set to NULL by user-space
       before reclaiming memory that contains the targeted struct rseq_cs.

       Read and set by the kernel.  Set by user-space with single-copy
       atomicity semantics.  This field should only be updated by the
       thread which registered this data structure.  Aligned on 64-bit.

       User-space may perform the update through the rseq_cs.uptr.ptr
       field.  The padding needs to be initialized to zero on 32-bit.  */
    union
      {
        uint64_t ptr64;
#ifdef __LP64__
        uint64_t ptr;
#else   
        struct
          {
# if (defined (__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined (__BIG_ENDIAN)
            uint32_t padding; /* Initialized to zero.  */
            uint32_t ptr32;
# else /* LITTLE */
            uint32_t ptr32;
            uint32_t padding; /* Initialized to zero.  */
# endif /* ENDIAN */
          } ptr;
#endif

#ifndef __KERNEL__
        struct
          {
# ifdef __LP64__
            const struct rseq_cs *ptr;
# else
#  if (defined (__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined (__BIG_ENDIAN)
            uint32_t padding; /* Initialized to zero.  */
            const struct rseq_cs *ptr;
#  else /* LITTLE */
            const struct rseq_cs *ptr;
            uint32_t padding; /* Initialized to zero.  */
#  endif /* ENDIAN */
# endif
          } uptr;
#endif
      } rseq_cs;

Thanks,

Mathieu
Mathieu Desnoyers April 28, 2020, 2:58 p.m. UTC | #15
----- On Apr 28, 2020, at 8:54 AM, Florian Weimer fw@deneb.enyo.de wrote:

> * Mathieu Desnoyers:
> 
>> ----- On Apr 28, 2020, at 8:35 AM, Florian Weimer fw@deneb.enyo.de wrote:
>>
>>> * Mathieu Desnoyers:
>>> 
>>>> One issue I'm currently facing when running "make check": because
>>>> nptl/tst-rseq-nptl.c uses pthread_cancel(), I run into an Abort
>>>> with:
>>>>
>>>> libgcc_s.so.1 must be installed for pthread_cancel to work
>>>> Didn't expect signal from child: got `Aborted'
>>> 
>>> This is really unusual.  Is the affected test statically linked?
>>
>> I built glibc without specifying anything particular, and ran
>> "make check". It indeed seems to be dynamically linked to libc:
>>
>> ldd tst-rseq-nptl
>> ./tst-rseq-nptl: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found
>> (required by ./tst-rseq-nptl)
>> linux-vdso.so.1 (0x00007ffd3a2f3000)
>> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f0527560000)
>> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f052716f000)
>> /home/efficios/glibc-test5/lib/ld-linux-x86-64.so.2 =>
>> /lib64/ld-linux-x86-64.so.2 (0x00007f0527986000)
> 
> That's expected if the installed glibc is older than the built glibc.
> 
>> After make check I have:
>>
>> cat tst-rseq-nptl.test-result
>> FAIL: nptl/tst-rseq-nptl
>> original exit status 134
> 
> What's in the tst-rseq-nptl.out file?

It contains:

Didn't expect signal from child: got `Aborted'

>> And if I run
>>
>> ./tst-rseq-nptl
>>
>> Then I get
>>
>> libgcc_s.so.1 must be installed for pthread_cancel to work
>> Didn't expect signal from child: got `Aborted'
>> libgcc_s.so.1 must be installed for pthread_cancel to work
>> Aborted (core dumped)
> 
> I'm puzzled why you don't get a GLIBC_2.32 version error in this case.
> Do you build with --enable-hardcoded-path-in-tests?

Just tried with this, and it fails the same way. However, the output
of ldd nptl/tst-rseq-nptl changes:

linux-vdso.so.1 (0x00007ffc235c9000)
libpthread.so.0 => /home/efficios/git/glibc-build/nptl/libpthread.so.0 (0x00007fd308278000)
libc.so.6 => /home/efficios/git/glibc-build/libc.so.6 (0x00007fd307ebc000)
/home/efficios/git/glibc-build/elf/ld.so => /lib64/ld-linux-x86-64.so.2 (0x00007fd30869e000)


>> Same result if I do ./testrun.sh nptl/tst-rseq-nptl
> 
> That one definitely should work.
> 
> I expect you might see this if libgcc_s.so.1 is installed into a
> multiarch subdirectory that upstream glibc does not search.  (The
> Debian patches are unfortunately not upstream.)

My test environment is a Ubuntu 18.04.1 LTS.

> 
> I think on my system, the built glibc can find the system libgcc_s via
> /etc/ld.so.cache, so I haven't seen this issue yet.

On my system, libgcc_s is provided here:

/lib/x86_64-linux-gnu/libgcc_s.so.1

by this package:

Package: libgcc1
Architecture: amd64
Version: 1:8.4.0-1ubuntu1~18.04

Thanks,

Mathieu
Szabolcs Nagy April 29, 2020, 8:16 a.m. UTC | #16
The 04/28/2020 10:58, Mathieu Desnoyers wrote:
> ----- On Apr 28, 2020, at 8:54 AM, Florian Weimer fw@deneb.enyo.de wrote:
> > That one definitely should work.
> > 
> > I expect you might see this if libgcc_s.so.1 is installed into a
> > multiarch subdirectory that upstream glibc does not search.  (The
> > Debian patches are unfortunately not upstream.)
> 
> My test environment is a Ubuntu 18.04.1 LTS.
> 
> > 
> > I think on my system, the built glibc can find the system libgcc_s via
> > /etc/ld.so.cache, so I haven't seen this issue yet.
> 
> On my system, libgcc_s is provided here:
> 
> /lib/x86_64-linux-gnu/libgcc_s.so.1
> 
> by this package:
> 
> Package: libgcc1
> Architecture: amd64
> Version: 1:8.4.0-1ubuntu1~18.04

before running the tests

cp `$CC --print-file-name libgcc_s.so.1` glibc/build/dir
cp `$CC --print-file-name libstdc++.so.6` glibc/build/dir

so those toolchain libs are in the search path
of the newly built libc when running tests.
Florian Weimer April 29, 2020, 8:18 a.m. UTC | #17
* Szabolcs Nagy:

> The 04/28/2020 10:58, Mathieu Desnoyers wrote:
>> ----- On Apr 28, 2020, at 8:54 AM, Florian Weimer fw@deneb.enyo.de wrote:
>> > That one definitely should work.
>> > 
>> > I expect you might see this if libgcc_s.so.1 is installed into a
>> > multiarch subdirectory that upstream glibc does not search.  (The
>> > Debian patches are unfortunately not upstream.)
>> 
>> My test environment is a Ubuntu 18.04.1 LTS.
>> 
>> > 
>> > I think on my system, the built glibc can find the system libgcc_s via
>> > /etc/ld.so.cache, so I haven't seen this issue yet.
>> 
>> On my system, libgcc_s is provided here:
>> 
>> /lib/x86_64-linux-gnu/libgcc_s.so.1
>> 
>> by this package:
>> 
>> Package: libgcc1
>> Architecture: amd64
>> Version: 1:8.4.0-1ubuntu1~18.04
>
> before running the tests
>
> cp `$CC --print-file-name libgcc_s.so.1` glibc/build/dir
> cp `$CC --print-file-name libstdc++.so.6` glibc/build/dir
>
> so those toolchain libs are in the search path
> of the newly built libc when running tests.

Do you actually see the need for these steps yourself?

I guess the correct fix would be to upstream the Debian multiarch
changes and activate them automatically with a configure check on
systems that use multiarch paths.
Szabolcs Nagy April 29, 2020, 8:52 a.m. UTC | #18
The 04/29/2020 10:18, Florian Weimer wrote:
> * Szabolcs Nagy:
> 
> > The 04/28/2020 10:58, Mathieu Desnoyers wrote:
> >> ----- On Apr 28, 2020, at 8:54 AM, Florian Weimer fw@deneb.enyo.de wrote:
> >> > That one definitely should work.
> >> > 
> >> > I expect you might see this if libgcc_s.so.1 is installed into a
> >> > multiarch subdirectory that upstream glibc does not search.  (The
> >> > Debian patches are unfortunately not upstream.)
> >> 
> >> My test environment is a Ubuntu 18.04.1 LTS.
> >> 
> >> > 
> >> > I think on my system, the built glibc can find the system libgcc_s via
> >> > /etc/ld.so.cache, so I haven't seen this issue yet.
> >> 
> >> On my system, libgcc_s is provided here:
> >> 
> >> /lib/x86_64-linux-gnu/libgcc_s.so.1
> >> 
> >> by this package:
> >> 
> >> Package: libgcc1
> >> Architecture: amd64
> >> Version: 1:8.4.0-1ubuntu1~18.04
> >
> > before running the tests
> >
> > cp `$CC --print-file-name libgcc_s.so.1` glibc/build/dir
> > cp `$CC --print-file-name libstdc++.so.6` glibc/build/dir
> >
> > so those toolchain libs are in the search path
> > of the newly built libc when running tests.
> 
> Do you actually see the need for these steps yourself?
> 
> I guess the correct fix would be to upstream the Debian multiarch
> changes and activate them automatically with a configure check on
> systems that use multiarch paths.

cancel tests work for me on an ubuntu system because
of /etc/ld.so.cache, but that may not be present
or the system may not be glibc based at all.

i always do the cp because i build gcc myself (usually
close to current master) and don't install it to the
system path which means at compile time and runtime
different libraries are used if i dont copy
Florian Weimer April 29, 2020, 9:01 a.m. UTC | #19
* Szabolcs Nagy:

> cancel tests work for me on an ubuntu system because
> of /etc/ld.so.cache, but that may not be present
> or the system may not be glibc based at all.

I see.

> i always do the cp because i build gcc myself (usually
> close to current master) and don't install it to the
> system path which means at compile time and runtime
> different libraries are used if i dont copy

I wonder if we should do this automatically (maybe using a symbolic
link), at least of GCC and ld can find these shared objects.  Maybe
it's possible to tell ld to provide output that makes it easy to
locate the actual objects in the file system?

(Trimmed the Cc: list.)
Florian Weimer April 29, 2020, 12:19 p.m. UTC | #20
* Mathieu Desnoyers:

> ----- On Apr 28, 2020, at 8:33 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
>
>> ----- On Apr 28, 2020, at 8:02 AM, Florian Weimer fw@deneb.enyo.de wrote:
>> 
> [...]
>>> 
>>>> x32 should not be an issue as explained above, so I'm very open to
>>>> add this "uptr" for user-space only.
>>> 
>>> Okay, then please use anonymous unions and structs as necessary, to
>>> ensure that the uptr field can be reached on all platforms in the same
>>> way.
>> 
>> OK, will do!
>
> What I came up with looks like this. User-space can use rseq_cs.uptr.ptr
> both on 32-bit and 64-bit to update the pointer:

Agreed, this should work.
Mathieu Desnoyers April 29, 2020, 12:57 p.m. UTC | #21
----- On Apr 29, 2020, at 5:01 AM, Florian Weimer fw@deneb.enyo.de wrote:

> * Szabolcs Nagy:
> 
>> cancel tests work for me on an ubuntu system because
>> of /etc/ld.so.cache, but that may not be present
>> or the system may not be glibc based at all.
> 
> I see.
> 
>> i always do the cp because i build gcc myself (usually
>> close to current master) and don't install it to the
>> system path which means at compile time and runtime
>> different libraries are used if i dont copy
> 
> I wonder if we should do this automatically (maybe using a symbolic
> link), at least of GCC and ld can find these shared objects.  Maybe
> it's possible to tell ld to provide output that makes it easy to
> locate the actual objects in the file system?
> 
> (Trimmed the Cc: list.)

For the records, here is the output I get here:

$ gcc --print-file-name libgcc_s.so.1
/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1
$ gcc --print-file-name libstdc++.so.6
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/libstdc++.so.6
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 

Even if this becomes worked-around upstream, I suspect it would be good to mention
this trick in https://sourceware.org/glibc/wiki/Testing/Builds for those
working on older glibc versions.

Considering that I my build directory is in:

/home/efficios/git/glibc-build

and I configure my glibc from the build directory with:
"../glibc/configure --prefix=/home/efficios/glibc-test"

Where exactly should I copy libgcc_s.so.1 and libstdc++.so.6 ? I tried copying
them to /home/efficios/git/glibc-build/ but it does not appear to fix the issue.
I notice that other .so are pulled into /home/efficios/git/glibc-build/testroot.pristine
and then /home/efficios/git/glibc-build/testroot.root though. Is it where I should
copy them, and if so, under which subdirectory are they expected ?

Thanks,

Mathieu
Szabolcs Nagy April 30, 2020, 10:40 a.m. UTC | #22
The 04/29/2020 08:57, Mathieu Desnoyers wrote:
> ----- On Apr 29, 2020, at 5:01 AM, Florian Weimer fw@deneb.enyo.de wrote:
> 
> > * Szabolcs Nagy:
> > 
> >> cancel tests work for me on an ubuntu system because
> >> of /etc/ld.so.cache, but that may not be present
> >> or the system may not be glibc based at all.
> > 
> > I see.
> > 
> >> i always do the cp because i build gcc myself (usually
> >> close to current master) and don't install it to the
> >> system path which means at compile time and runtime
> >> different libraries are used if i dont copy
> > 
> > I wonder if we should do this automatically (maybe using a symbolic
> > link), at least of GCC and ld can find these shared objects.  Maybe
> > it's possible to tell ld to provide output that makes it easy to
> > locate the actual objects in the file system?
> > 
> > (Trimmed the Cc: list.)
> 
> For the records, here is the output I get here:
> 
> $ gcc --print-file-name libgcc_s.so.1
> /usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1
> $ gcc --print-file-name libstdc++.so.6
> /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/libstdc++.so.6
> $ gcc -v
> Using built-in specs.
> COLLECT_GCC=gcc
> COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
> OFFLOAD_TARGET_NAMES=nvptx-none
> OFFLOAD_TARGET_DEFAULT=1
> Target: x86_64-linux-gnu
> Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
> Thread model: posix
> gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 
> 
> Even if this becomes worked-around upstream, I suspect it would be good to mention
> this trick in https://sourceware.org/glibc/wiki/Testing/Builds for those
> working on older glibc versions.
> 
> Considering that I my build directory is in:
> 
> /home/efficios/git/glibc-build
> 
> and I configure my glibc from the build directory with:
> "../glibc/configure --prefix=/home/efficios/glibc-test"
> 
> Where exactly should I copy libgcc_s.so.1 and libstdc++.so.6 ? I tried copying
> them to /home/efficios/git/glibc-build/ but it does not appear to fix the issue.

hm then the issue may be something else.

the build directory should be early in the runtime
library search path so if the right libs are there
then things should work.

> I notice that other .so are pulled into /home/efficios/git/glibc-build/testroot.pristine
> and then /home/efficios/git/glibc-build/testroot.root though. Is it where I should
> copy them, and if so, under which subdirectory are they expected ?

that is used if the test is running in a container
(e.g. resolver tests that use custom /etc/resolv.conf)

i'd try to run

strace -f ./testrun.sh nptl/tst-rseq-nptl --direct 2>strace.log

and grep libgcc_s strace.log
Mathieu Desnoyers April 30, 2020, 7:04 p.m. UTC | #23
----- On Apr 30, 2020, at 6:40 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote:

> The 04/29/2020 08:57, Mathieu Desnoyers wrote:
>> ----- On Apr 29, 2020, at 5:01 AM, Florian Weimer fw@deneb.enyo.de wrote:
>> 
>> > * Szabolcs Nagy:
>> > 
>> >> cancel tests work for me on an ubuntu system because
>> >> of /etc/ld.so.cache, but that may not be present
>> >> or the system may not be glibc based at all.
>> > 
>> > I see.
>> > 
>> >> i always do the cp because i build gcc myself (usually
>> >> close to current master) and don't install it to the
>> >> system path which means at compile time and runtime
>> >> different libraries are used if i dont copy
>> > 
>> > I wonder if we should do this automatically (maybe using a symbolic
>> > link), at least of GCC and ld can find these shared objects.  Maybe
>> > it's possible to tell ld to provide output that makes it easy to
>> > locate the actual objects in the file system?
>> > 
>> > (Trimmed the Cc: list.)
>> 
>> For the records, here is the output I get here:
>> 
>> $ gcc --print-file-name libgcc_s.so.1
>> /usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1
>> $ gcc --print-file-name libstdc++.so.6
>> /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/libstdc++.so.6
>> $ gcc -v
>> Using built-in specs.
>> COLLECT_GCC=gcc
>> COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
>> OFFLOAD_TARGET_NAMES=nvptx-none
>> OFFLOAD_TARGET_DEFAULT=1
>> Target: x86_64-linux-gnu
>> Configured with: ../src/configure -v --with-pkgversion='Ubuntu
>> 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs
>> --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr
>> --with-gcc-major-version-only --program-suffix=-7
>> --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
>> --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
>> --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
>> --enable-libstdcxx-debug --enable-libstdcxx-time=yes
>> --with-default-libstdcxx-abi=new --enable-gnu-unique-object
>> --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie
>> --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto
>> --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64
>> --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
>> --enable-offload-targets=nvptx-none --without-cuda-driver
>> --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
>> --target=x86_64-linux-gnu
>> Thread model: posix
>> gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
>> 
>> Even if this becomes worked-around upstream, I suspect it would be good to
>> mention
>> this trick in https://sourceware.org/glibc/wiki/Testing/Builds for those
>> working on older glibc versions.
>> 
>> Considering that I my build directory is in:
>> 
>> /home/efficios/git/glibc-build
>> 
>> and I configure my glibc from the build directory with:
>> "../glibc/configure --prefix=/home/efficios/glibc-test"
>> 
>> Where exactly should I copy libgcc_s.so.1 and libstdc++.so.6 ? I tried copying
>> them to /home/efficios/git/glibc-build/ but it does not appear to fix the issue.
> 
> hm then the issue may be something else.
> 
> the build directory should be early in the runtime
> library search path so if the right libs are there
> then things should work.
> 
>> I notice that other .so are pulled into
>> /home/efficios/git/glibc-build/testroot.pristine
>> and then /home/efficios/git/glibc-build/testroot.root though. Is it where I
>> should
>> copy them, and if so, under which subdirectory are they expected ?
> 
> that is used if the test is running in a container
> (e.g. resolver tests that use custom /etc/resolv.conf)
> 
> i'd try to run
> 
> strace -f ./testrun.sh nptl/tst-rseq-nptl --direct 2>strace.log
> 
> and grep libgcc_s strace.log

After a few attempts, it appears that the issue was that I had already
run "make check" in my build directory beforehand, so even though I then
copied over the .so files to the build directory top level, it did not
populate them into the testroot.pristine nor testroot.root directories
for the following "make check".

If I delete both testroot.pristine and testroot.root, and make check
after making sure both .so are in the toplevel build directory, then
they are copied over to testroot.pristine, but testroot.root is not
created at all (and the test still fails).

Starting from a brand new empty build directory, and copying the .so
files before the first make check gets the test to run fine! By doing so,
the .so are copied into testroot.pristine and testroot.root.

Thanks,

Mathieu
diff mbox series

Patch

diff --git a/NEWS b/NEWS
index e0379fc53c..cf3e05d5f9 100644
--- a/NEWS
+++ b/NEWS
@@ -9,7 +9,17 @@  Version 2.32
 
 Major new features:
 
-  * New locale added: ckb_IQ (Kurdish/Sorani spoken in Iraq)
+* New locale added: ckb_IQ (Kurdish/Sorani spoken in Iraq)
+
+* Support for automatically registering threads with the Linux rseq(2)
+  system call has been added.  This system call is implemented starting
+  from Linux 4.18.  The Restartable Sequences ABI accelerates user-space
+  operations on per-cpu data.  It allows user-space to perform updates
+  on per-cpu data without requiring heavy-weight atomic operations.
+  Automatically registering threads allows all libraries, including libc,
+  to make immediate use of the rseq(2) support by using the documented ABI.
+  See 'man 2 rseq' for the details of the ABI shared between libc and the
+  kernel.
 
 Deprecated and removed features, and other changes affecting compatibility:
 
diff --git a/elf/libc_early_init.c b/elf/libc_early_init.c
index 1ac66d895d..30466afea0 100644
--- a/elf/libc_early_init.c
+++ b/elf/libc_early_init.c
@@ -18,10 +18,13 @@ 
 
 #include <ctype.h>
 #include <libc-early-init.h>
+#include <rseq-internal.h>
 
 void
 __libc_early_init (void)
 {
   /* Initialize ctype data.  */
   __ctype_init ();
+  /* Register rseq ABI to the kernel.   */
+  (void) rseq_register_current_thread ();
 }
diff --git a/manual/threads.texi b/manual/threads.texi
index 0858ef8f92..59f634e432 100644
--- a/manual/threads.texi
+++ b/manual/threads.texi
@@ -9,8 +9,10 @@  This chapter describes functions used for managing threads.
 POSIX threads.
 
 @menu
-* ISO C Threads::	Threads based on the ISO C specification.
-* POSIX Threads::	Threads based on the POSIX specification.
+* ISO C Threads::		Threads based on the ISO C specification.
+* POSIX Threads::		Threads based on the POSIX specification.
+* Restartable Sequences::	Linux-specific Restartable Sequences
+				integration.
 @end menu
 
 
@@ -881,3 +883,27 @@  Behaves like @code{pthread_timedjoin_np} except that the absolute time in
 @c pthread_spin_unlock
 @c pthread_testcancel
 @c pthread_yield
+
+@node Restartable Sequences
+@section Restartable Sequences
+@cindex rseq
+
+This section describes @theglibc{} Restartable Sequences integration.
+
+@deftypevar {struct rseq} __rseq_abi
+@standards{GNU, sys/rseq.h}
+@Theglibc{} implements a @code{__rseq_abi} TLS symbol to interact with the
+Restartable Sequences system call (Linux-specific).  The layout of this
+structure is defined by the Linux kernel @file{linux/rseq.h} UAPI.
+Registration of each thread's @code{__rseq_abi} is performed by
+@theglibc{} at libc initialization and pthread creation.
+@end deftypevar
+
+@deftypevr Macro int RSEQ_SIG
+@standards{GNU, sys/rseq.h}
+Each supported architecture provide a @code{RSEQ_SIG} macro in
+@file{sys/rseq.h} which contains a signature.  That signature is expected to be
+present in the code before each Restartable Sequences abort handler.  Failure
+to provide the expected signature may terminate the process with a Segmentation
+fault.
+@end deftypevr
diff --git a/misc/rseq-internal.h b/misc/rseq-internal.h
new file mode 100644
index 0000000000..d564cf1bc3
--- /dev/null
+++ b/misc/rseq-internal.h
@@ -0,0 +1,33 @@ 
+/* Restartable Sequences internal API. Stub version.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef RSEQ_INTERNAL_H
+#define RSEQ_INTERNAL_H
+
+static inline int
+rseq_register_current_thread (void)
+{
+  return -1;
+}
+
+static inline int
+rseq_unregister_current_thread (void)
+{
+  return -1;
+}
+
+#endif /* rseq-internal.h */
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index afd379e89a..1ff248042e 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -33,6 +33,7 @@ 
 #include <default-sched.h>
 #include <futex-internal.h>
 #include <tls-setup.h>
+#include <rseq-internal.h>
 #include "libioP.h"
 
 #include <shlib-compat.h>
@@ -384,6 +385,9 @@  START_THREAD_DEFN
   /* Initialize pointers to locale data.  */
   __ctype_init ();
 
+  /* Register rseq TLS to the kernel. */
+  (void) rseq_register_current_thread ();
+
 #ifndef __ASSUME_SET_ROBUST_LIST
   if (__set_robust_list_avail >= 0)
 #endif
@@ -578,6 +582,14 @@  START_THREAD_DEFN
      process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
      flag.  The 'tid' field in the TCB will be set to zero.
 
+     rseq TLS is still registered at this point. Rely on implicit unregistration
+     performed by the kernel on thread teardown. This is not a problem because the
+     rseq TLS lives on the stack, and the stack outlives the thread. If TCB
+     allocation is ever changed, additional steps may be required, such as
+     performing explicit rseq unregistration before reclaiming the rseq TLS area
+     memory. It is NOT sufficient to block signals because the kernel may write
+     to the rseq area even without signals.
+
      The exit code is zero since in case all threads exit by calling
      'pthread_exit' the exit status must be 0 (zero).  */
   __exit_thread ();
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 60dc5cf9e5..6c6f669d21 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -41,7 +41,7 @@  update-syscall-lists: arch-syscall.h
 endif
 
 ifeq ($(subdir),csu)
-sysdep_routines += errno-loc
+sysdep_routines += errno-loc rseq-sym
 endif
 
 ifeq ($(subdir),assert)
@@ -92,7 +92,8 @@  sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \
 		  bits/termios-baud.h bits/termios-c_cflag.h \
 		  bits/termios-c_lflag.h bits/termios-tcflow.h \
 		  bits/termios-misc.h \
-		  bits/ipc-perm.h
+		  bits/ipc-perm.h \
+		  sys/rseq.h bits/rseq.h
 
 tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \
 	 tst-quota tst-sync_file_range tst-sysconf-iov_max tst-ttyname \
diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions
index d385085c61..52ca223ab2 100644
--- a/sysdeps/unix/sysv/linux/Versions
+++ b/sysdeps/unix/sysv/linux/Versions
@@ -177,6 +177,9 @@  libc {
   GLIBC_2.30 {
     getdents64; gettid; tgkill;
   }
+  GLIBC_2.32 {
+    __rseq_abi;
+  }
   GLIBC_PRIVATE {
     # functions used in other libraries
     __syscall_rt_sigqueueinfo;
diff --git a/sysdeps/unix/sysv/linux/aarch64/bits/rseq.h b/sysdeps/unix/sysv/linux/aarch64/bits/rseq.h
new file mode 100644
index 0000000000..e272c30446
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/aarch64/bits/rseq.h
@@ -0,0 +1,43 @@ 
+/* Restartable Sequences Linux aarch64 architecture header.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   aarch64 -mbig-endian generates mixed endianness code vs data:
+   little-endian code and big-endian data. Ensure the RSEQ_SIG signature
+   matches code endianness.  */
+
+#define RSEQ_SIG_CODE	0xd428bc00	/* BRK #0x45E0.  */
+
+#ifdef __AARCH64EB__
+#define RSEQ_SIG_DATA	0x00bc28d4	/* BRK #0x45E0.  */
+#else
+#define RSEQ_SIG_DATA	RSEQ_SIG_CODE
+#endif
+
+#define RSEQ_SIG	RSEQ_SIG_DATA
diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
index a4c31932cb..fa0702d4ac 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
@@ -2145,3 +2145,4 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist
index e7f2174ac2..482c486272 100644
--- a/sysdeps/unix/sysv/linux/alpha/libc.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist
@@ -2225,6 +2225,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/arm/be/libc.abilist b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
index b152c0e24a..753f643c3b 100644
--- a/sysdeps/unix/sysv/linux/arm/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
@@ -133,6 +133,7 @@  GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _Exit F
 GLIBC_2.4 _IO_2_1_stderr_ D 0xa0
 GLIBC_2.4 _IO_2_1_stdin_ D 0xa0
diff --git a/sysdeps/unix/sysv/linux/arm/bits/rseq.h b/sysdeps/unix/sysv/linux/arm/bits/rseq.h
new file mode 100644
index 0000000000..45a2118dbc
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/arm/bits/rseq.h
@@ -0,0 +1,83 @@ 
+/* Restartable Sequences Linux arm architecture header.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/*
+   RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   - ARM little endian
+
+   RSEQ_SIG uses the udf A32 instruction with an uncommon immediate operand
+   value 0x5de3. This traps if user-space reaches this instruction by mistake,
+   and the uncommon operand ensures the kernel does not move the instruction
+   pointer to attacker-controlled code on rseq abort.
+
+   The instruction pattern in the A32 instruction set is:
+
+   e7f5def3    udf    #24035    ; 0x5de3
+
+   This translates to the following instruction pattern in the T16 instruction
+   set:
+
+   little endian:
+   def3        udf    #243      ; 0xf3
+   e7f5        b.n    <7f5>
+
+   - ARMv6+ big endian (BE8):
+
+   ARMv6+ -mbig-endian generates mixed endianness code vs data: little-endian
+   code and big-endian data. The data value of the signature needs to have its
+   byte order reversed to generate the trap instruction:
+
+   Data: 0xf3def5e7
+
+   Translates to this A32 instruction pattern:
+
+   e7f5def3    udf    #24035    ; 0x5de3
+
+   Translates to this T16 instruction pattern:
+
+   def3        udf    #243      ; 0xf3
+   e7f5        b.n    <7f5>
+
+   - Prior to ARMv6 big endian (BE32):
+
+   Prior to ARMv6, -mbig-endian generates big-endian code and data
+   (which match), so the endianness of the data representation of the
+   signature should not be reversed. However, the choice between BE32
+   and BE8 is done by the linker, so we cannot know whether code and
+   data endianness will be mixed before the linker is invoked. So rather
+   than try to play tricks with the linker, the rseq signature is simply
+   data (not a trap instruction) prior to ARMv6 on big endian. This is
+   why the signature is expressed as data (.word) rather than as
+   instruction (.inst) in assembler.  */
+
+#ifdef __ARMEB__
+#define RSEQ_SIG    0xf3def5e7      /* udf    #24035    ; 0x5de3 (ARMv6+) */
+#else
+#define RSEQ_SIG    0xe7f5def3      /* udf    #24035    ; 0x5de3 */
+#endif
diff --git a/sysdeps/unix/sysv/linux/arm/le/libc.abilist b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
index 9371927927..97b1081fa5 100644
--- a/sysdeps/unix/sysv/linux/arm/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
@@ -130,6 +130,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _Exit F
 GLIBC_2.4 _IO_2_1_stderr_ D 0xa0
 GLIBC_2.4 _IO_2_1_stdin_ D 0xa0
diff --git a/sysdeps/unix/sysv/linux/bits/rseq.h b/sysdeps/unix/sysv/linux/bits/rseq.h
new file mode 100644
index 0000000000..b8a63ed26d
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/bits/rseq.h
@@ -0,0 +1,29 @@ 
+/* Restartable Sequences architecture header. Stub version.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.  */
diff --git a/sysdeps/unix/sysv/linux/csky/libc.abilist b/sysdeps/unix/sysv/linux/csky/libc.abilist
index 9b3cee65bb..ed94510f23 100644
--- a/sysdeps/unix/sysv/linux/csky/libc.abilist
+++ b/sysdeps/unix/sysv/linux/csky/libc.abilist
@@ -2089,3 +2089,4 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist
index df6d96fbae..3d60443799 100644
--- a/sysdeps/unix/sysv/linux/hppa/libc.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist
@@ -2046,6 +2046,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist
index fcb625b6bf..15f0baa54f 100644
--- a/sysdeps/unix/sysv/linux/i386/libc.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libc.abilist
@@ -2212,6 +2212,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist
index cb556c5998..8944cb2cc8 100644
--- a/sysdeps/unix/sysv/linux/ia64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist
@@ -2078,6 +2078,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
index 5e3cdea246..ae100260b9 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
@@ -134,6 +134,7 @@  GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _Exit F
 GLIBC_2.4 _IO_2_1_stderr_ D 0x98
 GLIBC_2.4 _IO_2_1_stdin_ D 0x98
diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
index ea5e7a41af..c2b3f91d17 100644
--- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
@@ -2158,6 +2158,7 @@  GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
index ac55b0acd7..ba9ebeae22 100644
--- a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
@@ -2140,3 +2140,4 @@  GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.32 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
index f7ced487f7..a1f4f9a1c2 100644
--- a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
@@ -2137,3 +2137,4 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/mips/bits/rseq.h b/sysdeps/unix/sysv/linux/mips/bits/rseq.h
new file mode 100644
index 0000000000..4eee59265a
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/mips/bits/rseq.h
@@ -0,0 +1,62 @@ 
+/* Restartable Sequences Linux mips architecture header.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   RSEQ_SIG uses the break instruction. The instruction pattern is:
+
+   On MIPS:
+        0350000d        break     0x350
+
+   On nanoMIPS:
+        00100350        break     0x350
+
+   On microMIPS:
+        0000d407        break     0x350
+
+   For nanoMIPS32 and microMIPS, the instruction stream is encoded as
+   16-bit halfwords, so the signature halfwords need to be swapped
+   accordingly for little-endian.  */
+
+#if defined(__nanomips__)
+# ifdef __MIPSEL__
+#  define RSEQ_SIG	0x03500010
+# else
+#  define RSEQ_SIG	0x00100350
+# endif
+#elif defined(__mips_micromips)
+# ifdef __MIPSEL__
+#  define RSEQ_SIG	0xd4070000
+# else
+#  define RSEQ_SIG	0x0000d407
+# endif
+#elif defined(__mips__)
+# define RSEQ_SIG	0x0350000d
+#else
+/* Unknown MIPS architecture. */
+#endif
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
index 06c2e64edd..333229c27a 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
@@ -2129,6 +2129,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
index bdfd073b86..17772e9a40 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
@@ -2127,6 +2127,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
index 3d61d4974a..497203cd44 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
@@ -2135,6 +2135,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
index 675acca5db..f5f48ce5c6 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
@@ -2129,6 +2129,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist
index 7fec0c9670..4152f4f734 100644
--- a/sysdeps/unix/sysv/linux/nios2/libc.abilist
+++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist
@@ -2178,3 +2178,4 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/powerpc/bits/rseq.h b/sysdeps/unix/sysv/linux/powerpc/bits/rseq.h
new file mode 100644
index 0000000000..9d10000a6e
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/bits/rseq.h
@@ -0,0 +1,37 @@ 
+/* Restartable Sequences Linux powerpc architecture header.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   RSEQ_SIG uses the following trap instruction:
+
+   powerpc-be:    0f e5 00 0b           twui   r5,11
+   powerpc64-le:  0b 00 e5 0f           twui   r5,11
+   powerpc64-be:  0f e5 00 0b           twui   r5,11  */
+
+#define RSEQ_SIG	0x0fe5000b
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
index 1e8ff6f83e..210d1795c1 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
@@ -2185,6 +2185,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
index b5a0751d90..6a3cde5fb6 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
@@ -2218,6 +2218,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
index 0c86217fc6..4f243acb2a 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
@@ -2048,6 +2048,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
index 2229a1dcc0..99d836bc11 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
@@ -2247,3 +2247,4 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
index 31010e6cf7..7694392cd5 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
@@ -2107,3 +2107,4 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/rseq-internal.h b/sysdeps/unix/sysv/linux/rseq-internal.h
new file mode 100644
index 0000000000..5f7f02f1ec
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/rseq-internal.h
@@ -0,0 +1,73 @@ 
+/* Restartable Sequences internal API. Linux implementation.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef RSEQ_INTERNAL_H
+#define RSEQ_INTERNAL_H
+
+#include <sysdep.h>
+#include <errno.h>
+#include <kernel-features.h>
+#include <sys/rseq.h>
+
+#ifdef RSEQ_SIG
+
+static inline int
+rseq_register_current_thread (void)
+{
+  int rc, ret = 0;
+
+  if (__rseq_abi.cpu_id == RSEQ_CPU_ID_REGISTRATION_FAILED)
+    return -1;
+  rc = INTERNAL_SYSCALL_CALL (rseq, &__rseq_abi, sizeof (struct rseq),
+                              0, RSEQ_SIG);
+  if (!rc)
+    goto end;
+  if (INTERNAL_SYSCALL_ERRNO (rc) != EBUSY)
+    __rseq_abi.cpu_id = RSEQ_CPU_ID_REGISTRATION_FAILED;
+  ret = -1;
+end:
+  return ret;
+}
+
+static inline int
+rseq_unregister_current_thread (void)
+{
+  int rc, ret = 0;
+
+  rc = INTERNAL_SYSCALL_CALL (rseq, &__rseq_abi, sizeof (struct rseq),
+                              RSEQ_FLAG_UNREGISTER, RSEQ_SIG);
+  if (!rc)
+    goto end;
+  ret = -1;
+end:
+  return ret;
+}
+#else
+static inline int
+rseq_register_current_thread (void)
+{
+  return -1;
+}
+
+static inline int
+rseq_unregister_current_thread (void)
+{
+  return -1;
+}
+#endif
+
+#endif /* rseq-internal.h */
diff --git a/sysdeps/unix/sysv/linux/rseq-sym.c b/sysdeps/unix/sysv/linux/rseq-sym.c
new file mode 100644
index 0000000000..0e33fab278
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/rseq-sym.c
@@ -0,0 +1,25 @@ 
+/* Restartable Sequences exported symbols. Linux Implementation.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sys/syscall.h>
+#include <stdint.h>
+#include <kernel-features.h>
+#include <sys/rseq.h>
+
+__thread struct rseq __rseq_abi = {
+  .cpu_id = RSEQ_CPU_ID_UNINITIALIZED,
+};
diff --git a/sysdeps/unix/sysv/linux/s390/bits/rseq.h b/sysdeps/unix/sysv/linux/s390/bits/rseq.h
new file mode 100644
index 0000000000..c25ee67ee7
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/s390/bits/rseq.h
@@ -0,0 +1,37 @@ 
+/* Restartable Sequences Linux s390 architecture header.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   RSEQ_SIG uses the trap4 instruction. As Linux does not make use of the
+   access-register mode nor the linkage stack this instruction will always
+   cause a special-operation exception (the trap-enabled bit in the DUCT
+   is and will stay 0). The instruction pattern is
+       b2 ff 0f ff        trap4   4095(%r0)  */
+
+#define RSEQ_SIG	0xB2FF0FFF
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
index 4feca641b0..eb49d11cd1 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
@@ -2183,6 +2183,7 @@  GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
index efe588a072..442f7e33a8 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
@@ -2084,6 +2084,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/sh/be/libc.abilist b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
index 6bfc2b7439..f47ec14735 100644
--- a/sysdeps/unix/sysv/linux/sh/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
@@ -2053,6 +2053,7 @@  GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/sh/le/libc.abilist b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
index 4b057bf4a2..8b4557d325 100644
--- a/sysdeps/unix/sysv/linux/sh/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
@@ -2050,6 +2050,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
index 49cd597fd6..52f951c4fe 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
@@ -2174,6 +2174,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
index 95e68e0ba1..2b59f52de0 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
@@ -2101,6 +2101,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/sys/rseq.h b/sysdeps/unix/sysv/linux/sys/rseq.h
new file mode 100644
index 0000000000..503dce4cac
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/sys/rseq.h
@@ -0,0 +1,186 @@ 
+/* Restartable Sequences exported symbols. Linux header.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+#define _SYS_RSEQ_H	1
+
+/* Architecture-specific rseq signature.  */
+#include <bits/rseq.h>
+#include <stdint.h>
+#include <sys/cdefs.h>
+
+#ifdef __has_include
+# if __has_include ("linux/rseq.h")
+#   define __GLIBC_HAVE_KERNEL_RSEQ
+# endif
+#else
+# include <linux/version.h>
+# if LINUX_VERSION_CODE >= KERNEL_VERSION (4, 18, 0)
+#   define __GLIBC_HAVE_KERNEL_RSEQ
+# endif
+#endif
+
+#ifdef __GLIBC_HAVE_KERNEL_RSEQ
+/* We use the structures declarations from the kernel headers.  */
+# include <linux/rseq.h>
+#else
+/* We use a copy of the include/uapi/linux/rseq.h kernel header.  */
+
+#include <asm/byteorder.h>
+
+enum rseq_cpu_id_state
+  {
+    RSEQ_CPU_ID_UNINITIALIZED = -1,
+    RSEQ_CPU_ID_REGISTRATION_FAILED = -2,
+  };
+
+enum rseq_flags
+  {
+    RSEQ_FLAG_UNREGISTER = (1 << 0),
+  };
+
+enum rseq_cs_flags_bit
+  {
+    RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT = 0,
+    RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT = 1,
+    RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT = 2,
+  };
+
+enum rseq_cs_flags
+  {
+    RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT =
+      (1U << RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT),
+    RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL =
+      (1U << RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT),
+    RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE =
+      (1U << RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT),
+  };
+
+/* struct rseq_cs is aligned on 4 * 8 bytes to ensure it is always
+   contained within a single cache-line. It is usually declared as
+   link-time constant data.  */
+struct rseq_cs
+  {
+    /* Version of this structure.  */
+    uint32_t version;
+    /* enum rseq_cs_flags.  */
+    uint32_t flags;
+    uint64_t start_ip;
+    /* Offset from start_ip.  */
+    uint64_t post_commit_offset;
+    uint64_t abort_ip;
+} __attribute__((aligned(4 * sizeof(uint64_t))));
+
+/* struct rseq is aligned on 4 * 8 bytes to ensure it is always
+   contained within a single cache-line.
+
+   A single struct rseq per thread is allowed.  */
+struct rseq
+  {
+    /* Restartable sequences cpu_id_start field. Updated by the
+       kernel. Read by user-space with single-copy atomicity
+       semantics. This field should only be read by the thread which
+       registered this data structure. Aligned on 32-bit. Always
+       contains a value in the range of possible CPUs, although the
+       value may not be the actual current CPU (e.g. if rseq is not
+       initialized). This CPU number value should always be compared
+       against the value of the cpu_id field before performing a rseq
+       commit or returning a value read from a data structure indexed
+       using the cpu_id_start value.  */
+    uint32_t cpu_id_start;
+    /* Restartable sequences cpu_id field. Updated by the kernel.
+       Read by user-space with single-copy atomicity semantics. This
+       field should only be read by the thread which registered this
+       data structure. Aligned on 32-bit. Values
+       RSEQ_CPU_ID_UNINITIALIZED and RSEQ_CPU_ID_REGISTRATION_FAILED
+       have a special semantic: the former means "rseq uninitialized",
+       and latter means "rseq initialization failed". This value is
+       meant to be read within rseq critical sections and compared
+       with the cpu_id_start value previously read, before performing
+       the commit instruction, or read and compared with the
+       cpu_id_start value before returning a value loaded from a data
+       structure indexed using the cpu_id_start value.  */
+    uint32_t cpu_id;
+    /* Restartable sequences rseq_cs field.
+
+       Contains NULL when no critical section is active for the current
+       thread, or holds a pointer to the currently active struct rseq_cs.
+
+       Updated by user-space, which sets the address of the currently
+       active rseq_cs at the beginning of assembly instruction sequence
+       block, and set to NULL by the kernel when it restarts an assembly
+       instruction sequence block, as well as when the kernel detects that
+       it is preempting or delivering a signal outside of the range
+       targeted by the rseq_cs. Also needs to be set to NULL by user-space
+       before reclaiming memory that contains the targeted struct rseq_cs.
+
+       Read and set by the kernel. Set by user-space with single-copy
+       atomicity semantics. This field should only be updated by the
+       thread which registered this data structure. Aligned on 64-bit.  */
+    union {
+      uint64_t ptr64;
+#ifdef __LP64__
+      uint64_t ptr;
+#else
+      struct {
+#if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined(__BIG_ENDIAN)
+        uint32_t padding; /* Initialized to zero.  */
+        uint32_t ptr32;
+#else /* LITTLE */
+        uint32_t ptr32;
+        uint32_t padding; /* Initialized to zero.  */
+#endif /* ENDIAN */
+      } ptr;
+#endif
+    } rseq_cs;
+
+    /* Restartable sequences flags field.
+
+       This field should only be updated by the thread which
+       registered this data structure. Read by the kernel.
+       Mainly used for single-stepping through rseq critical sections
+       with debuggers.
+
+       - RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT
+           Inhibit instruction sequence block restart on preemption
+           for this thread.
+       - RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL
+           Inhibit instruction sequence block restart on signal
+           delivery for this thread.
+       - RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE
+           Inhibit instruction sequence block restart on migration for
+           this thread.  */
+    uint32_t flags;
+  } __attribute__((aligned(4 * sizeof(uint64_t))));
+
+#endif
+
+/* Ensure the compiler supports __attribute__ ((aligned)).  */
+_Static_assert (__alignof__ (struct rseq_cs) >= 4 * sizeof(uint64_t),
+                "alignment");
+_Static_assert (__alignof__ (struct rseq) >= 4 * sizeof(uint64_t),
+                "alignment");
+
+/* Allocations of struct rseq and struct rseq_cs on the heap need to
+   be aligned on 32 bytes. Therefore, use of malloc is discouraged
+   because it does not guarantee alignment. posix_memalign should be
+   used instead.  */
+
+extern __thread struct rseq __rseq_abi
+__attribute__ ((tls_model ("initial-exec")));
+
+#endif /* sys/rseq.h */
diff --git a/sysdeps/unix/sysv/linux/x86/bits/rseq.h b/sysdeps/unix/sysv/linux/x86/bits/rseq.h
new file mode 100644
index 0000000000..75f52d9788
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/x86/bits/rseq.h
@@ -0,0 +1,30 @@ 
+/* Restartable Sequences Linux x86 architecture header.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   RSEQ_SIG is used with the following reserved undefined instructions, which
+   trap in user-space:
+
+   x86-32:    0f b9 3d 53 30 05 53      ud1    0x53053053,%edi
+   x86-64:    0f b9 3d 53 30 05 53      ud1    0x53053053(%rip),%edi  */
+
+#define RSEQ_SIG	0x53053053
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
index 1f2dbd1451..dfa3fe85ef 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
@@ -2059,6 +2059,7 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
index 59da85a5d8..6ff082ee4e 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
@@ -2158,3 +2158,4 @@  GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.32 __rseq_abi T 0x20