diff mbox series

x86: Use pad in pthread_unwind_buf to preserve shadow stack register

Message ID CAMe9rOpj6nYdP33emmepZ+KWFG2JDVNsHn1bve3oo5WQyFmduA@mail.gmail.com
State New
Headers show
Series x86: Use pad in pthread_unwind_buf to preserve shadow stack register | expand

Commit Message

H.J. Lu March 30, 2018, 5:41 p.m. UTC
On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * H. J. Lu:
>
>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>> * H. J. Lu:
>>>
>>>> You need to make a choice.  You either don't introduce a new symbol
>>>> version or don't save shadow stack for thread cancellation.  You
>>>> can't have both.
>>>
>>> I don't understand.  We have room to save the shadow stack pointer in
>>> the existing struct.
>>
>> No, we don't have room in struct pthread_unwind_buf:
>>
>> Note: There is an unused pointer space in pthread_unwind_buf_data.  But
>> it isn't suitable for saving and restoring shadow stack register since
>> x32 is a 64-bit process with 32-bit software pointer and kernel may
>> place x32 shadow stack above 4GB.  We need to save and restore 64-bit
>> shadow stack register for x32.
>
> We have for void * fields.  They are subsequently overwritten by
> __pthread_register_cancel.  But __sigsetjmp can write to them first
> without causing any harm.  We just need a private __longjmp_cancel
> that doesn't restore the shadow stack pointer.

Here is the patch which does that.   Any comments?

Thanks.

Comments

Carlos O'Donell April 6, 2018, 4:46 a.m. UTC | #1
On 03/30/2018 12:41 PM, H.J. Lu wrote:
> On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>> * H. J. Lu:
>>
>>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>> * H. J. Lu:
>>>>
>>>>> You need to make a choice.  You either don't introduce a new symbol
>>>>> version or don't save shadow stack for thread cancellation.  You
>>>>> can't have both.
>>>> I don't understand.  We have room to save the shadow stack pointer in
>>>> the existing struct.
>>> No, we don't have room in struct pthread_unwind_buf:
>>>
>>> Note: There is an unused pointer space in pthread_unwind_buf_data.  But
>>> it isn't suitable for saving and restoring shadow stack register since
>>> x32 is a 64-bit process with 32-bit software pointer and kernel may
>>> place x32 shadow stack above 4GB.  We need to save and restore 64-bit
>>> shadow stack register for x32.
>> We have for void * fields.  They are subsequently overwritten by
>> __pthread_register_cancel.  But __sigsetjmp can write to them first
>> without causing any harm.  We just need a private __longjmp_cancel
>> that doesn't restore the shadow stack pointer.
> Here is the patch which does that.   Any comments?
 
OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master,
please confirm that this is the correct branch of the required implementation
for CET. It really helps to review the rest of the patch set, you should be
preparing this as a patch set instead of having it reviewed one-at-a-time.
This issue was already raised in the thread with Zack.

Architecture:

* We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI,
  which we have discussed is a fragile process which should be avoided if
  a supportable alternative solution exists.

* We avoid versioned symbols, this makes CET backportable, and this has a
  bigger benefit for long-term stable distributions.

* A key design problem has been that cancellation jump buffers within glibc
  are truncated to optimize on-stack size, and this means that setjmp could
  write beyond the structure because setjmp now tries to save the shadowstack
  pointer into space that the cancellation jump buffer did not allocate.
  For the record this optimization seems premature and I'm sad we did it, this
  is a lesson we should learn from.

* We have all agreed to the following concepts:

  * The cancellation process, in particular the unwinder, never returns from
    any of the functions we call, it just keeps calling into the unwinder to
    jump to the next unwound cancellation function all the way to the thread
    start routine. Therefore because we never return from one of these functions
    we never need to restore the shadow stack, and consequently wherever it is
    stored in the cancellation jump buffer can be overwritten if we need the
    space (it's a dead store).

    * The corollary to this is that function calls made from cancellation handlers
      will continue to advance the shadowstack from the deepest point at which
      cancellation is initiated from. This means that the depth of the shadowstack
      doesn't match the depth of the real stack while we are unwinding. I don't
      know if this will have consequences on analysis tooling or not, or debug
      tools during unwinding. It's a fairly advanced situation and corner case,
      and restoring the shadowstack is not useful becuase we don't need it and
      simplifies the implementation.

  * The cancellation jump buffer has private data used for chaining the cancel
    jump buffers together such that the custom unwinder can follow them and
    call them in sequence. This space constitutes 4 void *'s which is space
    that setjmp can write to, because we will just overwrite it when we register
    the cancel handler.

  * If the new shadowstack-enabled setjmp stores the shadowstack pointer into
    the space taken by the 4 void*'s then we won't overflow the stack, and we
    don't need to change the layout of the cancellation jump buffer. The 4 void*'s
    are sufficient, even for x32 to write a 64-bit shadow stack address.

* After fixing the cancellation jump buffers the following work needs to be reviewed:

  * Add feature_1 in tcb head to track CET status and make it easily available
    to runtime for checking.

  * Save and restore shadowstack in setjmp/longjmp.

  * Add CET support to ld.so et. al. and track runtime status.

  * Adjust vfork for shadow stack usage.

  * Add ENDBR or NOTRACK where required in assembly.

  * CET and makecontext incompatible.
    - Probably need to discuss which default is appropriate.
    - Should the user get CET automatically disabled in makecontext() et. al. silently?
    - Should your current solution, which is to error out during the build, and require
      flag changes, be the default? This forces the user to review the security for their
      application.

  * prctl for CET.

* The work to review after this patch appears to be less contentious in terms of
  the kinds of changes that are required. Most of the changes are internal details
  of enabling CET and not ABI details, with the exception of the possible pain we
  might cause with makecontext() being unsupported and what default position to take
  there.

Design:

* Overall the implementation looks exactly how I might expect it to look, but some
  of the math that places the shadowstack pointer appears to need either commenting
  or fixing because I don't understand it. You need to make it easy for me to see
  that we have placed the shadowstack pointer into the 4 pad words.

Details:

* One comment needs filling out a bit more, noted below.

> 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch
> 
> 
> From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" <hjl.tools@gmail.com>
> Date: Sat, 24 Feb 2018 17:23:54 -0800
> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack
>  register
> 
> The pad array in struct pthread_unwind_buf is used by setjmp to save
> shadow stack register.  We assert that size of struct pthread_unwind_buf
> is no less than offset of shadow stack pointer + shadow stack pointer
> size.
> 

OK.

> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
> these with thread cancellation, call setjmp, but never return after
> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
> __libc_longjmp on x86, doesn't need to restore shadow stack register.

OK.

> __libc_longjmp, which is a private interface for thread cancellation
> implementation in libpthread, is changed to call __longjmp_cancel,
> instead of __longjmp.  __longjmp_cancel is a new internal function
> in libc, which is similar to __longjmp, but doesn't restore shadow
> stack register.

OK. Good. I like the use of a __longjmp_cancel name to call out what's
going on in the API (clear semantics).

> 
> The compatibility longjmp and siglongjmp in libpthread.so are changed
> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
> restore shadow stack register.

OK.

> 
> Tested with build-many-glibcs.py.
> 
> 	* nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
> 	handlers after setjmp.
> 	* setjmp/longjmp.c (__libc_longjmp): Don't define alias if
> 	defined.
> 	* sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG):
> 	Changed to 97.
> 	* sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
> 	* sysdeps/x86/__longjmp_cancel.S: New file.
> 	* sysdeps/x86/longjmp.c: Likewise.
> 	* sysdeps/x86/nptl/pt-longjmp.c: Likewise.

This looks much better.

> ---
>  nptl/pthread_create.c                 |  9 ++--
>  setjmp/longjmp.c                      |  2 +
>  sysdeps/unix/sysv/linux/x86/setjmpP.h |  4 +-
>  sysdeps/x86/Makefile                  |  4 ++
>  sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++
>  sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++
>  sysdeps/x86/nptl/pt-longjmp.c         | 97 +++++++++++++++++++++++++++++++++++
>  7 files changed, 176 insertions(+), 5 deletions(-)
>  create mode 100644 sysdeps/x86/__longjmp_cancel.S
>  create mode 100644 sysdeps/x86/longjmp.c
>  create mode 100644 sysdeps/x86/nptl/pt-longjmp.c
> 
> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
> index caaf07c134..1c5b3780c6 100644
> --- a/nptl/pthread_create.c
> +++ b/nptl/pthread_create.c
> @@ -427,12 +427,15 @@ START_THREAD_DEFN
>       compilers without that support we do use setjmp.  */
>    struct pthread_unwind_buf unwind_buf;
>  
> -  /* No previous handlers.  */
> +  int not_first_call;
> +  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
> +
> +  /* No previous handlers.  NB: This must be done after setjmp since
> +     the same space may be used by setjmp to store extra data which
> +     should never be used by __libc_unwind_longjmp.  */

Suggest:
~~~
No previous handlers.  NB: This must be done after setjmp since
the private space in the unwind jump buffer may overlap space
used by setjmp to store extra architecture-specific information
which is never be used by the cancellation-specific 
__libc_unwind_longjmp.

The private space is allowed to overlap because the unwinder never
has to return through any of the jumped-to call frames, and thus
only a minimum amount of saved data need be stored, and for example,
need not include the process signal mask information. This is all
an optimization to reduce stack usage when pushing cancellation
handlers.
~~~

>    unwind_buf.priv.data.prev = NULL;
>    unwind_buf.priv.data.cleanup = NULL;
>  
> -  int not_first_call;
> -  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);

OK.

>    if (__glibc_likely (! not_first_call))
>      {
>        /* Store the new cleanup handler info.  */
> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
> index a2a7065a85..453889e103 100644
> --- a/setjmp/longjmp.c
> +++ b/setjmp/longjmp.c
> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val)
>  }
>  
>  #ifndef __libc_siglongjmp
> +# ifndef __libc_longjmp
>  /* __libc_longjmp is a private interface for cancellation implementation
>     in libpthread.  */
>  strong_alias (__libc_siglongjmp, __libc_longjmp)
> +# endif

OK.

>  weak_alias (__libc_siglongjmp, _longjmp)
>  weak_alias (__libc_siglongjmp, longjmp)
>  weak_alias (__libc_siglongjmp, siglongjmp)
> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
> index c0ed767a0d..90a6bbcf32 100644
> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
> @@ -22,8 +22,8 @@
>  #include <bits/types/__sigset_t.h>
>  
>  /* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
> -   Define it to 513 to leave some rooms for future use.  */
> -#define _JUMP_BUF_SIGSET_NSIG	513
> +   Define it to 97 to leave some rooms for future use.  */

OK.

> +#define _JUMP_BUF_SIGSET_NSIG	97

Please provide proof in the way of a comment or rewriting this constant
to show that it places the shadow stack pointer on both x86_64 and x32
into the range of the private pad.

Also, from commit  f33632ccd1dec3217583fcfdd965afb62954203c,
where did this math come from?

((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))

Why the +7?

>  /* Number of longs to hold all signals.  */
>  #define _JUMP_BUF_SIGSET_NWORDS \
>    ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
> index 0d0326c21a..d25d6f0ae4 100644
> --- a/sysdeps/x86/Makefile
> +++ b/sysdeps/x86/Makefile
> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features
>  tests += tst-get-cpu-features
>  tests-static += tst-get-cpu-features-static
>  endif
> +
> +ifeq ($(subdir),setjmp)
> +sysdep_routines += __longjmp_cancel
> +endif

OK.

> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
> new file mode 100644
> index 0000000000..b57dbfa376
> --- /dev/null
> +++ b/sysdeps/x86/__longjmp_cancel.S
> @@ -0,0 +1,20 @@
> +/* __longjmp_cancel for x86.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#define __longjmp __longjmp_cancel
> +#include <__longjmp.S>

OK.

> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
> new file mode 100644
> index 0000000000..a53f31e1dd
> --- /dev/null
> +++ b/sysdeps/x86/longjmp.c
> @@ -0,0 +1,45 @@
> +/* __libc_siglongjmp for x86.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#define __libc_longjmp __redirect___libc_longjmp
> +#include <setjmp/longjmp.c>
> +#undef __libc_longjmp
> +
> +extern void __longjmp_cancel (__jmp_buf __env, int __val)
> +     __attribute__ ((__noreturn__)) attribute_hidden;
> +
> +/* Since __libc_longjmp is a private interface for cancellation
> +   implementation in libpthread, there is no need to restore shadow
> +   stack register.  */
> +
> +void
> +__libc_longjmp (sigjmp_buf env, int val)
> +{
> +  /* Perform any cleanups needed by the frames being unwound.  */
> +  _longjmp_unwind (env, val);

OK.

> +
> +  if (env[0].__mask_was_saved)
> +    /* Restore the saved signal mask.  */
> +    (void) __sigprocmask (SIG_SETMASK,
> +			  (sigset_t *) &env[0].__saved_mask,
> +			  (sigset_t *) NULL);

OK.

> +
> +  /* Call the machine-dependent function to restore machine state
> +     without shadow stack.  */
> +  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);

OK.

> +}
> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
> new file mode 100644
> index 0000000000..7eb8651cfe
> --- /dev/null
> +++ b/sysdeps/x86/nptl/pt-longjmp.c
> @@ -0,0 +1,97 @@
> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
> +   X86 version.
> +   Copyright (C) 18 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +/* <nptl/descr.h> has
> +
> +struct pthread_unwind_buf
> +{
> +  struct
> +  {
> +    __jmp_buf jmp_buf;
> +    int mask_was_saved;
> +  } cancel_jmp_buf[1];
> +
> +  union
> +  {
> +    void *pad[4];
> +    struct
> +    {
> +      struct pthread_unwind_buf *prev;
> +      struct _pthread_cleanup_buffer *cleanup;
> +      int canceltype;
> +    } data;
> +  } priv;
> +};
> +
> +  The pad array in struct pthread_unwind_buf is used by setjmp to save

This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches.

However on your hjl/cet/master branch it appears that this offset is not
defined to be *just after* the mask_was_saved?

> +  shadow stack register.  Assert that size of struct pthread_unwind_buf
> +  is no less than offset of shadow stack pointer plus shadow stack
> +  pointer size.
> +
> +  NB: setjmp is called in libpthread to save shadow stack register.  But
> +  __libc_unwind_longjmp doesn't restore shadow stack register since they
> +  never return after longjmp.  */
> +
> +#include <pthreadP.h>
> +#include <jmp_buf-ssp.h>
> +
> +#ifdef __x86_64__
> +# define SHADOW_STACK_POINTER_SIZE 8
> +#else
> +# define SHADOW_STACK_POINTER_SIZE 4
> +#endif
> +
> +_Static_assert ((sizeof (struct pthread_unwind_buf)
> +		 >= (SHADOW_STACK_POINTER_OFFSET
> +		     + SHADOW_STACK_POINTER_SIZE)),
> +		"size of struct pthread_unwind_buf < "
> +		"(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");

OK.

> +
> +#include <shlib-compat.h>
> +
> +/* libpthread once had its own longjmp (and siglongjmp alias), though there
> +   was no apparent reason for it.  There is no use in having a separate
> +   symbol in libpthread, but the historical ABI requires it.  For static
> +   linking, there is no need to provide anything here--the libc version
> +   will be linked in.  For shared library ABI compatibility, there must be
> +   longjmp and siglongjmp symbols in libpthread.so.
> +
> +   With an IFUNC resolver, it would be possible to avoid the indirection,
> +   but the IFUNC resolver might run before the __libc_longjmp symbol has
> +   been relocated, in which case the IFUNC resolver would not be able to
> +   provide the correct address.  */
> +
> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22)
> +
> +static void __attribute__ ((noreturn, used))
> +longjmp_compat (jmp_buf env, int val)
> +{
> +  /* NB: We call __libc_siglongjmp,  instead of __libc_longjmp, since
> +     __libc_longjmp is a private interface for cancellation which
> +     doesn't restore shadow stack register.  */
> +  __libc_siglongjmp (env, val);

OK.

> +}
> +
> +strong_alias (longjmp_compat, longjmp_alias)
> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0);
> +
> +strong_alias (longjmp_alias, siglongjmp_alias)
> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0);
> +
> +#endif
> -- 2.14.3
H.J. Lu April 6, 2018, 12:59 p.m. UTC | #2
On Thu, Apr 5, 2018 at 9:46 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 03/30/2018 12:41 PM, H.J. Lu wrote:
>> On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>> * H. J. Lu:
>>>
>>>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>>> * H. J. Lu:
>>>>>
>>>>>> You need to make a choice.  You either don't introduce a new symbol
>>>>>> version or don't save shadow stack for thread cancellation.  You
>>>>>> can't have both.
>>>>> I don't understand.  We have room to save the shadow stack pointer in
>>>>> the existing struct.
>>>> No, we don't have room in struct pthread_unwind_buf:
>>>>
>>>> Note: There is an unused pointer space in pthread_unwind_buf_data.  But
>>>> it isn't suitable for saving and restoring shadow stack register since
>>>> x32 is a 64-bit process with 32-bit software pointer and kernel may
>>>> place x32 shadow stack above 4GB.  We need to save and restore 64-bit
>>>> shadow stack register for x32.
>>> We have for void * fields.  They are subsequently overwritten by
>>> __pthread_register_cancel.  But __sigsetjmp can write to them first
>>> without causing any harm.  We just need a private __longjmp_cancel
>>> that doesn't restore the shadow stack pointer.
>> Here is the patch which does that.   Any comments?
>
> OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master,
> please confirm that this is the correct branch of the required implementation
> for CET. It really helps to review the rest of the patch set, you should be
> preparing this as a patch set instead of having it reviewed one-at-a-time.
> This issue was already raised in the thread with Zack.

Thanks for your feedbacks.

> Architecture:
>
> * We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI,
>   which we have discussed is a fragile process which should be avoided if
>   a supportable alternative solution exists.
>
> * We avoid versioned symbols, this makes CET backportable, and this has a
>   bigger benefit for long-term stable distributions.
>
> * A key design problem has been that cancellation jump buffers within glibc
>   are truncated to optimize on-stack size, and this means that setjmp could
>   write beyond the structure because setjmp now tries to save the shadowstack
>   pointer into space that the cancellation jump buffer did not allocate.
>   For the record this optimization seems premature and I'm sad we did it, this
>   is a lesson we should learn from.
>
> * We have all agreed to the following concepts:
>
>   * The cancellation process, in particular the unwinder, never returns from
>     any of the functions we call, it just keeps calling into the unwinder to
>     jump to the next unwound cancellation function all the way to the thread
>     start routine. Therefore because we never return from one of these functions
>     we never need to restore the shadow stack, and consequently wherever it is
>     stored in the cancellation jump buffer can be overwritten if we need the
>     space (it's a dead store).
>
>     * The corollary to this is that function calls made from cancellation handlers
>       will continue to advance the shadowstack from the deepest point at which
>       cancellation is initiated from. This means that the depth of the shadowstack
>       doesn't match the depth of the real stack while we are unwinding. I don't
>       know if this will have consequences on analysis tooling or not, or debug
>       tools during unwinding. It's a fairly advanced situation and corner case,
>       and restoring the shadowstack is not useful becuase we don't need it and
>       simplifies the implementation.
>
>   * The cancellation jump buffer has private data used for chaining the cancel
>     jump buffers together such that the custom unwinder can follow them and
>     call them in sequence. This space constitutes 4 void *'s which is space
>     that setjmp can write to, because we will just overwrite it when we register
>     the cancel handler.
>
>   * If the new shadowstack-enabled setjmp stores the shadowstack pointer into
>     the space taken by the 4 void*'s then we won't overflow the stack, and we
>     don't need to change the layout of the cancellation jump buffer. The 4 void*'s
>     are sufficient, even for x32 to write a 64-bit shadow stack address.
>
> * After fixing the cancellation jump buffers the following work needs to be reviewed:
>
>   * Add feature_1 in tcb head to track CET status and make it easily available
>     to runtime for checking.
>
>   * Save and restore shadowstack in setjmp/longjmp.
>
>   * Add CET support to ld.so et. al. and track runtime status.
>
>   * Adjust vfork for shadow stack usage.
>
>   * Add ENDBR or NOTRACK where required in assembly.
>
>   * CET and makecontext incompatible.
>     - Probably need to discuss which default is appropriate.
>     - Should the user get CET automatically disabled in makecontext() et. al. silently?
>     - Should your current solution, which is to error out during the build, and require
>       flag changes, be the default? This forces the user to review the security for their
>       application.

I'd like to reserve 4 slots in ucontext for shadow stack:

https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1

It should be binary backward compatible.  I will investigate if there is a way
to support shadow stack with existing API.  Otherwise, we need to add a new
API for ucontext functions with shadow stack.

>   * prctl for CET.

We have been experimenting different approaches to get the best implementation.
I am expecting that this patch may change as we collect more data.

> * The work to review after this patch appears to be less contentious in terms of
>   the kinds of changes that are required. Most of the changes are internal details
>   of enabling CET and not ABI details, with the exception of the possible pain we
>   might cause with makecontext() being unsupported and what default position to take
>   there.
>
> Design:
>
> * Overall the implementation looks exactly how I might expect it to look, but some
>   of the math that places the shadowstack pointer appears to need either commenting
>   or fixing because I don't understand it. You need to make it easy for me to see
>   that we have placed the shadowstack pointer into the 4 pad words.
>
> Details:
>
> * One comment needs filling out a bit more, noted below.
>
>> 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch
>>
>>
>> From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001
>> From: "H.J. Lu" <hjl.tools@gmail.com>
>> Date: Sat, 24 Feb 2018 17:23:54 -0800
>> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack
>>  register
>>
>> The pad array in struct pthread_unwind_buf is used by setjmp to save
>> shadow stack register.  We assert that size of struct pthread_unwind_buf
>> is no less than offset of shadow stack pointer + shadow stack pointer
>> size.
>>
>
> OK.
>
>> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
>> these with thread cancellation, call setjmp, but never return after
>> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
>> __libc_longjmp on x86, doesn't need to restore shadow stack register.
>
> OK.
>
>> __libc_longjmp, which is a private interface for thread cancellation
>> implementation in libpthread, is changed to call __longjmp_cancel,
>> instead of __longjmp.  __longjmp_cancel is a new internal function
>> in libc, which is similar to __longjmp, but doesn't restore shadow
>> stack register.
>
> OK. Good. I like the use of a __longjmp_cancel name to call out what's
> going on in the API (clear semantics).
>
>>
>> The compatibility longjmp and siglongjmp in libpthread.so are changed
>> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
>> restore shadow stack register.
>
> OK.
>
>>
>> Tested with build-many-glibcs.py.
>>
>>       * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
>>       handlers after setjmp.
>>       * setjmp/longjmp.c (__libc_longjmp): Don't define alias if
>>       defined.
>>       * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG):
>>       Changed to 97.
>>       * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
>>       * sysdeps/x86/__longjmp_cancel.S: New file.
>>       * sysdeps/x86/longjmp.c: Likewise.
>>       * sysdeps/x86/nptl/pt-longjmp.c: Likewise.
>
> This looks much better.
>
>> ---
>>  nptl/pthread_create.c                 |  9 ++--
>>  setjmp/longjmp.c                      |  2 +
>>  sysdeps/unix/sysv/linux/x86/setjmpP.h |  4 +-
>>  sysdeps/x86/Makefile                  |  4 ++
>>  sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++
>>  sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++
>>  sysdeps/x86/nptl/pt-longjmp.c         | 97 +++++++++++++++++++++++++++++++++++
>>  7 files changed, 176 insertions(+), 5 deletions(-)
>>  create mode 100644 sysdeps/x86/__longjmp_cancel.S
>>  create mode 100644 sysdeps/x86/longjmp.c
>>  create mode 100644 sysdeps/x86/nptl/pt-longjmp.c
>>
>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>> index caaf07c134..1c5b3780c6 100644
>> --- a/nptl/pthread_create.c
>> +++ b/nptl/pthread_create.c
>> @@ -427,12 +427,15 @@ START_THREAD_DEFN
>>       compilers without that support we do use setjmp.  */
>>    struct pthread_unwind_buf unwind_buf;
>>
>> -  /* No previous handlers.  */
>> +  int not_first_call;
>> +  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>> +
>> +  /* No previous handlers.  NB: This must be done after setjmp since
>> +     the same space may be used by setjmp to store extra data which
>> +     should never be used by __libc_unwind_longjmp.  */
>
> Suggest:
> ~~~
> No previous handlers.  NB: This must be done after setjmp since
> the private space in the unwind jump buffer may overlap space
> used by setjmp to store extra architecture-specific information
> which is never be used by the cancellation-specific
> __libc_unwind_longjmp.
>
> The private space is allowed to overlap because the unwinder never
> has to return through any of the jumped-to call frames, and thus
> only a minimum amount of saved data need be stored, and for example,
> need not include the process signal mask information. This is all
> an optimization to reduce stack usage when pushing cancellation
> handlers.
> ~~~

Will fix it.

>>    unwind_buf.priv.data.prev = NULL;
>>    unwind_buf.priv.data.cleanup = NULL;
>>
>> -  int not_first_call;
>> -  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>
> OK.
>
>>    if (__glibc_likely (! not_first_call))
>>      {
>>        /* Store the new cleanup handler info.  */
>> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
>> index a2a7065a85..453889e103 100644
>> --- a/setjmp/longjmp.c
>> +++ b/setjmp/longjmp.c
>> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val)
>>  }
>>
>>  #ifndef __libc_siglongjmp
>> +# ifndef __libc_longjmp
>>  /* __libc_longjmp is a private interface for cancellation implementation
>>     in libpthread.  */
>>  strong_alias (__libc_siglongjmp, __libc_longjmp)
>> +# endif
>
> OK.
>
>>  weak_alias (__libc_siglongjmp, _longjmp)
>>  weak_alias (__libc_siglongjmp, longjmp)
>>  weak_alias (__libc_siglongjmp, siglongjmp)
>> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>> index c0ed767a0d..90a6bbcf32 100644
>> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
>> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>> @@ -22,8 +22,8 @@
>>  #include <bits/types/__sigset_t.h>
>>
>>  /* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
>> -   Define it to 513 to leave some rooms for future use.  */
>> -#define _JUMP_BUF_SIGSET_NSIG        513
>> +   Define it to 97 to leave some rooms for future use.  */
>
> OK.
>
>> +#define _JUMP_BUF_SIGSET_NSIG        97
>
> Please provide proof in the way of a comment or rewriting this constant
> to show that it places the shadow stack pointer on both x86_64 and x32
> into the range of the private pad.

sysdeps/x86/nptl/pt-longjmp.c has

_Static_assert ((sizeof (struct pthread_unwind_buf)
>= (SHADOW_STACK_POINTER_OFFSET
     + SHADOW_STACK_POINTER_SIZE)),
"size of struct pthread_unwind_buf < "
"(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");

If shadow stack pointer is saved in the offset bigger than the size of
struct pthread_unwind_buf, assert will trigger at compile-time.

> Also, from commit  f33632ccd1dec3217583fcfdd965afb62954203c,
> where did this math come from?
>
> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
>
> Why the +7?

_JUMP_BUF_SIGSET_NSIG is the biggest signal number + 1.
_JUMP_BUF_SIGSET_NSIG - 1 gives the biggest signal number.
_JUMP_BUF_SIGSET_NSIG - 1 + 7 rounds up to the number of bytes
which are needed to store the biggest signal number.

>>  /* Number of longs to hold all signals.  */
>>  #define _JUMP_BUF_SIGSET_NWORDS \
>>    ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
>> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
>> index 0d0326c21a..d25d6f0ae4 100644
>> --- a/sysdeps/x86/Makefile
>> +++ b/sysdeps/x86/Makefile
>> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features
>>  tests += tst-get-cpu-features
>>  tests-static += tst-get-cpu-features-static
>>  endif
>> +
>> +ifeq ($(subdir),setjmp)
>> +sysdep_routines += __longjmp_cancel
>> +endif
>
> OK.
>
>> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
>> new file mode 100644
>> index 0000000000..b57dbfa376
>> --- /dev/null
>> +++ b/sysdeps/x86/__longjmp_cancel.S
>> @@ -0,0 +1,20 @@
>> +/* __longjmp_cancel for x86.
>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#define __longjmp __longjmp_cancel
>> +#include <__longjmp.S>
>
> OK.
>
>> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
>> new file mode 100644
>> index 0000000000..a53f31e1dd
>> --- /dev/null
>> +++ b/sysdeps/x86/longjmp.c
>> @@ -0,0 +1,45 @@
>> +/* __libc_siglongjmp for x86.
>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#define __libc_longjmp __redirect___libc_longjmp
>> +#include <setjmp/longjmp.c>
>> +#undef __libc_longjmp
>> +
>> +extern void __longjmp_cancel (__jmp_buf __env, int __val)
>> +     __attribute__ ((__noreturn__)) attribute_hidden;
>> +
>> +/* Since __libc_longjmp is a private interface for cancellation
>> +   implementation in libpthread, there is no need to restore shadow
>> +   stack register.  */
>> +
>> +void
>> +__libc_longjmp (sigjmp_buf env, int val)
>> +{
>> +  /* Perform any cleanups needed by the frames being unwound.  */
>> +  _longjmp_unwind (env, val);
>
> OK.
>
>> +
>> +  if (env[0].__mask_was_saved)
>> +    /* Restore the saved signal mask.  */
>> +    (void) __sigprocmask (SIG_SETMASK,
>> +                       (sigset_t *) &env[0].__saved_mask,
>> +                       (sigset_t *) NULL);
>
> OK.
>
>> +
>> +  /* Call the machine-dependent function to restore machine state
>> +     without shadow stack.  */
>> +  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);
>
> OK.
>
>> +}
>> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
>> new file mode 100644
>> index 0000000000..7eb8651cfe
>> --- /dev/null
>> +++ b/sysdeps/x86/nptl/pt-longjmp.c
>> @@ -0,0 +1,97 @@
>> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
>> +   X86 version.
>> +   Copyright (C) 18 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +/* <nptl/descr.h> has
>> +
>> +struct pthread_unwind_buf
>> +{
>> +  struct
>> +  {
>> +    __jmp_buf jmp_buf;
>> +    int mask_was_saved;
>> +  } cancel_jmp_buf[1];
>> +
>> +  union
>> +  {
>> +    void *pad[4];
>> +    struct
>> +    {
>> +      struct pthread_unwind_buf *prev;
>> +      struct _pthread_cleanup_buffer *cleanup;
>> +      int canceltype;
>> +    } data;
>> +  } priv;
>> +};
>> +
>> +  The pad array in struct pthread_unwind_buf is used by setjmp to save
>
> This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches.
>
> However on your hjl/cet/master branch it appears that this offset is not
> defined to be *just after* the mask_was_saved?

sysdeps/unix/sysv/linux/x86/setjmpP.h has

typedef struct
  {
    unsigned long int __val[_JUMP_BUF_SIGSET_NWORDS];
  } __jmp_buf_sigset_t;

typedef union
  {
    __sigset_t __saved_mask_compat;
    struct
      {
         __jmp_buf_sigset_t __saved_mask;
          /* Used for shadow stack pointer.  */
         unsigned long int __shadow_stack_pointer;
      } __saved;
  } __jmpbuf_arch_t;

__shadow_stack_pointer is placed after __saved_mask, aka mask_was_saved.

>> +  shadow stack register.  Assert that size of struct pthread_unwind_buf
>> +  is no less than offset of shadow stack pointer plus shadow stack
>> +  pointer size.
>> +
>> +  NB: setjmp is called in libpthread to save shadow stack register.  But
>> +  __libc_unwind_longjmp doesn't restore shadow stack register since they
>> +  never return after longjmp.  */
>> +
>> +#include <pthreadP.h>
>> +#include <jmp_buf-ssp.h>
>> +
>> +#ifdef __x86_64__
>> +# define SHADOW_STACK_POINTER_SIZE 8
>> +#else
>> +# define SHADOW_STACK_POINTER_SIZE 4
>> +#endif
>> +
>> +_Static_assert ((sizeof (struct pthread_unwind_buf)
>> +              >= (SHADOW_STACK_POINTER_OFFSET
>> +                  + SHADOW_STACK_POINTER_SIZE)),
>> +             "size of struct pthread_unwind_buf < "
>> +             "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");
>
> OK.
>
>> +
>> +#include <shlib-compat.h>
>> +
>> +/* libpthread once had its own longjmp (and siglongjmp alias), though there
>> +   was no apparent reason for it.  There is no use in having a separate
>> +   symbol in libpthread, but the historical ABI requires it.  For static
>> +   linking, there is no need to provide anything here--the libc version
>> +   will be linked in.  For shared library ABI compatibility, there must be
>> +   longjmp and siglongjmp symbols in libpthread.so.
>> +
>> +   With an IFUNC resolver, it would be possible to avoid the indirection,
>> +   but the IFUNC resolver might run before the __libc_longjmp symbol has
>> +   been relocated, in which case the IFUNC resolver would not be able to
>> +   provide the correct address.  */
>> +
>> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22)
>> +
>> +static void __attribute__ ((noreturn, used))
>> +longjmp_compat (jmp_buf env, int val)
>> +{
>> +  /* NB: We call __libc_siglongjmp,  instead of __libc_longjmp, since
>> +     __libc_longjmp is a private interface for cancellation which
>> +     doesn't restore shadow stack register.  */
>> +  __libc_siglongjmp (env, val);
>
> OK.
>
>> +}
>> +
>> +strong_alias (longjmp_compat, longjmp_alias)
>> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0);
>> +
>> +strong_alias (longjmp_alias, siglongjmp_alias)
>> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0);
>> +
>> +#endif
>> -- 2.14.3
>
>
> --
> Cheers,
> Carlos.
H.J. Lu April 6, 2018, 8:26 p.m. UTC | #3
On Fri, Apr 6, 2018 at 5:59 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Apr 5, 2018 at 9:46 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 03/30/2018 12:41 PM, H.J. Lu wrote:
>>> On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>> * H. J. Lu:
>>>>
>>>>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>>>> * H. J. Lu:
>>>>>>
>>>>>>> You need to make a choice.  You either don't introduce a new symbol
>>>>>>> version or don't save shadow stack for thread cancellation.  You
>>>>>>> can't have both.
>>>>>> I don't understand.  We have room to save the shadow stack pointer in
>>>>>> the existing struct.
>>>>> No, we don't have room in struct pthread_unwind_buf:
>>>>>
>>>>> Note: There is an unused pointer space in pthread_unwind_buf_data.  But
>>>>> it isn't suitable for saving and restoring shadow stack register since
>>>>> x32 is a 64-bit process with 32-bit software pointer and kernel may
>>>>> place x32 shadow stack above 4GB.  We need to save and restore 64-bit
>>>>> shadow stack register for x32.
>>>> We have for void * fields.  They are subsequently overwritten by
>>>> __pthread_register_cancel.  But __sigsetjmp can write to them first
>>>> without causing any harm.  We just need a private __longjmp_cancel
>>>> that doesn't restore the shadow stack pointer.
>>> Here is the patch which does that.   Any comments?
>>
>> OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master,
>> please confirm that this is the correct branch of the required implementation
>> for CET. It really helps to review the rest of the patch set, you should be
>> preparing this as a patch set instead of having it reviewed one-at-a-time.
>> This issue was already raised in the thread with Zack.
>
> Thanks for your feedbacks.
>
>> Architecture:
>>
>> * We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI,
>>   which we have discussed is a fragile process which should be avoided if
>>   a supportable alternative solution exists.
>>
>> * We avoid versioned symbols, this makes CET backportable, and this has a
>>   bigger benefit for long-term stable distributions.
>>
>> * A key design problem has been that cancellation jump buffers within glibc
>>   are truncated to optimize on-stack size, and this means that setjmp could
>>   write beyond the structure because setjmp now tries to save the shadowstack
>>   pointer into space that the cancellation jump buffer did not allocate.
>>   For the record this optimization seems premature and I'm sad we did it, this
>>   is a lesson we should learn from.
>>
>> * We have all agreed to the following concepts:
>>
>>   * The cancellation process, in particular the unwinder, never returns from
>>     any of the functions we call, it just keeps calling into the unwinder to
>>     jump to the next unwound cancellation function all the way to the thread
>>     start routine. Therefore because we never return from one of these functions
>>     we never need to restore the shadow stack, and consequently wherever it is
>>     stored in the cancellation jump buffer can be overwritten if we need the
>>     space (it's a dead store).
>>
>>     * The corollary to this is that function calls made from cancellation handlers
>>       will continue to advance the shadowstack from the deepest point at which
>>       cancellation is initiated from. This means that the depth of the shadowstack
>>       doesn't match the depth of the real stack while we are unwinding. I don't
>>       know if this will have consequences on analysis tooling or not, or debug
>>       tools during unwinding. It's a fairly advanced situation and corner case,
>>       and restoring the shadowstack is not useful becuase we don't need it and
>>       simplifies the implementation.
>>
>>   * The cancellation jump buffer has private data used for chaining the cancel
>>     jump buffers together such that the custom unwinder can follow them and
>>     call them in sequence. This space constitutes 4 void *'s which is space
>>     that setjmp can write to, because we will just overwrite it when we register
>>     the cancel handler.
>>
>>   * If the new shadowstack-enabled setjmp stores the shadowstack pointer into
>>     the space taken by the 4 void*'s then we won't overflow the stack, and we
>>     don't need to change the layout of the cancellation jump buffer. The 4 void*'s
>>     are sufficient, even for x32 to write a 64-bit shadow stack address.
>>
>> * After fixing the cancellation jump buffers the following work needs to be reviewed:
>>
>>   * Add feature_1 in tcb head to track CET status and make it easily available
>>     to runtime for checking.
>>
>>   * Save and restore shadowstack in setjmp/longjmp.
>>
>>   * Add CET support to ld.so et. al. and track runtime status.
>>
>>   * Adjust vfork for shadow stack usage.
>>
>>   * Add ENDBR or NOTRACK where required in assembly.
>>
>>   * CET and makecontext incompatible.
>>     - Probably need to discuss which default is appropriate.
>>     - Should the user get CET automatically disabled in makecontext() et. al. silently?
>>     - Should your current solution, which is to error out during the build, and require
>>       flag changes, be the default? This forces the user to review the security for their
>>       application.
>
> I'd like to reserve 4 slots in ucontext for shadow stack:
>
> https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1
>
> It should be binary backward compatible.  I will investigate if there is a way
> to support shadow stack with existing API.  Otherwise, we need to add a new
> API for ucontext functions with shadow stack.
>
>>   * prctl for CET.
>
> We have been experimenting different approaches to get the best implementation.
> I am expecting that this patch may change as we collect more data.
>
>> * The work to review after this patch appears to be less contentious in terms of
>>   the kinds of changes that are required. Most of the changes are internal details
>>   of enabling CET and not ABI details, with the exception of the possible pain we
>>   might cause with makecontext() being unsupported and what default position to take
>>   there.
>>
>> Design:
>>
>> * Overall the implementation looks exactly how I might expect it to look, but some
>>   of the math that places the shadowstack pointer appears to need either commenting
>>   or fixing because I don't understand it. You need to make it easy for me to see
>>   that we have placed the shadowstack pointer into the 4 pad words.
>>
>> Details:
>>
>> * One comment needs filling out a bit more, noted below.
>>
>>> 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch
>>>
>>>
>>> From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001
>>> From: "H.J. Lu" <hjl.tools@gmail.com>
>>> Date: Sat, 24 Feb 2018 17:23:54 -0800
>>> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack
>>>  register
>>>
>>> The pad array in struct pthread_unwind_buf is used by setjmp to save
>>> shadow stack register.  We assert that size of struct pthread_unwind_buf
>>> is no less than offset of shadow stack pointer + shadow stack pointer
>>> size.
>>>
>>
>> OK.
>>
>>> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
>>> these with thread cancellation, call setjmp, but never return after
>>> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
>>> __libc_longjmp on x86, doesn't need to restore shadow stack register.
>>
>> OK.
>>
>>> __libc_longjmp, which is a private interface for thread cancellation
>>> implementation in libpthread, is changed to call __longjmp_cancel,
>>> instead of __longjmp.  __longjmp_cancel is a new internal function
>>> in libc, which is similar to __longjmp, but doesn't restore shadow
>>> stack register.
>>
>> OK. Good. I like the use of a __longjmp_cancel name to call out what's
>> going on in the API (clear semantics).
>>
>>>
>>> The compatibility longjmp and siglongjmp in libpthread.so are changed
>>> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
>>> restore shadow stack register.
>>
>> OK.
>>
>>>
>>> Tested with build-many-glibcs.py.
>>>
>>>       * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
>>>       handlers after setjmp.
>>>       * setjmp/longjmp.c (__libc_longjmp): Don't define alias if
>>>       defined.
>>>       * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG):
>>>       Changed to 97.
>>>       * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
>>>       * sysdeps/x86/__longjmp_cancel.S: New file.
>>>       * sysdeps/x86/longjmp.c: Likewise.
>>>       * sysdeps/x86/nptl/pt-longjmp.c: Likewise.
>>
>> This looks much better.
>>
>>> ---
>>>  nptl/pthread_create.c                 |  9 ++--
>>>  setjmp/longjmp.c                      |  2 +
>>>  sysdeps/unix/sysv/linux/x86/setjmpP.h |  4 +-
>>>  sysdeps/x86/Makefile                  |  4 ++
>>>  sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++
>>>  sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++
>>>  sysdeps/x86/nptl/pt-longjmp.c         | 97 +++++++++++++++++++++++++++++++++++
>>>  7 files changed, 176 insertions(+), 5 deletions(-)
>>>  create mode 100644 sysdeps/x86/__longjmp_cancel.S
>>>  create mode 100644 sysdeps/x86/longjmp.c
>>>  create mode 100644 sysdeps/x86/nptl/pt-longjmp.c
>>>
>>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>>> index caaf07c134..1c5b3780c6 100644
>>> --- a/nptl/pthread_create.c
>>> +++ b/nptl/pthread_create.c
>>> @@ -427,12 +427,15 @@ START_THREAD_DEFN
>>>       compilers without that support we do use setjmp.  */
>>>    struct pthread_unwind_buf unwind_buf;
>>>
>>> -  /* No previous handlers.  */
>>> +  int not_first_call;
>>> +  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>>> +
>>> +  /* No previous handlers.  NB: This must be done after setjmp since
>>> +     the same space may be used by setjmp to store extra data which
>>> +     should never be used by __libc_unwind_longjmp.  */
>>
>> Suggest:
>> ~~~
>> No previous handlers.  NB: This must be done after setjmp since
>> the private space in the unwind jump buffer may overlap space
>> used by setjmp to store extra architecture-specific information
>> which is never be used by the cancellation-specific
>> __libc_unwind_longjmp.
>>
>> The private space is allowed to overlap because the unwinder never
>> has to return through any of the jumped-to call frames, and thus
>> only a minimum amount of saved data need be stored, and for example,
>> need not include the process signal mask information. This is all
>> an optimization to reduce stack usage when pushing cancellation
>> handlers.
>> ~~~
>
> Will fix it.
>
>>>    unwind_buf.priv.data.prev = NULL;
>>>    unwind_buf.priv.data.cleanup = NULL;
>>>
>>> -  int not_first_call;
>>> -  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>>
>> OK.
>>
>>>    if (__glibc_likely (! not_first_call))
>>>      {
>>>        /* Store the new cleanup handler info.  */
>>> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
>>> index a2a7065a85..453889e103 100644
>>> --- a/setjmp/longjmp.c
>>> +++ b/setjmp/longjmp.c
>>> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val)
>>>  }
>>>
>>>  #ifndef __libc_siglongjmp
>>> +# ifndef __libc_longjmp
>>>  /* __libc_longjmp is a private interface for cancellation implementation
>>>     in libpthread.  */
>>>  strong_alias (__libc_siglongjmp, __libc_longjmp)
>>> +# endif
>>
>> OK.
>>
>>>  weak_alias (__libc_siglongjmp, _longjmp)
>>>  weak_alias (__libc_siglongjmp, longjmp)
>>>  weak_alias (__libc_siglongjmp, siglongjmp)
>>> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>>> index c0ed767a0d..90a6bbcf32 100644
>>> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
>>> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>>> @@ -22,8 +22,8 @@
>>>  #include <bits/types/__sigset_t.h>
>>>
>>>  /* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
>>> -   Define it to 513 to leave some rooms for future use.  */
>>> -#define _JUMP_BUF_SIGSET_NSIG        513
>>> +   Define it to 97 to leave some rooms for future use.  */
>>
>> OK.
>>
>>> +#define _JUMP_BUF_SIGSET_NSIG        97
>>
>> Please provide proof in the way of a comment or rewriting this constant
>> to show that it places the shadow stack pointer on both x86_64 and x32
>> into the range of the private pad.
>
> sysdeps/x86/nptl/pt-longjmp.c has
>
> _Static_assert ((sizeof (struct pthread_unwind_buf)
>>= (SHADOW_STACK_POINTER_OFFSET
>      + SHADOW_STACK_POINTER_SIZE)),
> "size of struct pthread_unwind_buf < "
> "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");
>
> If shadow stack pointer is saved in the offset bigger than the size of
> struct pthread_unwind_buf, assert will trigger at compile-time.
>
>> Also, from commit  f33632ccd1dec3217583fcfdd965afb62954203c,
>> where did this math come from?
>>
>> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
>>
>> Why the +7?
>
> _JUMP_BUF_SIGSET_NSIG is the biggest signal number + 1.
> _JUMP_BUF_SIGSET_NSIG - 1 gives the biggest signal number.
> _JUMP_BUF_SIGSET_NSIG - 1 + 7 rounds up to the number of bytes
> which are needed to store the biggest signal number.
>
>>>  /* Number of longs to hold all signals.  */
>>>  #define _JUMP_BUF_SIGSET_NWORDS \
>>>    ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
>>> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
>>> index 0d0326c21a..d25d6f0ae4 100644
>>> --- a/sysdeps/x86/Makefile
>>> +++ b/sysdeps/x86/Makefile
>>> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features
>>>  tests += tst-get-cpu-features
>>>  tests-static += tst-get-cpu-features-static
>>>  endif
>>> +
>>> +ifeq ($(subdir),setjmp)
>>> +sysdep_routines += __longjmp_cancel
>>> +endif
>>
>> OK.
>>
>>> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
>>> new file mode 100644
>>> index 0000000000..b57dbfa376
>>> --- /dev/null
>>> +++ b/sysdeps/x86/__longjmp_cancel.S
>>> @@ -0,0 +1,20 @@
>>> +/* __longjmp_cancel for x86.
>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +
>>> +#define __longjmp __longjmp_cancel
>>> +#include <__longjmp.S>
>>
>> OK.
>>
>>> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
>>> new file mode 100644
>>> index 0000000000..a53f31e1dd
>>> --- /dev/null
>>> +++ b/sysdeps/x86/longjmp.c
>>> @@ -0,0 +1,45 @@
>>> +/* __libc_siglongjmp for x86.
>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +
>>> +#define __libc_longjmp __redirect___libc_longjmp
>>> +#include <setjmp/longjmp.c>
>>> +#undef __libc_longjmp
>>> +
>>> +extern void __longjmp_cancel (__jmp_buf __env, int __val)
>>> +     __attribute__ ((__noreturn__)) attribute_hidden;
>>> +
>>> +/* Since __libc_longjmp is a private interface for cancellation
>>> +   implementation in libpthread, there is no need to restore shadow
>>> +   stack register.  */
>>> +
>>> +void
>>> +__libc_longjmp (sigjmp_buf env, int val)
>>> +{
>>> +  /* Perform any cleanups needed by the frames being unwound.  */
>>> +  _longjmp_unwind (env, val);
>>
>> OK.
>>
>>> +
>>> +  if (env[0].__mask_was_saved)
>>> +    /* Restore the saved signal mask.  */
>>> +    (void) __sigprocmask (SIG_SETMASK,
>>> +                       (sigset_t *) &env[0].__saved_mask,
>>> +                       (sigset_t *) NULL);
>>
>> OK.
>>
>>> +
>>> +  /* Call the machine-dependent function to restore machine state
>>> +     without shadow stack.  */
>>> +  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);
>>
>> OK.
>>
>>> +}
>>> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
>>> new file mode 100644
>>> index 0000000000..7eb8651cfe
>>> --- /dev/null
>>> +++ b/sysdeps/x86/nptl/pt-longjmp.c
>>> @@ -0,0 +1,97 @@
>>> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
>>> +   X86 version.
>>> +   Copyright (C) 18 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +
>>> +/* <nptl/descr.h> has
>>> +
>>> +struct pthread_unwind_buf
>>> +{
>>> +  struct
>>> +  {
>>> +    __jmp_buf jmp_buf;
>>> +    int mask_was_saved;
>>> +  } cancel_jmp_buf[1];
>>> +
>>> +  union
>>> +  {
>>> +    void *pad[4];
>>> +    struct
>>> +    {
>>> +      struct pthread_unwind_buf *prev;
>>> +      struct _pthread_cleanup_buffer *cleanup;
>>> +      int canceltype;
>>> +    } data;
>>> +  } priv;
>>> +};
>>> +
>>> +  The pad array in struct pthread_unwind_buf is used by setjmp to save
>>
>> This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches.
>>
>> However on your hjl/cet/master branch it appears that this offset is not
>> defined to be *just after* the mask_was_saved?
>
> sysdeps/unix/sysv/linux/x86/setjmpP.h has
>
> typedef struct
>   {
>     unsigned long int __val[_JUMP_BUF_SIGSET_NWORDS];
>   } __jmp_buf_sigset_t;
>
> typedef union
>   {
>     __sigset_t __saved_mask_compat;
>     struct
>       {
>          __jmp_buf_sigset_t __saved_mask;
>           /* Used for shadow stack pointer.  */
>          unsigned long int __shadow_stack_pointer;
>       } __saved;
>   } __jmpbuf_arch_t;
>
> __shadow_stack_pointer is placed after __saved_mask, aka mask_was_saved.
>
>>> +  shadow stack register.  Assert that size of struct pthread_unwind_buf
>>> +  is no less than offset of shadow stack pointer plus shadow stack
>>> +  pointer size.
>>> +
>>> +  NB: setjmp is called in libpthread to save shadow stack register.  But
>>> +  __libc_unwind_longjmp doesn't restore shadow stack register since they
>>> +  never return after longjmp.  */
>>> +
>>> +#include <pthreadP.h>
>>> +#include <jmp_buf-ssp.h>
>>> +
>>> +#ifdef __x86_64__
>>> +# define SHADOW_STACK_POINTER_SIZE 8
>>> +#else
>>> +# define SHADOW_STACK_POINTER_SIZE 4
>>> +#endif
>>> +
>>> +_Static_assert ((sizeof (struct pthread_unwind_buf)
>>> +              >= (SHADOW_STACK_POINTER_OFFSET
>>> +                  + SHADOW_STACK_POINTER_SIZE)),
>>> +             "size of struct pthread_unwind_buf < "
>>> +             "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");
>>
>> OK.
>>
>>> +
>>> +#include <shlib-compat.h>
>>> +
>>> +/* libpthread once had its own longjmp (and siglongjmp alias), though there
>>> +   was no apparent reason for it.  There is no use in having a separate
>>> +   symbol in libpthread, but the historical ABI requires it.  For static
>>> +   linking, there is no need to provide anything here--the libc version
>>> +   will be linked in.  For shared library ABI compatibility, there must be
>>> +   longjmp and siglongjmp symbols in libpthread.so.
>>> +
>>> +   With an IFUNC resolver, it would be possible to avoid the indirection,
>>> +   but the IFUNC resolver might run before the __libc_longjmp symbol has
>>> +   been relocated, in which case the IFUNC resolver would not be able to
>>> +   provide the correct address.  */
>>> +
>>> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22)
>>> +
>>> +static void __attribute__ ((noreturn, used))
>>> +longjmp_compat (jmp_buf env, int val)
>>> +{
>>> +  /* NB: We call __libc_siglongjmp,  instead of __libc_longjmp, since
>>> +     __libc_longjmp is a private interface for cancellation which
>>> +     doesn't restore shadow stack register.  */
>>> +  __libc_siglongjmp (env, val);
>>
>> OK.
>>
>>> +}
>>> +
>>> +strong_alias (longjmp_compat, longjmp_alias)
>>> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0);
>>> +
>>> +strong_alias (longjmp_alias, siglongjmp_alias)
>>> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0);
>>> +
>>> +#endif
>>> -- 2.14.3
>>
>>

Here is the updated patch.  OK for master?

I will submit a CET patch set after this patch is merged.

Thanks.
Carlos O'Donell April 12, 2018, 9:36 p.m. UTC | #4
On 04/06/2018 07:59 AM, H.J. Lu wrote:
> On Thu, Apr 5, 2018 at 9:46 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 03/30/2018 12:41 PM, H.J. Lu wrote:
>>> On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>> * H. J. Lu:
>>>>
>>>>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>>>> * H. J. Lu:
>>>>>>
>>>>>>> You need to make a choice.  You either don't introduce a new symbol
>>>>>>> version or don't save shadow stack for thread cancellation.  You
>>>>>>> can't have both.
>>>>>> I don't understand.  We have room to save the shadow stack pointer in
>>>>>> the existing struct.
>>>>> No, we don't have room in struct pthread_unwind_buf:
>>>>>
>>>>> Note: There is an unused pointer space in pthread_unwind_buf_data.  But
>>>>> it isn't suitable for saving and restoring shadow stack register since
>>>>> x32 is a 64-bit process with 32-bit software pointer and kernel may
>>>>> place x32 shadow stack above 4GB.  We need to save and restore 64-bit
>>>>> shadow stack register for x32.
>>>> We have for void * fields.  They are subsequently overwritten by
>>>> __pthread_register_cancel.  But __sigsetjmp can write to them first
>>>> without causing any harm.  We just need a private __longjmp_cancel
>>>> that doesn't restore the shadow stack pointer.
>>> Here is the patch which does that.   Any comments?
>>
>> OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master,
>> please confirm that this is the correct branch of the required implementation
>> for CET. It really helps to review the rest of the patch set, you should be
>> preparing this as a patch set instead of having it reviewed one-at-a-time.
>> This issue was already raised in the thread with Zack.
> 
> Thanks for your feedbacks.
> 
>> Architecture:
>>
>> * We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI,
>>   which we have discussed is a fragile process which should be avoided if
>>   a supportable alternative solution exists.
>>
>> * We avoid versioned symbols, this makes CET backportable, and this has a
>>   bigger benefit for long-term stable distributions.
>>
>> * A key design problem has been that cancellation jump buffers within glibc
>>   are truncated to optimize on-stack size, and this means that setjmp could
>>   write beyond the structure because setjmp now tries to save the shadowstack
>>   pointer into space that the cancellation jump buffer did not allocate.
>>   For the record this optimization seems premature and I'm sad we did it, this
>>   is a lesson we should learn from.
>>
>> * We have all agreed to the following concepts:
>>
>>   * The cancellation process, in particular the unwinder, never returns from
>>     any of the functions we call, it just keeps calling into the unwinder to
>>     jump to the next unwound cancellation function all the way to the thread
>>     start routine. Therefore because we never return from one of these functions
>>     we never need to restore the shadow stack, and consequently wherever it is
>>     stored in the cancellation jump buffer can be overwritten if we need the
>>     space (it's a dead store).
>>
>>     * The corollary to this is that function calls made from cancellation handlers
>>       will continue to advance the shadowstack from the deepest point at which
>>       cancellation is initiated from. This means that the depth of the shadowstack
>>       doesn't match the depth of the real stack while we are unwinding. I don't
>>       know if this will have consequences on analysis tooling or not, or debug
>>       tools during unwinding. It's a fairly advanced situation and corner case,
>>       and restoring the shadowstack is not useful becuase we don't need it and
>>       simplifies the implementation.
>>
>>   * The cancellation jump buffer has private data used for chaining the cancel
>>     jump buffers together such that the custom unwinder can follow them and
>>     call them in sequence. This space constitutes 4 void *'s which is space
>>     that setjmp can write to, because we will just overwrite it when we register
>>     the cancel handler.
>>
>>   * If the new shadowstack-enabled setjmp stores the shadowstack pointer into
>>     the space taken by the 4 void*'s then we won't overflow the stack, and we
>>     don't need to change the layout of the cancellation jump buffer. The 4 void*'s
>>     are sufficient, even for x32 to write a 64-bit shadow stack address.
>>
>> * After fixing the cancellation jump buffers the following work needs to be reviewed:
>>
>>   * Add feature_1 in tcb head to track CET status and make it easily available
>>     to runtime for checking.
>>
>>   * Save and restore shadowstack in setjmp/longjmp.
>>
>>   * Add CET support to ld.so et. al. and track runtime status.
>>
>>   * Adjust vfork for shadow stack usage.
>>
>>   * Add ENDBR or NOTRACK where required in assembly.
>>
>>   * CET and makecontext incompatible.
>>     - Probably need to discuss which default is appropriate.
>>     - Should the user get CET automatically disabled in makecontext() et. al. silently?
>>     - Should your current solution, which is to error out during the build, and require
>>       flag changes, be the default? This forces the user to review the security for their
>>       application.
> 
> I'd like to reserve 4 slots in ucontext for shadow stack:
> 
> https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1
> 
> It should be binary backward compatible.  I will investigate if there is a way
> to support shadow stack with existing API.  Otherwise, we need to add a new
> API for ucontext functions with shadow stack.

Could you explain in detail how this is binary backwards compatible?

Is the assumption that if you extend ucontext_t, that the kernel will just write
more to the stack, and users who accesss it via the void* in a signal handler
setup via sigaction + SA_SIGINFO, will not read past the size they expect?

How is the shadow stack information moved between a getcontext -> setcontext
set of API calls? The user ucontext_t in existing binaries is too small to hold the
shadow stack?

Would we again have a "flag day" where CET enablement must be coordinated with
the definition of a new ucontext_t?

>>   * prctl for CET.
> 
> We have been experimenting different approaches to get the best implementation.
> I am expecting that this patch may change as we collect more data.
> 

OK.

>> * The work to review after this patch appears to be less contentious in terms of
>>   the kinds of changes that are required. Most of the changes are internal details
>>   of enabling CET and not ABI details, with the exception of the possible pain we
>>   might cause with makecontext() being unsupported and what default position to take
>>   there.
>>
>> Design:
>>
>> * Overall the implementation looks exactly how I might expect it to look, but some
>>   of the math that places the shadowstack pointer appears to need either commenting
>>   or fixing because I don't understand it. You need to make it easy for me to see
>>   that we have placed the shadowstack pointer into the 4 pad words.
>>
>> Details:
>>
>> * One comment needs filling out a bit more, noted below.
>>
>>> 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch
>>>
>>>
>>> From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001
>>> From: "H.J. Lu" <hjl.tools@gmail.com>
>>> Date: Sat, 24 Feb 2018 17:23:54 -0800
>>> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack
>>>  register
>>>
>>> The pad array in struct pthread_unwind_buf is used by setjmp to save
>>> shadow stack register.  We assert that size of struct pthread_unwind_buf
>>> is no less than offset of shadow stack pointer + shadow stack pointer
>>> size.
>>>
>>
>> OK.
>>
>>> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
>>> these with thread cancellation, call setjmp, but never return after
>>> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
>>> __libc_longjmp on x86, doesn't need to restore shadow stack register.
>>
>> OK.
>>
>>> __libc_longjmp, which is a private interface for thread cancellation
>>> implementation in libpthread, is changed to call __longjmp_cancel,
>>> instead of __longjmp.  __longjmp_cancel is a new internal function
>>> in libc, which is similar to __longjmp, but doesn't restore shadow
>>> stack register.
>>
>> OK. Good. I like the use of a __longjmp_cancel name to call out what's
>> going on in the API (clear semantics).
>>
>>>
>>> The compatibility longjmp and siglongjmp in libpthread.so are changed
>>> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
>>> restore shadow stack register.
>>
>> OK.
>>
>>>
>>> Tested with build-many-glibcs.py.
>>>
>>>       * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
>>>       handlers after setjmp.
>>>       * setjmp/longjmp.c (__libc_longjmp): Don't define alias if
>>>       defined.
>>>       * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG):
>>>       Changed to 97.
>>>       * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
>>>       * sysdeps/x86/__longjmp_cancel.S: New file.
>>>       * sysdeps/x86/longjmp.c: Likewise.
>>>       * sysdeps/x86/nptl/pt-longjmp.c: Likewise.
>>
>> This looks much better.
>>
>>> ---
>>>  nptl/pthread_create.c                 |  9 ++--
>>>  setjmp/longjmp.c                      |  2 +
>>>  sysdeps/unix/sysv/linux/x86/setjmpP.h |  4 +-
>>>  sysdeps/x86/Makefile                  |  4 ++
>>>  sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++
>>>  sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++
>>>  sysdeps/x86/nptl/pt-longjmp.c         | 97 +++++++++++++++++++++++++++++++++++
>>>  7 files changed, 176 insertions(+), 5 deletions(-)
>>>  create mode 100644 sysdeps/x86/__longjmp_cancel.S
>>>  create mode 100644 sysdeps/x86/longjmp.c
>>>  create mode 100644 sysdeps/x86/nptl/pt-longjmp.c
>>>
>>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>>> index caaf07c134..1c5b3780c6 100644
>>> --- a/nptl/pthread_create.c
>>> +++ b/nptl/pthread_create.c
>>> @@ -427,12 +427,15 @@ START_THREAD_DEFN
>>>       compilers without that support we do use setjmp.  */
>>>    struct pthread_unwind_buf unwind_buf;
>>>
>>> -  /* No previous handlers.  */
>>> +  int not_first_call;
>>> +  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>>> +
>>> +  /* No previous handlers.  NB: This must be done after setjmp since
>>> +     the same space may be used by setjmp to store extra data which
>>> +     should never be used by __libc_unwind_longjmp.  */
>>
>> Suggest:
>> ~~~
>> No previous handlers.  NB: This must be done after setjmp since
>> the private space in the unwind jump buffer may overlap space
>> used by setjmp to store extra architecture-specific information
>> which is never be used by the cancellation-specific
>> __libc_unwind_longjmp.
>>
>> The private space is allowed to overlap because the unwinder never
>> has to return through any of the jumped-to call frames, and thus
>> only a minimum amount of saved data need be stored, and for example,
>> need not include the process signal mask information. This is all
>> an optimization to reduce stack usage when pushing cancellation
>> handlers.
>> ~~~
> 
> Will fix it.
> 
>>>    unwind_buf.priv.data.prev = NULL;
>>>    unwind_buf.priv.data.cleanup = NULL;
>>>
>>> -  int not_first_call;
>>> -  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>>
>> OK.
>>
>>>    if (__glibc_likely (! not_first_call))
>>>      {
>>>        /* Store the new cleanup handler info.  */
>>> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
>>> index a2a7065a85..453889e103 100644
>>> --- a/setjmp/longjmp.c
>>> +++ b/setjmp/longjmp.c
>>> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val)
>>>  }
>>>
>>>  #ifndef __libc_siglongjmp
>>> +# ifndef __libc_longjmp
>>>  /* __libc_longjmp is a private interface for cancellation implementation
>>>     in libpthread.  */
>>>  strong_alias (__libc_siglongjmp, __libc_longjmp)
>>> +# endif
>>
>> OK.
>>
>>>  weak_alias (__libc_siglongjmp, _longjmp)
>>>  weak_alias (__libc_siglongjmp, longjmp)
>>>  weak_alias (__libc_siglongjmp, siglongjmp)
>>> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>>> index c0ed767a0d..90a6bbcf32 100644
>>> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
>>> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>>> @@ -22,8 +22,8 @@
>>>  #include <bits/types/__sigset_t.h>
>>>
>>>  /* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
>>> -   Define it to 513 to leave some rooms for future use.  */
>>> -#define _JUMP_BUF_SIGSET_NSIG        513
>>> +   Define it to 97 to leave some rooms for future use.  */
>>
>> OK.
>>
>>> +#define _JUMP_BUF_SIGSET_NSIG        97
>>
>> Please provide proof in the way of a comment or rewriting this constant
>> to show that it places the shadow stack pointer on both x86_64 and x32
>> into the range of the private pad.
> 
> sysdeps/x86/nptl/pt-longjmp.c has
> 
> _Static_assert ((sizeof (struct pthread_unwind_buf)
>> = (SHADOW_STACK_POINTER_OFFSET
>      + SHADOW_STACK_POINTER_SIZE)),
> "size of struct pthread_unwind_buf < "
> "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");
> 
> If shadow stack pointer is saved in the offset bigger than the size of
> struct pthread_unwind_buf, assert will trigger at compile-time.
> 

Thanks, I'll review this in the patch you posted.

>> Also, from commit  f33632ccd1dec3217583fcfdd965afb62954203c,
>> where did this math come from?
>>
>> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
>>
>> Why the +7?
> 
> _JUMP_BUF_SIGSET_NSIG is the biggest signal number + 1.

Agreed.

> _JUMP_BUF_SIGSET_NSIG - 1 gives the biggest signal number.

Agreed.

> _JUMP_BUF_SIGSET_NSIG - 1 + 7 rounds up to the number of bytes
> which are needed to store the biggest signal number.

It does not.

Result	# of signals	# of bits	Current #	Expected #
FAIL	1		64		0		1
FAIL	2		64		0		1
FAIL	3		64		0		1
FAIL	4		64		0		1
FAIL	5		64		0		1
FAIL	6		64		0		1
FAIL	7		64		0		1
FAIL	8		64		0		1
FAIL	9		64		0		1
FAIL	10		64		0		1
FAIL	11		64		0		1
FAIL	12		64		0		1
FAIL	13		64		0		1
FAIL	14		64		0		1
FAIL	15		64		0		1
FAIL	16		64		0		1
FAIL	17		64		0		1
FAIL	18		64		0		1
FAIL	19		64		0		1
FAIL	20		64		0		1
FAIL	21		64		0		1
FAIL	22		64		0		1
FAIL	23		64		0		1
FAIL	24		64		0		1
FAIL	25		64		0		1
FAIL	26		64		0		1
FAIL	27		64		0		1
FAIL	28		64		0		1
FAIL	29		64		0		1
FAIL	30		64		0		1
FAIL	31		64		0		1
FAIL	32		64		0		1
FAIL	33		64		0		1
FAIL	34		64		0		1
FAIL	35		64		0		1
FAIL	36		64		0		1
FAIL	37		64		0		1
FAIL	38		64		0		1
FAIL	39		64		0		1
FAIL	40		64		0		1
FAIL	41		64		0		1
FAIL	42		64		0		1
FAIL	43		64		0		1
FAIL	44		64		0		1
FAIL	45		64		0		1
FAIL	46		64		0		1
FAIL	47		64		0		1
FAIL	48		64		0		1
FAIL	49		64		0		1
FAIL	50		64		0		1
FAIL	51		64		0		1
FAIL	52		64		0		1
FAIL	53		64		0		1
FAIL	54		64		0		1
FAIL	55		64		0		1
FAIL	56		64		0		1
FAIL	57		64		0		1
PASS	58		64		1		1
PASS	59		64		1		1
PASS	60		64		1		1
PASS	61		64		1		1
PASS	62		64		1		1
PASS	63		64		1		1
PASS	64		64		1		1
FAIL	65		128		1		2
FAIL	66		128		1		2
FAIL	67		128		1		2
FAIL	68		128		1		2
FAIL	69		128		1		2
FAIL	70		128		1		2
FAIL	71		128		1		2
FAIL	72		128		1		2
FAIL	73		128		1		2
FAIL	74		128		1		2
FAIL	75		128		1		2
FAIL	76		128		1		2
FAIL	77		128		1		2
FAIL	78		128		1		2
FAIL	79		128		1		2
FAIL	80		128		1		2
FAIL	81		128		1		2
FAIL	82		128		1		2
FAIL	83		128		1		2
FAIL	84		128		1		2
FAIL	85		128		1		2
FAIL	86		128		1		2
FAIL	87		128		1		2
FAIL	88		128		1		2
FAIL	89		128		1		2
FAIL	90		128		1		2
FAIL	91		128		1		2
FAIL	92		128		1		2
FAIL	93		128		1		2
FAIL	94		128		1		2
FAIL	95		128		1		2
FAIL	96		128		1		2
FAIL	97		128		1		2
FAIL	98		128		1		2
FAIL	99		128		1		2
FAIL	100		128		1		2
FAIL	101		128		1		2
FAIL	102		128		1		2
FAIL	103		128		1		2
FAIL	104		128		1		2
FAIL	105		128		1		2
FAIL	106		128		1		2
FAIL	107		128		1		2
FAIL	108		128		1		2
FAIL	109		128		1		2
FAIL	110		128		1		2
FAIL	111		128		1		2
FAIL	112		128		1		2
FAIL	113		128		1		2
FAIL	114		128		1		2
FAIL	115		128		1		2
FAIL	116		128		1		2
FAIL	117		128		1		2
FAIL	118		128		1		2
FAIL	119		128		1		2
FAIL	120		128		1		2
FAIL	121		128		1		2
PASS	122		128		2		2
PASS	123		128		2		2
PASS	124		128		2		2
PASS	125		128		2		2
PASS	126		128		2		2
PASS	127		128		2		2
PASS	128		128		2		2
FAIL	129		192		2		3
FAIL	130		192		2		3
FAIL	131		192		2		3
FAIL	132		192		2		3
FAIL	133		192		2		3
FAIL	134		192		2		3
FAIL	135		192		2		3
FAIL	136		192		2		3
FAIL	137		192		2		3
FAIL	138		192		2		3
FAIL	139		192		2		3
FAIL	140		192		2		3
FAIL	141		192		2		3
FAIL	142		192		2		3
FAIL	143		192		2		3
FAIL	144		192		2		3
FAIL	145		192		2		3
FAIL	146		192		2		3
FAIL	147		192		2		3
FAIL	148		192		2		3
FAIL	149		192		2		3
FAIL	150		192		2		3
FAIL	151		192		2		3
FAIL	152		192		2		3
FAIL	153		192		2		3
FAIL	154		192		2		3
FAIL	155		192		2		3
FAIL	156		192		2		3
FAIL	157		192		2		3
FAIL	158		192		2		3
FAIL	159		192		2		3
FAIL	160		192		2		3
FAIL	161		192		2		3
FAIL	162		192		2		3
FAIL	163		192		2		3
FAIL	164		192		2		3
FAIL	165		192		2		3
FAIL	166		192		2		3
FAIL	167		192		2		3
FAIL	168		192		2		3
FAIL	169		192		2		3
FAIL	170		192		2		3
FAIL	171		192		2		3
FAIL	172		192		2		3
FAIL	173		192		2		3
FAIL	174		192		2		3
FAIL	175		192		2		3
FAIL	176		192		2		3
FAIL	177		192		2		3
FAIL	178		192		2		3
FAIL	179		192		2		3
FAIL	180		192		2		3
FAIL	181		192		2		3
FAIL	182		192		2		3
FAIL	183		192		2		3
FAIL	184		192		2		3
FAIL	185		192		2		3
PASS	186		192		3		3
PASS	187		192		3		3
PASS	188		192		3		3
PASS	189		192		3		3
PASS	190		192		3		3
PASS	191		192		3		3
PASS	192		192		3		3
FAIL	193		256		3		4
FAIL	194		256		3		4
FAIL	195		256		3		4
FAIL	196		256		3		4
FAIL	197		256		3		4
FAIL	198		256		3		4
FAIL	199		256		3		4
FAIL	200		256		3		4
FAIL	201		256		3		4
FAIL	202		256		3		4
FAIL	203		256		3		4
FAIL	204		256		3		4
FAIL	205		256		3		4
FAIL	206		256		3		4
FAIL	207		256		3		4
FAIL	208		256		3		4
FAIL	209		256		3		4
FAIL	210		256		3		4
FAIL	211		256		3		4
FAIL	212		256		3		4
FAIL	213		256		3		4
FAIL	214		256		3		4
FAIL	215		256		3		4
FAIL	216		256		3		4
FAIL	217		256		3		4
FAIL	218		256		3		4
FAIL	219		256		3		4
FAIL	220		256		3		4
FAIL	221		256		3		4
FAIL	222		256		3		4
FAIL	223		256		3		4
FAIL	224		256		3		4
FAIL	225		256		3		4
FAIL	226		256		3		4
FAIL	227		256		3		4
FAIL	228		256		3		4
FAIL	229		256		3		4
FAIL	230		256		3		4
FAIL	231		256		3		4
FAIL	232		256		3		4
FAIL	233		256		3		4
FAIL	234		256		3		4
FAIL	235		256		3		4
FAIL	236		256		3		4
FAIL	237		256		3		4
FAIL	238		256		3		4
FAIL	239		256		3		4
FAIL	240		256		3		4
FAIL	241		256		3		4
FAIL	242		256		3		4
FAIL	243		256		3		4
FAIL	244		256		3		4
FAIL	245		256		3		4
FAIL	246		256		3		4
FAIL	247		256		3		4
FAIL	248		256		3		4
FAIL	249		256		3		4
PASS	250		256		4		4
PASS	251		256		4		4
PASS	252		256		4		4
PASS	253		256		4		4
PASS	254		256		4		4
PASS	255		256		4		4
PASS	256		256		4		4
FAIL	257		320		4		5
FAIL	258		320		4		5
FAIL	259		320		4		5
FAIL	260		320		4		5
FAIL	261		320		4		5
FAIL	262		320		4		5
FAIL	263		320		4		5
FAIL	264		320		4		5
FAIL	265		320		4		5
FAIL	266		320		4		5
FAIL	267		320		4		5
FAIL	268		320		4		5
FAIL	269		320		4		5
FAIL	270		320		4		5
FAIL	271		320		4		5
FAIL	272		320		4		5
FAIL	273		320		4		5
FAIL	274		320		4		5
FAIL	275		320		4		5
FAIL	276		320		4		5
FAIL	277		320		4		5
FAIL	278		320		4		5
FAIL	279		320		4		5
FAIL	280		320		4		5
FAIL	281		320		4		5
FAIL	282		320		4		5
FAIL	283		320		4		5
FAIL	284		320		4		5
FAIL	285		320		4		5
FAIL	286		320		4		5
FAIL	287		320		4		5
FAIL	288		320		4		5
FAIL	289		320		4		5
FAIL	290		320		4		5
FAIL	291		320		4		5
FAIL	292		320		4		5
FAIL	293		320		4		5
FAIL	294		320		4		5
FAIL	295		320		4		5
FAIL	296		320		4		5
FAIL	297		320		4		5
FAIL	298		320		4		5
FAIL	299		320		4		5
FAIL	300		320		4		5
FAIL	301		320		4		5
FAIL	302		320		4		5
FAIL	303		320		4		5
FAIL	304		320		4		5
FAIL	305		320		4		5
FAIL	306		320		4		5
FAIL	307		320		4		5
FAIL	308		320		4		5
FAIL	309		320		4		5
FAIL	310		320		4		5
FAIL	311		320		4		5
FAIL	312		320		4		5
FAIL	313		320		4		5
PASS	314		320		5		5
PASS	315		320		5		5
PASS	316		320		5		5
PASS	317		320		5		5
PASS	318		320		5		5
PASS	319		320		5		5
PASS	320		320		5		5
FAIL	321		384		5		6
FAIL	322		384		5		6
FAIL	323		384		5		6
FAIL	324		384		5		6
FAIL	325		384		5		6
FAIL	326		384		5		6
FAIL	327		384		5		6
FAIL	328		384		5		6
FAIL	329		384		5		6
FAIL	330		384		5		6
FAIL	331		384		5		6
FAIL	332		384		5		6
FAIL	333		384		5		6
FAIL	334		384		5		6
FAIL	335		384		5		6
FAIL	336		384		5		6
FAIL	337		384		5		6
FAIL	338		384		5		6
FAIL	339		384		5		6
FAIL	340		384		5		6
FAIL	341		384		5		6
FAIL	342		384		5		6
FAIL	343		384		5		6
FAIL	344		384		5		6
FAIL	345		384		5		6
FAIL	346		384		5		6
FAIL	347		384		5		6
FAIL	348		384		5		6
FAIL	349		384		5		6
FAIL	350		384		5		6
FAIL	351		384		5		6
FAIL	352		384		5		6
FAIL	353		384		5		6
FAIL	354		384		5		6
FAIL	355		384		5		6
FAIL	356		384		5		6
FAIL	357		384		5		6
FAIL	358		384		5		6
FAIL	359		384		5		6
FAIL	360		384		5		6
FAIL	361		384		5		6
FAIL	362		384		5		6
FAIL	363		384		5		6
FAIL	364		384		5		6
FAIL	365		384		5		6
FAIL	366		384		5		6
FAIL	367		384		5		6
FAIL	368		384		5		6
FAIL	369		384		5		6
FAIL	370		384		5		6
FAIL	371		384		5		6
FAIL	372		384		5		6
FAIL	373		384		5		6
FAIL	374		384		5		6
FAIL	375		384		5		6
FAIL	376		384		5		6
FAIL	377		384		5		6
PASS	378		384		6		6
PASS	379		384		6		6
PASS	380		384		6		6
PASS	381		384		6		6
PASS	382		384		6		6
PASS	383		384		6		6
PASS	384		384		6		6
FAIL	385		448		6		7
FAIL	386		448		6		7
FAIL	387		448		6		7
FAIL	388		448		6		7
FAIL	389		448		6		7
FAIL	390		448		6		7
FAIL	391		448		6		7
FAIL	392		448		6		7
FAIL	393		448		6		7
FAIL	394		448		6		7
FAIL	395		448		6		7
FAIL	396		448		6		7
FAIL	397		448		6		7
FAIL	398		448		6		7
FAIL	399		448		6		7
FAIL	400		448		6		7
FAIL	401		448		6		7
FAIL	402		448		6		7
FAIL	403		448		6		7
FAIL	404		448		6		7
FAIL	405		448		6		7
FAIL	406		448		6		7
FAIL	407		448		6		7
FAIL	408		448		6		7
FAIL	409		448		6		7
FAIL	410		448		6		7
FAIL	411		448		6		7
FAIL	412		448		6		7
FAIL	413		448		6		7
FAIL	414		448		6		7
FAIL	415		448		6		7
FAIL	416		448		6		7
FAIL	417		448		6		7
FAIL	418		448		6		7
FAIL	419		448		6		7
FAIL	420		448		6		7
FAIL	421		448		6		7
FAIL	422		448		6		7
FAIL	423		448		6		7
FAIL	424		448		6		7
FAIL	425		448		6		7
FAIL	426		448		6		7
FAIL	427		448		6		7
FAIL	428		448		6		7
FAIL	429		448		6		7
FAIL	430		448		6		7
FAIL	431		448		6		7
FAIL	432		448		6		7
FAIL	433		448		6		7
FAIL	434		448		6		7
FAIL	435		448		6		7
FAIL	436		448		6		7
FAIL	437		448		6		7
FAIL	438		448		6		7
FAIL	439		448		6		7
FAIL	440		448		6		7
FAIL	441		448		6		7
PASS	442		448		7		7
PASS	443		448		7		7
PASS	444		448		7		7
PASS	445		448		7		7
PASS	446		448		7		7
PASS	447		448		7		7
PASS	448		448		7		7
FAIL	449		512		7		8
FAIL	450		512		7		8
FAIL	451		512		7		8
FAIL	452		512		7		8
FAIL	453		512		7		8
FAIL	454		512		7		8
FAIL	455		512		7		8
FAIL	456		512		7		8
FAIL	457		512		7		8
FAIL	458		512		7		8
FAIL	459		512		7		8
FAIL	460		512		7		8
FAIL	461		512		7		8
FAIL	462		512		7		8
FAIL	463		512		7		8
FAIL	464		512		7		8
FAIL	465		512		7		8
FAIL	466		512		7		8
FAIL	467		512		7		8
FAIL	468		512		7		8
FAIL	469		512		7		8
FAIL	470		512		7		8
FAIL	471		512		7		8
FAIL	472		512		7		8
FAIL	473		512		7		8
FAIL	474		512		7		8
FAIL	475		512		7		8
FAIL	476		512		7		8
FAIL	477		512		7		8
FAIL	478		512		7		8
FAIL	479		512		7		8
FAIL	480		512		7		8
FAIL	481		512		7		8
FAIL	482		512		7		8
FAIL	483		512		7		8
FAIL	484		512		7		8
FAIL	485		512		7		8
FAIL	486		512		7		8
FAIL	487		512		7		8
FAIL	488		512		7		8
FAIL	489		512		7		8
FAIL	490		512		7		8
FAIL	491		512		7		8
FAIL	492		512		7		8
FAIL	493		512		7		8
FAIL	494		512		7		8
FAIL	495		512		7		8
FAIL	496		512		7		8
FAIL	497		512		7		8
FAIL	498		512		7		8
FAIL	499		512		7		8
FAIL	500		512		7		8
FAIL	501		512		7		8
FAIL	502		512		7		8
FAIL	503		512		7		8
FAIL	504		512		7		8
FAIL	505		512		7		8
PASS	506		512		8		8
PASS	507		512		8		8
PASS	508		512		8		8
PASS	509		512		8		8
PASS	510		512		8		8
PASS	511		512		8		8
PASS	512		512		8		8

Luckily for 512 signals (513) the math works out.

For 96 signals it does not.

(96 - 1 + 7) = 102
102 / 64 = 1

That's only a signal word, that only supports 64 signals, not 96.

Why doesn't the static assert trigger? Because you a < the size of the
pthread_unwind_buf, and too small actually, writing into other parts of
the buffer!

I would expect us to use something like this:

diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
index c0ed767a0d..6e1e8f901c 100644
--- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
+++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
@@ -20,13 +20,15 @@
 #define        _SETJMPP_H      1
 
 #include <bits/types/__sigset_t.h>
+#include <libc-pointer-arith.h>
 
-/* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
-   Define it to 513 to leave some rooms for future use.  */
-#define _JUMP_BUF_SIGSET_NSIG  513
+/* As of kernel 4.14, x86 _NSIG is 64.
+   Define it to 512 to leave some rooms for future use.  */
+#define _JUMP_BUF_SIGSET_NSIG  512
 /* Number of longs to hold all signals.  */
 #define _JUMP_BUF_SIGSET_NWORDS \
-  ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
+  (ALIGN_UP(_JUMP_BUF_SIGSET_NSIG, (8 * sizeof (unsigned long int)))   \
+   / (8 * sizeof (unsigned long int)))
 
 typedef struct
   {
---
... but with the size you need and explain *why* it's 96.

You need a comment like this:
/* The layout looks like this:
   - N words for this.
   - N words for that.
   - N words for shadow stack.
   Total 96 signals.  */

Align the number of signals up to multiple of signals you can store in
a hardware word, and then figure out how many words that takes up.

Please review my math.

>>>  /* Number of longs to hold all signals.  */
>>>  #define _JUMP_BUF_SIGSET_NWORDS \
>>>    ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
>>> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
>>> index 0d0326c21a..d25d6f0ae4 100644
>>> --- a/sysdeps/x86/Makefile
>>> +++ b/sysdeps/x86/Makefile
>>> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features
>>>  tests += tst-get-cpu-features
>>>  tests-static += tst-get-cpu-features-static
>>>  endif
>>> +
>>> +ifeq ($(subdir),setjmp)
>>> +sysdep_routines += __longjmp_cancel
>>> +endif
>>
>> OK.
>>
>>> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
>>> new file mode 100644
>>> index 0000000000..b57dbfa376
>>> --- /dev/null
>>> +++ b/sysdeps/x86/__longjmp_cancel.S
>>> @@ -0,0 +1,20 @@
>>> +/* __longjmp_cancel for x86.
>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +
>>> +#define __longjmp __longjmp_cancel
>>> +#include <__longjmp.S>
>>
>> OK.
>>
>>> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
>>> new file mode 100644
>>> index 0000000000..a53f31e1dd
>>> --- /dev/null
>>> +++ b/sysdeps/x86/longjmp.c
>>> @@ -0,0 +1,45 @@
>>> +/* __libc_siglongjmp for x86.
>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +
>>> +#define __libc_longjmp __redirect___libc_longjmp
>>> +#include <setjmp/longjmp.c>
>>> +#undef __libc_longjmp
>>> +
>>> +extern void __longjmp_cancel (__jmp_buf __env, int __val)
>>> +     __attribute__ ((__noreturn__)) attribute_hidden;
>>> +
>>> +/* Since __libc_longjmp is a private interface for cancellation
>>> +   implementation in libpthread, there is no need to restore shadow
>>> +   stack register.  */
>>> +
>>> +void
>>> +__libc_longjmp (sigjmp_buf env, int val)
>>> +{
>>> +  /* Perform any cleanups needed by the frames being unwound.  */
>>> +  _longjmp_unwind (env, val);
>>
>> OK.
>>
>>> +
>>> +  if (env[0].__mask_was_saved)
>>> +    /* Restore the saved signal mask.  */
>>> +    (void) __sigprocmask (SIG_SETMASK,
>>> +                       (sigset_t *) &env[0].__saved_mask,
>>> +                       (sigset_t *) NULL);
>>
>> OK.
>>
>>> +
>>> +  /* Call the machine-dependent function to restore machine state
>>> +     without shadow stack.  */
>>> +  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);
>>
>> OK.
>>
>>> +}
>>> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
>>> new file mode 100644
>>> index 0000000000..7eb8651cfe
>>> --- /dev/null
>>> +++ b/sysdeps/x86/nptl/pt-longjmp.c
>>> @@ -0,0 +1,97 @@
>>> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
>>> +   X86 version.
>>> +   Copyright (C) 18 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +
>>> +/* <nptl/descr.h> has
>>> +
>>> +struct pthread_unwind_buf
>>> +{
>>> +  struct
>>> +  {
>>> +    __jmp_buf jmp_buf;
>>> +    int mask_was_saved;
>>> +  } cancel_jmp_buf[1];
>>> +
>>> +  union
>>> +  {
>>> +    void *pad[4];
>>> +    struct
>>> +    {
>>> +      struct pthread_unwind_buf *prev;
>>> +      struct _pthread_cleanup_buffer *cleanup;
>>> +      int canceltype;
>>> +    } data;
>>> +  } priv;
>>> +};
>>> +
>>> +  The pad array in struct pthread_unwind_buf is used by setjmp to save
>>
>> This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches.
>>
>> However on your hjl/cet/master branch it appears that this offset is not
>> defined to be *just after* the mask_was_saved?
> 
> sysdeps/unix/sysv/linux/x86/setjmpP.h has
> 
> typedef struct
>   {
>     unsigned long int __val[_JUMP_BUF_SIGSET_NWORDS];
>   } __jmp_buf_sigset_t;
> 
> typedef union
>   {
>     __sigset_t __saved_mask_compat;
>     struct
>       {
>          __jmp_buf_sigset_t __saved_mask;
>           /* Used for shadow stack pointer.  */
>          unsigned long int __shadow_stack_pointer;
>       } __saved;
>   } __jmpbuf_arch_t;
> 
> __shadow_stack_pointer is placed after __saved_mask, aka mask_was_saved.

OK, I'll re-review.

>>> +  shadow stack register.  Assert that size of struct pthread_unwind_buf
>>> +  is no less than offset of shadow stack pointer plus shadow stack
>>> +  pointer size.
>>> +
>>> +  NB: setjmp is called in libpthread to save shadow stack register.  But
>>> +  __libc_unwind_longjmp doesn't restore shadow stack register since they
>>> +  never return after longjmp.  */
>>> +
>>> +#include <pthreadP.h>
>>> +#include <jmp_buf-ssp.h>
>>> +
>>> +#ifdef __x86_64__
>>> +# define SHADOW_STACK_POINTER_SIZE 8
>>> +#else
>>> +# define SHADOW_STACK_POINTER_SIZE 4
>>> +#endif
>>> +
>>> +_Static_assert ((sizeof (struct pthread_unwind_buf)
>>> +              >= (SHADOW_STACK_POINTER_OFFSET
>>> +                  + SHADOW_STACK_POINTER_SIZE)),
>>> +             "size of struct pthread_unwind_buf < "
>>> +             "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");
>>
>> OK.
>>
>>> +
>>> +#include <shlib-compat.h>
>>> +
>>> +/* libpthread once had its own longjmp (and siglongjmp alias), though there
>>> +   was no apparent reason for it.  There is no use in having a separate
>>> +   symbol in libpthread, but the historical ABI requires it.  For static
>>> +   linking, there is no need to provide anything here--the libc version
>>> +   will be linked in.  For shared library ABI compatibility, there must be
>>> +   longjmp and siglongjmp symbols in libpthread.so.
>>> +
>>> +   With an IFUNC resolver, it would be possible to avoid the indirection,
>>> +   but the IFUNC resolver might run before the __libc_longjmp symbol has
>>> +   been relocated, in which case the IFUNC resolver would not be able to
>>> +   provide the correct address.  */
>>> +
>>> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22)
>>> +
>>> +static void __attribute__ ((noreturn, used))
>>> +longjmp_compat (jmp_buf env, int val)
>>> +{
>>> +  /* NB: We call __libc_siglongjmp,  instead of __libc_longjmp, since
>>> +     __libc_longjmp is a private interface for cancellation which
>>> +     doesn't restore shadow stack register.  */
>>> +  __libc_siglongjmp (env, val);
>>
>> OK.
>>
>>> +}
>>> +
>>> +strong_alias (longjmp_compat, longjmp_alias)
>>> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0);
>>> +
>>> +strong_alias (longjmp_alias, siglongjmp_alias)
>>> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0);
>>> +
>>> +#endif
>>> -- 2.14.3
>>
>>
>> --
>> Cheers,
>> Carlos.
> 
> 
> 

I look forward to a v2 with correct rounding and a new comment.
Carlos O'Donell April 12, 2018, 9:36 p.m. UTC | #5
On 04/06/2018 03:26 PM, H.J. Lu wrote:
> Here is the updated patch.  OK for master?
> 
> I will submit a CET patch set after this patch is merged.
> 
> Thanks.
> 

I will wait for a conclusion around the incorrect
_JUMP_BUF_SIGSET_NWORDS computation.
H.J. Lu April 12, 2018, 11:50 p.m. UTC | #6
On Thu, Apr 12, 2018 at 2:36 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 04/06/2018 07:59 AM, H.J. Lu wrote:
>> On Thu, Apr 5, 2018 at 9:46 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>>> On 03/30/2018 12:41 PM, H.J. Lu wrote:
>>>> On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>>> * H. J. Lu:
>>>>>
>>>>>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>>>>> * H. J. Lu:
>>>>>>>
>>>>>>>> You need to make a choice.  You either don't introduce a new symbol
>>>>>>>> version or don't save shadow stack for thread cancellation.  You
>>>>>>>> can't have both.
>>>>>>> I don't understand.  We have room to save the shadow stack pointer in
>>>>>>> the existing struct.
>>>>>> No, we don't have room in struct pthread_unwind_buf:
>>>>>>
>>>>>> Note: There is an unused pointer space in pthread_unwind_buf_data.  But
>>>>>> it isn't suitable for saving and restoring shadow stack register since
>>>>>> x32 is a 64-bit process with 32-bit software pointer and kernel may
>>>>>> place x32 shadow stack above 4GB.  We need to save and restore 64-bit
>>>>>> shadow stack register for x32.
>>>>> We have for void * fields.  They are subsequently overwritten by
>>>>> __pthread_register_cancel.  But __sigsetjmp can write to them first
>>>>> without causing any harm.  We just need a private __longjmp_cancel
>>>>> that doesn't restore the shadow stack pointer.
>>>> Here is the patch which does that.   Any comments?
>>>
>>> OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master,
>>> please confirm that this is the correct branch of the required implementation
>>> for CET. It really helps to review the rest of the patch set, you should be
>>> preparing this as a patch set instead of having it reviewed one-at-a-time.
>>> This issue was already raised in the thread with Zack.
>>
>> Thanks for your feedbacks.
>>
>>> Architecture:
>>>
>>> * We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI,
>>>   which we have discussed is a fragile process which should be avoided if
>>>   a supportable alternative solution exists.
>>>
>>> * We avoid versioned symbols, this makes CET backportable, and this has a
>>>   bigger benefit for long-term stable distributions.
>>>
>>> * A key design problem has been that cancellation jump buffers within glibc
>>>   are truncated to optimize on-stack size, and this means that setjmp could
>>>   write beyond the structure because setjmp now tries to save the shadowstack
>>>   pointer into space that the cancellation jump buffer did not allocate.
>>>   For the record this optimization seems premature and I'm sad we did it, this
>>>   is a lesson we should learn from.
>>>
>>> * We have all agreed to the following concepts:
>>>
>>>   * The cancellation process, in particular the unwinder, never returns from
>>>     any of the functions we call, it just keeps calling into the unwinder to
>>>     jump to the next unwound cancellation function all the way to the thread
>>>     start routine. Therefore because we never return from one of these functions
>>>     we never need to restore the shadow stack, and consequently wherever it is
>>>     stored in the cancellation jump buffer can be overwritten if we need the
>>>     space (it's a dead store).
>>>
>>>     * The corollary to this is that function calls made from cancellation handlers
>>>       will continue to advance the shadowstack from the deepest point at which
>>>       cancellation is initiated from. This means that the depth of the shadowstack
>>>       doesn't match the depth of the real stack while we are unwinding. I don't
>>>       know if this will have consequences on analysis tooling or not, or debug
>>>       tools during unwinding. It's a fairly advanced situation and corner case,
>>>       and restoring the shadowstack is not useful becuase we don't need it and
>>>       simplifies the implementation.
>>>
>>>   * The cancellation jump buffer has private data used for chaining the cancel
>>>     jump buffers together such that the custom unwinder can follow them and
>>>     call them in sequence. This space constitutes 4 void *'s which is space
>>>     that setjmp can write to, because we will just overwrite it when we register
>>>     the cancel handler.
>>>
>>>   * If the new shadowstack-enabled setjmp stores the shadowstack pointer into
>>>     the space taken by the 4 void*'s then we won't overflow the stack, and we
>>>     don't need to change the layout of the cancellation jump buffer. The 4 void*'s
>>>     are sufficient, even for x32 to write a 64-bit shadow stack address.
>>>
>>> * After fixing the cancellation jump buffers the following work needs to be reviewed:
>>>
>>>   * Add feature_1 in tcb head to track CET status and make it easily available
>>>     to runtime for checking.
>>>
>>>   * Save and restore shadowstack in setjmp/longjmp.
>>>
>>>   * Add CET support to ld.so et. al. and track runtime status.
>>>
>>>   * Adjust vfork for shadow stack usage.
>>>
>>>   * Add ENDBR or NOTRACK where required in assembly.
>>>
>>>   * CET and makecontext incompatible.
>>>     - Probably need to discuss which default is appropriate.
>>>     - Should the user get CET automatically disabled in makecontext() et. al. silently?
>>>     - Should your current solution, which is to error out during the build, and require
>>>       flag changes, be the default? This forces the user to review the security for their
>>>       application.
>>
>> I'd like to reserve 4 slots in ucontext for shadow stack:
>>
>> https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1
>>
>> It should be binary backward compatible.  I will investigate if there is a way
>> to support shadow stack with existing API.  Otherwise, we need to add a new
>> API for ucontext functions with shadow stack.
>
> Could you explain in detail how this is binary backwards compatible?
>
> Is the assumption that if you extend ucontext_t, that the kernel will just write
> more to the stack, and users who accesss it via the void* in a signal handler
> setup via sigaction + SA_SIGINFO, will not read past the size they expect?
>
> How is the shadow stack information moved between a getcontext -> setcontext
> set of API calls? The user ucontext_t in existing binaries is too small to hold the
> shadow stack?
>
> Would we again have a "flag day" where CET enablement must be coordinated with
> the definition of a new ucontext_t?
>
>>>   * prctl for CET.
>>
>> We have been experimenting different approaches to get the best implementation.
>> I am expecting that this patch may change as we collect more data.
>>
>
> OK.
>
>>> * The work to review after this patch appears to be less contentious in terms of
>>>   the kinds of changes that are required. Most of the changes are internal details
>>>   of enabling CET and not ABI details, with the exception of the possible pain we
>>>   might cause with makecontext() being unsupported and what default position to take
>>>   there.
>>>
>>> Design:
>>>
>>> * Overall the implementation looks exactly how I might expect it to look, but some
>>>   of the math that places the shadowstack pointer appears to need either commenting
>>>   or fixing because I don't understand it. You need to make it easy for me to see
>>>   that we have placed the shadowstack pointer into the 4 pad words.
>>>
>>> Details:
>>>
>>> * One comment needs filling out a bit more, noted below.
>>>
>>>> 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch
>>>>
>>>>
>>>> From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001
>>>> From: "H.J. Lu" <hjl.tools@gmail.com>
>>>> Date: Sat, 24 Feb 2018 17:23:54 -0800
>>>> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack
>>>>  register
>>>>
>>>> The pad array in struct pthread_unwind_buf is used by setjmp to save
>>>> shadow stack register.  We assert that size of struct pthread_unwind_buf
>>>> is no less than offset of shadow stack pointer + shadow stack pointer
>>>> size.
>>>>
>>>
>>> OK.
>>>
>>>> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
>>>> these with thread cancellation, call setjmp, but never return after
>>>> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
>>>> __libc_longjmp on x86, doesn't need to restore shadow stack register.
>>>
>>> OK.
>>>
>>>> __libc_longjmp, which is a private interface for thread cancellation
>>>> implementation in libpthread, is changed to call __longjmp_cancel,
>>>> instead of __longjmp.  __longjmp_cancel is a new internal function
>>>> in libc, which is similar to __longjmp, but doesn't restore shadow
>>>> stack register.
>>>
>>> OK. Good. I like the use of a __longjmp_cancel name to call out what's
>>> going on in the API (clear semantics).
>>>
>>>>
>>>> The compatibility longjmp and siglongjmp in libpthread.so are changed
>>>> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
>>>> restore shadow stack register.
>>>
>>> OK.
>>>
>>>>
>>>> Tested with build-many-glibcs.py.
>>>>
>>>>       * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
>>>>       handlers after setjmp.
>>>>       * setjmp/longjmp.c (__libc_longjmp): Don't define alias if
>>>>       defined.
>>>>       * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG):
>>>>       Changed to 97.
>>>>       * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
>>>>       * sysdeps/x86/__longjmp_cancel.S: New file.
>>>>       * sysdeps/x86/longjmp.c: Likewise.
>>>>       * sysdeps/x86/nptl/pt-longjmp.c: Likewise.
>>>
>>> This looks much better.
>>>
>>>> ---
>>>>  nptl/pthread_create.c                 |  9 ++--
>>>>  setjmp/longjmp.c                      |  2 +
>>>>  sysdeps/unix/sysv/linux/x86/setjmpP.h |  4 +-
>>>>  sysdeps/x86/Makefile                  |  4 ++
>>>>  sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++
>>>>  sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++
>>>>  sysdeps/x86/nptl/pt-longjmp.c         | 97 +++++++++++++++++++++++++++++++++++
>>>>  7 files changed, 176 insertions(+), 5 deletions(-)
>>>>  create mode 100644 sysdeps/x86/__longjmp_cancel.S
>>>>  create mode 100644 sysdeps/x86/longjmp.c
>>>>  create mode 100644 sysdeps/x86/nptl/pt-longjmp.c
>>>>
>>>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>>>> index caaf07c134..1c5b3780c6 100644
>>>> --- a/nptl/pthread_create.c
>>>> +++ b/nptl/pthread_create.c
>>>> @@ -427,12 +427,15 @@ START_THREAD_DEFN
>>>>       compilers without that support we do use setjmp.  */
>>>>    struct pthread_unwind_buf unwind_buf;
>>>>
>>>> -  /* No previous handlers.  */
>>>> +  int not_first_call;
>>>> +  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>>>> +
>>>> +  /* No previous handlers.  NB: This must be done after setjmp since
>>>> +     the same space may be used by setjmp to store extra data which
>>>> +     should never be used by __libc_unwind_longjmp.  */
>>>
>>> Suggest:
>>> ~~~
>>> No previous handlers.  NB: This must be done after setjmp since
>>> the private space in the unwind jump buffer may overlap space
>>> used by setjmp to store extra architecture-specific information
>>> which is never be used by the cancellation-specific
>>> __libc_unwind_longjmp.
>>>
>>> The private space is allowed to overlap because the unwinder never
>>> has to return through any of the jumped-to call frames, and thus
>>> only a minimum amount of saved data need be stored, and for example,
>>> need not include the process signal mask information. This is all
>>> an optimization to reduce stack usage when pushing cancellation
>>> handlers.
>>> ~~~
>>
>> Will fix it.
>>
>>>>    unwind_buf.priv.data.prev = NULL;
>>>>    unwind_buf.priv.data.cleanup = NULL;
>>>>
>>>> -  int not_first_call;
>>>> -  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>>>
>>> OK.
>>>
>>>>    if (__glibc_likely (! not_first_call))
>>>>      {
>>>>        /* Store the new cleanup handler info.  */
>>>> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
>>>> index a2a7065a85..453889e103 100644
>>>> --- a/setjmp/longjmp.c
>>>> +++ b/setjmp/longjmp.c
>>>> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val)
>>>>  }
>>>>
>>>>  #ifndef __libc_siglongjmp
>>>> +# ifndef __libc_longjmp
>>>>  /* __libc_longjmp is a private interface for cancellation implementation
>>>>     in libpthread.  */
>>>>  strong_alias (__libc_siglongjmp, __libc_longjmp)
>>>> +# endif
>>>
>>> OK.
>>>
>>>>  weak_alias (__libc_siglongjmp, _longjmp)
>>>>  weak_alias (__libc_siglongjmp, longjmp)
>>>>  weak_alias (__libc_siglongjmp, siglongjmp)
>>>> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>>>> index c0ed767a0d..90a6bbcf32 100644
>>>> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
>>>> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>>>> @@ -22,8 +22,8 @@
>>>>  #include <bits/types/__sigset_t.h>
>>>>
>>>>  /* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
>>>> -   Define it to 513 to leave some rooms for future use.  */
>>>> -#define _JUMP_BUF_SIGSET_NSIG        513
>>>> +   Define it to 97 to leave some rooms for future use.  */
>>>
>>> OK.
>>>
>>>> +#define _JUMP_BUF_SIGSET_NSIG        97
>>>
>>> Please provide proof in the way of a comment or rewriting this constant
>>> to show that it places the shadow stack pointer on both x86_64 and x32
>>> into the range of the private pad.
>>
>> sysdeps/x86/nptl/pt-longjmp.c has
>>
>> _Static_assert ((sizeof (struct pthread_unwind_buf)
>>> = (SHADOW_STACK_POINTER_OFFSET
>>      + SHADOW_STACK_POINTER_SIZE)),
>> "size of struct pthread_unwind_buf < "
>> "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");
>>
>> If shadow stack pointer is saved in the offset bigger than the size of
>> struct pthread_unwind_buf, assert will trigger at compile-time.
>>
>
> Thanks, I'll review this in the patch you posted.
>
>>> Also, from commit  f33632ccd1dec3217583fcfdd965afb62954203c,
>>> where did this math come from?
>>>
>>> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
>>>
>>> Why the +7?
>>
>> _JUMP_BUF_SIGSET_NSIG is the biggest signal number + 1.
>
> Agreed.
>
>> _JUMP_BUF_SIGSET_NSIG - 1 gives the biggest signal number.
>
> Agreed.
>
>> _JUMP_BUF_SIGSET_NSIG - 1 + 7 rounds up to the number of bytes
>> which are needed to store the biggest signal number.
>
> It does not.

I changed to

/* Number of bits per long.  */
#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int))
/* The biggest signal number.  As of kernel 4.14, x86 _NSIG is 64.
   Define it to 96 to leave some rooms for future use.  */
#define _JUMP_BUF_SIGSET_NSIG 96
/* Number of longs to hold all signals.  */
#define _JUMP_BUF_SIGSET_NWORDS \
  (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \
   / _JUMP_BUF_SIGSET_BITS_PER_WORD)

> Result  # of signals    # of bits       Current #       Expected #
...

> Luckily for 512 signals (513) the math works out.
>
> For 96 signals it does not.
>
> (96 - 1 + 7) = 102
> 102 / 64 = 1

True.

> That's only a signal word, that only supports 64 signals, not 96.

Lucky for me.  Since unsigned long int __shadow_stack_pointer is aligned
as unsigned long, there is padding before __shadow_stack_pointer.

> Why doesn't the static assert trigger? Because you a < the size of the
> pthread_unwind_buf, and too small actually, writing into other parts of
> the buffer!

Assert

_Static_assert ((sizeof (struct pthread_unwind_buf)
>= (SHADOW_STACK_POINTER_OFFSET
     + SHADOW_STACK_POINTER_SIZE)),
"size of struct pthread_unwind_buf < "
"(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");

is correct.

> I would expect us to use something like this:
>
> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
> index c0ed767a0d..6e1e8f901c 100644
> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
> @@ -20,13 +20,15 @@
>  #define        _SETJMPP_H      1
>
>  #include <bits/types/__sigset_t.h>
> +#include <libc-pointer-arith.h>
>
> -/* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
> -   Define it to 513 to leave some rooms for future use.  */
> -#define _JUMP_BUF_SIGSET_NSIG  513
> +/* As of kernel 4.14, x86 _NSIG is 64.
> +   Define it to 512 to leave some rooms for future use.  */
> +#define _JUMP_BUF_SIGSET_NSIG  512
>  /* Number of longs to hold all signals.  */
>  #define _JUMP_BUF_SIGSET_NWORDS \
> -  ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
> +  (ALIGN_UP(_JUMP_BUF_SIGSET_NSIG, (8 * sizeof (unsigned long int)))   \
> +   / (8 * sizeof (unsigned long int)))
>
>  typedef struct
>    {
> ---
> ... but with the size you need and explain *why* it's 96.
>
> You need a comment like this:
> /* The layout looks like this:
>    - N words for this.
>    - N words for that.
>    - N words for shadow stack.
>    Total 96 signals.  */
>
> Align the number of signals up to multiple of signals you can store in
> a hardware word, and then figure out how many words that takes up.

I put comments in sysdeps/x86/nptl/pt-longjmp.c together with static
assert.

> Please review my math.
>
>>>>  /* Number of longs to hold all signals.  */
>>>>  #define _JUMP_BUF_SIGSET_NWORDS \
>>>>    ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
>>>> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
>>>> index 0d0326c21a..d25d6f0ae4 100644
>>>> --- a/sysdeps/x86/Makefile
>>>> +++ b/sysdeps/x86/Makefile
>>>> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features
>>>>  tests += tst-get-cpu-features
>>>>  tests-static += tst-get-cpu-features-static
>>>>  endif
>>>> +
>>>> +ifeq ($(subdir),setjmp)
>>>> +sysdep_routines += __longjmp_cancel
>>>> +endif
>>>
>>> OK.
>>>
>>>> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
>>>> new file mode 100644
>>>> index 0000000000..b57dbfa376
>>>> --- /dev/null
>>>> +++ b/sysdeps/x86/__longjmp_cancel.S
>>>> @@ -0,0 +1,20 @@
>>>> +/* __longjmp_cancel for x86.
>>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>>> +   This file is part of the GNU C Library.
>>>> +
>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>> +   modify it under the terms of the GNU Lesser General Public
>>>> +   License as published by the Free Software Foundation; either
>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> +   Lesser General Public License for more details.
>>>> +
>>>> +   You should have received a copy of the GNU Lesser General Public
>>>> +   License along with the GNU C Library; if not, see
>>>> +   <http://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#define __longjmp __longjmp_cancel
>>>> +#include <__longjmp.S>
>>>
>>> OK.
>>>
>>>> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
>>>> new file mode 100644
>>>> index 0000000000..a53f31e1dd
>>>> --- /dev/null
>>>> +++ b/sysdeps/x86/longjmp.c
>>>> @@ -0,0 +1,45 @@
>>>> +/* __libc_siglongjmp for x86.
>>>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>>>> +   This file is part of the GNU C Library.
>>>> +
>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>> +   modify it under the terms of the GNU Lesser General Public
>>>> +   License as published by the Free Software Foundation; either
>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> +   Lesser General Public License for more details.
>>>> +
>>>> +   You should have received a copy of the GNU Lesser General Public
>>>> +   License along with the GNU C Library; if not, see
>>>> +   <http://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#define __libc_longjmp __redirect___libc_longjmp
>>>> +#include <setjmp/longjmp.c>
>>>> +#undef __libc_longjmp
>>>> +
>>>> +extern void __longjmp_cancel (__jmp_buf __env, int __val)
>>>> +     __attribute__ ((__noreturn__)) attribute_hidden;
>>>> +
>>>> +/* Since __libc_longjmp is a private interface for cancellation
>>>> +   implementation in libpthread, there is no need to restore shadow
>>>> +   stack register.  */
>>>> +
>>>> +void
>>>> +__libc_longjmp (sigjmp_buf env, int val)
>>>> +{
>>>> +  /* Perform any cleanups needed by the frames being unwound.  */
>>>> +  _longjmp_unwind (env, val);
>>>
>>> OK.
>>>
>>>> +
>>>> +  if (env[0].__mask_was_saved)
>>>> +    /* Restore the saved signal mask.  */
>>>> +    (void) __sigprocmask (SIG_SETMASK,
>>>> +                       (sigset_t *) &env[0].__saved_mask,
>>>> +                       (sigset_t *) NULL);
>>>
>>> OK.
>>>
>>>> +
>>>> +  /* Call the machine-dependent function to restore machine state
>>>> +     without shadow stack.  */
>>>> +  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);
>>>
>>> OK.
>>>
>>>> +}
>>>> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
>>>> new file mode 100644
>>>> index 0000000000..7eb8651cfe
>>>> --- /dev/null
>>>> +++ b/sysdeps/x86/nptl/pt-longjmp.c
>>>> @@ -0,0 +1,97 @@
>>>> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
>>>> +   X86 version.
>>>> +   Copyright (C) 18 Free Software Foundation, Inc.
>>>> +   This file is part of the GNU C Library.
>>>> +
>>>> +   The GNU C Library is free software; you can redistribute it and/or
>>>> +   modify it under the terms of the GNU Lesser General Public
>>>> +   License as published by the Free Software Foundation; either
>>>> +   version 2.1 of the License, or (at your option) any later version.
>>>> +
>>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> +   Lesser General Public License for more details.
>>>> +
>>>> +   You should have received a copy of the GNU Lesser General Public
>>>> +   License along with the GNU C Library; if not, see
>>>> +   <http://www.gnu.org/licenses/>.  */
>>>> +
>>>> +/* <nptl/descr.h> has
>>>> +
>>>> +struct pthread_unwind_buf
>>>> +{
>>>> +  struct
>>>> +  {
>>>> +    __jmp_buf jmp_buf;
>>>> +    int mask_was_saved;
>>>> +  } cancel_jmp_buf[1];
>>>> +
>>>> +  union
>>>> +  {
>>>> +    void *pad[4];
>>>> +    struct
>>>> +    {
>>>> +      struct pthread_unwind_buf *prev;
>>>> +      struct _pthread_cleanup_buffer *cleanup;
>>>> +      int canceltype;
>>>> +    } data;
>>>> +  } priv;
>>>> +};
>>>> +
>>>> +  The pad array in struct pthread_unwind_buf is used by setjmp to save
>>>
>>> This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches.
>>>
>>> However on your hjl/cet/master branch it appears that this offset is not
>>> defined to be *just after* the mask_was_saved?
>>
>> sysdeps/unix/sysv/linux/x86/setjmpP.h has
>>
>> typedef struct
>>   {
>>     unsigned long int __val[_JUMP_BUF_SIGSET_NWORDS];
>>   } __jmp_buf_sigset_t;
>>
>> typedef union
>>   {
>>     __sigset_t __saved_mask_compat;
>>     struct
>>       {
>>          __jmp_buf_sigset_t __saved_mask;
>>           /* Used for shadow stack pointer.  */
>>          unsigned long int __shadow_stack_pointer;
>>       } __saved;
>>   } __jmpbuf_arch_t;
>>
>> __shadow_stack_pointer is placed after __saved_mask, aka mask_was_saved.
>
> OK, I'll re-review.
>

>
> I look forward to a v2 with correct rounding and a new comment.
>

Here is the updated patch.  Please see if

sysdeps/unix/sysv/linux/x86/setjmpP.h
sysdeps/x86/nptl/pt-longjmp.c

address your concerns.

Thanks.
Joseph Myers April 17, 2018, 8:02 p.m. UTC | #7
On Fri, 6 Apr 2018, H.J. Lu wrote:

> https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1
> 
> It should be binary backward compatible.  I will investigate if there is a way

Increasing the size of a public type is always dangerous, because you can 
end up with one part of a program expecting the new, larger size but 
another part only allocating the old, smaller size.

It might in some cases be compatible to the extent that existing linked 
programs and shared libraries work with new glibc, if new glibc will never 
try to write into the unallocated part of such objects allocated by an 
existing linked program or shared library.  However, any such change would 
need a careful analysis of how the type gets written to, and to what 
extent external libraries have interfaces that depend on the size of the 
type, and would need a NEWS entry explaining the change and discussing the 
compatibility issues with it.
H.J. Lu April 17, 2018, 11:20 p.m. UTC | #8
On Tue, Apr 17, 2018 at 1:02 PM, Joseph Myers <joseph@codesourcery.com> wrote:
> On Fri, 6 Apr 2018, H.J. Lu wrote:
>
>> https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1
>>
>> It should be binary backward compatible.  I will investigate if there is a way
>
> Increasing the size of a public type is always dangerous, because you can
> end up with one part of a program expecting the new, larger size but
> another part only allocating the old, smaller size.

That is true.  The allocated ucontext size must be no less than the size
expected by ucontext consumer.

> It might in some cases be compatible to the extent that existing linked
> programs and shared libraries work with new glibc, if new glibc will never

This is done by checking CET properties.  Both linker and dynamic linker
clear CET property bits if any module doesn't have CET bits set.  Glibc
should access new extended fields only if CET bits are set, which means
the new ucontext is used in all .o files.  That is why I want to extend
ucontext before we have found a solution so that if an object file has
CET bits set, it must use the new ucontext.

> try to write into the unallocated part of such objects allocated by an
> existing linked program or shared library.  However, any such change would
> need a careful analysis of how the type gets written to, and to what
> extent external libraries have interfaces that depend on the size of the
> type, and would need a NEWS entry explaining the change and discussing the
> compatibility issues with it.
>

Agreed,
Carlos O'Donell April 21, 2018, 3:28 a.m. UTC | #9
Lets call this v2. Subject adjusted. Please keep incrementing the version
number on your patches to make the review easier by myself and others.

On 04/12/2018 06:50 PM, H.J. Lu wrote:
> From 1ce2b4f199070a63d9a60bd9b7ca9e33013d4208 Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" <hjl.tools@gmail.com>
> Date: Sat, 24 Feb 2018 17:23:54 -0800
> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack
>  register
> 
> The pad array in struct pthread_unwind_buf is used by setjmp to save
> shadow stack register.  We assert that size of struct pthread_unwind_buf
> is no less than offset of shadow stack pointer + shadow stack pointer
> size.
> 
> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
> these with thread cancellation, call setjmp, but never return after
> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
> __libc_longjmp on x86, doesn't need to restore shadow stack register.
> __libc_longjmp, which is a private interface for thread cancellation
> implementation in libpthread, is changed to call __longjmp_cancel,
> instead of __longjmp.  __longjmp_cancel is a new internal function
> in libc, which is similar to __longjmp, but doesn't restore shadow
> stack register.
> 
> The compatibility longjmp and siglongjmp in libpthread.so are changed
> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
> restore shadow stack register.
> 
> Tested with build-many-glibcs.py.
> 
> 	* nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
> 	handlers after setjmp.
> 	* setjmp/longjmp.c (__libc_longjmp): Don't define alias if
> 	defined.
> 	* sysdeps/unix/sysv/linux/x86/setjmpP.h: Include
> 	<libc-pointer-arith.h>.
> 	(_JUMP_BUF_SIGSET_BITS_PER_WORD): New.
> 	(_JUMP_BUF_SIGSET_NSIG): Changed to 96.
> 	(_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and
> 	_JUMP_BUF_SIGSET_BITS_PER_WORD.
> 	* sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
> 	* sysdeps/x86/__longjmp_cancel.S: New file.
> 	* sysdeps/x86/longjmp.c: Likewise.
> 	* sysdeps/x86/nptl/pt-longjmp.c: Likewise.
> ---
>  nptl/pthread_create.c                 | 18 +++++--
>  setjmp/longjmp.c                      |  2 +
>  sysdeps/unix/sysv/linux/x86/setjmpP.h | 12 +++--
>  sysdeps/x86/Makefile                  |  4 ++
>  sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++
>  sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++
>  sysdeps/x86/nptl/pt-longjmp.c         | 97 +++++++++++++++++++++++++++++++++++
>  7 files changed, 191 insertions(+), 7 deletions(-)
>  create mode 100644 sysdeps/x86/__longjmp_cancel.S
>  create mode 100644 sysdeps/x86/longjmp.c
>  create mode 100644 sysdeps/x86/nptl/pt-longjmp.c
> 
> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
> index caaf07c134..8b1f06599d 100644
> --- a/nptl/pthread_create.c
> +++ b/nptl/pthread_create.c
> @@ -427,12 +427,24 @@ START_THREAD_DEFN
>       compilers without that support we do use setjmp.  */
>    struct pthread_unwind_buf unwind_buf;
>  
> -  /* No previous handlers.  */
> +  int not_first_call;
> +  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
> +
> +  /* No previous handlers.  NB: This must be done after setjmp since
> +     the private space in the unwind jump buffer may overlap space
> +     used by setjmp to store extra architecture-specific information
> +     which is never be used by the cancellation-specific

s/be//g

> +     __libc_unwind_longjmp.
> +
> +     The private space is allowed to overlap because the unwinder never
> +     has to return through any of the jumped-to call frames, and thus
> +     only a minimum amount of saved data need be stored, and for example,
> +     need not include the process signal mask information. This is all
> +     an optimization to reduce stack usage when pushing cancellation
> +     handlers.  */
>    unwind_buf.priv.data.prev = NULL;
>    unwind_buf.priv.data.cleanup = NULL;
>  
> -  int not_first_call;
> -  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);

OK.

>    if (__glibc_likely (! not_first_call))
>      {
>        /* Store the new cleanup handler info.  */
> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
> index a2a7065a85..453889e103 100644
> --- a/setjmp/longjmp.c
> +++ b/setjmp/longjmp.c
> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val)
>  }
>  
>  #ifndef __libc_siglongjmp
> +# ifndef __libc_longjmp
>  /* __libc_longjmp is a private interface for cancellation implementation
>     in libpthread.  */
>  strong_alias (__libc_siglongjmp, __libc_longjmp)
> +# endif

OK.

>  weak_alias (__libc_siglongjmp, _longjmp)
>  weak_alias (__libc_siglongjmp, longjmp)
>  weak_alias (__libc_siglongjmp, siglongjmp)
> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
> index c0ed767a0d..24f87da204 100644
> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
> @@ -20,13 +20,17 @@
>  #define	_SETJMPP_H	1
>  
>  #include <bits/types/__sigset_t.h>
> +#include <libc-pointer-arith.h>
>  
> -/* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
> -   Define it to 513 to leave some rooms for future use.  */
> -#define _JUMP_BUF_SIGSET_NSIG	513
> +/* Number of bits per long.  */
> +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int))
> +/* The biggest signal number.  As of kernel 4.14, x86 _NSIG is 64.
> +   Define it to 96 to leave some rooms for future use.  */

Why 96?

Sure on x86_64 the cancel jump buf has 4 x void*'s and each is 64-bits,
so you have 128 signals worth of space and then the shadow stack pointer.

Does this work on x32?

On x32 you have only 4 void*'s in the private pad.

Your presently structured sigset_t looks like this:

typedef union
  {
    __sigset_t __saved_mask_compat;
    struct
      {
        __jmp_buf_sigset_t __saved_mask;
        /* Used for shadow stack pointer.  */
        unsigned long int __shadow_stack_pointer;
      } __saved;
  } __jmpbuf_arch_t;

Which means you have a sigset_t *before* the __shadow_stack_pointer.

On x32 to save a 64-bit shadow stack pointer, you'll only have 2 x 32-bit
words left? That's only space for 64 signals?

Are you counting on one additional 32-bit word of padding between the
__mask_was_saved and the rest of the long arguments?

If so, then this still needs spelling out in an a comment why 96 signals
works on both i686, x32, and x86_64.

Also it should be explained if 96 is the *maximum* we can do with the current
layout, or if more is possible. In which case picking 96 is *not* arbitrary.

> +#define _JUMP_BUF_SIGSET_NSIG	96
>  /* Number of longs to hold all signals.  */
>  #define _JUMP_BUF_SIGSET_NWORDS \
> -  ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
> +  (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \
> +   / _JUMP_BUF_SIGSET_BITS_PER_WORD)
>  
>  typedef struct
>    {
> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
> index 0d0326c21a..d25d6f0ae4 100644
> --- a/sysdeps/x86/Makefile
> +++ b/sysdeps/x86/Makefile
> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features
>  tests += tst-get-cpu-features
>  tests-static += tst-get-cpu-features-static
>  endif
> +
> +ifeq ($(subdir),setjmp)
> +sysdep_routines += __longjmp_cancel
> +endif
> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
> new file mode 100644
> index 0000000000..b57dbfa376
> --- /dev/null
> +++ b/sysdeps/x86/__longjmp_cancel.S
> @@ -0,0 +1,20 @@
> +/* __longjmp_cancel for x86.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#define __longjmp __longjmp_cancel
> +#include <__longjmp.S>

OK.

> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
> new file mode 100644
> index 0000000000..a53f31e1dd
> --- /dev/null
> +++ b/sysdeps/x86/longjmp.c
> @@ -0,0 +1,45 @@
> +/* __libc_siglongjmp for x86.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#define __libc_longjmp __redirect___libc_longjmp
> +#include <setjmp/longjmp.c>
> +#undef __libc_longjmp
> +
> +extern void __longjmp_cancel (__jmp_buf __env, int __val)
> +     __attribute__ ((__noreturn__)) attribute_hidden;
> +
> +/* Since __libc_longjmp is a private interface for cancellation
> +   implementation in libpthread, there is no need to restore shadow
> +   stack register.  */
> +
> +void
> +__libc_longjmp (sigjmp_buf env, int val)
> +{
> +  /* Perform any cleanups needed by the frames being unwound.  */
> +  _longjmp_unwind (env, val);
> +
> +  if (env[0].__mask_was_saved)
> +    /* Restore the saved signal mask.  */
> +    (void) __sigprocmask (SIG_SETMASK,
> +			  (sigset_t *) &env[0].__saved_mask,
> +			  (sigset_t *) NULL);
> +
> +  /* Call the machine-dependent function to restore machine state
> +     without shadow stack.  */
> +  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);
> +}

OK.

> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
> new file mode 100644
> index 0000000000..7eb8651cfe
> --- /dev/null
> +++ b/sysdeps/x86/nptl/pt-longjmp.c
> @@ -0,0 +1,97 @@
> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
> +   X86 version.
> +   Copyright (C) 18 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +/* <nptl/descr.h> has
> +
> +struct pthread_unwind_buf
> +{
> +  struct
> +  {
> +    __jmp_buf jmp_buf;
> +    int mask_was_saved;
> +  } cancel_jmp_buf[1];
> +
> +  union
> +  {
> +    void *pad[4];
> +    struct
> +    {
> +      struct pthread_unwind_buf *prev;
> +      struct _pthread_cleanup_buffer *cleanup;
> +      int canceltype;
> +    } data;
> +  } priv;
> +};
> +
> +  The pad array in struct pthread_unwind_buf is used by setjmp to save
> +  shadow stack register.  Assert that size of struct pthread_unwind_buf
> +  is no less than offset of shadow stack pointer plus shadow stack
> +  pointer size.
> +
> +  NB: setjmp is called in libpthread to save shadow stack register.  But
> +  __libc_unwind_longjmp doesn't restore shadow stack register since they
> +  never return after longjmp.  */

Suggest:
~~~
The pad array in struct pthread_unwind_buf is used by setjmp to save
the shadow stack register.  Assert that the size of struct pthread_unwind_buf
is no less than the offset of the shadow stack pointer plus the shadow stack
pointer size.

NB: We use setjmp in thread cancellation and this saves the shadow stack
register, but __libc_unwind_longjmp doesn't restore the shadow stack register
since cancellation never returns after longjmp.
~~~

> +
> +#include <pthreadP.h>
> +#include <jmp_buf-ssp.h>
> +
> +#ifdef __x86_64__
> +# define SHADOW_STACK_POINTER_SIZE 8
> +#else
> +# define SHADOW_STACK_POINTER_SIZE 4
> +#endif
> +
> +_Static_assert ((sizeof (struct pthread_unwind_buf)
> +		 >= (SHADOW_STACK_POINTER_OFFSET
> +		     + SHADOW_STACK_POINTER_SIZE)),
> +		"size of struct pthread_unwind_buf < "
> +		"(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");

This assertion is too loose.

The assertion you need is that the shadow stack pointer itself falls within
the range of the private padding. This would have caught the previous bug
with the rounded up size.

Please adjust the assertion to be as tight as possible or add new assertions.

Completely untested, but just to show what I'm thinking:

_Static_assert ((offsetof (struct pthread_unwind_buf, cancel_jump_buf[0].mask_was_saved)
		 + sizeof (int) < SHADOW_STACK_POINTER_OFFSET
		 && (offsetof (struct pthread_unwind_buf, priv.pad[4])
		    > (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE),
		"Shadow stack pointer is not within private storage of pthread_unwind_buf.");

> +
> +#include <shlib-compat.h>
> +
> +/* libpthread once had its own longjmp (and siglongjmp alias), though there
> +   was no apparent reason for it.  There is no use in having a separate
> +   symbol in libpthread, but the historical ABI requires it.  For static
> +   linking, there is no need to provide anything here--the libc version
> +   will be linked in.  For shared library ABI compatibility, there must be
> +   longjmp and siglongjmp symbols in libpthread.so.
> +
> +   With an IFUNC resolver, it would be possible to avoid the indirection,
> +   but the IFUNC resolver might run before the __libc_longjmp symbol has
> +   been relocated, in which case the IFUNC resolver would not be able to
> +   provide the correct address.  */
> +
> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22)
> +
> +static void __attribute__ ((noreturn, used))
> +longjmp_compat (jmp_buf env, int val)
> +{
> +  /* NB: We call __libc_siglongjmp,  instead of __libc_longjmp, since
> +     __libc_longjmp is a private interface for cancellation which
> +     doesn't restore shadow stack register.  */
> +  __libc_siglongjmp (env, val);
> +}
> +
> +strong_alias (longjmp_compat, longjmp_alias)
> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0);
> +
> +strong_alias (longjmp_alias, siglongjmp_alias)
> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0);
> +
> +#endif
> -- 2.14.3

Looking forward to a v3.
diff mbox series

Patch

From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Sat, 24 Feb 2018 17:23:54 -0800
Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack
 register

The pad array in struct pthread_unwind_buf is used by setjmp to save
shadow stack register.  We assert that size of struct pthread_unwind_buf
is no less than offset of shadow stack pointer + shadow stack pointer
size.

Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
these with thread cancellation, call setjmp, but never return after
__libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
__libc_longjmp on x86, doesn't need to restore shadow stack register.
__libc_longjmp, which is a private interface for thread cancellation
implementation in libpthread, is changed to call __longjmp_cancel,
instead of __longjmp.  __longjmp_cancel is a new internal function
in libc, which is similar to __longjmp, but doesn't restore shadow
stack register.

The compatibility longjmp and siglongjmp in libpthread.so are changed
to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
restore shadow stack register.

Tested with build-many-glibcs.py.

	* nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
	handlers after setjmp.
	* setjmp/longjmp.c (__libc_longjmp): Don't define alias if
	defined.
	* sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG):
	Changed to 97.
	* sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
	* sysdeps/x86/__longjmp_cancel.S: New file.
	* sysdeps/x86/longjmp.c: Likewise.
	* sysdeps/x86/nptl/pt-longjmp.c: Likewise.
---
 nptl/pthread_create.c                 |  9 ++--
 setjmp/longjmp.c                      |  2 +
 sysdeps/unix/sysv/linux/x86/setjmpP.h |  4 +-
 sysdeps/x86/Makefile                  |  4 ++
 sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++
 sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++
 sysdeps/x86/nptl/pt-longjmp.c         | 97 +++++++++++++++++++++++++++++++++++
 7 files changed, 176 insertions(+), 5 deletions(-)
 create mode 100644 sysdeps/x86/__longjmp_cancel.S
 create mode 100644 sysdeps/x86/longjmp.c
 create mode 100644 sysdeps/x86/nptl/pt-longjmp.c

diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index caaf07c134..1c5b3780c6 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -427,12 +427,15 @@  START_THREAD_DEFN
      compilers without that support we do use setjmp.  */
   struct pthread_unwind_buf unwind_buf;
 
-  /* No previous handlers.  */
+  int not_first_call;
+  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
+
+  /* No previous handlers.  NB: This must be done after setjmp since
+     the same space may be used by setjmp to store extra data which
+     should never be used by __libc_unwind_longjmp.  */
   unwind_buf.priv.data.prev = NULL;
   unwind_buf.priv.data.cleanup = NULL;
 
-  int not_first_call;
-  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
   if (__glibc_likely (! not_first_call))
     {
       /* Store the new cleanup handler info.  */
diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
index a2a7065a85..453889e103 100644
--- a/setjmp/longjmp.c
+++ b/setjmp/longjmp.c
@@ -40,9 +40,11 @@  __libc_siglongjmp (sigjmp_buf env, int val)
 }
 
 #ifndef __libc_siglongjmp
+# ifndef __libc_longjmp
 /* __libc_longjmp is a private interface for cancellation implementation
    in libpthread.  */
 strong_alias (__libc_siglongjmp, __libc_longjmp)
+# endif
 weak_alias (__libc_siglongjmp, _longjmp)
 weak_alias (__libc_siglongjmp, longjmp)
 weak_alias (__libc_siglongjmp, siglongjmp)
diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
index c0ed767a0d..90a6bbcf32 100644
--- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
+++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
@@ -22,8 +22,8 @@ 
 #include <bits/types/__sigset_t.h>
 
 /* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
-   Define it to 513 to leave some rooms for future use.  */
-#define _JUMP_BUF_SIGSET_NSIG	513
+   Define it to 97 to leave some rooms for future use.  */
+#define _JUMP_BUF_SIGSET_NSIG	97
 /* Number of longs to hold all signals.  */
 #define _JUMP_BUF_SIGSET_NWORDS \
   ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
index 0d0326c21a..d25d6f0ae4 100644
--- a/sysdeps/x86/Makefile
+++ b/sysdeps/x86/Makefile
@@ -8,3 +8,7 @@  sysdep-dl-routines += dl-get-cpu-features
 tests += tst-get-cpu-features
 tests-static += tst-get-cpu-features-static
 endif
+
+ifeq ($(subdir),setjmp)
+sysdep_routines += __longjmp_cancel
+endif
diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
new file mode 100644
index 0000000000..b57dbfa376
--- /dev/null
+++ b/sysdeps/x86/__longjmp_cancel.S
@@ -0,0 +1,20 @@ 
+/* __longjmp_cancel for x86.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define __longjmp __longjmp_cancel
+#include <__longjmp.S>
diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
new file mode 100644
index 0000000000..a53f31e1dd
--- /dev/null
+++ b/sysdeps/x86/longjmp.c
@@ -0,0 +1,45 @@ 
+/* __libc_siglongjmp for x86.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define __libc_longjmp __redirect___libc_longjmp
+#include <setjmp/longjmp.c>
+#undef __libc_longjmp
+
+extern void __longjmp_cancel (__jmp_buf __env, int __val)
+     __attribute__ ((__noreturn__)) attribute_hidden;
+
+/* Since __libc_longjmp is a private interface for cancellation
+   implementation in libpthread, there is no need to restore shadow
+   stack register.  */
+
+void
+__libc_longjmp (sigjmp_buf env, int val)
+{
+  /* Perform any cleanups needed by the frames being unwound.  */
+  _longjmp_unwind (env, val);
+
+  if (env[0].__mask_was_saved)
+    /* Restore the saved signal mask.  */
+    (void) __sigprocmask (SIG_SETMASK,
+			  (sigset_t *) &env[0].__saved_mask,
+			  (sigset_t *) NULL);
+
+  /* Call the machine-dependent function to restore machine state
+     without shadow stack.  */
+  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);
+}
diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
new file mode 100644
index 0000000000..7eb8651cfe
--- /dev/null
+++ b/sysdeps/x86/nptl/pt-longjmp.c
@@ -0,0 +1,97 @@ 
+/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
+   X86 version.
+   Copyright (C) 18 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* <nptl/descr.h> has
+
+struct pthread_unwind_buf
+{
+  struct
+  {
+    __jmp_buf jmp_buf;
+    int mask_was_saved;
+  } cancel_jmp_buf[1];
+
+  union
+  {
+    void *pad[4];
+    struct
+    {
+      struct pthread_unwind_buf *prev;
+      struct _pthread_cleanup_buffer *cleanup;
+      int canceltype;
+    } data;
+  } priv;
+};
+
+  The pad array in struct pthread_unwind_buf is used by setjmp to save
+  shadow stack register.  Assert that size of struct pthread_unwind_buf
+  is no less than offset of shadow stack pointer plus shadow stack
+  pointer size.
+
+  NB: setjmp is called in libpthread to save shadow stack register.  But
+  __libc_unwind_longjmp doesn't restore shadow stack register since they
+  never return after longjmp.  */
+
+#include <pthreadP.h>
+#include <jmp_buf-ssp.h>
+
+#ifdef __x86_64__
+# define SHADOW_STACK_POINTER_SIZE 8
+#else
+# define SHADOW_STACK_POINTER_SIZE 4
+#endif
+
+_Static_assert ((sizeof (struct pthread_unwind_buf)
+		 >= (SHADOW_STACK_POINTER_OFFSET
+		     + SHADOW_STACK_POINTER_SIZE)),
+		"size of struct pthread_unwind_buf < "
+		"(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)");
+
+#include <shlib-compat.h>
+
+/* libpthread once had its own longjmp (and siglongjmp alias), though there
+   was no apparent reason for it.  There is no use in having a separate
+   symbol in libpthread, but the historical ABI requires it.  For static
+   linking, there is no need to provide anything here--the libc version
+   will be linked in.  For shared library ABI compatibility, there must be
+   longjmp and siglongjmp symbols in libpthread.so.
+
+   With an IFUNC resolver, it would be possible to avoid the indirection,
+   but the IFUNC resolver might run before the __libc_longjmp symbol has
+   been relocated, in which case the IFUNC resolver would not be able to
+   provide the correct address.  */
+
+#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22)
+
+static void __attribute__ ((noreturn, used))
+longjmp_compat (jmp_buf env, int val)
+{
+  /* NB: We call __libc_siglongjmp,  instead of __libc_longjmp, since
+     __libc_longjmp is a private interface for cancellation which
+     doesn't restore shadow stack register.  */
+  __libc_siglongjmp (env, val);
+}
+
+strong_alias (longjmp_compat, longjmp_alias)
+compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0);
+
+strong_alias (longjmp_alias, siglongjmp_alias)
+compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0);
+
+#endif
-- 
2.14.3