diff mbox series

[v4,2/2] arm64: signal: Report signal frame size to userspace via auxv

Message ID 1527097616-25214-3-git-send-email-Dave.Martin@arm.com
State New
Headers show
Series arm64: signal: Report signal frame size to userspace via auxv | expand

Commit Message

Dave Martin May 23, 2018, 5:46 p.m. UTC
Stateful CPU architecture extensions may require the signal frame
to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
However, changing this #define is an ABI break.

To allow userspace the option of determining the signal frame size
in a more forwards-compatible way, this patch adds a new auxv entry
tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
size that the process can observe during its lifetime.

If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
assume that the MINSIGSTKSZ #define is sufficient.  This allows for
a consistent interface with older kernels that do not provide
AT_MINSIGSTKSZ.

The idea is that libc could expose this via sysconf() or some
similar mechanism.

There is deliberately no AT_SIGSTKSZ.  The kernel knows nothing
about userspace's own stack overheads and should not pretend to
know.

For arm64:

The primary motivation for this interface is the Scalable Vector
Extension, which can require at least 4KB or so of extra space
in the signal frame for the largest hardware implementations.

To determine the correct value, a "Christmas tree" mode (via the
add_all argument) is added to setup_sigframe_layout(), to simulate
addition of all possible records to the signal frame at maximum
possible size.

If this procedure goes wrong somehow, resulting in a stupidly large
frame layout and hence failure of sigframe_alloc() to allocate a
record to the frame, then this is indicative of a kernel bug: the
kernel's internal SIGFRAME_MAXSZ is supposed to sanity-check
against generting frames that we consider _impossibly_ large.  If
we hit this case, SIGFRAME_MAXSZ is used as our best guess, and we
WARN().

For arm64 SVE:

The SVE context block in the signal frame needs to be considered
too when computing the maximum possible signal frame size.

Because the size of this block depends on the vector length, this
patch computes the size based not on the thread's current vector
length but instead on the maximum possible vector length: this
determines the maximum size of SVE context block that can be
observed in any signal frame for the lifetime of the process.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>

---

Changes since v3:

 * Fall back to SIGFRAME_MAXSZ, not SIGSTKSZ in case of
   setup_sigframe_layout() failure, since exceeding SIGFRAME_MAXSZ is
   the only way this can fail: therefore SIGFRAME_MAXSZ is our best
   guess at the required size.

 * Commit message updated to reflect the above change.

Requested by Will Deacon:

 * Remove superfluous WARN()s from ARCH_DL_INFO and
   setup_sigframe_layout().  If something went wrong and we still have
   no value for signal_minsigstksz by the time we exec some user
   process, AT_MINSIGSTKSZ is omitted from the aux vector.  Userspace
   should fall back to the MINSIGSTKSZ #define anyway in this case
   (i.e., do things the POSIX way).

 * Change the type of signal_minsigstksz to unsigned long, to match the
   auxv and sigframe_size() types.  (sigframe_size uses size_t, but it's
   somewhat moot exactly what the types are, providing that
   SIGFRAME_MAXSZ fits in all of them).

Also requested by Mark Rutland:

 * Merge #ifndef __ASSEMBLY__ block introduced in <asm/elf.h> into
   the existing one.  The only things above are #defines, which
   shouldn't be affected by this.
---
 arch/arm64/include/asm/elf.h         | 11 ++++++++
 arch/arm64/include/asm/processor.h   |  5 ++++
 arch/arm64/include/uapi/asm/auxvec.h |  3 ++-
 arch/arm64/kernel/cpufeature.c       |  1 +
 arch/arm64/kernel/signal.c           | 52 +++++++++++++++++++++++++++++++-----
 5 files changed, 64 insertions(+), 8 deletions(-)

Comments

Will Deacon May 24, 2018, 12:49 p.m. UTC | #1
On Wed, May 23, 2018 at 06:46:56PM +0100, Dave Martin wrote:
> Stateful CPU architecture extensions may require the signal frame
> to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
> However, changing this #define is an ABI break.
> 
> To allow userspace the option of determining the signal frame size
> in a more forwards-compatible way, this patch adds a new auxv entry
> tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
> size that the process can observe during its lifetime.
> 
> If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
> assume that the MINSIGSTKSZ #define is sufficient.  This allows for
> a consistent interface with older kernels that do not provide
> AT_MINSIGSTKSZ.
> 
> The idea is that libc could expose this via sysconf() or some
> similar mechanism.
> 
> There is deliberately no AT_SIGSTKSZ.  The kernel knows nothing
> about userspace's own stack overheads and should not pretend to
> know.

[...]

> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index fac1c4d..9c18f0e 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -121,6 +121,9 @@
>  
>  #ifndef __ASSEMBLY__
>  
> +#include <linux/bug.h>
> +#include <asm/processor.h> /* for signal_minsigstksz, used by ARCH_DLINFO */
> +
>  typedef unsigned long elf_greg_t;
>  
>  #define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> @@ -148,6 +151,14 @@ typedef struct user_fpsimd_state elf_fpregset_t;
>  do {									\
>  	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
>  		    (elf_addr_t)current->mm->context.vdso);		\
> +									\
> +	/*								\
> +	 * Should always be nonzero unless there's a kernel bug.  If	\
> +	 * the we haven't determined a sensible value to give to	\

"If the we"?

> +	 * userspace, omit the entry:					\
> +	 */								\
> +	if (likely(signal_minsigstksz))					\
> +		NEW_AUX_ENT(AT_MINSIGSTKSZ, signal_minsigstksz);	\
>  } while (0)
>  
>  #define ARCH_HAS_SETUP_ADDITIONAL_PAGES
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 7675989..65ab83e 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -35,6 +35,8 @@
>  #ifdef __KERNEL__
>  
>  #include <linux/build_bug.h>
> +#include <linux/cache.h>
> +#include <linux/init.h>
>  #include <linux/stddef.h>
>  #include <linux/string.h>
>  
> @@ -244,6 +246,9 @@ void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused);
>  void cpu_enable_cache_maint_trap(const struct arm64_cpu_capabilities *__unused);
>  void cpu_clear_disr(const struct arm64_cpu_capabilities *__unused);
>  
> +extern unsigned long __ro_after_init signal_minsigstksz; /* sigframe size */
> +extern void __init minsigstksz_setup(void);
> +
>  /* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
>  #define SVE_SET_VL(arg)	sve_set_current_vl(arg)
>  #define SVE_GET_VL()	sve_get_current_vl()
> diff --git a/arch/arm64/include/uapi/asm/auxvec.h b/arch/arm64/include/uapi/asm/auxvec.h
> index ec0a86d..743c0b8 100644
> --- a/arch/arm64/include/uapi/asm/auxvec.h
> +++ b/arch/arm64/include/uapi/asm/auxvec.h
> @@ -19,7 +19,8 @@
>  
>  /* vDSO location */
>  #define AT_SYSINFO_EHDR	33
> +#define AT_MINSIGSTKSZ	51	/* stack needed for signal delivery */

Curious: but how do we avoid/detect conflicts at -rc1? I guess somebody just
needs to remember to run grep? (I know you have another series consolidating
the ID allocations).

> -#define AT_VECTOR_SIZE_ARCH 1 /* entries in ARCH_DLINFO */
> +#define AT_VECTOR_SIZE_ARCH 2 /* entries in ARCH_DLINFO */
>  
>  #endif
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 9d1b06d..0e0b53d 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1619,6 +1619,7 @@ void __init setup_cpu_features(void)
>  		pr_info("emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching\n");
>  
>  	sve_setup();
> +	minsigstksz_setup();
>  
>  	/* Advertise that we have computed the system capabilities */
>  	set_sys_caps_initialised();
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index 154b7d3..00b9990 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -17,6 +17,7 @@
>   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>   */
>  
> +#include <linux/cache.h>
>  #include <linux/compat.h>
>  #include <linux/errno.h>
>  #include <linux/kernel.h>
> @@ -570,8 +571,15 @@ asmlinkage long sys_rt_sigreturn(struct pt_regs *regs)
>  	return 0;
>  }
>  
> -/* Determine the layout of optional records in the signal frame */
> -static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
> +/*
> + * Determine the layout of optional records in the signal frame
> + *
> + * add_all: if true, lays out the biggest possible signal frame for
> + *	this task; otherwise, generates a layout for the current state
> + *	of the task.
> + */
> +static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
> +				 bool add_all)
>  {
>  	int err;
>  
> @@ -581,7 +589,7 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
>  		return err;
>  
>  	/* fault information, if valid */
> -	if (current->thread.fault_code) {
> +	if (add_all || current->thread.fault_code) {
>  		err = sigframe_alloc(user, &user->esr_offset,
>  				     sizeof(struct esr_context));
>  		if (err)
> @@ -591,8 +599,14 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
>  	if (system_supports_sve()) {
>  		unsigned int vq = 0;
>  
> -		if (test_thread_flag(TIF_SVE))
> -			vq = sve_vq_from_vl(current->thread.sve_vl);
> +		if (add_all || test_thread_flag(TIF_SVE)) {
> +			int vl = sve_max_vl;
> +
> +			if (!add_all)
> +				vl = current->thread.sve_vl;
> +
> +			vq = sve_vq_from_vl(vl);
> +		}
>  
>  		err = sigframe_alloc(user, &user->sve_offset,
>  				     SVE_SIG_CONTEXT_SIZE(vq));
> @@ -603,7 +617,6 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
>  	return sigframe_alloc_end(user);
>  }
>  
> -
>  static int setup_sigframe(struct rt_sigframe_user_layout *user,
>  			  struct pt_regs *regs, sigset_t *set)
>  {
> @@ -701,7 +714,7 @@ static int get_sigframe(struct rt_sigframe_user_layout *user,
>  	int err;
>  
>  	init_user_layout(user);
> -	err = setup_sigframe_layout(user);
> +	err = setup_sigframe_layout(user, false);
>  	if (err)
>  		return err;
>  
> @@ -936,3 +949,28 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
>  		thread_flags = READ_ONCE(current_thread_info()->flags);
>  	} while (thread_flags & _TIF_WORK_MASK);
>  }
> +
> +unsigned long __ro_after_init signal_minsigstksz;
> +
> +/*
> + * Determine the stack space required for guaranteed signal devliery.
> + * This function is used to populate AT_MINSIGSTKSZ at process startup.
> + * cpufeatures setup is assumed to be complete.
> + */
> +void __init minsigstksz_setup(void)
> +{
> +	struct rt_sigframe_user_layout user;
> +
> +	init_user_layout(&user);
> +
> +	/*
> +	 * If this fails, SIGFRAME_MAXSZ needs to be enlarged.  It won't
> +	 * be big enough, but it's our best guess:
> +	 */
> +	if (WARN_ON(setup_sigframe_layout(&user, true)))
> +		signal_minsigstksz = SIGFRAME_MAXSZ;

Can we not leave signal_minsigstksz as zero in this case?

Will
Dave Martin May 24, 2018, 3:55 p.m. UTC | #2
On Thu, May 24, 2018 at 01:49:21PM +0100, Will Deacon wrote:
> On Wed, May 23, 2018 at 06:46:56PM +0100, Dave Martin wrote:
> > Stateful CPU architecture extensions may require the signal frame
> > to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
> > However, changing this #define is an ABI break.
> > 
> > To allow userspace the option of determining the signal frame size
> > in a more forwards-compatible way, this patch adds a new auxv entry
> > tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
> > size that the process can observe during its lifetime.
> > 
> > If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
> > assume that the MINSIGSTKSZ #define is sufficient.  This allows for
> > a consistent interface with older kernels that do not provide
> > AT_MINSIGSTKSZ.
> > 
> > The idea is that libc could expose this via sysconf() or some
> > similar mechanism.
> > 
> > There is deliberately no AT_SIGSTKSZ.  The kernel knows nothing
> > about userspace's own stack overheads and should not pretend to
> > know.
> 
> [...]
> 
> > diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> > index fac1c4d..9c18f0e 100644
> > --- a/arch/arm64/include/asm/elf.h
> > +++ b/arch/arm64/include/asm/elf.h
> > @@ -121,6 +121,9 @@
> >  
> >  #ifndef __ASSEMBLY__
> >  
> > +#include <linux/bug.h>
> > +#include <asm/processor.h> /* for signal_minsigstksz, used by ARCH_DLINFO */
> > +
> >  typedef unsigned long elf_greg_t;
> >  
> >  #define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> > @@ -148,6 +151,14 @@ typedef struct user_fpsimd_state elf_fpregset_t;
> >  do {									\
> >  	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
> >  		    (elf_addr_t)current->mm->context.vdso);		\
> > +									\
> > +	/*								\
> > +	 * Should always be nonzero unless there's a kernel bug.  If	\
> > +	 * the we haven't determined a sensible value to give to	\
> 
> "If the we"?

Dang, fixed locally now.

[...]

> > diff --git a/arch/arm64/include/uapi/asm/auxvec.h b/arch/arm64/include/uapi/asm/auxvec.h
> > index ec0a86d..743c0b8 100644
> > --- a/arch/arm64/include/uapi/asm/auxvec.h
> > +++ b/arch/arm64/include/uapi/asm/auxvec.h
> > @@ -19,7 +19,8 @@
> >  
> >  /* vDSO location */
> >  #define AT_SYSINFO_EHDR	33
> > +#define AT_MINSIGSTKSZ	51	/* stack needed for signal delivery */
> 
> Curious: but how do we avoid/detect conflicts at -rc1? I guess somebody just
> needs to remember to run grep? (I know you have another series consolidating
> the ID allocations).

We basically can't.  These are spread over various arch headers today,
so the solution is to (a) grep, and (b) know that you needed to do that.

This is the main motivation for collecting the definitions together.

Short of having some script that checks these at build-time, I couldn't
see another obvious solution.  It's nonetheless a bit ugly because of
things like AT_VECTOR_SIZE_ARCH which is masquerading a tag but isn't
one, and obviously does vary across arches...

[...]

> > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > index 154b7d3..00b9990 100644
> > --- a/arch/arm64/kernel/signal.c
> > +++ b/arch/arm64/kernel/signal.c

[...]

> > @@ -936,3 +949,28 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
> >  		thread_flags = READ_ONCE(current_thread_info()->flags);
> >  	} while (thread_flags & _TIF_WORK_MASK);
> >  }
> > +
> > +unsigned long __ro_after_init signal_minsigstksz;
> > +
> > +/*
> > + * Determine the stack space required for guaranteed signal devliery.
> > + * This function is used to populate AT_MINSIGSTKSZ at process startup.
> > + * cpufeatures setup is assumed to be complete.
> > + */
> > +void __init minsigstksz_setup(void)
> > +{
> > +	struct rt_sigframe_user_layout user;
> > +
> > +	init_user_layout(&user);
> > +
> > +	/*
> > +	 * If this fails, SIGFRAME_MAXSZ needs to be enlarged.  It won't
> > +	 * be big enough, but it's our best guess:
> > +	 */
> > +	if (WARN_ON(setup_sigframe_layout(&user, true)))
> > +		signal_minsigstksz = SIGFRAME_MAXSZ;
> 
> Can we not leave signal_minsigstksz as zero in this case?

I prefer to distinguish the "kernel went wrong" case (where we just omit
AT_MINSIGSTKSZ for backwards compatibilty) from the "sigframe too
large" case.

Thanks to the vagueries of C stack sizing is rarely an exact science,
so there is merit in telling userspace a size that is approimately
correct even if it's not quite big enough.  So if the frame would be
larger than SIGFRAME_MAXSZ then it seems preferable to at least steer
userspace towards allocating bigger stacks rather than just letting
userspace fall back to MINSIGSTKSZ (which is likely to be much too
small in this scenario).

These are things that Should Not Happen (tm), so this distinction
might be viewed as overkill, but that was my rationale anyway.

What do you think?

Cheers
---Dave
Will Deacon May 24, 2018, 4:50 p.m. UTC | #3
On Thu, May 24, 2018 at 04:55:17PM +0100, Dave Martin wrote:
> On Thu, May 24, 2018 at 01:49:21PM +0100, Will Deacon wrote:
> > On Wed, May 23, 2018 at 06:46:56PM +0100, Dave Martin wrote:
> > > Stateful CPU architecture extensions may require the signal frame
> > > to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
> > > However, changing this #define is an ABI break.
> > > 
> > > To allow userspace the option of determining the signal frame size
> > > in a more forwards-compatible way, this patch adds a new auxv entry
> > > tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
> > > size that the process can observe during its lifetime.
> > > 
> > > If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
> > > assume that the MINSIGSTKSZ #define is sufficient.  This allows for
> > > a consistent interface with older kernels that do not provide
> > > AT_MINSIGSTKSZ.
> > > 
> > > The idea is that libc could expose this via sysconf() or some
> > > similar mechanism.
> > > 
> > > There is deliberately no AT_SIGSTKSZ.  The kernel knows nothing
> > > about userspace's own stack overheads and should not pretend to
> > > know.
> > 
> > [...]
> > 
> > > diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> > > index fac1c4d..9c18f0e 100644
> > > --- a/arch/arm64/include/asm/elf.h
> > > +++ b/arch/arm64/include/asm/elf.h
> > > @@ -121,6 +121,9 @@
> > >  
> > >  #ifndef __ASSEMBLY__
> > >  
> > > +#include <linux/bug.h>
> > > +#include <asm/processor.h> /* for signal_minsigstksz, used by ARCH_DLINFO */
> > > +
> > >  typedef unsigned long elf_greg_t;
> > >  
> > >  #define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> > > @@ -148,6 +151,14 @@ typedef struct user_fpsimd_state elf_fpregset_t;
> > >  do {									\
> > >  	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
> > >  		    (elf_addr_t)current->mm->context.vdso);		\
> > > +									\
> > > +	/*								\
> > > +	 * Should always be nonzero unless there's a kernel bug.  If	\
> > > +	 * the we haven't determined a sensible value to give to	\
> > 
> > "If the we"?
> 
> Dang, fixed locally now.
> 
> [...]
> 
> > > diff --git a/arch/arm64/include/uapi/asm/auxvec.h b/arch/arm64/include/uapi/asm/auxvec.h
> > > index ec0a86d..743c0b8 100644
> > > --- a/arch/arm64/include/uapi/asm/auxvec.h
> > > +++ b/arch/arm64/include/uapi/asm/auxvec.h
> > > @@ -19,7 +19,8 @@
> > >  
> > >  /* vDSO location */
> > >  #define AT_SYSINFO_EHDR	33
> > > +#define AT_MINSIGSTKSZ	51	/* stack needed for signal delivery */
> > 
> > Curious: but how do we avoid/detect conflicts at -rc1? I guess somebody just
> > needs to remember to run grep? (I know you have another series consolidating
> > the ID allocations).
> 
> We basically can't.  These are spread over various arch headers today,
> so the solution is to (a) grep, and (b) know that you needed to do that.
> 
> This is the main motivation for collecting the definitions together.
> 
> Short of having some script that checks these at build-time, I couldn't
> see another obvious solution.  It's nonetheless a bit ugly because of
> things like AT_VECTOR_SIZE_ARCH which is masquerading a tag but isn't
> one, and obviously does vary across arches...
> 
> [...]
> 
> > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > > index 154b7d3..00b9990 100644
> > > --- a/arch/arm64/kernel/signal.c
> > > +++ b/arch/arm64/kernel/signal.c
> 
> [...]
> 
> > > @@ -936,3 +949,28 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
> > >  		thread_flags = READ_ONCE(current_thread_info()->flags);
> > >  	} while (thread_flags & _TIF_WORK_MASK);
> > >  }
> > > +
> > > +unsigned long __ro_after_init signal_minsigstksz;
> > > +
> > > +/*
> > > + * Determine the stack space required for guaranteed signal devliery.
> > > + * This function is used to populate AT_MINSIGSTKSZ at process startup.
> > > + * cpufeatures setup is assumed to be complete.
> > > + */
> > > +void __init minsigstksz_setup(void)
> > > +{
> > > +	struct rt_sigframe_user_layout user;
> > > +
> > > +	init_user_layout(&user);
> > > +
> > > +	/*
> > > +	 * If this fails, SIGFRAME_MAXSZ needs to be enlarged.  It won't
> > > +	 * be big enough, but it's our best guess:
> > > +	 */
> > > +	if (WARN_ON(setup_sigframe_layout(&user, true)))
> > > +		signal_minsigstksz = SIGFRAME_MAXSZ;
> > 
> > Can we not leave signal_minsigstksz as zero in this case?
> 
> I prefer to distinguish the "kernel went wrong" case (where we just omit
> AT_MINSIGSTKSZ for backwards compatibilty) from the "sigframe too
> large" case.

Hmm, so I'm confused as to the distinction here. Wouldn't an allocation
failure in setup_sigframe_layout be indicative of "kernel went wrong"?

To put it another way, if we could determine the maximum sigframe size
at build time, surely we'd fail the build if SIGFRAME_MAXSZ wasn't big
enough? In that case, detecting this at runtime is also pretty bad (hence
the WARN_ON) and I think we should drop the aux entry rather than provide
a value that is known to be incorrect.

Will
Dave Martin May 24, 2018, 5:07 p.m. UTC | #4
On Thu, May 24, 2018 at 05:50:48PM +0100, Will Deacon wrote:
> On Thu, May 24, 2018 at 04:55:17PM +0100, Dave Martin wrote:
> > On Thu, May 24, 2018 at 01:49:21PM +0100, Will Deacon wrote:
> > > On Wed, May 23, 2018 at 06:46:56PM +0100, Dave Martin wrote:
> > > > Stateful CPU architecture extensions may require the signal frame
> > > > to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
> > > > However, changing this #define is an ABI break.
> > > > 
> > > > To allow userspace the option of determining the signal frame size
> > > > in a more forwards-compatible way, this patch adds a new auxv entry
> > > > tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
> > > > size that the process can observe during its lifetime.
> > > > 
> > > > If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
> > > > assume that the MINSIGSTKSZ #define is sufficient.  This allows for
> > > > a consistent interface with older kernels that do not provide
> > > > AT_MINSIGSTKSZ.
> > > > 
> > > > The idea is that libc could expose this via sysconf() or some
> > > > similar mechanism.
> > > > 
> > > > There is deliberately no AT_SIGSTKSZ.  The kernel knows nothing
> > > > about userspace's own stack overheads and should not pretend to
> > > > know.
> > > 
> > > [...]
> > > 
> > > > diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> > > > index fac1c4d..9c18f0e 100644
> > > > --- a/arch/arm64/include/asm/elf.h
> > > > +++ b/arch/arm64/include/asm/elf.h
> > > > @@ -121,6 +121,9 @@
> > > >  
> > > >  #ifndef __ASSEMBLY__
> > > >  
> > > > +#include <linux/bug.h>
> > > > +#include <asm/processor.h> /* for signal_minsigstksz, used by ARCH_DLINFO */
> > > > +
> > > >  typedef unsigned long elf_greg_t;
> > > >  
> > > >  #define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
> > > > @@ -148,6 +151,14 @@ typedef struct user_fpsimd_state elf_fpregset_t;
> > > >  do {									\
> > > >  	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
> > > >  		    (elf_addr_t)current->mm->context.vdso);		\
> > > > +									\
> > > > +	/*								\
> > > > +	 * Should always be nonzero unless there's a kernel bug.  If	\
> > > > +	 * the we haven't determined a sensible value to give to	\
> > > 
> > > "If the we"?
> > 
> > Dang, fixed locally now.
> > 
> > [...]
> > 
> > > > diff --git a/arch/arm64/include/uapi/asm/auxvec.h b/arch/arm64/include/uapi/asm/auxvec.h
> > > > index ec0a86d..743c0b8 100644
> > > > --- a/arch/arm64/include/uapi/asm/auxvec.h
> > > > +++ b/arch/arm64/include/uapi/asm/auxvec.h
> > > > @@ -19,7 +19,8 @@
> > > >  
> > > >  /* vDSO location */
> > > >  #define AT_SYSINFO_EHDR	33
> > > > +#define AT_MINSIGSTKSZ	51	/* stack needed for signal delivery */
> > > 
> > > Curious: but how do we avoid/detect conflicts at -rc1? I guess somebody just
> > > needs to remember to run grep? (I know you have another series consolidating
> > > the ID allocations).
> > 
> > We basically can't.  These are spread over various arch headers today,
> > so the solution is to (a) grep, and (b) know that you needed to do that.
> > 
> > This is the main motivation for collecting the definitions together.
> > 
> > Short of having some script that checks these at build-time, I couldn't
> > see another obvious solution.  It's nonetheless a bit ugly because of
> > things like AT_VECTOR_SIZE_ARCH which is masquerading a tag but isn't
> > one, and obviously does vary across arches...
> > 
> > [...]
> > 
> > > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > > > index 154b7d3..00b9990 100644
> > > > --- a/arch/arm64/kernel/signal.c
> > > > +++ b/arch/arm64/kernel/signal.c
> > 
> > [...]
> > 
> > > > @@ -936,3 +949,28 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
> > > >  		thread_flags = READ_ONCE(current_thread_info()->flags);
> > > >  	} while (thread_flags & _TIF_WORK_MASK);
> > > >  }
> > > > +
> > > > +unsigned long __ro_after_init signal_minsigstksz;
> > > > +
> > > > +/*
> > > > + * Determine the stack space required for guaranteed signal devliery.
> > > > + * This function is used to populate AT_MINSIGSTKSZ at process startup.
> > > > + * cpufeatures setup is assumed to be complete.
> > > > + */
> > > > +void __init minsigstksz_setup(void)
> > > > +{
> > > > +	struct rt_sigframe_user_layout user;
> > > > +
> > > > +	init_user_layout(&user);
> > > > +
> > > > +	/*
> > > > +	 * If this fails, SIGFRAME_MAXSZ needs to be enlarged.  It won't
> > > > +	 * be big enough, but it's our best guess:
> > > > +	 */
> > > > +	if (WARN_ON(setup_sigframe_layout(&user, true)))
> > > > +		signal_minsigstksz = SIGFRAME_MAXSZ;
> > > 
> > > Can we not leave signal_minsigstksz as zero in this case?
> > 
> > I prefer to distinguish the "kernel went wrong" case (where we just omit
> > AT_MINSIGSTKSZ for backwards compatibilty) from the "sigframe too
> > large" case.
> 
> Hmm, so I'm confused as to the distinction here. Wouldn't an allocation
> failure in setup_sigframe_layout be indicative of "kernel went wrong"?
> 
> To put it another way, if we could determine the maximum sigframe size
> at build time, surely we'd fail the build if SIGFRAME_MAXSZ wasn't big
> enough? In that case, detecting this at runtime is also pretty bad (hence

Yup

> the WARN_ON) and I think we should drop the aux entry rather than provide
> a value that is known to be incorrect.

Telling userspace the signal frame size is not optional: by omitting
AT_MINSIGSTKSZ we implicitly tell userspace than MINSIGSTKSZ
is sufficient.  But in this case we not only know that this is false, we
know that SIGFRAME_MAXSZ is not sufficient either.  But we also know
that SIGFRAME_MAXSZ is a closer estimate to the true requirement, because
it's the larger value.

This falls under the heading of "being no more wrong than necessary".

Either way, this is trying to paper over a kernel bug, by telling
userspace something "sensible".  This may not be a sensible course
of action...

So if you feel strongly I'm happy to not distinguish the two cases and
just WARN() in minsigstksz_setup() as at present.

Cheers
---Dave
Will Deacon May 25, 2018, 11:32 a.m. UTC | #5
On Thu, May 24, 2018 at 06:07:13PM +0100, Dave Martin wrote:
> On Thu, May 24, 2018 at 05:50:48PM +0100, Will Deacon wrote:
> > On Thu, May 24, 2018 at 04:55:17PM +0100, Dave Martin wrote:
> > > On Thu, May 24, 2018 at 01:49:21PM +0100, Will Deacon wrote:
> > > > On Wed, May 23, 2018 at 06:46:56PM +0100, Dave Martin wrote:
> > > > > @@ -936,3 +949,28 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
> > > > >  		thread_flags = READ_ONCE(current_thread_info()->flags);
> > > > >  	} while (thread_flags & _TIF_WORK_MASK);
> > > > >  }
> > > > > +
> > > > > +unsigned long __ro_after_init signal_minsigstksz;
> > > > > +
> > > > > +/*
> > > > > + * Determine the stack space required for guaranteed signal devliery.
> > > > > + * This function is used to populate AT_MINSIGSTKSZ at process startup.
> > > > > + * cpufeatures setup is assumed to be complete.
> > > > > + */
> > > > > +void __init minsigstksz_setup(void)
> > > > > +{
> > > > > +	struct rt_sigframe_user_layout user;
> > > > > +
> > > > > +	init_user_layout(&user);
> > > > > +
> > > > > +	/*
> > > > > +	 * If this fails, SIGFRAME_MAXSZ needs to be enlarged.  It won't
> > > > > +	 * be big enough, but it's our best guess:
> > > > > +	 */
> > > > > +	if (WARN_ON(setup_sigframe_layout(&user, true)))
> > > > > +		signal_minsigstksz = SIGFRAME_MAXSZ;
> > > > 
> > > > Can we not leave signal_minsigstksz as zero in this case?
> > > 
> > > I prefer to distinguish the "kernel went wrong" case (where we just omit
> > > AT_MINSIGSTKSZ for backwards compatibilty) from the "sigframe too
> > > large" case.
> > 
> > Hmm, so I'm confused as to the distinction here. Wouldn't an allocation
> > failure in setup_sigframe_layout be indicative of "kernel went wrong"?
> > 
> > To put it another way, if we could determine the maximum sigframe size
> > at build time, surely we'd fail the build if SIGFRAME_MAXSZ wasn't big
> > enough? In that case, detecting this at runtime is also pretty bad (hence
> 
> Yup

Good, I was starting to worry I was missing something!

> > the WARN_ON) and I think we should drop the aux entry rather than provide
> > a value that is known to be incorrect.
> 
> Telling userspace the signal frame size is not optional: by omitting
> AT_MINSIGSTKSZ we implicitly tell userspace than MINSIGSTKSZ
> is sufficient.  But in this case we not only know that this is false, we
> know that SIGFRAME_MAXSZ is not sufficient either.  But we also know
> that SIGFRAME_MAXSZ is a closer estimate to the true requirement, because
> it's the larger value.
> 
> This falls under the heading of "being no more wrong than necessary".

I think I just prefer distinguishing between "AT_MINSIGSTKSZ isn't
present, I'll assume MINSIGSTKSZ is sufficient but it might not be" and
"AT_MINSIGSTKSZ is present, I know that it's sufficient".

> Either way, this is trying to paper over a kernel bug, by telling
> userspace something "sensible".  This may not be a sensible course
> of action...
> 
> So if you feel strongly I'm happy to not distinguish the two cases and
> just WARN() in minsigstksz_setup() as at present.

Yes, please.

Will
Dave Martin May 25, 2018, 2:39 p.m. UTC | #6
On Fri, May 25, 2018 at 12:32:51PM +0100, Will Deacon wrote:
> On Thu, May 24, 2018 at 06:07:13PM +0100, Dave Martin wrote:
> > On Thu, May 24, 2018 at 05:50:48PM +0100, Will Deacon wrote:
> > > On Thu, May 24, 2018 at 04:55:17PM +0100, Dave Martin wrote:
> > > > On Thu, May 24, 2018 at 01:49:21PM +0100, Will Deacon wrote:
> > > > > On Wed, May 23, 2018 at 06:46:56PM +0100, Dave Martin wrote:
> > > > > > @@ -936,3 +949,28 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
> > > > > >  		thread_flags = READ_ONCE(current_thread_info()->flags);
> > > > > >  	} while (thread_flags & _TIF_WORK_MASK);
> > > > > >  }
> > > > > > +
> > > > > > +unsigned long __ro_after_init signal_minsigstksz;
> > > > > > +
> > > > > > +/*
> > > > > > + * Determine the stack space required for guaranteed signal devliery.
> > > > > > + * This function is used to populate AT_MINSIGSTKSZ at process startup.
> > > > > > + * cpufeatures setup is assumed to be complete.
> > > > > > + */
> > > > > > +void __init minsigstksz_setup(void)
> > > > > > +{
> > > > > > +	struct rt_sigframe_user_layout user;
> > > > > > +
> > > > > > +	init_user_layout(&user);
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * If this fails, SIGFRAME_MAXSZ needs to be enlarged.  It won't
> > > > > > +	 * be big enough, but it's our best guess:
> > > > > > +	 */
> > > > > > +	if (WARN_ON(setup_sigframe_layout(&user, true)))
> > > > > > +		signal_minsigstksz = SIGFRAME_MAXSZ;
> > > > > 
> > > > > Can we not leave signal_minsigstksz as zero in this case?
> > > > 
> > > > I prefer to distinguish the "kernel went wrong" case (where we just omit
> > > > AT_MINSIGSTKSZ for backwards compatibilty) from the "sigframe too
> > > > large" case.
> > > 
> > > Hmm, so I'm confused as to the distinction here. Wouldn't an allocation
> > > failure in setup_sigframe_layout be indicative of "kernel went wrong"?
> > > 
> > > To put it another way, if we could determine the maximum sigframe size
> > > at build time, surely we'd fail the build if SIGFRAME_MAXSZ wasn't big
> > > enough? In that case, detecting this at runtime is also pretty bad (hence
> > 
> > Yup
> 
> Good, I was starting to worry I was missing something!
> 
> > > the WARN_ON) and I think we should drop the aux entry rather than provide
> > > a value that is known to be incorrect.
> > 
> > Telling userspace the signal frame size is not optional: by omitting
> > AT_MINSIGSTKSZ we implicitly tell userspace than MINSIGSTKSZ
> > is sufficient.  But in this case we not only know that this is false, we
> > know that SIGFRAME_MAXSZ is not sufficient either.  But we also know
> > that SIGFRAME_MAXSZ is a closer estimate to the true requirement, because
> > it's the larger value.
> > 
> > This falls under the heading of "being no more wrong than necessary".
> 
> I think I just prefer distinguishing between "AT_MINSIGSTKSZ isn't
> present, I'll assume MINSIGSTKSZ is sufficient but it might not be" and
> "AT_MINSIGSTKSZ is present, I know that it's sufficient".
> 
> > Either way, this is trying to paper over a kernel bug, by telling
> > userspace something "sensible".  This may not be a sensible course
> > of action...
> > 
> > So if you feel strongly I'm happy to not distinguish the two cases and
> > just WARN() in minsigstksz_setup() as at present.
> 
> Yes, please.

OK, will do.  This saves on explanations at least.

Cheers
---Dave
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index fac1c4d..9c18f0e 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -121,6 +121,9 @@ 
 
 #ifndef __ASSEMBLY__
 
+#include <linux/bug.h>
+#include <asm/processor.h> /* for signal_minsigstksz, used by ARCH_DLINFO */
+
 typedef unsigned long elf_greg_t;
 
 #define ELF_NGREG (sizeof(struct user_pt_regs) / sizeof(elf_greg_t))
@@ -148,6 +151,14 @@  typedef struct user_fpsimd_state elf_fpregset_t;
 do {									\
 	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
 		    (elf_addr_t)current->mm->context.vdso);		\
+									\
+	/*								\
+	 * Should always be nonzero unless there's a kernel bug.  If	\
+	 * the we haven't determined a sensible value to give to	\
+	 * userspace, omit the entry:					\
+	 */								\
+	if (likely(signal_minsigstksz))					\
+		NEW_AUX_ENT(AT_MINSIGSTKSZ, signal_minsigstksz);	\
 } while (0)
 
 #define ARCH_HAS_SETUP_ADDITIONAL_PAGES
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 7675989..65ab83e 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -35,6 +35,8 @@ 
 #ifdef __KERNEL__
 
 #include <linux/build_bug.h>
+#include <linux/cache.h>
+#include <linux/init.h>
 #include <linux/stddef.h>
 #include <linux/string.h>
 
@@ -244,6 +246,9 @@  void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused);
 void cpu_enable_cache_maint_trap(const struct arm64_cpu_capabilities *__unused);
 void cpu_clear_disr(const struct arm64_cpu_capabilities *__unused);
 
+extern unsigned long __ro_after_init signal_minsigstksz; /* sigframe size */
+extern void __init minsigstksz_setup(void);
+
 /* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
 #define SVE_SET_VL(arg)	sve_set_current_vl(arg)
 #define SVE_GET_VL()	sve_get_current_vl()
diff --git a/arch/arm64/include/uapi/asm/auxvec.h b/arch/arm64/include/uapi/asm/auxvec.h
index ec0a86d..743c0b8 100644
--- a/arch/arm64/include/uapi/asm/auxvec.h
+++ b/arch/arm64/include/uapi/asm/auxvec.h
@@ -19,7 +19,8 @@ 
 
 /* vDSO location */
 #define AT_SYSINFO_EHDR	33
+#define AT_MINSIGSTKSZ	51	/* stack needed for signal delivery */
 
-#define AT_VECTOR_SIZE_ARCH 1 /* entries in ARCH_DLINFO */
+#define AT_VECTOR_SIZE_ARCH 2 /* entries in ARCH_DLINFO */
 
 #endif
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9d1b06d..0e0b53d 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1619,6 +1619,7 @@  void __init setup_cpu_features(void)
 		pr_info("emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching\n");
 
 	sve_setup();
+	minsigstksz_setup();
 
 	/* Advertise that we have computed the system capabilities */
 	set_sys_caps_initialised();
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 154b7d3..00b9990 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -17,6 +17,7 @@ 
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/cache.h>
 #include <linux/compat.h>
 #include <linux/errno.h>
 #include <linux/kernel.h>
@@ -570,8 +571,15 @@  asmlinkage long sys_rt_sigreturn(struct pt_regs *regs)
 	return 0;
 }
 
-/* Determine the layout of optional records in the signal frame */
-static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
+/*
+ * Determine the layout of optional records in the signal frame
+ *
+ * add_all: if true, lays out the biggest possible signal frame for
+ *	this task; otherwise, generates a layout for the current state
+ *	of the task.
+ */
+static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
+				 bool add_all)
 {
 	int err;
 
@@ -581,7 +589,7 @@  static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
 		return err;
 
 	/* fault information, if valid */
-	if (current->thread.fault_code) {
+	if (add_all || current->thread.fault_code) {
 		err = sigframe_alloc(user, &user->esr_offset,
 				     sizeof(struct esr_context));
 		if (err)
@@ -591,8 +599,14 @@  static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
 	if (system_supports_sve()) {
 		unsigned int vq = 0;
 
-		if (test_thread_flag(TIF_SVE))
-			vq = sve_vq_from_vl(current->thread.sve_vl);
+		if (add_all || test_thread_flag(TIF_SVE)) {
+			int vl = sve_max_vl;
+
+			if (!add_all)
+				vl = current->thread.sve_vl;
+
+			vq = sve_vq_from_vl(vl);
+		}
 
 		err = sigframe_alloc(user, &user->sve_offset,
 				     SVE_SIG_CONTEXT_SIZE(vq));
@@ -603,7 +617,6 @@  static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
 	return sigframe_alloc_end(user);
 }
 
-
 static int setup_sigframe(struct rt_sigframe_user_layout *user,
 			  struct pt_regs *regs, sigset_t *set)
 {
@@ -701,7 +714,7 @@  static int get_sigframe(struct rt_sigframe_user_layout *user,
 	int err;
 
 	init_user_layout(user);
-	err = setup_sigframe_layout(user);
+	err = setup_sigframe_layout(user, false);
 	if (err)
 		return err;
 
@@ -936,3 +949,28 @@  asmlinkage void do_notify_resume(struct pt_regs *regs,
 		thread_flags = READ_ONCE(current_thread_info()->flags);
 	} while (thread_flags & _TIF_WORK_MASK);
 }
+
+unsigned long __ro_after_init signal_minsigstksz;
+
+/*
+ * Determine the stack space required for guaranteed signal devliery.
+ * This function is used to populate AT_MINSIGSTKSZ at process startup.
+ * cpufeatures setup is assumed to be complete.
+ */
+void __init minsigstksz_setup(void)
+{
+	struct rt_sigframe_user_layout user;
+
+	init_user_layout(&user);
+
+	/*
+	 * If this fails, SIGFRAME_MAXSZ needs to be enlarged.  It won't
+	 * be big enough, but it's our best guess:
+	 */
+	if (WARN_ON(setup_sigframe_layout(&user, true)))
+		signal_minsigstksz = SIGFRAME_MAXSZ;
+	else
+		signal_minsigstksz = sigframe_size(&user) +
+			round_up(sizeof(struct frame_record), 16) +
+			16; /* max alignment padding */
+}