Message ID | 20210706145839.1658623-2-adhemerval.zanella@linaro.org |
---|---|
State | New |
Headers | show |
Series | Add close_range, closefrom, and posix_spawn_file_actions_closefrom_np | expand |
* Adhemerval Zanella via Libc-alpha: > The code to allocate a stack from xsigstack is refactored so it can > be more generic. The new support_stack_alloc() also set PROT_EXEC > if DEFAULT_STACK_PERMS has PF_X. This is required on some > architectures (hppa for instance) and trying to access the rtld > global from testsuite will require more intrusive refactoring > in the ldsodefs.h header. DEFAULT_STACK_PERMS is misnamed, it's really HISTORIC_STACK_PERMS. All architectures override it to RW permissions in the toolchain (maybe with the exception of Hurd, which uses trampolines for nested functions). I have a cstack_allocate version that handles this. It can only be done from within glibc proper because we do not export the stack execution status directly. But I think it's out of scope for glibc 2.34 by now. > + /* The guard bands need to be large enough to intercept offset > + accesses from a stack address that might otherwise hit another > + mapping. Make them at least twice as big as the stack itself, to > + defend against an offset by the entire size of a large > + stack-allocated array. The minimum is 1MiB, which is arbitrarily > + chosen to be larger than any "typical" wild pointer offset. > + Again, no matter what the number is, round it up to a whole > + number of pages. */ > + size_t guardsize = roundup (MAX (2 * stacksize, 1024 * 1024), pagesize); > + size_t alloc_size = guardsize + stacksize + guardsize; > + /* Use MAP_NORESERVE so that RAM will not be wasted on the guard > + bands; touch all the pages of the actual stack before returning, > + so we know they are allocated. */ > + void *alloc_base = xmmap (0, > + alloc_size, > + PROT_NONE, > + MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE|MAP_STACK, > + -1); > + /* PF_X can be overridden if PT_GNU_STACK is present. */ > + int prot = PROT_READ | PROT_WRITE > + | (DEFAULT_STACK_PERMS & PF_X ? PROT_EXEC : 0); > + xmprotect (alloc_base + guardsize, stacksize, prot); > + memset (alloc_base + guardsize, 0xA5, stacksize); > + return (struct support_stack) { alloc_base + guardsize, stacksize, guardsize }; This doesn't handle different stack growth directions. Thanks, Florian
On 07/07/2021 07:17, Florian Weimer wrote: > * Adhemerval Zanella via Libc-alpha: > >> The code to allocate a stack from xsigstack is refactored so it can >> be more generic. The new support_stack_alloc() also set PROT_EXEC >> if DEFAULT_STACK_PERMS has PF_X. This is required on some >> architectures (hppa for instance) and trying to access the rtld >> global from testsuite will require more intrusive refactoring >> in the ldsodefs.h header. > > DEFAULT_STACK_PERMS is misnamed, it's really HISTORIC_STACK_PERMS. > All architectures override it to RW permissions in the toolchain > (maybe with the exception of Hurd, which uses trampolines for nested > functions). This is in fact two different requirements, this gnulib thread gives a nice summary about the permission required from trampolines [1]. Another requirement is how Linux layout the signal return code for the signal handler stack. It seems that hppa still requires executable stacks, since tst-xsigstack does fails without a executable stack even on a recent 5.10.46-1 kernel. > > I have a cstack_allocate version that handles this. It can only be done > from within glibc proper because we do not export the stack execution > status directly. But I think it's out of scope for glibc 2.34 by now. We can in theory access the ldsodes.h fields directly and then use GL (dl_stack_flags) information to set the stack executable or not. The problem is ldsodefs.h is quite convoluted and it would require more refactoring to use outside libc.so code. But I agree with you that having less hacky way to obtain this information is better. So are you ok with the current approach or being conservative and use DEFAULT_STACK_PERMS on libsupport? > >> + /* The guard bands need to be large enough to intercept offset >> + accesses from a stack address that might otherwise hit another >> + mapping. Make them at least twice as big as the stack itself, to >> + defend against an offset by the entire size of a large >> + stack-allocated array. The minimum is 1MiB, which is arbitrarily >> + chosen to be larger than any "typical" wild pointer offset. >> + Again, no matter what the number is, round it up to a whole >> + number of pages. */ >> + size_t guardsize = roundup (MAX (2 * stacksize, 1024 * 1024), pagesize); >> + size_t alloc_size = guardsize + stacksize + guardsize; >> + /* Use MAP_NORESERVE so that RAM will not be wasted on the guard >> + bands; touch all the pages of the actual stack before returning, >> + so we know they are allocated. */ >> + void *alloc_base = xmmap (0, >> + alloc_size, >> + PROT_NONE, >> + MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE|MAP_STACK, >> + -1); >> + /* PF_X can be overridden if PT_GNU_STACK is present. */ >> + int prot = PROT_READ | PROT_WRITE >> + | (DEFAULT_STACK_PERMS & PF_X ? PROT_EXEC : 0); >> + xmprotect (alloc_base + guardsize, stacksize, prot); >> + memset (alloc_base + guardsize, 0xA5, stacksize); >> + return (struct support_stack) { alloc_base + guardsize, stacksize, guardsize }; > > This doesn't handle different stack growth directions. > At least for the usages of the routine it does not require any adjustment: xsigaltstack and xclone will handle it. I saw no regression for tst-xsigaltstack and tst-clone_range. [1] https://lists.gnu.org/archive/html/bug-gnulib/2021-05/msg00080.html
* Adhemerval Zanella: > On 07/07/2021 07:17, Florian Weimer wrote: >> * Adhemerval Zanella via Libc-alpha: >> >>> The code to allocate a stack from xsigstack is refactored so it can >>> be more generic. The new support_stack_alloc() also set PROT_EXEC >>> if DEFAULT_STACK_PERMS has PF_X. This is required on some >>> architectures (hppa for instance) and trying to access the rtld >>> global from testsuite will require more intrusive refactoring >>> in the ldsodefs.h header. >> >> DEFAULT_STACK_PERMS is misnamed, it's really HISTORIC_STACK_PERMS. >> All architectures override it to RW permissions in the toolchain >> (maybe with the exception of Hurd, which uses trampolines for nested >> functions). > > This is in fact two different requirements, this gnulib thread gives > a nice summary about the permission required from trampolines [1]. > Another requirement is how Linux layout the signal return code for the > signal handler stack. It seems that hppa still requires executable > stacks, since tst-xsigstack does fails without a executable stack even > on a recent 5.10.46-1 kernel. Ugh, okay. >> I have a cstack_allocate version that handles this. It can only be done >> from within glibc proper because we do not export the stack execution >> status directly. But I think it's out of scope for glibc 2.34 by now. > > We can in theory access the ldsodes.h fields directly and then > use GL (dl_stack_flags) information to set the stack executable or not. > The problem is ldsodefs.h is quite convoluted and it would require more > refactoring to use outside libc.so code. But I agree with you that > having less hacky way to obtain this information is better. > > So are you ok with the current approach or being conservative and use > DEFAULT_STACK_PERMS on libsupport? DEFAULT_STACK_PERMS with a comment is fine. I will resubmit my cstack_allocate patches for glibc 2.35 patches, and they will fully handle executable stacks. >>> + /* The guard bands need to be large enough to intercept offset >>> + accesses from a stack address that might otherwise hit another >>> + mapping. Make them at least twice as big as the stack itself, to >>> + defend against an offset by the entire size of a large >>> + stack-allocated array. The minimum is 1MiB, which is arbitrarily >>> + chosen to be larger than any "typical" wild pointer offset. >>> + Again, no matter what the number is, round it up to a whole >>> + number of pages. */ >>> + size_t guardsize = roundup (MAX (2 * stacksize, 1024 * 1024), pagesize); >>> + size_t alloc_size = guardsize + stacksize + guardsize; >>> + /* Use MAP_NORESERVE so that RAM will not be wasted on the guard >>> + bands; touch all the pages of the actual stack before returning, >>> + so we know they are allocated. */ >>> + void *alloc_base = xmmap (0, >>> + alloc_size, >>> + PROT_NONE, >>> + MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE|MAP_STACK, >>> + -1); >>> + /* PF_X can be overridden if PT_GNU_STACK is present. */ >>> + int prot = PROT_READ | PROT_WRITE >>> + | (DEFAULT_STACK_PERMS & PF_X ? PROT_EXEC : 0); >>> + xmprotect (alloc_base + guardsize, stacksize, prot); >>> + memset (alloc_base + guardsize, 0xA5, stacksize); >>> + return (struct support_stack) { alloc_base + guardsize, stacksize, guardsize }; >> >> This doesn't handle different stack growth directions. >> > > At least for the usages of the routine it does not require any adjustment: > xsigaltstack and xclone will handle it. I saw no regression for > tst-xsigaltstack and tst-clone_range. Huh, I would expect the guard area to be outside of the stack region returned by stack allocation. That's how the cstack_allocate API does it. If the current tests expect something else, then the approach in the patch is okay (with a comment for DEFAULT_STACK_PERMS). Thanks, Florian
On 07/07/2021 14:15, Florian Weimer wrote: > * Adhemerval Zanella: > >> On 07/07/2021 07:17, Florian Weimer wrote: >>> * Adhemerval Zanella via Libc-alpha: >>> >>>> The code to allocate a stack from xsigstack is refactored so it can >>>> be more generic. The new support_stack_alloc() also set PROT_EXEC >>>> if DEFAULT_STACK_PERMS has PF_X. This is required on some >>>> architectures (hppa for instance) and trying to access the rtld >>>> global from testsuite will require more intrusive refactoring >>>> in the ldsodefs.h header. >>> >>> DEFAULT_STACK_PERMS is misnamed, it's really HISTORIC_STACK_PERMS. >>> All architectures override it to RW permissions in the toolchain >>> (maybe with the exception of Hurd, which uses trampolines for nested >>> functions). >> >> This is in fact two different requirements, this gnulib thread gives >> a nice summary about the permission required from trampolines [1]. >> Another requirement is how Linux layout the signal return code for the >> signal handler stack. It seems that hppa still requires executable >> stacks, since tst-xsigstack does fails without a executable stack even >> on a recent 5.10.46-1 kernel. > > Ugh, okay. > >>> I have a cstack_allocate version that handles this. It can only be done >>> from within glibc proper because we do not export the stack execution >>> status directly. But I think it's out of scope for glibc 2.34 by now. >> >> We can in theory access the ldsodes.h fields directly and then >> use GL (dl_stack_flags) information to set the stack executable or not. >> The problem is ldsodefs.h is quite convoluted and it would require more >> refactoring to use outside libc.so code. But I agree with you that >> having less hacky way to obtain this information is better. >> >> So are you ok with the current approach or being conservative and use >> DEFAULT_STACK_PERMS on libsupport? > > DEFAULT_STACK_PERMS with a comment is fine. I have added: /* Some architecture still requires executable stack for the signal return trampoline, although PF_X could be overridden if PT_GNU_STACK is present. However since there is glibc does not export such information with a proper ABI, it uses the historical permissions. */ > > I will resubmit my cstack_allocate patches for glibc 2.35 patches, and > they will fully handle executable stacks. I think once we get cstack_allocate we might use on 'support_stack_alloc' instead. > >>>> + /* The guard bands need to be large enough to intercept offset >>>> + accesses from a stack address that might otherwise hit another >>>> + mapping. Make them at least twice as big as the stack itself, to >>>> + defend against an offset by the entire size of a large >>>> + stack-allocated array. The minimum is 1MiB, which is arbitrarily >>>> + chosen to be larger than any "typical" wild pointer offset. >>>> + Again, no matter what the number is, round it up to a whole >>>> + number of pages. */ >>>> + size_t guardsize = roundup (MAX (2 * stacksize, 1024 * 1024), pagesize); >>>> + size_t alloc_size = guardsize + stacksize + guardsize; >>>> + /* Use MAP_NORESERVE so that RAM will not be wasted on the guard >>>> + bands; touch all the pages of the actual stack before returning, >>>> + so we know they are allocated. */ >>>> + void *alloc_base = xmmap (0, >>>> + alloc_size, >>>> + PROT_NONE, >>>> + MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE|MAP_STACK, >>>> + -1); >>>> + /* PF_X can be overridden if PT_GNU_STACK is present. */ >>>> + int prot = PROT_READ | PROT_WRITE >>>> + | (DEFAULT_STACK_PERMS & PF_X ? PROT_EXEC : 0); >>>> + xmprotect (alloc_base + guardsize, stacksize, prot); >>>> + memset (alloc_base + guardsize, 0xA5, stacksize); >>>> + return (struct support_stack) { alloc_base + guardsize, stacksize, guardsize }; >>> >>> This doesn't handle different stack growth directions. >>> >> >> At least for the usages of the routine it does not require any adjustment: >> xsigaltstack and xclone will handle it. I saw no regression for >> tst-xsigaltstack and tst-clone_range. > > Huh, I would expect the guard area to be outside of the stack region > returned by stack allocation. That's how the cstack_allocate API does > it. If the current tests expect something else, then the approach in > the patch is okay (with a comment for DEFAULT_STACK_PERMS). It seems that at least for hppa, sigaltstack already handles it. For clone(), xclone does the adjustment explicitly.
* Adhemerval Zanella: > I have added: > > /* Some architecture still requires executable stack for the signal return > trampoline, although PF_X could be overridden if PT_GNU_STACK is present. > However since there is glibc does not export such information with a > proper ABI, it uses the historical permissions. */ “since [] glibc does not export” Thanks, Florian
On 08/07/2021 02:43, Florian Weimer wrote: > * Adhemerval Zanella: > >> I have added: >> >> /* Some architecture still requires executable stack for the signal return >> trampoline, although PF_X could be overridden if PT_GNU_STACK is present. >> However since there is glibc does not export such information with a >> proper ABI, it uses the historical permissions. */ > > “since [] glibc does not export” Ack.
diff --git a/support/Makefile b/support/Makefile index 5c69f0de4b..a462781718 100644 --- a/support/Makefile +++ b/support/Makefile @@ -39,6 +39,7 @@ libsupport-routines = \ resolv_response_context_free \ resolv_test \ set_fortify_handler \ + support_stack_alloc \ support-xfstat \ support-xfstat-time64 \ support-xstat \ diff --git a/support/support.h b/support/support.h index 9ec8ecb8d7..dbd270c78d 100644 --- a/support/support.h +++ b/support/support.h @@ -164,6 +164,25 @@ timer_t support_create_timer (uint64_t sec, long int nsec, bool repeat, /* Disable the timer TIMER. */ void support_delete_timer (timer_t timer); +struct support_stack +{ + void *stack; + size_t size; + size_t guardsize; +}; + +/* Allocate stack suitable to used with xclone or sigaltstack call. The stack + will have a minimum size of SIZE + MINSIGSTKSZ bytes, rounded up to a whole + number of pages. There will be a large (at least 1 MiB) inaccessible guard + bands on either side of it. + The returned value on ALLOC_BASE and ALLOC_SIZE will be the usable stack + region, excluding the GUARD_SIZE allocated area. + It also terminates the process on error. */ +struct support_stack support_stack_alloc (size_t size); + +/* Deallocate the STACK. */ +void support_stack_free (struct support_stack *stack); + __END_DECLS #endif /* SUPPORT_H */ diff --git a/support/support_stack_alloc.c b/support/support_stack_alloc.c new file mode 100644 index 0000000000..db0d522f2f --- /dev/null +++ b/support/support_stack_alloc.c @@ -0,0 +1,79 @@ +/* Allocate a stack suitable to be used with xclone or xsigaltstack. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <support/check.h> +#include <support/support.h> +#include <support/xunistd.h> +#include <stdint.h> +#include <string.h> +#include <stackinfo.h> +#include <sys/mman.h> +#include <sys/param.h> /* roundup, MAX */ + +#ifndef MAP_NORESERVE +# define MAP_NORESERVE 0 +#endif +#ifndef MAP_STACK +# define MAP_STACK 0 +#endif + +struct support_stack +support_stack_alloc (size_t size) +{ + size_t pagesize = sysconf (_SC_PAGESIZE); + if (pagesize == -1) + FAIL_EXIT1 ("sysconf (_SC_PAGESIZE): %m\n"); + + /* Always supply at least MINSIGSTKSZ space; passing 0 as size means + only that much space. No matter what the number is, round it up + to a whole number of pages. */ + size_t stacksize = roundup (size + MINSIGSTKSZ, pagesize); + + /* The guard bands need to be large enough to intercept offset + accesses from a stack address that might otherwise hit another + mapping. Make them at least twice as big as the stack itself, to + defend against an offset by the entire size of a large + stack-allocated array. The minimum is 1MiB, which is arbitrarily + chosen to be larger than any "typical" wild pointer offset. + Again, no matter what the number is, round it up to a whole + number of pages. */ + size_t guardsize = roundup (MAX (2 * stacksize, 1024 * 1024), pagesize); + size_t alloc_size = guardsize + stacksize + guardsize; + /* Use MAP_NORESERVE so that RAM will not be wasted on the guard + bands; touch all the pages of the actual stack before returning, + so we know they are allocated. */ + void *alloc_base = xmmap (0, + alloc_size, + PROT_NONE, + MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE|MAP_STACK, + -1); + /* PF_X can be overridden if PT_GNU_STACK is present. */ + int prot = PROT_READ | PROT_WRITE + | (DEFAULT_STACK_PERMS & PF_X ? PROT_EXEC : 0); + xmprotect (alloc_base + guardsize, stacksize, prot); + memset (alloc_base + guardsize, 0xA5, stacksize); + return (struct support_stack) { alloc_base + guardsize, stacksize, guardsize }; +} + +void +support_stack_free (struct support_stack *stack) +{ + void *alloc_base = (void *)((uintptr_t) stack->stack - stack->guardsize); + size_t alloc_size = stack->size + 2 * stack->guardsize; + xmunmap (alloc_base, alloc_size); +} diff --git a/support/xsigstack.c b/support/xsigstack.c index a2f0e3269a..a471c853cb 100644 --- a/support/xsigstack.c +++ b/support/xsigstack.c @@ -37,8 +37,7 @@ structures. */ struct sigstack_desc { - void *alloc_base; /* Base address of the complete allocation. */ - size_t alloc_size; /* Size of the complete allocation. */ + struct support_stack stack; stack_t alt_stack; /* The address and size of the stack itself. */ stack_t old_stack; /* The previous signal stack. */ }; @@ -46,43 +45,11 @@ struct sigstack_desc void * xalloc_sigstack (size_t size) { - size_t pagesize = sysconf (_SC_PAGESIZE); - if (pagesize == -1) - FAIL_EXIT1 ("sysconf (_SC_PAGESIZE): %m\n"); - - /* Always supply at least MINSIGSTKSZ space; passing 0 as size means - only that much space. No matter what the number is, round it up - to a whole number of pages. */ - size_t stacksize = roundup (size + MINSIGSTKSZ, pagesize); - - /* The guard bands need to be large enough to intercept offset - accesses from a stack address that might otherwise hit another - mapping. Make them at least twice as big as the stack itself, to - defend against an offset by the entire size of a large - stack-allocated array. The minimum is 1MiB, which is arbitrarily - chosen to be larger than any "typical" wild pointer offset. - Again, no matter what the number is, round it up to a whole - number of pages. */ - size_t guardsize = roundup (MAX (2 * stacksize, 1024 * 1024), pagesize); - struct sigstack_desc *desc = xmalloc (sizeof (struct sigstack_desc)); - desc->alloc_size = guardsize + stacksize + guardsize; - /* Use MAP_NORESERVE so that RAM will not be wasted on the guard - bands; touch all the pages of the actual stack before returning, - so we know they are allocated. */ - desc->alloc_base = xmmap (0, - desc->alloc_size, - PROT_READ|PROT_WRITE, - MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE|MAP_STACK, - -1); - - xmprotect (desc->alloc_base, guardsize, PROT_NONE); - xmprotect (desc->alloc_base + guardsize + stacksize, guardsize, PROT_NONE); - memset (desc->alloc_base + guardsize, 0xA5, stacksize); - - desc->alt_stack.ss_sp = desc->alloc_base + guardsize; + desc->stack = support_stack_alloc (size); + desc->alt_stack.ss_sp = desc->stack.stack; desc->alt_stack.ss_flags = 0; - desc->alt_stack.ss_size = stacksize; + desc->alt_stack.ss_size = desc->stack.size; if (sigaltstack (&desc->alt_stack, &desc->old_stack)) FAIL_EXIT1 ("sigaltstack (new stack: sp=%p, size=%zu, flags=%u): %m\n", @@ -101,7 +68,7 @@ xfree_sigstack (void *stack) FAIL_EXIT1 ("sigaltstack (restore old stack: sp=%p, size=%zu, flags=%u): " "%m\n", desc->old_stack.ss_sp, desc->old_stack.ss_size, desc->old_stack.ss_flags); - xmunmap (desc->alloc_base, desc->alloc_size); + support_stack_free (&desc->stack); free (desc); }