Message ID | 87y49rkuye.fsf@esperi.org.uk |
---|---|
State | New |
Headers | show |
On 10 Mar 2016 01:00, Nix wrote: > On 9 Mar 2016, Mike Frysinger uttered the following: > > On 08 Mar 2016 13:50, Nix wrote: > >> This one is a bit nasty. Now that we are initializing TLS earlier for > >> the stack canary's sake, existing memcpy() implementations become > >> problematic. We can use the multiarch implementations, but they might > >> not always be present, and even if they are present they might not always > >> be in assembler, so might be compiled with stack-protection. We cannot > >> use posix/memcpy.c without marking both it and */wordcopy.c as non-stack- > >> protected, which for memcpy() of all things seems like a seriously bad > >> idea: if any function in glibc should be stack-protected, it's memcpy() > >> (though stack-protecting the many optimized assembly versions is not done > >> in this patch series). > >> > >> So we have two real options: hack up the guts of posix/memcpy.c and > >> */wordcopy.c so that they can be #included (renamed and declared static) > >> inside libc-tls.c, or simply open-code the memcpy(). For simplicity's > >> sake, this patch open-codes it, on the grounds that static binaries are > >> relatively rare and quasi-deprecated anyway, and static binaries with > >> large TLS sections are yet rarer, and not worth the complexity of hacking > >> up all the arch-dependent wordcopy files. > > > > can we utilize _HAVE_STRING_ARCH_memcpy here ? the string includes will > > sometimes provide inlined asm versions ... > > I think so, *if* we can be sure that arches that define > _HAVE_STRING_ARCH_memcpy will never expand memcpy() to macros that > sometimes call functions, but only to pure inline asm stuff. Since there > is only one definer of that (x86), I'm not sure that's guaranteed... i guess look at the other arch inlines to see what they do. i honestly don't know where or how well these APIs are defined. > + /* sbrk gives us zero'd memory, so we don't need to clear the remainder. > + > + Use inlined asm implementation if available: otherwise, copy by hand, > + because memcpy() is stack-protected and is often multiarch too. */ > + > +#if defined _HAVE_STRING_ARCH_memcpy > memcpy (_dl_static_dtv[2].pointer.val, initimage, filesz); should this also depend on SSP not being enabled ? #if defined _HAVE_STRING_ARCH_memcpy || !defined <whatever SSP define> otherwise, you'd want to use "#ifdef". -mike
On 10-03-2016 09:28, Mike Frysinger wrote: > On 10 Mar 2016 01:00, Nix wrote: >> On 9 Mar 2016, Mike Frysinger uttered the following: >>> On 08 Mar 2016 13:50, Nix wrote: >>>> This one is a bit nasty. Now that we are initializing TLS earlier for >>>> the stack canary's sake, existing memcpy() implementations become >>>> problematic. We can use the multiarch implementations, but they might >>>> not always be present, and even if they are present they might not always >>>> be in assembler, so might be compiled with stack-protection. We cannot >>>> use posix/memcpy.c without marking both it and */wordcopy.c as non-stack- >>>> protected, which for memcpy() of all things seems like a seriously bad >>>> idea: if any function in glibc should be stack-protected, it's memcpy() >>>> (though stack-protecting the many optimized assembly versions is not done >>>> in this patch series). >>>> >>>> So we have two real options: hack up the guts of posix/memcpy.c and >>>> */wordcopy.c so that they can be #included (renamed and declared static) >>>> inside libc-tls.c, or simply open-code the memcpy(). For simplicity's >>>> sake, this patch open-codes it, on the grounds that static binaries are >>>> relatively rare and quasi-deprecated anyway, and static binaries with >>>> large TLS sections are yet rarer, and not worth the complexity of hacking >>>> up all the arch-dependent wordcopy files. >>> >>> can we utilize _HAVE_STRING_ARCH_memcpy here ? the string includes will >>> sometimes provide inlined asm versions ... >> >> I think so, *if* we can be sure that arches that define >> _HAVE_STRING_ARCH_memcpy will never expand memcpy() to macros that >> sometimes call functions, but only to pure inline asm stuff. Since there >> is only one definer of that (x86), I'm not sure that's guaranteed... > > i guess look at the other arch inlines to see what they do. i honestly > don't know where or how well these APIs are defined. > >> + /* sbrk gives us zero'd memory, so we don't need to clear the remainder. >> + >> + Use inlined asm implementation if available: otherwise, copy by hand, >> + because memcpy() is stack-protected and is often multiarch too. */ >> + >> +#if defined _HAVE_STRING_ARCH_memcpy >> memcpy (_dl_static_dtv[2].pointer.val, initimage, filesz); > > should this also depend on SSP not being enabled ? > #if defined _HAVE_STRING_ARCH_memcpy || !defined <whatever SSP define> > > otherwise, you'd want to use "#ifdef". > -mike > Is _HAVE_STRING_ARCH_memcpy really doing a better job than gcc? I would prefer to just use the open memcpy and let it optimize it.
On 10 Mar 2016, Adhemerval Zanella stated: > On 10-03-2016 09:28, Mike Frysinger wrote: >> otherwise, you'd want to use "#ifdef". >> -mike > > Is _HAVE_STRING_ARCH_memcpy really doing a better job than gcc? I would prefer > to just use the open memcpy and let it optimize it. In most cases I'd agree with you but the thing about _HAVE_STRING_ARCH_memcpy is that GCC can decide *not* to optimize it: though we have suppressed generation of libcalls in this function, not even a call to __builtin_memcpy() is guaranteed to generate inline asm with no function calls, as far as I know. (We could start messing about with __attribute__((__optimize__("-mmemcpy-strategy=STRATEGY"))), but this is really starting to get a bit over the top, I think... particularly given that the only platform that works on is *also* the only platform we have ARCH_memcpy for, so the only platform we don't need that sort of thing on.)
On 10 Mar 2016, Mike Frysinger verbalised: > On 10 Mar 2016 01:00, Nix wrote: >> I think so, *if* we can be sure that arches that define >> _HAVE_STRING_ARCH_memcpy will never expand memcpy() to macros that >> sometimes call functions, but only to pure inline asm stuff. Since there >> is only one definer of that (x86), I'm not sure that's guaranteed... > > i guess look at the other arch inlines to see what they do. i honestly > don't know where or how well these APIs are defined. The only platform that defines *any* string arch operations is x86. None of those fall back to out-of-line functions at all, even for huge inputs (which for some functions and some larger inputs strikes me as a terrible implementation choice, but ah well.) >> + /* sbrk gives us zero'd memory, so we don't need to clear the remainder. >> + >> + Use inlined asm implementation if available: otherwise, copy by hand, >> + because memcpy() is stack-protected and is often multiarch too. */ >> + >> +#if defined _HAVE_STRING_ARCH_memcpy >> memcpy (_dl_static_dtv[2].pointer.val, initimage, filesz); > > should this also depend on SSP not being enabled ? > #if defined _HAVE_STRING_ARCH_memcpy || !defined <whatever SSP define> Probably a good idea: it makes things less invasive for the common case. (Though your belief that there is just *one* SSP define is alas untrue: we'll have to check all three.) Adding.
On 10 Mar 2016, nix@esperi.org.uk stated: > On 10 Mar 2016, Mike Frysinger verbalised: >> should this also depend on SSP not being enabled ? >> #if defined _HAVE_STRING_ARCH_memcpy || !defined <whatever SSP define> > > Probably a good idea: it makes things less invasive for the common case. > (Though your belief that there is just *one* SSP define is alas untrue: > we'll have to check all three.) Heh. This has additional fun complications: all of csu/ is built without stack protection, so we cannot rely on the macros the compiler predefines to indicate that stack protection is on: for csu/, it's not, ever. So I do have to introduce a new #define here, to indicate that stack-protection is globally enabled even if it's off for some file in particular (easy to do in configure.ac etc: folding that part into patch 1).
diff --git a/csu/libc-tls.c b/csu/libc-tls.c index 3d67a64..af0d3e6 100644 --- a/csu/libc-tls.c +++ b/csu/libc-tls.c @@ -102,6 +102,7 @@ init_static_tls (size_t memsz, size_t align) } void +inhibit_loop_to_libcall __libc_setup_tls (size_t tcbsize, size_t tcbalign) { void *tlsblock; @@ -176,8 +177,22 @@ __libc_setup_tls (size_t tcbsize, size_t tcbalign) # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined" #endif _dl_static_dtv[2].pointer.is_static = true; - /* sbrk gives us zero'd memory, so we don't need to clear the remainder. */ + + /* sbrk gives us zero'd memory, so we don't need to clear the remainder. + + Use inlined asm implementation if available: otherwise, copy by hand, + because memcpy() is stack-protected and is often multiarch too. */ + +#if defined _HAVE_STRING_ARCH_memcpy memcpy (_dl_static_dtv[2].pointer.val, initimage, filesz); +#else + char *dst = (char *) _dl_static_dtv[2].pointer.val; + char *src = (char *) initimage; + size_t i; + + for (i = 0; i < filesz; dst++, src++, i++) + *dst = *src; +#endif /* Install the pointer to the dtv. */