diff mbox series

[libphobos] Work around lack of dlpi_tls_modid before Solaris 11.5

Message ID ydd8sz3zqfb.fsf@CeBiTec.Uni-Bielefeld.DE
State New
Headers show
Series [libphobos] Work around lack of dlpi_tls_modid before Solaris 11.5 | expand

Commit Message

Rainer Orth Jan. 29, 2019, 10:52 a.m. UTC
Before Solaris 11.5, struct dl_phdr_info lacked the dlpi_tls_modid
member.  While the support might be backported to Solaris 11.4, it
certainly won't to previous Solaris releases.  To work around this, I've
used the following patch.  Again, it's pretty straightforward.

Point of note:

* On Solaris, FreeBSD and NetBSD, dl_phdr_info is always visible in
  <link.h>, while on Linux one needs to define _GNU_SOURCE.
  AC_USE_SYSTEM_EXTENSION takes care of this, but needs to be called
  early (i.e. not in DRUNTIME_OS_DLPI_TLS_MODID) to avoid an autoconf
  warning:

configure.ac:129: warning: AC_COMPILE_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS
m4/druntime/os.m4:190: DRUNTIME_OS_DLPI_TLS_MODID is expanded from...
configure.ac:129: the top level

Tested on i386-pc-solaris2.11 (Solaris 11.4) and x86_64-pc-linux-gnu.

Not unexpectedly, there are a couple of regressions compared to the
Solaris 11.5/x86 results:

+FAIL: gdc.test/runnable/testaa.d   execution test
+FAIL: gdc.test/runnable/testaa.d -fPIC   execution test
+FAIL: gdc.test/runnable/testaa.d -fPIC -shared-libphobos   execution test
+FAIL: gdc.test/runnable/testaa.d -shared-libphobos   execution test

  32 and 64-bit

+FAIL: gdc.test/runnable/xtest55.d   execution test
+FAIL: gdc.test/runnable/xtest55.d -shared-libphobos   execution test

  64-bit only

+FAIL: libphobos.shared/link_linkdep.d -I/vol/gcc/src/hg/trunk/local/libphobos/t
estsuite/libphobos.shared liblinkdep.so lib.so -shared-libphobos execution test

+FAIL: libphobos.shared/load_linkdep.d -shared-libphobos -ldl execution test
+FAIL: libphobos.shared/load_loaddep.d -shared-libphobos -ldl execution test

+FAIL: libphobos.shared/linkDR.c -shared-libphobos -ldl -pthread execution test
+FAIL: libphobos.shared/host.c -ldl -pthread execution test
+FAIL: libphobos.shared/loadDR.c -ldl -pthread -g execution test

  32 and 64-bit

+FAIL: libphobos.shared/link.d -I/vol/gcc/src/hg/trunk/local/libphobos/testsuite/libphobos.shared lib.so -shared-libphobos execution test

  64-bit only

I haven't looked much closer yet, just posting this to see if anything
along those lines could be acceptable at all.

	Rainer

Comments

Johannes Pfau Jan. 31, 2019, 6:40 p.m. UTC | #1
Hi Rainer,

I'd recommend not using such a workaround:

This means getTLSRange will always return an empty range, but the GC 
uses this to scan TLS memory. This means a GC collection can delete 
objects which are still pointed to from TLS. This leads to hard to debug 
errors, and if I remember correctly, the testsuite will not catch these 
errors. I think we have code in phobos though which references objects 
only from TLS and this will break after a GC collection.

I just noticed we now have similar code in upstream druntime for uclibc 
though so I also posted a comment there:
https://github.com/dlang/druntime/pull/2157

I'm not sure what's a good solution here. EmuTLS has got the same 
problem, but I'll post a RFC patch next weekend which would allow to 
scan the emuTLS memory. If we somehow make that work, I'd recommend to 
use emuTLS instead of native TLS if there's no way to scan the native TLS*.

FYI Martin Nowak(in CC) wrote most of the original code for rt.sections 
so he's the expert we'd have to ask.

* Maybe we could implement a more runtime-independent approach to scan 
native TLS?
1) We somehow need to bracket the TLS section (it would have to be
    per-shared-library though, we basically need thread-local, hidden
    __start_tls and __stop_tls symbols).
2) We need to emit a hidden _dso_scan_tls function into each D library.
    A pointer to  this DSO specific function then has to be passed in
    CompilerDSOData to _d_dso_registry.
3) tlsRange has to forward to the correct, DSO specific _dso_scan_tls.

2 and 3 are easy but I'm not sure if we can do 1.

Best regards,
Johannes

Am 29.01.19 um 11:52 schrieb Rainer Orth:
> Before Solaris 11.5, struct dl_phdr_info lacked the dlpi_tls_modid
> member.  While the support might be backported to Solaris 11.4, it
> certainly won't to previous Solaris releases.  To work around this, I've
> used the following patch.  Again, it's pretty straightforward.
> 
> Point of note:
> 
> * On Solaris, FreeBSD and NetBSD, dl_phdr_info is always visible in
>    <link.h>, while on Linux one needs to define _GNU_SOURCE.
>    AC_USE_SYSTEM_EXTENSION takes care of this, but needs to be called
>    early (i.e. not in DRUNTIME_OS_DLPI_TLS_MODID) to avoid an autoconf
>    warning:
> 
> configure.ac:129: warning: AC_COMPILE_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS
> m4/druntime/os.m4:190: DRUNTIME_OS_DLPI_TLS_MODID is expanded from...
> configure.ac:129: the top level
> 
> Tested on i386-pc-solaris2.11 (Solaris 11.4) and x86_64-pc-linux-gnu.
> 
> Not unexpectedly, there are a couple of regressions compared to the
> Solaris 11.5/x86 results:
> 
> +FAIL: gdc.test/runnable/testaa.d   execution test
> +FAIL: gdc.test/runnable/testaa.d -fPIC   execution test
> +FAIL: gdc.test/runnable/testaa.d -fPIC -shared-libphobos   execution test
> +FAIL: gdc.test/runnable/testaa.d -shared-libphobos   execution test
> 
>    32 and 64-bit
> 
> +FAIL: gdc.test/runnable/xtest55.d   execution test
> +FAIL: gdc.test/runnable/xtest55.d -shared-libphobos   execution test
> 
>    64-bit only
> 
> +FAIL: libphobos.shared/link_linkdep.d -I/vol/gcc/src/hg/trunk/local/libphobos/t
> estsuite/libphobos.shared liblinkdep.so lib.so -shared-libphobos execution test
> 
> +FAIL: libphobos.shared/load_linkdep.d -shared-libphobos -ldl execution test
> +FAIL: libphobos.shared/load_loaddep.d -shared-libphobos -ldl execution test
> 
> +FAIL: libphobos.shared/linkDR.c -shared-libphobos -ldl -pthread execution test
> +FAIL: libphobos.shared/host.c -ldl -pthread execution test
> +FAIL: libphobos.shared/loadDR.c -ldl -pthread -g execution test
> 
>    32 and 64-bit
> 
> +FAIL: libphobos.shared/link.d -I/vol/gcc/src/hg/trunk/local/libphobos/testsuite/libphobos.shared lib.so -shared-libphobos execution test
> 
>    64-bit only
> 
> I haven't looked much closer yet, just posting this to see if anything
> along those lines could be acceptable at all.
> 
> 	Rainer
>
Rainer Orth Feb. 1, 2019, 2:38 p.m. UTC | #2
Hi Johannes,

> I'd recommend not using such a workaround:
>
> This means getTLSRange will always return an empty range, but the GC uses
> this to scan TLS memory. This means a GC collection can delete objects
> which are still pointed to from TLS. This leads to hard to debug errors,
> and if I remember correctly, the testsuite will not catch these errors. I
> think we have code in phobos though which references objects only from TLS
> and this will break after a GC collection.

I fully admit to have been wary about such an approach myself, but was
astonished how far it seemed to get me.

I suspect the two testsuite regressions (compared to a build with
dlpi_tls_modid present) I mentioned are exactly of the kind you mention:

e.g. the gdc.test/runnable/testaa.d failures are like this

core.exception.RangeError@gdc.test/runnable/testaa.d(410): Range violation
----------------
/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:496 onRangeError [0x80f0d2c]
/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:672 _d_arraybounds [0x80f132f]
??:? void testaa.test15() [0x80d7ae4]
??:? _Dmain [0x80dd3fc]
before test 1

and gdc.test/runnable/xtest55.d fails like so:

core.exception.AssertError@gdc.test/runnable/xtest55.d(19): Assertion failure
----------------
/vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:441 onAssertError [0x7fff55dd3b56]
??:? _Dmain [0x418959]
7FFFBEB00000    7FFFBEB00000

It's a small set admittedly (but there are the libphobos failures as
well), but a compiler that leaves its users with a feeling of
unreliablity is probably worse than none at all.

Just for the record, I saw the same regressions on Linux/x86_64 when I
accidentally didn't define _GNU_SOURCE in the configure test for
dlpi_tls_modid, producing an equivalent configuration.  So this isn't
Solaris-specific in any way.

> I'm not sure what's a good solution here. EmuTLS has got the same problem,
> but I'll post a RFC patch next weekend which would allow to scan the emuTLS
> memory. If we somehow make that work, I'd recommend to use emuTLS instead
> of native TLS if there's no way to scan the native TLS*.

The problem here is that we'd probably need to build gcc twice in this
case: once with native TLS for all non-D languages, and a second time
with --disable-tls for D.  AFAICS TARGET_HAVE_TLS needs to be a
compile-time constant and cannot depend on the language being compiled
for.

> FYI Martin Nowak(in CC) wrote most of the original code for rt.sections so
> he's the expert we'd have to ask.
>
> * Maybe we could implement a more runtime-independent approach to scan
> native TLS?
> 1) We somehow need to bracket the TLS section (it would have to be
>    per-shared-library though, we basically need thread-local, hidden
>    __start_tls and __stop_tls symbols).
> 2) We need to emit a hidden _dso_scan_tls function into each D library.
>    A pointer to  this DSO specific function then has to be passed in
>    CompilerDSOData to _d_dso_registry.
> 3) tlsRange has to forward to the correct, DSO specific _dso_scan_tls.
>
> 2 and 3 are easy but I'm not sure if we can do 1.

Right: I suspect 1 would we way more difficult than the
__start_minfo/__stop_minfo stuff.

I failed to mention another approach in my patch submission, though I
alluded to it in PR d/88150: the ldc fork of libdruntime

	https://github.com/ldc-developers/druntime

has in src/rt/sections_ldc.d an implementation of getTLSRange for
Illumos/Solaris without dlpi_tls_modid.  I managed to adapt it to
sections_elf_shared.d, but apart from the fact that it uses undocumented
libc internals (which probably don't change between Solaris 10 and 11.4,
so that shouldn't be too bad) that implementation only gets you the TLS
range for the main executable, so isn't very useful AFAICS.

	Rainer
Johannes Pfau Feb. 2, 2019, 8:35 p.m. UTC | #3
Hi Rainer,

> 
> I suspect the two testsuite regressions (compared to a build with
> dlpi_tls_modid present) I mentioned are exactly of the kind you mention:
> 
> e.g. the gdc.test/runnable/testaa.d failures are like this
> 
> core.exception.RangeError@gdc.test/runnable/testaa.d(410): Range violation
> ----------------
> /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:496 onRangeError [0x80f0d2c]
> /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:672 _d_arraybounds [0x80f132f]
> ??:? void testaa.test15() [0x80d7ae4]
> ??:? _Dmain [0x80dd3fc]
> before test 1
> 
> and gdc.test/runnable/xtest55.d fails like so:
> 
> core.exception.AssertError@gdc.test/runnable/xtest55.d(19): Assertion failure
> ----------------
> /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:441 onAssertError [0x7fff55dd3b56]
> ??:? _Dmain [0x418959]
> 7FFFBEB00000    7FFFBEB00000

If you want to verify whether it's really a GC problem, you can add this 
in the main function to disable GC collections:

import core.memory;
GC.disable();

This should be fine for the test suite. If you want to do this for the 
unit tests it's slightly more complicated as the main functions is 
executed _after_ all unit tests. IIRC adding it in a shared static 
this() module constructor would work there.

Best regards,
Johannes
Rainer Orth Feb. 4, 2019, 1:09 p.m. UTC | #4
Hi Johannes,

>> I suspect the two testsuite regressions (compared to a build with
>> dlpi_tls_modid present) I mentioned are exactly of the kind you mention:
>>
>> e.g. the gdc.test/runnable/testaa.d failures are like this
>>
>> core.exception.RangeError@gdc.test/runnable/testaa.d(410): Range violation
>> ----------------
>> /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:496
>> onRangeError [0x80f0d2c]
>> /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:672
>> _d_arraybounds [0x80f132f]
>> ??:? void testaa.test15() [0x80d7ae4]
>> ??:? _Dmain [0x80dd3fc]
>> before test 1
>>
>> and gdc.test/runnable/xtest55.d fails like so:
>>
>> core.exception.AssertError@gdc.test/runnable/xtest55.d(19): Assertion failure
>> ----------------
>> /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:441
>> onAssertError [0x7fff55dd3b56]
>> ??:? _Dmain [0x418959]
>> 7FFFBEB00000    7FFFBEB00000
>
> If you want to verify whether it's really a GC problem, you can add this in
> the main function to disable GC collections:
>
> import core.memory;
> GC.disable();

thanks for the hint.  With that change, the xtest55.d failure vanishes,
while the testaa.d one remains unchanged.

> This should be fine for the test suite. If you want to do this for the unit
> tests it's slightly more complicated as the main functions is executed
> _after_ all unit tests. IIRC adding it in a shared static this() module
> constructor would work there.

The unittest failures also occur when dlpi_tls_modid is present.

The libphobos testsuite regressions are like this:

+FAIL: libphobos.shared/link.d -I/vol/gcc/src/hg/trunk/local/libphobos/testsuite/libphobos.shared lib.so -shared-libphobos execution test

  64-bit only

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x00000000005064f0 in ?? ()
(gdb) where
#0  0x00000000005064f0 in ?? ()
#1  0x00007fff38903c84 in lib.tls_access() ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/lib.d:26
#2  0x000000000040298d in link.testGC() ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/link.d:24
#3  0x0000000000402e36 in D main ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/link.d:62

The rest occurs for both 32 and 64-bit:

+FAIL: libphobos.shared/link_linkdep.d -I/vol/gcc/src/hg/trunk/local/libphobos/testsuite/libphobos.shared liblinkdep.so lib.so -shared-libphobos execution test

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x080620ad in ?? ()
(gdb) where
#0  0x080620ad in ?? ()
#1  0xfe652f2e in lib.stack!(lib.tls_access()).stack() ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/lib.d:33
#2  0xfe6528fd in lib.testGC() ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/lib.d:45
#3  0xfe652af1 in lib.runTestsImpl() ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/lib.d:80
#4  0xfe652a1e in runTests ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/lib.d:62
#5  0xfe670be4 in runDepTests ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/liblinkdep.d:5
#6  0x08050efa in D main ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/link_linkdep.d:5

Except for the last one, the rest is similar:

+FAIL: libphobos.shared/load_linkdep.d -shared-libphobos -ldl execution test
+FAIL: libphobos.shared/load_loaddep.d -shared-libphobos -ldl execution test
+FAIL: libphobos.shared/linkDR.c -shared-libphobos -ldl -pthread execution test
+FAIL: libphobos.shared/host.c -ldl -pthread execution test
+FAIL: libphobos.shared/loadDR.c -ldl -pthread -g execution test

This one is a bit of the odd man out:

Thread 2 received signal SIGILL, Illegal instruction.
[Switching to Thread 1 (LWP 1)]
0x08061f85 in ?? ()
(gdb) where
#0  0x08061f85 in ?? ()
#1  0xfe4b2f2e in lib.stack!(lib.tls_access()).stack() ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/testsuite/libphobos.shared/lib.d:33
#2  0xe8fe4b28 in ?? ()
[...]
#63 0x00fe6ec9 in ?? ()
#64 0x00000000 in ?? ()

	Rainer
Rainer Orth April 9, 2019, 7:27 p.m. UTC | #5
Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:

> Before Solaris 11.5, struct dl_phdr_info lacked the dlpi_tls_modid
> member.  While the support might be backported to Solaris 11.4, it
> certainly won't to previous Solaris releases.  To work around this, I've
> used the following patch.  Again, it's pretty straightforward.
>
> Point of note:
>
> * On Solaris, FreeBSD and NetBSD, dl_phdr_info is always visible in
>   <link.h>, while on Linux one needs to define _GNU_SOURCE.
>   AC_USE_SYSTEM_EXTENSION takes care of this, but needs to be called
>   early (i.e. not in DRUNTIME_OS_DLPI_TLS_MODID) to avoid an autoconf
>   warning:
>
> configure.ac:129: warning: AC_COMPILE_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS
> m4/druntime/os.m4:190: DRUNTIME_OS_DLPI_TLS_MODID is expanded from...
> configure.ac:129: the top level
>
> Tested on i386-pc-solaris2.11 (Solaris 11.4) and x86_64-pc-linux-gnu.
>
> Not unexpectedly, there are a couple of regressions compared to the
> Solaris 11.5/x86 results:
>
> +FAIL: gdc.test/runnable/testaa.d   execution test
> +FAIL: gdc.test/runnable/testaa.d -fPIC   execution test
> +FAIL: gdc.test/runnable/testaa.d -fPIC -shared-libphobos   execution test
> +FAIL: gdc.test/runnable/testaa.d -shared-libphobos   execution test
>
>   32 and 64-bit
>
> +FAIL: gdc.test/runnable/xtest55.d   execution test
> +FAIL: gdc.test/runnable/xtest55.d -shared-libphobos   execution test
>
>   64-bit only
>
> +FAIL: libphobos.shared/link_linkdep.d -I/vol/gcc/src/hg/trunk/local/libphobos/t
> estsuite/libphobos.shared liblinkdep.so lib.so -shared-libphobos execution test
>
> +FAIL: libphobos.shared/load_linkdep.d -shared-libphobos -ldl execution test
> +FAIL: libphobos.shared/load_loaddep.d -shared-libphobos -ldl execution test
>
> +FAIL: libphobos.shared/linkDR.c -shared-libphobos -ldl -pthread execution test
> +FAIL: libphobos.shared/host.c -ldl -pthread execution test
> +FAIL: libphobos.shared/loadDR.c -ldl -pthread -g execution test
>
>   32 and 64-bit
>
> +FAIL: libphobos.shared/link.d -I/vol/gcc/src/hg/trunk/local/libphobos/testsuite/libphobos.shared lib.so -shared-libphobos execution test
>
>   64-bit only
>
> I haven't looked much closer yet, just posting this to see if anything
> along those lines could be acceptable at all.

Fortunately, Iain just discovered a way better way to work around the
lack of dlpi_tls_modid: dlinfo(RTLD_DI_LINKMAP) returns not only a
pointer to a Link_map * as documented in the man page, but rather a
pointer to a struct Rt_map which also includes a rt_tlsmodid member.
It has been this way since at least Solaris 9 and the Solaris/Illumos
sources ($SRC/cmd/sgs/include/rtld.h) explicitly have a comment around
the start of the structure stateing

	Exposed to rtld_db - don't move, don't delete

which pretty much guarantees that this can be relied on.

The only other quirk of note is that before dlpi_tls_modid got added,
tlsmodid started at 0 for the main executable, not 1 as the glibc spec
required and libdruntime assumes.  I account for this by adjusting
_tlsMod on dlinfo and undoing the adjustment before passing it to
__tls_get_adddr, the only consumer.

With the revised patch below, I get clean gdc testresults on Solaris
11.3 and 11.4 (and still on 11.5 which uses the dlpi_tls_modid path),
and almost identical libphobos testresults on all three versions.

I hope this is good to go in (the sections_elf_shared.d part via
upstream, the rest directly) now?

Thanks a lot to Iain for discovering this.

	Rainer
Iain Buclaw April 13, 2019, 3:37 p.m. UTC | #6
On Tue, 9 Apr 2019 at 21:27, Rainer Orth <ro@cebitec.uni-bielefeld.de> wrote:
>
> Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
>
> > Before Solaris 11.5, struct dl_phdr_info lacked the dlpi_tls_modid
> > member.  While the support might be backported to Solaris 11.4, it
> > certainly won't to previous Solaris releases.  To work around this, I've
> > used the following patch.  Again, it's pretty straightforward.
> >
> > Point of note:
> >
> > * On Solaris, FreeBSD and NetBSD, dl_phdr_info is always visible in
> >   <link.h>, while on Linux one needs to define _GNU_SOURCE.
> >   AC_USE_SYSTEM_EXTENSION takes care of this, but needs to be called
> >   early (i.e. not in DRUNTIME_OS_DLPI_TLS_MODID) to avoid an autoconf
> >   warning:
> >
> > configure.ac:129: warning: AC_COMPILE_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS
> > m4/druntime/os.m4:190: DRUNTIME_OS_DLPI_TLS_MODID is expanded from...
> > configure.ac:129: the top level
> >
> > Tested on i386-pc-solaris2.11 (Solaris 11.4) and x86_64-pc-linux-gnu.
> >
> > Not unexpectedly, there are a couple of regressions compared to the
> > Solaris 11.5/x86 results:
> >
> > +FAIL: gdc.test/runnable/testaa.d   execution test
> > +FAIL: gdc.test/runnable/testaa.d -fPIC   execution test
> > +FAIL: gdc.test/runnable/testaa.d -fPIC -shared-libphobos   execution test
> > +FAIL: gdc.test/runnable/testaa.d -shared-libphobos   execution test
> >
> >   32 and 64-bit
> >
> > +FAIL: gdc.test/runnable/xtest55.d   execution test
> > +FAIL: gdc.test/runnable/xtest55.d -shared-libphobos   execution test
> >
> >   64-bit only
> >
> > +FAIL: libphobos.shared/link_linkdep.d -I/vol/gcc/src/hg/trunk/local/libphobos/t
> > estsuite/libphobos.shared liblinkdep.so lib.so -shared-libphobos execution test
> >
> > +FAIL: libphobos.shared/load_linkdep.d -shared-libphobos -ldl execution test
> > +FAIL: libphobos.shared/load_loaddep.d -shared-libphobos -ldl execution test
> >
> > +FAIL: libphobos.shared/linkDR.c -shared-libphobos -ldl -pthread execution test
> > +FAIL: libphobos.shared/host.c -ldl -pthread execution test
> > +FAIL: libphobos.shared/loadDR.c -ldl -pthread -g execution test
> >
> >   32 and 64-bit
> >
> > +FAIL: libphobos.shared/link.d -I/vol/gcc/src/hg/trunk/local/libphobos/testsuite/libphobos.shared lib.so -shared-libphobos execution test
> >
> >   64-bit only
> >
> > I haven't looked much closer yet, just posting this to see if anything
> > along those lines could be acceptable at all.
>
> Fortunately, Iain just discovered a way better way to work around the
> lack of dlpi_tls_modid: dlinfo(RTLD_DI_LINKMAP) returns not only a
> pointer to a Link_map * as documented in the man page, but rather a
> pointer to a struct Rt_map which also includes a rt_tlsmodid member.
> It has been this way since at least Solaris 9 and the Solaris/Illumos
> sources ($SRC/cmd/sgs/include/rtld.h) explicitly have a comment around
> the start of the structure stateing
>
>         Exposed to rtld_db - don't move, don't delete
>
> which pretty much guarantees that this can be relied on.
>
> The only other quirk of note is that before dlpi_tls_modid got added,
> tlsmodid started at 0 for the main executable, not 1 as the glibc spec
> required and libdruntime assumes.  I account for this by adjusting
> _tlsMod on dlinfo and undoing the adjustment before passing it to
> __tls_get_adddr, the only consumer.
>
> With the revised patch below, I get clean gdc testresults on Solaris
> 11.3 and 11.4 (and still on 11.5 which uses the dlpi_tls_modid path),
> and almost identical libphobos testresults on all three versions.
>
> I hope this is good to go in (the sections_elf_shared.d part via
> upstream, the rest directly) now?
>
> Thanks a lot to Iain for discovering this.
>

I've finally managed to get around to moving the guts of rt.sections_*
to gcc.sections.*, with the patch amended to apply to
gcc/sections/elf_shared.d, this is OK.
Rainer Orth April 14, 2019, 10:20 a.m. UTC | #7
Hi Iain,

> On Tue, 9 Apr 2019 at 21:27, Rainer Orth <ro@cebitec.uni-bielefeld.de> wrote:
>>
>> Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> writes:
>>
>> > Before Solaris 11.5, struct dl_phdr_info lacked the dlpi_tls_modid
>> > member.  While the support might be backported to Solaris 11.4, it
>> > certainly won't to previous Solaris releases.  To work around this, I've
>> > used the following patch.  Again, it's pretty straightforward.
>> >
>> > Point of note:
>> >
>> > * On Solaris, FreeBSD and NetBSD, dl_phdr_info is always visible in
>> >   <link.h>, while on Linux one needs to define _GNU_SOURCE.
>> >   AC_USE_SYSTEM_EXTENSION takes care of this, but needs to be called
>> >   early (i.e. not in DRUNTIME_OS_DLPI_TLS_MODID) to avoid an autoconf
>> >   warning:
>> >
>> > configure.ac:129: warning: AC_COMPILE_IFELSE was called before
>> > AC_USE_SYSTEM_EXTENSIONS
>> > m4/druntime/os.m4:190: DRUNTIME_OS_DLPI_TLS_MODID is expanded from...
>> > configure.ac:129: the top level
>> >
>> > Tested on i386-pc-solaris2.11 (Solaris 11.4) and x86_64-pc-linux-gnu.
>> >
>> > Not unexpectedly, there are a couple of regressions compared to the
>> > Solaris 11.5/x86 results:
>> >
>> > +FAIL: gdc.test/runnable/testaa.d   execution test
>> > +FAIL: gdc.test/runnable/testaa.d -fPIC   execution test
>> > +FAIL: gdc.test/runnable/testaa.d -fPIC -shared-libphobos   execution test
>> > +FAIL: gdc.test/runnable/testaa.d -shared-libphobos   execution test
>> >
>> >   32 and 64-bit
>> >
>> > +FAIL: gdc.test/runnable/xtest55.d   execution test
>> > +FAIL: gdc.test/runnable/xtest55.d -shared-libphobos   execution test
>> >
>> >   64-bit only
>> >
>> > +FAIL: libphobos.shared/link_linkdep.d
>> > -I/vol/gcc/src/hg/trunk/local/libphobos/t
>> > estsuite/libphobos.shared liblinkdep.so lib.so -shared-libphobos
>> > execution test
>> >
>> > +FAIL: libphobos.shared/load_linkdep.d -shared-libphobos -ldl execution test
>> > +FAIL: libphobos.shared/load_loaddep.d -shared-libphobos -ldl execution test
>> >
>> > +FAIL: libphobos.shared/linkDR.c -shared-libphobos -ldl -pthread
>> > execution test
>> > +FAIL: libphobos.shared/host.c -ldl -pthread execution test
>> > +FAIL: libphobos.shared/loadDR.c -ldl -pthread -g execution test
>> >
>> >   32 and 64-bit
>> >
>> > +FAIL: libphobos.shared/link.d
>> > -I/vol/gcc/src/hg/trunk/local/libphobos/testsuite/libphobos.shared
>> > lib.so -shared-libphobos execution test
>> >
>> >   64-bit only
>> >
>> > I haven't looked much closer yet, just posting this to see if anything
>> > along those lines could be acceptable at all.
>>
>> Fortunately, Iain just discovered a way better way to work around the
>> lack of dlpi_tls_modid: dlinfo(RTLD_DI_LINKMAP) returns not only a
>> pointer to a Link_map * as documented in the man page, but rather a
>> pointer to a struct Rt_map which also includes a rt_tlsmodid member.
>> It has been this way since at least Solaris 9 and the Solaris/Illumos
>> sources ($SRC/cmd/sgs/include/rtld.h) explicitly have a comment around
>> the start of the structure stateing
>>
>>         Exposed to rtld_db - don't move, don't delete
>>
>> which pretty much guarantees that this can be relied on.
>>
>> The only other quirk of note is that before dlpi_tls_modid got added,
>> tlsmodid started at 0 for the main executable, not 1 as the glibc spec
>> required and libdruntime assumes.  I account for this by adjusting
>> _tlsMod on dlinfo and undoing the adjustment before passing it to
>> __tls_get_adddr, the only consumer.
>>
>> With the revised patch below, I get clean gdc testresults on Solaris
>> 11.3 and 11.4 (and still on 11.5 which uses the dlpi_tls_modid path),
>> and almost identical libphobos testresults on all three versions.
>>
>> I hope this is good to go in (the sections_elf_shared.d part via
>> upstream, the rest directly) now?
>>
>> Thanks a lot to Iain for discovering this.
>>
>
> I've finally managed to get around to moving the guts of rt.sections_*
> to gcc.sections.*, with the patch amended to apply to
> gcc/sections/elf_shared.d, this is OK.

Here's the rebased patch for this.  Tested as the previous one and
installed on mainline.

	Rainer
diff mbox series

Patch

# HG changeset patch
# Parent  517609f2fc6f40a6cbf15addd7f1006864256064
Work around lack of dlpi_tls_modid before Solaris 11.5

diff --git a/libphobos/configure.ac b/libphobos/configure.ac
--- a/libphobos/configure.ac
+++ b/libphobos/configure.ac
@@ -32,6 +32,7 @@  AC_CONFIG_HEADERS(config.h)
 
 AM_ENABLE_MULTILIB(, ..)
 AC_CANONICAL_SYSTEM
+AC_USE_SYSTEM_EXTENSIONS
 
 target_alias=${target_alias-$target}
 AC_SUBST(target_alias)
@@ -126,6 +127,7 @@  DRUNTIME_OS_SOURCES
 DRUNTIME_OS_THREAD_MODEL
 DRUNTIME_OS_ARM_EABI_UNWINDER
 DRUNTIME_OS_MINFO_BRACKETING
+DRUNTIME_OS_DLPI_TLS_MODID
 DRUNTIME_OS_LINK_SPEC
 
 WITH_LOCAL_DRUNTIME([
diff --git a/libphobos/libdruntime/gcc/config.d.in b/libphobos/libdruntime/gcc/config.d.in
--- a/libphobos/libdruntime/gcc/config.d.in
+++ b/libphobos/libdruntime/gcc/config.d.in
@@ -38,6 +38,9 @@  enum ThreadModel GNU_Thread_Model = Thre
 // Whether the linker provides __start_minfo and __stop_minfo symbols
 enum Minfo_Bracketing = @DCFG_MINFO_BRACKETING@;
 
+// Whether struct dl_phdr_info has dlpi_tls_modid member.
+enum OS_Have_Dlpi_Tls_Modid = @DCFG_DLPI_TLS_MODID@;
+
 // Whether target has support for builtin atomics.
 enum GNU_Have_Atomics = @DCFG_HAVE_ATOMIC_BUILTINS@;
 
diff --git a/libphobos/libdruntime/rt/sections_elf_shared.d b/libphobos/libdruntime/rt/sections_elf_shared.d
--- a/libphobos/libdruntime/rt/sections_elf_shared.d
+++ b/libphobos/libdruntime/rt/sections_elf_shared.d
@@ -751,8 +751,16 @@  void scanSegments(in ref dl_phdr_info in
 
         case PT_TLS: // TLS segment
             assert(!pdso._tlsSize); // is unique per DSO
-            pdso._tlsMod = info.dlpi_tls_modid;
-            pdso._tlsSize = phdr.p_memsz;
+            static if (OS_Have_Dlpi_Tls_Modid)
+            {
+                pdso._tlsMod = info.dlpi_tls_modid;
+                pdso._tlsSize = phdr.p_memsz;
+            }
+            else
+            {
+                pdso._tlsMod = 0;
+                pdso._tlsSize = 0;
+            }
             break;
 
         default:
diff --git a/libphobos/m4/druntime/os.m4 b/libphobos/m4/druntime/os.m4
--- a/libphobos/m4/druntime/os.m4
+++ b/libphobos/m4/druntime/os.m4
@@ -183,6 +183,20 @@  AC_DEFUN([DRUNTIME_OS_MINFO_BRACKETING],
   AC_LANG_POP([C])
 ])
 
+# DRUNTIME_OS_DLPI_TLS_MODID
+# ----------------------------
+# Check if struct dl_phdr_info includes the dlpi_tls_modid member and  
+# substitute DCFG_DLPI_TLS_MODID.
+AC_DEFUN([DRUNTIME_OS_DLPI_TLS_MODID],
+[
+  AC_LANG_PUSH([C])
+  AC_CHECK_MEMBER([struct dl_phdr_info.dlpi_tls_modid],
+		  [DCFG_DLPI_TLS_MODID=true], [DCFG_DLPI_TLS_MODID=false],
+		  [[#include <link.h>]])
+  AC_SUBST(DCFG_DLPI_TLS_MODID)
+  AC_LANG_POP([C])
+])
+
 # DRUNTIME_OS_LINK_SPEC
 # ---------------------
 # Add target-specific link options to link_spec.