Message ID | 20220601174617.72002-1-murphyp@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | nptl_db: do not link libthread_db.so with dt-relr | expand |
On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote: > Should libthread_db.so be dlopen-able by gdb running with a slightly > older glibc without dt-relr support? If not, I can rework this to mark > these tests unsupported in such cases. > > ---8<--- > > This may be loaded by a host gdb linked against a libc which does not > support dt-relr. tst-gdb-pthread-attach{,-static} fails on powerpc64 > whose linker supports dt-relr, but the host libc does not. > > Trying host libthread_db library: .../build/glibc/nptl_db/libthread_db.so.1. > dlopen failed: /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by .../build/glibc/nptl_db/libthread_db.so.1). I think it does make sense to disable DT_RELR for libthread_db.so mainly as an improvement for the testsuite regression tests, although I am not sure if the scenario of trying to use a gdb without DT_RELR with a binary that requires DT_RELR would be that usual (it would either require the binary or the gdb to have non standard loader paths). > --- > nptl_db/Makefile | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/nptl_db/Makefile b/nptl_db/Makefile > index ed923a41e5..29830cae4a 100644 > --- a/nptl_db/Makefile > +++ b/nptl_db/Makefile > @@ -49,6 +49,9 @@ libthread_db-inhibit-o = $(filter-out .os,$(object-suffixes)) > # The ps_* callback functions are not defined. > libthread_db.so-no-z-defs = yes > > +# The host GDB may be linked against an older libc which may not support dt-relr. > +libthread_db.so-no-dt-relr = yes > + > tests-special += $(objpfx)db-symbols.out > > include ../Rules
On 6/1/22 1:56 PM, Adhemerval Zanella wrote: > > > On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote: >> Should libthread_db.so be dlopen-able by gdb running with a slightly >> older glibc without dt-relr support? If not, I can rework this to mark >> these tests unsupported in such cases. >> >> ---8<--- >> >> This may be loaded by a host gdb linked against a libc which does not >> support dt-relr. tst-gdb-pthread-attach{,-static} fails on powerpc64 >> whose linker supports dt-relr, but the host libc does not. >> >> Trying host libthread_db library: .../build/glibc/nptl_db/libthread_db.so.1. >> dlopen failed: /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by .../build/glibc/nptl_db/libthread_db.so.1). > > I think it does make sense to disable DT_RELR for libthread_db.so mainly > as an improvement for the testsuite regression tests, although I am not > sure if the scenario of trying to use a gdb without DT_RELR with a binary > that requires DT_RELR would be that usual (it would either require the > binary or the gdb to have non standard loader paths). It is an unlikely combination, but not so improbable that we've run into with two of our CI configurations. Is this OK to submit? I don't think this failure is unique to ppc64. However, I am still curious about what, if any, compatibility guarantees nptl_db has.
On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote: > > > On 6/1/22 1:56 PM, Adhemerval Zanella wrote: >> >> >> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote: >>> Should libthread_db.so be dlopen-able by gdb running with a >>> slightly older glibc without dt-relr support? If not, I can >>> rework this to mark these tests unsupported in such cases. >>> >>> ---8<--- >>> >>> This may be loaded by a host gdb linked against a libc which does >>> not support dt-relr. tst-gdb-pthread-attach{,-static} fails on >>> powerpc64 whose linker supports dt-relr, but the host libc does >>> not. >>> >>> Trying host libthread_db library: >>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed: >>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required >>> by .../build/glibc/nptl_db/libthread_db.so.1). >> >> I think it does make sense to disable DT_RELR for libthread_db.so >> mainly as an improvement for the testsuite regression tests, >> although I am not sure if the scenario of trying to use a gdb >> without DT_RELR with a binary that requires DT_RELR would be that >> usual (it would either require the binary or the gdb to have non >> standard loader paths). > > It is an unlikely combination, but not so improbable that we've run > into with two of our CI configurations. > > Is this OK to submit? I don't think this failure is unique to ppc64. > However, I am still curious about what, if any, compatibility > guarantees nptl_db has. Your CI configurations are incorrectly setup. You must match nptl_db with the runtime. We only support a matched loader, runtime, and nptl_db. When we make transitional changes like these we find all the corner cases in our deployments where we have mismatched the host and test environments.
* Carlos O'Donell via Libc-alpha: > On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote: >> >> >> On 6/1/22 1:56 PM, Adhemerval Zanella wrote: >>> >>> >>> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote: >>>> Should libthread_db.so be dlopen-able by gdb running with a >>>> slightly older glibc without dt-relr support? If not, I can >>>> rework this to mark these tests unsupported in such cases. >>>> >>>> ---8<--- >>>> >>>> This may be loaded by a host gdb linked against a libc which does >>>> not support dt-relr. tst-gdb-pthread-attach{,-static} fails on >>>> powerpc64 whose linker supports dt-relr, but the host libc does >>>> not. >>>> >>>> Trying host libthread_db library: >>>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed: >>>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required >>>> by .../build/glibc/nptl_db/libthread_db.so.1). >>> >>> I think it does make sense to disable DT_RELR for libthread_db.so >>> mainly as an improvement for the testsuite regression tests, >>> although I am not sure if the scenario of trying to use a gdb >>> without DT_RELR with a binary that requires DT_RELR would be that >>> usual (it would either require the binary or the gdb to have non >>> standard loader paths). >> >> It is an unlikely combination, but not so improbable that we've run >> into with two of our CI configurations. >> >> Is this OK to submit? I don't think this failure is unique to ppc64. >> However, I am still curious about what, if any, compatibility >> guarantees nptl_db has. > > Your CI configurations are incorrectly setup. > > You must match nptl_db with the runtime. > > We only support a matched loader, runtime, and nptl_db. Uhm … I'm not sure if I agree. Looking at nptl/tst-pthread-gdb-attach.c, we do not run gdb under the glibc being tested. Instead, the test assumes that the libthread_db we just built can be loaded by the system glibc. But it may not support DT_RELR, so it could lead to test failure. I expect that the pretty-printer tests are similar. So it's not the CI setup is wrong, it's our testsuite. I think we should do something to avoid this spurious failure. Running GDB under an explicit loader invocation will not necessarily work if the built glibc is older than the system glibc. Given that, Paul's patch is not unreasonable. We really should avoid such spurious test failures. Alternatively, we could try to get DT_RELR into those distributions that can be used easily for glibc development. Thanks, Florian
On 02/06/2022 11:54, Florian Weimer wrote: > * Carlos O'Donell via Libc-alpha: > >> On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote: >>> >>> >>> On 6/1/22 1:56 PM, Adhemerval Zanella wrote: >>>> >>>> >>>> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote: >>>>> Should libthread_db.so be dlopen-able by gdb running with a >>>>> slightly older glibc without dt-relr support? If not, I can >>>>> rework this to mark these tests unsupported in such cases. >>>>> >>>>> ---8<--- >>>>> >>>>> This may be loaded by a host gdb linked against a libc which does >>>>> not support dt-relr. tst-gdb-pthread-attach{,-static} fails on >>>>> powerpc64 whose linker supports dt-relr, but the host libc does >>>>> not. >>>>> >>>>> Trying host libthread_db library: >>>>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed: >>>>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required >>>>> by .../build/glibc/nptl_db/libthread_db.so.1). >>>> >>>> I think it does make sense to disable DT_RELR for libthread_db.so >>>> mainly as an improvement for the testsuite regression tests, >>>> although I am not sure if the scenario of trying to use a gdb >>>> without DT_RELR with a binary that requires DT_RELR would be that >>>> usual (it would either require the binary or the gdb to have non >>>> standard loader paths). >>> >>> It is an unlikely combination, but not so improbable that we've run >>> into with two of our CI configurations. >>> >>> Is this OK to submit? I don't think this failure is unique to ppc64. >>> However, I am still curious about what, if any, compatibility >>> guarantees nptl_db has. >> >> Your CI configurations are incorrectly setup. >> >> You must match nptl_db with the runtime. >> >> We only support a matched loader, runtime, and nptl_db. > > Uhm … I'm not sure if I agree. > > Looking at nptl/tst-pthread-gdb-attach.c, we do not run gdb under the > glibc being tested. Instead, the test assumes that the libthread_db we > just built can be loaded by the system glibc. But it may not support > DT_RELR, so it could lead to test failure. I expect that the > pretty-printer tests are similar. > > So it's not the CI setup is wrong, it's our testsuite. I think we > should do something to avoid this spurious failure. Running GDB under > an explicit loader invocation will not necessarily work if the built > glibc is older than the system glibc. Given that, Paul's patch is not > unreasonable. We really should avoid such spurious test failures. > > Alternatively, we could try to get DT_RELR into those distributions that > can be used easily for glibc development. Or we can set the tests unsupported if gdb "throws version `GLIBC_ABI_DT_RELR' not found".
On 6/2/22 10:54, Florian Weimer wrote: > * Carlos O'Donell via Libc-alpha: > >> On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote: >>> >>> >>> On 6/1/22 1:56 PM, Adhemerval Zanella wrote: >>>> >>>> >>>> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote: >>>>> Should libthread_db.so be dlopen-able by gdb running with a >>>>> slightly older glibc without dt-relr support? If not, I can >>>>> rework this to mark these tests unsupported in such cases. >>>>> >>>>> ---8<--- >>>>> >>>>> This may be loaded by a host gdb linked against a libc which does >>>>> not support dt-relr. tst-gdb-pthread-attach{,-static} fails on >>>>> powerpc64 whose linker supports dt-relr, but the host libc does >>>>> not. >>>>> >>>>> Trying host libthread_db library: >>>>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed: >>>>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required >>>>> by .../build/glibc/nptl_db/libthread_db.so.1). >>>> >>>> I think it does make sense to disable DT_RELR for libthread_db.so >>>> mainly as an improvement for the testsuite regression tests, >>>> although I am not sure if the scenario of trying to use a gdb >>>> without DT_RELR with a binary that requires DT_RELR would be that >>>> usual (it would either require the binary or the gdb to have non >>>> standard loader paths). >>> >>> It is an unlikely combination, but not so improbable that we've run >>> into with two of our CI configurations. >>> >>> Is this OK to submit? I don't think this failure is unique to ppc64. >>> However, I am still curious about what, if any, compatibility >>> guarantees nptl_db has. >> >> Your CI configurations are incorrectly setup. >> >> You must match nptl_db with the runtime. >> >> We only support a matched loader, runtime, and nptl_db. > > Uhm … I'm not sure if I agree. The just built nptl_db must be matched to the libc you just built. > Looking at nptl/tst-pthread-gdb-attach.c, we do not run gdb under the > glibc being tested. Instead, the test assumes that the libthread_db we > just built can be loaded by the system glibc. But it may not support > DT_RELR, so it could lead to test failure. I expect that the > pretty-printer tests are similar. We should fix this. > So it's not the CI setup is wrong, it's our testsuite. I think we > should do something to avoid this spurious failure. Running GDB under > an explicit loader invocation will not necessarily work if the built > glibc is older than the system glibc. Given that, Paul's patch is not > unreasonable. We really should avoid such spurious test failures. I'm OK with Paul's patch if it has a big comment that says this is not the correct solution, and that the right solution is to run everything, gdb included, under an explicit loader invocation, and maybe even look at 'set exec-wrapper.' We *can* run gdb under an explicit loader invocation, but if we can't start that gdb then those tests will have to fail as unsupported. This is a consequence of having a dependency on system tools that themselves must run in the same execution image with the just-built runtime. In order to decouple this we need to get rid of nptl_db :-) > Alternatively, we could try to get DT_RELR into those distributions that > can be used easily for glibc development. Absolutely.
On 6/2/22 1:16 PM, Carlos O'Donell wrote: > On 6/2/22 10:54, Florian Weimer wrote: >> * Carlos O'Donell via Libc-alpha: >> >>> On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote: >>>> >>>> >>>> On 6/1/22 1:56 PM, Adhemerval Zanella wrote: >>>>> >>>>> >>>>> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote: >>>>>> Should libthread_db.so be dlopen-able by gdb running with a >>>>>> slightly older glibc without dt-relr support? If not, I can >>>>>> rework this to mark these tests unsupported in such cases. >>>>>> >>>>>> ---8<--- >>>>>> >>>>>> This may be loaded by a host gdb linked against a libc which does >>>>>> not support dt-relr. tst-gdb-pthread-attach{,-static} fails on >>>>>> powerpc64 whose linker supports dt-relr, but the host libc does >>>>>> not. >>>>>> >>>>>> Trying host libthread_db library: >>>>>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed: >>>>>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required >>>>>> by .../build/glibc/nptl_db/libthread_db.so.1). >>>>> >>>>> I think it does make sense to disable DT_RELR for libthread_db.so >>>>> mainly as an improvement for the testsuite regression tests, >>>>> although I am not sure if the scenario of trying to use a gdb >>>>> without DT_RELR with a binary that requires DT_RELR would be that >>>>> usual (it would either require the binary or the gdb to have non >>>>> standard loader paths). >>>> >>>> It is an unlikely combination, but not so improbable that we've run >>>> into with two of our CI configurations. >>>> >>>> Is this OK to submit? I don't think this failure is unique to ppc64. >>>> However, I am still curious about what, if any, compatibility >>>> guarantees nptl_db has. >>> >>> Your CI configurations are incorrectly setup. >>> >>> You must match nptl_db with the runtime. >>> >>> We only support a matched loader, runtime, and nptl_db. >> >> Uhm … I'm not sure if I agree. > > The just built nptl_db must be matched to the libc you just built. > >> Looking at nptl/tst-pthread-gdb-attach.c, we do not run gdb under the >> glibc being tested. Instead, the test assumes that the libthread_db we >> just built can be loaded by the system glibc. But it may not support >> DT_RELR, so it could lead to test failure. I expect that the >> pretty-printer tests are similar. > > We should fix this. > >> So it's not the CI setup is wrong, it's our testsuite. I think we >> should do something to avoid this spurious failure. Running GDB under >> an explicit loader invocation will not necessarily work if the built >> glibc is older than the system glibc. Given that, Paul's patch is not >> unreasonable. We really should avoid such spurious test failures. > > I'm OK with Paul's patch if it has a big comment that says this is not the correct > solution, and that the right solution is to run everything, gdb included, under > an explicit loader invocation, and maybe even look at 'set exec-wrapper.' > > We *can* run gdb under an explicit loader invocation, but if we can't start that gdb > then those tests will have to fail as unsupported. This is a consequence of > having a dependency on system tools that themselves must run in the same execution > image with the just-built runtime. That seems like a more ideal workaround, but not without its corner cases. If I understand the usage of this library, the benefits of building it with DT_RELR are limited. Disabling it seems less troublesome. Is this following wordsmithing for the Makefile change OK? diff --git a/nptl_db/Makefile b/nptl_db/Makefile index ed923a41e5..2360e53c88 100644 --- a/nptl_db/Makefile +++ b/nptl_db/Makefile @@ -49,6 +49,12 @@ libthread_db-inhibit-o = $(filter-out .os,$(object-suffixes)) # The ps_* callback functions are not defined. libthread_db.so-no-z-defs = yes +# This is a hack. This is not the correct solution. When this glibc +# is tested, the gdb used could be loaded by the host glibc, which +# may not support DT_RELR, and report a failure instead of unsupported. +# For now, build this without DT_RELR support to avoid this situation. +libthread_db.so-no-dt-relr = yes + tests-special += $(objpfx)db-symbols.out include ../Rules
diff --git a/nptl_db/Makefile b/nptl_db/Makefile index ed923a41e5..29830cae4a 100644 --- a/nptl_db/Makefile +++ b/nptl_db/Makefile @@ -49,6 +49,9 @@ libthread_db-inhibit-o = $(filter-out .os,$(object-suffixes)) # The ps_* callback functions are not defined. libthread_db.so-no-z-defs = yes +# The host GDB may be linked against an older libc which may not support dt-relr. +libthread_db.so-no-dt-relr = yes + tests-special += $(objpfx)db-symbols.out include ../Rules