diff mbox series

nptl_db: do not link libthread_db.so with dt-relr

Message ID 20220601174617.72002-1-murphyp@linux.ibm.com
State New
Headers show
Series nptl_db: do not link libthread_db.so with dt-relr | expand

Commit Message

Paul E Murphy June 1, 2022, 5:46 p.m. UTC
Should libthread_db.so be dlopen-able by gdb running with a slightly
older glibc without dt-relr support?  If not, I can rework this to mark
these tests unsupported in such cases.

---8<---

This may be loaded by a host gdb linked against a libc which does not
support dt-relr.  tst-gdb-pthread-attach{,-static} fails on powerpc64
whose linker supports dt-relr, but the host libc does not.

  Trying host libthread_db library: .../build/glibc/nptl_db/libthread_db.so.1.
  dlopen failed: /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by .../build/glibc/nptl_db/libthread_db.so.1).
---
 nptl_db/Makefile | 3 +++
 1 file changed, 3 insertions(+)

Comments

Adhemerval Zanella Netto June 1, 2022, 6:56 p.m. UTC | #1
On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote:
> Should libthread_db.so be dlopen-able by gdb running with a slightly
> older glibc without dt-relr support?  If not, I can rework this to mark
> these tests unsupported in such cases.
> 
> ---8<---
> 
> This may be loaded by a host gdb linked against a libc which does not
> support dt-relr.  tst-gdb-pthread-attach{,-static} fails on powerpc64
> whose linker supports dt-relr, but the host libc does not.
> 
>   Trying host libthread_db library: .../build/glibc/nptl_db/libthread_db.so.1.
>   dlopen failed: /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by .../build/glibc/nptl_db/libthread_db.so.1).

I think it does make sense to disable DT_RELR for libthread_db.so mainly
as an improvement for the testsuite regression tests, although I am not 
sure if the scenario of trying to use a gdb without DT_RELR with a binary 
that requires DT_RELR would be that usual (it would either require the 
binary or the gdb to have non standard loader paths).

> ---
>  nptl_db/Makefile | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/nptl_db/Makefile b/nptl_db/Makefile
> index ed923a41e5..29830cae4a 100644
> --- a/nptl_db/Makefile
> +++ b/nptl_db/Makefile
> @@ -49,6 +49,9 @@ libthread_db-inhibit-o = $(filter-out .os,$(object-suffixes))
>  # The ps_* callback functions are not defined.
>  libthread_db.so-no-z-defs = yes
>  
> +# The host GDB may be linked against an older libc which may not support dt-relr.
> +libthread_db.so-no-dt-relr = yes
> +
>  tests-special += $(objpfx)db-symbols.out
>  
>  include ../Rules
Paul E Murphy June 2, 2022, 2:12 p.m. UTC | #2
On 6/1/22 1:56 PM, Adhemerval Zanella wrote:
> 
> 
> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote:
>> Should libthread_db.so be dlopen-able by gdb running with a slightly
>> older glibc without dt-relr support?  If not, I can rework this to mark
>> these tests unsupported in such cases.
>>
>> ---8<---
>>
>> This may be loaded by a host gdb linked against a libc which does not
>> support dt-relr.  tst-gdb-pthread-attach{,-static} fails on powerpc64
>> whose linker supports dt-relr, but the host libc does not.
>>
>>    Trying host libthread_db library: .../build/glibc/nptl_db/libthread_db.so.1.
>>    dlopen failed: /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by .../build/glibc/nptl_db/libthread_db.so.1).
> 
> I think it does make sense to disable DT_RELR for libthread_db.so mainly
> as an improvement for the testsuite regression tests, although I am not
> sure if the scenario of trying to use a gdb without DT_RELR with a binary
> that requires DT_RELR would be that usual (it would either require the
> binary or the gdb to have non standard loader paths).

It is an unlikely combination, but not so improbable that we've run into 
with two of our CI configurations.

Is this OK to submit?  I don't think this failure is unique to ppc64. 
However, I am still curious about what, if any, compatibility guarantees 
nptl_db has.
Carlos O'Donell June 2, 2022, 2:34 p.m. UTC | #3
On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote:
> 
> 
> On 6/1/22 1:56 PM, Adhemerval Zanella wrote:
>> 
>> 
>> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote:
>>> Should libthread_db.so be dlopen-able by gdb running with a
>>> slightly older glibc without dt-relr support?  If not, I can
>>> rework this to mark these tests unsupported in such cases.
>>> 
>>> ---8<---
>>> 
>>> This may be loaded by a host gdb linked against a libc which does
>>> not support dt-relr.  tst-gdb-pthread-attach{,-static} fails on
>>> powerpc64 whose linker supports dt-relr, but the host libc does
>>> not.
>>> 
>>> Trying host libthread_db library:
>>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed:
>>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required
>>> by .../build/glibc/nptl_db/libthread_db.so.1).
>> 
>> I think it does make sense to disable DT_RELR for libthread_db.so
>> mainly as an improvement for the testsuite regression tests,
>> although I am not sure if the scenario of trying to use a gdb
>> without DT_RELR with a binary that requires DT_RELR would be that
>> usual (it would either require the binary or the gdb to have non
>> standard loader paths).
> 
> It is an unlikely combination, but not so improbable that we've run
> into with two of our CI configurations.
> 
> Is this OK to submit?  I don't think this failure is unique to ppc64.
> However, I am still curious about what, if any, compatibility
> guarantees nptl_db has.
 
Your CI configurations are incorrectly setup.

You must match nptl_db with the runtime.

We only support a matched loader, runtime, and nptl_db.

When we make transitional changes like these we find all the corner cases in our
deployments where we have mismatched the host and test environments.
Florian Weimer June 2, 2022, 2:54 p.m. UTC | #4
* Carlos O'Donell via Libc-alpha:

> On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote:
>> 
>> 
>> On 6/1/22 1:56 PM, Adhemerval Zanella wrote:
>>> 
>>> 
>>> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote:
>>>> Should libthread_db.so be dlopen-able by gdb running with a
>>>> slightly older glibc without dt-relr support?  If not, I can
>>>> rework this to mark these tests unsupported in such cases.
>>>> 
>>>> ---8<---
>>>> 
>>>> This may be loaded by a host gdb linked against a libc which does
>>>> not support dt-relr.  tst-gdb-pthread-attach{,-static} fails on
>>>> powerpc64 whose linker supports dt-relr, but the host libc does
>>>> not.
>>>> 
>>>> Trying host libthread_db library:
>>>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed:
>>>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required
>>>> by .../build/glibc/nptl_db/libthread_db.so.1).
>>> 
>>> I think it does make sense to disable DT_RELR for libthread_db.so
>>> mainly as an improvement for the testsuite regression tests,
>>> although I am not sure if the scenario of trying to use a gdb
>>> without DT_RELR with a binary that requires DT_RELR would be that
>>> usual (it would either require the binary or the gdb to have non
>>> standard loader paths).
>> 
>> It is an unlikely combination, but not so improbable that we've run
>> into with two of our CI configurations.
>> 
>> Is this OK to submit?  I don't think this failure is unique to ppc64.
>> However, I am still curious about what, if any, compatibility
>> guarantees nptl_db has.
>  
> Your CI configurations are incorrectly setup.
>
> You must match nptl_db with the runtime.
>
> We only support a matched loader, runtime, and nptl_db.

Uhm … I'm not sure if I agree.

Looking at nptl/tst-pthread-gdb-attach.c, we do not run gdb under the
glibc being tested.  Instead, the test assumes that the libthread_db we
just built can be loaded by the system glibc.  But it may not support
DT_RELR, so it could lead to test failure.  I expect that the
pretty-printer tests are similar.

So it's not the CI setup is wrong, it's our testsuite.  I think we
should do something to avoid this spurious failure.  Running GDB under
an explicit loader invocation will not necessarily work if the built
glibc is older than the system glibc.  Given that, Paul's patch is not
unreasonable.  We really should avoid such spurious test failures.

Alternatively, we could try to get DT_RELR into those distributions that
can be used easily for glibc development.

Thanks,
Florian
Adhemerval Zanella Netto June 2, 2022, 4:27 p.m. UTC | #5
On 02/06/2022 11:54, Florian Weimer wrote:
> * Carlos O'Donell via Libc-alpha:
> 
>> On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote:
>>>
>>>
>>> On 6/1/22 1:56 PM, Adhemerval Zanella wrote:
>>>>
>>>>
>>>> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote:
>>>>> Should libthread_db.so be dlopen-able by gdb running with a
>>>>> slightly older glibc without dt-relr support?  If not, I can
>>>>> rework this to mark these tests unsupported in such cases.
>>>>>
>>>>> ---8<---
>>>>>
>>>>> This may be loaded by a host gdb linked against a libc which does
>>>>> not support dt-relr.  tst-gdb-pthread-attach{,-static} fails on
>>>>> powerpc64 whose linker supports dt-relr, but the host libc does
>>>>> not.
>>>>>
>>>>> Trying host libthread_db library:
>>>>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed:
>>>>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required
>>>>> by .../build/glibc/nptl_db/libthread_db.so.1).
>>>>
>>>> I think it does make sense to disable DT_RELR for libthread_db.so
>>>> mainly as an improvement for the testsuite regression tests,
>>>> although I am not sure if the scenario of trying to use a gdb
>>>> without DT_RELR with a binary that requires DT_RELR would be that
>>>> usual (it would either require the binary or the gdb to have non
>>>> standard loader paths).
>>>
>>> It is an unlikely combination, but not so improbable that we've run
>>> into with two of our CI configurations.
>>>
>>> Is this OK to submit?  I don't think this failure is unique to ppc64.
>>> However, I am still curious about what, if any, compatibility
>>> guarantees nptl_db has.
>>  
>> Your CI configurations are incorrectly setup.
>>
>> You must match nptl_db with the runtime.
>>
>> We only support a matched loader, runtime, and nptl_db.
> 
> Uhm … I'm not sure if I agree.
> 
> Looking at nptl/tst-pthread-gdb-attach.c, we do not run gdb under the
> glibc being tested.  Instead, the test assumes that the libthread_db we
> just built can be loaded by the system glibc.  But it may not support
> DT_RELR, so it could lead to test failure.  I expect that the
> pretty-printer tests are similar.
> 
> So it's not the CI setup is wrong, it's our testsuite.  I think we
> should do something to avoid this spurious failure.  Running GDB under
> an explicit loader invocation will not necessarily work if the built
> glibc is older than the system glibc.  Given that, Paul's patch is not
> unreasonable.  We really should avoid such spurious test failures.
> 
> Alternatively, we could try to get DT_RELR into those distributions that
> can be used easily for glibc development.

Or we can set the tests unsupported if gdb "throws version `GLIBC_ABI_DT_RELR'
not found".
Carlos O'Donell June 2, 2022, 6:16 p.m. UTC | #6
On 6/2/22 10:54, Florian Weimer wrote:
> * Carlos O'Donell via Libc-alpha:
> 
>> On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote:
>>>
>>>
>>> On 6/1/22 1:56 PM, Adhemerval Zanella wrote:
>>>>
>>>>
>>>> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote:
>>>>> Should libthread_db.so be dlopen-able by gdb running with a
>>>>> slightly older glibc without dt-relr support?  If not, I can
>>>>> rework this to mark these tests unsupported in such cases.
>>>>>
>>>>> ---8<---
>>>>>
>>>>> This may be loaded by a host gdb linked against a libc which does
>>>>> not support dt-relr.  tst-gdb-pthread-attach{,-static} fails on
>>>>> powerpc64 whose linker supports dt-relr, but the host libc does
>>>>> not.
>>>>>
>>>>> Trying host libthread_db library:
>>>>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed:
>>>>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required
>>>>> by .../build/glibc/nptl_db/libthread_db.so.1).
>>>>
>>>> I think it does make sense to disable DT_RELR for libthread_db.so
>>>> mainly as an improvement for the testsuite regression tests,
>>>> although I am not sure if the scenario of trying to use a gdb
>>>> without DT_RELR with a binary that requires DT_RELR would be that
>>>> usual (it would either require the binary or the gdb to have non
>>>> standard loader paths).
>>>
>>> It is an unlikely combination, but not so improbable that we've run
>>> into with two of our CI configurations.
>>>
>>> Is this OK to submit?  I don't think this failure is unique to ppc64.
>>> However, I am still curious about what, if any, compatibility
>>> guarantees nptl_db has.
>>  
>> Your CI configurations are incorrectly setup.
>>
>> You must match nptl_db with the runtime.
>>
>> We only support a matched loader, runtime, and nptl_db.
> 
> Uhm … I'm not sure if I agree.

The just built nptl_db must be matched to the libc you just built.
 
> Looking at nptl/tst-pthread-gdb-attach.c, we do not run gdb under the
> glibc being tested.  Instead, the test assumes that the libthread_db we
> just built can be loaded by the system glibc.  But it may not support
> DT_RELR, so it could lead to test failure.  I expect that the
> pretty-printer tests are similar.

We should fix this.

> So it's not the CI setup is wrong, it's our testsuite.  I think we
> should do something to avoid this spurious failure.  Running GDB under
> an explicit loader invocation will not necessarily work if the built
> glibc is older than the system glibc.  Given that, Paul's patch is not
> unreasonable.  We really should avoid such spurious test failures.

I'm OK with Paul's patch if it has a big comment that says this is not the correct
solution, and that the right solution is to run everything, gdb included, under
an explicit loader invocation, and maybe even look at 'set exec-wrapper.'

We *can* run gdb under an explicit loader invocation, but if we can't start that gdb
then those tests will have to fail as unsupported. This is a consequence of
having a dependency on system tools that themselves must run in the same execution
image with the just-built runtime.

In order to decouple this we need to get rid of nptl_db :-)

> Alternatively, we could try to get DT_RELR into those distributions that
> can be used easily for glibc development.

Absolutely.
Paul E Murphy June 2, 2022, 7:11 p.m. UTC | #7
On 6/2/22 1:16 PM, Carlos O'Donell wrote:
> On 6/2/22 10:54, Florian Weimer wrote:
>> * Carlos O'Donell via Libc-alpha:
>>
>>> On 6/2/22 10:12, Paul E Murphy via Libc-alpha wrote:
>>>>
>>>>
>>>> On 6/1/22 1:56 PM, Adhemerval Zanella wrote:
>>>>>
>>>>>
>>>>> On 01/06/2022 14:46, Paul E. Murphy via Libc-alpha wrote:
>>>>>> Should libthread_db.so be dlopen-able by gdb running with a
>>>>>> slightly older glibc without dt-relr support?  If not, I can
>>>>>> rework this to mark these tests unsupported in such cases.
>>>>>>
>>>>>> ---8<---
>>>>>>
>>>>>> This may be loaded by a host gdb linked against a libc which does
>>>>>> not support dt-relr.  tst-gdb-pthread-attach{,-static} fails on
>>>>>> powerpc64 whose linker supports dt-relr, but the host libc does
>>>>>> not.
>>>>>>
>>>>>> Trying host libthread_db library:
>>>>>> .../build/glibc/nptl_db/libthread_db.so.1. dlopen failed:
>>>>>> /lib64/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required
>>>>>> by .../build/glibc/nptl_db/libthread_db.so.1).
>>>>>
>>>>> I think it does make sense to disable DT_RELR for libthread_db.so
>>>>> mainly as an improvement for the testsuite regression tests,
>>>>> although I am not sure if the scenario of trying to use a gdb
>>>>> without DT_RELR with a binary that requires DT_RELR would be that
>>>>> usual (it would either require the binary or the gdb to have non
>>>>> standard loader paths).
>>>>
>>>> It is an unlikely combination, but not so improbable that we've run
>>>> into with two of our CI configurations.
>>>>
>>>> Is this OK to submit?  I don't think this failure is unique to ppc64.
>>>> However, I am still curious about what, if any, compatibility
>>>> guarantees nptl_db has.
>>>   
>>> Your CI configurations are incorrectly setup.
>>>
>>> You must match nptl_db with the runtime.
>>>
>>> We only support a matched loader, runtime, and nptl_db.
>>
>> Uhm … I'm not sure if I agree.
> 
> The just built nptl_db must be matched to the libc you just built.
>   
>> Looking at nptl/tst-pthread-gdb-attach.c, we do not run gdb under the
>> glibc being tested.  Instead, the test assumes that the libthread_db we
>> just built can be loaded by the system glibc.  But it may not support
>> DT_RELR, so it could lead to test failure.  I expect that the
>> pretty-printer tests are similar.
> 
> We should fix this.
> 
>> So it's not the CI setup is wrong, it's our testsuite.  I think we
>> should do something to avoid this spurious failure.  Running GDB under
>> an explicit loader invocation will not necessarily work if the built
>> glibc is older than the system glibc.  Given that, Paul's patch is not
>> unreasonable.  We really should avoid such spurious test failures.
> 
> I'm OK with Paul's patch if it has a big comment that says this is not the correct
> solution, and that the right solution is to run everything, gdb included, under
> an explicit loader invocation, and maybe even look at 'set exec-wrapper.'
> 
> We *can* run gdb under an explicit loader invocation, but if we can't start that gdb
> then those tests will have to fail as unsupported. This is a consequence of
> having a dependency on system tools that themselves must run in the same execution
> image with the just-built runtime.

That seems like a more ideal workaround, but not without its corner 
cases.  If I understand the usage of this library, the benefits of 
building it with DT_RELR are limited. Disabling it seems less troublesome.

Is this following wordsmithing for the Makefile change OK?

diff --git a/nptl_db/Makefile b/nptl_db/Makefile
index ed923a41e5..2360e53c88 100644
--- a/nptl_db/Makefile
+++ b/nptl_db/Makefile
@@ -49,6 +49,12 @@ libthread_db-inhibit-o = $(filter-out 
.os,$(object-suffixes))
  # The ps_* callback functions are not defined.
  libthread_db.so-no-z-defs = yes

+# This is a hack.  This is not the correct solution.  When this glibc
+# is tested, the gdb used could be loaded by the host glibc, which
+# may not support DT_RELR, and report a failure instead of unsupported.
+# For now, build this without DT_RELR support to avoid this situation.
+libthread_db.so-no-dt-relr = yes
+
  tests-special += $(objpfx)db-symbols.out

  include ../Rules
diff mbox series

Patch

diff --git a/nptl_db/Makefile b/nptl_db/Makefile
index ed923a41e5..29830cae4a 100644
--- a/nptl_db/Makefile
+++ b/nptl_db/Makefile
@@ -49,6 +49,9 @@  libthread_db-inhibit-o = $(filter-out .os,$(object-suffixes))
 # The ps_* callback functions are not defined.
 libthread_db.so-no-z-defs = yes
 
+# The host GDB may be linked against an older libc which may not support dt-relr.
+libthread_db.so-no-dt-relr = yes
+
 tests-special += $(objpfx)db-symbols.out
 
 include ../Rules