Using libthread_db.so with single-threaded programs, for TLS access (was: Re: [RFC PATCH 0/3] Pretty-printing for errno)

Message ID 72a8ac9b-2429-c8bd-83b9-d758224571c5@redhat.com
State New
Headers show
Series
  • Using libthread_db.so with single-threaded programs, for TLS access (was: Re: [RFC PATCH 0/3] Pretty-printing for errno)
Related show

Commit Message

Pedro Alves Sept. 13, 2017, 11:22 a.m.
On 09/07/2017 12:34 PM, Pedro Alves wrote:
> On 09/06/2017 10:03 PM, Zack Weinberg wrote:
> 
>> So, changes to both gdb and libthread_db seem to be required here.  I
>> do think that _in principle_ it ought to be possible to use
>> libthread_db to retrieve the address of thread-local data even if the
>> inferior is not linked with libpthread; glibc has quite a few
>> thread-specific variables (errno most prominent, of course, but also
>> h_errno, _res, etc), and so might any library which can be used from
>> both single- and multithreaded programs.
>>
>> This is really not code I feel comfortable hacking up, though, and
>> it's probably more of a project than I have time for, in any case.
> 
> Sounds like a promising approach though.  I'd like to see this path
> explored a bit more.  I'll keep this in my TODO, even though it's
> not likely to bubble up very soon.  Thanks for the discussion/ideas!

So I played with this a bit more on the plane back from Cauldron,
to try to see if we'd hit some major roadblock.  I also chatted
with Carlos a bit about this back at the Cauldron, and seemingly
there's no major reason this can't be made to work,
TLS-internals-wise.

Seems like that it's mainly a case of moving libthread_db.so-related
symbols from libpthread.so elsewhere.  More below.

I hacked libthread_db.so to disable the nptl_version check, so that
it always successfully loads with non-threaded programs.  And then
I tweaked GDB enough to make it actually reach libthread_db.so's
td_thr_tls_get_addr in that scenario too.  That's when I hit
another snag: the symbols that libthread_db.so needs which describe
the necessary offsets of internal data structures for getting at the
TLS blocks are also in libpthread.so...  In particular, the first
we stumble on is "_thread_db_link_map_l_tls_modid".  I made GDB
print the symbol lookups to make it easier to debug.  Vis:

 (gdb) p errno
 ps_pglobal_lookup: name="__stack_user" => PS_NOSYM
 warning: Cannot find user-level thread for LWP 31772: generic error
 ps_pglobal_lookup: name="_thread_db_link_map_l_tls_modid" => PS_NOSYM
 Cannot find thread-local storage for process 31772, shared library /lib64/libc.so.6:
 operation not applicable to

The lookup is coming from here:

 (top-gdb) bt
 #0  ps_pglobal_lookup (ph=0x1f65fe0, obj=0x7fffe58f93ae "libpthread.so.0", name=0x7fffe58f9e48 "_thread_db_link_map_l_tls_modid", sym_addr=0x7fffffffc428) at src/gdb/proc-service.c:115
 #1  0x00007fffe58f88a8 in td_mod_lookup (ps=<optimized out>, mod=mod@entry=0x7fffe58f93ae "libpthread.so.0", idx=<optimized out>, sym_addr=sym_addr@entry=0x7fffffffc428)
     at td_symbol_list.c:48
 #2  0x00007fffe58f8f45 in _td_locate_field (ta=ta@entry=0x1f84df0, desc=desc@entry=0x1f84fbc, descriptor_name=descriptor_name@entry=43, idx=idx@entry=0x0, 
     address=address@entry=0x7fffffffc458) at fetch-value.c:54
 #3  0x00007fffe58f8ff0 in _td_fetch_value (ta=0x1f84df0, desc=0x1f84fbc, descriptor_name=descriptor_name@entry=43, idx=idx@entry=0x0, address=0x7ffff7ff7658, 
     result=result@entry=0x7fffffffc498) at fetch-value.c:94
 #4  0x00007fffe58f8ddf in td_thr_tls_get_addr (th=0x7fffffffc4e0, map_address=<optimized out>, offset=16, address=0x7fffffffc4f8) at td_thr_tls_get_addr.c:31
 ...

So we'd need to move that symbol (and maybe others) to one of ld.so/libc.so
instead.  AFAICT, those magic symbols are described in nptl_db/structs.def.
I haven't looked enough to figure out what ends up expanding those macros
in libpthread.so.  This is where I stopped.

I'm attaching the gdb and libthread_db.so patches I used, both against current
master in their respective projects.  See comments within the patches.
I've also pushed the gdb patch to a "users/palves/tls-nonthreaded" branch.
(I don't think I have write access to glibc's git.)

Thanks,
Pedro Alves

Comments

Philippe Waroquiers Sept. 13, 2017, 7:26 p.m. | #1
On Wed, 2017-09-13 at 12:22 +0100, Pedro Alves wrote:
> On 09/07/2017 12:34 PM, Pedro Alves wrote:
> > On 09/06/2017 10:03 PM, Zack Weinberg wrote:
> > 
> > > So, changes to both gdb and libthread_db seem to be required
> > > here.  I
> > > do think that _in principle_ it ought to be possible to use
> > > libthread_db to retrieve the address of thread-local data even if
> > > the
> > > inferior is not linked with libpthread; glibc has quite a few
> > > thread-specific variables (errno most prominent, of course, but
> > > also
> > > h_errno, _res, etc), and so might any library which can be used
> > > from
> > > both single- and multithreaded programs.
> > > 
> > > This is really not code I feel comfortable hacking up, though,
> > > and
> > > it's probably more of a project than I have time for, in any
> > > case.
> > 
> > Sounds like a promising approach though.  I'd like to see this path
> > explored a bit more.  I'll keep this in my TODO, even though it's
> > not likely to bubble up very soon.  Thanks for the
> > discussion/ideas!
> 
> So I played with this a bit more on the plane back from Cauldron,
> to try to see if we'd hit some major roadblock.  I also chatted
> with Carlos a bit about this back at the Cauldron, and seemingly
> there's no major reason this can't be made to work,
> TLS-internals-wise.
> 
> Seems like that it's mainly a case of moving libthread_db.so-related
> symbols from libpthread.so elsewhere.  More below.
Note that in the valgrind gdbserver, I had to handle the same problem
i.e. find the address of a tls variable without access to any
library (valgrind cannot make use of any library including glibc).

So, I finally end-ed up implementing the minimum logic for that.
It is based on some real ugly hacks, e.g. to get the offset of
lm_modid in struct link_map.
There is also some arch dependent 1 or 2 lines of code to get the dtv.


This is all somewhat fragile, was done in 2014, not broken (yet).
But some more recent changes might have broken the hack,
as I have a test failing after upgrading to Debian 9.

See valgrind  coregrind/m_gdbserver/server.c handling of qGetTLSAddr
for
the gory/hacky details.

Better (even partial) support for such things without the need of a
library would significantly improve my life :)

Philippe
Pedro Alves Sept. 14, 2017, 12:02 a.m. | #2
On 09/13/2017 12:22 PM, Pedro Alves wrote:
> On 09/07/2017 12:34 PM, Pedro Alves wrote:
>> On 09/06/2017 10:03 PM, Zack Weinberg wrote:
>>
>>> So, changes to both gdb and libthread_db seem to be required here.  I
>>> do think that _in principle_ it ought to be possible to use
>>> libthread_db to retrieve the address of thread-local data even if the
>>> inferior is not linked with libpthread; glibc has quite a few
>>> thread-specific variables (errno most prominent, of course, but also
>>> h_errno, _res, etc), and so might any library which can be used from
>>> both single- and multithreaded programs.
>>>
>>> This is really not code I feel comfortable hacking up, though, and
>>> it's probably more of a project than I have time for, in any case.
>>
>> Sounds like a promising approach though.  I'd like to see this path
>> explored a bit more.  I'll keep this in my TODO, even though it's
>> not likely to bubble up very soon.  Thanks for the discussion/ideas!
> 
> So I played with this a bit more on the plane back from Cauldron,
> to try to see if we'd hit some major roadblock.  I also chatted
> with Carlos a bit about this back at the Cauldron, and seemingly
> there's no major reason this can't be made to work,
> TLS-internals-wise.
> 
> Seems like that it's mainly a case of moving libthread_db.so-related
> symbols from libpthread.so elsewhere.  More below.
> 
> I hacked libthread_db.so to disable the nptl_version check, so that
> it always successfully loads with non-threaded programs.  And then
> I tweaked GDB enough to make it actually reach libthread_db.so's
> td_thr_tls_get_addr in that scenario too.  That's when I hit
> another snag: the symbols that libthread_db.so needs which describe
> the necessary offsets of internal data structures for getting at the
> TLS blocks are also in libpthread.so...  In particular, the first
> we stumble on is "_thread_db_link_map_l_tls_modid".  I made GDB
> print the symbol lookups to make it easier to debug.  Vis:
> 
>  (gdb) p errno
>  ps_pglobal_lookup: name="__stack_user" => PS_NOSYM
>  warning: Cannot find user-level thread for LWP 31772: generic error
>  ps_pglobal_lookup: name="_thread_db_link_map_l_tls_modid" => PS_NOSYM
>  Cannot find thread-local storage for process 31772, shared library /lib64/libc.so.6:
>  operation not applicable to
> 
> The lookup is coming from here:
> 
>  (top-gdb) bt
>  #0  ps_pglobal_lookup (ph=0x1f65fe0, obj=0x7fffe58f93ae "libpthread.so.0", name=0x7fffe58f9e48 "_thread_db_link_map_l_tls_modid", sym_addr=0x7fffffffc428) at src/gdb/proc-service.c:115
>  #1  0x00007fffe58f88a8 in td_mod_lookup (ps=<optimized out>, mod=mod@entry=0x7fffe58f93ae "libpthread.so.0", idx=<optimized out>, sym_addr=sym_addr@entry=0x7fffffffc428)
>      at td_symbol_list.c:48
>  #2  0x00007fffe58f8f45 in _td_locate_field (ta=ta@entry=0x1f84df0, desc=desc@entry=0x1f84fbc, descriptor_name=descriptor_name@entry=43, idx=idx@entry=0x0, 
>      address=address@entry=0x7fffffffc458) at fetch-value.c:54
>  #3  0x00007fffe58f8ff0 in _td_fetch_value (ta=0x1f84df0, desc=0x1f84fbc, descriptor_name=descriptor_name@entry=43, idx=idx@entry=0x0, address=0x7ffff7ff7658, 
>      result=result@entry=0x7fffffffc498) at fetch-value.c:94
>  #4  0x00007fffe58f8ddf in td_thr_tls_get_addr (th=0x7fffffffc4e0, map_address=<optimized out>, offset=16, address=0x7fffffffc4f8) at td_thr_tls_get_addr.c:31
>  ...
> 
> So we'd need to move that symbol (and maybe others) to one of ld.so/libc.so
> instead.  AFAICT, those magic symbols are described in nptl_db/structs.def.
> I haven't looked enough to figure out what ends up expanding those macros
> in libpthread.so.  This is where I stopped.
> 
> I'm attaching the gdb and libthread_db.so patches I used, both against current
> master in their respective projects.  See comments within the patches.
> I've also pushed the gdb patch to a "users/palves/tls-nonthreaded" branch.
> (I don't think I have write access to glibc's git.)

I played some more with this tonight, and got it working.
I found that libpthread.so gets the symbols from structs.def
via this bit in pthread_create.c:

~~~
 /* Information for libthread_db.  */

 #include "../nptl_db/db_info.c"
~~~

So I added the same thing to elf/dl-tls.c (because TLS stuff),
with a tweak to structs.def to make it skip some symbols.

Compared to the other version, I also defined nptl_version so that
libthread_db.so's version check works.  And I made td_ta_map_lwp2thr
return the initial thread's fake descriptor when __stack_used isn't
found, so that gdb gets back a descriptor instead of an error.

I haven't tried running glibc's testsuite, nor gdb's.

But, with:

 GLIBC=/home/pedro/src/glibc/build/ gcc \
   -Wl,-rpath=${GLIBC}:${GLIBC}/math:${GLIBC}/elf:${GLIBC}/dlfcn:${GLIBC}/nss:${GLIBC}/nis:${GLIBC}/rt:${GLIBC}/resolv:${GLIBC}/crypt:${GLIBC}/nptl:${GLIBC}/dfp \
   -Wl,--dynamic-linker=${GLIBC}/elf/ld.so \
   -g -O0 \
   errno.c -o errno

 $ gdb 
    -ex "set debug libthread-db 1" \
    -ex "set auto-load safe-path /home/pedro/src/glibc/build/nptl_db/" \
    -ex "set libthread-db-search-path /home/pedro/src/glibc/build/nptl_db/" \
   ~/tmp/errno

I get:

 ...
 Temporary breakpoint 1, main () at errno.c:6
 6         errno = 2;
 (gdb) p errno
 $1 = 0
 (gdb) n
 7         return errno;
 (gdb) p errno
 $2 = 2
 (gdb) p &errno
 $3 = (int *) 0x7ffff7ff36c0

So WDYT?  Do you think there's a chance we could get something
like this merged?

Thanks,
Pedro Alves
From 0edee198c4edb446bb5e676d6c49e1bef6e15ad1 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Wed, 13 Sep 2017 11:43:15 +0100
Subject: [PATCH] Use libthread_db.so with non-threaded programs, for TLS

---
 gdb/linux-thread-db.c | 42 ++++++++++++++++++++++++++++++++----------
 gdb/proc-service.c    | 19 ++++++++++++++++---
 2 files changed, 48 insertions(+), 13 deletions(-)

diff --git a/gdb/linux-thread-db.c b/gdb/linux-thread-db.c
index 0e16f6a..4f627cb 100644
--- a/gdb/linux-thread-db.c
+++ b/gdb/linux-thread-db.c
@@ -104,7 +104,7 @@ set_libthread_db_search_path (char *ignored, int from_tty,
 
 /* If non-zero, print details of libthread_db processing.  */
 
-static unsigned int libthread_db_debug;
+unsigned int libthread_db_debug;
 
 static void
 show_libthread_db_debug (struct ui_file *file, int from_tty,
@@ -354,13 +354,19 @@ thread_from_lwp (ptid_t ptid)
   err = info->td_ta_map_lwp2thr_p (info->thread_agent, ptid_get_lwp (ptid),
 				   &th);
   if (err != TD_OK)
-    error (_("Cannot find user-level thread for LWP %ld: %s"),
-	   ptid_get_lwp (ptid), thread_db_err_str (err));
+    {
+      warning (_("Cannot find user-level thread for LWP %ld: %s"),
+	       ptid_get_lwp (ptid), thread_db_err_str (err));
+      return NULL;
+    }
 
   err = info->td_thr_get_info_p (&th, &ti);
   if (err != TD_OK)
-    error (_("thread_get_info_callback: cannot get thread info: %s"),
-	   thread_db_err_str (err));
+    {
+      warning (_("thread_get_info_callback: cannot get thread info: %s"),
+	       thread_db_err_str (err));
+      return NULL;
+    }
 
   /* Fill the cache.  */
   tp = find_thread_ptid (ptid);
@@ -1429,14 +1435,32 @@ thread_db_get_thread_local_address (struct target_ops *ops,
   if (thread_info != NULL && thread_info->priv == NULL)
     thread_info = thread_from_lwp (ptid);
 
-  if (thread_info != NULL && thread_info->priv != NULL)
+  if (thread_info != NULL)
     {
       td_err_e err;
       psaddr_t address;
       struct thread_db_info *info;
+      td_thrhandle_t main_thr_th;
 
       info = get_thread_db_info (ptid_get_pid (ptid));
 
+      /* Handle executables that don't link with pthread.
+	 Historically we didn't use to use libthread_db.so with those,
+	 but nowadays we do in order to be able to access TLS
+	 variables in non-threaded programs, like "errno".  If the
+	 program is not threaded, the initial thread's handle's tid
+	 field remains 0, and we won't have cached the handle /
+	 created the priv field.  */
+      td_thrhandle_t *th;
+      if (thread_info->priv == NULL)
+	{
+	  main_thr_th.th_ta_p = info->thread_agent;
+	  main_thr_th.th_unique = 0;
+	  th = &main_thr_th;
+	}
+      else
+	th = &thread_info->priv->th;
+
       /* Finally, get the address of the variable.  */
       if (lm != 0)
 	{
@@ -1448,8 +1472,7 @@ thread_db_get_thread_local_address (struct target_ops *ops,
 	  /* Note the cast through uintptr_t: this interface only works if
 	     a target address fits in a psaddr_t, which is a host pointer.
 	     So a 32-bit debugger can not access 64-bit TLS through this.  */
-	  err = info->td_thr_tls_get_addr_p (&thread_info->priv->th,
-					     (psaddr_t)(uintptr_t) lm,
+	  err = info->td_thr_tls_get_addr_p (th, (psaddr_t)(uintptr_t) lm,
 					     offset, &address);
 	}
       else
@@ -1466,8 +1489,7 @@ thread_db_get_thread_local_address (struct target_ops *ops,
 	     PR libc/16831 due to GDB PR threads/16954 LOAD_MODULE is also NULL.
 	     The constant number 1 depends on GNU __libc_setup_tls
 	     initialization of l_tls_modid to 1.  */
-	  err = info->td_thr_tlsbase_p (&thread_info->priv->th,
-					1, &address);
+	  err = info->td_thr_tlsbase_p (th, 1, &address);
 	  address = (char *) address + offset;
 	}
 
diff --git a/gdb/proc-service.c b/gdb/proc-service.c
index 4620fea..940993c 100644
--- a/gdb/proc-service.c
+++ b/gdb/proc-service.c
@@ -102,6 +102,8 @@ ps_xfer_memory (const struct ps_prochandle *ph, psaddr_t addr,
 }
 
 
+extern unsigned int libthread_db_debug;
+
 /* Search for the symbol named NAME within the object named OBJ within
    the target process PH.  If the symbol is found the address of the
    symbol is stored in SYM_ADDR.  */
@@ -119,9 +121,20 @@ ps_pglobal_lookup (gdb_ps_prochandle_t ph, const char *obj,
   /* FIXME: kettenis/2000-09-03: What should we do with OBJ?  */
   bound_minimal_symbol ms = lookup_minimal_symbol (name, NULL, NULL);
   if (ms.minsym == NULL)
-    return PS_NOSYM;
-
-  *sym_addr = core_addr_to_ps_addr (BMSYMBOL_VALUE_ADDRESS (ms));
+    {
+      if (libthread_db_debug)
+	fprintf_unfiltered (gdb_stdlog,
+			    "ps_pglobal_lookup: name=\"%s\" => PS_NOSYM\n",
+			    name);
+      return PS_NOSYM;
+    }
+
+  CORE_ADDR ms_addr = BMSYMBOL_VALUE_ADDRESS (ms);
+  if (libthread_db_debug)
+    fprintf_unfiltered (gdb_stdlog,
+			"ps_pglobal_lookup: name=\"%s\" => PS_OK, %s\n", name,
+			paddress (target_gdbarch (), ms_addr));
+  *sym_addr = core_addr_to_ps_addr (ms_addr);
   return PS_OK;
 }
Carlos O'Donell Sept. 18, 2017, 1:17 p.m. | #3
On 09/13/2017 06:02 PM, Pedro Alves wrote:
> On 09/13/2017 12:22 PM, Pedro Alves wrote:
>> On 09/07/2017 12:34 PM, Pedro Alves wrote:
>>> On 09/06/2017 10:03 PM, Zack Weinberg wrote:
>>>
>>>> So, changes to both gdb and libthread_db seem to be required here.  I
>>>> do think that _in principle_ it ought to be possible to use
>>>> libthread_db to retrieve the address of thread-local data even if the
>>>> inferior is not linked with libpthread; glibc has quite a few
>>>> thread-specific variables (errno most prominent, of course, but also
>>>> h_errno, _res, etc), and so might any library which can be used from
>>>> both single- and multithreaded programs.
>>>>
>>>> This is really not code I feel comfortable hacking up, though, and
>>>> it's probably more of a project than I have time for, in any case.
>>>
>>> Sounds like a promising approach though.  I'd like to see this path
>>> explored a bit more.  I'll keep this in my TODO, even though it's
>>> not likely to bubble up very soon.  Thanks for the discussion/ideas!
>>
>> So I played with this a bit more on the plane back from Cauldron,
>> to try to see if we'd hit some major roadblock.  I also chatted
>> with Carlos a bit about this back at the Cauldron, and seemingly
>> there's no major reason this can't be made to work,
>> TLS-internals-wise.
>>
>> Seems like that it's mainly a case of moving libthread_db.so-related
>> symbols from libpthread.so elsewhere.  More below.
>>
>> I hacked libthread_db.so to disable the nptl_version check, so that
>> it always successfully loads with non-threaded programs.  And then
>> I tweaked GDB enough to make it actually reach libthread_db.so's
>> td_thr_tls_get_addr in that scenario too.  That's when I hit
>> another snag: the symbols that libthread_db.so needs which describe
>> the necessary offsets of internal data structures for getting at the
>> TLS blocks are also in libpthread.so...  In particular, the first
>> we stumble on is "_thread_db_link_map_l_tls_modid".  I made GDB
>> print the symbol lookups to make it easier to debug.  Vis:
>>
>>  (gdb) p errno
>>  ps_pglobal_lookup: name="__stack_user" => PS_NOSYM
>>  warning: Cannot find user-level thread for LWP 31772: generic error
>>  ps_pglobal_lookup: name="_thread_db_link_map_l_tls_modid" => PS_NOSYM
>>  Cannot find thread-local storage for process 31772, shared library /lib64/libc.so.6:
>>  operation not applicable to
>>
>> The lookup is coming from here:
>>
>>  (top-gdb) bt
>>  #0  ps_pglobal_lookup (ph=0x1f65fe0, obj=0x7fffe58f93ae "libpthread.so.0", name=0x7fffe58f9e48 "_thread_db_link_map_l_tls_modid", sym_addr=0x7fffffffc428) at src/gdb/proc-service.c:115
>>  #1  0x00007fffe58f88a8 in td_mod_lookup (ps=<optimized out>, mod=mod@entry=0x7fffe58f93ae "libpthread.so.0", idx=<optimized out>, sym_addr=sym_addr@entry=0x7fffffffc428)
>>      at td_symbol_list.c:48
>>  #2  0x00007fffe58f8f45 in _td_locate_field (ta=ta@entry=0x1f84df0, desc=desc@entry=0x1f84fbc, descriptor_name=descriptor_name@entry=43, idx=idx@entry=0x0, 
>>      address=address@entry=0x7fffffffc458) at fetch-value.c:54
>>  #3  0x00007fffe58f8ff0 in _td_fetch_value (ta=0x1f84df0, desc=0x1f84fbc, descriptor_name=descriptor_name@entry=43, idx=idx@entry=0x0, address=0x7ffff7ff7658, 
>>      result=result@entry=0x7fffffffc498) at fetch-value.c:94
>>  #4  0x00007fffe58f8ddf in td_thr_tls_get_addr (th=0x7fffffffc4e0, map_address=<optimized out>, offset=16, address=0x7fffffffc4f8) at td_thr_tls_get_addr.c:31
>>  ...
>>
>> So we'd need to move that symbol (and maybe others) to one of ld.so/libc.so
>> instead.  AFAICT, those magic symbols are described in nptl_db/structs.def.
>> I haven't looked enough to figure out what ends up expanding those macros
>> in libpthread.so.  This is where I stopped.
>>
>> I'm attaching the gdb and libthread_db.so patches I used, both against current
>> master in their respective projects.  See comments within the patches.
>> I've also pushed the gdb patch to a "users/palves/tls-nonthreaded" branch.
>> (I don't think I have write access to glibc's git.)
> 
> I played some more with this tonight, and got it working.
> I found that libpthread.so gets the symbols from structs.def
> via this bit in pthread_create.c:
> 
> ~~~
>  /* Information for libthread_db.  */
> 
>  #include "../nptl_db/db_info.c"
> ~~~
> 
> So I added the same thing to elf/dl-tls.c (because TLS stuff),
> with a tweak to structs.def to make it skip some symbols.
> 
> Compared to the other version, I also defined nptl_version so that
> libthread_db.so's version check works.  And I made td_ta_map_lwp2thr
> return the initial thread's fake descriptor when __stack_used isn't
> found, so that gdb gets back a descriptor instead of an error.
> 
> I haven't tried running glibc's testsuite, nor gdb's.
> 
> But, with:
> 
>  GLIBC=/home/pedro/src/glibc/build/ gcc \
>    -Wl,-rpath=${GLIBC}:${GLIBC}/math:${GLIBC}/elf:${GLIBC}/dlfcn:${GLIBC}/nss:${GLIBC}/nis:${GLIBC}/rt:${GLIBC}/resolv:${GLIBC}/crypt:${GLIBC}/nptl:${GLIBC}/dfp \
>    -Wl,--dynamic-linker=${GLIBC}/elf/ld.so \
>    -g -O0 \
>    errno.c -o errno
> 
>  $ gdb 
>     -ex "set debug libthread-db 1" \
>     -ex "set auto-load safe-path /home/pedro/src/glibc/build/nptl_db/" \
>     -ex "set libthread-db-search-path /home/pedro/src/glibc/build/nptl_db/" \
>    ~/tmp/errno
> 
> I get:
> 
>  ...
>  Temporary breakpoint 1, main () at errno.c:6
>  6         errno = 2;
>  (gdb) p errno
>  $1 = 0
>  (gdb) n
>  7         return errno;
>  (gdb) p errno
>  $2 = 2
>  (gdb) p &errno
>  $3 = (int *) 0x7ffff7ff36c0
> 
> So WDYT?  Do you think there's a chance we could get something
> like this merged?

From the glibc side I think your patch goes in a very desirable direction.
We want this to look the same if there is one thread (main thread 0) or
more than one thread. Therefore the abstraction is valuable.

At the implementation level I would want to see the changes to dl-tls.c
much more restricted. Including db_info.c seems like just an expedient
airplane hack. We need a cleaner way to define the symbols you need.

So the patch isn't ready, but the idea is solid. I don't have time to look
at this right now, but I will in maybe a month (yes my queue is that deep
right now).
Pedro Alves Sept. 18, 2017, 2:28 p.m. | #4
Hi Carlos,

On 09/18/2017 02:17 PM, Carlos O'Donell wrote:
> On 09/13/2017 06:02 PM, Pedro Alves wrote:

> From the glibc side I think your patch goes in a very desirable direction.
> We want this to look the same if there is one thread (main thread 0) or
> more than one thread. Therefore the abstraction is valuable.

Great to hear, thanks!

> At the implementation level I would want to see the changes to dl-tls.c
> much more restricted. Including db_info.c seems like just an expedient
> airplane hack. We need a cleaner way to define the symbols you need.

Yeah, I was aiming for the "minimal viable product", and just
followed what nptl/pthread_create.c does.  It's possible that we
could trim the defined symbols, I haven't looked much beyond
the surface.

I suspect that nptl/pthread_create.c includes ../nptl_db/db_info.c
directly for static links (-static).  I.e., so that the
thread_db-related symbols are defined (and are only defined)
with programs that call pthread_create.  It's possible that
my prototype breaks -static linking with multiple definition
errors, I haven't tried it.

> 
> So the patch isn't ready, but the idea is solid. I don't have time to look
> at this right now, but I will in maybe a month (yes my queue is that deep
> right now).

Likewise.  If someone more familiar with glibc internals wants to
run with this, they're more than welcome.

Thanks,
Pedro Alves

Patch

From 386b8dc8ef16197b3efa38f4bbbc98833ce7c2c6 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Mon, 11 Sep 2017 13:48:04 +0100
Subject: [PATCH] remove version checks hack

---
 nptl_db/td_ta_map_lwp2thr.c | 10 ++++++++--
 nptl_db/td_ta_new.c         |  9 ++++++++-
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/nptl_db/td_ta_map_lwp2thr.c b/nptl_db/td_ta_map_lwp2thr.c
index d34711d..85620cb 100644
--- a/nptl_db/td_ta_map_lwp2thr.c
+++ b/nptl_db/td_ta_map_lwp2thr.c
@@ -185,11 +185,17 @@  td_ta_map_lwp2thr (const td_thragent_t *ta_arg,
      sometimes contain garbage that would confuse us, left by the kernel
      at exec.  So if it looks like initialization is incomplete, we only
      fake a special descriptor for the initial thread.  */
-
   psaddr_t list;
   td_err_e err = DB_GET_SYMBOL (list, ta, __stack_user);
   if (err != TD_OK)
-    return err;
+    {
+      /* '__stack_user' is in pthread.so, so this always fails with
+	 non-threaded programs.  GDB hardcodes/assumes th_unique==0
+	 for the main thread - maybe we should instead return the fake
+	 special descriptor for the initial thread here too.  See
+	 below.  */
+      return err;
+    }
 
   err = DB_GET_FIELD (list, ta, list, list_t, next, 0);
   if (err != TD_OK)
diff --git a/nptl_db/td_ta_new.c b/nptl_db/td_ta_new.c
index aec2356..40424ad 100644
--- a/nptl_db/td_ta_new.c
+++ b/nptl_db/td_ta_new.c
@@ -33,12 +33,18 @@  LIST_HEAD (__td_agent_list);
 td_err_e
 td_ta_new (struct ps_prochandle *ps, td_thragent_t **ta)
 {
+#if 0
   psaddr_t versaddr;
   char versbuf[sizeof (VERSION)];
+#endif
 
   LOG ("td_ta_new");
 
-  /* Check whether the versions match.  */
+  /* Check whether the versions match.
+
+     XXX: Disabled because "nptl_version" currently lives in
+     libpthread.so.  */
+#if 0
   if (td_lookup (ps, SYM_nptl_version, &versaddr) != PS_OK)
     return TD_NOLIBTHREAD;
   if (ps_pdread (ps, versaddr, versbuf, sizeof (versbuf)) != PS_OK)
@@ -47,6 +53,7 @@  td_ta_new (struct ps_prochandle *ps, td_thragent_t **ta)
   if (memcmp (versbuf, VERSION, sizeof VERSION) != 0)
     /* Not the right version.  */
     return TD_VERSION;
+#endif
 
   /* Fill in the appropriate information.  */
   *ta = (td_thragent_t *) calloc (1, sizeof (td_thragent_t));
-- 
2.5.5