[review] Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
diff mbox series

Message ID gerrit.1572549639000.I240c58387dabda3ca1bcab48b02115175fa83d6c@gnutoolchain-gerrit.osci.io
State New
Headers show
Series
  • [review] Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
Related show

Commit Message

Carlos O'Donell (Code Review) Oct. 31, 2019, 7:20 p.m. UTC
Change URL: https://gnutoolchain-gerrit.osci.io/r/c/glibc/+/468
......................................................................

Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]

This change splits the scope and TLS slotinfo updates in dlopen into
two parts: one to resize the data structures, and one to actually apply
the update.  The call to add_to_global_resize in dl_open_worker is moved
before the demarcation point at which no further memory allocations are
allowed.

_dl_add_to_slotinfo is adjusted to make the list update optional.  There
is some optimization possibility here because we could grow the slotinfo
list of arrays in a single call, one the largest TLS modid is known.

This commit does not fix the fatal meory allocation failure in
_dl_update_slotinfo.  Ideally, this error during dlopen should be
recoverable.

The update order of scopes and TLS data structures is retained, although
it appears to be more correct to fully initialize TLS first, and then
expose symbols in the newly loaded objects via the scope update.

Tested on x86_64-linux-gnu.

Change-Id: I240c58387dabda3ca1bcab48b02115175fa83d6c
---
M elf/dl-open.c
M elf/dl-tls.c
M elf/rtld.c
M sysdeps/generic/ldsodefs.h
4 files changed, 240 insertions(+), 128 deletions(-)

Comments

Carlos O'Donell (Code Review) Nov. 13, 2019, 12:57 p.m. UTC | #1
Florian Weimer has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/glibc/+/468
......................................................................


Patch Set 2:

This change is ready for review.
Carlos O'Donell (Code Review) Nov. 26, 2019, 10:11 p.m. UTC | #2
Carlos O'Donell has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/glibc/+/468
......................................................................


Patch Set 3:

(25 comments)

This is *great* work and some of the best loader cleanup I've seen yet. I really like the refactoring and placing into various functions. I called out one function which does too much and I think we should split. For v2 I mostly want the comments cleaned up, since they form the basis of how we will continue to fixup the code. Logically the code makes sense to me, and it took so long becuase I started auditing from all slotinfo uses upwards across the call chains, and also l_scope uses (and I even audited the gscope locking uses too).

Looking forward to a v2. I think we're almost ready to commit this.

| --- /dev/null
| +++ /COMMIT_MSG
| @@ -1,0 +1,29 @@ 
| +Parent:     0d636cf2 (Lazy binding failures during dlopen/dlclose must be fatal [BZ #24304])
| +Author:     Florian Weimer <fweimer@redhat.com>
| +AuthorDate: 2019-10-31 13:12:32 +0100
| +Commit:     Florian Weimer <fweimer@redhat.com>
| +CommitDate: 2019-11-15 16:21:18 +0100
| +
| +Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]

PS3, Line 7:

OK. The idea here is that we do the slotinfo updates early.

| +
| +This change splits the scope and TLS slotinfo updates in dlopen into
| +two parts: one to resize the data structures, and one to actually apply
| +the update.  The call to add_to_global_resize in dl_open_worker is moved
| +before the demarcation point at which no further memory allocations are
| +allowed.
| +
| +_dl_add_to_slotinfo is adjusted to make the list update optional.  There
| +is some optimization possibility here because we could grow the slotinfo
| +list of arrays in a single call, one the largest TLS modid is known.
| +
| +This commit does not fix the fatal meory allocation failure in
| +_dl_update_slotinfo.  Ideally, this error during dlopen should be
| +recoverable.

PS3, Line 21:

Correct, I agree, but it requires more surgery.

| +
| +The update order of scopes and TLS data structures is retained, although
| +it appears to be more correct to fully initialize TLS first, and then
| +expose symbols in the newly loaded objects via the scope update.
| +
| +Tested on x86_64-linux-gnu.
| +
| +Change-Id: I240c58387dabda3ca1bcab48b02115175fa83d6c
| --- elf/dl-open.c
| +++ elf/dl-open.c
| @@ -27,18 +27,19 @@ #include <unistd.h>
|  #include <sys/mman.h>		/* Check whether MAP_COPY is defined.  */
|  #include <sys/param.h>
|  #include <libc-lock.h>
|  #include <ldsodefs.h>
|  #include <sysdep-cancel.h>
|  #include <tls.h>
|  #include <stap-probe.h>
|  #include <atomic.h>
|  #include <libc-internal.h>
| +#include <array_length.h>

PS3, Line 36:

OK.

|  
|  #include <dl-dst.h>
|  #include <dl-prop.h>
|  
|  
|  /* We must be careful not to leave us in an inconsistent state.  Thus we
|     catch any error and re-raise it after cleaning up.  */
|  
|  struct dl_open_args

 ...

| @@ -208,9 +209,87 @@ _dl_find_dso_for_object (const ElfW(Addr) addr)
|  	      || _dl_addr_inside_object (l, (ElfW(Addr)) addr)))
|  	{
|  	  assert (ns == l->l_ns);
|  	  return l;
|  	}
|    return NULL;
|  }
|  rtld_hidden_def (_dl_find_dso_for_object);
|  
| +/* Returned from scope_size if the new map was found in the existing
| +   scope.  */
| +enum { scope_size_found_new = (size_t) -1 };

PS3, Line 220:

See comments below.

| +
| +/* Return the length of the scope for MAP.  If NEW->l_searchlist is
| +   found in the scope, return scope_size_found_new.  */
| +static size_t
| +scope_size (struct link_map *map, struct link_map *new)

PS3, Line 225:

My opinion is that this is a micro-optimization that makes this code
harder to maintain or undrestand. The compiler should handle this when
it inlines this function. Please split this into scope_size which
always returns size and can't fail, and a scope_find which searches
just for the pointer to the dependency tree of the map in question
inside the scope of the other object. Clearly comment each function
and what it does.

| +{
| +  size_t cnt;
| +  for (cnt = 0; map->l_scope[cnt] != NULL; ++cnt)
| +    if (map->l_scope[cnt] == &new->l_searchlist)
| +      return scope_size_found_new;
| +  return cnt;
| +}
| +
| +/* Resize the scopes of depended-upon objects, so that the new object
| +   can be added later without further allocation of memory.  This
| +   function can raise an exceptions due to malloc failure.  */
| +static void
| +resize_scopes (struct link_map *new)
| +{

PS3, Line 239:

OK, good comment and semantics are clear.

| +  /* If the file is not loaded now as a dependency, add the search
| +     list of the newly loaded object to the scope.  */
| +  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
| +    {
| +      struct link_map *imap = new->l_searchlist.r_list[i];
| +
| +      /* If the initializer has been called already, the object has
| +	 not been loaded here and now.  */
| +      if (imap->l_init_called && imap->l_type == lt_loaded)
| +	{
| +	  size_t cnt = scope_size (imap, new);
| +	  if (cnt == scope_size_found_new)
| +	    /* Avoid duplicates.  */
| +	    continue;

PS3, Line 253:

Instead of reusing count we'd call:
if (scope_find(imap, new))
  continue;
No need to have any scope_size_found_new enum.

| +
| +	  if (__glibc_unlikely (cnt + 1 >= imap->l_scope_max))

PS3, Line 255:

OK. We want to make sure that cnt + 1 (the size of the used entries +
the one new entry) does not exceed the currently allocated size.

| +	    {
| +	      /* The 'r_scope' array is too small.  Allocate a new one
| +		 dynamically.  */

PS3, Line 258:

I know you're just refactoring this code, but there is no such thing
as an r_scope array (even back in 2006 with
1100f84983f22e570a5081cbe79b0ef8fe4952d7 when this was added). The
'r_scope' refers to l_scope AFAIU. The comments should say "The
'l_scope' array is too small. Allocate a new one dynamically." We
should fix this since it makes the comment make sense.

| +	      size_t new_size;
| +	      struct r_scope_elem **newp;
| +
| +	      if (imap->l_scope != imap->l_scope_mem
| +		  && imap->l_scope_max < array_length (imap->l_scope_mem))
| +		{
| +		  new_size = array_length (imap->l_scope_mem);
| +		  newp = imap->l_scope_mem;

PS3, Line 266:

This needs a comment. What we're doing here is as follows:
/* If the current l_scope memory is not pointing to the static memory
in the structure, but the static memory in the structure is large
enough to use for "cnt + 1" scope entries, then switch to using the
static memory. */
The old memory is deallocated on line 286.

| +		}
| +	      else
| +		{
| +		  new_size = imap->l_scope_max * 2;

PS3, Line 270:

The x2 scope increase is likely a waste that will never fit into any
previous allocation causing fragmentation during large dlopen l_scope
growth (python with lots of modules?). I reviewed the usage of
l_scope_max and r_nlist, and neither needs to be power of 2. Could we
get away with a x1.5 growth here? This is the only place we resize the
scope array. I know it will cost slightly more to compute x1.5, but it
shouldn't be too bad and the utilization will go up in malloc. In dl-
scope.c we queue up 50 scopes to free in one go (because we're waiting
to do the deferred work when we finally acquire the global scope
lock). We do consolidate in free (backward and forward bordering
consolidation or fastbin consolidation (size >
FASTBIN_CONSOLIDATION_THRESHOLD==65536UL triggering
malloc_consolidated()).

| +		  newp = (struct r_scope_elem **)
| +		    malloc (new_size * sizeof (struct r_scope_elem *));
| +		  if (newp == NULL)
| +		    _dl_signal_error (ENOMEM, "dlopen", NULL,
| +				      N_("cannot create scope list"));
| +		}
| +
| +	      /* Copy the array and the terminating NULL.  */
| +	      memcpy (newp, imap->l_scope,
| +		      (cnt + 1) * sizeof (imap->l_scope[0]));
| +	      struct r_scope_elem **old = imap->l_scope;
| +
| +	      imap->l_scope = newp;
| +
| +	      if (old != imap->l_scope_mem)
| +		_dl_scope_free (old);

PS3, Line 286:

OK. Free the scope that isn't static memory.

| +
| +	      imap->l_scope_max = new_size;
| +	    }
| +	}
| +    }
| +}
| +
| +/* Second stage of resize_scopes: Add NEW to the scopes.  Also print
| +   debugging information about scopes if requested.

 ...

| @@ -217,0 +300,59 @@ static void
| +update_scopes (struct link_map *new)
| +{
| +  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
| +    {
| +      struct link_map *imap = new->l_searchlist.r_list[i];
| +      int from_scope = 0;
| +
| +      if (imap->l_init_called && imap->l_type == lt_loaded)
| +	{
| +	  size_t cnt = scope_size (imap, new);
| +	  if (cnt == scope_size_found_new)
| +	    /* Avoid duplicates.  */
| +	    continue;

PS3, Line 312:

Change as suggested above. Split into two function calls.

| +
| +	  assert (cnt + 1 < imap->l_scope_max);

PS3, Line 314:

Please add a comment about this assert.

| +
| +	  /* First terminate the extended list.  Otherwise a thread
| +	     might use the new last element and then use the garbage
| +	     at offset IDX+1.  */
| +	  imap->l_scope[cnt + 1] = NULL;
| +	  atomic_write_barrier ();
| +	  imap->l_scope[cnt] = &new->l_searchlist;
| +
| +	  from_scope = cnt;
| +	}
| +
| +      /* Print scope information.  */
| +      if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_SCOPES))
| +	_dl_show_scope (imap, from_scope);

PS3, Line 328:

OK.

| +    }
| +}
| +
| +/* Call _dl_add_to_slotinfo with DO_ADD set to false, to allocate
| +   space in GL (dl_tls_dtv_slotinfo_list).  This can raise an
| +   exception.  */

PS3, Line 334:

Please update comment to explain what the boolean return value means.

| +static bool
| +resize_tls_slotinfo (struct link_map *new)
| +{
| +  bool any_tls = false;
| +  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
| +    {
| +      struct link_map *imap = new->l_searchlist.r_list[i];
| +
| +      /* Only add TLS memory if this object is loaded now and
| +	 therefore is not yet initialized.  */
| +      if (! imap->l_init_called && imap->l_tls_blocksize > 0)
| +	{
| +	  _dl_add_to_slotinfo (imap, false);
| +	  any_tls = true;
| +	}

PS3, Line 349:

OK, if we aren't initialized and have a non-zero tls blocksize then
add the size to slotinfo but don't update.

| +    }
| +  return any_tls;
| +}
| +
| +/* Second stage of TLS update, after resize_tls_slotinfo.  This
| +   function does not raise any exception.  It should only be called if
| +   resize_tls_slotinfo returned true.  */
| +static void
| +update_tls_slotinfo (struct link_map *new)

 ...

| @@ -217,0 +360,19 @@ update_tls_slotinfo (struct link_map *new)
| +  unsigned int first_static_tls = new->l_searchlist.r_nlist;
| +  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
| +    {
| +      struct link_map *imap = new->l_searchlist.r_list[i];
| +
| +      /* Only add TLS memory if this object is loaded now and
| +	 therefore is not yet initialized.  */
| +      if (! imap->l_init_called && imap->l_tls_blocksize > 0)
| +	{
| +	  _dl_add_to_slotinfo (imap, true);

PS3, Line 369:

OK, call _dl_add_to_slotinfo (imap, true); with true.

| +
| +	  if (imap->l_need_tls_init
| +	      && first_static_tls == new->l_searchlist.r_nlist)
| +	    first_static_tls = i;
| +	}
| +    }
| +
| +  if (__builtin_expect (++GL(dl_tls_generation) == 0, 0))
| +    _dl_fatal_printf (N_("\

 ...

| @@ -217,0 +390,23 @@ TLS generation counter wrapped!  Please report this."));
| +	  && imap->l_tls_blocksize > 0)
| +	{
| +	  /* For static TLS we have to allocate the memory here and
| +	     now, but we can delay updating the DTV.  */
| +	  imap->l_need_tls_init = 0;
| +#ifdef SHARED
| +	  /* Update the slot information data for at least the
| +	     generation of the DSO we are allocating data for.  */
| +
| +	  /* FIXME: This can terminate the process on memory
| +	     allocation failure.  It is not possible to raise
| +	     exceptions from this context; to fix this bug,
| +	     _dl_update_slotinfo would have to be split into two
| +	     operations.  */

PS3, Line 403:

I would expand this a bit: "... split into two operations e.g.
resize_scopes vs. update_scopes " Or something similar.

| +	  _dl_update_slotinfo (imap->l_tls_modid);
| +#endif
| +
| +	  GL(dl_init_static_tls) (imap);
| +	  assert (imap->l_need_tls_init == 0);
| +	}
| +    }
| +}
| +

 ...

| @@ -555,18 +633,42 @@ #ifdef SHARED
| -	  /* Update the slot information data for at least the
| -	     generation of the DSO we are allocating data for.  */
| -	  _dl_update_slotinfo (imap->l_tls_modid);
| -#endif
| -
| -	  GL(dl_init_static_tls) (imap);
| -	  assert (imap->l_need_tls_init == 0);
| -	}
| -    }
| +  /* This only performs the memory allocations.  The actual update of
| +     the scopes happens below, after failure is impossible.  */
| +  resize_scopes (new);
| +
| +  /* Increase the size of the GL (dl_tls_dtv_slotinfo_list) data
| +     structure.  */
| +  bool any_tls = resize_tls_slotinfo (new);
| +
| +  /* Perform the necessary allocations for adding new global objects
| +     to the global scope below.  */
| +  if (mode & RTLD_GLOBAL)
| +    add_to_global_resize (new);
| +
| +  /* Demarcation point: After this, no recoverable errors are allowed.
| +     All memory allocations for new objects must have happened
| +     before.  */
| +
| +  /* Second stage after resize_scopes: Actually perform the scope
| +     update.  After this, dlsym and lazy binding can bind to new
| +     objects.  */
| +  update_scopes (new);
| +
| +  /* FIXME: It is unclear whether the order here is correct.
| +     Shouldn't new objects be made available for binding (and thus
| +     execution) only after there TLS data has been set up
| +     correctly?  */
| +
| +  /* Second stage after resize_tls_slotinfo: Update the slotinfo data
| +     structures.  */
| +  if (any_tls)
| +    /* FIXME: This calls _dl_update_slotinfo, which aborts the process
| +       on memory allocation failure.  */
| +    update_tls_slotinfo (new);

PS3, Line 665:

Excellent cleanup and refactoring. We reszie_scopes to get the new
size. Then we resize slotinfo (recoverable). Then we add to the global
scope. Then we update the scopes (can't fail now).

Regarding your FIXME and TLS, I think the design is supposed to
tolerate calling __tls_get_addr() with minimal setup of TLS (which has
already happened at startup). Eventually we want to fix this though, I
think we absolutely want to do what you're doing with scopes, which is
to move the static TLS allocation (all TLS, slotinfo, static blocks,
dyanmic blocks etc) ahead of the demarcation point to avoid writing to
a TLS variable causing an out of memory error (makes TLS var access
AS-safe). So I think you should update your FIXME or maybe say
something like:
~~~
FIXME: We would like all TLS allocations to be done upfront to avoid
being able to fail when writing to a TLS variable for the first time
but we don't currently do this. This is bug 16134.
~~~
I think your second FIXME is similar in nature. Maybe talk about
splitting
_dl_update_slotinfo in the comment?

|  
|    /* Notify the debugger all new objects have been relocated.  */
|    if (relocation_in_progress)
|      LIBC_PROBE (reloc_complete, 3, args->nsid, r, new);
|  
|  #ifndef SHARED
|    DL_STATIC_INIT (new);
|  #endif
|  
| --- elf/dl-tls.c
| +++ elf/dl-tls.c
| @@ -878,18 +878,18 @@ _dl_tls_get_addr_soft (struct link_map *l)
|         to allocate this module's segment.  */
|      data = NULL;
|  
|    return data;
|  }
|  
|  
|  void
| -_dl_add_to_slotinfo (struct link_map *l)
| +_dl_add_to_slotinfo (struct link_map *l, bool do_add)

PS3, Line 886:

OK, add bool.

|  {
|    /* Now that we know the object is loaded successfully add
|       modules containing TLS data to the dtv info table.  We
|       might have to increase its size.  */
|    struct dtv_slotinfo_list *listp;
|    struct dtv_slotinfo_list *prevp;
|    size_t idx = l->l_tls_modid;
|  
|    /* Find the place in the dtv slotinfo list.  */

 ...

| @@ -939,6 +939,9 @@ cannot create TLS data structures"));
|      }
|  
|    /* Add the information into the slotinfo data structure.  */
| -  listp->slotinfo[idx].map = l;
| -  listp->slotinfo[idx].gen = GL(dl_tls_generation) + 1;
| +  if (do_add)
| +    {
| +      listp->slotinfo[idx].map = l;
| +      listp->slotinfo[idx].gen = GL(dl_tls_generation) + 1;
| +    }

PS3, Line 946:

OK. Used by dlopen to do the actual update.

|  }
| --- elf/rtld.c
| +++ elf/rtld.c
| @@ -2210,18 +2210,18 @@ #define VERNEEDTAG (DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGIDX (DT_VERNEED))
|        for (struct link_map *l = main_map; l != NULL; l = l->l_next)
|  	{
|  	  l->l_relocated = 1;
|  	  if (l->l_relro_size)
|  	    _dl_protect_relro (l);
|  
|  	  /* Add object to slot information data if necessasy.  */
|  	  if (l->l_tls_blocksize != 0 && tls_init_tp_called)
| -	    _dl_add_to_slotinfo (l);
| +	    _dl_add_to_slotinfo (l, true);

PS3, Line 2218:

OK. Yup, we want to do the update here.

|  	}
|      }
|    else
|      {
|        /* Now we have all the objects loaded.  Relocate them all except for
|  	 the dynamic linker itself.  We do this in reverse order so that copy
|  	 relocs of earlier objects overwrite the data written by later
|  	 objects.  We do not re-relocate the dynamic linker itself in this
|  	 loop because that could result in the GOT entries for functions we

 ...

| @@ -2255,18 +2255,18 @@ #define VERNEEDTAG (DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGIDX (DT_VERNEED))
|  	  l->l_free_initfini = 0;
|  
|  	  if (l != &GL(dl_rtld_map))
|  	    _dl_relocate_object (l, l->l_scope, GLRO(dl_lazy) ? RTLD_LAZY : 0,
|  				 consider_profiling);
|  
|  	  /* Add object to slot information data if necessasy.  */
|  	  if (l->l_tls_blocksize != 0 && tls_init_tp_called)
| -	    _dl_add_to_slotinfo (l);
| +	    _dl_add_to_slotinfo (l, true);

PS3, Line 2263:

OK. Likewise we want to do the update here too.

|  	}
|        rtld_timer_stop (&relocate_time, start);
|  
|        /* Now enable profiling if needed.  Like the previous call,
|  	 this has to go here because the calls it makes should use the
|  	 rtld versions of the functions (particularly calloc()), but it
|  	 needs to have _dl_profile_map set up by the relocator.  */
|        if (__glibc_unlikely (GL(dl_profile_map) != NULL))
|  	/* We must prepare the profiling.  */
| --- sysdeps/generic/ldsodefs.h
| +++ sysdeps/generic/ldsodefs.h
| @@ -1145,10 +1144,18 @@ /* Add module to slot information data.  */
| -extern void _dl_add_to_slotinfo (struct link_map  *l) attribute_hidden;
| +
| +/* Add module to slot information data.  If DO_ADD is false, only the
| +   required memory is allocated.  Must be called with GL
| +   (dl_load_lock) acquired.  If the function has already been called
| +   for the link map L with !do_add, then this function will not raise
| +   an exception, otherwise it is possible that it encounters a memory
| +   allocation failure.  */
| +extern void _dl_add_to_slotinfo (struct link_map *l, bool do_add)
| +  attribute_hidden;

PS3, Line 1152:

OK, just adding info about do_add.

|  
|  /* Update slot information data for at least the generation of the
|     module with the given index.  */
|  extern struct link_map *_dl_update_slotinfo (unsigned long int req_modid)
|       attribute_hidden;
|  
|  /* Look up the module's TLS block as for __tls_get_addr,
|     but never touch anything.  Return null if it's not allocated yet.  */
|  extern void *_dl_tls_get_addr_soft (struct link_map *l) attribute_hidden;
Carlos O'Donell (Code Review) Nov. 27, 2019, 3:56 p.m. UTC | #3
Florian Weimer has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/glibc/+/468
......................................................................


Patch Set 4:

(11 comments)

| --- elf/dl-open.c
| +++ elf/dl-open.c
| @@ -208,9 +209,26 @@ _dl_find_dso_for_object (const ElfW(Addr) addr)
|  	      || _dl_addr_inside_object (l, (ElfW(Addr)) addr)))
|  	{
|  	  assert (ns == l->l_ns);
|  	  return l;
|  	}
|    return NULL;
|  }
|  rtld_hidden_def (_dl_find_dso_for_object);
|  
| +/* Returned from scope_size if the new map was found in the existing
| +   scope.  */
| +enum { scope_size_found_new = (size_t) -1 };

PS3, Line 220:

Done

| +
| +/* Return the length of the scope for MAP.  If NEW->l_searchlist is
| +   found in the scope, return scope_size_found_new.  */
| +static size_t
| +scope_size (struct link_map *map, struct link_map *new)

PS3, Line 225:

I would be surprised if GCC can optimize this, but I've made the
change.

| +{
| +  size_t cnt;
| +  for (cnt = 0; map->l_scope[cnt] != NULL; ++cnt)
| +    if (map->l_scope[cnt] == &new->l_searchlist)
| +      return scope_size_found_new;
| +  return cnt;
| +}
| +
| +/* Resize the scopes of depended-upon objects, so that the new object

 ...

| @@ -217,0 +242,38 @@ resize_scopes (struct link_map *new)
| +  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
| +    {
| +      struct link_map *imap = new->l_searchlist.r_list[i];
| +
| +      /* If the initializer has been called already, the object has
| +	 not been loaded here and now.  */
| +      if (imap->l_init_called && imap->l_type == lt_loaded)
| +	{
| +	  size_t cnt = scope_size (imap, new);
| +	  if (cnt == scope_size_found_new)
| +	    /* Avoid duplicates.  */
| +	    continue;

PS3, Line 253:

Done

| +
| +	  if (__glibc_unlikely (cnt + 1 >= imap->l_scope_max))
| +	    {
| +	      /* The 'r_scope' array is too small.  Allocate a new one
| +		 dynamically.  */

PS3, Line 258:

Done

| +	      size_t new_size;
| +	      struct r_scope_elem **newp;
| +
| +	      if (imap->l_scope != imap->l_scope_mem
| +		  && imap->l_scope_max < array_length (imap->l_scope_mem))
| +		{
| +		  new_size = array_length (imap->l_scope_mem);
| +		  newp = imap->l_scope_mem;

PS3, Line 266:

Done

| +		}
| +	      else
| +		{
| +		  new_size = imap->l_scope_max * 2;

PS3, Line 270:

I think this should be a separate change. It's unrelated to the
NODELETE bug.

| +		  newp = (struct r_scope_elem **)
| +		    malloc (new_size * sizeof (struct r_scope_elem *));
| +		  if (newp == NULL)
| +		    _dl_signal_error (ENOMEM, "dlopen", NULL,
| +				      N_("cannot create scope list"));
| +		}
| +
| +	      /* Copy the array and the terminating NULL.  */
| +	      memcpy (newp, imap->l_scope,

 ...

| @@ -217,0 +300,44 @@ static void
| +update_scopes (struct link_map *new)
| +{
| +  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
| +    {
| +      struct link_map *imap = new->l_searchlist.r_list[i];
| +      int from_scope = 0;
| +
| +      if (imap->l_init_called && imap->l_type == lt_loaded)
| +	{
| +	  size_t cnt = scope_size (imap, new);
| +	  if (cnt == scope_size_found_new)
| +	    /* Avoid duplicates.  */
| +	    continue;

PS3, Line 312:

Done

| +
| +	  assert (cnt + 1 < imap->l_scope_max);

PS3, Line 314:

Done

| +
| +	  /* First terminate the extended list.  Otherwise a thread
| +	     might use the new last element and then use the garbage
| +	     at offset IDX+1.  */
| +	  imap->l_scope[cnt + 1] = NULL;
| +	  atomic_write_barrier ();
| +	  imap->l_scope[cnt] = &new->l_searchlist;
| +
| +	  from_scope = cnt;
| +	}
| +
| +      /* Print scope information.  */
| +      if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_SCOPES))
| +	_dl_show_scope (imap, from_scope);
| +    }
| +}
| +
| +/* Call _dl_add_to_slotinfo with DO_ADD set to false, to allocate
| +   space in GL (dl_tls_dtv_slotinfo_list).  This can raise an
| +   exception.  */

PS3, Line 334:

Done

| +static bool
| +resize_tls_slotinfo (struct link_map *new)
| +{
| +  bool any_tls = false;
| +  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
| +    {
| +      struct link_map *imap = new->l_searchlist.r_list[i];
| +
| +      /* Only add TLS memory if this object is loaded now and

 ...

| @@ -217,0 +390,23 @@ TLS generation counter wrapped!  Please report this."));
| +	  && imap->l_tls_blocksize > 0)
| +	{
| +	  /* For static TLS we have to allocate the memory here and
| +	     now, but we can delay updating the DTV.  */
| +	  imap->l_need_tls_init = 0;
| +#ifdef SHARED
| +	  /* Update the slot information data for at least the
| +	     generation of the DSO we are allocating data for.  */
| +
| +	  /* FIXME: This can terminate the process on memory
| +	     allocation failure.  It is not possible to raise
| +	     exceptions from this context; to fix this bug,
| +	     _dl_update_slotinfo would have to be split into two
| +	     operations.  */

PS3, Line 403:

Done

| +	  _dl_update_slotinfo (imap->l_tls_modid);
| +#endif
| +
| +	  GL(dl_init_static_tls) (imap);
| +	  assert (imap->l_need_tls_init == 0);
| +	}
| +    }
| +}
| +

 ...

| @@ -555,18 +633,42 @@ #ifdef SHARED
| -	  /* Update the slot information data for at least the
| -	     generation of the DSO we are allocating data for.  */
| -	  _dl_update_slotinfo (imap->l_tls_modid);
| -#endif
| -
| -	  GL(dl_init_static_tls) (imap);
| -	  assert (imap->l_need_tls_init == 0);
| -	}
| -    }
| +  /* This only performs the memory allocations.  The actual update of
| +     the scopes happens below, after failure is impossible.  */
| +  resize_scopes (new);
| +
| +  /* Increase the size of the GL (dl_tls_dtv_slotinfo_list) data
| +     structure.  */
| +  bool any_tls = resize_tls_slotinfo (new);
| +
| +  /* Perform the necessary allocations for adding new global objects
| +     to the global scope below.  */
| +  if (mode & RTLD_GLOBAL)
| +    add_to_global_resize (new);
| +
| +  /* Demarcation point: After this, no recoverable errors are allowed.
| +     All memory allocations for new objects must have happened
| +     before.  */
| +
| +  /* Second stage after resize_scopes: Actually perform the scope
| +     update.  After this, dlsym and lazy binding can bind to new
| +     objects.  */
| +  update_scopes (new);
| +
| +  /* FIXME: It is unclear whether the order here is correct.
| +     Shouldn't new objects be made available for binding (and thus
| +     execution) only after there TLS data has been set up
| +     correctly?  */
| +
| +  /* Second stage after resize_tls_slotinfo: Update the slotinfo data
| +     structures.  */
| +  if (any_tls)
| +    /* FIXME: This calls _dl_update_slotinfo, which aborts the process
| +       on memory allocation failure.  */
| +    update_tls_slotinfo (new);

PS3, Line 665:

Done

|  
|    /* Notify the debugger all new objects have been relocated.  */
|    if (relocation_in_progress)
|      LIBC_PROBE (reloc_complete, 3, args->nsid, r, new);
|  
|  #ifndef SHARED
|    DL_STATIC_INIT (new);
|  #endif
|
Carlos O'Donell (Code Review) Nov. 27, 2019, 4:16 p.m. UTC | #4
Carlos O'Donell has posted comments on this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/glibc/+/468
......................................................................


Patch Set 4: Code-Review+2

(1 comment)

This version looks good. Thanks for fixing the few nits and correcting comments. Sorry this took so long to review, this is some difficult code with few comments. Your work here has made this better for all future reviewers.

OK for master.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

| --- elf/dl-open.c
| +++ elf/dl-open.c
| @@ -217,0 +261,19 @@ resize_scopes (struct link_map *new)
| +
| +	      if (imap->l_scope != imap->l_scope_mem
| +		  && imap->l_scope_max < array_length (imap->l_scope_mem))
| +		{
| +		  new_size = array_length (imap->l_scope_mem);
| +		  newp = imap->l_scope_mem;
| +		}
| +	      else
| +		{
| +		  new_size = imap->l_scope_max * 2;

PS3, Line 270:

Your right. Sounds good to me.

| +		  newp = (struct r_scope_elem **)
| +		    malloc (new_size * sizeof (struct r_scope_elem *));
| +		  if (newp == NULL)
| +		    _dl_signal_error (ENOMEM, "dlopen", NULL,
| +				      N_("cannot create scope list"));
| +		}
| +
| +	      /* Copy the array and the terminating NULL.  */
| +	      memcpy (newp, imap->l_scope,

Patch
diff mbox series

diff --git a/elf/dl-open.c b/elf/dl-open.c
index aad22ef..a305fef 100644
--- a/elf/dl-open.c
+++ b/elf/dl-open.c
@@ -33,6 +33,7 @@ 
 #include <stap-probe.h>
 #include <atomic.h>
 #include <libc-internal.h>
+#include <array_length.h>
 
 #include <dl-dst.h>
 #include <dl-prop.h>
@@ -214,6 +215,201 @@ 
 }
 rtld_hidden_def (_dl_find_dso_for_object);
 
+/* Returned from scope_size if the new map was found in the existing
+   scope.  */
+enum { scope_size_found_new = (size_t) -1 };
+
+/* Return the length of the scope for MAP.  If NEW->l_searchlist is
+   found in the scope, return scope_size_found_new.  */
+static size_t
+scope_size (struct link_map *map, struct link_map *new)
+{
+  size_t cnt;
+  for (cnt = 0; map->l_scope[cnt] != NULL; ++cnt)
+    if (map->l_scope[cnt] == &new->l_searchlist)
+      return scope_size_found_new;
+  return cnt;
+}
+
+/* Resize the scopes of depended-upon objects, so that the new object
+   can be added later without further allocation of memory.  This
+   function can raise an exceptions due to malloc failure.  */
+static void
+resize_scopes (struct link_map *new)
+{
+  /* If the file is not loaded now as a dependency, add the search
+     list of the newly loaded object to the scope.  */
+  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
+    {
+      struct link_map *imap = new->l_searchlist.r_list[i];
+
+      /* If the initializer has been called already, the object has
+	 not been loaded here and now.  */
+      if (imap->l_init_called && imap->l_type == lt_loaded)
+	{
+	  size_t cnt = scope_size (imap, new);
+	  if (cnt == scope_size_found_new)
+	    /* Avoid duplicates.  */
+	    continue;
+
+	  if (__glibc_unlikely (cnt + 1 >= imap->l_scope_max))
+	    {
+	      /* The 'r_scope' array is too small.  Allocate a new one
+		 dynamically.  */
+	      size_t new_size;
+	      struct r_scope_elem **newp;
+
+	      if (imap->l_scope != imap->l_scope_mem
+		  && imap->l_scope_max < array_length (imap->l_scope_mem))
+		{
+		  new_size = array_length (imap->l_scope_mem);
+		  newp = imap->l_scope_mem;
+		}
+	      else
+		{
+		  new_size = imap->l_scope_max * 2;
+		  newp = (struct r_scope_elem **)
+		    malloc (new_size * sizeof (struct r_scope_elem *));
+		  if (newp == NULL)
+		    _dl_signal_error (ENOMEM, "dlopen", NULL,
+				      N_("cannot create scope list"));
+		}
+
+	      /* Copy the array and the terminating NULL.  */
+	      memcpy (newp, imap->l_scope,
+		      (cnt + 1) * sizeof (imap->l_scope[0]));
+	      struct r_scope_elem **old = imap->l_scope;
+
+	      imap->l_scope = newp;
+
+	      if (old != imap->l_scope_mem)
+		_dl_scope_free (old);
+
+	      imap->l_scope_max = new_size;
+	    }
+	}
+    }
+}
+
+/* Second stage of resize_scopes: Add NEW to the scopes.  Also print
+   debugging information about scopes if requested.
+
+   This function cannot raise an exception because all required memory
+   has been allocated by a previous call to resize_scopes.  */
+static void
+update_scopes (struct link_map *new)
+{
+  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
+    {
+      struct link_map *imap = new->l_searchlist.r_list[i];
+      int from_scope = 0;
+
+      if (imap->l_init_called && imap->l_type == lt_loaded)
+	{
+	  size_t cnt = scope_size (imap, new);
+	  if (cnt == scope_size_found_new)
+	    /* Avoid duplicates.  */
+	    continue;
+
+	  assert (cnt + 1 < imap->l_scope_max);
+
+	  /* First terminate the extended list.  Otherwise a thread
+	     might use the new last element and then use the garbage
+	     at offset IDX+1.  */
+	  imap->l_scope[cnt + 1] = NULL;
+	  atomic_write_barrier ();
+	  imap->l_scope[cnt] = &new->l_searchlist;
+
+	  from_scope = cnt;
+	}
+
+      /* Print scope information.  */
+      if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_SCOPES))
+	_dl_show_scope (imap, from_scope);
+    }
+}
+
+/* Call _dl_add_to_slotinfo with DO_ADD set to false, to allocate
+   space in GL (dl_tls_dtv_slotinfo_list).  This can raise an
+   exception.  */
+static bool
+resize_tls_slotinfo (struct link_map *new)
+{
+  bool any_tls = false;
+  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
+    {
+      struct link_map *imap = new->l_searchlist.r_list[i];
+
+      /* Only add TLS memory if this object is loaded now and
+	 therefore is not yet initialized.  */
+      if (! imap->l_init_called && imap->l_tls_blocksize > 0)
+	{
+	  _dl_add_to_slotinfo (imap, false);
+	  any_tls = true;
+	}
+    }
+  return any_tls;
+}
+
+/* Second stage of TLS update, after resize_tls_slotinfo.  This
+   function does not raise any exception.  It should only be called if
+   resize_tls_slotinfo returned true.  */
+static void
+update_tls_slotinfo (struct link_map *new)
+{
+  unsigned int first_static_tls = new->l_searchlist.r_nlist;
+  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
+    {
+      struct link_map *imap = new->l_searchlist.r_list[i];
+
+      /* Only add TLS memory if this object is loaded now and
+	 therefore is not yet initialized.  */
+      if (! imap->l_init_called && imap->l_tls_blocksize > 0)
+	{
+	  _dl_add_to_slotinfo (imap, true);
+
+	  if (imap->l_need_tls_init
+	      && first_static_tls == new->l_searchlist.r_nlist)
+	    first_static_tls = i;
+	}
+    }
+
+  if (__builtin_expect (++GL(dl_tls_generation) == 0, 0))
+    _dl_fatal_printf (N_("\
+TLS generation counter wrapped!  Please report this."));
+
+  /* We need a second pass for static tls data, because
+     _dl_update_slotinfo must not be run while calls to
+     _dl_add_to_slotinfo are still pending.  */
+  for (unsigned int i = first_static_tls; i < new->l_searchlist.r_nlist; ++i)
+    {
+      struct link_map *imap = new->l_searchlist.r_list[i];
+
+      if (imap->l_need_tls_init
+	  && ! imap->l_init_called
+	  && imap->l_tls_blocksize > 0)
+	{
+	  /* For static TLS we have to allocate the memory here and
+	     now, but we can delay updating the DTV.  */
+	  imap->l_need_tls_init = 0;
+#ifdef SHARED
+	  /* Update the slot information data for at least the
+	     generation of the DSO we are allocating data for.  */
+
+	  /* FIXME: This can terminate the process on memory
+	     allocation failure.  It is not possible to raise
+	     exceptions from this context; to fix this bug,
+	     _dl_update_slotinfo would have to be split into two
+	     operations.  */
+	  _dl_update_slotinfo (imap->l_tls_modid);
+#endif
+
+	  GL(dl_init_static_tls) (imap);
+	  assert (imap->l_need_tls_init == 0);
+	}
+    }
+}
+
 /* struct dl_init_args and call_dl_init are used to call _dl_init with
    exception handling disabled.  */
 struct dl_init_args
@@ -431,133 +627,39 @@ 
      relocation.  */
   _dl_open_check (new);
 
-  /* If the file is not loaded now as a dependency, add the search
-     list of the newly loaded object to the scope.  */
-  bool any_tls = false;
-  unsigned int first_static_tls = new->l_searchlist.r_nlist;
-  for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
-    {
-      struct link_map *imap = new->l_searchlist.r_list[i];
-      int from_scope = 0;
+  /* This only performs the memory allocations.  The actual update of
+     the scopes happens below, after failure is impossible.  */
+  resize_scopes (new);
 
-      /* If the initializer has been called already, the object has
-	 not been loaded here and now.  */
-      if (imap->l_init_called && imap->l_type == lt_loaded)
-	{
-	  struct r_scope_elem **runp = imap->l_scope;
-	  size_t cnt = 0;
+  /* Increase the size of the GL (dl_tls_dtv_slotinfo_list) data
+     structure.  */
+  bool any_tls = resize_tls_slotinfo (new);
 
-	  while (*runp != NULL)
-	    {
-	      if (*runp == &new->l_searchlist)
-		break;
-	      ++cnt;
-	      ++runp;
-	    }
+  /* Perform the necessary allocations for adding new global objects
+     to the global scope below.  */
+  if (mode & RTLD_GLOBAL)
+    add_to_global_resize (new);
 
-	  if (*runp != NULL)
-	    /* Avoid duplicates.  */
-	    continue;
+  /* Demarcation point: After this, no recoverable errors are allowed.
+     All memory allocations for new objects must have happened
+     before.  */
 
-	  if (__glibc_unlikely (cnt + 1 >= imap->l_scope_max))
-	    {
-	      /* The 'r_scope' array is too small.  Allocate a new one
-		 dynamically.  */
-	      size_t new_size;
-	      struct r_scope_elem **newp;
+  /* Second stage after resize_scopes: Actually perform the scope
+     update.  After this, dlsym and lazy binding can bind to new
+     objects.  */
+  update_scopes (new);
 
-#define SCOPE_ELEMS(imap) \
-  (sizeof (imap->l_scope_mem) / sizeof (imap->l_scope_mem[0]))
+  /* FIXME: It is unclear whether the order here is correct.
+     Shouldn't new objects be made available for binding (and thus
+     execution) only after there TLS data has been set up
+     correctly?  */
 
-	      if (imap->l_scope != imap->l_scope_mem
-		  && imap->l_scope_max < SCOPE_ELEMS (imap))
-		{
-		  new_size = SCOPE_ELEMS (imap);
-		  newp = imap->l_scope_mem;
-		}
-	      else
-		{
-		  new_size = imap->l_scope_max * 2;
-		  newp = (struct r_scope_elem **)
-		    malloc (new_size * sizeof (struct r_scope_elem *));
-		  if (newp == NULL)
-		    _dl_signal_error (ENOMEM, "dlopen", NULL,
-				      N_("cannot create scope list"));
-		}
-
-	      memcpy (newp, imap->l_scope, cnt * sizeof (imap->l_scope[0]));
-	      struct r_scope_elem **old = imap->l_scope;
-
-	      imap->l_scope = newp;
-
-	      if (old != imap->l_scope_mem)
-		_dl_scope_free (old);
-
-	      imap->l_scope_max = new_size;
-	    }
-
-	  /* First terminate the extended list.  Otherwise a thread
-	     might use the new last element and then use the garbage
-	     at offset IDX+1.  */
-	  imap->l_scope[cnt + 1] = NULL;
-	  atomic_write_barrier ();
-	  imap->l_scope[cnt] = &new->l_searchlist;
-
-	  /* Print only new scope information.  */
-	  from_scope = cnt;
-	}
-      /* Only add TLS memory if this object is loaded now and
-	 therefore is not yet initialized.  */
-      else if (! imap->l_init_called
-	       /* Only if the module defines thread local data.  */
-	       && __builtin_expect (imap->l_tls_blocksize > 0, 0))
-	{
-	  /* Now that we know the object is loaded successfully add
-	     modules containing TLS data to the slot info table.  We
-	     might have to increase its size.  */
-	  _dl_add_to_slotinfo (imap);
-
-	  if (imap->l_need_tls_init
-	      && first_static_tls == new->l_searchlist.r_nlist)
-	    first_static_tls = i;
-
-	  /* We have to bump the generation counter.  */
-	  any_tls = true;
-	}
-
-      /* Print scope information.  */
-      if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_SCOPES))
-	_dl_show_scope (imap, from_scope);
-    }
-
-  /* Bump the generation number if necessary.  */
-  if (any_tls && __builtin_expect (++GL(dl_tls_generation) == 0, 0))
-    _dl_fatal_printf (N_("\
-TLS generation counter wrapped!  Please report this."));
-
-  /* We need a second pass for static tls data, because _dl_update_slotinfo
-     must not be run while calls to _dl_add_to_slotinfo are still pending.  */
-  for (unsigned int i = first_static_tls; i < new->l_searchlist.r_nlist; ++i)
-    {
-      struct link_map *imap = new->l_searchlist.r_list[i];
-
-      if (imap->l_need_tls_init
-	  && ! imap->l_init_called
-	  && imap->l_tls_blocksize > 0)
-	{
-	  /* For static TLS we have to allocate the memory here and
-	     now, but we can delay updating the DTV.  */
-	  imap->l_need_tls_init = 0;
-#ifdef SHARED
-	  /* Update the slot information data for at least the
-	     generation of the DSO we are allocating data for.  */
-	  _dl_update_slotinfo (imap->l_tls_modid);
-#endif
-
-	  GL(dl_init_static_tls) (imap);
-	  assert (imap->l_need_tls_init == 0);
-	}
-    }
+  /* Second stage after resize_tls_slotinfo: Update the slotinfo data
+     structures.  */
+  if (any_tls)
+    /* FIXME: This calls _dl_update_slotinfo, which aborts the process
+       on memory allocation failure.  */
+    update_tls_slotinfo (new);
 
   /* Notify the debugger all new objects have been relocated.  */
   if (relocation_in_progress)
diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index a4b0529..65d3520 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -883,7 +883,7 @@ 
 
 
 void
-_dl_add_to_slotinfo (struct link_map *l)
+_dl_add_to_slotinfo (struct link_map *l, bool do_add)
 {
   /* Now that we know the object is loaded successfully add
      modules containing TLS data to the dtv info table.  We
@@ -939,6 +939,9 @@ 
     }
 
   /* Add the information into the slotinfo data structure.  */
-  listp->slotinfo[idx].map = l;
-  listp->slotinfo[idx].gen = GL(dl_tls_generation) + 1;
+  if (do_add)
+    {
+      listp->slotinfo[idx].map = l;
+      listp->slotinfo[idx].gen = GL(dl_tls_generation) + 1;
+    }
 }
diff --git a/elf/rtld.c b/elf/rtld.c
index 8a6e1a1..0f185e7 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -2214,7 +2214,7 @@ 
 
 	  /* Add object to slot information data if necessasy.  */
 	  if (l->l_tls_blocksize != 0 && tls_init_tp_called)
-	    _dl_add_to_slotinfo (l);
+	    _dl_add_to_slotinfo (l, true);
 	}
     }
   else
@@ -2259,7 +2259,7 @@ 
 
 	  /* Add object to slot information data if necessasy.  */
 	  if (l->l_tls_blocksize != 0 && tls_init_tp_called)
-	    _dl_add_to_slotinfo (l);
+	    _dl_add_to_slotinfo (l, true);
 	}
       rtld_timer_stop (&relocate_time, start);
 
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index ed0aed7..c546757 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -1140,8 +1140,15 @@ 
    old scope, OLD can't be freed until no thread is using it.  */
 extern int _dl_scope_free (void *) attribute_hidden;
 
-/* Add module to slot information data.  */
-extern void _dl_add_to_slotinfo (struct link_map  *l) attribute_hidden;
+
+/* Add module to slot information data.  If DO_ADD is false, only the
+   required memory is allocated.  Must be called with GL
+   (dl_load_lock) acquired.  If the function has already been called
+   for the link map L with !do_add, then this function will not raise
+   an exception, otherwise it is possible that it encounters a memory
+   allocation failure.  */
+extern void _dl_add_to_slotinfo (struct link_map *l, bool do_add)
+  attribute_hidden;
 
 /* Update slot information data for at least the generation of the
    module with the given index.  */