Message ID | 20210209171839.7911-1-vivek@collabora.com |
---|---|
Headers | show |
Series | Implementation of RTLD_SHARED for dlmopen | expand |
On Tue, 9 Feb 2021, Vivek Das Mohapatra via Libc-alpha wrote: > This is a revision of a previous patchset that I posted here > regarding https://sourceware.org/bugzilla/show_bug.cgi?id=22745 Please mention the bug number in the proposed commit messages so that commits get properly filed in Bugzilla. Given the size of the patch series I wonder if it should also add a NEWS entry for the user-visible change (beyond the automatically-generated one for a bug fix generated if the bug is RESOLVED/FIXED with the correct target milestone set at the time the release is made).
> Please mention the bug number in the proposed commit messages so that > commits get properly filed in Bugzilla. Given the size of the patch Any particular format? And do you want the bug # in every commit message?
On Wed, 10 Feb 2021, Vivek Das Mohapatra via Libc-alpha wrote: > > Please mention the bug number in the proposed commit messages so that > > commits get properly filed in Bugzilla. Given the size of the patch > > Any particular format? And do you want the bug # in every commit message? There are various forms such as "bug 12345" or "BZ #12345" that are accepted by the commit processing. Any commit that can be considered to be the one fixing the bug should definitely mention the number and be explicit that it is fixing that bug. Any commit mentioning the number without fixing the bug should be clear that it is relevant to the bug, but not by itself fixing the bug.
On 09/02/2021 14:18, Vivek Das Mohapatra via Libc-alpha wrote: > This is a revision of a previous patchset that I posted here > regarding https://sourceware.org/bugzilla/show_bug.cgi?id=22745 > > Introduction: > > ======================================================================= > As discussed in the URL above dlmopen requires a mechanism for > [optionally] sharing some objects between more than one namespace. > > The following patchset provides an implementation for this: If an > object is loaded with the new RTLD_SHARED flag we instead ensure > that a "master" copy exists (and is flagged as no-delete) in the > main namespace and a thin wrapper or clone is placed in the target > namespace. > > This patch series should address all the comments received on the > earlier (v1) series, and fixes a bug in the previous (v2) series > which left the r_debug struct in an inconsistent state when creating > a proxy triggered the initial load of a DSO into the main namespace. > ======================================================================= > > In addition this patch series implements the following: > > - dlmopen will implicitly apply RTLD_SHARED to the libc/libpthread group > (requires a patched binutils/ld so that the libc family DSOs can > be flagged as requiring this behaviour) > > - binutils patchset accepted upstream; > - https://sourceware.org/git/?p=binutils-gdb.git > - commit 8a87b2791181eb7fc1533ffaeb95df8d87d41493 > > - LD_AUDIT paths will NOT apply this implict sharing rule: > audit libraries will continue to be completely isolated. > > - The mechanism for tagging DSOs as implicitly shared has been changed > from a DT_FLAGS_1 flag to a DT_VALRNGHI/LO range dynamic section tag. > (Based on feedback on the binutils side of this patch series). > > - DT_GNU_FLAGS_1/DF_GNU_1_UNIQUE > > - A flag RTLD_ISOLATE which is used inernally to suppress RTLD_SHARED > behaviour when audit libraries are being loaded, and is also made available > to users who really want a completely separate copy of glibc in their new > namespace. > > - Tests for the new dlmopen behaviour > > - Adds the unique dso flag to htl/libpthread.so as well as nptl I will try to start the review of this patchset next week, I still reading all the history and provided links but the idea is sound. And my sniff tests did triggered any warnings. Carlos started to review it on previous iterations, so I take he agrees this should a good addition as well. > > I have not yet implemented, but plan to address once this series is > accepted/acceptable: > > - Sensible RTLD_GLOBAL semantics for dlmopened DSOs in non-base namespaces Could you extend what semantic and useful would to add such extension? > > - dl_iterate_ns_phdr (cf dl_iterate_phdr but taking a namespace argument) I think if we agree to add this extension we might also work around the current issues of dl_iterate_phdr. If I recall correctly, Florian has raised the current scalability issues and proposed a better replacement in a previous Cauldron.
* Adhemerval Zanella: >> - dl_iterate_ns_phdr (cf dl_iterate_phdr but taking a namespace argument) > > I think if we agree to add this extension we might also work around the > current issues of dl_iterate_phdr. If I recall correctly, Florian has > raised the current scalability issues and proposed a better replacement > in a previous Cauldron. Iterating over link maps in a namespace is still indepedently useful. I just don't think it's the right interface for locating exception unwinding information based on a code address. Thanks, Florian
On 12/02/2021 15:56, Florian Weimer wrote: > * Adhemerval Zanella: > >>> - dl_iterate_ns_phdr (cf dl_iterate_phdr but taking a namespace argument) >> >> I think if we agree to add this extension we might also work around the >> current issues of dl_iterate_phdr. If I recall correctly, Florian has >> raised the current scalability issues and proposed a better replacement >> in a previous Cauldron. > > Iterating over link maps in a namespace is still indepedently useful. > I just don't think it's the right interface for locating exception > unwinding information based on a code address. If I recall correctly you also brought some scalability issue due the internal locking, or am I missing something?
* Adhemerval Zanella: > On 12/02/2021 15:56, Florian Weimer wrote: >> * Adhemerval Zanella: >> >>>> - dl_iterate_ns_phdr (cf dl_iterate_phdr but taking a namespace argument) >>> >>> I think if we agree to add this extension we might also work around the >>> current issues of dl_iterate_phdr. If I recall correctly, Florian has >>> raised the current scalability issues and proposed a better replacement >>> in a previous Cauldron. >> >> Iterating over link maps in a namespace is still indepedently useful. >> I just don't think it's the right interface for locating exception >> unwinding information based on a code address. > > If I recall correctly you also brought some scalability issue due the > internal locking, or am I missing something? Yes, but it's hard to tell if it matters to other use cases for iteration without looking at them individually. Thanks, Florian
>> I have not yet implemented, but plan to address once this series is >> accepted/acceptable: >> >> - Sensible RTLD_GLOBAL semantics for dlmopened DSOs in non-base namespaces > > Could you extend what semantic and useful would to add such extension? Right now RTLD_GLOBAL is well defined for the main namespace. It isn't defined at all for secondary namespaces (and tends to result in a segfault when passed to dlmopen). I would propose something like this: Main namespace RTLD_GLOBAL - works as at present. The new (or newly promoted) DSO is implicitly available for symbol resolution from/by all main namespace DSOs. Secondary namespace RTLD_GLOBAL - the new or newly promoted DSO is implicitly available for symbol resolution from/by all DSOs _in that namespace_.
* Vivek Das Mohapatra via Libc-alpha: > I would propose something like this: > > Main namespace RTLD_GLOBAL - works as at present. The new > (or newly promoted) DSO is implicitly available for symbol > resolution from/by all main namespace DSOs. > > Secondary namespace RTLD_GLOBAL - the new or newly promoted > DSO is implicitly available for symbol resolution from/by > all DSOs _in that namespace_. That makes sense to me. We currently delay the update of the global scope after ELF constructors of new objects are invoked. Is this something we should keep (in both cases)? Thanks, Florian
> We currently delay the update of the global scope after ELF constructors > of new objects are invoked. Is this something we should keep (in both > cases)? I think so - make sure everything is consistent and settled before exposing it to the rest of the world.
* Vivek Das Mohapatra: >> We currently delay the update of the global scope after ELF constructors >> of new objects are invoked. Is this something we should keep (in both >> cases)? > > I think so - make sure everything is consistent and settled before exposing > it to the rest of the world. But the shared objects are available on the local scope before initialization (and it has to be this way). So there's still an inconsistency. The technical problem here is that adding things to the global scope may require memory allocation, and that can fail. But after we have called any ELF constructors, dlopen must not fail. So we have to pre-allocate any changes, and that code is a bit weird, particularly due to recursive dlopen. Thanks, Florian
> But the shared objects are available on the local scope before > initialization (and it has to be this way). So there's still an > inconsistency. > > The technical problem here is that adding things to the global scope may > require memory allocation, and that can fail. But after we have called > any ELF constructors, dlopen must not fail. So we have to pre-allocate > any changes, and that code is a bit weird, particularly due to recursive > dlopen. Hm. Well, I guess my answer would be that it should work the same way for both cases: Otherwise we're introducing a weird (as well as obscure and surprising) inconsistency between primary and secondary namespaces.
On 09/02/2021 14:18, Vivek Das Mohapatra via Libc-alpha wrote: > This is a revision of a previous patchset that I posted here > regarding https://sourceware.org/bugzilla/show_bug.cgi?id=22745 > > Introduction: > > ======================================================================= > As discussed in the URL above dlmopen requires a mechanism for > [optionally] sharing some objects between more than one namespace. > > The following patchset provides an implementation for this: If an > object is loaded with the new RTLD_SHARED flag we instead ensure > that a "master" copy exists (and is flagged as no-delete) in the > main namespace and a thin wrapper or clone is placed in the target > namespace. > > This patch series should address all the comments received on the > earlier (v1) series, and fixes a bug in the previous (v2) series > which left the r_debug struct in an inconsistent state when creating > a proxy triggered the initial load of a DSO into the main namespace. > ======================================================================= > > In addition this patch series implements the following: > > - dlmopen will implicitly apply RTLD_SHARED to the libc/libpthread group > (requires a patched binutils/ld so that the libc family DSOs can > be flagged as requiring this behaviour) > > - binutils patchset accepted upstream; > - https://sourceware.org/git/?p=binutils-gdb.git > - commit 8a87b2791181eb7fc1533ffaeb95df8d87d41493 > > - LD_AUDIT paths will NOT apply this implict sharing rule: > audit libraries will continue to be completely isolated. > > - The mechanism for tagging DSOs as implicitly shared has been changed > from a DT_FLAGS_1 flag to a DT_VALRNGHI/LO range dynamic section tag. > (Based on feedback on the binutils side of this patch series). > > - DT_GNU_FLAGS_1/DF_GNU_1_UNIQUE > > - A flag RTLD_ISOLATE which is used inernally to suppress RTLD_SHARED > behaviour when audit libraries are being loaded, and is also made available > to users who really want a completely separate copy of glibc in their new > namespace. > > - Tests for the new dlmopen behaviour > > - Adds the unique dso flag to htl/libpthread.so as well as nptl > > I have not yet implemented, but plan to address once this series is > accepted/acceptable: > > - Sensible RTLD_GLOBAL semantics for dlmopened DSOs in non-base namespaces > > - dl_iterate_ns_phdr (cf dl_iterate_phdr but taking a namespace argument) > > Vivek Das Mohapatra (20): > Declare and describe the dlmopen RTLD_SHARED flag > include/link.h: Update the link_map struct to allow proxies > elf/dl-object.c: Implement a helper function to proxy link_map entries > elf/dl-load.c, elf-dl-open.c: Implement RTLD_SHARED dlmopen proxying > elf/dl-fini.c: Handle proxy link_map entries in the shutdown path > elf/dl-init.c: Skip proxied link map entries in the dl init path > elf/dl-open.c: Don't try libc linit in namespaces with no libc mapping > elf/dl-open.c: when creating a proxy check the libc_map in NS 0 > Define a new dynamic section tag - DT_GNU_FLAGS_1 > Abstract the loaded-DSO search code into a private helper function > Compare loaded DSOs by file ID and check for DF_GNU_1_UNIQUE > Use the new DSO finder helper function since we have it > Use the DSO search helper to check for preloaded DT_GNU_UNIQUE DSOs > When loading DSOs into alternate namespaces check for DT_GNU_UNIQUE > Suppress audit calls when a (new) namespace is empty > Suppress inter-namespace DSO sharing for audit libraries > dlsym, dlvsym should be able to look up symbols via DSO proxies > Add DT_GNU_FLAGS_1/DF_GNU_1_UNIQUE dynamic section+flag to glibc DSOs > Add dlmopen / RTLD_SHARED tests > Restore separate libc loading for the TLS/namespace storage test Now that I have reviewed all the patches, I think the set should be reorganized to each patch being logically consistent and not requiring to have all patches applied in a bulk to get RTLD_SHARED/RTLD_ISOLATE support fully implemented. So besides fixing all the implicit and style issues (missing space, attribute out of 'if', etc.) I think the patch should be logically implemented as: 1. Move the 09/20 to first in set (it adds the new binutils definitions and set the l_gnu_flags_1). The new definitions are used only internally and the new flag is only set but not used on the patch. 2. Move the 10/20 to second in set (it adds a function used in subsequent patch). It add a new function which is used on code refactoring. 3. Split the 12/20 a patch to do *just* the refactor that uses the _dl_find_dso and move the RTLD_ISOLATE to the patch that actually enables RTLD_SHARED. 4. Add a patch to add the DT_GNU_FLAGS_1 dynamic tag on the required library (it should be safer since there is no logic yet that consumes it). 5. Combine all the remaining patch that enable RTLD_SHARED and RT_ISOLATE on a single patch. It would be large patch, but it is more logically consistent and easier to revert or backport. 6. Add the tests.