Message ID | 20181127232555.GA3099@redhat.com |
---|---|
State | New |
Headers | show |
Series | PR libstdc++/67843 set shared_ptr lock policy at build-time | expand |
On Wed, 28 Nov 2018 at 00:25, Jonathan Wakely <jwakely@redhat.com> wrote: > > This resolves a longstanding issue where the lock policy for shared_ptr > reference counting depends on compilation options when the header is > included, so that different -march options can cause ABI changes. For > example, objects compiled with -march=armv7 will use atomics to > synchronize reference counts, and objects compiled with -march=armv5t > will use a mutex. That means the shared_ptr control block will have a > different layout in different objects, causing ODR violations and > undefined behaviour. This was the root cause of PR libstdc++/42734 as > well as PR libstdc++/67843. > > The solution is to decide on the lock policy at build time, when > libstdc++ is configured. The configure script checks for the > availability of the necessary atomic built-ins for the target and fixes > that choice permanently. Different -march flags used to compile user > code will not cause changes to the lock policy. This results in an ABI > change for certain compilations, but only where there was already an ABI > incompatibility between the libstdc++.so library and objects built with > an incompatible -march option. In general, this means a more stable ABI > that isn't silently altered when -march flags make addition atomic ops > available. > > To force a target to use "atomic" or "mutex" the new configure option > --with-libstdcxx-lock-policy can be used. > > In order to turn ODR violations into linker errors, the uses of > shared_ptr in filesystem directory iterators have been replaced > with __shared_ptr, and explicit instantiations are declared. This > ensures that object files using those types cannot link to libstdc++ > libs unless they use the same lock policy. > > PR libstdc++/67843 > * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro > that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY. > * config.h.in: Regenerate. > * configure: Regenerate. > * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY. > * doc/xml/manual/configure.xml: Document new configure option. > * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr > instead of shared_ptr. > (recursive_directory_iterator): Likewise. > (__shared_ptr<_Dir>): Add explicit instantiation declaration. > (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise. > * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared): > Add default template argument for _Lock_policy template parameter. > * include/ext/concurrence.h (__default_lock_policy): Check macro > _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current > target supports the builtins for compare-and-swap. > * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit > instantiation definition. > (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise. > (directory_iterator, recursive_directory_iterator): Use __make_shared > instead of make_shared. > > Tested x86_64-linux and aarch64-linux, committed to trunk. > > I would appreciate testing on ARM, especially Christophe's > -march=armv5t set up. > Hi Jonathan, Your patch passed my tests, thanks! directory_iterator.cc tests used to be killed by a timeout until last night where now they do PASS in my configurations involving -march=armv5t. That being said, I'm not sure I fully understand the fix. In my tests involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9, and I pass -march=armv5t as an override with the --target-board runtest option. In these cases libstdc++ configure detects support for "atomic", so I suspect the tests pass only because I use QEMU with --cpu cortex-a9. I think if I used a different QEMU cpu (eg arm926), the tests would try to use atomics, and fail? The reason I'm still using cortex-a9 at QEMU level is that many tests override -mcpu/-march, and I used that as a compromise. However, I do have configurations using --with-cpu=arm10tdmi or --with-cpu=default and QEMU --cpu arm926. In these 2 cases, the tests do PASS, but they used to before your patch. libstdc++ configure does detect "mutex" in these cases. To be hopefully clearer, the subset of relevant configurations I use is as follows: GCC target / with-cpu / RUNTEST target-board / QEMU cpu 1- arm-none-linux-gnueabi[hf] / cortex-a9 / -march=armv5t / cortex-a9 2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926 3- arm-none-eabi / default / "" / arm926 (1) uses atomics (2) and (3) use mutex All of them now say "PASS", but maybe I should give a try switch to arm926 for QEMU in (1) ? Thanks, Christophe
On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote: > > This resolves a longstanding issue where the lock policy for shared_ptr > reference counting depends on compilation options when the header is > included, so that different -march options can cause ABI changes. For > example, objects compiled with -march=armv7 will use atomics to > synchronize reference counts, and objects compiled with -march=armv5t > will use a mutex. That means the shared_ptr control block will have a > different layout in different objects, causing ODR violations and > undefined behaviour. This was the root cause of PR libstdc++/42734 as > well as PR libstdc++/67843. > > The solution is to decide on the lock policy at build time, when > libstdc++ is configured. The configure script checks for the > availability of the necessary atomic built-ins for the target and fixes > that choice permanently. Different -march flags used to compile user > code will not cause changes to the lock policy. This results in an ABI > change for certain compilations, but only where there was already an ABI > incompatibility between the libstdc++.so library and objects built with > an incompatible -march option. In general, this means a more stable ABI > that isn't silently altered when -march flags make addition atomic ops > available. > > To force a target to use "atomic" or "mutex" the new configure option > --with-libstdcxx-lock-policy can be used. > > In order to turn ODR violations into linker errors, the uses of > shared_ptr in filesystem directory iterators have been replaced > with __shared_ptr, and explicit instantiations are declared. This > ensures that object files using those types cannot link to libstdc++ > libs unless they use the same lock policy. Would it be possible to have both in libstdc++ and with differnet mangling of different kind ensure compatibility between different user CUs? Or is that too awkward for the user to get right? Richard. > PR libstdc++/67843 > * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro > that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY. > * config.h.in: Regenerate. > * configure: Regenerate. > * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY. > * doc/xml/manual/configure.xml: Document new configure option. > * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr > instead of shared_ptr. > (recursive_directory_iterator): Likewise. > (__shared_ptr<_Dir>): Add explicit instantiation declaration. > (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise. > * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared): > Add default template argument for _Lock_policy template parameter. > * include/ext/concurrence.h (__default_lock_policy): Check macro > _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current > target supports the builtins for compare-and-swap. > * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit > instantiation definition. > (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise. > (directory_iterator, recursive_directory_iterator): Use __make_shared > instead of make_shared. > > Tested x86_64-linux and aarch64-linux, committed to trunk. > > I would appreciate testing on ARM, especially Christophe's > -march=armv5t set up. >
On 28/11/18 11:46 +0100, Richard Biener wrote: >On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote: >> >> This resolves a longstanding issue where the lock policy for shared_ptr >> reference counting depends on compilation options when the header is >> included, so that different -march options can cause ABI changes. For >> example, objects compiled with -march=armv7 will use atomics to >> synchronize reference counts, and objects compiled with -march=armv5t >> will use a mutex. That means the shared_ptr control block will have a >> different layout in different objects, causing ODR violations and >> undefined behaviour. This was the root cause of PR libstdc++/42734 as >> well as PR libstdc++/67843. >> >> The solution is to decide on the lock policy at build time, when >> libstdc++ is configured. The configure script checks for the >> availability of the necessary atomic built-ins for the target and fixes >> that choice permanently. Different -march flags used to compile user >> code will not cause changes to the lock policy. This results in an ABI >> change for certain compilations, but only where there was already an ABI >> incompatibility between the libstdc++.so library and objects built with >> an incompatible -march option. In general, this means a more stable ABI >> that isn't silently altered when -march flags make addition atomic ops >> available. >> >> To force a target to use "atomic" or "mutex" the new configure option >> --with-libstdcxx-lock-policy can be used. >> >> In order to turn ODR violations into linker errors, the uses of >> shared_ptr in filesystem directory iterators have been replaced >> with __shared_ptr, and explicit instantiations are declared. This >> ensures that object files using those types cannot link to libstdc++ >> libs unless they use the same lock policy. > >Would it be possible to have both in libstdc++ and with differnet mangling of >different kind ensure compatibility between different user CUs? Or is >that too awkward for the user to get right? It would mean duplicating a lot more code, which is already duplicated once for the std::string ABI, so we'd have four permuations! It still wouldn't ensure compatibility between different user CUs, only between any user CU and any build of libstdc++. Different user CUs would still disagree on the ABI of the types, and so couldn't pass them between CUs. I see no advantage to supporting that for the std::filesystem library (unlike std::string and std::iostream, which are probably used in the majority of CUs). I do not want to get to the point where every type in libstdc++ exists multiple times and you select some combination via command-line flags. It's already becoming unmanageable with multiple std::string and long double ABIs.
On 28/11/18 10:54 +0100, Christophe Lyon wrote: >On Wed, 28 Nov 2018 at 00:25, Jonathan Wakely <jwakely@redhat.com> wrote: >> >> This resolves a longstanding issue where the lock policy for shared_ptr >> reference counting depends on compilation options when the header is >> included, so that different -march options can cause ABI changes. For >> example, objects compiled with -march=armv7 will use atomics to >> synchronize reference counts, and objects compiled with -march=armv5t >> will use a mutex. That means the shared_ptr control block will have a >> different layout in different objects, causing ODR violations and >> undefined behaviour. This was the root cause of PR libstdc++/42734 as >> well as PR libstdc++/67843. >> >> The solution is to decide on the lock policy at build time, when >> libstdc++ is configured. The configure script checks for the >> availability of the necessary atomic built-ins for the target and fixes >> that choice permanently. Different -march flags used to compile user >> code will not cause changes to the lock policy. This results in an ABI >> change for certain compilations, but only where there was already an ABI >> incompatibility between the libstdc++.so library and objects built with >> an incompatible -march option. In general, this means a more stable ABI >> that isn't silently altered when -march flags make addition atomic ops >> available. >> >> To force a target to use "atomic" or "mutex" the new configure option >> --with-libstdcxx-lock-policy can be used. >> >> In order to turn ODR violations into linker errors, the uses of >> shared_ptr in filesystem directory iterators have been replaced >> with __shared_ptr, and explicit instantiations are declared. This >> ensures that object files using those types cannot link to libstdc++ >> libs unless they use the same lock policy. >> >> PR libstdc++/67843 >> * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro >> that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY. >> * config.h.in: Regenerate. >> * configure: Regenerate. >> * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY. >> * doc/xml/manual/configure.xml: Document new configure option. >> * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr >> instead of shared_ptr. >> (recursive_directory_iterator): Likewise. >> (__shared_ptr<_Dir>): Add explicit instantiation declaration. >> (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise. >> * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared): >> Add default template argument for _Lock_policy template parameter. >> * include/ext/concurrence.h (__default_lock_policy): Check macro >> _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current >> target supports the builtins for compare-and-swap. >> * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit >> instantiation definition. >> (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise. >> (directory_iterator, recursive_directory_iterator): Use __make_shared >> instead of make_shared. >> >> Tested x86_64-linux and aarch64-linux, committed to trunk. >> >> I would appreciate testing on ARM, especially Christophe's >> -march=armv5t set up. >> > >Hi Jonathan, > >Your patch passed my tests, thanks! Thanks for checking it. >directory_iterator.cc tests used to be killed by a timeout until last >night where now they do PASS in my configurations involving >-march=armv5t. > >That being said, I'm not sure I fully understand the fix. In my tests >involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9, >and I pass -march=armv5t as an override with the --target-board runtest option. Yup, and that difference between the arch used to build libstdc++ and the arch used to build the tests was what caused the FAILures. And that's what my patch addresses, by ensuring that the arch used to build libstdc++ is what dictates the type of synchronization for shared_ptr. >In these cases libstdc++ configure detects support for "atomic", so I >suspect the tests pass only because I use QEMU with --cpu cortex-a9. >I think if I used a different QEMU cpu (eg arm926), the tests would >try to use atomics, and fail? I don't know if a libstdc++ configured for cortex-a9 would run on an arm296 QEMU. It might depend on cortex-a9 instructions (unrelated to the use of atomics in my patch). See the end of this mail. >The reason I'm still using cortex-a9 at QEMU level is that many tests >override -mcpu/-march, and I used that as a compromise. That means you're not fully testing armv5t, because you're still linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9 instruction set. Usually that doesn't matter, because most libstdc++ code doesn't care about which specific instructions it gets compiled to. That >However, I do have configurations using --with-cpu=arm10tdmi or >--with-cpu=default and QEMU --cpu arm926. >In these 2 cases, the tests do PASS, but they used to before your >patch. libstdc++ configure does detect "mutex" in these cases. What matters is that the tests and the libstdc++ libraries agree on the lock policy. It doesn't matter if they both use "atomic" or they both use "mutex", only that they agree. Previously the lock policy was selected by the preprocessor in every translation unit. That made it very easy to get incompatible sets of objects. Now the selection is made once per build of GCC, and is not affected by -march options when compiling user code. >To be hopefully clearer, the subset of relevant configurations I use >is as follows: >GCC target / with-cpu / RUNTEST target-board / QEMU cpu >1- arm-none-linux-gnueabi[hf] / cortex-a9 / -march=armv5t / cortex-a9 >2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926 >3- arm-none-eabi / default / "" / arm926 > >(1) uses atomics >(2) and (3) use mutex > >All of them now say "PASS", but maybe I should give a try switch to >arm926 for QEMU in (1) ? That would be a useful test. If I understand correctly, that will still work because the atomic ops will use the kernel helper in libgcc. Previously it would have failed the same way as (1) was already failing, because --with-cpu=cortex-a9 when libstdc++ was built chose atomics, and the -march=armv5t target board chose mutexes, and so the testsuite files were incompatible with the library. Since my patch, only the --with-cpu option matters, and the testsuite files will still use atomics. I think the arm926 cpu running under QEMU will implement those atomics via libgcc's kernel helpers. However, that test might fail for other reasons, if the libstdc++ libs configured for cortex-a9 use other insns that are not supported by arm296 then you'd get a SIGILL when running the tests. The kernel helpers only implement the atomic operations, not any cortex-a9 instructions that are not supported by arm296.
On 28/11/18 11:11 +0000, Jonathan Wakely wrote: >On 28/11/18 11:46 +0100, Richard Biener wrote: >>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote: >>> >>>This resolves a longstanding issue where the lock policy for shared_ptr >>>reference counting depends on compilation options when the header is >>>included, so that different -march options can cause ABI changes. For >>>example, objects compiled with -march=armv7 will use atomics to >>>synchronize reference counts, and objects compiled with -march=armv5t >>>will use a mutex. That means the shared_ptr control block will have a >>>different layout in different objects, causing ODR violations and >>>undefined behaviour. This was the root cause of PR libstdc++/42734 as >>>well as PR libstdc++/67843. >>> >>>The solution is to decide on the lock policy at build time, when >>>libstdc++ is configured. The configure script checks for the >>>availability of the necessary atomic built-ins for the target and fixes >>>that choice permanently. Different -march flags used to compile user >>>code will not cause changes to the lock policy. This results in an ABI >>>change for certain compilations, but only where there was already an ABI >>>incompatibility between the libstdc++.so library and objects built with >>>an incompatible -march option. In general, this means a more stable ABI >>>that isn't silently altered when -march flags make addition atomic ops >>>available. >>> >>>To force a target to use "atomic" or "mutex" the new configure option >>>--with-libstdcxx-lock-policy can be used. >>> >>>In order to turn ODR violations into linker errors, the uses of >>>shared_ptr in filesystem directory iterators have been replaced >>>with __shared_ptr, and explicit instantiations are declared. This >>>ensures that object files using those types cannot link to libstdc++ >>>libs unless they use the same lock policy. >> >>Would it be possible to have both in libstdc++ and with differnet mangling of >>different kind ensure compatibility between different user CUs? Or is >>that too awkward for the user to get right? > >It would mean duplicating a lot more code, which is already duplicated >once for the std::string ABI, so we'd have four permuations! > >It still wouldn't ensure compatibility between different user CUs, >only between any user CU and any build of libstdc++. Different user >CUs would still disagree on the ABI of the types, and so couldn't pass >them between CUs. I see no advantage to supporting that for the >std::filesystem library (unlike std::string and std::iostream, which >are probably used in the majority of CUs). > >I do not want to get to the point where every type in libstdc++ exists >multiple times and you select some combination via command-line flags. >It's already becoming unmanageable with multiple std::string and long >double ABIs. Also, in practice, I don't see a need. The common cases where this bug arises are limited to 32-bit ARM, and for arm-linux the kernel helpers mean that you can always use "atomic". For bare metal ARM toolchains you probably don't want to be mixing CUs built against different versions of libstdc++ anyway. You have your toolchain for the board, and you use that. If it is configured to use "atomic", then that's what you get. If it's configured to use "mutex", then that's what you get. You probably don't want to use a toolchain configured for a different board, but you could use --with-libstdcxx-lock-policy to ensure a consistent policy across those toolchains if needed. It's also possible for problems to arise on i386. If you build GCC for i386 then it will choose "mutex" and be incompatible with a libstdc++ built for i486 or later. In that case, you could use the option --with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and then you'd need to link to libatomic to provide the ops, as there are no kernel helpers for i386 in libgcc). There are a few other things we could do though: 1) As well as the __default_lock_policy value, we could add a "library's preferred lock policy" value, which could be different from __default_lock_policy. That policy would get used for the __shared_ptr objects in the libstdc++ API/ABI, and would have to be consistent for all CUs in the library and using its headers. That solves the problem of user CUs being incompatible with the library because they used -march. It doesn't change the fact that a libstdc++.so using "atomic" is not compatible with one using "mutex". For this to work the library always needs to use std::__shared_ptr<X, __library_lock_policy> instead of std::shared_ptr<X> so that the policy is explicit (which I think is a good rule anyway). 1a) Then (if we wanted) we could restore the old behaviour where __default_lock_policy is chosen by the preprocessor and can vary between user CUs. If users want to compile different CUs with different lock policies, that's their choice. They risk creating incompatible shared_ptr objects which cannot be passed between CUs, but it becomes their responsibility to get it right. The library always uses its preferred policy, and so won't get it wrong. This seems easy for users to get wrong, but would arguably be more backwards compatible (by still allowing users to get silent UB that only fails at runtime). 1b) Or (if we wanted) we could still fix __default_lock_policy at configure-time, but add a second version of the new option --with-libstdcxx-lock-policy, so the default policy for user CUs and the library's preferred policy can be chosen independently. You could say you want the library to always use "mutex" but have two builds of GCC, one where user CUs use "atomic" and one where they use "mutex". The libstdc++.so used by those two GCC builds would be compatible, but user CUs that use std::shared_ptr would not be. Options 1a and 1b are mutually exclusive. 2) We could consider using __attribute__((__abi_tag__("..."))) on std::shared_ptr when the lock policy chosen by the preprocessor doesn't match what would have been chosen when libstdc++ was built. So if a user compiles with a -march option that changes the lock policy, we change the mangling of std::shared_ptr. This would still allow incompatible std::shared_ptr objects, but they would cause linker errors. For PR 42734 and PR 67843 that would have meant the programs fail to link, instead of having runtime UB. The solution I committed makes them link and run correctly, which seems better than just saying "no, you can't do that". 3) We could change libstdc++'s configure to automatically override --with-libstdcxx-lock-policy for arm-linux so it always uses "atomic" (because we know the kernel helpers are available). That causes an ABI change for a GCC built for --target=armv5t-*-linux* so I didn't do that by default. Packagers who want that can use the --with option explicitly to build a libstdc++ with atomic refcounting that can be used on any armv5t or later CPU, allowing users to choose a newer -march for their own code. (Until my patch that didn't work, because Option 1) on its own seems to have no advantage over what I committed. Option 1a) is still too easy for users to get wrong and get UB, only now it's all their fault not my fault. That doesn't really help users. Option 1b) makes it harder for users to get wrong (they have to mix CUs built by different toolchains, not just built with different -march options). I'm not sure there's much benefit to being able to control the library policy separately from the user policy though. Option 2) makes the failures happen earlier (when linking, not runtime) but doesn't actually make anything work that failed previously. Option 3) is not my call to make. If the ARM maintainers want to change the default older arm-linux targets I have no objections.
On 28/11/18 11:35 +0000, Jonathan Wakely wrote: >On 28/11/18 10:54 +0100, Christophe Lyon wrote: >>The reason I'm still using cortex-a9 at QEMU level is that many tests >>override -mcpu/-march, and I used that as a compromise. > >That means you're not fully testing armv5t, because you're still >linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9 >instruction set. Usually that doesn't matter, because most libstdc++ >code doesn't care about which specific instructions it gets compiled >to. That Oops, sorry, unfinished thought. I was going to say: That is not true for the shared_ptr code, which depends on whether atomic instructions are supported. There are also some parts of <random> that depend on specific instructions being supported (but some of them do runtime checks for the CPU).
On 28/11/18 12:30 +0000, Jonathan Wakely wrote: >3) We could change libstdc++'s configure to automatically override >--with-libstdcxx-lock-policy for arm-linux so it always uses "atomic" >(because we know the kernel helpers are available). That causes an ABI >change for a GCC built for --target=armv5t-*-linux* so I didn't do >that by default. Packagers who want that can use the --with option >explicitly to build a libstdc++ with atomic refcounting that can be >used on any armv5t or later CPU, allowing users to choose a newer >-march for their own code. (Until my patch that didn't work, because [...] >Option 3) is not my call to make. If the ARM maintainers want to >change the default older arm-linux targets I have no objections. Another way to approach option 3) that we discussed at Cauldron, and which I want to start a separate discussion about on gcc@gcc.gnu.org, is to change the value of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 et al when we have kernel helpers to implement those operations. The shared_ptr code doesn't care if an atomic CAS comes from the CPU or a kernel helper in libgcc. If the atomic CAS is supported via kernel helpers, let's use it! But those predefined macros only tell you about native CPU support (for the current target selected by -march). It's worth noting that all this shared_ptr code predates libatomic, so we couldn't just say "always use atomics, and link to libatomic if needed". If __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 was not defined, we had to assume no CAS *at all*. Not native, not provided by libatomic, nothing. I'd love to simply rely on std::atomic in std::shared_ptr, but that would be an ABI break for targets currently using a mutex, and would add new dependencies on libatomic for some targets. (It might also pessimize some single-threaded programs on targets that do use atomic ref-counts, because currently some updates are non-atomic when __gthread_active_p() == false).
On Wed, Nov 28, 2018 at 1:30 PM Jonathan Wakely <jwakely@redhat.com> wrote: > > On 28/11/18 11:11 +0000, Jonathan Wakely wrote: > >On 28/11/18 11:46 +0100, Richard Biener wrote: > >>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote: > >>> > >>>This resolves a longstanding issue where the lock policy for shared_ptr > >>>reference counting depends on compilation options when the header is > >>>included, so that different -march options can cause ABI changes. For > >>>example, objects compiled with -march=armv7 will use atomics to > >>>synchronize reference counts, and objects compiled with -march=armv5t > >>>will use a mutex. That means the shared_ptr control block will have a > >>>different layout in different objects, causing ODR violations and > >>>undefined behaviour. This was the root cause of PR libstdc++/42734 as > >>>well as PR libstdc++/67843. > >>> > >>>The solution is to decide on the lock policy at build time, when > >>>libstdc++ is configured. The configure script checks for the > >>>availability of the necessary atomic built-ins for the target and fixes > >>>that choice permanently. Different -march flags used to compile user > >>>code will not cause changes to the lock policy. This results in an ABI > >>>change for certain compilations, but only where there was already an ABI > >>>incompatibility between the libstdc++.so library and objects built with > >>>an incompatible -march option. In general, this means a more stable ABI > >>>that isn't silently altered when -march flags make addition atomic ops > >>>available. > >>> > >>>To force a target to use "atomic" or "mutex" the new configure option > >>>--with-libstdcxx-lock-policy can be used. > >>> > >>>In order to turn ODR violations into linker errors, the uses of > >>>shared_ptr in filesystem directory iterators have been replaced > >>>with __shared_ptr, and explicit instantiations are declared. This > >>>ensures that object files using those types cannot link to libstdc++ > >>>libs unless they use the same lock policy. > >> > >>Would it be possible to have both in libstdc++ and with differnet mangling of > >>different kind ensure compatibility between different user CUs? Or is > >>that too awkward for the user to get right? > > > >It would mean duplicating a lot more code, which is already duplicated > >once for the std::string ABI, so we'd have four permuations! > > > >It still wouldn't ensure compatibility between different user CUs, > >only between any user CU and any build of libstdc++. Different user > >CUs would still disagree on the ABI of the types, and so couldn't pass > >them between CUs. I see no advantage to supporting that for the > >std::filesystem library (unlike std::string and std::iostream, which > >are probably used in the majority of CUs). > > > >I do not want to get to the point where every type in libstdc++ exists > >multiple times and you select some combination via command-line flags. > >It's already becoming unmanageable with multiple std::string and long > >double ABIs. > > Also, in practice, I don't see a need. The common cases where this bug > arises are limited to 32-bit ARM, and for arm-linux the kernel helpers > mean that you can always use "atomic". For bare metal ARM toolchains > you probably don't want to be mixing CUs built against different > versions of libstdc++ anyway. You have your toolchain for the board, > and you use that. If it is configured to use "atomic", then that's > what you get. If it's configured to use "mutex", then that's what you > get. You probably don't want to use a toolchain configured for a > different board, but you could use --with-libstdcxx-lock-policy to > ensure a consistent policy across those toolchains if needed. > > It's also possible for problems to arise on i386. If you build GCC for > i386 then it will choose "mutex" and be incompatible with a libstdc++ > built for i486 or later. In that case, you could use the option > --with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and > then you'd need to link to libatomic to provide the ops, as there are > no kernel helpers for i386 in libgcc). I think the default should be part of the platform ABI somehow, like the i386 incompatibility should better not arise. I suppose libstdc++ configury doesn't read in gcc/config.gcc but does it have sth similar where it's easy to see defaults for plaforms? There's configure.host but I'm not sure this is related at all. Richard. > There are a few other things we could do though: > > 1) As well as the __default_lock_policy value, we could add a > "library's preferred lock policy" value, which could be different from > __default_lock_policy. That policy would get used for the __shared_ptr > objects in the libstdc++ API/ABI, and would have to be consistent for > all CUs in the library and using its headers. That solves the problem > of user CUs being incompatible with the library because they used > -march. It doesn't change the fact that a libstdc++.so using "atomic" > is not compatible with one using "mutex". For this to work the library > always needs to use std::__shared_ptr<X, __library_lock_policy> > instead of std::shared_ptr<X> so that the policy is explicit (which I > think is a good rule anyway). > > 1a) Then (if we wanted) we could restore the old behaviour where > __default_lock_policy is chosen by the preprocessor and can vary > between user CUs. If users want to compile different CUs with > different lock policies, that's their choice. They risk creating > incompatible shared_ptr objects which cannot be passed between CUs, > but it becomes their responsibility to get it right. The library > always uses its preferred policy, and so won't get it wrong. This > seems easy for users to get wrong, but would arguably be more > backwards compatible (by still allowing users to get silent UB that > only fails at runtime). > > 1b) Or (if we wanted) we could still fix __default_lock_policy at > configure-time, but add a second version of the new option > --with-libstdcxx-lock-policy, so the default policy for user CUs and > the library's preferred policy can be chosen independently. You could > say you want the library to always use "mutex" but have two builds of > GCC, one where user CUs use "atomic" and one where they use "mutex". > The libstdc++.so used by those two GCC builds would be compatible, but > user CUs that use std::shared_ptr would not be. > > Options 1a and 1b are mutually exclusive. > > 2) We could consider using __attribute__((__abi_tag__("..."))) on > std::shared_ptr when the lock policy chosen by the preprocessor > doesn't match what would have been chosen when libstdc++ was built. So > if a user compiles with a -march option that changes the lock > policy, we change the mangling of std::shared_ptr. This would still > allow incompatible std::shared_ptr objects, but they would cause > linker errors. For PR 42734 and PR 67843 that would have meant the > programs fail to link, instead of having runtime UB. The solution I > committed makes them link and run correctly, which seems better than > just saying "no, you can't do that". > > 3) We could change libstdc++'s configure to automatically override > --with-libstdcxx-lock-policy for arm-linux so it always uses "atomic" > (because we know the kernel helpers are available). That causes an ABI > change for a GCC built for --target=armv5t-*-linux* so I didn't do > that by default. Packagers who want that can use the --with option > explicitly to build a libstdc++ with atomic refcounting that can be > used on any armv5t or later CPU, allowing users to choose a newer > -march for their own code. (Until my patch that didn't work, because > > > Option 1) on its own seems to have no advantage over what I committed. > > Option 1a) is still too easy for users to get wrong and get UB, only > now it's all their fault not my fault. That doesn't really help users. > > Option 1b) makes it harder for users to get wrong (they have to mix > CUs built by different toolchains, not just built with different > -march options). I'm not sure there's much benefit to being able to > control the library policy separately from the user policy though. > > Option 2) makes the failures happen earlier (when linking, not > runtime) but doesn't actually make anything work that failed > previously. > > Option 3) is not my call to make. If the ARM maintainers want to > change the default older arm-linux targets I have no objections. > >
On 28/11/18 14:46 +0100, Richard Biener wrote: >On Wed, Nov 28, 2018 at 1:30 PM Jonathan Wakely <jwakely@redhat.com> wrote: >> >> On 28/11/18 11:11 +0000, Jonathan Wakely wrote: >> >On 28/11/18 11:46 +0100, Richard Biener wrote: >> >>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote: >> >>> >> >>>This resolves a longstanding issue where the lock policy for shared_ptr >> >>>reference counting depends on compilation options when the header is >> >>>included, so that different -march options can cause ABI changes. For >> >>>example, objects compiled with -march=armv7 will use atomics to >> >>>synchronize reference counts, and objects compiled with -march=armv5t >> >>>will use a mutex. That means the shared_ptr control block will have a >> >>>different layout in different objects, causing ODR violations and >> >>>undefined behaviour. This was the root cause of PR libstdc++/42734 as >> >>>well as PR libstdc++/67843. >> >>> >> >>>The solution is to decide on the lock policy at build time, when >> >>>libstdc++ is configured. The configure script checks for the >> >>>availability of the necessary atomic built-ins for the target and fixes >> >>>that choice permanently. Different -march flags used to compile user >> >>>code will not cause changes to the lock policy. This results in an ABI >> >>>change for certain compilations, but only where there was already an ABI >> >>>incompatibility between the libstdc++.so library and objects built with >> >>>an incompatible -march option. In general, this means a more stable ABI >> >>>that isn't silently altered when -march flags make addition atomic ops >> >>>available. >> >>> >> >>>To force a target to use "atomic" or "mutex" the new configure option >> >>>--with-libstdcxx-lock-policy can be used. >> >>> >> >>>In order to turn ODR violations into linker errors, the uses of >> >>>shared_ptr in filesystem directory iterators have been replaced >> >>>with __shared_ptr, and explicit instantiations are declared. This >> >>>ensures that object files using those types cannot link to libstdc++ >> >>>libs unless they use the same lock policy. >> >> >> >>Would it be possible to have both in libstdc++ and with differnet mangling of >> >>different kind ensure compatibility between different user CUs? Or is >> >>that too awkward for the user to get right? >> > >> >It would mean duplicating a lot more code, which is already duplicated >> >once for the std::string ABI, so we'd have four permuations! >> > >> >It still wouldn't ensure compatibility between different user CUs, >> >only between any user CU and any build of libstdc++. Different user >> >CUs would still disagree on the ABI of the types, and so couldn't pass >> >them between CUs. I see no advantage to supporting that for the >> >std::filesystem library (unlike std::string and std::iostream, which >> >are probably used in the majority of CUs). >> > >> >I do not want to get to the point where every type in libstdc++ exists >> >multiple times and you select some combination via command-line flags. >> >It's already becoming unmanageable with multiple std::string and long >> >double ABIs. >> >> Also, in practice, I don't see a need. The common cases where this bug >> arises are limited to 32-bit ARM, and for arm-linux the kernel helpers >> mean that you can always use "atomic". For bare metal ARM toolchains >> you probably don't want to be mixing CUs built against different >> versions of libstdc++ anyway. You have your toolchain for the board, >> and you use that. If it is configured to use "atomic", then that's >> what you get. If it's configured to use "mutex", then that's what you >> get. You probably don't want to use a toolchain configured for a >> different board, but you could use --with-libstdcxx-lock-policy to >> ensure a consistent policy across those toolchains if needed. >> >> It's also possible for problems to arise on i386. If you build GCC for >> i386 then it will choose "mutex" and be incompatible with a libstdc++ >> built for i486 or later. In that case, you could use the option >> --with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and >> then you'd need to link to libatomic to provide the ops, as there are >> no kernel helpers for i386 in libgcc). > >I think the default should be part of the platform ABI somehow, >like the i386 incompatibility should better not arise. What do you suggest to ensure that? Just imake i386 default to the "atomic" lock policy, and require anybody silly enough to configure for i386 to either link to libatomic, or explicitly use --with-libstdcxx-lock-policy=mutex if they really want to be incompatible with default i486 builds? >I suppose >libstdc++ configury doesn't read in gcc/config.gcc but does it >have sth similar where it's easy to see defaults for plaforms? >There's configure.host but I'm not sure this is related at all. No, I don't think we do. When there are target-specific things hardcoded it's spread out around acinclude.m4 and configure.ac
On Wed, Nov 28, 2018 at 3:09 PM Jonathan Wakely <jwakely@redhat.com> wrote: > > On 28/11/18 14:46 +0100, Richard Biener wrote: > >On Wed, Nov 28, 2018 at 1:30 PM Jonathan Wakely <jwakely@redhat.com> wrote: > >> > >> On 28/11/18 11:11 +0000, Jonathan Wakely wrote: > >> >On 28/11/18 11:46 +0100, Richard Biener wrote: > >> >>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote: > >> >>> > >> >>>This resolves a longstanding issue where the lock policy for shared_ptr > >> >>>reference counting depends on compilation options when the header is > >> >>>included, so that different -march options can cause ABI changes. For > >> >>>example, objects compiled with -march=armv7 will use atomics to > >> >>>synchronize reference counts, and objects compiled with -march=armv5t > >> >>>will use a mutex. That means the shared_ptr control block will have a > >> >>>different layout in different objects, causing ODR violations and > >> >>>undefined behaviour. This was the root cause of PR libstdc++/42734 as > >> >>>well as PR libstdc++/67843. > >> >>> > >> >>>The solution is to decide on the lock policy at build time, when > >> >>>libstdc++ is configured. The configure script checks for the > >> >>>availability of the necessary atomic built-ins for the target and fixes > >> >>>that choice permanently. Different -march flags used to compile user > >> >>>code will not cause changes to the lock policy. This results in an ABI > >> >>>change for certain compilations, but only where there was already an ABI > >> >>>incompatibility between the libstdc++.so library and objects built with > >> >>>an incompatible -march option. In general, this means a more stable ABI > >> >>>that isn't silently altered when -march flags make addition atomic ops > >> >>>available. > >> >>> > >> >>>To force a target to use "atomic" or "mutex" the new configure option > >> >>>--with-libstdcxx-lock-policy can be used. > >> >>> > >> >>>In order to turn ODR violations into linker errors, the uses of > >> >>>shared_ptr in filesystem directory iterators have been replaced > >> >>>with __shared_ptr, and explicit instantiations are declared. This > >> >>>ensures that object files using those types cannot link to libstdc++ > >> >>>libs unless they use the same lock policy. > >> >> > >> >>Would it be possible to have both in libstdc++ and with differnet mangling of > >> >>different kind ensure compatibility between different user CUs? Or is > >> >>that too awkward for the user to get right? > >> > > >> >It would mean duplicating a lot more code, which is already duplicated > >> >once for the std::string ABI, so we'd have four permuations! > >> > > >> >It still wouldn't ensure compatibility between different user CUs, > >> >only between any user CU and any build of libstdc++. Different user > >> >CUs would still disagree on the ABI of the types, and so couldn't pass > >> >them between CUs. I see no advantage to supporting that for the > >> >std::filesystem library (unlike std::string and std::iostream, which > >> >are probably used in the majority of CUs). > >> > > >> >I do not want to get to the point where every type in libstdc++ exists > >> >multiple times and you select some combination via command-line flags. > >> >It's already becoming unmanageable with multiple std::string and long > >> >double ABIs. > >> > >> Also, in practice, I don't see a need. The common cases where this bug > >> arises are limited to 32-bit ARM, and for arm-linux the kernel helpers > >> mean that you can always use "atomic". For bare metal ARM toolchains > >> you probably don't want to be mixing CUs built against different > >> versions of libstdc++ anyway. You have your toolchain for the board, > >> and you use that. If it is configured to use "atomic", then that's > >> what you get. If it's configured to use "mutex", then that's what you > >> get. You probably don't want to use a toolchain configured for a > >> different board, but you could use --with-libstdcxx-lock-policy to > >> ensure a consistent policy across those toolchains if needed. > >> > >> It's also possible for problems to arise on i386. If you build GCC for > >> i386 then it will choose "mutex" and be incompatible with a libstdc++ > >> built for i486 or later. In that case, you could use the option > >> --with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and > >> then you'd need to link to libatomic to provide the ops, as there are > >> no kernel helpers for i386 in libgcc). > > > >I think the default should be part of the platform ABI somehow, > >like the i386 incompatibility should better not arise. > > What do you suggest to ensure that? > > Just imake i386 default to the "atomic" lock policy, and require anybody > silly enough to configure for i386 to either link to libatomic, or > explicitly use --with-libstdcxx-lock-policy=mutex if they really want > to be incompatible with default i486 builds? Something like that, yes. > >I suppose > >libstdc++ configury doesn't read in gcc/config.gcc but does it > >have sth similar where it's easy to see defaults for plaforms? > >There's configure.host but I'm not sure this is related at all. > > No, I don't think we do. When there are target-specific things > hardcoded it's spread out around acinclude.m4 and configure.ac > >
On 28/11/2018 12:35, Jonathan Wakely wrote: > On 28/11/18 10:54 +0100, Christophe Lyon wrote: >> On Wed, 28 Nov 2018 at 00:25, Jonathan Wakely <jwakely@redhat.com> wrote: >>> >>> This resolves a longstanding issue where the lock policy for shared_ptr >>> reference counting depends on compilation options when the header is >>> included, so that different -march options can cause ABI changes. For >>> example, objects compiled with -march=armv7 will use atomics to >>> synchronize reference counts, and objects compiled with -march=armv5t >>> will use a mutex. That means the shared_ptr control block will have a >>> different layout in different objects, causing ODR violations and >>> undefined behaviour. This was the root cause of PR libstdc++/42734 as >>> well as PR libstdc++/67843. >>> >>> The solution is to decide on the lock policy at build time, when >>> libstdc++ is configured. The configure script checks for the >>> availability of the necessary atomic built-ins for the target and fixes >>> that choice permanently. Different -march flags used to compile user >>> code will not cause changes to the lock policy. This results in an ABI >>> change for certain compilations, but only where there was already an ABI >>> incompatibility between the libstdc++.so library and objects built with >>> an incompatible -march option. In general, this means a more stable ABI >>> that isn't silently altered when -march flags make addition atomic ops >>> available. >>> >>> To force a target to use "atomic" or "mutex" the new configure option >>> --with-libstdcxx-lock-policy can be used. >>> >>> In order to turn ODR violations into linker errors, the uses of >>> shared_ptr in filesystem directory iterators have been replaced >>> with __shared_ptr, and explicit instantiations are declared. This >>> ensures that object files using those types cannot link to libstdc++ >>> libs unless they use the same lock policy. >>> >>> PR libstdc++/67843 >>> * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro >>> that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY. >>> * config.h.in: Regenerate. >>> * configure: Regenerate. >>> * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY. >>> * doc/xml/manual/configure.xml: Document new configure option. >>> * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr >>> instead of shared_ptr. >>> (recursive_directory_iterator): Likewise. >>> (__shared_ptr<_Dir>): Add explicit instantiation declaration. >>> (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise. >>> * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared): >>> Add default template argument for _Lock_policy template parameter. >>> * include/ext/concurrence.h (__default_lock_policy): Check macro >>> _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current >>> target supports the builtins for compare-and-swap. >>> * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit >>> instantiation definition. >>> (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise. >>> (directory_iterator, recursive_directory_iterator): Use __make_shared >>> instead of make_shared. >>> >>> Tested x86_64-linux and aarch64-linux, committed to trunk. >>> >>> I would appreciate testing on ARM, especially Christophe's >>> -march=armv5t set up. >>> >> >> Hi Jonathan, >> >> Your patch passed my tests, thanks! > > Thanks for checking it. > >> directory_iterator.cc tests used to be killed by a timeout until last >> night where now they do PASS in my configurations involving >> -march=armv5t. >> >> That being said, I'm not sure I fully understand the fix. In my tests >> involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9, >> and I pass -march=armv5t as an override with the --target-board runtest option. > > Yup, and that difference between the arch used to build libstdc++ and > the arch used to build the tests was what caused the FAILures. And > that's what my patch addresses, by ensuring that the arch used to > build libstdc++ is what dictates the type of synchronization for > shared_ptr. > >> In these cases libstdc++ configure detects support for "atomic", so I >> suspect the tests pass only because I use QEMU with --cpu cortex-a9. >> I think if I used a different QEMU cpu (eg arm926), the tests would >> try to use atomics, and fail? > > I don't know if a libstdc++ configured for cortex-a9 would run on an > arm296 QEMU. It might depend on cortex-a9 instructions (unrelated to > the use of atomics in my patch). See the end of this mail. > >> The reason I'm still using cortex-a9 at QEMU level is that many tests >> override -mcpu/-march, and I used that as a compromise. > > That means you're not fully testing armv5t, because you're still > linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9 > instruction set. Usually that doesn't matter, because most libstdc++ > code doesn't care about which specific instructions it gets compiled > to. That >> However, I do have configurations using --with-cpu=arm10tdmi or >> --with-cpu=default and QEMU --cpu arm926. >> In these 2 cases, the tests do PASS, but they used to before your >> patch. libstdc++ configure does detect "mutex" in these cases. > > What matters is that the tests and the libstdc++ libraries agree on the > lock policy. It doesn't matter if they both use "atomic" or they both > use "mutex", only that they agree. Previously the lock policy was > selected by the preprocessor in every translation unit. That made it > very easy to get incompatible sets of objects. Now the selection is > made once per build of GCC, and is not affected by -march options when > compiling user code. > >> To be hopefully clearer, the subset of relevant configurations I use >> is as follows: >> GCC target / with-cpu / RUNTEST target-board / QEMU cpu >> 1- arm-none-linux-gnueabi[hf] / cortex-a9 / -march=armv5t / cortex-a9 >> 2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926 >> 3- arm-none-eabi / default / "" / arm926 >> >> (1) uses atomics >> (2) and (3) use mutex >> >> All of them now say "PASS", but maybe I should give a try switch to >> arm926 for QEMU in (1) ? > > That would be a useful test. > > If I understand correctly, that will still work because the atomic ops > will use the kernel helper in libgcc. Previously it would have failed > the same way as (1) was already failing, because --with-cpu=cortex-a9 > when libstdc++ was built chose atomics, and the -march=armv5t target > board chose mutexes, and so the testsuite files were incompatible with > the library. > > Since my patch, only the --with-cpu option matters, and the testsuite > files will still use atomics. I think the arm926 cpu running under > QEMU will implement those atomics via libgcc's kernel helpers. > > However, that test might fail for other reasons, if the libstdc++ libs > configured for cortex-a9 use other insns that are not supported by > arm296 then you'd get a SIGILL when running the tests. The kernel > helpers only implement the atomic operations, not any cortex-a9 > instructions that are not supported by arm296. > So, I didn't forget about this, it's just that it takes quite a bit of time to setup, run and check the logs... This is exactly what happens :( (and probably why I sticked to cortex-a9 in qemu despite -march=armv5t in runtestflags) Almost all execution tests fail in _dl_start, because it uses "movw". Surprisingly my logs show a few execution tests passing despite qemu exiting with "Illegal instruction". I don't understand why dejagnu sometimes sees exit code 132 as FAIL, and sometimes as PASS (both cases showing the same qemu execution trace) In the fortran testsuite, I also see qemu exiting with this trace: IN: 0x40801800: e59fa094 ldr sl, [pc, #0x94] 0x40801804: e59f4094 ldr r4, [pc, #0x94] 0x40801808: e1a0000d mov r0, sp 0x4080180c: fa000c66 blx #0x408049ac ---------------- IN: _dl_start 0x408049ac: e92d .byte 0x2d, 0xe9 ---------------- IN: process_dl_audit 0x40801a68: feeaf00f cdp2 p0, #0xe, c15, c10, c15, #0 ---------------- IN: process_dl_audit 0x40801a6c: b1b04604 lslslt r4, r4, #0xc 0x40801a70: 2b007823 blhs #0x4081fb04 ---------------- IN: process_dl_audit 0x40801a74: f8d8d0f6 .byte 0xf6, 0xd0, 0xd8, 0xf8 (this is qemu-2.11.0, I'm surprised some instructions are not decoded) So well, all in all it seems we can't test what I'm trying to test, or rather that -march=armv5t is broken if the toolchain is built --with-cpu more recent than that. Not very user friendly :-) Christophe
Firstly thanks a lot for working on this. On 28/11/2018 12:49, Jonathan Wakely wrote: > On 28/11/18 12:30 +0000, Jonathan Wakely wrote: >> 3) We could change libstdc++'s configure to automatically override >> --with-libstdcxx-lock-policy for arm-linux so it always uses "atomic" >> (because we know the kernel helpers are available). That causes an ABI >> change for a GCC built for --target=armv5t-*-linux* so I didn't do >> that by default. Packagers who want that can use the --with option >> explicitly to build a libstdc++ with atomic refcounting that can be >> used on any armv5t or later CPU, allowing users to choose a newer >> -march for their own code. (Until my patch that didn't work, because > > [...] > >> Option 3) is not my call to make. If the ARM maintainers want to >> change the default older arm-linux targets I have no objections. We could change the defaults when(if) we next bump up the libstdc++ ABI in general :-/ and if we remember to do this. I don't feel comfortable changing the defaults silently and I don't view this as something we can decide on by ourselves as the Arm maintainers as this really falls in the space of folks distributing Linux using by defaults versions of the architecture that result in the use of mutexes rather than the atomics. It's also not clear to me if we can use any other magic tricks to make this co-exist and whether that is worth it. > > Another way to approach option 3) that we discussed at Cauldron, and > which I want to start a separate discussion about on gcc@gcc.gnu.org, > is to change the value of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 et al > when we have kernel helpers to implement those operations. > > The shared_ptr code doesn't care if an atomic CAS comes from the CPU > or a kernel helper in libgcc. If the atomic CAS is supported via > kernel helpers, let's use it! But those predefined macros only tell > you about native CPU support (for the current target selected by > -march). > > > It's worth noting that all this shared_ptr code predates libatomic, so > we couldn't just say "always use atomics, and link to libatomic if > needed". If __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 was not defined, we had > to assume no CAS *at all*. Not native, not provided by libatomic, > nothing. I'd love to simply rely on std::atomic in std::shared_ptr, > but that would be an ABI break for targets currently using a mutex, > and would add new dependencies on libatomic for some targets. (It > might also pessimize some single-threaded programs on targets that do > use atomic ref-counts, because currently some updates are non-atomic > when __gthread_active_p() == false). Yep, though you want it to go to the kernel helpers or indeed libatomic. regards Ramana > >
diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 index 82a25e5f2f1..6bcd29dc8c3 100644 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -3562,6 +3562,72 @@ EOF ]) +dnl +dnl Set default lock policy for synchronizing shared_ptr reference counting. +dnl +dnl --with-libstdcxx-lock-policy=auto +dnl Use atomic operations for shared_ptr reference counting only if +dnl the default target supports atomic compare-and-swap. +dnl --with-libstdcxx-lock-policy=atomic +dnl Use atomic operations for shared_ptr reference counting. +dnl --with-libstdcxx-lock-policy=mutex +dnl Use a mutex to synchronize shared_ptr reference counting. +dnl +dnl This controls the value of __gnu_cxx::__default_lock_policy, which +dnl determines how shared_ptr reference counts are synchronized. +dnl The option "atomic" means that atomic operations should be used, +dnl "mutex" means that a mutex will be used. The default option, "auto", +dnl will check if the target supports the compiler-generated builtins +dnl for atomic compare-and-swap operations for 2-byte and 4-byte integers, +dnl and will use "atomic" if supported, "mutex" otherwise. +dnl This option is ignored if the thread model used by GCC is "single", +dnl as no synchronization is used at all in that case. +dnl This option affects the library ABI (except in the "single" thread model). +dnl +dnl Defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY to 1 if atomics should be used. +dnl +AC_DEFUN([GLIBCXX_ENABLE_LOCK_POLICY], [ + + AC_ARG_WITH([libstdcxx-lock-policy], + AC_HELP_STRING([--with-libstdcxx-lock-policy={atomic,mutex,auto}], + [synchronization policy for shared_ptr reference counting [default=auto]]), + [libstdcxx_atomic_lock_policy=$withval], + [libstdcxx_atomic_lock_policy=auto]) + + case "$libstdcxx_atomic_lock_policy" in + atomic|mutex|auto) ;; + *) AC_MSG_ERROR([Invalid argument for --with-libstdcxx-lock-policy]) ;; + esac + AC_MSG_CHECKING([for lock policy for shared_ptr reference counts]) + + if test x"$libstdcxx_atomic_lock_policy" = x"auto"; then + AC_LANG_SAVE + AC_LANG_CPLUSPLUS + ac_save_CXXFLAGS="$CXXFLAGS" + + dnl Why do we care about 2-byte CAS on targets with 4-byte _Atomic_word?! + AC_TRY_COMPILE([ + #if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 + # error "No 2-byte compare-and-swap" + #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 + # error "No 4-byte compare-and-swap" + #endif + ],, + [libstdcxx_atomic_lock_policy=atomic], + [libstdcxx_atomic_lock_policy=mutex]) + AC_LANG_RESTORE + CXXFLAGS="$ac_save_CXXFLAGS" + fi + + if test x"$libstdcxx_atomic_lock_policy" = x"atomic"; then + AC_MSG_RESULT(atomic) + AC_DEFINE(HAVE_ATOMIC_LOCK_POLICY,1, + [Defined if shared_ptr reference counting should use atomic operations.]) + else + AC_MSG_RESULT(mutex) + fi + +]) dnl dnl Allow visibility attributes to be used on namespaces, objects, etc. diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac index a657a726778..ad5b4117cfd 100644 --- a/libstdc++-v3/configure.ac +++ b/libstdc++-v3/configure.ac @@ -149,6 +149,7 @@ GLIBCXX_ENABLE_VERBOSE GLIBCXX_ENABLE_PCH($is_hosted) GLIBCXX_ENABLE_THREADS GLIBCXX_ENABLE_ATOMIC_BUILTINS +GLIBCXX_ENABLE_LOCK_POLICY GLIBCXX_ENABLE_DECIMAL_FLOAT GLIBCXX_ENABLE_INT128_FLOAT128 if test "$enable_float128" = yes; then diff --git a/libstdc++-v3/doc/xml/manual/configure.xml b/libstdc++-v3/doc/xml/manual/configure.xml index ac383cf7fa7..d296c8d8a49 100644 --- a/libstdc++-v3/doc/xml/manual/configure.xml +++ b/libstdc++-v3/doc/xml/manual/configure.xml @@ -399,6 +399,28 @@ </para> </listitem></varlistentry> + <varlistentry><term><code>--with-libstdcxx-lock-policy=OPTION</code></term> + <listitem><para>Sets the lock policy that controls how + <classname>shared_ptr</classname> reference counting is + synchronized. + The choice OPTION=atomic enables use of atomics for updates to + <classname>shared_ptr</classname> reference counts. + The choice OPTION=mutex enables use of a mutex to synchronize updates + to <classname>shared_ptr</classname> reference counts. + If the compiler's thread model is "single" then this option has no + effect, as no synchronization is used for the reference counts. + The default is OPTION=auto, which checks for the availability of + compiler built-ins for 2-byte and 4-byte atomic compare-and-swap, + and uses OPTION=atomic if they're available, OPTION=mutex otherwise. + This option can change the library ABI. + If the library is configured to use atomics and user programs are + compiled using a target that doesn't natively support the atomic + operations (e.g. the library is configured for armv7 and then code + is compiled with <option>-march=armv5t</option>) then the program + might rely on support in libgcc to provide the atomics. + </para> + </listitem></varlistentry> + <varlistentry><term><code>--enable-vtable-verify</code>[default]</term> <listitem> <para>Use <code>-fvtable-verify=std</code> to compile the C++ diff --git a/libstdc++-v3/include/bits/fs_dir.h b/libstdc++-v3/include/bits/fs_dir.h index 9ee1cb66b61..2f81a1709e4 100644 --- a/libstdc++-v3/include/bits/fs_dir.h +++ b/libstdc++-v3/include/bits/fs_dir.h @@ -403,7 +403,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 friend class recursive_directory_iterator; - std::shared_ptr<_Dir> _M_dir; + std::__shared_ptr<_Dir> _M_dir; }; inline directory_iterator @@ -494,7 +494,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 const recursive_directory_iterator& __rhs); struct _Dir_stack; - std::shared_ptr<_Dir_stack> _M_dirs; + std::__shared_ptr<_Dir_stack> _M_dirs; directory_options _M_options = {}; bool _M_pending = false; }; @@ -525,6 +525,14 @@ _GLIBCXX_END_NAMESPACE_CXX11 // @} group filesystem } // namespace filesystem + // Use explicit instantiations of these types. Any inconsistency in the + // value of __default_lock_policy between code including this header and + // the library will cause a linker error. + extern template class + __shared_ptr<filesystem::_Dir>; + extern template class + __shared_ptr<filesystem::recursive_directory_iterator::_Dir_stack>; + _GLIBCXX_END_NAMESPACE_VERSION } // namespace std diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h index 46ff4a7cf29..21debf362aa 100644 --- a/libstdc++-v3/include/bits/shared_ptr_base.h +++ b/libstdc++-v3/include/bits/shared_ptr_base.h @@ -1803,7 +1803,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION mutable __weak_ptr<_Tp, _Lp> _M_weak_this; }; - template<typename _Tp, _Lock_policy _Lp, typename _Alloc, typename... _Args> + template<typename _Tp, _Lock_policy _Lp = __default_lock_policy, + typename _Alloc, typename... _Args> inline __shared_ptr<_Tp, _Lp> __allocate_shared(const _Alloc& __a, _Args&&... __args) { @@ -1811,7 +1812,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION std::forward<_Args>(__args)...); } - template<typename _Tp, _Lock_policy _Lp, typename... _Args> + template<typename _Tp, _Lock_policy _Lp = __default_lock_policy, + typename... _Args> inline __shared_ptr<_Tp, _Lp> __make_shared(_Args&&... __args) { diff --git a/libstdc++-v3/include/ext/concurrence.h b/libstdc++-v3/include/ext/concurrence.h index 302cddfa473..33ad9e06c9f 100644 --- a/libstdc++-v3/include/ext/concurrence.h +++ b/libstdc++-v3/include/ext/concurrence.h @@ -51,16 +51,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Compile time constant that indicates prefered locking policy in // the current configuration. static const _Lock_policy __default_lock_policy = -#ifdef __GTHREADS -#if (defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2) \ - && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4)) +#ifndef __GTHREADS + _S_single; +#elif defined _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY _S_atomic; #else _S_mutex; #endif -#else - _S_single; -#endif // NB: As this is used in libsupc++, need to only depend on // exception. No stdexception classes, no use of std::string. diff --git a/libstdc++-v3/src/filesystem/std-dir.cc b/libstdc++-v3/src/filesystem/std-dir.cc index 4c9a287ad80..038f635a712 100644 --- a/libstdc++-v3/src/filesystem/std-dir.cc +++ b/libstdc++-v3/src/filesystem/std-dir.cc @@ -39,6 +39,9 @@ namespace fs = std::filesystem; namespace posix = std::filesystem::__gnu_posix; +template class std::__shared_ptr<fs::_Dir>; +template class std::__shared_ptr<fs::recursive_directory_iterator::_Dir_stack>; + struct fs::_Dir : _Dir_base { _Dir(const fs::path& p, bool skip_permission_denied, error_code& ec) @@ -125,7 +128,7 @@ directory_iterator(const path& p, directory_options options, error_code* ecptr) if (dir.dirp) { - auto sp = std::make_shared<fs::_Dir>(std::move(dir)); + auto sp = std::__make_shared<fs::_Dir>(std::move(dir)); if (sp->advance(skip_permission_denied, ec)) _M_dir.swap(sp); } @@ -185,7 +188,7 @@ recursive_directory_iterator(const path& p, directory_options options, { if (ecptr) ecptr->clear(); - auto sp = std::make_shared<_Dir_stack>(); + auto sp = std::__make_shared<_Dir_stack>(); sp->push(_Dir{ dirp, p }); if (ecptr ? sp->top().advance(*ecptr) : sp->top().advance()) _M_dirs.swap(sp);