diff mbox series

PR libstdc++/67843 set shared_ptr lock policy at build-time

Message ID 20181127232555.GA3099@redhat.com
State New
Headers show
Series PR libstdc++/67843 set shared_ptr lock policy at build-time | expand

Commit Message

Jonathan Wakely Nov. 27, 2018, 11:25 p.m. UTC
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.

The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.

To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.

In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.

	PR libstdc++/67843
	* acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
	that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
	* config.h.in: Regenerate.
	* configure: Regenerate.
	* configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
	* doc/xml/manual/configure.xml: Document new configure option.
	* include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
	instead of shared_ptr.
	(recursive_directory_iterator): Likewise.
	(__shared_ptr<_Dir>): Add explicit instantiation declaration.
	(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
	* include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
	Add default template argument for _Lock_policy template parameter.
	* include/ext/concurrence.h (__default_lock_policy): Check macro
	_GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
	target supports the builtins for compare-and-swap.
	* src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
	instantiation definition.
	(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
	(directory_iterator, recursive_directory_iterator): Use __make_shared
	instead of make_shared.

Tested x86_64-linux and aarch64-linux, committed to trunk.

I would appreciate testing on ARM, especially Christophe's
-march=armv5t set up.
commit 5ae8a02becb13b5d9c0cdd72bce5312efd6bc569
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Tue Nov 27 12:13:56 2018 +0000

    PR libstdc++/67843 set shared_ptr lock policy at build-time
    
    This resolves a longstanding issue where the lock policy for shared_ptr
    reference counting depends on compilation options when the header is
    included, so that different -march options can cause ABI changes. For
    example, objects compiled with -march=armv7 will use atomics to
    synchronize reference counts, and objects compiled with -march=armv5t
    will use a mutex. That means the shared_ptr control block will have a
    different layout in different objects, causing ODR violations and
    undefined behaviour. This was the root cause of PR libstdc++/42734 as
    well as PR libstdc++/67843.
    
    The solution is to decide on the lock policy at build time, when
    libstdc++ is configured. The configure script checks for the
    availability of the necessary atomic built-ins for the target and fixes
    that choice permanently. Different -march flags used to compile user
    code will not cause changes to the lock policy. This results in an ABI
    change for certain compilations, but only where there was already an ABI
    incompatibility between the libstdc++.so library and objects built with
    an incompatible -march option. In general, this means a more stable ABI
    that isn't silently altered when -march flags make addition atomic ops
    available.
    
    To force a target to use "atomic" or "mutex" the new configure option
    --with-libstdcxx-lock-policy can be used.
    
    In order to turn ODR violations into linker errors, the uses of
    shared_ptr in filesystem directory iterators have been replaced
    with __shared_ptr, and explicit instantiations are declared. This
    ensures that object files using those types cannot link to libstdc++
    libs unless they use the same lock policy.
    
            PR libstdc++/67843
            * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
            that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
            * config.h.in: Regenerate.
            * configure: Regenerate.
            * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
            * doc/xml/manual/configure.xml: Document new configure option.
            * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
            instead of shared_ptr.
            (recursive_directory_iterator): Likewise.
            (__shared_ptr<_Dir>): Add explicit instantiation declaration.
            (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
            * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
            Add default template argument for _Lock_policy template parameter.
            * include/ext/concurrence.h (__default_lock_policy): Check macro
            _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
            target supports the builtins for compare-and-swap.
            * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
            instantiation definition.
            (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
            (directory_iterator, recursive_directory_iterator): Use __make_shared
            instead of make_shared.

Comments

Christophe Lyon Nov. 28, 2018, 9:54 a.m. UTC | #1
On Wed, 28 Nov 2018 at 00:25, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> This resolves a longstanding issue where the lock policy for shared_ptr
> reference counting depends on compilation options when the header is
> included, so that different -march options can cause ABI changes. For
> example, objects compiled with -march=armv7 will use atomics to
> synchronize reference counts, and objects compiled with -march=armv5t
> will use a mutex. That means the shared_ptr control block will have a
> different layout in different objects, causing ODR violations and
> undefined behaviour. This was the root cause of PR libstdc++/42734 as
> well as PR libstdc++/67843.
>
> The solution is to decide on the lock policy at build time, when
> libstdc++ is configured. The configure script checks for the
> availability of the necessary atomic built-ins for the target and fixes
> that choice permanently. Different -march flags used to compile user
> code will not cause changes to the lock policy. This results in an ABI
> change for certain compilations, but only where there was already an ABI
> incompatibility between the libstdc++.so library and objects built with
> an incompatible -march option. In general, this means a more stable ABI
> that isn't silently altered when -march flags make addition atomic ops
> available.
>
> To force a target to use "atomic" or "mutex" the new configure option
> --with-libstdcxx-lock-policy can be used.
>
> In order to turn ODR violations into linker errors, the uses of
> shared_ptr in filesystem directory iterators have been replaced
> with __shared_ptr, and explicit instantiations are declared. This
> ensures that object files using those types cannot link to libstdc++
> libs unless they use the same lock policy.
>
>         PR libstdc++/67843
>         * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
>         that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
>         * config.h.in: Regenerate.
>         * configure: Regenerate.
>         * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
>         * doc/xml/manual/configure.xml: Document new configure option.
>         * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
>         instead of shared_ptr.
>         (recursive_directory_iterator): Likewise.
>         (__shared_ptr<_Dir>): Add explicit instantiation declaration.
>         (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>         * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
>         Add default template argument for _Lock_policy template parameter.
>         * include/ext/concurrence.h (__default_lock_policy): Check macro
>         _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
>         target supports the builtins for compare-and-swap.
>         * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
>         instantiation definition.
>         (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>         (directory_iterator, recursive_directory_iterator): Use __make_shared
>         instead of make_shared.
>
> Tested x86_64-linux and aarch64-linux, committed to trunk.
>
> I would appreciate testing on ARM, especially Christophe's
> -march=armv5t set up.
>

Hi Jonathan,

Your patch passed my tests, thanks!
directory_iterator.cc tests used to be killed by a timeout until last
night where now they do PASS in my configurations involving
-march=armv5t.

That being said, I'm not sure I fully understand the fix. In my tests
involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9,
and I pass -march=armv5t as an override with the --target-board runtest option.

In these cases libstdc++ configure detects support for "atomic", so I
suspect the tests pass only because I use QEMU with --cpu cortex-a9.
I think if I used a different QEMU cpu (eg arm926), the tests would
try to use atomics, and fail?

The reason I'm still using cortex-a9 at QEMU level is that many tests
override -mcpu/-march, and I used that as a compromise.

However, I do have configurations using --with-cpu=arm10tdmi or
--with-cpu=default and QEMU --cpu arm926.
In these 2 cases, the tests do PASS, but they used to before your
patch. libstdc++ configure does detect "mutex" in these cases.

To be hopefully clearer, the subset of relevant configurations I use
is as follows:
GCC target / with-cpu / RUNTEST target-board / QEMU cpu
1- arm-none-linux-gnueabi[hf]  / cortex-a9 / -march=armv5t / cortex-a9
2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926
3- arm-none-eabi / default / "" / arm926

(1) uses atomics
(2) and (3) use mutex

All of them now say "PASS", but maybe I should give a try switch to
arm926 for QEMU in (1) ?

Thanks,

Christophe
Richard Biener Nov. 28, 2018, 10:46 a.m. UTC | #2
On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote:
>
> This resolves a longstanding issue where the lock policy for shared_ptr
> reference counting depends on compilation options when the header is
> included, so that different -march options can cause ABI changes. For
> example, objects compiled with -march=armv7 will use atomics to
> synchronize reference counts, and objects compiled with -march=armv5t
> will use a mutex. That means the shared_ptr control block will have a
> different layout in different objects, causing ODR violations and
> undefined behaviour. This was the root cause of PR libstdc++/42734 as
> well as PR libstdc++/67843.
>
> The solution is to decide on the lock policy at build time, when
> libstdc++ is configured. The configure script checks for the
> availability of the necessary atomic built-ins for the target and fixes
> that choice permanently. Different -march flags used to compile user
> code will not cause changes to the lock policy. This results in an ABI
> change for certain compilations, but only where there was already an ABI
> incompatibility between the libstdc++.so library and objects built with
> an incompatible -march option. In general, this means a more stable ABI
> that isn't silently altered when -march flags make addition atomic ops
> available.
>
> To force a target to use "atomic" or "mutex" the new configure option
> --with-libstdcxx-lock-policy can be used.
>
> In order to turn ODR violations into linker errors, the uses of
> shared_ptr in filesystem directory iterators have been replaced
> with __shared_ptr, and explicit instantiations are declared. This
> ensures that object files using those types cannot link to libstdc++
> libs unless they use the same lock policy.

Would it be possible to have both in libstdc++ and with differnet mangling of
different kind ensure compatibility between different user CUs?  Or is
that too awkward for the user to get right?

Richard.

>         PR libstdc++/67843
>         * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
>         that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
>         * config.h.in: Regenerate.
>         * configure: Regenerate.
>         * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
>         * doc/xml/manual/configure.xml: Document new configure option.
>         * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
>         instead of shared_ptr.
>         (recursive_directory_iterator): Likewise.
>         (__shared_ptr<_Dir>): Add explicit instantiation declaration.
>         (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>         * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
>         Add default template argument for _Lock_policy template parameter.
>         * include/ext/concurrence.h (__default_lock_policy): Check macro
>         _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
>         target supports the builtins for compare-and-swap.
>         * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
>         instantiation definition.
>         (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>         (directory_iterator, recursive_directory_iterator): Use __make_shared
>         instead of make_shared.
>
> Tested x86_64-linux and aarch64-linux, committed to trunk.
>
> I would appreciate testing on ARM, especially Christophe's
> -march=armv5t set up.
>
Jonathan Wakely Nov. 28, 2018, 11:11 a.m. UTC | #3
On 28/11/18 11:46 +0100, Richard Biener wrote:
>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote:
>>
>> This resolves a longstanding issue where the lock policy for shared_ptr
>> reference counting depends on compilation options when the header is
>> included, so that different -march options can cause ABI changes. For
>> example, objects compiled with -march=armv7 will use atomics to
>> synchronize reference counts, and objects compiled with -march=armv5t
>> will use a mutex. That means the shared_ptr control block will have a
>> different layout in different objects, causing ODR violations and
>> undefined behaviour. This was the root cause of PR libstdc++/42734 as
>> well as PR libstdc++/67843.
>>
>> The solution is to decide on the lock policy at build time, when
>> libstdc++ is configured. The configure script checks for the
>> availability of the necessary atomic built-ins for the target and fixes
>> that choice permanently. Different -march flags used to compile user
>> code will not cause changes to the lock policy. This results in an ABI
>> change for certain compilations, but only where there was already an ABI
>> incompatibility between the libstdc++.so library and objects built with
>> an incompatible -march option. In general, this means a more stable ABI
>> that isn't silently altered when -march flags make addition atomic ops
>> available.
>>
>> To force a target to use "atomic" or "mutex" the new configure option
>> --with-libstdcxx-lock-policy can be used.
>>
>> In order to turn ODR violations into linker errors, the uses of
>> shared_ptr in filesystem directory iterators have been replaced
>> with __shared_ptr, and explicit instantiations are declared. This
>> ensures that object files using those types cannot link to libstdc++
>> libs unless they use the same lock policy.
>
>Would it be possible to have both in libstdc++ and with differnet mangling of
>different kind ensure compatibility between different user CUs?  Or is
>that too awkward for the user to get right?

It would mean duplicating a lot more code, which is already duplicated
once for the std::string ABI, so we'd have four permuations!

It still wouldn't ensure compatibility between different user CUs,
only between any user CU and any build of libstdc++. Different user
CUs would still disagree on the ABI of the types, and so couldn't pass
them between CUs. I see no advantage to supporting that for the
std::filesystem library (unlike std::string and std::iostream, which
are probably used in the majority of CUs).

I do not want to get to the point where every type in libstdc++ exists
multiple times and you select some combination via command-line flags.
It's already becoming unmanageable with multiple std::string and long
double ABIs.
Jonathan Wakely Nov. 28, 2018, 11:35 a.m. UTC | #4
On 28/11/18 10:54 +0100, Christophe Lyon wrote:
>On Wed, 28 Nov 2018 at 00:25, Jonathan Wakely <jwakely@redhat.com> wrote:
>>
>> This resolves a longstanding issue where the lock policy for shared_ptr
>> reference counting depends on compilation options when the header is
>> included, so that different -march options can cause ABI changes. For
>> example, objects compiled with -march=armv7 will use atomics to
>> synchronize reference counts, and objects compiled with -march=armv5t
>> will use a mutex. That means the shared_ptr control block will have a
>> different layout in different objects, causing ODR violations and
>> undefined behaviour. This was the root cause of PR libstdc++/42734 as
>> well as PR libstdc++/67843.
>>
>> The solution is to decide on the lock policy at build time, when
>> libstdc++ is configured. The configure script checks for the
>> availability of the necessary atomic built-ins for the target and fixes
>> that choice permanently. Different -march flags used to compile user
>> code will not cause changes to the lock policy. This results in an ABI
>> change for certain compilations, but only where there was already an ABI
>> incompatibility between the libstdc++.so library and objects built with
>> an incompatible -march option. In general, this means a more stable ABI
>> that isn't silently altered when -march flags make addition atomic ops
>> available.
>>
>> To force a target to use "atomic" or "mutex" the new configure option
>> --with-libstdcxx-lock-policy can be used.
>>
>> In order to turn ODR violations into linker errors, the uses of
>> shared_ptr in filesystem directory iterators have been replaced
>> with __shared_ptr, and explicit instantiations are declared. This
>> ensures that object files using those types cannot link to libstdc++
>> libs unless they use the same lock policy.
>>
>>         PR libstdc++/67843
>>         * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
>>         that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
>>         * config.h.in: Regenerate.
>>         * configure: Regenerate.
>>         * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
>>         * doc/xml/manual/configure.xml: Document new configure option.
>>         * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
>>         instead of shared_ptr.
>>         (recursive_directory_iterator): Likewise.
>>         (__shared_ptr<_Dir>): Add explicit instantiation declaration.
>>         (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>>         * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
>>         Add default template argument for _Lock_policy template parameter.
>>         * include/ext/concurrence.h (__default_lock_policy): Check macro
>>         _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
>>         target supports the builtins for compare-and-swap.
>>         * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
>>         instantiation definition.
>>         (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>>         (directory_iterator, recursive_directory_iterator): Use __make_shared
>>         instead of make_shared.
>>
>> Tested x86_64-linux and aarch64-linux, committed to trunk.
>>
>> I would appreciate testing on ARM, especially Christophe's
>> -march=armv5t set up.
>>
>
>Hi Jonathan,
>
>Your patch passed my tests, thanks!

Thanks for checking it.

>directory_iterator.cc tests used to be killed by a timeout until last
>night where now they do PASS in my configurations involving
>-march=armv5t.
>
>That being said, I'm not sure I fully understand the fix. In my tests
>involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9,
>and I pass -march=armv5t as an override with the --target-board runtest option.

Yup, and that difference between the arch used to build libstdc++ and
the arch used to build the tests was what caused the FAILures. And
that's what my patch addresses, by ensuring that the arch used to
build libstdc++ is what dictates the type of synchronization for
shared_ptr.

>In these cases libstdc++ configure detects support for "atomic", so I
>suspect the tests pass only because I use QEMU with --cpu cortex-a9.
>I think if I used a different QEMU cpu (eg arm926), the tests would
>try to use atomics, and fail?

I don't know if a libstdc++ configured for cortex-a9 would run on an
arm296 QEMU. It might depend on cortex-a9 instructions (unrelated to
the use of atomics in my patch). See the end of this mail.

>The reason I'm still using cortex-a9 at QEMU level is that many tests
>override -mcpu/-march, and I used that as a compromise.

That means you're not fully testing armv5t, because you're still
linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9
instruction set. Usually that doesn't matter, because most libstdc++
code doesn't care about which specific instructions it gets compiled
to. That 

>However, I do have configurations using --with-cpu=arm10tdmi or
>--with-cpu=default and QEMU --cpu arm926.
>In these 2 cases, the tests do PASS, but they used to before your
>patch. libstdc++ configure does detect "mutex" in these cases.

What matters is that the tests and the libstdc++ libraries agree on the
lock policy. It doesn't matter if they both use "atomic" or they both
use "mutex", only that they agree. Previously the lock policy was
selected by the preprocessor in every translation unit. That made it
very easy to get incompatible sets of objects. Now the selection is
made once per build of GCC, and is not affected by -march options when
compiling user code.

>To be hopefully clearer, the subset of relevant configurations I use
>is as follows:
>GCC target / with-cpu / RUNTEST target-board / QEMU cpu
>1- arm-none-linux-gnueabi[hf]  / cortex-a9 / -march=armv5t / cortex-a9
>2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926
>3- arm-none-eabi / default / "" / arm926
>
>(1) uses atomics
>(2) and (3) use mutex
>
>All of them now say "PASS", but maybe I should give a try switch to
>arm926 for QEMU in (1) ?

That would be a useful test.

If I understand correctly, that will still work because the atomic ops
will use the kernel helper in libgcc. Previously it would have failed
the same way as (1) was already failing, because --with-cpu=cortex-a9
when libstdc++ was built chose atomics, and the -march=armv5t target
board chose mutexes, and so the testsuite files were incompatible with
the library.

Since my patch, only the --with-cpu option matters, and the testsuite
files will still use atomics. I think the arm926 cpu running under
QEMU will implement those atomics via libgcc's kernel helpers.

However, that test might fail for other reasons, if the libstdc++ libs
configured for cortex-a9 use other insns that are not supported by
arm296 then you'd get a SIGILL when running the tests.  The kernel
helpers only implement the atomic operations, not any cortex-a9
instructions that are not supported by arm296.
Jonathan Wakely Nov. 28, 2018, 12:30 p.m. UTC | #5
On 28/11/18 11:11 +0000, Jonathan Wakely wrote:
>On 28/11/18 11:46 +0100, Richard Biener wrote:
>>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote:
>>>
>>>This resolves a longstanding issue where the lock policy for shared_ptr
>>>reference counting depends on compilation options when the header is
>>>included, so that different -march options can cause ABI changes. For
>>>example, objects compiled with -march=armv7 will use atomics to
>>>synchronize reference counts, and objects compiled with -march=armv5t
>>>will use a mutex. That means the shared_ptr control block will have a
>>>different layout in different objects, causing ODR violations and
>>>undefined behaviour. This was the root cause of PR libstdc++/42734 as
>>>well as PR libstdc++/67843.
>>>
>>>The solution is to decide on the lock policy at build time, when
>>>libstdc++ is configured. The configure script checks for the
>>>availability of the necessary atomic built-ins for the target and fixes
>>>that choice permanently. Different -march flags used to compile user
>>>code will not cause changes to the lock policy. This results in an ABI
>>>change for certain compilations, but only where there was already an ABI
>>>incompatibility between the libstdc++.so library and objects built with
>>>an incompatible -march option. In general, this means a more stable ABI
>>>that isn't silently altered when -march flags make addition atomic ops
>>>available.
>>>
>>>To force a target to use "atomic" or "mutex" the new configure option
>>>--with-libstdcxx-lock-policy can be used.
>>>
>>>In order to turn ODR violations into linker errors, the uses of
>>>shared_ptr in filesystem directory iterators have been replaced
>>>with __shared_ptr, and explicit instantiations are declared. This
>>>ensures that object files using those types cannot link to libstdc++
>>>libs unless they use the same lock policy.
>>
>>Would it be possible to have both in libstdc++ and with differnet mangling of
>>different kind ensure compatibility between different user CUs?  Or is
>>that too awkward for the user to get right?
>
>It would mean duplicating a lot more code, which is already duplicated
>once for the std::string ABI, so we'd have four permuations!
>
>It still wouldn't ensure compatibility between different user CUs,
>only between any user CU and any build of libstdc++. Different user
>CUs would still disagree on the ABI of the types, and so couldn't pass
>them between CUs. I see no advantage to supporting that for the
>std::filesystem library (unlike std::string and std::iostream, which
>are probably used in the majority of CUs).
>
>I do not want to get to the point where every type in libstdc++ exists
>multiple times and you select some combination via command-line flags.
>It's already becoming unmanageable with multiple std::string and long
>double ABIs.

Also, in practice, I don't see a need. The common cases where this bug
arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
mean that you can always use "atomic". For bare metal ARM toolchains
you probably don't want to be mixing CUs built against different
versions of libstdc++ anyway. You have your toolchain for the board,
and you use that. If it is configured to use "atomic", then that's
what you get. If it's configured to use "mutex", then that's what you
get. You probably don't want to use a toolchain configured for a
different board, but you could use --with-libstdcxx-lock-policy to
ensure a consistent policy across those toolchains if needed.

It's also possible for problems to arise on i386. If you build GCC for
i386 then it will choose "mutex" and be incompatible with a libstdc++
built for i486 or later. In that case, you could use the option
--with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
then you'd need to link to libatomic to provide the ops, as there are
no kernel helpers for i386 in libgcc).

There are a few other things we could do though:

1) As well as the __default_lock_policy value, we could add a
"library's preferred lock policy" value, which could be different from
__default_lock_policy. That policy would get used for the __shared_ptr
objects in the libstdc++ API/ABI, and would have to be consistent for
all CUs in the library and using its headers. That solves the problem
of user CUs being incompatible with the library because they used
-march. It doesn't change the fact that a libstdc++.so using "atomic"
is not compatible with one using "mutex". For this to work the library
always needs to use std::__shared_ptr<X, __library_lock_policy>
instead of std::shared_ptr<X> so that the policy is explicit (which I
think is a good rule anyway).

1a) Then (if we wanted) we could restore the old behaviour where
__default_lock_policy is chosen by the preprocessor and can vary
between user CUs. If users want to compile different CUs with
different lock policies, that's their choice. They risk creating
incompatible shared_ptr objects which cannot be passed between CUs,
but it becomes their responsibility to get it right. The library
always uses its preferred policy, and so won't get it wrong. This
seems easy for users to get wrong, but would arguably be more
backwards compatible (by still allowing users to get silent UB that
only fails at runtime).

1b) Or (if we wanted) we could still fix __default_lock_policy at
configure-time, but add a second version of the new option
--with-libstdcxx-lock-policy, so the default policy for user CUs and
the library's preferred policy can be chosen independently. You could
say you want the library to always use "mutex" but have two builds of
GCC, one where user CUs use "atomic" and one where they use "mutex".
The libstdc++.so used by those two GCC builds would be compatible, but
user CUs that use std::shared_ptr would not be.

Options 1a and 1b are mutually exclusive.

2) We could consider using __attribute__((__abi_tag__("..."))) on
std::shared_ptr when the lock policy chosen by the preprocessor
doesn't match what would have been chosen when libstdc++ was built. So
if a user compiles with a -march option that changes the lock
policy, we change the mangling of std::shared_ptr. This would still
allow incompatible std::shared_ptr objects, but they would cause
linker errors. For PR 42734 and PR 67843 that would have meant the
programs fail to link, instead of having runtime UB. The solution I
committed makes them link and run correctly, which seems better than
just saying "no, you can't do that".

3) We could change libstdc++'s configure to automatically override
--with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
(because we know the kernel helpers are available). That causes an ABI
change for a GCC built for --target=armv5t-*-linux* so I didn't do
that by default. Packagers who want that can use the --with option
explicitly to build a libstdc++ with atomic refcounting that can be
used on any armv5t or later CPU, allowing users to choose a newer
-march for their own code. (Until my patch that didn't work, because


Option 1) on its own seems to have no advantage over what I committed.

Option 1a) is still too easy for users to get wrong and get UB, only
now it's all their fault not my fault. That doesn't really help users.

Option 1b) makes it harder for users to get wrong (they have to mix
CUs built by different toolchains, not just built with different
-march options). I'm not sure there's much benefit to being able to
control the library policy separately from the user policy though.

Option 2) makes the failures happen earlier (when linking, not
runtime) but doesn't actually make anything work that failed
previously.

Option 3) is not my call to make. If the ARM maintainers want to
change the default older arm-linux targets I have no objections.
Jonathan Wakely Nov. 28, 2018, 12:39 p.m. UTC | #6
On 28/11/18 11:35 +0000, Jonathan Wakely wrote:
>On 28/11/18 10:54 +0100, Christophe Lyon wrote:
>>The reason I'm still using cortex-a9 at QEMU level is that many tests
>>override -mcpu/-march, and I used that as a compromise.
>
>That means you're not fully testing armv5t, because you're still
>linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9
>instruction set. Usually that doesn't matter, because most libstdc++
>code doesn't care about which specific instructions it gets compiled
>to. That

Oops, sorry, unfinished thought. I was going to say:

That is not true for the shared_ptr code, which depends on whether
atomic instructions are supported. There are also some parts of
<random> that depend on specific instructions being supported (but
some of them do runtime checks for the CPU).
Jonathan Wakely Nov. 28, 2018, 12:49 p.m. UTC | #7
On 28/11/18 12:30 +0000, Jonathan Wakely wrote:
>3) We could change libstdc++'s configure to automatically override
>--with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
>(because we know the kernel helpers are available). That causes an ABI
>change for a GCC built for --target=armv5t-*-linux* so I didn't do
>that by default. Packagers who want that can use the --with option
>explicitly to build a libstdc++ with atomic refcounting that can be
>used on any armv5t or later CPU, allowing users to choose a newer
>-march for their own code. (Until my patch that didn't work, because

[...]

>Option 3) is not my call to make. If the ARM maintainers want to
>change the default older arm-linux targets I have no objections.

Another way to approach option 3) that we discussed at Cauldron, and
which I want to start a separate discussion about on gcc@gcc.gnu.org,
is to change the value of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 et al
when we have kernel helpers to implement those operations.

The shared_ptr code doesn't care if an atomic CAS comes from the CPU
or a kernel helper in libgcc. If the atomic CAS is supported via
kernel helpers, let's use it! But those predefined macros only tell
you about native CPU support (for the current target selected by
-march).


It's worth noting that all this shared_ptr code predates libatomic, so
we couldn't just say "always use atomics, and link to libatomic if
needed". If __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 was not defined, we had
to assume no CAS *at all*. Not native, not provided by libatomic,
nothing. I'd love to simply rely on std::atomic in std::shared_ptr,
but that would be an ABI break for targets currently using a mutex,
and would add new dependencies on libatomic for some targets. (It
might also pessimize some single-threaded programs on targets that do
use atomic ref-counts, because currently some updates are non-atomic
when __gthread_active_p() == false).
Richard Biener Nov. 28, 2018, 1:46 p.m. UTC | #8
On Wed, Nov 28, 2018 at 1:30 PM Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On 28/11/18 11:11 +0000, Jonathan Wakely wrote:
> >On 28/11/18 11:46 +0100, Richard Biener wrote:
> >>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote:
> >>>
> >>>This resolves a longstanding issue where the lock policy for shared_ptr
> >>>reference counting depends on compilation options when the header is
> >>>included, so that different -march options can cause ABI changes. For
> >>>example, objects compiled with -march=armv7 will use atomics to
> >>>synchronize reference counts, and objects compiled with -march=armv5t
> >>>will use a mutex. That means the shared_ptr control block will have a
> >>>different layout in different objects, causing ODR violations and
> >>>undefined behaviour. This was the root cause of PR libstdc++/42734 as
> >>>well as PR libstdc++/67843.
> >>>
> >>>The solution is to decide on the lock policy at build time, when
> >>>libstdc++ is configured. The configure script checks for the
> >>>availability of the necessary atomic built-ins for the target and fixes
> >>>that choice permanently. Different -march flags used to compile user
> >>>code will not cause changes to the lock policy. This results in an ABI
> >>>change for certain compilations, but only where there was already an ABI
> >>>incompatibility between the libstdc++.so library and objects built with
> >>>an incompatible -march option. In general, this means a more stable ABI
> >>>that isn't silently altered when -march flags make addition atomic ops
> >>>available.
> >>>
> >>>To force a target to use "atomic" or "mutex" the new configure option
> >>>--with-libstdcxx-lock-policy can be used.
> >>>
> >>>In order to turn ODR violations into linker errors, the uses of
> >>>shared_ptr in filesystem directory iterators have been replaced
> >>>with __shared_ptr, and explicit instantiations are declared. This
> >>>ensures that object files using those types cannot link to libstdc++
> >>>libs unless they use the same lock policy.
> >>
> >>Would it be possible to have both in libstdc++ and with differnet mangling of
> >>different kind ensure compatibility between different user CUs?  Or is
> >>that too awkward for the user to get right?
> >
> >It would mean duplicating a lot more code, which is already duplicated
> >once for the std::string ABI, so we'd have four permuations!
> >
> >It still wouldn't ensure compatibility between different user CUs,
> >only between any user CU and any build of libstdc++. Different user
> >CUs would still disagree on the ABI of the types, and so couldn't pass
> >them between CUs. I see no advantage to supporting that for the
> >std::filesystem library (unlike std::string and std::iostream, which
> >are probably used in the majority of CUs).
> >
> >I do not want to get to the point where every type in libstdc++ exists
> >multiple times and you select some combination via command-line flags.
> >It's already becoming unmanageable with multiple std::string and long
> >double ABIs.
>
> Also, in practice, I don't see a need. The common cases where this bug
> arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
> mean that you can always use "atomic". For bare metal ARM toolchains
> you probably don't want to be mixing CUs built against different
> versions of libstdc++ anyway. You have your toolchain for the board,
> and you use that. If it is configured to use "atomic", then that's
> what you get. If it's configured to use "mutex", then that's what you
> get. You probably don't want to use a toolchain configured for a
> different board, but you could use --with-libstdcxx-lock-policy to
> ensure a consistent policy across those toolchains if needed.
>
> It's also possible for problems to arise on i386. If you build GCC for
> i386 then it will choose "mutex" and be incompatible with a libstdc++
> built for i486 or later. In that case, you could use the option
> --with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
> then you'd need to link to libatomic to provide the ops, as there are
> no kernel helpers for i386 in libgcc).

I think the default should be part of the platform ABI somehow,
like the i386 incompatibility should better not arise.  I suppose
libstdc++ configury doesn't read in gcc/config.gcc but does it
have sth similar where it's easy to see defaults for plaforms?
There's configure.host but I'm not sure this is related at all.

Richard.

> There are a few other things we could do though:
>
> 1) As well as the __default_lock_policy value, we could add a
> "library's preferred lock policy" value, which could be different from
> __default_lock_policy. That policy would get used for the __shared_ptr
> objects in the libstdc++ API/ABI, and would have to be consistent for
> all CUs in the library and using its headers. That solves the problem
> of user CUs being incompatible with the library because they used
> -march. It doesn't change the fact that a libstdc++.so using "atomic"
> is not compatible with one using "mutex". For this to work the library
> always needs to use std::__shared_ptr<X, __library_lock_policy>
> instead of std::shared_ptr<X> so that the policy is explicit (which I
> think is a good rule anyway).
>
> 1a) Then (if we wanted) we could restore the old behaviour where
> __default_lock_policy is chosen by the preprocessor and can vary
> between user CUs. If users want to compile different CUs with
> different lock policies, that's their choice. They risk creating
> incompatible shared_ptr objects which cannot be passed between CUs,
> but it becomes their responsibility to get it right. The library
> always uses its preferred policy, and so won't get it wrong. This
> seems easy for users to get wrong, but would arguably be more
> backwards compatible (by still allowing users to get silent UB that
> only fails at runtime).
>
> 1b) Or (if we wanted) we could still fix __default_lock_policy at
> configure-time, but add a second version of the new option
> --with-libstdcxx-lock-policy, so the default policy for user CUs and
> the library's preferred policy can be chosen independently. You could
> say you want the library to always use "mutex" but have two builds of
> GCC, one where user CUs use "atomic" and one where they use "mutex".
> The libstdc++.so used by those two GCC builds would be compatible, but
> user CUs that use std::shared_ptr would not be.
>
> Options 1a and 1b are mutually exclusive.
>
> 2) We could consider using __attribute__((__abi_tag__("..."))) on
> std::shared_ptr when the lock policy chosen by the preprocessor
> doesn't match what would have been chosen when libstdc++ was built. So
> if a user compiles with a -march option that changes the lock
> policy, we change the mangling of std::shared_ptr. This would still
> allow incompatible std::shared_ptr objects, but they would cause
> linker errors. For PR 42734 and PR 67843 that would have meant the
> programs fail to link, instead of having runtime UB. The solution I
> committed makes them link and run correctly, which seems better than
> just saying "no, you can't do that".
>
> 3) We could change libstdc++'s configure to automatically override
> --with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
> (because we know the kernel helpers are available). That causes an ABI
> change for a GCC built for --target=armv5t-*-linux* so I didn't do
> that by default. Packagers who want that can use the --with option
> explicitly to build a libstdc++ with atomic refcounting that can be
> used on any armv5t or later CPU, allowing users to choose a newer
> -march for their own code. (Until my patch that didn't work, because
>
>
> Option 1) on its own seems to have no advantage over what I committed.
>
> Option 1a) is still too easy for users to get wrong and get UB, only
> now it's all their fault not my fault. That doesn't really help users.
>
> Option 1b) makes it harder for users to get wrong (they have to mix
> CUs built by different toolchains, not just built with different
> -march options). I'm not sure there's much benefit to being able to
> control the library policy separately from the user policy though.
>
> Option 2) makes the failures happen earlier (when linking, not
> runtime) but doesn't actually make anything work that failed
> previously.
>
> Option 3) is not my call to make. If the ARM maintainers want to
> change the default older arm-linux targets I have no objections.
>
>
Jonathan Wakely Nov. 28, 2018, 2:09 p.m. UTC | #9
On 28/11/18 14:46 +0100, Richard Biener wrote:
>On Wed, Nov 28, 2018 at 1:30 PM Jonathan Wakely <jwakely@redhat.com> wrote:
>>
>> On 28/11/18 11:11 +0000, Jonathan Wakely wrote:
>> >On 28/11/18 11:46 +0100, Richard Biener wrote:
>> >>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote:
>> >>>
>> >>>This resolves a longstanding issue where the lock policy for shared_ptr
>> >>>reference counting depends on compilation options when the header is
>> >>>included, so that different -march options can cause ABI changes. For
>> >>>example, objects compiled with -march=armv7 will use atomics to
>> >>>synchronize reference counts, and objects compiled with -march=armv5t
>> >>>will use a mutex. That means the shared_ptr control block will have a
>> >>>different layout in different objects, causing ODR violations and
>> >>>undefined behaviour. This was the root cause of PR libstdc++/42734 as
>> >>>well as PR libstdc++/67843.
>> >>>
>> >>>The solution is to decide on the lock policy at build time, when
>> >>>libstdc++ is configured. The configure script checks for the
>> >>>availability of the necessary atomic built-ins for the target and fixes
>> >>>that choice permanently. Different -march flags used to compile user
>> >>>code will not cause changes to the lock policy. This results in an ABI
>> >>>change for certain compilations, but only where there was already an ABI
>> >>>incompatibility between the libstdc++.so library and objects built with
>> >>>an incompatible -march option. In general, this means a more stable ABI
>> >>>that isn't silently altered when -march flags make addition atomic ops
>> >>>available.
>> >>>
>> >>>To force a target to use "atomic" or "mutex" the new configure option
>> >>>--with-libstdcxx-lock-policy can be used.
>> >>>
>> >>>In order to turn ODR violations into linker errors, the uses of
>> >>>shared_ptr in filesystem directory iterators have been replaced
>> >>>with __shared_ptr, and explicit instantiations are declared. This
>> >>>ensures that object files using those types cannot link to libstdc++
>> >>>libs unless they use the same lock policy.
>> >>
>> >>Would it be possible to have both in libstdc++ and with differnet mangling of
>> >>different kind ensure compatibility between different user CUs?  Or is
>> >>that too awkward for the user to get right?
>> >
>> >It would mean duplicating a lot more code, which is already duplicated
>> >once for the std::string ABI, so we'd have four permuations!
>> >
>> >It still wouldn't ensure compatibility between different user CUs,
>> >only between any user CU and any build of libstdc++. Different user
>> >CUs would still disagree on the ABI of the types, and so couldn't pass
>> >them between CUs. I see no advantage to supporting that for the
>> >std::filesystem library (unlike std::string and std::iostream, which
>> >are probably used in the majority of CUs).
>> >
>> >I do not want to get to the point where every type in libstdc++ exists
>> >multiple times and you select some combination via command-line flags.
>> >It's already becoming unmanageable with multiple std::string and long
>> >double ABIs.
>>
>> Also, in practice, I don't see a need. The common cases where this bug
>> arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
>> mean that you can always use "atomic". For bare metal ARM toolchains
>> you probably don't want to be mixing CUs built against different
>> versions of libstdc++ anyway. You have your toolchain for the board,
>> and you use that. If it is configured to use "atomic", then that's
>> what you get. If it's configured to use "mutex", then that's what you
>> get. You probably don't want to use a toolchain configured for a
>> different board, but you could use --with-libstdcxx-lock-policy to
>> ensure a consistent policy across those toolchains if needed.
>>
>> It's also possible for problems to arise on i386. If you build GCC for
>> i386 then it will choose "mutex" and be incompatible with a libstdc++
>> built for i486 or later. In that case, you could use the option
>> --with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
>> then you'd need to link to libatomic to provide the ops, as there are
>> no kernel helpers for i386 in libgcc).
>
>I think the default should be part of the platform ABI somehow,
>like the i386 incompatibility should better not arise.

What do you suggest to ensure that?

Just imake i386 default to the "atomic" lock policy, and require anybody
silly enough to configure for i386 to either link to libatomic, or
explicitly use --with-libstdcxx-lock-policy=mutex if they really want
to be incompatible with default i486 builds?

>I suppose
>libstdc++ configury doesn't read in gcc/config.gcc but does it
>have sth similar where it's easy to see defaults for plaforms?
>There's configure.host but I'm not sure this is related at all.

No, I don't think we do. When there are target-specific things
hardcoded it's spread out around acinclude.m4 and configure.ac
Richard Biener Nov. 29, 2018, 1:51 p.m. UTC | #10
On Wed, Nov 28, 2018 at 3:09 PM Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On 28/11/18 14:46 +0100, Richard Biener wrote:
> >On Wed, Nov 28, 2018 at 1:30 PM Jonathan Wakely <jwakely@redhat.com> wrote:
> >>
> >> On 28/11/18 11:11 +0000, Jonathan Wakely wrote:
> >> >On 28/11/18 11:46 +0100, Richard Biener wrote:
> >> >>On Wed, Nov 28, 2018 at 12:26 AM Jonathan Wakely <jwakely@redhat.com> wrote:
> >> >>>
> >> >>>This resolves a longstanding issue where the lock policy for shared_ptr
> >> >>>reference counting depends on compilation options when the header is
> >> >>>included, so that different -march options can cause ABI changes. For
> >> >>>example, objects compiled with -march=armv7 will use atomics to
> >> >>>synchronize reference counts, and objects compiled with -march=armv5t
> >> >>>will use a mutex. That means the shared_ptr control block will have a
> >> >>>different layout in different objects, causing ODR violations and
> >> >>>undefined behaviour. This was the root cause of PR libstdc++/42734 as
> >> >>>well as PR libstdc++/67843.
> >> >>>
> >> >>>The solution is to decide on the lock policy at build time, when
> >> >>>libstdc++ is configured. The configure script checks for the
> >> >>>availability of the necessary atomic built-ins for the target and fixes
> >> >>>that choice permanently. Different -march flags used to compile user
> >> >>>code will not cause changes to the lock policy. This results in an ABI
> >> >>>change for certain compilations, but only where there was already an ABI
> >> >>>incompatibility between the libstdc++.so library and objects built with
> >> >>>an incompatible -march option. In general, this means a more stable ABI
> >> >>>that isn't silently altered when -march flags make addition atomic ops
> >> >>>available.
> >> >>>
> >> >>>To force a target to use "atomic" or "mutex" the new configure option
> >> >>>--with-libstdcxx-lock-policy can be used.
> >> >>>
> >> >>>In order to turn ODR violations into linker errors, the uses of
> >> >>>shared_ptr in filesystem directory iterators have been replaced
> >> >>>with __shared_ptr, and explicit instantiations are declared. This
> >> >>>ensures that object files using those types cannot link to libstdc++
> >> >>>libs unless they use the same lock policy.
> >> >>
> >> >>Would it be possible to have both in libstdc++ and with differnet mangling of
> >> >>different kind ensure compatibility between different user CUs?  Or is
> >> >>that too awkward for the user to get right?
> >> >
> >> >It would mean duplicating a lot more code, which is already duplicated
> >> >once for the std::string ABI, so we'd have four permuations!
> >> >
> >> >It still wouldn't ensure compatibility between different user CUs,
> >> >only between any user CU and any build of libstdc++. Different user
> >> >CUs would still disagree on the ABI of the types, and so couldn't pass
> >> >them between CUs. I see no advantage to supporting that for the
> >> >std::filesystem library (unlike std::string and std::iostream, which
> >> >are probably used in the majority of CUs).
> >> >
> >> >I do not want to get to the point where every type in libstdc++ exists
> >> >multiple times and you select some combination via command-line flags.
> >> >It's already becoming unmanageable with multiple std::string and long
> >> >double ABIs.
> >>
> >> Also, in practice, I don't see a need. The common cases where this bug
> >> arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
> >> mean that you can always use "atomic". For bare metal ARM toolchains
> >> you probably don't want to be mixing CUs built against different
> >> versions of libstdc++ anyway. You have your toolchain for the board,
> >> and you use that. If it is configured to use "atomic", then that's
> >> what you get. If it's configured to use "mutex", then that's what you
> >> get. You probably don't want to use a toolchain configured for a
> >> different board, but you could use --with-libstdcxx-lock-policy to
> >> ensure a consistent policy across those toolchains if needed.
> >>
> >> It's also possible for problems to arise on i386. If you build GCC for
> >> i386 then it will choose "mutex" and be incompatible with a libstdc++
> >> built for i486 or later. In that case, you could use the option
> >> --with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
> >> then you'd need to link to libatomic to provide the ops, as there are
> >> no kernel helpers for i386 in libgcc).
> >
> >I think the default should be part of the platform ABI somehow,
> >like the i386 incompatibility should better not arise.
>
> What do you suggest to ensure that?
>
> Just imake i386 default to the "atomic" lock policy, and require anybody
> silly enough to configure for i386 to either link to libatomic, or
> explicitly use --with-libstdcxx-lock-policy=mutex if they really want
> to be incompatible with default i486 builds?

Something like that, yes.

> >I suppose
> >libstdc++ configury doesn't read in gcc/config.gcc but does it
> >have sth similar where it's easy to see defaults for plaforms?
> >There's configure.host but I'm not sure this is related at all.
>
> No, I don't think we do. When there are target-specific things
> hardcoded it's spread out around acinclude.m4 and configure.ac
>
>
Christophe Lyon Nov. 29, 2018, 3:39 p.m. UTC | #11
On 28/11/2018 12:35, Jonathan Wakely wrote:
> On 28/11/18 10:54 +0100, Christophe Lyon wrote:
>> On Wed, 28 Nov 2018 at 00:25, Jonathan Wakely <jwakely@redhat.com> wrote:
>>>
>>> This resolves a longstanding issue where the lock policy for shared_ptr
>>> reference counting depends on compilation options when the header is
>>> included, so that different -march options can cause ABI changes. For
>>> example, objects compiled with -march=armv7 will use atomics to
>>> synchronize reference counts, and objects compiled with -march=armv5t
>>> will use a mutex. That means the shared_ptr control block will have a
>>> different layout in different objects, causing ODR violations and
>>> undefined behaviour. This was the root cause of PR libstdc++/42734 as
>>> well as PR libstdc++/67843.
>>>
>>> The solution is to decide on the lock policy at build time, when
>>> libstdc++ is configured. The configure script checks for the
>>> availability of the necessary atomic built-ins for the target and fixes
>>> that choice permanently. Different -march flags used to compile user
>>> code will not cause changes to the lock policy. This results in an ABI
>>> change for certain compilations, but only where there was already an ABI
>>> incompatibility between the libstdc++.so library and objects built with
>>> an incompatible -march option. In general, this means a more stable ABI
>>> that isn't silently altered when -march flags make addition atomic ops
>>> available.
>>>
>>> To force a target to use "atomic" or "mutex" the new configure option
>>> --with-libstdcxx-lock-policy can be used.
>>>
>>> In order to turn ODR violations into linker errors, the uses of
>>> shared_ptr in filesystem directory iterators have been replaced
>>> with __shared_ptr, and explicit instantiations are declared. This
>>> ensures that object files using those types cannot link to libstdc++
>>> libs unless they use the same lock policy.
>>>
>>>         PR libstdc++/67843
>>>         * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
>>>         that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
>>>         * config.h.in: Regenerate.
>>>         * configure: Regenerate.
>>>         * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
>>>         * doc/xml/manual/configure.xml: Document new configure option.
>>>         * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
>>>         instead of shared_ptr.
>>>         (recursive_directory_iterator): Likewise.
>>>         (__shared_ptr<_Dir>): Add explicit instantiation declaration.
>>> (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>>>         * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
>>>         Add default template argument for _Lock_policy template parameter.
>>>         * include/ext/concurrence.h (__default_lock_policy): Check macro
>>>         _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
>>>         target supports the builtins for compare-and-swap.
>>>         * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
>>>         instantiation definition.
>>> (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>>>         (directory_iterator, recursive_directory_iterator): Use __make_shared
>>>         instead of make_shared.
>>>
>>> Tested x86_64-linux and aarch64-linux, committed to trunk.
>>>
>>> I would appreciate testing on ARM, especially Christophe's
>>> -march=armv5t set up.
>>>
>>
>> Hi Jonathan,
>>
>> Your patch passed my tests, thanks!
>
> Thanks for checking it.
>
>> directory_iterator.cc tests used to be killed by a timeout until last
>> night where now they do PASS in my configurations involving
>> -march=armv5t.
>>
>> That being said, I'm not sure I fully understand the fix. In my tests
>> involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9,
>> and I pass -march=armv5t as an override with the --target-board runtest option.
>
> Yup, and that difference between the arch used to build libstdc++ and
> the arch used to build the tests was what caused the FAILures. And
> that's what my patch addresses, by ensuring that the arch used to
> build libstdc++ is what dictates the type of synchronization for
> shared_ptr.
>
>> In these cases libstdc++ configure detects support for "atomic", so I
>> suspect the tests pass only because I use QEMU with --cpu cortex-a9.
>> I think if I used a different QEMU cpu (eg arm926), the tests would
>> try to use atomics, and fail?
>
> I don't know if a libstdc++ configured for cortex-a9 would run on an
> arm296 QEMU. It might depend on cortex-a9 instructions (unrelated to
> the use of atomics in my patch). See the end of this mail.
>
>> The reason I'm still using cortex-a9 at QEMU level is that many tests
>> override -mcpu/-march, and I used that as a compromise.
>
> That means you're not fully testing armv5t, because you're still
> linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9
> instruction set. Usually that doesn't matter, because most libstdc++
> code doesn't care about which specific instructions it gets compiled
> to. That
>> However, I do have configurations using --with-cpu=arm10tdmi or
>> --with-cpu=default and QEMU --cpu arm926.
>> In these 2 cases, the tests do PASS, but they used to before your
>> patch. libstdc++ configure does detect "mutex" in these cases.
>
> What matters is that the tests and the libstdc++ libraries agree on the
> lock policy. It doesn't matter if they both use "atomic" or they both
> use "mutex", only that they agree. Previously the lock policy was
> selected by the preprocessor in every translation unit. That made it
> very easy to get incompatible sets of objects. Now the selection is
> made once per build of GCC, and is not affected by -march options when
> compiling user code.
>
>> To be hopefully clearer, the subset of relevant configurations I use
>> is as follows:
>> GCC target / with-cpu / RUNTEST target-board / QEMU cpu
>> 1- arm-none-linux-gnueabi[hf]  / cortex-a9 / -march=armv5t / cortex-a9
>> 2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926
>> 3- arm-none-eabi / default / "" / arm926
>>
>> (1) uses atomics
>> (2) and (3) use mutex
>>
>> All of them now say "PASS", but maybe I should give a try switch to
>> arm926 for QEMU in (1) ?
>
> That would be a useful test.
>
> If I understand correctly, that will still work because the atomic ops
> will use the kernel helper in libgcc. Previously it would have failed
> the same way as (1) was already failing, because --with-cpu=cortex-a9
> when libstdc++ was built chose atomics, and the -march=armv5t target
> board chose mutexes, and so the testsuite files were incompatible with
> the library.
>
> Since my patch, only the --with-cpu option matters, and the testsuite
> files will still use atomics. I think the arm926 cpu running under
> QEMU will implement those atomics via libgcc's kernel helpers.
>
> However, that test might fail for other reasons, if the libstdc++ libs
> configured for cortex-a9 use other insns that are not supported by
> arm296 then you'd get a SIGILL when running the tests.  The kernel
> helpers only implement the atomic operations, not any cortex-a9
> instructions that are not supported by arm296.
>

So, I didn't forget about this, it's just that it takes quite a bit of time
to setup, run and check the logs...

This is exactly what happens :( (and probably why I sticked to cortex-a9 in qemu despite -march=armv5t in runtestflags)

Almost all execution tests fail in _dl_start, because it uses "movw".

Surprisingly my logs show a few execution tests passing despite qemu exiting with "Illegal instruction". I don't understand why dejagnu sometimes sees exit code 132 as FAIL, and sometimes as PASS (both cases showing the same qemu execution trace)

In the fortran testsuite, I also see qemu exiting with this trace:
IN:
0x40801800:  e59fa094  ldr      sl, [pc, #0x94]
0x40801804:  e59f4094  ldr      r4, [pc, #0x94]
0x40801808:  e1a0000d  mov      r0, sp
0x4080180c:  fa000c66  blx      #0x408049ac

----------------
IN: _dl_start
0x408049ac:  e92d       .byte    0x2d, 0xe9

----------------
IN: process_dl_audit
0x40801a68:  feeaf00f  cdp2     p0, #0xe, c15, c10, c15, #0

----------------
IN: process_dl_audit
0x40801a6c:  b1b04604  lslslt   r4, r4, #0xc
0x40801a70:  2b007823  blhs     #0x4081fb04

----------------
IN: process_dl_audit
0x40801a74:  f8d8d0f6  .byte    0xf6, 0xd0, 0xd8, 0xf8

(this is qemu-2.11.0, I'm surprised some instructions are not decoded)

So well, all in all it seems we can't test what I'm trying to test,
or rather that -march=armv5t is broken if the toolchain is
built --with-cpu more recent than that. Not very user friendly :-)

Christophe
Ramana Radhakrishnan Nov. 30, 2018, 11:49 a.m. UTC | #12
Firstly thanks a lot for working on this.

On 28/11/2018 12:49, Jonathan Wakely wrote:
> On 28/11/18 12:30 +0000, Jonathan Wakely wrote:
>> 3) We could change libstdc++'s configure to automatically override
>> --with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
>> (because we know the kernel helpers are available). That causes an ABI
>> change for a GCC built for --target=armv5t-*-linux* so I didn't do
>> that by default. Packagers who want that can use the --with option
>> explicitly to build a libstdc++ with atomic refcounting that can be
>> used on any armv5t or later CPU, allowing users to choose a newer
>> -march for their own code. (Until my patch that didn't work, because
> 
> [...]
> 
>> Option 3) is not my call to make. If the ARM maintainers want to
>> change the default older arm-linux targets I have no objections.

We could change the defaults when(if) we next bump up the libstdc++ ABI 
in general :-/ and if we remember to do this. I don't feel comfortable 
changing the defaults silently and I don't view this as something we can 
decide on by ourselves as the Arm maintainers as this really falls in 
the space of folks distributing Linux using by defaults versions of the 
architecture that result in the use of mutexes rather than the atomics.

It's also not clear to me if we can use any other magic tricks to make 
this co-exist and whether that is worth it.
> 
> Another way to approach option 3) that we discussed at Cauldron, and
> which I want to start a separate discussion about on gcc@gcc.gnu.org,
> is to change the value of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 et al
> when we have kernel helpers to implement those operations.
> 
> The shared_ptr code doesn't care if an atomic CAS comes from the CPU
> or a kernel helper in libgcc. If the atomic CAS is supported via
> kernel helpers, let's use it! But those predefined macros only tell
> you about native CPU support (for the current target selected by
> -march).
> 
> 
> It's worth noting that all this shared_ptr code predates libatomic, so
> we couldn't just say "always use atomics, and link to libatomic if
> needed". If __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 was not defined, we had
> to assume no CAS *at all*. Not native, not provided by libatomic,
> nothing. I'd love to simply rely on std::atomic in std::shared_ptr,
> but that would be an ABI break for targets currently using a mutex,
> and would add new dependencies on libatomic for some targets. (It
> might also pessimize some single-threaded programs on targets that do
> use atomic ref-counts, because currently some updates are non-atomic
> when __gthread_active_p() == false).

Yep, though you want it to go to the kernel helpers or indeed libatomic.

regards
Ramana

> 
>
diff mbox series

Patch

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 82a25e5f2f1..6bcd29dc8c3 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -3562,6 +3562,72 @@  EOF
 
 ])
 
+dnl
+dnl Set default lock policy for synchronizing shared_ptr reference counting.
+dnl
+dnl --with-libstdcxx-lock-policy=auto
+dnl	Use atomic operations for shared_ptr reference counting only if
+dnl	the default target supports atomic compare-and-swap.
+dnl --with-libstdcxx-lock-policy=atomic
+dnl	Use atomic operations for shared_ptr reference counting.
+dnl --with-libstdcxx-lock-policy=mutex
+dnl	Use a mutex to synchronize shared_ptr reference counting.
+dnl
+dnl This controls the value of __gnu_cxx::__default_lock_policy, which
+dnl determines how shared_ptr reference counts are synchronized.
+dnl The option "atomic" means that atomic operations should be used,
+dnl "mutex" means that a mutex will be used. The default option, "auto",
+dnl will check if the target supports the compiler-generated builtins
+dnl for atomic compare-and-swap operations for 2-byte and 4-byte integers,
+dnl and will use "atomic" if supported, "mutex" otherwise.
+dnl This option is ignored if the thread model used by GCC is "single",
+dnl as no synchronization is used at all in that case.
+dnl This option affects the library ABI (except in the "single" thread model).
+dnl
+dnl Defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY to 1 if atomics should be used.
+dnl
+AC_DEFUN([GLIBCXX_ENABLE_LOCK_POLICY], [
+
+  AC_ARG_WITH([libstdcxx-lock-policy],
+    AC_HELP_STRING([--with-libstdcxx-lock-policy={atomic,mutex,auto}],
+      [synchronization policy for shared_ptr reference counting [default=auto]]),
+              [libstdcxx_atomic_lock_policy=$withval],
+              [libstdcxx_atomic_lock_policy=auto])
+
+  case "$libstdcxx_atomic_lock_policy" in
+    atomic|mutex|auto) ;;
+    *) AC_MSG_ERROR([Invalid argument for --with-libstdcxx-lock-policy]) ;;
+  esac
+  AC_MSG_CHECKING([for lock policy for shared_ptr reference counts])
+
+  if test x"$libstdcxx_atomic_lock_policy" = x"auto"; then
+    AC_LANG_SAVE
+    AC_LANG_CPLUSPLUS
+    ac_save_CXXFLAGS="$CXXFLAGS"
+
+    dnl Why do we care about 2-byte CAS on targets with 4-byte _Atomic_word?!
+    AC_TRY_COMPILE([
+    #if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
+    # error "No 2-byte compare-and-swap"
+    #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
+    # error "No 4-byte compare-and-swap"
+    #endif
+    ],,
+    [libstdcxx_atomic_lock_policy=atomic],
+    [libstdcxx_atomic_lock_policy=mutex])
+    AC_LANG_RESTORE
+    CXXFLAGS="$ac_save_CXXFLAGS"
+  fi
+
+  if test x"$libstdcxx_atomic_lock_policy" = x"atomic"; then
+    AC_MSG_RESULT(atomic)
+    AC_DEFINE(HAVE_ATOMIC_LOCK_POLICY,1,
+      [Defined if shared_ptr reference counting should use atomic operations.])
+  else
+    AC_MSG_RESULT(mutex)
+  fi
+
+])
 
 dnl
 dnl Allow visibility attributes to be used on namespaces, objects, etc.
diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index a657a726778..ad5b4117cfd 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -149,6 +149,7 @@  GLIBCXX_ENABLE_VERBOSE
 GLIBCXX_ENABLE_PCH($is_hosted)
 GLIBCXX_ENABLE_THREADS
 GLIBCXX_ENABLE_ATOMIC_BUILTINS
+GLIBCXX_ENABLE_LOCK_POLICY
 GLIBCXX_ENABLE_DECIMAL_FLOAT
 GLIBCXX_ENABLE_INT128_FLOAT128
 if test "$enable_float128" = yes; then
diff --git a/libstdc++-v3/doc/xml/manual/configure.xml b/libstdc++-v3/doc/xml/manual/configure.xml
index ac383cf7fa7..d296c8d8a49 100644
--- a/libstdc++-v3/doc/xml/manual/configure.xml
+++ b/libstdc++-v3/doc/xml/manual/configure.xml
@@ -399,6 +399,28 @@ 
    </para>
  </listitem></varlistentry>
 
+ <varlistentry><term><code>--with-libstdcxx-lock-policy=OPTION</code></term>
+ <listitem><para>Sets the lock policy that controls how
+        <classname>shared_ptr</classname> reference counting is
+        synchronized.
+        The choice OPTION=atomic enables use of atomics for updates to
+        <classname>shared_ptr</classname> reference counts.
+        The choice OPTION=mutex enables use of a mutex to synchronize updates
+        to <classname>shared_ptr</classname> reference counts.
+        If the compiler's thread model is "single" then this option has no
+        effect, as no synchronization is used for the reference counts.
+	The default is OPTION=auto, which checks for the availability of
+        compiler built-ins for 2-byte and 4-byte atomic compare-and-swap,
+        and uses OPTION=atomic if they're available, OPTION=mutex otherwise.
+        This option can change the library ABI.
+        If the library is configured to use atomics and user programs are
+        compiled using a target that doesn't natively support the atomic
+        operations (e.g. the library is configured for armv7 and then code
+        is compiled with <option>-march=armv5t</option>) then the program
+        might rely on support in libgcc to provide the atomics.
+    </para>
+ </listitem></varlistentry>
+
  <varlistentry><term><code>--enable-vtable-verify</code>[default]</term>
  <listitem>
     <para>Use <code>-fvtable-verify=std</code> to compile the C++
diff --git a/libstdc++-v3/include/bits/fs_dir.h b/libstdc++-v3/include/bits/fs_dir.h
index 9ee1cb66b61..2f81a1709e4 100644
--- a/libstdc++-v3/include/bits/fs_dir.h
+++ b/libstdc++-v3/include/bits/fs_dir.h
@@ -403,7 +403,7 @@  _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
     friend class recursive_directory_iterator;
 
-    std::shared_ptr<_Dir> _M_dir;
+    std::__shared_ptr<_Dir> _M_dir;
   };
 
   inline directory_iterator
@@ -494,7 +494,7 @@  _GLIBCXX_BEGIN_NAMESPACE_CXX11
                const recursive_directory_iterator& __rhs);
 
     struct _Dir_stack;
-    std::shared_ptr<_Dir_stack> _M_dirs;
+    std::__shared_ptr<_Dir_stack> _M_dirs;
     directory_options _M_options = {};
     bool _M_pending = false;
   };
@@ -525,6 +525,14 @@  _GLIBCXX_END_NAMESPACE_CXX11
   // @} group filesystem
 } // namespace filesystem
 
+  // Use explicit instantiations of these types. Any inconsistency in the
+  // value of __default_lock_policy between code including this header and
+  // the library will cause a linker error.
+  extern template class
+    __shared_ptr<filesystem::_Dir>;
+  extern template class
+    __shared_ptr<filesystem::recursive_directory_iterator::_Dir_stack>;
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 
diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h
index 46ff4a7cf29..21debf362aa 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1803,7 +1803,8 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       mutable __weak_ptr<_Tp, _Lp>  _M_weak_this;
     };
 
-  template<typename _Tp, _Lock_policy _Lp, typename _Alloc, typename... _Args>
+  template<typename _Tp, _Lock_policy _Lp = __default_lock_policy,
+	   typename _Alloc, typename... _Args>
     inline __shared_ptr<_Tp, _Lp>
     __allocate_shared(const _Alloc& __a, _Args&&... __args)
     {
@@ -1811,7 +1812,8 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 				    std::forward<_Args>(__args)...);
     }
 
-  template<typename _Tp, _Lock_policy _Lp, typename... _Args>
+  template<typename _Tp, _Lock_policy _Lp = __default_lock_policy,
+	   typename... _Args>
     inline __shared_ptr<_Tp, _Lp>
     __make_shared(_Args&&... __args)
     {
diff --git a/libstdc++-v3/include/ext/concurrence.h b/libstdc++-v3/include/ext/concurrence.h
index 302cddfa473..33ad9e06c9f 100644
--- a/libstdc++-v3/include/ext/concurrence.h
+++ b/libstdc++-v3/include/ext/concurrence.h
@@ -51,16 +51,13 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Compile time constant that indicates prefered locking policy in
   // the current configuration.
   static const _Lock_policy __default_lock_policy = 
-#ifdef __GTHREADS
-#if (defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2) \
-     && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4))
+#ifndef __GTHREADS
+  _S_single;
+#elif defined _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY
   _S_atomic;
 #else
   _S_mutex;
 #endif
-#else
-  _S_single;
-#endif
 
   // NB: As this is used in libsupc++, need to only depend on
   // exception. No stdexception classes, no use of std::string.
diff --git a/libstdc++-v3/src/filesystem/std-dir.cc b/libstdc++-v3/src/filesystem/std-dir.cc
index 4c9a287ad80..038f635a712 100644
--- a/libstdc++-v3/src/filesystem/std-dir.cc
+++ b/libstdc++-v3/src/filesystem/std-dir.cc
@@ -39,6 +39,9 @@ 
 namespace fs = std::filesystem;
 namespace posix = std::filesystem::__gnu_posix;
 
+template class std::__shared_ptr<fs::_Dir>;
+template class std::__shared_ptr<fs::recursive_directory_iterator::_Dir_stack>;
+
 struct fs::_Dir : _Dir_base
 {
   _Dir(const fs::path& p, bool skip_permission_denied, error_code& ec)
@@ -125,7 +128,7 @@  directory_iterator(const path& p, directory_options options, error_code* ecptr)
 
   if (dir.dirp)
     {
-      auto sp = std::make_shared<fs::_Dir>(std::move(dir));
+      auto sp = std::__make_shared<fs::_Dir>(std::move(dir));
       if (sp->advance(skip_permission_denied, ec))
 	_M_dir.swap(sp);
     }
@@ -185,7 +188,7 @@  recursive_directory_iterator(const path& p, directory_options options,
     {
       if (ecptr)
 	ecptr->clear();
-      auto sp = std::make_shared<_Dir_stack>();
+      auto sp = std::__make_shared<_Dir_stack>();
       sp->push(_Dir{ dirp, p });
       if (ecptr ? sp->top().advance(*ecptr) : sp->top().advance())
 	_M_dirs.swap(sp);