Message ID | 8736rldyzm.fsf@oldenburg.str.redhat.com |
---|---|
State | New |
Headers | show |
Series | Linux: Implement membarrier function | expand |
On Wed, 2018-11-28 at 16:05 +0100, Florian Weimer wrote: > This is essentially a repost of last year's patch, rebased to the glibc > 2.29 symbol version and reflecting the introduction of > MEMBARRIER_CMD_GLOBAL. > > I'm not including any changes to manual/ here because the set of > supported operations is evolving rapidly, we could not get consensus for > the language I proposed the last time, and I do not want to contribute > to the manual for the time being. Fair enough. Nonetheless, can you summarize how far you're along with properly defining the semantics (eg, based on the C/C++ memory model)?
* Torvald Riegel: > On Wed, 2018-11-28 at 16:05 +0100, Florian Weimer wrote: >> This is essentially a repost of last year's patch, rebased to the glibc >> 2.29 symbol version and reflecting the introduction of >> MEMBARRIER_CMD_GLOBAL. >> >> I'm not including any changes to manual/ here because the set of >> supported operations is evolving rapidly, we could not get consensus for >> the language I proposed the last time, and I do not want to contribute >> to the manual for the time being. > > Fair enough. Nonetheless, can you summarize how far you're along with > properly defining the semantics (eg, based on the C/C++ memory model)? I wrote down what you could, but no one liked it. <https://sourceware.org/ml/libc-alpha/2017-12/msg00796.html> I expect that a formalization would interact in non-trivial ways with any potential formalization of usable relaxed memory order semantics, and I'm not sure if anyone knows how to do the latter today. Thanks, Florian
----- On Nov 29, 2018, at 8:50 AM, Florian Weimer fweimer@redhat.com wrote: > * Torvald Riegel: > >> On Wed, 2018-11-28 at 16:05 +0100, Florian Weimer wrote: >>> This is essentially a repost of last year's patch, rebased to the glibc >>> 2.29 symbol version and reflecting the introduction of >>> MEMBARRIER_CMD_GLOBAL. >>> >>> I'm not including any changes to manual/ here because the set of >>> supported operations is evolving rapidly, we could not get consensus for >>> the language I proposed the last time, and I do not want to contribute >>> to the manual for the time being. >> >> Fair enough. Nonetheless, can you summarize how far you're along with >> properly defining the semantics (eg, based on the C/C++ memory model)? > > I wrote down what you could, but no one liked it. > > <https://sourceware.org/ml/libc-alpha/2017-12/msg00796.html> > > I expect that a formalization would interact in non-trivial ways with > any potential formalization of usable relaxed memory order semantics, > and I'm not sure if anyone knows how to do the latter today. Adding Paul E. McKenney in CC. Thanks, Mathieu > > Thanks, > Florian
On Thu, Nov 29, 2018 at 09:44:22AM -0500, Mathieu Desnoyers wrote: > ----- On Nov 29, 2018, at 8:50 AM, Florian Weimer fweimer@redhat.com wrote: > > > * Torvald Riegel: > > > >> On Wed, 2018-11-28 at 16:05 +0100, Florian Weimer wrote: > >>> This is essentially a repost of last year's patch, rebased to the glibc > >>> 2.29 symbol version and reflecting the introduction of > >>> MEMBARRIER_CMD_GLOBAL. > >>> > >>> I'm not including any changes to manual/ here because the set of > >>> supported operations is evolving rapidly, we could not get consensus for > >>> the language I proposed the last time, and I do not want to contribute > >>> to the manual for the time being. > >> > >> Fair enough. Nonetheless, can you summarize how far you're along with > >> properly defining the semantics (eg, based on the C/C++ memory model)? > > > > I wrote down what you could, but no one liked it. > > > > <https://sourceware.org/ml/libc-alpha/2017-12/msg00796.html> > > > > I expect that a formalization would interact in non-trivial ways with > > any potential formalization of usable relaxed memory order semantics, > > and I'm not sure if anyone knows how to do the latter today. > > Adding Paul E. McKenney in CC. There is some prototype C++ memory model wording from David Goldblatt (CCed) here (search for "Standarese"): http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1202r0.pdf David's key insight is that (in Linuxese) light fences cannot pair with each other. Thanx, Paul
One note with the suggested patch is that `atomic_thread_fence(memory_order_acq_rel)` should probably be `atomic_thread_fence (memory_order_seq_cst)` (otherwise the call would be a no-op on, say, x86, which it very much isn't). The non-transitivity thing makes the resulting description arguably incorrect, but this is informal enough that it might not be a big deal to add something after "For these threads, the membarrier function call turns an existing compiler barrier (see above) executed by these threads into full memory barriers" that clarifies it. E.g. you could make it into "turns an existing compiler barrier [...] into full memory barriers, with respect to the calling thread". Since this is targeting the description of the OS call (and doesn't have to concern itself with also being implementable by other asymmetric techniques or degrading to architectural barriers), I think that the description in "approach 2" in P1202 would also make sense for a formal description of the syscall. (Of course, without the kernel itself committing to a rigorous semantics, anything specified on top of it will be on slightly shaky ground). - David On Thu, Nov 29, 2018 at 7:04 AM Paul E. McKenney <paulmck@linux.ibm.com> wrote: > > On Thu, Nov 29, 2018 at 09:44:22AM -0500, Mathieu Desnoyers wrote: > > ----- On Nov 29, 2018, at 8:50 AM, Florian Weimer fweimer@redhat.com wrote: > > > > > * Torvald Riegel: > > > > > >> On Wed, 2018-11-28 at 16:05 +0100, Florian Weimer wrote: > > >>> This is essentially a repost of last year's patch, rebased to the glibc > > >>> 2.29 symbol version and reflecting the introduction of > > >>> MEMBARRIER_CMD_GLOBAL. > > >>> > > >>> I'm not including any changes to manual/ here because the set of > > >>> supported operations is evolving rapidly, we could not get consensus for > > >>> the language I proposed the last time, and I do not want to contribute > > >>> to the manual for the time being. > > >> > > >> Fair enough. Nonetheless, can you summarize how far you're along with > > >> properly defining the semantics (eg, based on the C/C++ memory model)? > > > > > > I wrote down what you could, but no one liked it. > > > > > > <https://sourceware.org/ml/libc-alpha/2017-12/msg00796.html> > > > > > > I expect that a formalization would interact in non-trivial ways with > > > any potential formalization of usable relaxed memory order semantics, > > > and I'm not sure if anyone knows how to do the latter today. > > > > Adding Paul E. McKenney in CC. > > There is some prototype C++ memory model wording from David Goldblatt (CCed) > here (search for "Standarese"): > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1202r0.pdf > > David's key insight is that (in Linuxese) light fences cannot pair with > each other. > > Thanx, Paul >
* Florian Weimer: > This is essentially a repost of last year's patch, rebased to the glibc > 2.29 symbol version and reflecting the introduction of > MEMBARRIER_CMD_GLOBAL. > > I'm not including any changes to manual/ here because the set of > supported operations is evolving rapidly, we could not get consensus for > the language I proposed the last time, and I do not want to contribute > to the manual for the time being. Any comments on the technical aspects of the patch? I would like to commit this. <https://sourceware.org/ml/libc-alpha/2018-11/msg00750.html> Thanks, Florian
On 28/11/2018 13:05, Florian Weimer wrote: > This is essentially a repost of last year's patch, rebased to the glibc > 2.29 symbol version and reflecting the introduction of > MEMBARRIER_CMD_GLOBAL. > > I'm not including any changes to manual/ here because the set of > supported operations is evolving rapidly, we could not get consensus for > the language I proposed the last time, and I do not want to contribute > to the manual for the time being. I agree that documentation might be added in a subsequent patch. The only issue I have with this patch is how to handle the kernel header, see comments below. > > Thanks, > Florian > > 2018-11-28 Florian Weimer <fweimer@redhat.com> > > Linux: Implement membarrier function. > * sysdeps/unix/sysv/linux/Makefile (sysdep_headers): Add > sys/membarrier.h. > (tests): Add tst-membarrier. > * sysdeps/unix/sysv/linux/Versions (GLIBC_2.27): Add membarrier. > * sysdeps/unix/sysv/linux/sys/membarrier.h: New file. > * sysdeps/unix/sysv/linux/tst-membarrier.c: Likewise. > * sysdeps/unix/sysv/linux/aarch64/libc.abilist (GLIBC_2.29): Add > membarrier. > * sysdeps/unix/sysv/linux/alpha/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/arm/libc.abilist (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/hppa/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/i386/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/ia64/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/microblaze/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist > (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist > (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist > (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist > (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/nios2/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist > (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist > (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist > (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist > (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/sh/libc.abilist (GLIBC_2.29): Likewise. > * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist (GLIBC_2.29): > Likewise. > * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist (GLIBC_2.29): > Likewise. > > diff --git a/NEWS b/NEWS > index 1098be1afb..d5786f4eab 100644 > --- a/NEWS > +++ b/NEWS > @@ -35,6 +35,9 @@ Major new features: > different directory. This is a GNU extension and similar to the > Solaris function of the same name. > > +* The membarrier function and the <sys/membarrier.h> header file have been > + added. > + > Deprecated and removed features, and other changes affecting compatibility: > > * The glibc.tune tunable namespace has been renamed to glibc.cpu and the Please make explicit this is a Linux only interface. > diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile > index 362cf3b950..f8c4843ab6 100644 > --- a/sysdeps/unix/sysv/linux/Makefile > +++ b/sysdeps/unix/sysv/linux/Makefile > @@ -43,12 +43,13 @@ sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \ > bits/siginfo-arch.h bits/siginfo-consts-arch.h \ > bits/procfs.h bits/procfs-id.h bits/procfs-extra.h \ > bits/procfs-prregset.h bits/mman-map-flags-generic.h \ > - bits/msq-pad.h bits/sem-pad.h bits/shmlba.h bits/shm-pad.h > + bits/msq-pad.h bits/sem-pad.h bits/shmlba.h bits/shm-pad.h \ > + sys/membarrier.h > > tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \ > tst-quota tst-sync_file_range tst-sysconf-iov_max tst-ttyname \ > test-errno-linux tst-memfd_create tst-mlock2 tst-pkey \ > - tst-rlimit-infinity tst-ofdlocks > + tst-rlimit-infinity tst-ofdlocks tst-membarrier > tests-internal += tst-ofdlocks-compat > > Ok. > diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions > index 336c13b57d..86db06f403 100644 > --- a/sysdeps/unix/sysv/linux/Versions > +++ b/sysdeps/unix/sysv/linux/Versions > @@ -171,6 +171,9 @@ libc { > mlock2; > pkey_alloc; pkey_free; pkey_set; pkey_get; pkey_mprotect; > } > + GLIBC_2.29 { > + membarrier; > + } > GLIBC_PRIVATE { > # functions used in other libraries > __syscall_rt_sigqueueinfo; Ok. > diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist > index e66c741d04..c73c731eec 100644 > --- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist > +++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist > @@ -2138,4 +2138,5 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F Ok. > diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist > index 8df162fe99..c9488b3c18 100644 > --- a/sysdeps/unix/sysv/linux/alpha/libc.abilist > +++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist > @@ -2033,6 +2033,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/arm/libc.abilist b/sysdeps/unix/sysv/linux/arm/libc.abilist > index 43c804f9dc..2524b6545b 100644 > --- a/sysdeps/unix/sysv/linux/arm/libc.abilist > +++ b/sysdeps/unix/sysv/linux/arm/libc.abilist > @@ -123,6 +123,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.4 _Exit F > GLIBC_2.4 _IO_2_1_stderr_ D 0xa0 Ok. > diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist > index 88b01c2e75..9baaa34b62 100644 > --- a/sysdeps/unix/sysv/linux/hppa/libc.abilist > +++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist > @@ -1880,6 +1880,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist > index 6d02f31612..1b91873f65 100644 > --- a/sysdeps/unix/sysv/linux/i386/libc.abilist > +++ b/sysdeps/unix/sysv/linux/i386/libc.abilist > @@ -2045,6 +2045,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist > index 4249712611..1b3465c6f4 100644 > --- a/sysdeps/unix/sysv/linux/ia64/libc.abilist > +++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist > @@ -1914,6 +1914,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist > index d47b808862..db22eb4a12 100644 > --- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist > +++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist > @@ -124,6 +124,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.4 _Exit F > GLIBC_2.4 _IO_2_1_stderr_ D 0x98 Ok. > diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist > index d5e38308be..5a93b98e76 100644 > --- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist > +++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist > @@ -1989,6 +1989,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/microblaze/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/libc.abilist > index 8596b84399..0142b573ca 100644 > --- a/sysdeps/unix/sysv/linux/microblaze/libc.abilist > +++ b/sysdeps/unix/sysv/linux/microblaze/libc.abilist > @@ -2130,4 +2130,5 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F Ok. > diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist > index 88e0f896d5..a84db6e3f4 100644 > --- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist > +++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist > @@ -1967,6 +1967,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist > index aff7462c34..aed9d20fa5 100644 > --- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist > +++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist > @@ -1965,6 +1965,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist > index 71d82444aa..d117ad299e 100644 > --- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist > +++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist > @@ -1973,6 +1973,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist > index de6c53d293..3598a33eca 100644 > --- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist > +++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist > @@ -1968,6 +1968,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist > index e724bab9fb..7ce4aa1841 100644 > --- a/sysdeps/unix/sysv/linux/nios2/libc.abilist > +++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist > @@ -2171,4 +2171,5 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F Ok. > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist > index e9ecbccb71..50fedd0fab 100644 > --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist > @@ -1993,6 +1993,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist > index da83ea6028..86862b6e16 100644 > --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist > @@ -1997,6 +1997,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist > index 4535b40d15..57c0e67347 100644 > --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist > @@ -2228,4 +2228,5 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F Ok. > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist > index 65725de4f0..7658e2c2b9 100644 > --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist > @@ -123,6 +123,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 _Exit F > GLIBC_2.3 _IO_2_1_stderr_ D 0xe0 Ok. > diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist > index bbb3c4a8e7..f4020e881d 100644 > --- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist > +++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist > @@ -2100,4 +2100,5 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F Ok. > diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist > index e85ac2a178..9476770e5b 100644 > --- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist > +++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist > @@ -2002,6 +2002,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist > index d56931022c..ee9a87659f 100644 > --- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist > +++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist > @@ -1908,6 +1908,7 @@ GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > GLIBC_2.29 __fentry__ F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/sh/libc.abilist b/sysdeps/unix/sysv/linux/sh/libc.abilist > index ff939a15c4..c2dd2a5f7e 100644 > --- a/sysdeps/unix/sysv/linux/sh/libc.abilist > +++ b/sysdeps/unix/sysv/linux/sh/libc.abilist > @@ -1884,6 +1884,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist > index 64fa9e10a5..55c6496b96 100644 > --- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist > +++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist > @@ -1996,6 +1996,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist > index db909d1506..a0e7f2f221 100644 > --- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist > +++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist > @@ -1937,6 +1937,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/sys/membarrier.h b/sysdeps/unix/sysv/linux/sys/membarrier.h > new file mode 100644 > index 0000000000..4c3e6164f7 > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/sys/membarrier.h > @@ -0,0 +1,55 @@ > +/* Memory barriers. > + Copyright (C) 2018 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#ifndef _SYS_MEMBARRIER_H > +#define _SYS_MEMBARRIER_H 1 > + > +#include <features.h> > + > +__BEGIN_DECLS > + > +/* Perform a memory barrier on multiple threads. */ > +int membarrier (int __op, int __flags) __THROW; > + > +__END_DECLS > + > +/* Obtain the definitions of the MEMBARRIER_CMD_* constants. */ > + > +#include <linux/version.h> > +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 16, 0) > +# include <linux/membarrier.h> > +#else > + > +/* Definitions from Linux 4.16 follow. */ > + > +enum membarrier_cmd > +{ > + MEMBARRIER_CMD_QUERY = 0, > + MEMBARRIER_CMD_SHARED = 1, > + MEMBARRIER_CMD_GLOBAL = 1, > + MEMBARRIER_CMD_GLOBAL_EXPEDITED = 2, > + MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED = 4, > + MEMBARRIER_CMD_PRIVATE_EXPEDITED = 8, > + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED = 16, > + MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE = 32, > + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE = 64, > +}; > + > +#endif If we are replicating the values, meaning the idea is to keep it sync at least when we have the minimum supported kernel of 4.16, why just not add a comment to add linux/membarrier.h once the minimum supported kernel provides this header and not rely on linux/membarrier.h? Also I think the minimum kernel that provides this header is 4.3, however by using 4.3 as the condition to include the kernel header in add another issue which is glibc will have different semantic depending of the installed header. This fallback enum definition is also lacking MEMBARRIER_CMD_SHARED, and although is provided by kernel headers just for compatibility, it is another interface difference it has depending of the installed kernel header. Personally I prefer to decouple glibc headers from kernel ones, so I would go just with define membarrier_cmd (and the flags once its is implemented by kernel) with the burden of keep them in sync with kernel releases (as we do for various syscalls and headers). > + > +#endif /* _SYS_MEMBARRIER_H */ > diff --git a/sysdeps/unix/sysv/linux/syscalls.list b/sysdeps/unix/sysv/linux/syscalls.list > index e24ea29e35..3deee2bc19 100644 > --- a/sysdeps/unix/sysv/linux/syscalls.list > +++ b/sysdeps/unix/sysv/linux/syscalls.list > @@ -112,3 +112,4 @@ process_vm_writev EXTRA process_vm_writev i:ipipii process_vm_writev > memfd_create EXTRA memfd_create i:si memfd_create > pkey_alloc EXTRA pkey_alloc i:ii pkey_alloc > pkey_free EXTRA pkey_free i:i pkey_free > +membarrier EXTRA membarrier i:ii membarrier Ok. > diff --git a/sysdeps/unix/sysv/linux/tst-membarrier.c b/sysdeps/unix/sysv/linux/tst-membarrier.c > new file mode 100644 > index 0000000000..aeaccad578 > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/tst-membarrier.c > @@ -0,0 +1,61 @@ > +/* Tests for the membarrier function. > + Copyright (C) 2017 Free Software Foundation, Inc. Update copyright year. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#include <errno.h> > +#include <stdio.h> > +#include <support/check.h> > +#include <sys/membarrier.h> > + > +static int > +do_test (void) > +{ > + int supported = membarrier (MEMBARRIER_CMD_QUERY, 0); > + if (supported == -1) > + { > + if (errno == ENOSYS) > + FAIL_UNSUPPORTED ("membarrier system call not implemented"); > + else > + FAIL_EXIT1 ("membarrier: %m"); > + } > + > + if ((supported & MEMBARRIER_CMD_GLOBAL) == 0) > + FAIL_UNSUPPORTED ("global memory barriers not supported"); > + > + puts ("info: membarrier is supported on this system"); > + > + /* The global barrier is always implemented. */ > + TEST_COMPARE (supported & MEMBARRIER_CMD_GLOBAL, MEMBARRIER_CMD_GLOBAL); > + TEST_COMPARE (membarrier (MEMBARRIER_CMD_GLOBAL, 0), 0); > + > + /* If the private-expedited barrier is advertised, execute it after > + registering the intent. */ > + if (supported & MEMBARRIER_CMD_PRIVATE_EXPEDITED) > + { > + TEST_COMPARE (supported & MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, > + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED); > + TEST_COMPARE (membarrier (MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, 0), > + 0); > + TEST_COMPARE (membarrier (MEMBARRIER_CMD_PRIVATE_EXPEDITED, 0), 0); > + } > + else > + puts ("info: MEMBARRIER_CMD_PRIVATE_EXPEDITED not supported"); > + > + return 0; > +} > + > +#include <support/test-driver.c> Ok. > diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist > index 3b175f104b..f097eb86e2 100644 > --- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist > +++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist > @@ -1895,6 +1895,7 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > GLIBC_2.3 __ctype_b_loc F > GLIBC_2.3 __ctype_tolower_loc F Ok. > diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist > index 1b57710477..afb1a196fa 100644 > --- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist > +++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist > @@ -2146,4 +2146,5 @@ GLIBC_2.28 thrd_current F > GLIBC_2.28 thrd_equal F > GLIBC_2.28 thrd_sleep F > GLIBC_2.28 thrd_yield F > +GLIBC_2.29 membarrier F > GLIBC_2.29 posix_spawn_file_actions_addchdir_np F > Ok.
* Adhemerval Zanella: > If we are replicating the values, meaning the idea is to keep it sync at least > when we have the minimum supported kernel of 4.16, why just not add a comment > to add linux/membarrier.h once the minimum supported kernel provides this > header and not rely on linux/membarrier.h? This way, we can compile the test with any supported kernel headers for glibc. If we defer to <linux/membarrier.h> unconditionally, we cannot build the test with all kernel headers. The current approach definitely makes the test case much cleaner. > Also I think the minimum kernel that provides this header is 4.3, however > by using 4.3 as the condition to include the kernel header in add another > issue which is glibc will have different semantic depending of the installed > header. This fallback enum definition is also lacking MEMBARRIER_CMD_SHARED, > and although is provided by kernel headers just for compatibility, it is > another interface difference it has depending of the installed kernel > header. MEMBARRIER_CMD_SHARED is included. Thanks, Florian
Are you going to open a bug report in Bugzilla for the lack of documentation (discussing exactly what the issues are that need to be clarified to document the function)? If in any particular case there is sufficient justification for not documenting a new function, I'd expect such a bug report as a minimum to say what needs to be done to add documentation. (The most common case for not adding documentation is probably that the function joins a whole family of existing functions lacking documentation, in which case one bug report for the whole family suffices, like bug 10891 covering all the *_l functions missing documentation.)
On 05/12/2018 12:51, Florian Weimer wrote: > * Adhemerval Zanella: > >> If we are replicating the values, meaning the idea is to keep it sync at least >> when we have the minimum supported kernel of 4.16, why just not add a comment >> to add linux/membarrier.h once the minimum supported kernel provides this >> header and not rely on linux/membarrier.h? > > This way, we can compile the test with any supported kernel headers for > glibc. > > If we defer to <linux/membarrier.h> unconditionally, we cannot build the > test with all kernel headers. The current approach definitely makes the > test case much cleaner. > My point is exactly to *not* rely on kernel headers. >> Also I think the minimum kernel that provides this header is 4.3, however >> by using 4.3 as the condition to include the kernel header in add another >> issue which is glibc will have different semantic depending of the installed >> header. This fallback enum definition is also lacking MEMBARRIER_CMD_SHARED, >> and although is provided by kernel headers just for compatibility, it is >> another interface difference it has depending of the installed kernel >> header. > > MEMBARRIER_CMD_SHARED is included. > Right, my mistake here.
* Adhemerval Zanella: > On 05/12/2018 12:51, Florian Weimer wrote: >> * Adhemerval Zanella: >> >>> If we are replicating the values, meaning the idea is to keep it sync at least >>> when we have the minimum supported kernel of 4.16, why just not add a comment >>> to add linux/membarrier.h once the minimum supported kernel provides this >>> header and not rely on linux/membarrier.h? >> >> This way, we can compile the test with any supported kernel headers for >> glibc. >> >> If we defer to <linux/membarrier.h> unconditionally, we cannot build the >> test with all kernel headers. The current approach definitely makes the >> test case much cleaner. >> > > My point is exactly to *not* rely on kernel headers. I thought there was a general desire to move in the opposite direction, avoiding copying declarations and definitions in clean/new headers that are dedicated to a specific purpose? I don't think we can add much value by copying the contents of those headers. We can even relax the version check if we use __has_include (which has to be remain optional in installed headers, of course). Thanks, Florian
On 05/12/2018 16:11, Florian Weimer wrote: > * Adhemerval Zanella: > >> On 05/12/2018 12:51, Florian Weimer wrote: >>> * Adhemerval Zanella: >>> >>>> If we are replicating the values, meaning the idea is to keep it sync at least >>>> when we have the minimum supported kernel of 4.16, why just not add a comment >>>> to add linux/membarrier.h once the minimum supported kernel provides this >>>> header and not rely on linux/membarrier.h? >>> >>> This way, we can compile the test with any supported kernel headers for >>> glibc. >>> >>> If we defer to <linux/membarrier.h> unconditionally, we cannot build the >>> test with all kernel headers. The current approach definitely makes the >>> test case much cleaner. >>> >> >> My point is exactly to *not* rely on kernel headers. > > I thought there was a general desire to move in the opposite direction, > avoiding copying declarations and definitions in clean/new headers that > are dedicated to a specific purpose? I don't recall this discussion, do you have the link? In any case I still think mixing header inclusion with the definition duplication is just a double effort, it will still require syncing the definitions on each kernel release. > > I don't think we can add much value by copying the contents of those > headers. We can even relax the version check if we use __has_include > (which has to be remain optional in installed headers, of course). This example give us why trying to relying in kernel header to provide glibc exported interfaces are at least tricky, imho. The issues I can think of are: - The header would require to be include safe in all possible releases and for all possible permutations, which means if something is broken we need to rely on kernel to actually fix it. - glibc might provide different semantic to users depending of the underlying installed kernel header version. It means for two identical glibc versions, some programs might fail to build depending of the installed kernel version.
* Adhemerval Zanella: > On 05/12/2018 16:11, Florian Weimer wrote: >> * Adhemerval Zanella: >> >>> On 05/12/2018 12:51, Florian Weimer wrote: >>>> * Adhemerval Zanella: >>>> >>>>> If we are replicating the values, meaning the idea is to keep it sync at least >>>>> when we have the minimum supported kernel of 4.16, why just not add a comment >>>>> to add linux/membarrier.h once the minimum supported kernel provides this >>>>> header and not rely on linux/membarrier.h? >>>> >>>> This way, we can compile the test with any supported kernel headers for >>>> glibc. >>>> >>>> If we defer to <linux/membarrier.h> unconditionally, we cannot build the >>>> test with all kernel headers. The current approach definitely makes the >>>> test case much cleaner. >>>> >>> >>> My point is exactly to *not* rely on kernel headers. >> >> I thought there was a general desire to move in the opposite direction, >> avoiding copying declarations and definitions in clean/new headers that >> are dedicated to a specific purpose? > > I don't recall this discussion, do you have the link? I will try to find it. > In any case I still think mixing header inclusion with the definition > duplication is just a double effort, it will still require syncing the > definitions on each kernel release. I think that's not actually true. We only need to do this if we need newer definitions for building glibc itself (including the test suite). >> I don't think we can add much value by copying the contents of those >> headers. We can even relax the version check if we use __has_include >> (which has to be remain optional in installed headers, of course). > This example give us why trying to relying in kernel header to provide > glibc exported interfaces are at least tricky, imho. The issues I can > think of are: > > - The header would require to be include safe in all possible releases > and for all possible permutations, which means if something is broken > we need to rely on kernel to actually fix it. I'm not too worried about this for new headers with a dedicated purpose. Using the UAPI will actually make things work in more cases; see below. > - glibc might provide different semantic to users depending of the > underlying installed kernel header version. It means for two > identical glibc versions, some programs might fail to build > depending of the installed kernel version. Well, that already happens with the duplicated header approach if users want to use UAPI headers, too. It requires complicated synchronization to make it work. We can tweak the conditional for the header inclusion further, like this: +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 16, 0) \ + || __glibc_has_include (<linux/membarrier.h>) \ +# include <linux/membarrier.h> +#else With GCC 5 and later, this would always prefer the kernel header if available. There would never be a conflict between the definitions, irrespective of header file inclusion order. What do you think? Thanks, Florian
* Joseph Myers: > Are you going to open a bug report in Bugzilla for the lack of > documentation (discussing exactly what the issues are that need to be > clarified to document the function)? A filed a bug for the general issue of lack of documentation for the memory model: <https://sourceware.org/bugzilla/show_bug.cgi?id=23955> Thanks, Florian
On 05/12/2018 17:53, Florian Weimer wrote: > * Adhemerval Zanella: > >> On 05/12/2018 16:11, Florian Weimer wrote: >>> * Adhemerval Zanella: >>> >>>> On 05/12/2018 12:51, Florian Weimer wrote: >>>>> * Adhemerval Zanella: >>>>> >>>>>> If we are replicating the values, meaning the idea is to keep it sync at least >>>>>> when we have the minimum supported kernel of 4.16, why just not add a comment >>>>>> to add linux/membarrier.h once the minimum supported kernel provides this >>>>>> header and not rely on linux/membarrier.h? >>>>> >>>>> This way, we can compile the test with any supported kernel headers for >>>>> glibc. >>>>> >>>>> If we defer to <linux/membarrier.h> unconditionally, we cannot build the >>>>> test with all kernel headers. The current approach definitely makes the >>>>> test case much cleaner. >>>>> >>>> >>>> My point is exactly to *not* rely on kernel headers. >>> >>> I thought there was a general desire to move in the opposite direction, >>> avoiding copying declarations and definitions in clean/new headers that >>> are dedicated to a specific purpose? >> >> I don't recall this discussion, do you have the link? > > I will try to find it. > >> In any case I still think mixing header inclusion with the definition >> duplication is just a double effort, it will still require syncing the >> definitions on each kernel release. > > I think that's not actually true. We only need to do this if we need > newer definitions for building glibc itself (including the test suite). > >>> I don't think we can add much value by copying the contents of those >>> headers. We can even relax the version check if we use __has_include >>> (which has to be remain optional in installed headers, of course). > >> This example give us why trying to relying in kernel header to provide >> glibc exported interfaces are at least tricky, imho. The issues I can >> think of are: >> >> - The header would require to be include safe in all possible releases >> and for all possible permutations, which means if something is broken >> we need to rely on kernel to actually fix it. > > I'm not too worried about this for new headers with a dedicated purpose. > Using the UAPI will actually make things work in more cases; see below. > >> - glibc might provide different semantic to users depending of the >> underlying installed kernel header version. It means for two >> identical glibc versions, some programs might fail to build >> depending of the installed kernel version. > > Well, that already happens with the duplicated header approach if users > want to use UAPI headers, too. It requires complicated synchronization > to make it work. But at least glibc is not imposing it, if user want to use kernel uapi header he will explicit include it and it is up to kernel provide a sane implementation. > > We can tweak the conditional for the header inclusion further, like > this: > > +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 16, 0) \ > + || __glibc_has_include (<linux/membarrier.h>) \ > +# include <linux/membarrier.h> > +#else > > With GCC 5 and later, this would always prefer the kernel header if > available. There would never be a conflict between the definitions, > irrespective of header file inclusion order. > > What do you think? In any case I don't think my suggestion should be a blocker, we already rely on linux header for some cases and it seems I am alone in trying to decouple glibc from kernel headers.
Hello, David, I took a crack at extending LKMM to accommodate what I think would support what you have in your paper. Please see the very end of this email for a patch against the "dev" branch of my -rcu tree. This gives the expected result for the following three litmus tests, but is probably deficient or otherwise misguided in other ways. I have added the LKMM maintainers on CC for their amusement. ;-) Thoughts? Thanx, Paul ------------------------------------------------------------------------ C C-Goldblat-memb-1 { } P0(int *x0, int *x1) { WRITE_ONCE(*x0, 1); r1 = READ_ONCE(*x1); } P1(int *x0, int *x1) { WRITE_ONCE(*x1, 1); smp_memb(); r2 = READ_ONCE(*x0); } exists (0:r1=0 /\ 1:r2=0) ------------------------------------------------------------------------ C C-Goldblat-memb-2 { } P0(int *x0, int *x1) { WRITE_ONCE(*x0, 1); r1 = READ_ONCE(*x1); } P1(int *x1, int *x2) { WRITE_ONCE(*x1, 1); smp_memb(); r1 = READ_ONCE(*x2); } P2(int *x2, int *x0) { WRITE_ONCE(*x2, 1); r1 = READ_ONCE(*x0); } exists (0:r1=0 /\ 1:r1=0 /\ 2:r1=0) ------------------------------------------------------------------------ C C-Goldblat-memb-3 { } P0(int *x0, int *x1) { WRITE_ONCE(*x0, 1); r1 = READ_ONCE(*x1); } P1(int *x1, int *x2) { WRITE_ONCE(*x1, 1); smp_memb(); r1 = READ_ONCE(*x2); } P2(int *x2, int *x3) { WRITE_ONCE(*x2, 1); r1 = READ_ONCE(*x3); } P3(int *x3, int *x0) { WRITE_ONCE(*x3, 1); smp_memb(); r1 = READ_ONCE(*x0); } exists (0:r1=0 /\ 1:r1=0 /\ 2:r1=0 /\ 3:r1=0) ------------------------------------------------------------------------ On Thu, Nov 29, 2018 at 11:02:17AM -0800, David Goldblatt wrote: > One note with the suggested patch is that > `atomic_thread_fence(memory_order_acq_rel)` should probably be > `atomic_thread_fence (memory_order_seq_cst)` (otherwise the call would > be a no-op on, say, x86, which it very much isn't). > > The non-transitivity thing makes the resulting description arguably > incorrect, but this is informal enough that it might not be a big deal > to add something after "For these threads, the membarrier function > call turns an existing compiler barrier (see above) executed by these > threads into full memory barriers" that clarifies it. E.g. you could > make it into "turns an existing compiler barrier [...] into full > memory barriers, with respect to the calling thread". > > Since this is targeting the description of the OS call (and doesn't > have to concern itself with also being implementable by other > asymmetric techniques or degrading to architectural barriers), I think > that the description in "approach 2" in P1202 would also make sense > for a formal description of the syscall. (Of course, without the > kernel itself committing to a rigorous semantics, anything specified > on top of it will be on slightly shaky ground). > > - David > > On Thu, Nov 29, 2018 at 7:04 AM Paul E. McKenney <paulmck@linux.ibm.com> wrote: > > > > On Thu, Nov 29, 2018 at 09:44:22AM -0500, Mathieu Desnoyers wrote: > > > ----- On Nov 29, 2018, at 8:50 AM, Florian Weimer fweimer@redhat.com wrote: > > > > > > > * Torvald Riegel: > > > > > > > >> On Wed, 2018-11-28 at 16:05 +0100, Florian Weimer wrote: > > > >>> This is essentially a repost of last year's patch, rebased to the glibc > > > >>> 2.29 symbol version and reflecting the introduction of > > > >>> MEMBARRIER_CMD_GLOBAL. > > > >>> > > > >>> I'm not including any changes to manual/ here because the set of > > > >>> supported operations is evolving rapidly, we could not get consensus for > > > >>> the language I proposed the last time, and I do not want to contribute > > > >>> to the manual for the time being. > > > >> > > > >> Fair enough. Nonetheless, can you summarize how far you're along with > > > >> properly defining the semantics (eg, based on the C/C++ memory model)? > > > > > > > > I wrote down what you could, but no one liked it. > > > > > > > > <https://sourceware.org/ml/libc-alpha/2017-12/msg00796.html> > > > > > > > > I expect that a formalization would interact in non-trivial ways with > > > > any potential formalization of usable relaxed memory order semantics, > > > > and I'm not sure if anyone knows how to do the latter today. > > > > > > Adding Paul E. McKenney in CC. > > > > There is some prototype C++ memory model wording from David Goldblatt (CCed) > > here (search for "Standarese"): > > > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1202r0.pdf > > > > David's key insight is that (in Linuxese) light fences cannot pair with > > each other. ------------------------------------------------------------------------ commit 17e3b6b60e57d1cb791f68a1a6a36e942cb2baad Author: Paul E. McKenney <paulmck@linux.ibm.com> Date: Thu Dec 6 13:40:40 2018 -0800 EXP tools/memory-model: Add semantics for sys_membarrier() This prototype commit extends LKMM to accommodate sys_membarrier(), which is a asymmetric barrier with a limited ability to insert full ordering into tasks that provide only compiler ordering. This commit currently uses the "po" relation for this purpose, but something more sophisticated will be required when plain accesses are added, which the compiler can reorder. For more detail, please see David Goldblatt's C++ working paper: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1202r0.pdf Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell index 9c42cd9ddcb4..4ef41453f569 100644 --- a/tools/memory-model/linux-kernel.bell +++ b/tools/memory-model/linux-kernel.bell @@ -24,6 +24,7 @@ instructions RMW[{'once,'acquire,'release}] enum Barriers = 'wmb (*smp_wmb*) || 'rmb (*smp_rmb*) || 'mb (*smp_mb*) || + 'memb (*sys_membarrier*) || 'rcu-lock (*rcu_read_lock*) || 'rcu-unlock (*rcu_read_unlock*) || 'sync-rcu (*synchronize_rcu*) || diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat index 8dcb37835b61..837c3ee20bea 100644 --- a/tools/memory-model/linux-kernel.cat +++ b/tools/memory-model/linux-kernel.cat @@ -33,9 +33,10 @@ let mb = ([M] ; fencerel(Mb) ; [M]) | ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) | ([M] ; po ; [UL] ; (co | po) ; [LKW] ; fencerel(After-unlock-lock) ; [M]) +let memb = [M] ; fencerel(Memb) ; [M] let gp = po ; [Sync-rcu | Sync-srcu] ; po? -let strong-fence = mb | gp +let strong-fence = mb | gp | memb (* Release Acquire *) let acq-po = [Acquire] ; po ; [M] @@ -86,6 +87,13 @@ acyclic hb as happens-before let pb = prop ; strong-fence ; hb* acyclic pb as propagation +(********************) +(* sys_membarrier() *) +(********************) + +let memb-step = ( prop ; po ; prop )? ; memb +acyclic memb-step as memb-before + (*******) (* RCU *) (*******) diff --git a/tools/memory-model/linux-kernel.def b/tools/memory-model/linux-kernel.def index 1d6a120cde14..9ff0691c5f2c 100644 --- a/tools/memory-model/linux-kernel.def +++ b/tools/memory-model/linux-kernel.def @@ -17,6 +17,7 @@ rcu_dereference(X) __load{once}(X) smp_store_mb(X,V) { __store{once}(X,V); __fence{mb}; } // Fences +smp_memb() { __fence{memb}; } smp_mb() { __fence{mb}; } smp_rmb() { __fence{rmb}; } smp_wmb() { __fence{wmb}; }
On Thu, 6 Dec 2018, Paul E. McKenney wrote: > Hello, David, > > I took a crack at extending LKMM to accommodate what I think would > support what you have in your paper. Please see the very end of this > email for a patch against the "dev" branch of my -rcu tree. > > This gives the expected result for the following three litmus tests, > but is probably deficient or otherwise misguided in other ways. I have > added the LKMM maintainers on CC for their amusement. ;-) > > Thoughts? Since sys_membarrier() provides a heavyweight barrier comparable to synchronize_rcu(), the memory model should treat the two in the same way. That's what this patch does. The corresponding critical section would be any region of code bounded by compiler barriers. Since the LKMM doesn't currently handle plain accesses, the effect is the same as if a compiler barrier were present between each pair of instructions. Basically, each instruction acts as its own critical section. Therefore the patch below defines memb-rscsi as the trivial identity relation. When plain accesses and compiler barriers are added to the memory model, a different definition will be needed. This gives the correct results for the three C-Goldblat-memb-* litmus tests in Paul's email. Alan PS: The patch below is meant to apply on top of the SRCU patches, which are not yet in the mainline kernel. Index: usb-4.x/tools/memory-model/linux-kernel.bell =================================================================== --- usb-4.x.orig/tools/memory-model/linux-kernel.bell +++ usb-4.x/tools/memory-model/linux-kernel.bell @@ -24,6 +24,7 @@ instructions RMW[{'once,'acquire,'releas enum Barriers = 'wmb (*smp_wmb*) || 'rmb (*smp_rmb*) || 'mb (*smp_mb*) || + 'memb (*sys_membarrier*) || 'rcu-lock (*rcu_read_lock*) || 'rcu-unlock (*rcu_read_unlock*) || 'sync-rcu (*synchronize_rcu*) || Index: usb-4.x/tools/memory-model/linux-kernel.cat =================================================================== --- usb-4.x.orig/tools/memory-model/linux-kernel.cat +++ usb-4.x/tools/memory-model/linux-kernel.cat @@ -33,7 +33,7 @@ let mb = ([M] ; fencerel(Mb) ; [M]) | ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) | ([M] ; po ; [UL] ; (co | po) ; [LKW] ; fencerel(After-unlock-lock) ; [M]) -let gp = po ; [Sync-rcu | Sync-srcu] ; po? +let gp = po ; [Sync-rcu | Sync-srcu | Memb] ; po? let strong-fence = mb | gp @@ -102,8 +102,10 @@ acyclic pb as propagation *) let rcu-gp = [Sync-rcu] (* Compare with gp *) let srcu-gp = [Sync-srcu] +let memb-gp = [Memb] let rcu-rscsi = rcu-rscs^-1 let srcu-rscsi = srcu-rscs^-1 +let memb-rscsi = id (* * The synchronize_rcu() strong fence is special in that it can order not @@ -119,15 +121,19 @@ let rcu-link = po? ; hb* ; pb* ; prop ; * the synchronize_srcu() and srcu_read_[un]lock() calls refer to the same * struct srcu_struct location. *) -let rec rcu-fence = rcu-gp | srcu-gp | +let rec rcu-fence = rcu-gp | srcu-gp | memb-gp | (rcu-gp ; rcu-link ; rcu-rscsi) | ((srcu-gp ; rcu-link ; srcu-rscsi) & loc) | + (memb-gp ; rcu-link ; memb-rscsi) | (rcu-rscsi ; rcu-link ; rcu-gp) | ((srcu-rscsi ; rcu-link ; srcu-gp) & loc) | + (memb-rscsi ; rcu-link ; memb-gp) | (rcu-gp ; rcu-link ; rcu-fence ; rcu-link ; rcu-rscsi) | ((srcu-gp ; rcu-link ; rcu-fence ; rcu-link ; srcu-rscsi) & loc) | + (memb-gp ; rcu-link ; rcu-fence ; rcu-link ; memb-rscsi) | (rcu-rscsi ; rcu-link ; rcu-fence ; rcu-link ; rcu-gp) | ((srcu-rscsi ; rcu-link ; rcu-fence ; rcu-link ; srcu-gp) & loc) | + (memb-rscsi ; rcu-link ; rcu-fence ; rcu-link ; memb-gp) | (rcu-fence ; rcu-link ; rcu-fence) (* rb orders instructions just as pb does *) Index: usb-4.x/tools/memory-model/linux-kernel.def =================================================================== --- usb-4.x.orig/tools/memory-model/linux-kernel.def +++ usb-4.x/tools/memory-model/linux-kernel.def @@ -20,6 +20,7 @@ smp_store_mb(X,V) { __store{once}(X,V); smp_mb() { __fence{mb}; } smp_rmb() { __fence{rmb}; } smp_wmb() { __fence{wmb}; } +smp_memb() { __fence{memb}; } smp_mb__before_atomic() { __fence{before-atomic}; } smp_mb__after_atomic() { __fence{after-atomic}; } smp_mb__after_spinlock() { __fence{after-spinlock}; }
On Mon, Dec 10, 2018 at 11:22:31AM -0500, Alan Stern wrote: > On Thu, 6 Dec 2018, Paul E. McKenney wrote: > > > Hello, David, > > > > I took a crack at extending LKMM to accommodate what I think would > > support what you have in your paper. Please see the very end of this > > email for a patch against the "dev" branch of my -rcu tree. > > > > This gives the expected result for the following three litmus tests, > > but is probably deficient or otherwise misguided in other ways. I have > > added the LKMM maintainers on CC for their amusement. ;-) > > > > Thoughts? > > Since sys_membarrier() provides a heavyweight barrier comparable to > synchronize_rcu(), the memory model should treat the two in the same > way. That's what this patch does. > > The corresponding critical section would be any region of code bounded > by compiler barriers. Since the LKMM doesn't currently handle plain > accesses, the effect is the same as if a compiler barrier were present > between each pair of instructions. Basically, each instruction acts as > its own critical section. Therefore the patch below defines memb-rscsi > as the trivial identity relation. When plain accesses and compiler > barriers are added to the memory model, a different definition will be > needed. > > This gives the correct results for the three C-Goldblat-memb-* litmus > tests in Paul's email. Yow!!! My first reaction was that this cannot possibly be correct because sys_membarrier(), which is probably what we should call it, does not wait for anything. But your formulation has the corresponding readers being "id", which as you say above is just a single event. But what makes this work for the following litmus test? ------------------------------------------------------------------------ C membrcu { } P0(intptr_t *x0, intptr_t *x1) { WRITE_ONCE(*x0, 2); smp_memb(); intptr_t r2 = READ_ONCE(*x1); } P1(intptr_t *x1, intptr_t *x2) { WRITE_ONCE(*x1, 2); smp_memb(); intptr_t r2 = READ_ONCE(*x2); } P2(intptr_t *x2, intptr_t *x3) { WRITE_ONCE(*x2, 2); smp_memb(); intptr_t r2 = READ_ONCE(*x3); } P3(intptr_t *x3, intptr_t *x4) { rcu_read_lock(); WRITE_ONCE(*x3, 2); intptr_t r2 = READ_ONCE(*x4); rcu_read_unlock(); } P4(intptr_t *x4, intptr_t *x5) { rcu_read_lock(); WRITE_ONCE(*x4, 2); intptr_t r2 = READ_ONCE(*x5); rcu_read_unlock(); } P5(intptr_t *x0, intptr_t *x5) { rcu_read_lock(); WRITE_ONCE(*x5, 2); intptr_t r2 = READ_ONCE(*x0); rcu_read_unlock(); } exists (5:r2=0 /\ 0:r2=0 /\ 1:r2=0 /\ 2:r2=0 /\ 3:r2=0 /\ 4:r2=0) ------------------------------------------------------------------------ For this, herd gives "Never". Of course, if I reverse the write and read in any of P3(), P4(), or P5(), I get "Sometimes", which does make sense. But what is preserving the order between P3() and P4() and between P4() and P5()? I am not immediately seeing how the analogy with RCU carries over to this case. Thanx, Paul > Alan > > PS: The patch below is meant to apply on top of the SRCU patches, which > are not yet in the mainline kernel. > > > > Index: usb-4.x/tools/memory-model/linux-kernel.bell > =================================================================== > --- usb-4.x.orig/tools/memory-model/linux-kernel.bell > +++ usb-4.x/tools/memory-model/linux-kernel.bell > @@ -24,6 +24,7 @@ instructions RMW[{'once,'acquire,'releas > enum Barriers = 'wmb (*smp_wmb*) || > 'rmb (*smp_rmb*) || > 'mb (*smp_mb*) || > + 'memb (*sys_membarrier*) || > 'rcu-lock (*rcu_read_lock*) || > 'rcu-unlock (*rcu_read_unlock*) || > 'sync-rcu (*synchronize_rcu*) || > Index: usb-4.x/tools/memory-model/linux-kernel.cat > =================================================================== > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat > +++ usb-4.x/tools/memory-model/linux-kernel.cat > @@ -33,7 +33,7 @@ let mb = ([M] ; fencerel(Mb) ; [M]) | > ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) | > ([M] ; po ; [UL] ; (co | po) ; [LKW] ; > fencerel(After-unlock-lock) ; [M]) > -let gp = po ; [Sync-rcu | Sync-srcu] ; po? > +let gp = po ; [Sync-rcu | Sync-srcu | Memb] ; po? > > let strong-fence = mb | gp > > @@ -102,8 +102,10 @@ acyclic pb as propagation > *) > let rcu-gp = [Sync-rcu] (* Compare with gp *) > let srcu-gp = [Sync-srcu] > +let memb-gp = [Memb] > let rcu-rscsi = rcu-rscs^-1 > let srcu-rscsi = srcu-rscs^-1 > +let memb-rscsi = id > > (* > * The synchronize_rcu() strong fence is special in that it can order not > @@ -119,15 +121,19 @@ let rcu-link = po? ; hb* ; pb* ; prop ; > * the synchronize_srcu() and srcu_read_[un]lock() calls refer to the same > * struct srcu_struct location. > *) > -let rec rcu-fence = rcu-gp | srcu-gp | > +let rec rcu-fence = rcu-gp | srcu-gp | memb-gp | > (rcu-gp ; rcu-link ; rcu-rscsi) | > ((srcu-gp ; rcu-link ; srcu-rscsi) & loc) | > + (memb-gp ; rcu-link ; memb-rscsi) | > (rcu-rscsi ; rcu-link ; rcu-gp) | > ((srcu-rscsi ; rcu-link ; srcu-gp) & loc) | > + (memb-rscsi ; rcu-link ; memb-gp) | > (rcu-gp ; rcu-link ; rcu-fence ; rcu-link ; rcu-rscsi) | > ((srcu-gp ; rcu-link ; rcu-fence ; rcu-link ; srcu-rscsi) & loc) | > + (memb-gp ; rcu-link ; rcu-fence ; rcu-link ; memb-rscsi) | > (rcu-rscsi ; rcu-link ; rcu-fence ; rcu-link ; rcu-gp) | > ((srcu-rscsi ; rcu-link ; rcu-fence ; rcu-link ; srcu-gp) & loc) | > + (memb-rscsi ; rcu-link ; rcu-fence ; rcu-link ; memb-gp) | > (rcu-fence ; rcu-link ; rcu-fence) > > (* rb orders instructions just as pb does *) > Index: usb-4.x/tools/memory-model/linux-kernel.def > =================================================================== > --- usb-4.x.orig/tools/memory-model/linux-kernel.def > +++ usb-4.x/tools/memory-model/linux-kernel.def > @@ -20,6 +20,7 @@ smp_store_mb(X,V) { __store{once}(X,V); > smp_mb() { __fence{mb}; } > smp_rmb() { __fence{rmb}; } > smp_wmb() { __fence{wmb}; } > +smp_memb() { __fence{memb}; } > smp_mb__before_atomic() { __fence{before-atomic}; } > smp_mb__after_atomic() { __fence{after-atomic}; } > smp_mb__after_spinlock() { __fence{after-spinlock}; } >
Hi Paul, thank you for thinking about all this. I think the modelling you suggest captures most of the algorithms I would want to write. I think it's slightly too weak, though, to implement the model suggested in P1202R0[1], which permits the SC outcome to be recovered in C-Goldblat-memb-2[2] by inserting a second smp_memb() after the first, which is a rather nice property (and I believe is supported by the underlying implementation options). I afraid though that I'm not familiar enough with the Linux herd definitions to suggest a tweak (or know how easy a tweak might be). - David [1] Which I think may be strengthened a little bit more even in R1. [2] As a nit, my name has two "t"'s in it, although I'd throw into the ring "memb-pairwise", "memb-nontransitive", and "memb-sequenced" if these get non-placeholder names. On Thu, Dec 6, 2018 at 1:54 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote: > > Hello, David, > > I took a crack at extending LKMM to accommodate what I think would > support what you have in your paper. Please see the very end of this > email for a patch against the "dev" branch of my -rcu tree. > > This gives the expected result for the following three litmus tests, > but is probably deficient or otherwise misguided in other ways. I have > added the LKMM maintainers on CC for their amusement. ;-) > > Thoughts? > > Thanx, Paul > > ------------------------------------------------------------------------ > > C C-Goldblat-memb-1 > { > } > > P0(int *x0, int *x1) > { > WRITE_ONCE(*x0, 1); > r1 = READ_ONCE(*x1); > } > > > P1(int *x0, int *x1) > { > WRITE_ONCE(*x1, 1); > smp_memb(); > r2 = READ_ONCE(*x0); > } > > exists (0:r1=0 /\ 1:r2=0) > > ------------------------------------------------------------------------ > > C C-Goldblat-memb-2 > { > } > > P0(int *x0, int *x1) > { > WRITE_ONCE(*x0, 1); > r1 = READ_ONCE(*x1); > } > > > P1(int *x1, int *x2) > { > WRITE_ONCE(*x1, 1); > smp_memb(); > r1 = READ_ONCE(*x2); > } > > P2(int *x2, int *x0) > { > WRITE_ONCE(*x2, 1); > r1 = READ_ONCE(*x0); > } > > exists (0:r1=0 /\ 1:r1=0 /\ 2:r1=0) > > ------------------------------------------------------------------------ > > C C-Goldblat-memb-3 > { > } > > P0(int *x0, int *x1) > { > WRITE_ONCE(*x0, 1); > r1 = READ_ONCE(*x1); > } > > > P1(int *x1, int *x2) > { > WRITE_ONCE(*x1, 1); > smp_memb(); > r1 = READ_ONCE(*x2); > } > > P2(int *x2, int *x3) > { > WRITE_ONCE(*x2, 1); > r1 = READ_ONCE(*x3); > } > > P3(int *x3, int *x0) > { > WRITE_ONCE(*x3, 1); > smp_memb(); > r1 = READ_ONCE(*x0); > } > > exists (0:r1=0 /\ 1:r1=0 /\ 2:r1=0 /\ 3:r1=0) > > ------------------------------------------------------------------------ > > On Thu, Nov 29, 2018 at 11:02:17AM -0800, David Goldblatt wrote: > > One note with the suggested patch is that > > `atomic_thread_fence(memory_order_acq_rel)` should probably be > > `atomic_thread_fence (memory_order_seq_cst)` (otherwise the call would > > be a no-op on, say, x86, which it very much isn't). > > > > The non-transitivity thing makes the resulting description arguably > > incorrect, but this is informal enough that it might not be a big deal > > to add something after "For these threads, the membarrier function > > call turns an existing compiler barrier (see above) executed by these > > threads into full memory barriers" that clarifies it. E.g. you could > > make it into "turns an existing compiler barrier [...] into full > > memory barriers, with respect to the calling thread". > > > > Since this is targeting the description of the OS call (and doesn't > > have to concern itself with also being implementable by other > > asymmetric techniques or degrading to architectural barriers), I think > > that the description in "approach 2" in P1202 would also make sense > > for a formal description of the syscall. (Of course, without the > > kernel itself committing to a rigorous semantics, anything specified > > on top of it will be on slightly shaky ground). > > > > - David > > > > On Thu, Nov 29, 2018 at 7:04 AM Paul E. McKenney <paulmck@linux.ibm.com> wrote: > > > > > > On Thu, Nov 29, 2018 at 09:44:22AM -0500, Mathieu Desnoyers wrote: > > > > ----- On Nov 29, 2018, at 8:50 AM, Florian Weimer fweimer@redhat.com wrote: > > > > > > > > > * Torvald Riegel: > > > > > > > > > >> On Wed, 2018-11-28 at 16:05 +0100, Florian Weimer wrote: > > > > >>> This is essentially a repost of last year's patch, rebased to the glibc > > > > >>> 2.29 symbol version and reflecting the introduction of > > > > >>> MEMBARRIER_CMD_GLOBAL. > > > > >>> > > > > >>> I'm not including any changes to manual/ here because the set of > > > > >>> supported operations is evolving rapidly, we could not get consensus for > > > > >>> the language I proposed the last time, and I do not want to contribute > > > > >>> to the manual for the time being. > > > > >> > > > > >> Fair enough. Nonetheless, can you summarize how far you're along with > > > > >> properly defining the semantics (eg, based on the C/C++ memory model)? > > > > > > > > > > I wrote down what you could, but no one liked it. > > > > > > > > > > <https://sourceware.org/ml/libc-alpha/2017-12/msg00796.html> > > > > > > > > > > I expect that a formalization would interact in non-trivial ways with > > > > > any potential formalization of usable relaxed memory order semantics, > > > > > and I'm not sure if anyone knows how to do the latter today. > > > > > > > > Adding Paul E. McKenney in CC. > > > > > > There is some prototype C++ memory model wording from David Goldblatt (CCed) > > > here (search for "Standarese"): > > > > > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1202r0.pdf > > > > > > David's key insight is that (in Linuxese) light fences cannot pair with > > > each other. > > ------------------------------------------------------------------------ > > commit 17e3b6b60e57d1cb791f68a1a6a36e942cb2baad > Author: Paul E. McKenney <paulmck@linux.ibm.com> > Date: Thu Dec 6 13:40:40 2018 -0800 > > EXP tools/memory-model: Add semantics for sys_membarrier() > > This prototype commit extends LKMM to accommodate sys_membarrier(), > which is a asymmetric barrier with a limited ability to insert full > ordering into tasks that provide only compiler ordering. This commit > currently uses the "po" relation for this purpose, but something more > sophisticated will be required when plain accesses are added, which > the compiler can reorder. > > For more detail, please see David Goldblatt's C++ working paper: > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1202r0.pdf > > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> > > diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell > index 9c42cd9ddcb4..4ef41453f569 100644 > --- a/tools/memory-model/linux-kernel.bell > +++ b/tools/memory-model/linux-kernel.bell > @@ -24,6 +24,7 @@ instructions RMW[{'once,'acquire,'release}] > enum Barriers = 'wmb (*smp_wmb*) || > 'rmb (*smp_rmb*) || > 'mb (*smp_mb*) || > + 'memb (*sys_membarrier*) || > 'rcu-lock (*rcu_read_lock*) || > 'rcu-unlock (*rcu_read_unlock*) || > 'sync-rcu (*synchronize_rcu*) || > diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat > index 8dcb37835b61..837c3ee20bea 100644 > --- a/tools/memory-model/linux-kernel.cat > +++ b/tools/memory-model/linux-kernel.cat > @@ -33,9 +33,10 @@ let mb = ([M] ; fencerel(Mb) ; [M]) | > ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) | > ([M] ; po ; [UL] ; (co | po) ; [LKW] ; > fencerel(After-unlock-lock) ; [M]) > +let memb = [M] ; fencerel(Memb) ; [M] > let gp = po ; [Sync-rcu | Sync-srcu] ; po? > > -let strong-fence = mb | gp > +let strong-fence = mb | gp | memb > > (* Release Acquire *) > let acq-po = [Acquire] ; po ; [M] > @@ -86,6 +87,13 @@ acyclic hb as happens-before > let pb = prop ; strong-fence ; hb* > acyclic pb as propagation > > +(********************) > +(* sys_membarrier() *) > +(********************) > + > +let memb-step = ( prop ; po ; prop )? ; memb > +acyclic memb-step as memb-before > + > (*******) > (* RCU *) > (*******) > diff --git a/tools/memory-model/linux-kernel.def b/tools/memory-model/linux-kernel.def > index 1d6a120cde14..9ff0691c5f2c 100644 > --- a/tools/memory-model/linux-kernel.def > +++ b/tools/memory-model/linux-kernel.def > @@ -17,6 +17,7 @@ rcu_dereference(X) __load{once}(X) > smp_store_mb(X,V) { __store{once}(X,V); __fence{mb}; } > > // Fences > +smp_memb() { __fence{memb}; } > smp_mb() { __fence{mb}; } > smp_rmb() { __fence{rmb}; } > smp_wmb() { __fence{wmb}; } >
On Mon, Dec 10, 2018 at 10:42:25PM -0800, David Goldblatt wrote: > Hi Paul, thank you for thinking about all this. > > I think the modelling you suggest captures most of the algorithms I > would want to write. I think it's slightly too weak, though, to > implement the model suggested in P1202R0[1], which permits the SC > outcome to be recovered in C-Goldblat-memb-2[2] by inserting a second > smp_memb() after the first, which is a rather nice property (and I > believe is supported by the underlying implementation options). I > afraid though that I'm not familiar enough with the Linux herd > definitions to suggest a tweak (or know how easy a tweak might be). Actually, there has been an offlist discussion on exactly this. What is the general rule? Is it that a given cycle have at least as many heavy barriers as it does light ones? Either way, why? Gah! I updated the tests to add the second "t", apologies!!! Thanx, Paul > - David > > [1] Which I think may be strengthened a little bit more even in R1. > [2] As a nit, my name has two "t"'s in it, although I'd throw into the > ring "memb-pairwise", "memb-nontransitive", and "memb-sequenced" if > these get non-placeholder names. > > On Thu, Dec 6, 2018 at 1:54 PM Paul E. McKenney <paulmck@linux.ibm.com> wrote: > > > > Hello, David, > > > > I took a crack at extending LKMM to accommodate what I think would > > support what you have in your paper. Please see the very end of this > > email for a patch against the "dev" branch of my -rcu tree. > > > > This gives the expected result for the following three litmus tests, > > but is probably deficient or otherwise misguided in other ways. I have > > added the LKMM maintainers on CC for their amusement. ;-) > > > > Thoughts? > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > C C-Goldblat-memb-1 > > { > > } > > > > P0(int *x0, int *x1) > > { > > WRITE_ONCE(*x0, 1); > > r1 = READ_ONCE(*x1); > > } > > > > > > P1(int *x0, int *x1) > > { > > WRITE_ONCE(*x1, 1); > > smp_memb(); > > r2 = READ_ONCE(*x0); > > } > > > > exists (0:r1=0 /\ 1:r2=0) > > > > ------------------------------------------------------------------------ > > > > C C-Goldblat-memb-2 > > { > > } > > > > P0(int *x0, int *x1) > > { > > WRITE_ONCE(*x0, 1); > > r1 = READ_ONCE(*x1); > > } > > > > > > P1(int *x1, int *x2) > > { > > WRITE_ONCE(*x1, 1); > > smp_memb(); > > r1 = READ_ONCE(*x2); > > } > > > > P2(int *x2, int *x0) > > { > > WRITE_ONCE(*x2, 1); > > r1 = READ_ONCE(*x0); > > } > > > > exists (0:r1=0 /\ 1:r1=0 /\ 2:r1=0) > > > > ------------------------------------------------------------------------ > > > > C C-Goldblat-memb-3 > > { > > } > > > > P0(int *x0, int *x1) > > { > > WRITE_ONCE(*x0, 1); > > r1 = READ_ONCE(*x1); > > } > > > > > > P1(int *x1, int *x2) > > { > > WRITE_ONCE(*x1, 1); > > smp_memb(); > > r1 = READ_ONCE(*x2); > > } > > > > P2(int *x2, int *x3) > > { > > WRITE_ONCE(*x2, 1); > > r1 = READ_ONCE(*x3); > > } > > > > P3(int *x3, int *x0) > > { > > WRITE_ONCE(*x3, 1); > > smp_memb(); > > r1 = READ_ONCE(*x0); > > } > > > > exists (0:r1=0 /\ 1:r1=0 /\ 2:r1=0 /\ 3:r1=0) > > > > ------------------------------------------------------------------------ > > > > On Thu, Nov 29, 2018 at 11:02:17AM -0800, David Goldblatt wrote: > > > One note with the suggested patch is that > > > `atomic_thread_fence(memory_order_acq_rel)` should probably be > > > `atomic_thread_fence (memory_order_seq_cst)` (otherwise the call would > > > be a no-op on, say, x86, which it very much isn't). > > > > > > The non-transitivity thing makes the resulting description arguably > > > incorrect, but this is informal enough that it might not be a big deal > > > to add something after "For these threads, the membarrier function > > > call turns an existing compiler barrier (see above) executed by these > > > threads into full memory barriers" that clarifies it. E.g. you could > > > make it into "turns an existing compiler barrier [...] into full > > > memory barriers, with respect to the calling thread". > > > > > > Since this is targeting the description of the OS call (and doesn't > > > have to concern itself with also being implementable by other > > > asymmetric techniques or degrading to architectural barriers), I think > > > that the description in "approach 2" in P1202 would also make sense > > > for a formal description of the syscall. (Of course, without the > > > kernel itself committing to a rigorous semantics, anything specified > > > on top of it will be on slightly shaky ground). > > > > > > - David > > > > > > On Thu, Nov 29, 2018 at 7:04 AM Paul E. McKenney <paulmck@linux.ibm.com> wrote: > > > > > > > > On Thu, Nov 29, 2018 at 09:44:22AM -0500, Mathieu Desnoyers wrote: > > > > > ----- On Nov 29, 2018, at 8:50 AM, Florian Weimer fweimer@redhat.com wrote: > > > > > > > > > > > * Torvald Riegel: > > > > > > > > > > > >> On Wed, 2018-11-28 at 16:05 +0100, Florian Weimer wrote: > > > > > >>> This is essentially a repost of last year's patch, rebased to the glibc > > > > > >>> 2.29 symbol version and reflecting the introduction of > > > > > >>> MEMBARRIER_CMD_GLOBAL. > > > > > >>> > > > > > >>> I'm not including any changes to manual/ here because the set of > > > > > >>> supported operations is evolving rapidly, we could not get consensus for > > > > > >>> the language I proposed the last time, and I do not want to contribute > > > > > >>> to the manual for the time being. > > > > > >> > > > > > >> Fair enough. Nonetheless, can you summarize how far you're along with > > > > > >> properly defining the semantics (eg, based on the C/C++ memory model)? > > > > > > > > > > > > I wrote down what you could, but no one liked it. > > > > > > > > > > > > <https://sourceware.org/ml/libc-alpha/2017-12/msg00796.html> > > > > > > > > > > > > I expect that a formalization would interact in non-trivial ways with > > > > > > any potential formalization of usable relaxed memory order semantics, > > > > > > and I'm not sure if anyone knows how to do the latter today. > > > > > > > > > > Adding Paul E. McKenney in CC. > > > > > > > > There is some prototype C++ memory model wording from David Goldblatt (CCed) > > > > here (search for "Standarese"): > > > > > > > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1202r0.pdf > > > > > > > > David's key insight is that (in Linuxese) light fences cannot pair with > > > > each other. > > > > ------------------------------------------------------------------------ > > > > commit 17e3b6b60e57d1cb791f68a1a6a36e942cb2baad > > Author: Paul E. McKenney <paulmck@linux.ibm.com> > > Date: Thu Dec 6 13:40:40 2018 -0800 > > > > EXP tools/memory-model: Add semantics for sys_membarrier() > > > > This prototype commit extends LKMM to accommodate sys_membarrier(), > > which is a asymmetric barrier with a limited ability to insert full > > ordering into tasks that provide only compiler ordering. This commit > > currently uses the "po" relation for this purpose, but something more > > sophisticated will be required when plain accesses are added, which > > the compiler can reorder. > > > > For more detail, please see David Goldblatt's C++ working paper: > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1202r0.pdf > > > > Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> > > > > diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell > > index 9c42cd9ddcb4..4ef41453f569 100644 > > --- a/tools/memory-model/linux-kernel.bell > > +++ b/tools/memory-model/linux-kernel.bell > > @@ -24,6 +24,7 @@ instructions RMW[{'once,'acquire,'release}] > > enum Barriers = 'wmb (*smp_wmb*) || > > 'rmb (*smp_rmb*) || > > 'mb (*smp_mb*) || > > + 'memb (*sys_membarrier*) || > > 'rcu-lock (*rcu_read_lock*) || > > 'rcu-unlock (*rcu_read_unlock*) || > > 'sync-rcu (*synchronize_rcu*) || > > diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat > > index 8dcb37835b61..837c3ee20bea 100644 > > --- a/tools/memory-model/linux-kernel.cat > > +++ b/tools/memory-model/linux-kernel.cat > > @@ -33,9 +33,10 @@ let mb = ([M] ; fencerel(Mb) ; [M]) | > > ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) | > > ([M] ; po ; [UL] ; (co | po) ; [LKW] ; > > fencerel(After-unlock-lock) ; [M]) > > +let memb = [M] ; fencerel(Memb) ; [M] > > let gp = po ; [Sync-rcu | Sync-srcu] ; po? > > > > -let strong-fence = mb | gp > > +let strong-fence = mb | gp | memb > > > > (* Release Acquire *) > > let acq-po = [Acquire] ; po ; [M] > > @@ -86,6 +87,13 @@ acyclic hb as happens-before > > let pb = prop ; strong-fence ; hb* > > acyclic pb as propagation > > > > +(********************) > > +(* sys_membarrier() *) > > +(********************) > > + > > +let memb-step = ( prop ; po ; prop )? ; memb > > +acyclic memb-step as memb-before > > + > > (*******) > > (* RCU *) > > (*******) > > diff --git a/tools/memory-model/linux-kernel.def b/tools/memory-model/linux-kernel.def > > index 1d6a120cde14..9ff0691c5f2c 100644 > > --- a/tools/memory-model/linux-kernel.def > > +++ b/tools/memory-model/linux-kernel.def > > @@ -17,6 +17,7 @@ rcu_dereference(X) __load{once}(X) > > smp_store_mb(X,V) { __store{once}(X,V); __fence{mb}; } > > > > // Fences > > +smp_memb() { __fence{memb}; } > > smp_mb() { __fence{mb}; } > > smp_rmb() { __fence{rmb}; } > > smp_wmb() { __fence{wmb}; } > > >
On Mon, 10 Dec 2018, Paul E. McKenney wrote: > On Mon, Dec 10, 2018 at 11:22:31AM -0500, Alan Stern wrote: > > On Thu, 6 Dec 2018, Paul E. McKenney wrote: > > > > > Hello, David, > > > > > > I took a crack at extending LKMM to accommodate what I think would > > > support what you have in your paper. Please see the very end of this > > > email for a patch against the "dev" branch of my -rcu tree. > > > > > > This gives the expected result for the following three litmus tests, > > > but is probably deficient or otherwise misguided in other ways. I have > > > added the LKMM maintainers on CC for their amusement. ;-) > > > > > > Thoughts? > > > > Since sys_membarrier() provides a heavyweight barrier comparable to > > synchronize_rcu(), the memory model should treat the two in the same > > way. That's what this patch does. > > > > The corresponding critical section would be any region of code bounded > > by compiler barriers. Since the LKMM doesn't currently handle plain > > accesses, the effect is the same as if a compiler barrier were present > > between each pair of instructions. Basically, each instruction acts as > > its own critical section. Therefore the patch below defines memb-rscsi > > as the trivial identity relation. When plain accesses and compiler > > barriers are added to the memory model, a different definition will be > > needed. > > > > This gives the correct results for the three C-Goldblat-memb-* litmus > > tests in Paul's email. > > Yow!!! > > My first reaction was that this cannot possibly be correct because > sys_membarrier(), which is probably what we should call it, does not > wait for anything. But your formulation has the corresponding readers > being "id", which as you say above is just a single event. > > But what makes this work for the following litmus test? > > ------------------------------------------------------------------------ > > C membrcu > > { > } > > P0(intptr_t *x0, intptr_t *x1) > { > WRITE_ONCE(*x0, 2); > smp_memb(); > intptr_t r2 = READ_ONCE(*x1); > } > > > P1(intptr_t *x1, intptr_t *x2) > { > WRITE_ONCE(*x1, 2); > smp_memb(); > intptr_t r2 = READ_ONCE(*x2); > } > > > P2(intptr_t *x2, intptr_t *x3) > { > WRITE_ONCE(*x2, 2); > smp_memb(); > intptr_t r2 = READ_ONCE(*x3); > } > > > P3(intptr_t *x3, intptr_t *x4) > { > rcu_read_lock(); > WRITE_ONCE(*x3, 2); > intptr_t r2 = READ_ONCE(*x4); > rcu_read_unlock(); > } > > > P4(intptr_t *x4, intptr_t *x5) > { > rcu_read_lock(); > WRITE_ONCE(*x4, 2); > intptr_t r2 = READ_ONCE(*x5); > rcu_read_unlock(); > } > > > P5(intptr_t *x0, intptr_t *x5) > { > rcu_read_lock(); > WRITE_ONCE(*x5, 2); > intptr_t r2 = READ_ONCE(*x0); > rcu_read_unlock(); > } > > exists > (5:r2=0 /\ 0:r2=0 /\ 1:r2=0 /\ 2:r2=0 /\ 3:r2=0 /\ 4:r2=0) > > ------------------------------------------------------------------------ > > For this, herd gives "Never". Of course, if I reverse the write and > read in any of P3(), P4(), or P5(), I get "Sometimes", which does make > sense. But what is preserving the order between P3() and P4() and > between P4() and P5()? I am not immediately seeing how the analogy > with RCU carries over to this case. That isn't how it works. Nothing preserves the orders you mentioned. It's more like: the order between P1 and P4 is preserved, as is the order between P0 and P5. You'll see below... (I readily agree that this result is not simple or obvious. It took me quite a while to formulate the following analysis.) To begin with, since there aren't any synchronize_rcu calls in the test, the rcu_read_lock and rcu_read_unlock calls do nothing. They can be eliminated. Also, I find the variable names "x0" - "x5" to be a little hard to work with. If you don't mind, I'll replace them with "a" - "f". Now, a little digression on how sys_membarrier works. It starts by executing a full memory barrier. Then it injects memory barriers into the instruction streams of all the other CPUs and waits for them all to complete. Then it executes an ending memory barrier. These barriers are ordered as described. Therefore we have mb0s < mb05 < mb0e, mb1s < mb14 < mb1e, and mb2s < mb23 < mb2e, where mb0s is the starting barrier of the sys_memb call on P0, mb05 is the barrier that it injects into P5, mb0e is the ending barrier of the call, and similarly for the other sys_memb calls. The '<' signs mean that the thing on their left finishes before the thing on their right does. Rewriting the litmus test in these terms gives: P0 P1 P2 P3 P4 P5 Wa=2 Wb=2 Wc=2 [mb23] [mb14] [mb05] mb0s mb1s mb2s Wd=2 We=2 Wf=2 mb0e mb1e mb2e Re=0 Rf=0 Ra=0 Rb=0 Rc=0 Rd=0 Here the brackets in "[mb23]", "[mb14]", and "[mb05]" mean that the positions of these barriers in their respective threads' program orderings is undetermined; they need not come at the top as shown. (Also, in case David is unfamiliar with it, the "Wa=2" notation is shorthand for "Write 2 to a" and "Rb=0" is short for "Read 0 from b".) Finally, here are a few facts which may be well known and obvious, but I'll state them anyway: A CPU cannot reorder instructions across a memory barrier. If x is po-after a barrier then x executes after the barrier is finished. If a store is po-before a barrier then the store propagates to every CPU before the barrier finishes. If a store propagates to some CPU before a load on that CPU reads from the same location, then the load will obtain the value from that store or a co-later store. This implies that if a load obtains a value co-earlier than some store then the load must have executed before the store propagated to the load's CPU. The proof consists of three main stages, each requiring three steps. Using the facts that b - f are all read as 0, I'll show that P1 executes Rc before P3 executes Re, then that P0 executes Rb before P4 executes Rf, and lastly that P5's Ra must obtain 2, not 0. This will demonstrate that the litmus test is not allowed. 1. Suppose that mb23 ends up coming po-later than Wd in P3. Then we would have: Wd propagates to P2 < mb23 < mb2e < Rd, and so Rd would obtain 2, not 0. Hence mb23 must come po-before Wd (as shown in the listing): mb23 < Wd. 2. Since mb23 therefore occurs po-before Re and instructions cannot be reordered across barriers, mb23 < Re. 3. Since Rc obtains 0, we must have: Rc < Wc propagates to P1 < mb2s < mb23 < Re. Thus Rc < Re. 4. Suppose that mb14 ends up coming po-later than We in P4. Then we would have: We propagates to P3 < mb14 < mb1e < Rc < Re, and so Re would obtain 2, not 0. Hence mb14 must come po-before We (as shown in the listing): mb14 < We. 5. Since mb14 therefore occurs po-before Rf and instructions cannot be reordered across barriers, mb14 < Rf. 6. Since Rb obtains 0, we must have: Rb < Wb propagates to P0 < mb1s < mb14 < Rf. Thus Rb < Rf. 7. Suppose that mb05 ends up coming po-later than Wf in P5. Then we would have: Wf propagates to P4 < mb05 < mb0e < Rb < Rf, and so Rf would obtain 2, not 0. Hence mb05 must come po-before Wf (as shown in the listing): mb05 < Wf. 8. Since mb05 therefore occurs po-before Ra and instructions cannot be reordered across barriers, mb05 < Ra. 9. Now we have: Wa propagates to P5 < mb0s < mb05 < Ra, and so Ra must obtain 2, not 0. QED. Alan
On Tue, Dec 11, 2018 at 11:21:15AM -0500, Alan Stern wrote: > On Mon, 10 Dec 2018, Paul E. McKenney wrote: > > > On Mon, Dec 10, 2018 at 11:22:31AM -0500, Alan Stern wrote: > > > On Thu, 6 Dec 2018, Paul E. McKenney wrote: > > > > > > > Hello, David, > > > > > > > > I took a crack at extending LKMM to accommodate what I think would > > > > support what you have in your paper. Please see the very end of this > > > > email for a patch against the "dev" branch of my -rcu tree. > > > > > > > > This gives the expected result for the following three litmus tests, > > > > but is probably deficient or otherwise misguided in other ways. I have > > > > added the LKMM maintainers on CC for their amusement. ;-) > > > > > > > > Thoughts? > > > > > > Since sys_membarrier() provides a heavyweight barrier comparable to > > > synchronize_rcu(), the memory model should treat the two in the same > > > way. That's what this patch does. > > > > > > The corresponding critical section would be any region of code bounded > > > by compiler barriers. Since the LKMM doesn't currently handle plain > > > accesses, the effect is the same as if a compiler barrier were present > > > between each pair of instructions. Basically, each instruction acts as > > > its own critical section. Therefore the patch below defines memb-rscsi > > > as the trivial identity relation. When plain accesses and compiler > > > barriers are added to the memory model, a different definition will be > > > needed. > > > > > > This gives the correct results for the three C-Goldblat-memb-* litmus > > > tests in Paul's email. > > > > Yow!!! > > > > My first reaction was that this cannot possibly be correct because > > sys_membarrier(), which is probably what we should call it, does not > > wait for anything. But your formulation has the corresponding readers > > being "id", which as you say above is just a single event. > > > > But what makes this work for the following litmus test? > > > > ------------------------------------------------------------------------ > > > > C membrcu > > > > { > > } > > > > P0(intptr_t *x0, intptr_t *x1) > > { > > WRITE_ONCE(*x0, 2); > > smp_memb(); > > intptr_t r2 = READ_ONCE(*x1); > > } > > > > > > P1(intptr_t *x1, intptr_t *x2) > > { > > WRITE_ONCE(*x1, 2); > > smp_memb(); > > intptr_t r2 = READ_ONCE(*x2); > > } > > > > > > P2(intptr_t *x2, intptr_t *x3) > > { > > WRITE_ONCE(*x2, 2); > > smp_memb(); > > intptr_t r2 = READ_ONCE(*x3); > > } > > > > > > P3(intptr_t *x3, intptr_t *x4) > > { > > rcu_read_lock(); > > WRITE_ONCE(*x3, 2); > > intptr_t r2 = READ_ONCE(*x4); > > rcu_read_unlock(); > > } > > > > > > P4(intptr_t *x4, intptr_t *x5) > > { > > rcu_read_lock(); > > WRITE_ONCE(*x4, 2); > > intptr_t r2 = READ_ONCE(*x5); > > rcu_read_unlock(); > > } > > > > > > P5(intptr_t *x0, intptr_t *x5) > > { > > rcu_read_lock(); > > WRITE_ONCE(*x5, 2); > > intptr_t r2 = READ_ONCE(*x0); > > rcu_read_unlock(); > > } > > > > exists > > (5:r2=0 /\ 0:r2=0 /\ 1:r2=0 /\ 2:r2=0 /\ 3:r2=0 /\ 4:r2=0) > > > > ------------------------------------------------------------------------ > > > > For this, herd gives "Never". Of course, if I reverse the write and > > read in any of P3(), P4(), or P5(), I get "Sometimes", which does make > > sense. But what is preserving the order between P3() and P4() and > > between P4() and P5()? I am not immediately seeing how the analogy > > with RCU carries over to this case. > > That isn't how it works. Nothing preserves the orders you mentioned. > It's more like: the order between P1 and P4 is preserved, as is the > order between P0 and P5. You'll see below... > > (I readily agree that this result is not simple or obvious. It took me > quite a while to formulate the following analysis.) For whatever it is worth, David Goldblatt agrees with you to at least some extent. I have sent him an inquiry. ;-) > To begin with, since there aren't any synchronize_rcu calls in the > test, the rcu_read_lock and rcu_read_unlock calls do nothing. They > can be eliminated. Agreed. I was just being lazy. > Also, I find the variable names "x0" - "x5" to be a little hard to > work with. If you don't mind, I'll replace them with "a" - "f". Easy enough to translate, so have at it! > Now, a little digression on how sys_membarrier works. It starts by > executing a full memory barrier. Then it injects memory barriers into > the instruction streams of all the other CPUs and waits for them all > to complete. Then it executes an ending memory barrier. > > These barriers are ordered as described. Therefore we have > > mb0s < mb05 < mb0e, > mb1s < mb14 < mb1e, and > mb2s < mb23 < mb2e, > > where mb0s is the starting barrier of the sys_memb call on P0, mb05 is > the barrier that it injects into P5, mb0e is the ending barrier of the > call, and similarly for the other sys_memb calls. The '<' signs mean > that the thing on their left finishes before the thing on their right > does. > > Rewriting the litmus test in these terms gives: > > P0 P1 P2 P3 P4 P5 > Wa=2 Wb=2 Wc=2 [mb23] [mb14] [mb05] > mb0s mb1s mb2s Wd=2 We=2 Wf=2 > mb0e mb1e mb2e Re=0 Rf=0 Ra=0 > Rb=0 Rc=0 Rd=0 > > Here the brackets in "[mb23]", "[mb14]", and "[mb05]" mean that the > positions of these barriers in their respective threads' program > orderings is undetermined; they need not come at the top as shown. > > (Also, in case David is unfamiliar with it, the "Wa=2" notation is > shorthand for "Write 2 to a" and "Rb=0" is short for "Read 0 from b".) > > Finally, here are a few facts which may be well known and obvious, but > I'll state them anyway: > > A CPU cannot reorder instructions across a memory barrier. > If x is po-after a barrier then x executes after the barrier > is finished. > > If a store is po-before a barrier then the store propagates > to every CPU before the barrier finishes. > > If a store propagates to some CPU before a load on that CPU > reads from the same location, then the load will obtain the > value from that store or a co-later store. This implies that > if a load obtains a value co-earlier than some store then the > load must have executed before the store propagated to the > load's CPU. > > The proof consists of three main stages, each requiring three steps. > Using the facts that b - f are all read as 0, I'll show that P1 > executes Rc before P3 executes Re, then that P0 executes Rb before P4 > executes Rf, and lastly that P5's Ra must obtain 2, not 0. This will > demonstrate that the litmus test is not allowed. > > 1. Suppose that mb23 ends up coming po-later than Wd in P3. > Then we would have: > > Wd propagates to P2 < mb23 < mb2e < Rd, > > and so Rd would obtain 2, not 0. Hence mb23 must come > po-before Wd (as shown in the listing): mb23 < Wd. > > 2. Since mb23 therefore occurs po-before Re and instructions > cannot be reordered across barriers, mb23 < Re. > > 3. Since Rc obtains 0, we must have: > > Rc < Wc propagates to P1 < mb2s < mb23 < Re. > > Thus Rc < Re. > > 4. Suppose that mb14 ends up coming po-later than We in P4. > Then we would have: > > We propagates to P3 < mb14 < mb1e < Rc < Re, > > and so Re would obtain 2, not 0. Hence mb14 must come > po-before We (as shown in the listing): mb14 < We. > > 5. Since mb14 therefore occurs po-before Rf and instructions > cannot be reordered across barriers, mb14 < Rf. > > 6. Since Rb obtains 0, we must have: > > Rb < Wb propagates to P0 < mb1s < mb14 < Rf. > > Thus Rb < Rf. > > 7. Suppose that mb05 ends up coming po-later than Wf in P5. > Then we would have: > > Wf propagates to P4 < mb05 < mb0e < Rb < Rf, > > and so Rf would obtain 2, not 0. Hence mb05 must come > po-before Wf (as shown in the listing): mb05 < Wf. > > 8. Since mb05 therefore occurs po-before Ra and instructions > cannot be reordered across barriers, mb05 < Ra. > > 9. Now we have: > > Wa propagates to P5 < mb0s < mb05 < Ra, > > and so Ra must obtain 2, not 0. QED. Like this, then, with maximal reordering of P3-P5's reads? P0 P1 P2 P3 P4 P5 Wa=2 mb0s [mb05] mb0e Ra=0 Rb=0 Wb=2 mb1s [mb14] mb1e Rf=0 Rc=0 Wc=2 Wf=2 mb2s [mb23] mb2e Re=0 Rd=0 We=2 Wd=2 But don't the sys_membarrier() calls affect everyone, especially given the shared-variable communication? If so, why wouldn't this more strict variant hold? P0 P1 P2 P3 P4 P5 Wa=2 mb0s [mb05] [mb05] [mb05] mb0e Rb=0 Wb=2 mb1s [mb14] [mb14] [mb14] mb1e Rc=0 Wc=2 mb2s [mb23] [mb23] [mb23] mb2e Re=0 Rf=0 Ra=0 Rd=0 We=2 Wf=2 Wd=2 In which case, wouldn't this cycle be forbidden even if it had only one sys_membarrier() call? Ah, but the IPIs are not necessarily synchronized across the CPUs, so that the following could happen: P0 P1 P2 P3 P4 P5 Wa=2 mb0s [mb05] [mb05] [mb05] mb0e Ra=0 Rb=0 Wb=2 mb1s [mb14] [mb14] Rf=0 Wf=2 [mb14] mb1e Rc=0 Wc=2 mb2s [mb23] Re=0 We=2 [mb23] [mb23] mb2e Rd=0 Wd=2 I guess in light of this post in 2001, I really don't have an excuse, do I? ;-) https://lists.gt.net/linux/kernel/223555 Or am I still missing something here? Thanx, Paul
On Tue, 11 Dec 2018, Paul E. McKenney wrote: > > Rewriting the litmus test in these terms gives: > > > > P0 P1 P2 P3 P4 P5 > > Wa=2 Wb=2 Wc=2 [mb23] [mb14] [mb05] > > mb0s mb1s mb2s Wd=2 We=2 Wf=2 > > mb0e mb1e mb2e Re=0 Rf=0 Ra=0 > > Rb=0 Rc=0 Rd=0 > > > > Here the brackets in "[mb23]", "[mb14]", and "[mb05]" mean that the > > positions of these barriers in their respective threads' program > > orderings is undetermined; they need not come at the top as shown. > > > > (Also, in case David is unfamiliar with it, the "Wa=2" notation is > > shorthand for "Write 2 to a" and "Rb=0" is short for "Read 0 from b".) > > > > Finally, here are a few facts which may be well known and obvious, but > > I'll state them anyway: > > > > A CPU cannot reorder instructions across a memory barrier. > > If x is po-after a barrier then x executes after the barrier > > is finished. > > > > If a store is po-before a barrier then the store propagates > > to every CPU before the barrier finishes. > > > > If a store propagates to some CPU before a load on that CPU > > reads from the same location, then the load will obtain the > > value from that store or a co-later store. This implies that > > if a load obtains a value co-earlier than some store then the > > load must have executed before the store propagated to the > > load's CPU. > > > > The proof consists of three main stages, each requiring three steps. > > Using the facts that b - f are all read as 0, I'll show that P1 > > executes Rc before P3 executes Re, then that P0 executes Rb before P4 > > executes Rf, and lastly that P5's Ra must obtain 2, not 0. This will > > demonstrate that the litmus test is not allowed. > > > > 1. Suppose that mb23 ends up coming po-later than Wd in P3. > > Then we would have: > > > > Wd propagates to P2 < mb23 < mb2e < Rd, > > > > and so Rd would obtain 2, not 0. Hence mb23 must come > > po-before Wd (as shown in the listing): mb23 < Wd. > > > > 2. Since mb23 therefore occurs po-before Re and instructions > > cannot be reordered across barriers, mb23 < Re. > > > > 3. Since Rc obtains 0, we must have: > > > > Rc < Wc propagates to P1 < mb2s < mb23 < Re. > > > > Thus Rc < Re. > > > > 4. Suppose that mb14 ends up coming po-later than We in P4. > > Then we would have: > > > > We propagates to P3 < mb14 < mb1e < Rc < Re, > > > > and so Re would obtain 2, not 0. Hence mb14 must come > > po-before We (as shown in the listing): mb14 < We. > > > > 5. Since mb14 therefore occurs po-before Rf and instructions > > cannot be reordered across barriers, mb14 < Rf. > > > > 6. Since Rb obtains 0, we must have: > > > > Rb < Wb propagates to P0 < mb1s < mb14 < Rf. > > > > Thus Rb < Rf. > > > > 7. Suppose that mb05 ends up coming po-later than Wf in P5. > > Then we would have: > > > > Wf propagates to P4 < mb05 < mb0e < Rb < Rf, > > > > and so Rf would obtain 2, not 0. Hence mb05 must come > > po-before Wf (as shown in the listing): mb05 < Wf. > > > > 8. Since mb05 therefore occurs po-before Ra and instructions > > cannot be reordered across barriers, mb05 < Ra. > > > > 9. Now we have: > > > > Wa propagates to P5 < mb0s < mb05 < Ra, > > > > and so Ra must obtain 2, not 0. QED. > > Like this, then, with maximal reordering of P3-P5's reads? > > P0 P1 P2 P3 P4 P5 > Wa=2 > mb0s > [mb05] > mb0e Ra=0 > Rb=0 Wb=2 > mb1s > [mb14] > mb1e Rf=0 > Rc=0 Wc=2 Wf=2 > mb2s > [mb23] > mb2e Re=0 > Rd=0 We=2 > Wd=2 Yes, that's right. This shows how P5's Ra must obtain 2 instead of 0. > But don't the sys_membarrier() calls affect everyone, especially given > the shared-variable communication? They do, but the other effects are irrelevant for this proof. > If so, why wouldn't this more strict > variant hold? > > P0 P1 P2 P3 P4 P5 > Wa=2 > mb0s > [mb05] [mb05] [mb05] You have misunderstood the naming scheme. mb05 is the barrier injected by P0's sys_membarrier call into P5. So the three barriers above should be named "mb03", "mb04", and "mb05". And you left out mb01 and mb02. > mb0e > Rb=0 Wb=2 > mb1s > [mb14] [mb14] [mb14] > mb1e > Rc=0 Wc=2 > mb2s > [mb23] [mb23] [mb23] > mb2e Re=0 Rf=0 Ra=0 > Rd=0 We=2 Wf=2 > Wd=2 Yes, this does hold. But since it doesn't affect the end result, there's no point in mentioning all those other barriers. > In which case, wouldn't this cycle be forbidden even if it had only one > sys_membarrier() call? No, it wouldn't. I don't understand why you might think it would. This is just like RCU, if you imagine a tiny critical section between each adjacent pair of instructions. You wouldn't expect RCU to enforce ordering among six CPUs with only one synchronize_rcu call. > Ah, but the IPIs are not necessarily synchronized across the CPUs, > so that the following could happen: > > P0 P1 P2 P3 P4 P5 > Wa=2 > mb0s > [mb05] [mb05] [mb05] > mb0e Ra=0 > Rb=0 Wb=2 > mb1s > [mb14] [mb14] > Rf=0 > Wf=2 > [mb14] > mb1e > Rc=0 Wc=2 > mb2s > [mb23] > Re=0 > We=2 > [mb23] [mb23] > mb2e > Rd=0 > Wd=2 Yes it could. But even in this execution you would end up with Ra=2 instead of Ra=0. > I guess in light of this post in 2001, I really don't have an excuse, > do I? ;-) > > https://lists.gt.net/linux/kernel/223555 > > Or am I still missing something here? You tell me... Alan
On Tue, Dec 11, 2018 at 03:09:33PM -0500, Alan Stern wrote: > On Tue, 11 Dec 2018, Paul E. McKenney wrote: > > > > Rewriting the litmus test in these terms gives: > > > > > > P0 P1 P2 P3 P4 P5 > > > Wa=2 Wb=2 Wc=2 [mb23] [mb14] [mb05] > > > mb0s mb1s mb2s Wd=2 We=2 Wf=2 > > > mb0e mb1e mb2e Re=0 Rf=0 Ra=0 > > > Rb=0 Rc=0 Rd=0 > > > > > > Here the brackets in "[mb23]", "[mb14]", and "[mb05]" mean that the > > > positions of these barriers in their respective threads' program > > > orderings is undetermined; they need not come at the top as shown. > > > > > > (Also, in case David is unfamiliar with it, the "Wa=2" notation is > > > shorthand for "Write 2 to a" and "Rb=0" is short for "Read 0 from b".) > > > > > > Finally, here are a few facts which may be well known and obvious, but > > > I'll state them anyway: > > > > > > A CPU cannot reorder instructions across a memory barrier. > > > If x is po-after a barrier then x executes after the barrier > > > is finished. > > > > > > If a store is po-before a barrier then the store propagates > > > to every CPU before the barrier finishes. > > > > > > If a store propagates to some CPU before a load on that CPU > > > reads from the same location, then the load will obtain the > > > value from that store or a co-later store. This implies that > > > if a load obtains a value co-earlier than some store then the > > > load must have executed before the store propagated to the > > > load's CPU. > > > > > > The proof consists of three main stages, each requiring three steps. > > > Using the facts that b - f are all read as 0, I'll show that P1 > > > executes Rc before P3 executes Re, then that P0 executes Rb before P4 > > > executes Rf, and lastly that P5's Ra must obtain 2, not 0. This will > > > demonstrate that the litmus test is not allowed. > > > > > > 1. Suppose that mb23 ends up coming po-later than Wd in P3. > > > Then we would have: > > > > > > Wd propagates to P2 < mb23 < mb2e < Rd, > > > > > > and so Rd would obtain 2, not 0. Hence mb23 must come > > > po-before Wd (as shown in the listing): mb23 < Wd. > > > > > > 2. Since mb23 therefore occurs po-before Re and instructions > > > cannot be reordered across barriers, mb23 < Re. > > > > > > 3. Since Rc obtains 0, we must have: > > > > > > Rc < Wc propagates to P1 < mb2s < mb23 < Re. > > > > > > Thus Rc < Re. > > > > > > 4. Suppose that mb14 ends up coming po-later than We in P4. > > > Then we would have: > > > > > > We propagates to P3 < mb14 < mb1e < Rc < Re, > > > > > > and so Re would obtain 2, not 0. Hence mb14 must come > > > po-before We (as shown in the listing): mb14 < We. > > > > > > 5. Since mb14 therefore occurs po-before Rf and instructions > > > cannot be reordered across barriers, mb14 < Rf. > > > > > > 6. Since Rb obtains 0, we must have: > > > > > > Rb < Wb propagates to P0 < mb1s < mb14 < Rf. > > > > > > Thus Rb < Rf. > > > > > > 7. Suppose that mb05 ends up coming po-later than Wf in P5. > > > Then we would have: > > > > > > Wf propagates to P4 < mb05 < mb0e < Rb < Rf, > > > > > > and so Rf would obtain 2, not 0. Hence mb05 must come > > > po-before Wf (as shown in the listing): mb05 < Wf. > > > > > > 8. Since mb05 therefore occurs po-before Ra and instructions > > > cannot be reordered across barriers, mb05 < Ra. > > > > > > 9. Now we have: > > > > > > Wa propagates to P5 < mb0s < mb05 < Ra, > > > > > > and so Ra must obtain 2, not 0. QED. > > > > Like this, then, with maximal reordering of P3-P5's reads? > > > > P0 P1 P2 P3 P4 P5 > > Wa=2 > > mb0s > > [mb05] > > mb0e Ra=0 > > Rb=0 Wb=2 > > mb1s > > [mb14] > > mb1e Rf=0 > > Rc=0 Wc=2 Wf=2 > > mb2s > > [mb23] > > mb2e Re=0 > > Rd=0 We=2 > > Wd=2 > > Yes, that's right. This shows how P5's Ra must obtain 2 instead of 0. > > > But don't the sys_membarrier() calls affect everyone, especially given > > the shared-variable communication? > > They do, but the other effects are irrelevant for this proof. If I understand correctly, the shared-variable communication within sys_membarrier() is included in your proof in the form of ordering between memory barriers in the mainline sys_membarrier() code and in the IPI handlers. > > If so, why wouldn't this more strict > > variant hold? > > > > P0 P1 P2 P3 P4 P5 > > Wa=2 > > mb0s > > [mb05] [mb05] [mb05] > > You have misunderstood the naming scheme. mb05 is the barrier injected > by P0's sys_membarrier call into P5. So the three barriers above > should be named "mb03", "mb04", and "mb05". And you left out mb01 and > mb02. The former is a copy-and-paste error on my part, the latter was intentional because the IPIs among P0, P1, and P2 don't seem to strengthen the ordering. > > mb0e > > Rb=0 Wb=2 > > mb1s > > [mb14] [mb14] [mb14] > > mb1e > > Rc=0 Wc=2 > > mb2s > > [mb23] [mb23] [mb23] > > mb2e Re=0 Rf=0 Ra=0 > > Rd=0 We=2 Wf=2 > > Wd=2 > > Yes, this does hold. But since it doesn't affect the end result, > there's no point in mentioning all those other barriers. > > > In which case, wouldn't this cycle be forbidden even if it had only one > > sys_membarrier() call? > > No, it wouldn't. I don't understand why you might think it would. Because I hadn't yet thought of the scenario I showed below. > This is just like RCU, if you imagine a tiny critical section between > each adjacent pair of instructions. You wouldn't expect RCU to enforce > ordering among six CPUs with only one synchronize_rcu call. Yes, I do now agree in light of the scenario shown below. > > Ah, but the IPIs are not necessarily synchronized across the CPUs, > > so that the following could happen: > > > > P0 P1 P2 P3 P4 P5 > > Wa=2 > > mb0s > > [mb05] [mb05] [mb05] > > mb0e Ra=0 > > Rb=0 Wb=2 > > mb1s > > [mb14] [mb14] > > Rf=0 > > Wf=2 > > [mb14] > > mb1e > > Rc=0 Wc=2 > > mb2s > > [mb23] > > Re=0 > > We=2 > > [mb23] [mb23] > > mb2e > > Rd=0 > > Wd=2 > > Yes it could. But even in this execution you would end up with Ra=2 > instead of Ra=0. Agreed. Or I should have said that the above execution is forbidden, either way. > > I guess in light of this post in 2001, I really don't have an excuse, > > do I? ;-) > > > > https://lists.gt.net/linux/kernel/223555 > > > > Or am I still missing something here? > > You tell me... I think I am on board. ;-) Thanx, Paul
On Tue, Dec 11, 2018 at 01:22:04PM -0800, Paul E. McKenney wrote: > On Tue, Dec 11, 2018 at 03:09:33PM -0500, Alan Stern wrote: > > On Tue, 11 Dec 2018, Paul E. McKenney wrote: > > > > > > Rewriting the litmus test in these terms gives: > > > > > > > > P0 P1 P2 P3 P4 P5 > > > > Wa=2 Wb=2 Wc=2 [mb23] [mb14] [mb05] > > > > mb0s mb1s mb2s Wd=2 We=2 Wf=2 > > > > mb0e mb1e mb2e Re=0 Rf=0 Ra=0 > > > > Rb=0 Rc=0 Rd=0 > > > > > > > > Here the brackets in "[mb23]", "[mb14]", and "[mb05]" mean that the > > > > positions of these barriers in their respective threads' program > > > > orderings is undetermined; they need not come at the top as shown. > > > > > > > > (Also, in case David is unfamiliar with it, the "Wa=2" notation is > > > > shorthand for "Write 2 to a" and "Rb=0" is short for "Read 0 from b".) > > > > > > > > Finally, here are a few facts which may be well known and obvious, but > > > > I'll state them anyway: > > > > > > > > A CPU cannot reorder instructions across a memory barrier. > > > > If x is po-after a barrier then x executes after the barrier > > > > is finished. > > > > > > > > If a store is po-before a barrier then the store propagates > > > > to every CPU before the barrier finishes. > > > > > > > > If a store propagates to some CPU before a load on that CPU > > > > reads from the same location, then the load will obtain the > > > > value from that store or a co-later store. This implies that > > > > if a load obtains a value co-earlier than some store then the > > > > load must have executed before the store propagated to the > > > > load's CPU. > > > > > > > > The proof consists of three main stages, each requiring three steps. > > > > Using the facts that b - f are all read as 0, I'll show that P1 > > > > executes Rc before P3 executes Re, then that P0 executes Rb before P4 > > > > executes Rf, and lastly that P5's Ra must obtain 2, not 0. This will > > > > demonstrate that the litmus test is not allowed. > > > > > > > > 1. Suppose that mb23 ends up coming po-later than Wd in P3. > > > > Then we would have: > > > > > > > > Wd propagates to P2 < mb23 < mb2e < Rd, > > > > > > > > and so Rd would obtain 2, not 0. Hence mb23 must come > > > > po-before Wd (as shown in the listing): mb23 < Wd. > > > > > > > > 2. Since mb23 therefore occurs po-before Re and instructions > > > > cannot be reordered across barriers, mb23 < Re. > > > > > > > > 3. Since Rc obtains 0, we must have: > > > > > > > > Rc < Wc propagates to P1 < mb2s < mb23 < Re. > > > > > > > > Thus Rc < Re. > > > > > > > > 4. Suppose that mb14 ends up coming po-later than We in P4. > > > > Then we would have: > > > > > > > > We propagates to P3 < mb14 < mb1e < Rc < Re, > > > > > > > > and so Re would obtain 2, not 0. Hence mb14 must come > > > > po-before We (as shown in the listing): mb14 < We. > > > > > > > > 5. Since mb14 therefore occurs po-before Rf and instructions > > > > cannot be reordered across barriers, mb14 < Rf. > > > > > > > > 6. Since Rb obtains 0, we must have: > > > > > > > > Rb < Wb propagates to P0 < mb1s < mb14 < Rf. > > > > > > > > Thus Rb < Rf. > > > > > > > > 7. Suppose that mb05 ends up coming po-later than Wf in P5. > > > > Then we would have: > > > > > > > > Wf propagates to P4 < mb05 < mb0e < Rb < Rf, > > > > > > > > and so Rf would obtain 2, not 0. Hence mb05 must come > > > > po-before Wf (as shown in the listing): mb05 < Wf. > > > > > > > > 8. Since mb05 therefore occurs po-before Ra and instructions > > > > cannot be reordered across barriers, mb05 < Ra. > > > > > > > > 9. Now we have: > > > > > > > > Wa propagates to P5 < mb0s < mb05 < Ra, > > > > > > > > and so Ra must obtain 2, not 0. QED. > > > > > > Like this, then, with maximal reordering of P3-P5's reads? > > > > > > P0 P1 P2 P3 P4 P5 > > > Wa=2 > > > mb0s > > > [mb05] > > > mb0e Ra=0 > > > Rb=0 Wb=2 > > > mb1s > > > [mb14] > > > mb1e Rf=0 > > > Rc=0 Wc=2 Wf=2 > > > mb2s > > > [mb23] > > > mb2e Re=0 > > > Rd=0 We=2 > > > Wd=2 > > > > Yes, that's right. This shows how P5's Ra must obtain 2 instead of 0. > > > > > But don't the sys_membarrier() calls affect everyone, especially given > > > the shared-variable communication? > > > > They do, but the other effects are irrelevant for this proof. > > If I understand correctly, the shared-variable communication within > sys_membarrier() is included in your proof in the form of ordering > between memory barriers in the mainline sys_membarrier() code and > in the IPI handlers. > > > > If so, why wouldn't this more strict > > > variant hold? > > > > > > P0 P1 P2 P3 P4 P5 > > > Wa=2 > > > mb0s > > > [mb05] [mb05] [mb05] > > > > You have misunderstood the naming scheme. mb05 is the barrier injected > > by P0's sys_membarrier call into P5. So the three barriers above > > should be named "mb03", "mb04", and "mb05". And you left out mb01 and > > mb02. > > The former is a copy-and-paste error on my part, the latter was > intentional because the IPIs among P0, P1, and P2 don't seem to > strengthen the ordering. > > > > mb0e > > > Rb=0 Wb=2 > > > mb1s > > > [mb14] [mb14] [mb14] > > > mb1e > > > Rc=0 Wc=2 > > > mb2s > > > [mb23] [mb23] [mb23] > > > mb2e Re=0 Rf=0 Ra=0 > > > Rd=0 We=2 Wf=2 > > > Wd=2 > > > > Yes, this does hold. But since it doesn't affect the end result, > > there's no point in mentioning all those other barriers. > > > > > In which case, wouldn't this cycle be forbidden even if it had only one > > > sys_membarrier() call? > > > > No, it wouldn't. I don't understand why you might think it would. > > Because I hadn't yet thought of the scenario I showed below. > > > This is just like RCU, if you imagine a tiny critical section between > > each adjacent pair of instructions. You wouldn't expect RCU to enforce > > ordering among six CPUs with only one synchronize_rcu call. > > Yes, I do now agree in light of the scenario shown below. > > > > Ah, but the IPIs are not necessarily synchronized across the CPUs, > > > so that the following could happen: > > > > > > P0 P1 P2 P3 P4 P5 > > > Wa=2 > > > mb0s > > > [mb05] [mb05] [mb05] > > > mb0e Ra=0 > > > Rb=0 Wb=2 > > > mb1s > > > [mb14] [mb14] > > > Rf=0 > > > Wf=2 > > > [mb14] > > > mb1e > > > Rc=0 Wc=2 > > > mb2s > > > [mb23] > > > Re=0 > > > We=2 > > > [mb23] [mb23] > > > mb2e > > > Rd=0 > > > Wd=2 > > > > Yes it could. But even in this execution you would end up with Ra=2 > > instead of Ra=0. > > Agreed. Or I should have said that the above execution is forbidden, > either way. > > > > I guess in light of this post in 2001, I really don't have an excuse, > > > do I? ;-) > > > > > > https://lists.gt.net/linux/kernel/223555 > > > > > > Or am I still missing something here? > > > > You tell me... > > I think I am on board. ;-) And more to the point, here is a three-process variant showing a cycle that is permitted: P0 P1 P2 Wa=2 Wb=2 Wc=2 mb0s [mb01] [mb02] mb0e Rb=0 Rc=0 Ra=0 As can be seen by reordering it as follows: P0 P1 P2 Ra=0 Wa=2 mb0s [mb01] Rc=0 Wc=2 [mb02] mb0e Rb=0 Wb=2 Make sense? Thanx, Paul
On Wed, 12 Dec 2018, Paul E. McKenney wrote: > > > > Or am I still missing something here? > > > > > > You tell me... > > > > I think I am on board. ;-) > > And more to the point, here is a three-process variant showing a cycle > that is permitted: > > > P0 P1 P2 > Wa=2 Wb=2 Wc=2 > mb0s > [mb01] [mb02] > mb0e > Rb=0 Rc=0 Ra=0 > > As can be seen by reordering it as follows: > > P0 P1 P2 > Ra=0 > Wa=2 > mb0s > [mb01] > Rc=0 > Wc=2 > [mb02] > mb0e > Rb=0 > Wb=2 > > Make sense? You got it! Alan
On Wed, Dec 12, 2018 at 01:04:44PM -0500, Alan Stern wrote: > On Wed, 12 Dec 2018, Paul E. McKenney wrote: > > > > > > Or am I still missing something here? > > > > > > > > You tell me... > > > > > > I think I am on board. ;-) > > > > And more to the point, here is a three-process variant showing a cycle > > that is permitted: > > > > > > P0 P1 P2 > > Wa=2 Wb=2 Wc=2 > > mb0s > > [mb01] [mb02] > > mb0e > > Rb=0 Rc=0 Ra=0 > > > > As can be seen by reordering it as follows: > > > > P0 P1 P2 > > Ra=0 > > Wa=2 > > mb0s > > [mb01] > > Rc=0 > > Wc=2 > > [mb02] > > mb0e > > Rb=0 > > Wb=2 > > > > Make sense? > > You got it! OK. How about this one? P0 P1 P2 P3 Wa=2 rcu_read_lock() Wc=2 Wd=2 memb Wb=2 Rd=0 synchronize_rcu(); Rb=0 Rc=0 Ra=0 rcu_read_unlock() The model should say that it is allowed. Taking a look... P0 P1 P2 P3 Rd=0 Wd=2 synchronize_rcu(); Ra=0 Wa=2 membs rcu_read_lock() [m01] Rc=0 Wc=2 [m02] [m03] membe Rb=0 Wb=2 rcu_read_unlock() Looks allowed to me. If the synchronization of P1 and P2 were interchanged, it should be forbidden: P0 P1 P2 P3 Wa=2 Wb=2 rcu_read_lock() Wd=2 memb Rc=0 Wc=2 synchronize_rcu(); Rb=0 Rd=0 Ra=0 rcu_read_unlock() Taking a look... P0 P1 P2 P3 rcu_read_lock() Rd=0 Wa=2 Wb=2 Wd=2 membs synchronize_rcu(); [m01] Rc=0 Wc=2 rcu_read_unlock() [m02] Ra=0 [Forbidden?] membe Rb=0 I believe that this ordering forbids the cycle: Wa=1 > membs -> [m01] -> Rc=0 -> Wc=2 -> rcu_read_unlock() -> return from synchronize_rcu() -> Ra Does this make sense, or am I missing something? Thanx, Paul
On Wed, 12 Dec 2018, Paul E. McKenney wrote: > OK. How about this one? > > P0 P1 P2 P3 > Wa=2 rcu_read_lock() Wc=2 Wd=2 > memb Wb=2 Rd=0 synchronize_rcu(); > Rb=0 Rc=0 Ra=0 > rcu_read_unlock() > > The model should say that it is allowed. Taking a look... > > P0 P1 P2 P3 > Rd=0 > Wd=2 > synchronize_rcu(); > Ra=0 > Wa=2 > membs > rcu_read_lock() > [m01] > Rc=0 > Wc=2 > [m02] [m03] > membe > Rb=0 > Wb=2 > rcu_read_unlock() > > Looks allowed to me. If the synchronization of P1 and P2 were > interchanged, it should be forbidden: > > P0 P1 P2 P3 > Wa=2 Wb=2 rcu_read_lock() Wd=2 > memb Rc=0 Wc=2 synchronize_rcu(); > Rb=0 Rd=0 Ra=0 > rcu_read_unlock() > > Taking a look... > > P0 P1 P2 P3 > rcu_read_lock() > Rd=0 > Wa=2 Wb=2 Wd=2 > membs synchronize_rcu(); > [m01] > Rc=0 > Wc=2 > rcu_read_unlock() > [m02] Ra=0 [Forbidden?] > membe > Rb=0 Have you tried writing these as real litmus tests and running them through herd? > I believe that this ordering forbids the cycle: > > Wa=1 > membs -> [m01] -> Rc=0 -> Wc=2 -> rcu_read_unlock() -> > return from synchronize_rcu() -> Ra > > Does this make sense, or am I missing something? It's hard to tell. What you have written here isn't justified by the litmus test source code, since the position of m01 in P1's program order is undetermined. How do you justify m01 -> Rc, for example? Write it this way instead, using the relations defined in the sys_membarrier patch for linux-kernel.cat: memb ->memb-gp memb ->rcu-link Rc ->memb-rscsi Rc ->rcu-link rcu_read_unlock ->rcu-rscsi rcu_read_lock ->rcu-link synchronize_rcu ->rcu-gp synchronize_rcu ->rcu-link memb Recall that: memb-gp is the identity relation on sys_membarrier events, rcu-link includes (po? ; fre ; po), memb-rscsi is the identity relation on all events, rcu-rscsi links unlocks to their corresponding locks, and rcu-gp is the identity relation on synchronize_rcu events. These facts justify the cycle above. Leaving off the final rcu-link step, the sequence matches the definition of rcu-fence (the relations are memb-gp, memb-rscsi, rcu-rscsi, rcu-gp with rcu-links in between). Therefore the cycle is forbidden. Alan
On Wed, Dec 12, 2018 at 04:32:50PM -0500, Alan Stern wrote: > On Wed, 12 Dec 2018, Paul E. McKenney wrote: > > > OK. How about this one? > > > > P0 P1 P2 P3 > > Wa=2 rcu_read_lock() Wc=2 Wd=2 > > memb Wb=2 Rd=0 synchronize_rcu(); > > Rb=0 Rc=0 Ra=0 > > rcu_read_unlock() > > > > The model should say that it is allowed. Taking a look... > > > > P0 P1 P2 P3 > > Rd=0 > > Wd=2 > > synchronize_rcu(); > > Ra=0 > > Wa=2 > > membs > > rcu_read_lock() > > [m01] > > Rc=0 > > Wc=2 > > [m02] [m03] > > membe > > Rb=0 > > Wb=2 > > rcu_read_unlock() > > > > Looks allowed to me. If the synchronization of P1 and P2 were > > interchanged, it should be forbidden: > > > > P0 P1 P2 P3 > > Wa=2 Wb=2 rcu_read_lock() Wd=2 > > memb Rc=0 Wc=2 synchronize_rcu(); > > Rb=0 Rd=0 Ra=0 > > rcu_read_unlock() > > > > Taking a look... > > > > P0 P1 P2 P3 > > rcu_read_lock() > > Rd=0 > > Wa=2 Wb=2 Wd=2 > > membs synchronize_rcu(); > > [m01] > > Rc=0 > > Wc=2 > > rcu_read_unlock() > > [m02] Ra=0 [Forbidden?] > > membe > > Rb=0 For one thing, Wb=2 needs to be down here, apologies! Which then ... > Have you tried writing these as real litmus tests and running them > through herd? That comes later, but yes, I will do that. > > I believe that this ordering forbids the cycle: > > > > Wa=1 > membs -> [m01] -> Rc=0 -> Wc=2 -> rcu_read_unlock() -> > > return from synchronize_rcu() -> Ra > > > > Does this make sense, or am I missing something? > > It's hard to tell. What you have written here isn't justified by the > litmus test source code, since the position of m01 in P1's program > order is undetermined. How do you justify m01 -> Rc, for example? ... justifies Rc=0 following [m01]. > Write it this way instead, using the relations defined in the > sys_membarrier patch for linux-kernel.cat: > > memb ->memb-gp memb ->rcu-link Rc ->memb-rscsi Rc ->rcu-link > > rcu_read_unlock ->rcu-rscsi rcu_read_lock ->rcu-link > > synchronize_rcu ->rcu-gp synchronize_rcu ->rcu-link memb > > Recall that: > > memb-gp is the identity relation on sys_membarrier events, > > rcu-link includes (po? ; fre ; po), > > memb-rscsi is the identity relation on all events, > > rcu-rscsi links unlocks to their corresponding locks, and > > rcu-gp is the identity relation on synchronize_rcu events. > > These facts justify the cycle above. > > Leaving off the final rcu-link step, the sequence matches the > definition of rcu-fence (the relations are memb-gp, memb-rscsi, > rcu-rscsi, rcu-gp with rcu-links in between). Therefore the cycle is > forbidden. Understood, but that would be using the model to check the model. ;-) Thanx, Paul
On Wed, 12 Dec 2018, Paul E. McKenney wrote: > > > I believe that this ordering forbids the cycle: > > > > > > Wa=1 > membs -> [m01] -> Rc=0 -> Wc=2 -> rcu_read_unlock() -> > > > return from synchronize_rcu() -> Ra > > > > > > Does this make sense, or am I missing something? > > > > It's hard to tell. What you have written here isn't justified by the > > litmus test source code, since the position of m01 in P1's program > > order is undetermined. How do you justify m01 -> Rc, for example? > > ... justifies Rc=0 following [m01]. > > > Write it this way instead, using the relations defined in the > > sys_membarrier patch for linux-kernel.cat: > > > > memb ->memb-gp memb ->rcu-link Rc ->memb-rscsi Rc ->rcu-link > > > > rcu_read_unlock ->rcu-rscsi rcu_read_lock ->rcu-link > > > > synchronize_rcu ->rcu-gp synchronize_rcu ->rcu-link memb > > > > Recall that: > > > > memb-gp is the identity relation on sys_membarrier events, > > > > rcu-link includes (po? ; fre ; po), > > > > memb-rscsi is the identity relation on all events, > > > > rcu-rscsi links unlocks to their corresponding locks, and > > > > rcu-gp is the identity relation on synchronize_rcu events. > > > > These facts justify the cycle above. > > > > Leaving off the final rcu-link step, the sequence matches the > > definition of rcu-fence (the relations are memb-gp, memb-rscsi, > > rcu-rscsi, rcu-gp with rcu-links in between). Therefore the cycle is > > forbidden. > > Understood, but that would be using the model to check the model. ;-) Well, what are you trying to accomplish? Do you want to find an argument similar to the one I posted for the 6-CPU test to show that this test should be forbidden? Alan
On Wed, Dec 12, 2018 at 01:52:45PM -0800, Paul E. McKenney wrote: > On Wed, Dec 12, 2018 at 04:32:50PM -0500, Alan Stern wrote: > > On Wed, 12 Dec 2018, Paul E. McKenney wrote: > > > > > OK. How about this one? > > > > > > P0 P1 P2 P3 > > > Wa=2 rcu_read_lock() Wc=2 Wd=2 > > > memb Wb=2 Rd=0 synchronize_rcu(); > > > Rb=0 Rc=0 Ra=0 > > > rcu_read_unlock() > > > > > > The model should say that it is allowed. Taking a look... > > > > > > P0 P1 P2 P3 > > > Rd=0 > > > Wd=2 > > > synchronize_rcu(); > > > Ra=0 > > > Wa=2 > > > membs > > > rcu_read_lock() > > > [m01] > > > Rc=0 > > > Wc=2 > > > [m02] [m03] > > > membe > > > Rb=0 > > > Wb=2 > > > rcu_read_unlock() > > > > > > Looks allowed to me. If the synchronization of P1 and P2 were > > > interchanged, it should be forbidden: > > > > > > P0 P1 P2 P3 > > > Wa=2 Wb=2 rcu_read_lock() Wd=2 > > > memb Rc=0 Wc=2 synchronize_rcu(); > > > Rb=0 Rd=0 Ra=0 > > > rcu_read_unlock() > > > > > > Taking a look... > > > > > > P0 P1 P2 P3 > > > rcu_read_lock() > > > Rd=0 > > > Wa=2 Wb=2 Wd=2 > > > membs synchronize_rcu(); > > > [m01] > > > Rc=0 > > > Wc=2 > > > rcu_read_unlock() > > > [m02] Ra=0 [Forbidden?] > > > membe > > > Rb=0 > > For one thing, Wb=2 needs to be down here, apologies! Which then ... > > > Have you tried writing these as real litmus tests and running them > > through herd? > > That comes later, but yes, I will do that. > > > > I believe that this ordering forbids the cycle: > > > > > > Wa=1 > membs -> [m01] -> Rc=0 -> Wc=2 -> rcu_read_unlock() -> > > > return from synchronize_rcu() -> Ra > > > > > > Does this make sense, or am I missing something? > > > > It's hard to tell. What you have written here isn't justified by the > > litmus test source code, since the position of m01 in P1's program > > order is undetermined. How do you justify m01 -> Rc, for example? > > ... justifies Rc=0 following [m01]. > > > Write it this way instead, using the relations defined in the > > sys_membarrier patch for linux-kernel.cat: > > > > memb ->memb-gp memb ->rcu-link Rc ->memb-rscsi Rc ->rcu-link > > > > rcu_read_unlock ->rcu-rscsi rcu_read_lock ->rcu-link > > > > synchronize_rcu ->rcu-gp synchronize_rcu ->rcu-link memb > > > > Recall that: > > > > memb-gp is the identity relation on sys_membarrier events, > > > > rcu-link includes (po? ; fre ; po), > > > > memb-rscsi is the identity relation on all events, > > > > rcu-rscsi links unlocks to their corresponding locks, and > > > > rcu-gp is the identity relation on synchronize_rcu events. > > > > These facts justify the cycle above. > > > > Leaving off the final rcu-link step, the sequence matches the > > definition of rcu-fence (the relations are memb-gp, memb-rscsi, > > rcu-rscsi, rcu-gp with rcu-links in between). Therefore the cycle is > > forbidden. > > Understood, but that would be using the model to check the model. ;-) And here are the litmus tests in the same order as above. They do give the results we both called out above, which is encouraging. Thanx, Paul ------------------------------------------------------------------------ C C-memb-RCU-1 (* * Result: Sometimes *) { } P0(int *x0, int *x1) { WRITE_ONCE(*x0, 1); smp_memb(); r1 = READ_ONCE(*x1); } P1(int *x1, int *x2) { rcu_read_lock(); WRITE_ONCE(*x1, 1); r1 = READ_ONCE(*x2); rcu_read_unlock(); } P2(int *x2, int *x3) { WRITE_ONCE(*x2, 1); r1 = READ_ONCE(*x3); } P3(int *x3, int *x0) { WRITE_ONCE(*x3, 1); synchronize_rcu(); r1 = READ_ONCE(*x0); } exists (0:r1=0 /\ 1:r1=0 /\ 2:r1=0 /\ 3:r1=0) ------------------------------------------------------------------------ C C-memb-RCU-1 (* * Result: Never *) { } P0(int *x0, int *x1) { WRITE_ONCE(*x0, 1); smp_memb(); r1 = READ_ONCE(*x1); } P1(int *x1, int *x2) { WRITE_ONCE(*x1, 1); r1 = READ_ONCE(*x2); } P2(int *x2, int *x3) { rcu_read_lock(); WRITE_ONCE(*x2, 1); r1 = READ_ONCE(*x3); rcu_read_unlock(); } P3(int *x3, int *x0) { WRITE_ONCE(*x3, 1); synchronize_rcu(); r1 = READ_ONCE(*x0); } exists (0:r1=0 /\ 1:r1=0 /\ 2:r1=0 /\ 3:r1=0)
On Wed, Dec 12, 2018 at 05:12:18PM -0500, Alan Stern wrote: > On Wed, 12 Dec 2018, Paul E. McKenney wrote: > > > > > I believe that this ordering forbids the cycle: > > > > > > > > Wa=1 > membs -> [m01] -> Rc=0 -> Wc=2 -> rcu_read_unlock() -> > > > > return from synchronize_rcu() -> Ra > > > > > > > > Does this make sense, or am I missing something? > > > > > > It's hard to tell. What you have written here isn't justified by the > > > litmus test source code, since the position of m01 in P1's program > > > order is undetermined. How do you justify m01 -> Rc, for example? > > > > ... justifies Rc=0 following [m01]. > > > > > Write it this way instead, using the relations defined in the > > > sys_membarrier patch for linux-kernel.cat: > > > > > > memb ->memb-gp memb ->rcu-link Rc ->memb-rscsi Rc ->rcu-link > > > > > > rcu_read_unlock ->rcu-rscsi rcu_read_lock ->rcu-link > > > > > > synchronize_rcu ->rcu-gp synchronize_rcu ->rcu-link memb > > > > > > Recall that: > > > > > > memb-gp is the identity relation on sys_membarrier events, > > > > > > rcu-link includes (po? ; fre ; po), > > > > > > memb-rscsi is the identity relation on all events, > > > > > > rcu-rscsi links unlocks to their corresponding locks, and > > > > > > rcu-gp is the identity relation on synchronize_rcu events. > > > > > > These facts justify the cycle above. > > > > > > Leaving off the final rcu-link step, the sequence matches the > > > definition of rcu-fence (the relations are memb-gp, memb-rscsi, > > > rcu-rscsi, rcu-gp with rcu-links in between). Therefore the cycle is > > > forbidden. > > > > Understood, but that would be using the model to check the model. ;-) > > Well, what are you trying to accomplish? Do you want to find an > argument similar to the one I posted for the 6-CPU test to show that > this test should be forbidden? I am trying to check odd corner cases. Your sys_membarrier() model is quite nice and certainly fits nicely with the rest of the model, but where I come from, that is actually reason for suspicion. ;-) All kidding aside, your argument for the 6-CPU test was extremely valuable, as it showed me a way to think of that test from an implementation viewpoint. Then the question is whether or not that viewpoint actually matches the model, which seems to be the case thus far. A good next step would be to automatically generate random tests along with an automatically generated prediction, like I did for RCU a few years back. I should be able to generalize my time-based cheat for RCU to also cover SRCU, though sys_membarrier() will require a bit more thought. (The time-based cheat was to have fixed duration RCU grace periods and RCU read-side critical sections, with the grace period duration being slightly longer than that of the critical sections. The number of processes is of course limited by the chosen durations, but that limit can easily be made insanely large.) I guess that I still haven't gotten over being a bit surprised that the RCU counting rule also applies to sys_membarrier(). ;-) Thanx, Paul
On Wed, 12 Dec 2018, Paul E. McKenney wrote: > > Well, what are you trying to accomplish? Do you want to find an > > argument similar to the one I posted for the 6-CPU test to show that > > this test should be forbidden? > > I am trying to check odd corner cases. Your sys_membarrier() model > is quite nice and certainly fits nicely with the rest of the model, > but where I come from, that is actually reason for suspicion. ;-) > > All kidding aside, your argument for the 6-CPU test was extremely > valuable, as it showed me a way to think of that test from an > implementation viewpoint. Then the question is whether or not that > viewpoint actually matches the model, which seems to be the case thus far. It should, since I formulated the reasoning behind that viewpoint directly from the model. The basic idea is this: By induction, show that whenever we have A ->rcu-fence B then anything po-before A executes before anything po-after B, and furthermore, any write which propagates to A's CPU before A executes will propagate to every CPU before B finishes (i.e., before anything po-after B executes). Using this, show that whenever X ->rb Y holds then X must execute before Y. That's what the 6-CPU argument did. In that litmus test we have mb2 ->rcu-fence mb23, Rc ->rb Re, mb1 ->rcu-fence mb14, Rb ->rb Rf, mb0 ->rcu-fence mb05, and lastly Ra ->rb Ra. The last one is what shows that the test is forbidden. > A good next step would be to automatically generate random tests along > with an automatically generated prediction, like I did for RCU a few > years back. I should be able to generalize my time-based cheat for RCU to > also cover SRCU, though sys_membarrier() will require a bit more thought. > (The time-based cheat was to have fixed duration RCU grace periods and > RCU read-side critical sections, with the grace period duration being > slightly longer than that of the critical sections. The number of > processes is of course limited by the chosen durations, but that limit > can easily be made insanely large.) Imagine that each sys_membarrier call takes a fixed duration and each other instruction takes slightly less (the idea being that each instruction is a critical section). Instructions can be reordered (although not across a sys_membarrier call), but no matter how the reordering is done, the result is disallowed. > I guess that I still haven't gotten over being a bit surprised that the > RCU counting rule also applies to sys_membarrier(). ;-) Why not? They are both synchronization mechanisms with heavy-weight write sides and light-weight read sides, and most importantly, they provide the same Guarantee. Alan
On Thu, Dec 13, 2018 at 10:49:49AM -0500, Alan Stern wrote: > On Wed, 12 Dec 2018, Paul E. McKenney wrote: > > > > Well, what are you trying to accomplish? Do you want to find an > > > argument similar to the one I posted for the 6-CPU test to show that > > > this test should be forbidden? > > > > I am trying to check odd corner cases. Your sys_membarrier() model > > is quite nice and certainly fits nicely with the rest of the model, > > but where I come from, that is actually reason for suspicion. ;-) > > > > All kidding aside, your argument for the 6-CPU test was extremely > > valuable, as it showed me a way to think of that test from an > > implementation viewpoint. Then the question is whether or not that > > viewpoint actually matches the model, which seems to be the case thus far. > > It should, since I formulated the reasoning behind that viewpoint > directly from the model. The basic idea is this: > > By induction, show that whenever we have A ->rcu-fence B then > anything po-before A executes before anything po-after B, and > furthermore, any write which propagates to A's CPU before A > executes will propagate to every CPU before B finishes (i.e., > before anything po-after B executes). > > Using this, show that whenever X ->rb Y holds then X must > execute before Y. > > That's what the 6-CPU argument did. In that litmus test we have > mb2 ->rcu-fence mb23, Rc ->rb Re, mb1 ->rcu-fence mb14, Rb ->rb Rf, > mb0 ->rcu-fence mb05, and lastly Ra ->rb Ra. The last one is what > shows that the test is forbidden. I really am not trying to be difficult. Well, no more difficult than I normally am, anyway. Which admittedly isn't saying much. ;-) > > A good next step would be to automatically generate random tests along > > with an automatically generated prediction, like I did for RCU a few > > years back. I should be able to generalize my time-based cheat for RCU to > > also cover SRCU, though sys_membarrier() will require a bit more thought. > > (The time-based cheat was to have fixed duration RCU grace periods and > > RCU read-side critical sections, with the grace period duration being > > slightly longer than that of the critical sections. The number of > > processes is of course limited by the chosen durations, but that limit > > can easily be made insanely large.) > > Imagine that each sys_membarrier call takes a fixed duration and each > other instruction takes slightly less (the idea being that each > instruction is a critical section). Instructions can be reordered > (although not across a sys_membarrier call), but no matter how the > reordering is done, the result is disallowed. It gets a bit trickier with interleavings of different combinations of RCU, SRCU, and sys_membarrier(). Yes, your cat code very elegantly sorts this out, but my goal is to be able to explain a given example to someone. > > I guess that I still haven't gotten over being a bit surprised that the > > RCU counting rule also applies to sys_membarrier(). ;-) > > Why not? They are both synchronization mechanisms with heavy-weight > write sides and light-weight read sides, and most importantly, they > provide the same Guarantee. True, but I do feel the need to poke at it. The zero-size sys_membarrier() read-side critical sections do make things act a bit differently, for example, interchanging the accesses in an RCU read-side critical section has no effect, while doing so in a sys_membarrier() reader can cause the result to be allowed. One key point is that everything before the end of a read-side critical section of any type is ordered before any later grace period of that same type, and vice versa. This is why reordering accesses matters for sys_membarrier() readers but not for RCU and SRCU readers -- in the case of RCU and SRCU readers, the accesses are inside the read-side critical section, while for sys_membarrier() readers, the read-side critical sections don't have an inside. So yes, ordering also matters in the case of SRCU and RCU readers for accesses outside of the read-side critical sections. The reason sys_membarrier() seems surprising to me isn't because it is any different in theoretical structure, but rather because the practice is to put RCU and SRCU read-side accesses inside a read-side critical sections, which is impossible for sys_membarrier(). The other thing that took some time to get used to is the possibility of long delays during sys_membarrier() execution, allowing significant execution and reordering between different CPUs' IPIs. This was key to my understanding of the six-process example, and probably needs to be clearly called out, including in an example or two. The interleaving restrictions are straightforward for me, but the fixed-time approach does have some interesting cross-talk potential between sys_membarrier() and RCU read-side critical sections whose accesses have been reversed. I don't believe that it is possible to leverage this "order the other guy's read-side critical sections" effect in the general case, but I could be missing something. If you are claiming that I am worrying unnecessarily, you are probably right. But if I didn't worry unnecessarily, RCU wouldn't work at all! ;-) Thanx, Paul
On Thu, 13 Dec 2018, Paul E. McKenney wrote: > > > A good next step would be to automatically generate random tests along > > > with an automatically generated prediction, like I did for RCU a few > > > years back. I should be able to generalize my time-based cheat for RCU to > > > also cover SRCU, though sys_membarrier() will require a bit more thought. > > > (The time-based cheat was to have fixed duration RCU grace periods and > > > RCU read-side critical sections, with the grace period duration being > > > slightly longer than that of the critical sections. The number of > > > processes is of course limited by the chosen durations, but that limit > > > can easily be made insanely large.) > > > > Imagine that each sys_membarrier call takes a fixed duration and each > > other instruction takes slightly less (the idea being that each > > instruction is a critical section). Instructions can be reordered > > (although not across a sys_membarrier call), but no matter how the > > reordering is done, the result is disallowed. This turns out not to be right. Instead, imagine that each sys_membarrier call takes a fixed duration, T. Other instructions can take arbitrary amounts of time and can be reordered abitrarily, with two restrictions: Instructions cannot be reordered past a sys_membarrier call; If instructions A and B are reordered then the time duration from B to A must be less than T. If you prefer, you can replace the second restriction with something a little more liberal: If A and B are reordered and A ends up executing after a sys_membarrier call (on any CPU) then B cannot execute before that sys_membarrier call. Of course, this form is a consequence of the more restrictive form. > It gets a bit trickier with interleavings of different combinations > of RCU, SRCU, and sys_membarrier(). Yes, your cat code very elegantly > sorts this out, but my goal is to be able to explain a given example > to someone. I don't think you're going to be able to fit different combinations of RCU, SRCU, and sys_membarrier into this picture. How would you allow tests with incorrect interleaving, such as GP - memb - RSCS - nothing, while forbidding similar tests with correct interleaving? Alan
On Thu, Dec 13, 2018 at 09:26:47PM -0500, Alan Stern wrote: > On Thu, 13 Dec 2018, Paul E. McKenney wrote: > > > > > A good next step would be to automatically generate random tests along > > > > with an automatically generated prediction, like I did for RCU a few > > > > years back. I should be able to generalize my time-based cheat for RCU to > > > > also cover SRCU, though sys_membarrier() will require a bit more thought. > > > > (The time-based cheat was to have fixed duration RCU grace periods and > > > > RCU read-side critical sections, with the grace period duration being > > > > slightly longer than that of the critical sections. The number of > > > > processes is of course limited by the chosen durations, but that limit > > > > can easily be made insanely large.) > > > > > > Imagine that each sys_membarrier call takes a fixed duration and each > > > other instruction takes slightly less (the idea being that each > > > instruction is a critical section). Instructions can be reordered > > > (although not across a sys_membarrier call), but no matter how the > > > reordering is done, the result is disallowed. > > This turns out not to be right. Instead, imagine that each > sys_membarrier call takes a fixed duration, T. Other instructions can > take arbitrary amounts of time and can be reordered abitrarily, with > two restrictions: > > Instructions cannot be reordered past a sys_membarrier call; > > If instructions A and B are reordered then the time duration > from B to A must be less than T. > > If you prefer, you can replace the second restriction with something a > little more liberal: > > If A and B are reordered and A ends up executing after a > sys_membarrier call (on any CPU) then B cannot execute before > that sys_membarrier call. > > Of course, this form is a consequence of the more restrictive form. Makes sense. And the zero-size critical sections are why sys_membarrier() cannot be directly used for classic deferred reclamation. > > It gets a bit trickier with interleavings of different combinations > > of RCU, SRCU, and sys_membarrier(). Yes, your cat code very elegantly > > sorts this out, but my goal is to be able to explain a given example > > to someone. > > I don't think you're going to be able to fit different combinations of > RCU, SRCU, and sys_membarrier into this picture. How would you allow > tests with incorrect interleaving, such as GP - memb - RSCS - nothing, > while forbidding similar tests with correct interleaving? Well, no, I cannot do a simple linear scan tracking time, which is what the current scripts do. I must instead find longest sequence with all operations of the same type (RCU, SRCU, or memb) and work out their worst-case timing. If the overall effect of a given sequence is to go backwards in time, the result is allowed. Otherwise eliminate that sequence from the cycle and repeat. If everything is eliminated, the cycle is forbidden. Which can be thought of as an iterative process similar to something called "rcu-fence", can't it? ;-) Thanx, Paul
On Thu, 13 Dec 2018, Paul E. McKenney wrote: > > > I guess that I still haven't gotten over being a bit surprised that the > > > RCU counting rule also applies to sys_membarrier(). ;-) > > > > Why not? They are both synchronization mechanisms with heavy-weight > > write sides and light-weight read sides, and most importantly, they > > provide the same Guarantee. > > True, but I do feel the need to poke at it. > > The zero-size sys_membarrier() read-side critical sections do make > things act a bit differently, for example, interchanging the accesses > in an RCU read-side critical section has no effect, while doing so in > a sys_membarrier() reader can cause the result to be allowed. One key > point is that everything before the end of a read-side critical section > of any type is ordered before any later grace period of that same type, > and vice versa. > > This is why reordering accesses matters for sys_membarrier() readers but > not for RCU and SRCU readers -- in the case of RCU and SRCU readers, > the accesses are inside the read-side critical section, while for > sys_membarrier() readers, the read-side critical sections don't have > an inside. So yes, ordering also matters in the case of SRCU and > RCU readers for accesses outside of the read-side critical sections. > The reason sys_membarrier() seems surprising to me isn't because it is > any different in theoretical structure, but rather because the practice > is to put RCU and SRCU read-side accesses inside a read-side critical > sections, which is impossible for sys_membarrier(). RCU and sys_membarrier are more similar than you might think at first. For one thing, if there were primitives for blocking and unblocking reception of IPIs, those primitives would delimit critical sections for sys_membarrier. (Maybe such things do exist; I wouldn't know.) For another, the way we model RCU isn't fully accurate for the Linux kernel, as you know. Since individual instructions cannot be preempted, each instruction is a tiny read-side critical section. Thus, litmus tests like this one: P0 P1 Wa=1 Wb=1 synchronize_rcu() Ra=0 Rb=0 actually are forbidden in the kernel (provided P1 isn't part of the idle loop!), even though the LKMM allows them. However, it wouldn't be forbidden if the accesses in P1 were swapped -- just like with sys_membarrier. Put these two observations together and you see that sys_membarrier is almost exactly the same as RCU without explicit read-side critical sections. Perhaps this isn't surprising, given that the initial implementation of sys_membarrier() was pretty much the same as synchronize_rcu(). > The other thing that took some time to get used to is the possibility > of long delays during sys_membarrier() execution, allowing significant > execution and reordering between different CPUs' IPIs. This was key > to my understanding of the six-process example, and probably needs to > be clearly called out, including in an example or two. In all the examples I'm aware of, no more than one of the IPIs generated by each sys_membarrier call really matters. (Of course, there's no way to know in advance which one it will be, so you have to send an IPI to every CPU.) The execution delays and reordering between different CPUs' IPIs don't appear to be significant. > The interleaving restrictions are straightforward for me, but the > fixed-time approach does have some interesting cross-talk potential > between sys_membarrier() and RCU read-side critical sections whose > accesses have been reversed. I don't believe that it is possible to > leverage this "order the other guy's read-side critical sections" effect > in the general case, but I could be missing something. I regard the fixed-time approach as nothing more than a heuristic aid. It's not an accurate explaination of what's really going on. > If you are claiming that I am worrying unnecessarily, you are probably > right. But if I didn't worry unnecessarily, RCU wouldn't work at all! ;-) Alan
On Fri, Dec 14, 2018 at 10:31:51AM -0500, Alan Stern wrote: > On Thu, 13 Dec 2018, Paul E. McKenney wrote: > > > > > I guess that I still haven't gotten over being a bit surprised that the > > > > RCU counting rule also applies to sys_membarrier(). ;-) > > > > > > Why not? They are both synchronization mechanisms with heavy-weight > > > write sides and light-weight read sides, and most importantly, they > > > provide the same Guarantee. > > > > True, but I do feel the need to poke at it. > > > > The zero-size sys_membarrier() read-side critical sections do make > > things act a bit differently, for example, interchanging the accesses > > in an RCU read-side critical section has no effect, while doing so in > > a sys_membarrier() reader can cause the result to be allowed. One key > > point is that everything before the end of a read-side critical section > > of any type is ordered before any later grace period of that same type, > > and vice versa. > > > > This is why reordering accesses matters for sys_membarrier() readers but > > not for RCU and SRCU readers -- in the case of RCU and SRCU readers, > > the accesses are inside the read-side critical section, while for > > sys_membarrier() readers, the read-side critical sections don't have > > an inside. So yes, ordering also matters in the case of SRCU and > > RCU readers for accesses outside of the read-side critical sections. > > The reason sys_membarrier() seems surprising to me isn't because it is > > any different in theoretical structure, but rather because the practice > > is to put RCU and SRCU read-side accesses inside a read-side critical > > sections, which is impossible for sys_membarrier(). > > RCU and sys_membarrier are more similar than you might think at first. > For one thing, if there were primitives for blocking and unblocking > reception of IPIs, those primitives would delimit critical sections for > sys_membarrier. (Maybe such things do exist; I wouldn't know.) Within the kernel of course local_irq_disable() and friends. In userspace, there have been proposals to make the IPI handler interact with rseq or equivalent, which would have a roughly similar effect. > For another, the way we model RCU isn't fully accurate for the Linux > kernel, as you know. Since individual instructions cannot be > preempted, each instruction is a tiny read-side critical section. > Thus, litmus tests like this one: > > P0 P1 > Wa=1 Wb=1 > synchronize_rcu() Ra=0 > Rb=0 > > actually are forbidden in the kernel (provided P1 isn't part of the > idle loop!), even though the LKMM allows them. However, it wouldn't > be forbidden if the accesses in P1 were swapped -- just like with > sys_membarrier. And that P1 isn't executing on a CPU that RCU believes to be offline, but yes. But this is an implementation choice, and SRCU makes a different choice, which would allow the litmus test shown above. And it would be good to keep this freedom for the implementation, in other words, this difference is a good thing, so let's please keep it. ;-) > Put these two observations together and you see that sys_membarrier is > almost exactly the same as RCU without explicit read-side critical > sections. Perhaps this isn't surprising, given that the initial > implementation of sys_membarrier() was pretty much the same as > synchronize_rcu(). Heh! The initial implementation in the Linux kernel was exactly synchronize_sched(). ;-) I would say that sys_membarrier() has zero-sized read-side critical sections, either comprising a single instruction (as is the case for synchronize_sched(), actually), preempt-disable regions of code (which are irrelevant to userspace execution), or the spaces between consecutive pairs of instructions (as is the case for the newer IPI-based implementation). The model picks the single-instruction option, and I haven't yet found a problem with this -- which is no surprise given that, as you say, an actual implementation makes this same choice. > > The other thing that took some time to get used to is the possibility > > of long delays during sys_membarrier() execution, allowing significant > > execution and reordering between different CPUs' IPIs. This was key > > to my understanding of the six-process example, and probably needs to > > be clearly called out, including in an example or two. > > In all the examples I'm aware of, no more than one of the IPIs > generated by each sys_membarrier call really matters. (Of course, > there's no way to know in advance which one it will be, so you have to > send an IPI to every CPU.) The execution delays and reordering > between different CPUs' IPIs don't appear to be significant. Well, there are litmus tests that are allowed in which the allowed execution is more easily explained in terms of delays between different CPUs' IPIs, so it seems worth keeping track of. There might be a litmus test that can tell the difference between simultaneous and non-simultaneous IPIs, but I cannot immediately think of one that matters. Might be a failure of imagination on my part, though. > > The interleaving restrictions are straightforward for me, but the > > fixed-time approach does have some interesting cross-talk potential > > between sys_membarrier() and RCU read-side critical sections whose > > accesses have been reversed. I don't believe that it is possible to > > leverage this "order the other guy's read-side critical sections" effect > > in the general case, but I could be missing something. > > I regard the fixed-time approach as nothing more than a heuristic > aid. It's not an accurate explaination of what's really going on. Agreed, albeit a useful heuristic aid in scripts generating litmus tests. Thanx, Paul > > If you are claiming that I am worrying unnecessarily, you are probably > > right. But if I didn't worry unnecessarily, RCU wouldn't work at all! ;-) > > Alan >
On Fri, 14 Dec 2018, Paul E. McKenney wrote: > I would say that sys_membarrier() has zero-sized read-side critical > sections, either comprising a single instruction (as is the case for > synchronize_sched(), actually), preempt-disable regions of code > (which are irrelevant to userspace execution), or the spaces between > consecutive pairs of instructions (as is the case for the newer > IPI-based implementation). > > The model picks the single-instruction option, and I haven't yet found > a problem with this -- which is no surprise given that, as you say, > an actual implementation makes this same choice. I believe that for RCU tests the LKMM gives the same results for length-zero critical sections interspersed between all the instructions and length-one critical sections surrounding all instructions (except synchronize_rcu). But the proof is tricky and I haven't checked it carefully. > > > The other thing that took some time to get used to is the possibility > > > of long delays during sys_membarrier() execution, allowing significant > > > execution and reordering between different CPUs' IPIs. This was key > > > to my understanding of the six-process example, and probably needs to > > > be clearly called out, including in an example or two. > > > > In all the examples I'm aware of, no more than one of the IPIs > > generated by each sys_membarrier call really matters. (Of course, > > there's no way to know in advance which one it will be, so you have to > > send an IPI to every CPU.) The execution delays and reordering > > between different CPUs' IPIs don't appear to be significant. > > Well, there are litmus tests that are allowed in which the allowed > execution is more easily explained in terms of delays between different > CPUs' IPIs, so it seems worth keeping track of. > > There might be a litmus test that can tell the difference between > simultaneous and non-simultaneous IPIs, but I cannot immediately think of > one that matters. Might be a failure of imagination on my part, though. P0 P1 P2 Wc=1 [mb01] Rb=1 memb Wa=1 Rc=0 Ra=0 Wb=1 [mb02] The IPIs have to appear in the positions shown, which means they cannot be simultaneous. The test is allowed because P2's reads can be reordered. Alan
On Fri, Dec 14, 2018 at 04:39:34PM -0500, Alan Stern wrote: > On Fri, 14 Dec 2018, Paul E. McKenney wrote: > > > I would say that sys_membarrier() has zero-sized read-side critical > > sections, either comprising a single instruction (as is the case for > > synchronize_sched(), actually), preempt-disable regions of code > > (which are irrelevant to userspace execution), or the spaces between > > consecutive pairs of instructions (as is the case for the newer > > IPI-based implementation). > > > > The model picks the single-instruction option, and I haven't yet found > > a problem with this -- which is no surprise given that, as you say, > > an actual implementation makes this same choice. > > I believe that for RCU tests the LKMM gives the same results for > length-zero critical sections interspersed between all the instructions > and length-one critical sections surrounding all instructions (except > synchronize_rcu). But the proof is tricky and I haven't checked it > carefully. That assertion is completely consistent with my implementation experience, give or take the usual caveats about idle and offline execution. > > > > The other thing that took some time to get used to is the possibility > > > > of long delays during sys_membarrier() execution, allowing significant > > > > execution and reordering between different CPUs' IPIs. This was key > > > > to my understanding of the six-process example, and probably needs to > > > > be clearly called out, including in an example or two. > > > > > > In all the examples I'm aware of, no more than one of the IPIs > > > generated by each sys_membarrier call really matters. (Of course, > > > there's no way to know in advance which one it will be, so you have to > > > send an IPI to every CPU.) The execution delays and reordering > > > between different CPUs' IPIs don't appear to be significant. > > > > Well, there are litmus tests that are allowed in which the allowed > > execution is more easily explained in terms of delays between different > > CPUs' IPIs, so it seems worth keeping track of. > > > > There might be a litmus test that can tell the difference between > > simultaneous and non-simultaneous IPIs, but I cannot immediately think of > > one that matters. Might be a failure of imagination on my part, though. > > P0 P1 P2 > Wc=1 [mb01] Rb=1 > memb Wa=1 Rc=0 > Ra=0 Wb=1 [mb02] > > The IPIs have to appear in the positions shown, which means they cannot > be simultaneous. The test is allowed because P2's reads can be > reordered. OK, so "simultaneous" IPIs could be emulated in a real implementation by having sys_membarrier() send each IPI (but not wait for a response), then execute a full memory barrier and set a shared variable. Each IPI handler would spin waiting for the shared variable to be set, then execute a full memory barrier and atomically increment yet another shared variable and return from interrupt. When that other shared variable's value reached the number of IPIs sent, the sys_membarrier() would execute its final (already existing) full memory barrier and return. Horribly expensive and definitely not recommended, but eminently doable. The difference between current sys_membarrier() and the "simultaneous" variant described above is similar to the difference between non-multicopy-atomic and multicopy-atomic memory ordering. So, after thinking it through, my guess is that pretty much any litmus test that can discern between multicopy-atomic and non-multicopy-atomic should be transformable into something that can distinguish between the current and the "simultaneous" sys_membarrier() implementation. Seem reasonable? Or alternatively, may I please apply your Signed-off-by to your earlier sys_membarrier() patch so that I can queue it? I will probably also change smp_memb() to membarrier() or some such. Again, within the Linux kernel, membarrier() can be emulated with smp_call_function() invoking a handler that does smp_mb(). Thanx, Paul
On Sun, 16 Dec 2018, Paul E. McKenney wrote: > OK, so "simultaneous" IPIs could be emulated in a real implementation by > having sys_membarrier() send each IPI (but not wait for a response), then > execute a full memory barrier and set a shared variable. Each IPI handler > would spin waiting for the shared variable to be set, then execute a full > memory barrier and atomically increment yet another shared variable and > return from interrupt. When that other shared variable's value reached > the number of IPIs sent, the sys_membarrier() would execute its final > (already existing) full memory barrier and return. Horribly expensive > and definitely not recommended, but eminently doable. I don't think that's right. What would make the IPIs "simultaneous" would be if none of the handlers return until all of them have started executing. For example, you could have each handler increment a shared variable and then spin, waiting for the variable to reach the number of CPUs, before returning. What you wrote was to have each handler wait until all the IPIs had been sent, which is not the same thing at all. > The difference between current sys_membarrier() and the "simultaneous" > variant described above is similar to the difference between > non-multicopy-atomic and multicopy-atomic memory ordering. So, after > thinking it through, my guess is that pretty much any litmus test that > can discern between multicopy-atomic and non-multicopy-atomic should > be transformable into something that can distinguish between the current > and the "simultaneous" sys_membarrier() implementation. > > Seem reasonable? Yes. > Or alternatively, may I please apply your Signed-off-by to your earlier > sys_membarrier() patch so that I can queue it? I will probably also > change smp_memb() to membarrier() or some such. Again, within the > Linux kernel, membarrier() can be emulated with smp_call_function() > invoking a handler that does smp_mb(). Do you really want to put sys_membarrier into the LKMM? I'm not so sure it's appropriate. Alan
* Adhemerval Zanella: >> Well, that already happens with the duplicated header approach if users >> want to use UAPI headers, too. It requires complicated synchronization >> to make it work. > > But at least glibc is not imposing it, if user want to use kernel uapi > header he will explicit include it and it is up to kernel provide a sane > implementation. This only works if the header does not include system call wrapper. >> We can tweak the conditional for the header inclusion further, like >> this: >> >> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 16, 0) \ >> + || __glibc_has_include (<linux/membarrier.h>) \ >> +# include <linux/membarrier.h> >> +#else >> >> With GCC 5 and later, this would always prefer the kernel header if >> available. There would never be a conflict between the definitions, >> irrespective of header file inclusion order. >> >> What do you think? > > In any case I don't think my suggestion should be a blocker, we > already rely on linux header for some cases and it seems I am alone in > trying to decouple glibc from kernel headers. Okay. I've reconsidered the header baseline approach and I'm now proposing to go with the 4.3 header version, the first which had the system call. With a later baseline, you'd get an error if you include <linux/membarrier.h> before <sys/membarrier.h> and the kernel is older than the chosen baseline. This cannot happen if we pick the first supported kernel version as the baseline. (This will not matter if we eventually choose the use the __has_include trick and the compiler supports __has_include.) I had to adjust the test to define a private copy of the subsequently added constants. Thanks, Florian 2018-12-17 Florian Weimer <fweimer@redhat.com> Linux: Implement membarrier function. * sysdeps/unix/sysv/linux/Makefile (sysdep_headers): Add sys/membarrier.h. (tests): Add tst-membarrier. * sysdeps/unix/sysv/linux/Versions (GLIBC_2.27): Add membarrier. * sysdeps/unix/sysv/linux/sys/membarrier.h: New file. * sysdeps/unix/sysv/linux/tst-membarrier.c: Likewise. * sysdeps/unix/sysv/linux/aarch64/libc.abilist (GLIBC_2.29): Add membarrier. * sysdeps/unix/sysv/linux/alpha/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/arm/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/hppa/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/i386/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/ia64/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/microblaze/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/nios2/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/sh/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist (GLIBC_2.29): Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist (GLIBC_2.29): Likewise. diff --git a/NEWS b/NEWS index ae80818df4..d8cefe90dd 100644 --- a/NEWS +++ b/NEWS @@ -46,6 +46,9 @@ Major new features: incosistent mutex state after fork call in multithread environment. In both popen and system there is no direct access to user-defined mutexes. +* On Linux, The membarrier function and the <sys/membarrier.h> header file + have been added. + Deprecated and removed features, and other changes affecting compatibility: * The glibc.tune tunable namespace has been renamed to glibc.cpu and the diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile index 2c1a7dd274..74138429d6 100644 --- a/sysdeps/unix/sysv/linux/Makefile +++ b/sysdeps/unix/sysv/linux/Makefile @@ -43,12 +43,13 @@ sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \ bits/siginfo-arch.h bits/siginfo-consts-arch.h \ bits/procfs.h bits/procfs-id.h bits/procfs-extra.h \ bits/procfs-prregset.h bits/mman-map-flags-generic.h \ - bits/msq-pad.h bits/sem-pad.h bits/shmlba.h bits/shm-pad.h + bits/msq-pad.h bits/sem-pad.h bits/shmlba.h bits/shm-pad.h \ + sys/membarrier.h tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \ tst-quota tst-sync_file_range tst-sysconf-iov_max tst-ttyname \ test-errno-linux tst-memfd_create tst-mlock2 tst-pkey \ - tst-rlimit-infinity tst-ofdlocks + tst-rlimit-infinity tst-ofdlocks tst-membarrier tests-internal += tst-ofdlocks-compat diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions index f1e12d9c69..39a61b29eb 100644 --- a/sysdeps/unix/sysv/linux/Versions +++ b/sysdeps/unix/sysv/linux/Versions @@ -173,6 +173,7 @@ libc { } GLIBC_2.29 { getcpu; + membarrier; } GLIBC_PRIVATE { # functions used in other libraries diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist index 9c330f325e..eff4b1d055 100644 --- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist +++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist @@ -2139,5 +2139,6 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist index f630fa4c6f..6f37593acc 100644 --- a/sysdeps/unix/sysv/linux/alpha/libc.abilist +++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist @@ -2034,6 +2034,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/arm/libc.abilist b/sysdeps/unix/sysv/linux/arm/libc.abilist index b96f45590f..b2e5cc4113 100644 --- a/sysdeps/unix/sysv/linux/arm/libc.abilist +++ b/sysdeps/unix/sysv/linux/arm/libc.abilist @@ -124,6 +124,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.4 _Exit F diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist index 088a8ee369..a45c266c33 100644 --- a/sysdeps/unix/sysv/linux/hppa/libc.abilist +++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist @@ -1881,6 +1881,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist index f7ff2c57b9..3597737fb2 100644 --- a/sysdeps/unix/sysv/linux/i386/libc.abilist +++ b/sysdeps/unix/sysv/linux/i386/libc.abilist @@ -2046,6 +2046,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist index becd8b1033..908e8b4d52 100644 --- a/sysdeps/unix/sysv/linux/ia64/libc.abilist +++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist @@ -1915,6 +1915,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist index 74e42a5209..8fc66fd450 100644 --- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist +++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist @@ -125,6 +125,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.4 _Exit F diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist index 4af5a74e8a..e4c60d6f83 100644 --- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist +++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist @@ -1990,6 +1990,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/microblaze/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/libc.abilist index ccef673fd2..377e5d7049 100644 --- a/sysdeps/unix/sysv/linux/microblaze/libc.abilist +++ b/sysdeps/unix/sysv/linux/microblaze/libc.abilist @@ -2131,5 +2131,6 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist index 1054bb599e..4f93740486 100644 --- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist @@ -1968,6 +1968,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist index 4f5b5ffebf..c8e956a922 100644 --- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist @@ -1966,6 +1966,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist index 943aee58d4..6f69ce6478 100644 --- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist @@ -1974,6 +1974,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist index 17a5d17ef9..eaf6fdd1a4 100644 --- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist @@ -1969,6 +1969,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist index 4d62a540fd..d6becb39db 100644 --- a/sysdeps/unix/sysv/linux/nios2/libc.abilist +++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist @@ -2172,5 +2172,6 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist index ecc2d6fa13..2b45a9dedf 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist @@ -1994,6 +1994,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist index f5830f9c33..a9cb22ec2e 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist @@ -1998,6 +1998,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist index 633d8f4792..26216a660b 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist @@ -124,6 +124,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 _Exit F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist index 2c712636ef..066985b4a4 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist @@ -2229,5 +2229,6 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist index 195bc8b2cf..6b52ed0eeb 100644 --- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist +++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist @@ -2101,5 +2101,6 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist index 334def033c..80db25db0c 100644 --- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist +++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist @@ -2003,6 +2003,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist index 536f4c4ced..8fcc88ef9a 100644 --- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist +++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist @@ -1909,6 +1909,7 @@ GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 __fentry__ F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/sh/libc.abilist b/sysdeps/unix/sysv/linux/sh/libc.abilist index 30ae3b6ebb..8a49fa8a4b 100644 --- a/sysdeps/unix/sysv/linux/sh/libc.abilist +++ b/sysdeps/unix/sysv/linux/sh/libc.abilist @@ -1885,6 +1885,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist index 68b107d080..e17b9e72da 100644 --- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist +++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist @@ -1997,6 +1997,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist index e5b6a4da50..34f741d00b 100644 --- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist +++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist @@ -1938,6 +1938,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/sys/membarrier.h b/sysdeps/unix/sysv/linux/sys/membarrier.h new file mode 100644 index 0000000000..0e0941bc1c --- /dev/null +++ b/sysdeps/unix/sysv/linux/sys/membarrier.h @@ -0,0 +1,48 @@ +/* Memory barriers. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _SYS_MEMBARRIER_H +#define _SYS_MEMBARRIER_H 1 + +#include <features.h> + +__BEGIN_DECLS + +/* Perform a memory barrier on multiple threads. */ +int membarrier (int __op, int __flags) __THROW; + +__END_DECLS + +/* Obtain the definitions of the MEMBARRIER_CMD_* constants. */ + +#include <linux/version.h> +#if LINUX_VERSION_CODE >= KERNEL_VERSION (4, 3, 0) +# include <linux/membarrier.h> +#else + +/* Definitions from Linux 4.3 follow. */ + +enum membarrier_cmd +{ + MEMBARRIER_CMD_QUERY = 0, + MEMBARRIER_CMD_SHARED = 1 +}; + +#endif + +#endif /* _SYS_MEMBARRIER_H */ diff --git a/sysdeps/unix/sysv/linux/syscalls.list b/sysdeps/unix/sysv/linux/syscalls.list index e24ea29e35..3deee2bc19 100644 --- a/sysdeps/unix/sysv/linux/syscalls.list +++ b/sysdeps/unix/sysv/linux/syscalls.list @@ -112,3 +112,4 @@ process_vm_writev EXTRA process_vm_writev i:ipipii process_vm_writev memfd_create EXTRA memfd_create i:si memfd_create pkey_alloc EXTRA pkey_alloc i:ii pkey_alloc pkey_free EXTRA pkey_free i:i pkey_free +membarrier EXTRA membarrier i:ii membarrier diff --git a/sysdeps/unix/sysv/linux/tst-membarrier.c b/sysdeps/unix/sysv/linux/tst-membarrier.c new file mode 100644 index 0000000000..48f4cddf02 --- /dev/null +++ b/sysdeps/unix/sysv/linux/tst-membarrier.c @@ -0,0 +1,66 @@ +/* Tests for the membarrier function. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <errno.h> +#include <stdio.h> +#include <support/check.h> +#include <sys/membarrier.h> + +static int +do_test (void) +{ + int supported = membarrier (MEMBARRIER_CMD_QUERY, 0); + if (supported == -1) + { + if (errno == ENOSYS) + FAIL_UNSUPPORTED ("membarrier system call not implemented"); + else + FAIL_EXIT1 ("membarrier: %m"); + } + + if ((supported & MEMBARRIER_CMD_SHARED) == 0) + FAIL_UNSUPPORTED ("shared memory barriers not supported"); + + puts ("info: membarrier is supported on this system"); + + /* This was not included in the original implementation in Linux + 4.3. */ + const enum membarrier_cmd cmd_private_expedited = 8; + const enum membarrier_cmd cmd_register_private_expedited = 16; + + /* The shared barrier is always implemented. */ + TEST_COMPARE (supported & MEMBARRIER_CMD_SHARED, MEMBARRIER_CMD_SHARED); + TEST_COMPARE (membarrier (MEMBARRIER_CMD_SHARED, 0), 0); + + /* If the private-expedited barrier is advertised, execute it after + registering the intent. */ + if (supported & cmd_private_expedited) + { + puts ("info: MEMBARRIER_CMD_PRIVATE_EXPEDITED is supported"); + TEST_COMPARE (supported & cmd_register_private_expedited, + cmd_register_private_expedited); + TEST_COMPARE (membarrier (cmd_register_private_expedited, 0), 0); + TEST_COMPARE (membarrier (cmd_private_expedited, 0), 0); + } + else + puts ("info: MEMBARRIER_CMD_PRIVATE_EXPEDITED not supported"); + + return 0; +} + +#include <support/test-driver.c> diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist index 86dfb0c94d..6f15b7b371 100644 --- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist @@ -1896,6 +1896,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F GLIBC_2.3 __ctype_b_loc F diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist index dd688263aa..88e1e1f36d 100644 --- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist @@ -2147,5 +2147,6 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 getcpu F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.29 posix_spawn_file_actions_addfchdir_np F
On Mon, Dec 17, 2018 at 11:02:40AM -0500, Alan Stern wrote: > On Sun, 16 Dec 2018, Paul E. McKenney wrote: > > > OK, so "simultaneous" IPIs could be emulated in a real implementation by > > having sys_membarrier() send each IPI (but not wait for a response), then > > execute a full memory barrier and set a shared variable. Each IPI handler > > would spin waiting for the shared variable to be set, then execute a full > > memory barrier and atomically increment yet another shared variable and > > return from interrupt. When that other shared variable's value reached > > the number of IPIs sent, the sys_membarrier() would execute its final > > (already existing) full memory barrier and return. Horribly expensive > > and definitely not recommended, but eminently doable. > > I don't think that's right. What would make the IPIs "simultaneous" > would be if none of the handlers return until all of them have started > executing. For example, you could have each handler increment a shared > variable and then spin, waiting for the variable to reach the number of > CPUs, before returning. > > What you wrote was to have each handler wait until all the IPIs had > been sent, which is not the same thing at all. You are right, the handlers need to do the atomic increment before waiting for the shared variable to be set, and the sys_membarrier() must wait for the incremented variable to reach its final value before setting the shared variable. > > The difference between current sys_membarrier() and the "simultaneous" > > variant described above is similar to the difference between > > non-multicopy-atomic and multicopy-atomic memory ordering. So, after > > thinking it through, my guess is that pretty much any litmus test that > > can discern between multicopy-atomic and non-multicopy-atomic should > > be transformable into something that can distinguish between the current > > and the "simultaneous" sys_membarrier() implementation. > > > > Seem reasonable? > > Yes. > > > Or alternatively, may I please apply your Signed-off-by to your earlier > > sys_membarrier() patch so that I can queue it? I will probably also > > change smp_memb() to membarrier() or some such. Again, within the > > Linux kernel, membarrier() can be emulated with smp_call_function() > > invoking a handler that does smp_mb(). > > Do you really want to put sys_membarrier into the LKMM? I'm not so > sure it's appropriate. We do need it for the benefit of the C++ folks, but you are right that it need not be accepted into the kernel to be useful to them. So agreed, let's hold off for the time being. Thanx, Paul
On Wed, Nov 28, 2018 at 04:05:01PM +0100, Florian Weimer wrote: > This is essentially a repost of last year's patch, rebased to the glibc > 2.29 symbol version and reflecting the introduction of > MEMBARRIER_CMD_GLOBAL. > > I'm not including any changes to manual/ here because the set of > supported operations is evolving rapidly, we could not get consensus for > the language I proposed the last time, and I do not want to contribute > to the manual for the time being. > > Thanks, > Florian > > 2018-11-28 Florian Weimer <fweimer@redhat.com> > > Linux: Implement membarrier function. > * sysdeps/unix/sysv/linux/Makefile (sysdep_headers): Add > sys/membarrier.h. > (tests): Add tst-membarrier. > * sysdeps/unix/sysv/linux/Versions (GLIBC_2.27): Add membarrier. > * sysdeps/unix/sysv/linux/sys/membarrier.h: New file. I notice that the declaration moved from sys/mman.h to this new header since the previous version of this patch. Is this an intentional change, and is it where everyone now intends/agrees for it to be when it gets merged? Rich
* Rich Felker: > On Wed, Nov 28, 2018 at 04:05:01PM +0100, Florian Weimer wrote: >> This is essentially a repost of last year's patch, rebased to the glibc >> 2.29 symbol version and reflecting the introduction of >> MEMBARRIER_CMD_GLOBAL. >> >> I'm not including any changes to manual/ here because the set of >> supported operations is evolving rapidly, we could not get consensus for >> the language I proposed the last time, and I do not want to contribute >> to the manual for the time being. >> >> Thanks, >> Florian >> >> 2018-11-28 Florian Weimer <fweimer@redhat.com> >> >> Linux: Implement membarrier function. >> * sysdeps/unix/sysv/linux/Makefile (sysdep_headers): Add >> sys/membarrier.h. >> (tests): Add tst-membarrier. >> * sysdeps/unix/sysv/linux/Versions (GLIBC_2.27): Add membarrier. >> * sysdeps/unix/sysv/linux/sys/membarrier.h: New file. > > I notice that the declaration moved from sys/mman.h to this new header > since the previous version of this patch. Is this an intentional > change, Yes, it makes it clearer how we avoid maintaining a separate list of constants for this. > and is it where everyone now intends/agrees for it to be when > it gets merged? I don't know if there is consensus, sorry. Thanks, Florian
diff --git a/NEWS b/NEWS index 1098be1afb..d5786f4eab 100644 --- a/NEWS +++ b/NEWS @@ -35,6 +35,9 @@ Major new features: different directory. This is a GNU extension and similar to the Solaris function of the same name. +* The membarrier function and the <sys/membarrier.h> header file have been + added. + Deprecated and removed features, and other changes affecting compatibility: * The glibc.tune tunable namespace has been renamed to glibc.cpu and the diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile index 362cf3b950..f8c4843ab6 100644 --- a/sysdeps/unix/sysv/linux/Makefile +++ b/sysdeps/unix/sysv/linux/Makefile @@ -43,12 +43,13 @@ sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \ bits/siginfo-arch.h bits/siginfo-consts-arch.h \ bits/procfs.h bits/procfs-id.h bits/procfs-extra.h \ bits/procfs-prregset.h bits/mman-map-flags-generic.h \ - bits/msq-pad.h bits/sem-pad.h bits/shmlba.h bits/shm-pad.h + bits/msq-pad.h bits/sem-pad.h bits/shmlba.h bits/shm-pad.h \ + sys/membarrier.h tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \ tst-quota tst-sync_file_range tst-sysconf-iov_max tst-ttyname \ test-errno-linux tst-memfd_create tst-mlock2 tst-pkey \ - tst-rlimit-infinity tst-ofdlocks + tst-rlimit-infinity tst-ofdlocks tst-membarrier tests-internal += tst-ofdlocks-compat diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions index 336c13b57d..86db06f403 100644 --- a/sysdeps/unix/sysv/linux/Versions +++ b/sysdeps/unix/sysv/linux/Versions @@ -171,6 +171,9 @@ libc { mlock2; pkey_alloc; pkey_free; pkey_set; pkey_get; pkey_mprotect; } + GLIBC_2.29 { + membarrier; + } GLIBC_PRIVATE { # functions used in other libraries __syscall_rt_sigqueueinfo; diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist index e66c741d04..c73c731eec 100644 --- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist +++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist @@ -2138,4 +2138,5 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist index 8df162fe99..c9488b3c18 100644 --- a/sysdeps/unix/sysv/linux/alpha/libc.abilist +++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist @@ -2033,6 +2033,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/arm/libc.abilist b/sysdeps/unix/sysv/linux/arm/libc.abilist index 43c804f9dc..2524b6545b 100644 --- a/sysdeps/unix/sysv/linux/arm/libc.abilist +++ b/sysdeps/unix/sysv/linux/arm/libc.abilist @@ -123,6 +123,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.4 _Exit F GLIBC_2.4 _IO_2_1_stderr_ D 0xa0 diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist index 88b01c2e75..9baaa34b62 100644 --- a/sysdeps/unix/sysv/linux/hppa/libc.abilist +++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist @@ -1880,6 +1880,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist index 6d02f31612..1b91873f65 100644 --- a/sysdeps/unix/sysv/linux/i386/libc.abilist +++ b/sysdeps/unix/sysv/linux/i386/libc.abilist @@ -2045,6 +2045,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist index 4249712611..1b3465c6f4 100644 --- a/sysdeps/unix/sysv/linux/ia64/libc.abilist +++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist @@ -1914,6 +1914,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist index d47b808862..db22eb4a12 100644 --- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist +++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist @@ -124,6 +124,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.4 _Exit F GLIBC_2.4 _IO_2_1_stderr_ D 0x98 diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist index d5e38308be..5a93b98e76 100644 --- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist +++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist @@ -1989,6 +1989,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/microblaze/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/libc.abilist index 8596b84399..0142b573ca 100644 --- a/sysdeps/unix/sysv/linux/microblaze/libc.abilist +++ b/sysdeps/unix/sysv/linux/microblaze/libc.abilist @@ -2130,4 +2130,5 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist index 88e0f896d5..a84db6e3f4 100644 --- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist @@ -1967,6 +1967,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist index aff7462c34..aed9d20fa5 100644 --- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist @@ -1965,6 +1965,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist index 71d82444aa..d117ad299e 100644 --- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist @@ -1973,6 +1973,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist index de6c53d293..3598a33eca 100644 --- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist @@ -1968,6 +1968,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist index e724bab9fb..7ce4aa1841 100644 --- a/sysdeps/unix/sysv/linux/nios2/libc.abilist +++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist @@ -2171,4 +2171,5 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist index e9ecbccb71..50fedd0fab 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist @@ -1993,6 +1993,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist index da83ea6028..86862b6e16 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist @@ -1997,6 +1997,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist index 4535b40d15..57c0e67347 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist @@ -2228,4 +2228,5 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist index 65725de4f0..7658e2c2b9 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist @@ -123,6 +123,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 _Exit F GLIBC_2.3 _IO_2_1_stderr_ D 0xe0 diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist index bbb3c4a8e7..f4020e881d 100644 --- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist +++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist @@ -2100,4 +2100,5 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist index e85ac2a178..9476770e5b 100644 --- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist +++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist @@ -2002,6 +2002,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist index d56931022c..ee9a87659f 100644 --- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist +++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist @@ -1908,6 +1908,7 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 __fentry__ F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/sh/libc.abilist b/sysdeps/unix/sysv/linux/sh/libc.abilist index ff939a15c4..c2dd2a5f7e 100644 --- a/sysdeps/unix/sysv/linux/sh/libc.abilist +++ b/sysdeps/unix/sysv/linux/sh/libc.abilist @@ -1884,6 +1884,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist index 64fa9e10a5..55c6496b96 100644 --- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist +++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist @@ -1996,6 +1996,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist index db909d1506..a0e7f2f221 100644 --- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist +++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist @@ -1937,6 +1937,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/sys/membarrier.h b/sysdeps/unix/sysv/linux/sys/membarrier.h new file mode 100644 index 0000000000..4c3e6164f7 --- /dev/null +++ b/sysdeps/unix/sysv/linux/sys/membarrier.h @@ -0,0 +1,55 @@ +/* Memory barriers. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _SYS_MEMBARRIER_H +#define _SYS_MEMBARRIER_H 1 + +#include <features.h> + +__BEGIN_DECLS + +/* Perform a memory barrier on multiple threads. */ +int membarrier (int __op, int __flags) __THROW; + +__END_DECLS + +/* Obtain the definitions of the MEMBARRIER_CMD_* constants. */ + +#include <linux/version.h> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4, 16, 0) +# include <linux/membarrier.h> +#else + +/* Definitions from Linux 4.16 follow. */ + +enum membarrier_cmd +{ + MEMBARRIER_CMD_QUERY = 0, + MEMBARRIER_CMD_SHARED = 1, + MEMBARRIER_CMD_GLOBAL = 1, + MEMBARRIER_CMD_GLOBAL_EXPEDITED = 2, + MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED = 4, + MEMBARRIER_CMD_PRIVATE_EXPEDITED = 8, + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED = 16, + MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE = 32, + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE = 64, +}; + +#endif + +#endif /* _SYS_MEMBARRIER_H */ diff --git a/sysdeps/unix/sysv/linux/syscalls.list b/sysdeps/unix/sysv/linux/syscalls.list index e24ea29e35..3deee2bc19 100644 --- a/sysdeps/unix/sysv/linux/syscalls.list +++ b/sysdeps/unix/sysv/linux/syscalls.list @@ -112,3 +112,4 @@ process_vm_writev EXTRA process_vm_writev i:ipipii process_vm_writev memfd_create EXTRA memfd_create i:si memfd_create pkey_alloc EXTRA pkey_alloc i:ii pkey_alloc pkey_free EXTRA pkey_free i:i pkey_free +membarrier EXTRA membarrier i:ii membarrier diff --git a/sysdeps/unix/sysv/linux/tst-membarrier.c b/sysdeps/unix/sysv/linux/tst-membarrier.c new file mode 100644 index 0000000000..aeaccad578 --- /dev/null +++ b/sysdeps/unix/sysv/linux/tst-membarrier.c @@ -0,0 +1,61 @@ +/* Tests for the membarrier function. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <errno.h> +#include <stdio.h> +#include <support/check.h> +#include <sys/membarrier.h> + +static int +do_test (void) +{ + int supported = membarrier (MEMBARRIER_CMD_QUERY, 0); + if (supported == -1) + { + if (errno == ENOSYS) + FAIL_UNSUPPORTED ("membarrier system call not implemented"); + else + FAIL_EXIT1 ("membarrier: %m"); + } + + if ((supported & MEMBARRIER_CMD_GLOBAL) == 0) + FAIL_UNSUPPORTED ("global memory barriers not supported"); + + puts ("info: membarrier is supported on this system"); + + /* The global barrier is always implemented. */ + TEST_COMPARE (supported & MEMBARRIER_CMD_GLOBAL, MEMBARRIER_CMD_GLOBAL); + TEST_COMPARE (membarrier (MEMBARRIER_CMD_GLOBAL, 0), 0); + + /* If the private-expedited barrier is advertised, execute it after + registering the intent. */ + if (supported & MEMBARRIER_CMD_PRIVATE_EXPEDITED) + { + TEST_COMPARE (supported & MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED); + TEST_COMPARE (membarrier (MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED, 0), + 0); + TEST_COMPARE (membarrier (MEMBARRIER_CMD_PRIVATE_EXPEDITED, 0), 0); + } + else + puts ("info: MEMBARRIER_CMD_PRIVATE_EXPEDITED not supported"); + + return 0; +} + +#include <support/test-driver.c> diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist index 3b175f104b..f097eb86e2 100644 --- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist @@ -1895,6 +1895,7 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist index 1b57710477..afb1a196fa 100644 --- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist @@ -2146,4 +2146,5 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 membarrier F GLIBC_2.29 posix_spawn_file_actions_addchdir_np F