Message ID | 1483446882-2593-1-git-send-email-tuliom@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On Tue, 2017-01-03 at 10:34 -0200, Tulio Magno Quites Machado Filho wrote: > Changes since v3: > - Use atomics to update adapt_count in __lll_unlock_elision. > - Improved the source code comments in __lll_unlock_elision to mention > the mutex destroy requirements. > > Changes since v2: > - Fixed coding style. > > Changes since v1: > - Removed references to data race by the actual error: write-after-destroy. > - Enclosed adapt_count accesses in C11 atomics. > > --- 8< --- This is okay, thanks. For the next time, or when you change non-arch-specific code, please remember to document memory orders; a brief comment is often sufficient but reveals the intent behind a specific bit of implementation. For example: > diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c > index 4589491..044803a 100644 > --- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c > +++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c > @@ -45,7 +45,7 @@ > int > __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared) > { > - if (*adapt_count > 0) > + if (atomic_load_relaxed (adapt_count) > 0) Here, you could have just addded: /* adapt_count is accessed concurrently but is just a hint. Thus, use atomic accesses but relaxed MO is sufficient. */ This helps others to more quickly understand what was intended. This isn't critical in this case because we have well-documented similar code for other architectures, but generally it is better to document why a choice was made if it is not immediately obvious, or cannot be checked for consistency based on just local information (eg, the source code lines around the statement in question); concurrent code obviously can't be cross-checked with just local information...
Torvald Riegel <triegel@redhat.com> writes: > For the next time, or when you change non-arch-specific code, please > remember to document memory orders; a brief comment is often sufficient > but reveals the intent behind a specific bit of implementation. For > example: > >> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c >> index 4589491..044803a 100644 >> --- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c >> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c >> @@ -45,7 +45,7 @@ >> int >> __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared) >> { >> - if (*adapt_count > 0) >> + if (atomic_load_relaxed (adapt_count) > 0) > > Here, you could have just addded: > /* adapt_count is accessed concurrently but is just a hint. Thus, > use atomic accesses but relaxed MO is sufficient. */ > > This helps others to more quickly understand what was intended. This > isn't critical in this case because we have well-documented similar code > for other architectures, but generally it is better to document why a > choice was made if it is not immediately obvious, or cannot be checked > for consistency based on just local information (eg, the source code > lines around the statement in question); concurrent code obviously can't > be cross-checked with just local information... I agree with you. I added this comment in this patch and pushed it as e9a96ea. Thanks!
On Jan 11 2017, "Ulrich Weigand" <uweigand@de.ibm.com> wrote: > Tulio Magno Quites Machado Filho wrote: > >> + /* Update adapt_count in the critical section to prevent a >> + write-after-destroy error as mentioned in BZ 20822. The >> + following update of adapt_count has to be contained within >> + the critical region of the fall-back lock in order to not violate >> + the mutex destruction requirements. */ >> + short __tmp = atomic_load_relaxed (adapt_count); >> + if (__tmp > 0) >> + atomic_store_relaxed (adapt_count, __tmp--); > > Doesn't this have to be --__tmp to have any effect? It should be just __tmp - 1. Andreas.
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c index 4589491..044803a 100644 --- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c +++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c @@ -45,7 +45,7 @@ int __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared) { - if (*adapt_count > 0) + if (atomic_load_relaxed (adapt_count) > 0) { goto use_lock; } @@ -67,7 +67,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared) if (_TEXASRU_FAILURE_PERSISTENT (__builtin_get_texasru ())) { if (aconf.skip_lock_internal_abort > 0) - *adapt_count = aconf.skip_lock_internal_abort; + atomic_store_relaxed (adapt_count, + aconf.skip_lock_internal_abort); goto use_lock; } } @@ -75,7 +76,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared) /* Fall back to locks for a bit if retries have been exhausted */ if (aconf.try_tbegin > 0 && aconf.skip_lock_out_of_tbegin_retries > 0) - *adapt_count = aconf.skip_lock_out_of_tbegin_retries; + atomic_store_relaxed (adapt_count, + aconf.skip_lock_out_of_tbegin_retries); use_lock: return LLL_LOCK ((*lock), pshared); diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c index 1e5cbe8..ed244d3 100644 --- a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c +++ b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c @@ -34,7 +34,7 @@ __lll_trylock_elision (int *futex, short *adapt_count) __libc_tabort (_ABORT_NESTED_TRYLOCK); /* Only try a transaction if it's worth it. */ - if (*adapt_count > 0) + if (atomic_load_relaxed (adapt_count) > 0) { goto use_lock; } @@ -49,7 +49,7 @@ __lll_trylock_elision (int *futex, short *adapt_count) __libc_tend (0); if (aconf.skip_lock_busy > 0) - *adapt_count = aconf.skip_lock_busy; + atomic_store_relaxed (adapt_count, aconf.skip_lock_busy); } else { @@ -59,7 +59,8 @@ __lll_trylock_elision (int *futex, short *adapt_count) result in another failure. Use normal locking now and for the next couple of calls. */ if (aconf.skip_trylock_internal_abort > 0) - *adapt_count = aconf.skip_trylock_internal_abort; + atomic_store_relaxed (adapt_count, + aconf.skip_trylock_internal_abort); } } diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c index 6f45a9c..759c146 100644 --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c @@ -28,13 +28,16 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared) __libc_tend (0); else { - lll_unlock ((*lock), pshared); + /* Update adapt_count in the critical section to prevent a + write-after-destroy error as mentioned in BZ 20822. The + following update of adapt_count has to be contained within + the critical region of the fall-back lock in order to not violate + the mutex destruction requirements. */ + short __tmp = atomic_load_relaxed (adapt_count); + if (__tmp > 0) + atomic_store_relaxed (adapt_count, __tmp--); - /* Update the adapt count AFTER completing the critical section. - Doing this here prevents unneeded stalling when entering - a critical section. Saving about 8% runtime on P8. */ - if (*adapt_count > 0) - (*adapt_count)--; + lll_unlock ((*lock), pshared); } return 0; }