Patchwork Fix __lll_timedlock_wait busy-wait issue

login
register
mail settings
Submitter bniebuhr@efjohnson.com
Date March 21, 2014, 1:50 p.m.
Message ID <1395409800-4457-1-git-send-email-bniebuhr@efjohnson.com>
Download mbox | patch
Permalink /patch/332656/
State New
Headers show

Comments

bniebuhr@efjohnson.com - March 21, 2014, 1:50 p.m.
From: Brian Niebuhr <bniebuhr@efjohnson.com>

__lll_timedlock_wait has a bug that is exposed in these conditions:

1. Thread 1 acquires the lock
2. Thread 2 attempts to acquire the lock and waits via futex syscall
3. Thread 1 unlocks the lock and wakes the waiting thread
4. Thread 1 re-aquires the lock before thread 2 gets the CPU

What happens in this case is that the futex value is set to '1' since
Thread 1 reaquired the lock through the fast path (since it had just been
released).  The problem is that Thread 2 is in the loop in
__lll_timedlock_wait and it expects the futex value to be '2'.
atomic_compare_and_exchange_bool_acq only sets the futex value to '2' if
it is already set to '0', and since Thread 1 reaquired the lock, the
futex remains set to '1'.

When Thread 2 attempts to wait on the futex, the operating system returns
-EWOULDBLOCK since the futex value is not '2'.  This causes a busy wait
condition where Thread 2 continuously attempts to wait on the futex and
the kernel immediately returns -EWOULDBLOCK.  This continues until Thread
1 releases the lock.

The fix is to use atomic_exchange_acq instead of
atomic_compare_and_exchange_bool_acq which will force the futex value to
'2' on each loop iteration.  This change makes uClibc line up with
glibc's implementation.

Signed-off-by: Brian Niebuhr <bniebuhr@efjohnson.com>
---
 libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)
Maxim Kuvyrkov - March 27, 2014, 8:41 p.m.
[Re-sending from subscribed address.]

On Mar 22, 2014, at 2:50 AM, bniebuhr@efjohnson.com wrote:

> From: Brian Niebuhr <bniebuhr@efjohnson.com>
> 
> __lll_timedlock_wait has a bug that is exposed in these conditions:
> 
> 1. Thread 1 acquires the lock
> 2. Thread 2 attempts to acquire the lock and waits via futex syscall
> 3. Thread 1 unlocks the lock and wakes the waiting thread
> 4. Thread 1 re-aquires the lock before thread 2 gets the CPU
> 
> What happens in this case is that the futex value is set to '1' since
> Thread 1 reaquired the lock through the fast path (since it had just been
> released).  The problem is that Thread 2 is in the loop in
> __lll_timedlock_wait and it expects the futex value to be '2'.
> atomic_compare_and_exchange_bool_acq only sets the futex value to '2' if
> it is already set to '0', and since Thread 1 reaquired the lock, the
> futex remains set to '1'.
> 
> When Thread 2 attempts to wait on the futex, the operating system returns
> -EWOULDBLOCK since the futex value is not '2'.  This causes a busy wait
> condition where Thread 2 continuously attempts to wait on the futex and
> the kernel immediately returns -EWOULDBLOCK.  This continues until Thread
> 1 releases the lock.
> 
> The fix is to use atomic_exchange_acq instead of
> atomic_compare_and_exchange_bool_acq which will force the futex value to
> '2' on each loop iteration.

You know, I started this reply as a you-are-wrong-here-is-why, but after looking at both glibc and uclibc, I agree that you are correct.  Just to reiterate for others, the problem here is not correctness, but busy-wait instead of sleep under certain conditions.

> This change makes uClibc line up with
> glibc's implementation.

This problem is fixed in glibc's ./nptl/sysdeps/unix/sysv/linux/lowlevellock.c, but still present in glibc's ARM and sparc32 lowlevellock.c.  Do you plan to fix these too?

Interestingly, glibc's hppa lowlevellock.c has an alternative solution to this problem.  I can't quite figure out which solution is faster, so would appreciate a second pair of eyes.  My feeling is that the generic (the one in your patch) version is faster.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org

> 
> Signed-off-by: Brian Niebuhr <bniebuhr@efjohnson.com>
> ---
> libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c b/libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c
> index af864b3..4f7e890 100644
> --- a/libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c
> +++ b/libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c
> @@ -60,10 +60,7 @@ __lll_timedlock_wait (int *futex, const struct timespec *abstime, int private)
>    return EINVAL;
> 
>  /* Upgrade the lock.  */
> -  if (atomic_exchange_acq (futex, 2) == 0)
> -    return 0;
> -
> -  do
> +  while (atomic_exchange_acq (futex, 2) != 0)
>    {
>      struct timeval tv;
> 
> @@ -86,7 +83,6 @@ __lll_timedlock_wait (int *futex, const struct timespec *abstime, int private)
>      // XYZ: Lost the lock to check whether it was private.
>      lll_futex_timed_wait (futex, 2, &rt, private);
>    }
> -  while (atomic_compare_and_exchange_bool_acq (futex, 2, 0) != 0);
> 
>  return 0;
> }
> -- 
> 1.8.3.2
> 
> _______________________________________________
> uClibc mailing list
> uClibc@uclibc.org
> http://lists.busybox.net/mailman/listinfo/uclibc
Joseph S. Myers - March 27, 2014, 10:01 p.m.
I don't know how this might relate to 
<https://sourceware.org/bugzilla/show_bug.cgi?id=15119> (see 
<https://sourceware.org/ml/libc-ports/2013-01/msg00084.html> and 
<https://sourceware.org/ml/libc-ports/2013-02/msg00021.html> and the rest 
of that thread).  But my preference for how to address this is definitely 
to move to unifying lowlevellock.[ch] files across as many architectures 
as possible - which requires someone to understand the differences and 
produce a careful analysis that shows what the best form for generic files 
is and what cases actually require architecture-specific files to override 
those generic files (preferably overriding only the bits that need 
overriding).
Joseph S. Myers - March 27, 2014, 10:19 p.m.
On Fri, 28 Mar 2014, Maxim Kuvyrkov wrote:

> On Mar 28, 2014, at 11:01 AM, Joseph S. Myers <joseph@codesourcery.com> wrote:
> 
> > I don't know how this might relate to 
> > <https://sourceware.org/bugzilla/show_bug.cgi?id=15119> (see 
> > <https://sourceware.org/ml/libc-ports/2013-01/msg00084.html> and 
> > <https://sourceware.org/ml/libc-ports/2013-02/msg00021.html> and the rest 
> > of that thread).  But my preference for how to address this is definitely 
> > to move to unifying lowlevellock.[ch] files across as many architectures 
> > as possible - which requires someone to understand the differences and 
> > produce a careful analysis that shows what the best form for generic files 
> > is and what cases actually require architecture-specific files to override 
> > those generic files (preferably overriding only the bits that need 
> > overriding).
> 
> Yeap, it's the same issue in the PR and same solution as in this thread.  
> Unfortunately, the previous discussion veered off towards sparc away 
> from ARM and got forgotten.

The present thread is specifically discussing lowlevellock.c, but Carlos 
suggested in the previous discussion that the real issue was in 
__lll_timedlock in lowlevellock.h.  I think both files need unification 
across architectures.
Brian Niebuhr - March 28, 2014, 2:26 p.m.
> From: Joseph Myers [mailto:joseph@codesourcery.com]
> On Fri, 28 Mar 2014, Maxim Kuvyrkov wrote:
>
> > On Mar 28, 2014, at 11:01 AM, Joseph S. Myers
> <joseph@codesourcery.com> wrote:
> >
> > > I don't know how this might relate to
> > > <https://sourceware.org/bugzilla/show_bug.cgi?id=15119> (see
> > > <https://sourceware.org/ml/libc-ports/2013-01/msg00084.html> and
> > > <https://sourceware.org/ml/libc-ports/2013-02/msg00021.html> and the
> rest
> > > of that thread).  But my preference for how to address this is definitely
> > > to move to unifying lowlevellock.[ch] files across as many architectures
> > > as possible - which requires someone to understand the differences and
> > > produce a careful analysis that shows what the best form for generic files
> > > is and what cases actually require architecture-specific files to override
> > > those generic files (preferably overriding only the bits that need
> > > overriding).
> >
> > Yeap, it's the same issue in the PR and same solution as in this thread.
> > Unfortunately, the previous discussion veered off towards sparc away
> > from ARM and got forgotten.
>
> The present thread is specifically discussing lowlevellock.c, but Carlos
> suggested in the previous discussion that the real issue was in
> __lll_timedlock in lowlevellock.h.  I think both files need unification
> across architectures.

On the topic of the original patch I submitted to uclibc:  I can certainly submit a similar patch to glibc, but it sounds as if people would prefer to refactor those files instead.  If that's the case, I don't feel that I know the code well enough to safely refactor it and merge all of the architectures, so I will probably let someone more knowledgeable do that.  However if you all would like me to submit a patch to __lll_timedlock_wait in glibc instead, just let me know.
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.

Patch

diff --git a/libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c b/libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c
index af864b3..4f7e890 100644
--- a/libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c
+++ b/libpthread/nptl/sysdeps/unix/sysv/linux/arm/lowlevellock.c
@@ -60,10 +60,7 @@  __lll_timedlock_wait (int *futex, const struct timespec *abstime, int private)
     return EINVAL;
 
   /* Upgrade the lock.  */
-  if (atomic_exchange_acq (futex, 2) == 0)
-    return 0;
-
-  do
+  while (atomic_exchange_acq (futex, 2) != 0)
     {
       struct timeval tv;
 
@@ -86,7 +83,6 @@  __lll_timedlock_wait (int *futex, const struct timespec *abstime, int private)
       // XYZ: Lost the lock to check whether it was private.
       lll_futex_timed_wait (futex, 2, &rt, private);
     }
-  while (atomic_compare_and_exchange_bool_acq (futex, 2, 0) != 0);
 
   return 0;
 }