diff mbox

[v2] Add and use new glibc-internal futex API.

Message ID 1434575216.5250.204.camel@localhost.localdomain
State New
Headers show

Commit Message

Torvald Riegel June 17, 2015, 9:06 p.m. UTC
This adds new functions for futex operations, starting with wait,
abstimed_wait, reltimed_wait, wake.  They add documentation and error
checking according to the current draft of the Linux kernel futex
manpage.

Waiting with absolute or relative timeouts is split into separate
functions.  This allows for removing a few cases of code duplication in
pthreads code, which uses absolute timeouts; also, it allows us to put
platform-specific code to go from an absolute to a relative timeout into
the platform-specific futex abstractions.  The latter is done by adding
lll_futex_abstimed_wait.  I expect that we will refactor this later on,
depending on how we do the lll_ parts.

Futex operations that can be canceled are also split out into separate
functions suffixed by "_cancelable".

There are separate versions for both Linux and NaCl; while they
currently differ only slightly, my expectation is that the separate
versions of lowlevellock-futex.h will eventually be merged into
futex-internal.h when we get to move the lll_ functions over to the new
futex API.
    
The sanity checks regarding whether shared futexes are supported abort
instead of returning errors because POSIX error specs don't really
consider that there could be no support for shared futexes.  Aborting is
better than returning an unspecified error or an error specified for a
different condition; only NaCl has no support for shared futexes.

This is a revision of
https://sourceware.org/ml/libc-alpha/2015-06/msg00284.html

I have transformed all the lll_futex_* uses that I'm aware of except the
following:
* Anything related to lowlevellock or mutexes.
* sparc-specific files: I'll send a follow-up patch so this can be
reviewed and tested separately.
* pthread condvar: I'll send a follow-up patch on top of my revised
condvar implementation.  This will use
futex_supports_exact_relative_timeouts() (as required by the
CLOCK_MONOTONIC clock setting).
* All tls.h files: Siddhesh wants to look at this area in more detail,
so I'll work with him to see how to best move this to the new API.

Interacting with futex words requires atomic accesses, which isn't done
by most of glibc's current futex callers.  I did not fix these in this
patch to keep the patch easier to review: Using the new futex API is in
most cases a pretty mechanical change, which I didn't want to obfuscate
by lots of other changes to atomics.  The core motivation behind this
patch is to add error handling and improve the internal futex API, not
to change any synchronization.
Nonetheless, using atomics where they are needed is on my list of things
to do, so don't worry :)
Specifically, this is already done in the my new condvar implementation
and in the new semaphore.  It will get done for rwlock in the new
implementation I'm working on.  I also plan to update the barrier
implementation.  For TLS, this is something Siddhesh has on his radar,
AFAIK.  Adhemerval is working on a new cancellation scheme.

I kept the old semaphore code unchanged for now because I don't have a
testing setup for this ready.

Roland, okay for NaCl?  I decided to not try to "optimize" the
shared/private setting at data structure initialization time because I
didn't see a good way to specify the error conditions for the futex_*
functions then: We do want those to sanity check shared/private and not
just rely on the shared/private initialization to do the right thing --
but if we do that, we can as well transform private into shared if
that's actually necessary.
For support of shared mutexes, I have added sanity checks (see above).
The actual __ASSUME_FUTEX_* flags can be removed later, once we tackled
all the remaining uses (lll_*, TLS, ...).
I'll leave it to you to merge NaCl lowlevellock-futex.h into
futex-internal.h because you can do the testing, and know which errors
the NaCl futex functions actually return.  I suppose doing so is a good
thing even though there might be some dupplication with
lowlevellock-futex.h as long as that one still exists.

I'll adapt the SPARC version after this patch has been ack'ed.

Tested on x86_64-linux.


2015-06-17  Torvald Riegel  <triegel@redhat.com>

	* sysdeps/nptl/futex-internal.h: New file.
	* sysdeps/nacl/futex-internal.h: New file.
	* sysdeps/unix/sysv/linux/futex-internal.h: New file.
	* nptl/allocatestack.c (setxid_mark_thread): Use futex wrappers with
	error checking.
	(setxid_unmark_thread): Likewise.
	(__nptl_setxid): Likewise.
	(__wait_lookup_done): Likewise.
	* nptl/cancellation.c (__pthread_disable_asynccancel): Likewise.
	* nptl/nptl-init.c (sighandler_setxid): Likewise.
	* nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
	* nptl/pthread_once.c (clear_once_control): Likewise.
	(__pthread_once_slow): Likewise.
	* nptl/pthread_rwlock_rdlock.c (__pthread_rwlock_rdlock_slow):
	Likewise.
	(__pthread_rwlock_rdlock): Likewise.
	* nptl/pthread_rwlock_timedrdlock.c (pthread_rwlock_timedrdlock):
	Likewise.
	* nptl/pthread_rwlock_timedwrlock.c (pthread_rwlock_timedwrlock):
	Likewise.
	* nptl/pthread_rwlock_tryrdlock.c (__pthread_rwlock_tryrdlock):
	Likewise.
	* nptl/pthread_rwlock_unlock.c (__pthread_rwlock_unlock): Likewise.
	* nptl/pthread_rwlock_wrlock.c (__pthread_rwlock_wrlock_slow:
	Likewise.
	* nptl/unregister-atfork.c (__unregister_atfork): Likewise.
	* sysdeps/nacl/exit-thread.h (__exit_thread): Likewise.
	* sysdeps/nptl/aio_misc.h (AIO_MISC_NOTIFY, AIO_MISC_WAIT): Likewise.
	* sysdeps/nptl/fork.c (__libc_fork): Likewise.
	* sysdeps/nptl/gai_misc.h (GAI_MISC_NOTIFY, GAI_MISC_WAIT): Likewise.
	* nptl/pthread_barrier_wait.c (pthread_barrier_wait): Likewise.
	* nptl/pthread_barrier_init.c (pthread_barrier_init): Add comments
	and abort if attribute initializer failed to sanitize inputs.
	* nptl/pthread_barrierattr_setpshared.c
	(pthread_barrierattr_setpshared): Add sanity check.
	* nptl/sem_init.c (futex_private_if_supported): Remove.
	(__new_sem_init): Adapt and add sanity check.
	* nptl/sem_post.c (futex_wake): Likewise.
	* nptl/sem_waitcommon.c (futex_abstimed_wait, futex_wake): Likewise.
	(do_futex_wait): Use futex wrappers with error checking.
	* nptl/sem_wait.c: Include lowlevellock.h.
	* nptl/sem_open.c (sem_open): Use FUTEX_SHARED.
	* sysdeps/nptl/lowlevellock-futex.h (lll_futex_abstimed_wait): New.
	* sysdeps/unix/sysv/linux/lowlevellock-futex.h
	(lll_futex_abstimed_wait): New.
	* sysdeps/nacl/lowlevellock-futex.h (lll_futex_abstimed_wait): New.

Comments

Roland McGrath June 17, 2015, 10:46 p.m. UTC | #1
> Waiting with absolute or relative timeouts is split into separate
> functions.  This allows for removing a few cases of code duplication in
> pthreads code, which uses absolute timeouts; also, it allows us to put
> platform-specific code to go from an absolute to a relative timeout into
> the platform-specific futex abstractions.  The latter is done by adding
> lll_futex_abstimed_wait.  I expect that we will refactor this later on,
> depending on how we do the lll_ parts.

I don't understand the motivation for adding lll_futex_abstimed_wait now
at all.  Its only users are in futex-internal.h implementations, which
are already OS-specific.  So why make the new files have identical
copies of the wrappers around the implementations living in the moribund
files?  What is the downside to simply having each futex-internal.h's
futex_abstimed_wait{,_cancelable} be the real implementation?

> There are separate versions for both Linux and NaCl; while they
> currently differ only slightly, my expectation is that the separate
> versions of lowlevellock-futex.h will eventually be merged into
> futex-internal.h when we get to move the lll_ functions over to the new
> futex API.

This is not just an expectation, it's the core plan and the whole effort
would be pointless if we failed to actually do this in the future.

> The sanity checks regarding whether shared futexes are supported abort
> instead of returning errors because POSIX error specs don't really
> consider that there could be no support for shared futexes.  Aborting is
> better than returning an unspecified error or an error specified for a
> different condition; only NaCl has no support for shared futexes.

This is not the right behavior.  It is indeed improper to use errno code
E in function F for condition X when POSIX specifies that F returns E
for condition Y.  It is also improper to use errno code E in function F
for condition X when POSIX specifies errno code E2 for condition X in
function F.  But it is entirely proper to return an errno code that
POSIX does not specify for a given function to diagnose a condition that
POSIX does not specify shall or may be diagnosed (see 2.3 Error Numbers).

For pthread_*attr_setpshared, the sensible thing to do for an
unsupported (but valid) value is to return ENOTSUP.  That's what these
functions should do for PTHREAD_PROCESS_SHARED on NaCl.  Conversely,
it's entirely reasonable not to account for any possibility of
PTHREAD_PROCESS_PRIVATE not being supported, because to request that
is to request the default behavior.

> Interacting with futex words requires atomic accesses, which isn't done
> by most of glibc's current futex callers. [...]

I certainly concur with leaving this until later.  What I think will be
reasonable to do eventually is to use the <stdatomic.h> type names in
all our internal interfaces (probably only atomic_int or atomic_uint
should be used for futex stuff).  When building with older compilers
that don't have those, our own internal headers can typedef the ones
we use to the simple types (or perhaps volatile-qualified ones?).

> Roland, okay for NaCl?  I decided to not try to "optimize" the
> shared/private setting at data structure initialization time because I
> didn't see a good way to specify the error conditions for the futex_*
> functions then: We do want those to sanity check shared/private and not
> just rely on the shared/private initialization to do the right thing --
> but if we do that, we can as well transform private into shared if
> that's actually necessary.

I don't follow your logic here.  Why do you think we want any
argument-validity checks applied inside internal interfaces?  It's
reasonable enough to have asserts inside the futex-internal.h
functions if you feel like it.  Those can only fire when other libc
code has a bug or there is memory clobberation.  Those are both cases
where it's nice to fail catastrophically rather than mysteriously, but
also both cases where we don't spend extra effort to diagnose the
(supposedly) impossible situations and certainly where we never
propagate such an error back to the user as a return or errno value.

What I have in mind is (names and signatures are straw men):

* In pthread_*attr_setpshared:
	error = futex_pshared (pshared, &iattr->pshared);
	if (__glibc_unlikely (error))
	  return error;
  This transforms the public ABI values PTHREAD_PROCESS_{PRIVATE,SHARED}
  in PSHARED into the internal form in IATTR->pshared.  On Linux the
  internal form would be FUTEX_PRIVATE_FLAG or 0, to be simply OR'd
  into the operation in the syscall.  On NaCl the internal form would
  just be some constant value, only stored at all to avoid confusing
  valgrind et al.

* In *_init:
	object->private = iattr->pshared;
  There's really no need for checks here, since it's undefined
  behavior to pass a bogus attributes object.  

  It can't be literally that alone since there has to be a default for
  a null attributes pointer.  If it seems worthwhile, we could have it
  sanity check the value in the attributes struct so as to detect
  clobbered or uninitialized attributes; that would be done by a
  futex-internal.h function to transfer attributes-field format into
  object-field format (it would be up to the particular implementation
  to decide if those are the same or different).  But I don't see a
  real need for that.

* In the actual uses:
	futex_wake (&obj->futex, nw, obj->private);
  That is, nothing special.  All the checking was done before.
  If the futex-internal.h implementation cares to do redundant checks
  for clobberation, it can assert inside there.  There should be no
  provision for returning errors to indicate the private field has an
  invalid value.

What don't you like about that (rough) picture?


Thanks,
Roland
Joseph Myers June 17, 2015, 10:55 p.m. UTC | #2
On Wed, 17 Jun 2015, Roland McGrath wrote:

> > Interacting with futex words requires atomic accesses, which isn't done
> > by most of glibc's current futex callers. [...]
> 
> I certainly concur with leaving this until later.  What I think will be
> reasonable to do eventually is to use the <stdatomic.h> type names in
> all our internal interfaces (probably only atomic_int or atomic_uint
> should be used for futex stuff).  When building with older compilers
> that don't have those, our own internal headers can typedef the ones
> we use to the simple types (or perhaps volatile-qualified ones?).

Those types imply seq_cst memory order for plain loads and stores, which 
isn't what's wanted in most places in glibc (one might expect many places 
presently using plain loads and stores actually want relaxed memory 
order).  Operations on _Atomic types may also bring in libatomic 
dependencies depending on processor support, causing obvious problems with 
circular dependencies (libatomic depends on libpthread).
Roland McGrath June 18, 2015, 1:49 a.m. UTC | #3
> Those types imply seq_cst memory order for plain loads and stores, which 
> isn't what's wanted in most places in glibc (one might expect many places 
> presently using plain loads and stores actually want relaxed memory 
> order).  Operations on _Atomic types may also bring in libatomic 
> dependencies depending on processor support, causing obvious problems with 
> circular dependencies (libatomic depends on libpthread).

OK.  I guess we'll have to go with our own typedef names when we clean this up.
Torvald Riegel June 18, 2015, 10:06 a.m. UTC | #4
On Wed, 2015-06-17 at 22:55 +0000, Joseph Myers wrote:
> On Wed, 17 Jun 2015, Roland McGrath wrote:
> 
> > > Interacting with futex words requires atomic accesses, which isn't done
> > > by most of glibc's current futex callers. [...]
> > 
> > I certainly concur with leaving this until later.  What I think will be
> > reasonable to do eventually is to use the <stdatomic.h> type names in
> > all our internal interfaces (probably only atomic_int or atomic_uint
> > should be used for futex stuff).  When building with older compilers
> > that don't have those, our own internal headers can typedef the ones
> > we use to the simple types (or perhaps volatile-qualified ones?).
> 
> Those types imply seq_cst memory order for plain loads and stores, which 
> isn't what's wanted in most places in glibc (one might expect many places 
> presently using plain loads and stores actually want relaxed memory 
> order).

I agree.  We should always have explicit memory orders for all atomic
accesses.  The default seq_cst MO is very likely to be just unnecessary
runtime overhead, yet if we pick a weaker MO it should always be a
conscious choice.

> Operations on _Atomic types may also bring in libatomic 
> dependencies depending on processor support, causing obvious problems with 
> circular dependencies (libatomic depends on libpthread).

Agreed.  Though if looking at just the required functionality, we expect
the glibc atomics to always either use atomic HW instructions or a
kernel helper; if we'd end up using locks to synchronize because there's
no other support for the atomics we're using, we'd be doing something
wrong in glibc.
Torvald Riegel June 18, 2015, 2:13 p.m. UTC | #5
On Wed, 2015-06-17 at 15:46 -0700, Roland McGrath wrote:
> > Waiting with absolute or relative timeouts is split into separate
> > functions.  This allows for removing a few cases of code duplication in
> > pthreads code, which uses absolute timeouts; also, it allows us to put
> > platform-specific code to go from an absolute to a relative timeout into
> > the platform-specific futex abstractions.  The latter is done by adding
> > lll_futex_abstimed_wait.  I expect that we will refactor this later on,
> > depending on how we do the lll_ parts.
> 
> I don't understand the motivation for adding lll_futex_abstimed_wait now
> at all.  Its only users are in futex-internal.h implementations, which
> are already OS-specific.  So why make the new files have identical
> copies of the wrappers around the implementations living in the moribund
> files?  What is the downside to simply having each futex-internal.h's
> futex_abstimed_wait{,_cancelable} be the real implementation?

There's no downside, it's simply a matter of which incremental steps we
take, and how we organize the work.  Remember that futex-internal.h
wasn't OS-specific before.  There are three options now that it is:
(1) Merge in all of lowlevellock-futex.h into futex-internal.h.
lowlevellock-futex.h has to still remain until we moved over all the
uses (lowlevellock itself is the largest and will take more time).
Thus, quite a bit of code duplication until we'd finished all of this.
(2) Move just lll_futex_abstimed_wait into futex-internal.h.  Then we
have a mix of lll_futex_* usage and direct futex syscall / NaCl uses in
futex-internal.h
(3) Do as we have now, and merge + remove lowlevellock-futex.h later on.

I didn't like option (2) too much because of the mixed merge / futex
usage, and thought (3) would be a cleaner intermediate step.  That's why
I picked that option.  I don't mind doing (2) now, or even (1), though
-- I think it's mostly a question of which steps / churn we prefer.

> > There are separate versions for both Linux and NaCl; while they
> > currently differ only slightly, my expectation is that the separate
> > versions of lowlevellock-futex.h will eventually be merged into
> > futex-internal.h when we get to move the lll_ functions over to the new
> > futex API.
> 
> This is not just an expectation, it's the core plan and the whole effort
> would be pointless if we failed to actually do this in the future.

That's what I'd think as well, but I we have no patches for this yet, so
I didn't want to make foregone conclusions ... :)

> > The sanity checks regarding whether shared futexes are supported abort
> > instead of returning errors because POSIX error specs don't really
> > consider that there could be no support for shared futexes.  Aborting is
> > better than returning an unspecified error or an error specified for a
> > different condition; only NaCl has no support for shared futexes.
> 
> This is not the right behavior.  It is indeed improper to use errno code
> E in function F for condition X when POSIX specifies that F returns E
> for condition Y.  It is also improper to use errno code E in function F
> for condition X when POSIX specifies errno code E2 for condition X in
> function F.  But it is entirely proper to return an errno code that
> POSIX does not specify for a given function to diagnose a condition that
> POSIX does not specify shall or may be diagnosed (see 2.3 Error Numbers).

That's what POSIX allows, I agreed.  But I thought that your motivation
was to make existing software robust on NaCl.  If you add a new error
code, how much of the existing software do you think will be prepared to
handle it sensibly, or check for it at all?  I don't remember seeing any
code that would do something useful on errors not specified by POSIX
(e.g., an "else { puts("unkown error"); exit(...); }" everywhere...). 

Thus, if a new error code is simply most likely to be not acted upon,
that's not better than failing fast, and in a way that can't simply be
ignored by the program.

> For pthread_*attr_setpshared, the sensible thing to do for an
> unsupported (but valid) value is to return ENOTSUP.  That's what these
> functions should do for PTHREAD_PROCESS_SHARED on NaCl.

If you think that's helpful for NaCl, we can do that.  Can you please
provide the documentation patch that mentions this additional error
condition on NaCl, or do you want it to remain undocumented?  (This
should be documented as a NaCl-only error condition, IMO.)

> Conversely,
> it's entirely reasonable not to account for any possibility of
> PTHREAD_PROCESS_PRIVATE not being supported, because to request that
> is to request the default behavior.
> 
> > Interacting with futex words requires atomic accesses, which isn't done
> > by most of glibc's current futex callers. [...]
> 
> I certainly concur with leaving this until later.  What I think will be
> reasonable to do eventually is to use the <stdatomic.h> type names in
> all our internal interfaces (probably only atomic_int or atomic_uint
> should be used for futex stuff).  When building with older compilers
> that don't have those, our own internal headers can typedef the ones
> we use to the simple types (or perhaps volatile-qualified ones?).

Joseph commented already on potential practical issues with that
(although I think we may be able to solve them or have them not trigger
in practice).

Annotating the types used for atomic accesses is something I considered.
We could do it for data not exposed to users (e.g., on internal
interfaces as you say), but then we have this weird (IMHO) mix of some
data being atomic-typed and some not.  This would mean that a variable
not having an atomic type wouldn't be sufficient to infer that it
doesn't need atomic accesses.

The compromise that I thought would be useful as a first step was to
require explicitly atomic accesses (through atomic_*) for all data that
needs it, and don't change the types for now.  We want to have the
explicit accesses anyway to have explicit MO choices as I mentioned
elsewhere in the thread, so doing that is something we'd keep doing
anyway.
Changing the types of (some of) all atomically accessed data later on
would be a fairly mechanical change, I suppose.

Nonetheless, if there is a preference in the project to use atomic types
where possible right from the start, I wouldn't be opposed to that.

> > Roland, okay for NaCl?  I decided to not try to "optimize" the
> > shared/private setting at data structure initialization time because I
> > didn't see a good way to specify the error conditions for the futex_*
> > functions then: We do want those to sanity check shared/private and not
> > just rely on the shared/private initialization to do the right thing --
> > but if we do that, we can as well transform private into shared if
> > that's actually necessary.
> 
> I don't follow your logic here.  Why do you think we want any
> argument-validity checks applied inside internal interfaces?  It's
> reasonable enough to have asserts inside the futex-internal.h
> functions if you feel like it.

Yes, the sanity checks I was talking about are assertions, not checks
that alter the conditions under which the futex wrappers return to the
caller.  Carlos specifically requested that the futex API calls abort
when the futex syscall returns an error that can only happen if glibc or
the program are buggy.

> Those can only fire when other libc
> code has a bug or there is memory clobberation.  Those are both cases
> where it's nice to fail catastrophically rather than mysteriously, but
> also both cases where we don't spend extra effort to diagnose the
> (supposedly) impossible situations and certainly where we never
> propagate such an error back to the user as a return or errno value.

Yeah, no (new) errors.  That's what I meant to refer to when speaking
about "error conditions".

But I think it's worth distinguishing between private and shared.  I
suppose we agree that private must always be supported, potentially
through the implementation picking shared instead (though that would be
unlikely in practice).

What the existing code does is to select SHARED early if PRIVATE isn't
natively supported (but that's not the case anymore neither for Linux
nor NaCl).  We can expect that all callers do that, but if we then add
an assertion (private != PRIVATE), we as well do this inside of this
hypothetical futex function:  if (private == PRIVATE) private = SHARED;
That's what I meant when saying that if we do an assertion, we can do
the conversion as well in this case.

SHARED is different, because we don't expect that all platforms support
it.  Therefore, the patch has futex_supports_shared, and if that returns
false, futex_* can assert that private != SHARED.
This makes sense because SHARED isn't the common case, so handling it
asymmetrically works well.

Does that explain my reasoning?

> What I have in mind is (names and signatures are straw men):
> 
> * In pthread_*attr_setpshared:
> 	error = futex_pshared (pshared, &iattr->pshared);
> 	if (__glibc_unlikely (error))
> 	  return error;
>   This transforms the public ABI values PTHREAD_PROCESS_{PRIVATE,SHARED}
>   in PSHARED into the internal form in IATTR->pshared.

PTHREAD_PROCESS_* is used by *_setpshared, but not by semaphore.
If we transform into the internal form, we need to have another function
that transforms back for _getpshared, so it's better to do the
transformation when initializing using the attributes.

Then the check at attr setting time would be:
  if (pshared == PTHREAD_PROCESS_SHARED && !futex_supports_shared ())
    return ENOTSUP;
and when using the attr it would be:
  foo = iattr->pshared == PTHREAD_PROCESS_SHARED ? FUTEX_SHARED :
                                                   FUTEX_PRIVATE;
(i.e., same as in the patch).

The only substantial benefit I can see is to have the return of ENOTSUP
be part of a NaCl sysdep.  Having a futex_check_shared that does the
check above would be good for that.

Anyway, I don't care strongly about that.  We're talking about 4
occurrences of setpshared.  Which option do you want to have?

> On Linux the
>   internal form would be FUTEX_PRIVATE_FLAG or 0, to be simply OR'd
>   into the operation in the syscall.  On NaCl the internal form would
>   just be some constant value, only stored at all to avoid confusing
>   valgrind et al.

This abstraction already exists.  See FUTEX_PRIVATE and FUTEX_SHARED.

> * In *_init:
> 	object->private = iattr->pshared;
>   There's really no need for checks here, since it's undefined
>   behavior to pass a bogus attributes object.  

Yes, except that I'm not sure having FUTEX_PRIVATE/SHARED in
iattr->pshared already is a real improvement.

>   It can't be literally that alone since there has to be a default for
>   a null attributes pointer.  If it seems worthwhile, we could have it
>   sanity check the value in the attributes struct so as to detect
>   clobbered or uninitialized attributes; that would be done by a
>   futex-internal.h function to transfer attributes-field format into
>   object-field format (it would be up to the particular implementation
>   to decide if those are the same or different).  But I don't see a
>   real need for that.
> 
> * In the actual uses:
> 	futex_wake (&obj->futex, nw, obj->private);
>   That is, nothing special.  All the checking was done before.
>   If the futex-internal.h implementation cares to do redundant checks
>   for clobberation, it can assert inside there.  There should be no
>   provision for returning errors to indicate the private field has an
>   invalid value.

No errors, but assertion in case of NaCl when it's passed FUTEX_SHARED.
(I mean, I don't really care about NaCl details at that level, but
that's the scheme I had in mind.)

Have you looked at the actual patch yet?
Roland McGrath June 18, 2015, 10:26 p.m. UTC | #6
> I didn't like option (2) too much because of the mixed merge / futex
> usage, and thought (3) would be a cleaner intermediate step.  That's why
> I picked that option.  I don't mind doing (2) now, or even (1), though
> -- I think it's mostly a question of which steps / churn we prefer.

OK, I understand your thinking now.  I think (2) is best.  It avoids the
gratuitous churn whose only benefit is making the intermediate state
look well-layered.  I think it's better to have things go as quickly as
possible to what their expected end state is, with fewer intermediate
steps.  I agree (1) is bad because we'd have duplicated code/logic and
because it's better to migrate implementation and users in one step (or
closer to one).  To make things incremental, I think it makes most sense
to address one macro at a time (or small sets that go together): replace
lll_futex_foo macro with futex_foo function and change all users;
iterate.  But we can play it by ear.  I just think that adding new
things we know we'll remove soon is the wrong kind of churn.

> That's what POSIX allows, I agreed.  But I thought that your motivation
> was to make existing software robust on NaCl.  If you add a new error
> code, how much of the existing software do you think will be prepared to
> handle it sensibly, or check for it at all?  I don't remember seeing any
> code that would do something useful on errors not specified by POSIX
> (e.g., an "else { puts("unkown error"); exit(...); }" everywhere...). 

Most code just checks for failure and reports all errors the same way
(using strerror et al) rather than examining errno for specific cases.

> If you think that's helpful for NaCl, we can do that.  

I think it is the right choice, yes.

> Can you please provide the documentation patch that mentions this
> additional error condition on NaCl, or do you want it to remain
> undocumented?  (This should be documented as a NaCl-only error condition,
> IMO.)

We don't have documentation for these functions at all.  If we did, I
think I'd probably leave out any NaCl-specific issues for now anyway.

> Annotating the types used for atomic accesses is something I considered.

My remark about that was an aside and I don't think we should do
anything like this until later (after the futex cleanup is complete, and
perhaps other things).  So let's not muddy this thread by discussing
details further now.

> Yes, the sanity checks I was talking about are assertions, not checks
> that alter the conditions under which the futex wrappers return to the
> caller.  Carlos specifically requested that the futex API calls abort
> when the futex syscall returns an error that can only happen if glibc or
> the program are buggy.

Yes, we had strong consensus about sanity-checking the errors reported
by the kernel.  That seems orthogonal to sanity-checking values in
internal APIs before calling into the kernel, which I think is the only
thing we're talking about here.

> But I think it's worth distinguishing between private and shared.  I
> suppose we agree that private must always be supported, potentially
> through the implementation picking shared instead (though that would be
> unlikely in practice).

I do agree that private must always be supported.  But I'm not at all
sure I'm following exactly what kind of "distinguishing" you mean.

I had forgotten about the *_getpshared functions.  Clearly to support
those the attributes object has to actually store something on
configurations where shared is ever supported.  But on NaCl there is no
need to store anything; the function to extract the flag from the
internal form would just yield a constant.  This is the only sense of
distinguishing private from shared that I think is necessary.

pthread_mutex_init and pthread_cond_init are far more common than the
attribute-fiddling calls.  So they should not do any extra work that
could be offloaded to the attribute-fiddling calls.  Since the
attribute-fiddling calls need to validate their argument, it's optimal
to roll the validation in with the conversion to internal form and so do
it there.  (For sem_init, there is no such distinction.)

Actual use of the objects, that leads to the futex calls, is of course
the most common thing.  So it should not do any extra work at all that
can be avoided.  (I realize it's a slow path, but still.)  That says
that any stored values should be kept in internal form so that the
actual futex calls are just using them directly.  Any assertions to
validate arguments to futex-internal.h calls are just checking for
memory clobberation or libc bugs, so I don't see a real motivation for
having them (though, like all assertions, it's never completely
unreasonable to have them).

> What the existing code does is to select SHARED early if PRIVATE isn't
> natively supported (but that's not the case anymore neither for Linux
> nor NaCl).  We can expect that all callers do that, but if we then add
> an assertion (private != PRIVATE), we as well do this inside of this
> hypothetical futex function:  if (private == PRIVATE) private = SHARED;
> That's what I meant when saying that if we do an assertion, we can do
> the conversion as well in this case.

I don't think that's at all desireable.  (This is academic, since we
don't actually have any configurations that need to "degrade" private to
shared in this way.  But I'll elaborate anyway.)  Remember, assertions
can always be disabled (and this is what most Linux distros do for
production builds).  So you could have an assertion inside futex_wake or
whatnot, fine (though as I've said above, I don't see a good reason to
want one).  But it's wrong logic to think that because you'll have an
assertion looking at the value you might as well have other logic on the
value in the same place.  With assertions disabled, there would be no
need for any examination of the value inside futex_wake beyond just
OR'ing it into the operation argument to the syscall.

If we did need to handle this case, then I think the best thing would be
to have another sysdeps call that transforms an internal form stored by
*_setpshared into the internal form actually used in futex_wake et al.
Then pthread_mutex_init et al would use that call instead of simply
copying the field from attributes object to synchronization object.  (I
don't have any objection to structuring it this way now even though all
the sysdeps implementations today would in fact just copy the value.)

> PTHREAD_PROCESS_* is used by *_setpshared, but not by semaphore.
> If we transform into the internal form, we need to have another function
> that transforms back for _getpshared, so it's better to do the
> transformation when initializing using the attributes.

I disagree.  This pair of functions is very straightforward to define.
I've explained above why delaying transformation any later than necessary
is suboptimal.

> Anyway, I don't care strongly about that.  We're talking about 4
> occurrences of setpshared.  Which option do you want to have?

error = futex_init_pshared (&attr->pshared, pshared_argument);
pshared = futex_get_pshared (&attr->pshared);

sem_init can just pass (pshared ? PTHREAD_PROCESS_SHARED :
PTHREAD_PROCESS_PRIVATE) as the argument, and inlining/constant-folding
should turn it back into optimal.

So, just what I said before, but expanded to cover getpshared.
(I don't care about the functions' names or their exact signatures.)

> No errors, but assertion in case of NaCl when it's passed FUTEX_SHARED.
> (I mean, I don't really care about NaCl details at that level, but
> that's the scheme I had in mind.)

Sensible enough.

> Have you looked at the actual patch yet?

I skimmed it and didn't see deep issues beyond what I raised.  There
were a couple of trivial style things (whitespace and the like) that I
didn't bother to point out.  You'll probably see them if you just read
over the whole patch.  I'll do the fine-tooth comb review on the next
iteration once we've agreed on the details we're discussing here.


Thanks,
Roland
Torvald Riegel June 19, 2015, 12:26 p.m. UTC | #7
On Thu, 2015-06-18 at 15:26 -0700, Roland McGrath wrote:
> > I didn't like option (2) too much because of the mixed merge / futex
> > usage, and thought (3) would be a cleaner intermediate step.  That's why
> > I picked that option.  I don't mind doing (2) now, or even (1), though
> > -- I think it's mostly a question of which steps / churn we prefer.
> 
> OK, I understand your thinking now.  I think (2) is best.

Alright.

> > Yes, the sanity checks I was talking about are assertions, not checks
> > that alter the conditions under which the futex wrappers return to the
> > caller.  Carlos specifically requested that the futex API calls abort
> > when the futex syscall returns an error that can only happen if glibc or
> > the program are buggy.
> 
> Yes, we had strong consensus about sanity-checking the errors reported
> by the kernel.  That seems orthogonal to sanity-checking values in
> internal APIs before calling into the kernel, which I think is the only
> thing we're talking about here.

I'm talking about some input to the futex functions resulting in a call
to abort, either before it goes to the kernel / nacl runtime / ..., or
after because the kernel / ... returns a certain error.

> > But I think it's worth distinguishing between private and shared.  I
> > suppose we agree that private must always be supported, potentially
> > through the implementation picking shared instead (though that would be
> > unlikely in practice).
> 
> I do agree that private must always be supported.  But I'm not at all
> sure I'm following exactly what kind of "distinguishing" you mean.
> 
> I had forgotten about the *_getpshared functions.  Clearly to support
> those the attributes object has to actually store something on
> configurations where shared is ever supported.  But on NaCl there is no
> need to store anything; the function to extract the flag from the
> internal form would just yield a constant.

But that's only possible because NaCl doesn't support shared, so no
conversion is necessary.

If we had a platform that would support private through translation to
shared (or whatever internal form it had), the attributes would still
have to use two separate values:  If one calls setphared(private), a
subsequent getpshared() has to return private -- not the merged internal
form (shared in this case).  That's similar to the lock elision issues
we had.

> This is the only sense of
> distinguishing private from shared that I think is necessary.
> 
> pthread_mutex_init and pthread_cond_init are far more common than the
> attribute-fiddling calls.  So they should not do any extra work that
> could be offloaded to the attribute-fiddling calls.  Since the
> attribute-fiddling calls need to validate their argument, it's optimal
> to roll the validation in with the conversion to internal form and so do
> it there.  (For sem_init, there is no such distinction.)

A single branch, or even a load of some config flag, is irrelevant
compared to the cost of actually sharing the mutex/condvar/... with
another thread -- which you need to do to actually have a need for the
mutex.  There's really no need to optimize this.  And we don't even have
a need to do anything there currently, for Linux or NaCl.

> Actual use of the objects, that leads to the futex calls, is of course
> the most common thing.  So it should not do any extra work at all that
> can be avoided.  (I realize it's a slow path, but still.)

Yes, it is a slow path.  But consider futex_wake: The rest of the slow
path enters the kernel, grabs a lock, likely cache misses on the lock
and/or the waitqueue (if there's an actual waiter, or has been), and
then potentially wakes another thread or just releases a lock, and
returns.  In the NaCl runtime, this will be similar except the syscall
overhead.
Some minor conversion like a branch is irrelevant compared to that.
NaCl wouldn't have any conversion; Linux may just replace the values. 

> That says
> that any stored values should be kept in internal form so that the
> actual futex calls are just using them directly.

But we have plenty of futex calls that are not used for PThreads
synchronization data structures.  I don't think it clarifies the code if
for those, we can't simply pass in FUTEX_PRIVATE, but have to first do
another call that transforms it to whatever futex_wake or such would
like to see internally.  So now we have those calls littered throughout
the code, and it would be cleanest to call this on every callsite.  That
just seems wrong.

> Any assertions to
> validate arguments to futex-internal.h calls are just checking for
> memory clobberation or libc bugs, so I don't see a real motivation for
> having them (though, like all assertions, it's never completely
> unreasonable to have them).

See above.  It's not assertions on input, it's just that the function
can abort on bad input.  You can simply decide to not do that in NaCl.

> > What the existing code does is to select SHARED early if PRIVATE isn't
> > natively supported (but that's not the case anymore neither for Linux
> > nor NaCl).  We can expect that all callers do that, but if we then add
> > an assertion (private != PRIVATE), we as well do this inside of this
> > hypothetical futex function:  if (private == PRIVATE) private = SHARED;
> > That's what I meant when saying that if we do an assertion, we can do
> > the conversion as well in this case.
> 
> I don't think that's at all desireable.  (This is academic, since we
> don't actually have any configurations that need to "degrade" private to
> shared in this way.  But I'll elaborate anyway.)  Remember, assertions
> can always be disabled (and this is what most Linux distros do for
> production builds).

Well, the "assertions" that we have in the patch currently are not
assert(), but "if (unexpected_error) abort();".  IIRC having abort()
instead of assert() was one request regarding the error checking.

> So you could have an assertion inside futex_wake or
> whatnot, fine (though as I've said above, I don't see a good reason to
> want one).  But it's wrong logic to think that because you'll have an
> assertion looking at the value you might as well have other logic on the
> value in the same place.  With assertions disabled, there would be no
> need for any examination of the value inside futex_wake beyond just
> OR'ing it into the operation argument to the syscall.

See above.

> If we did need to handle this case, then I think the best thing would be
> to have another sysdeps call that transforms an internal form stored by
> *_setpshared into the internal form actually used in futex_wake et al.
> Then pthread_mutex_init et al would use that call instead of simply
> copying the field from attributes object to synchronization object.  (I
> don't have any objection to structuring it this way now even though all
> the sysdeps implementations today would in fact just copy the value.)
> 
> > PTHREAD_PROCESS_* is used by *_setpshared, but not by semaphore.
> > If we transform into the internal form, we need to have another function
> > that transforms back for _getpshared, so it's better to do the
> > transformation when initializing using the attributes.
> 
> I disagree.  This pair of functions is very straightforward to define.
> I've explained above why delaying transformation any later than necessary
> is suboptimal.
> 
> > Anyway, I don't care strongly about that.  We're talking about 4
> > occurrences of setpshared.  Which option do you want to have?
> 
> error = futex_init_pshared (&attr->pshared, pshared_argument);
> pshared = futex_get_pshared (&attr->pshared);

As mentioned above, the internal form will have to preserve two values,
if a user can set either successfully.

It also doesn't handle the default attribute.  We can have a macro /
constexpr that transform the default attribute in place, or need a
special case for when NULL is passed as the attribute.  It would be
simpler to transform at this time not getpshared/setpshared time, if we
can't agree on simply doing it on futex_* call time.

> sem_init can just pass (pshared ? PTHREAD_PROCESS_SHARED :
> PTHREAD_PROCESS_PRIVATE) as the argument, and inlining/constant-folding
> should turn it back into optimal.
> 
> So, just what I said before, but expanded to cover getpshared.
> (I don't care about the functions' names or their exact signatures.)

What would you do for the other calls to futex_* that pass in
FUTEX_PRIVATE directly?  (There are about 20 such calls in the patch
that I sent...).

I'll prepare a patch that does (2) above and the ENOTSUP for NaCl, but
keeps the remaining checks as is.  I think it would be worthwhile if you
would look at this patch specifically and check whether you do have
concerns about any of the actually generated code (regarding runtime
overheads).
Roland McGrath June 19, 2015, 9:27 p.m. UTC | #8
> If we had a platform that would support private through translation to
> shared (or whatever internal form it had), the attributes would still
> have to use two separate values:  If one calls setphared(private), a
> subsequent getpshared() has to return private -- not the merged internal
> form (shared in this case).

Yes, I understand that.  I mentioned this case below.

> That's similar to the lock elision issues we had.

I don't know what this refers to, but perhaps it's not really relevant.

> A single branch, or even a load of some config flag, is irrelevant [...]

I understand that the cost is lost in the noise.  That does mean that
it's not important to spend effort on avoiding that cost.  But it does
not mean that when you have multiple choices with similar amounts of
work to implement and all else being equal, you should not choose one
that has less cost.  Less is still less.

> Well, the "assertions" that we have in the patch currently are not
> assert(), but "if (unexpected_error) abort();".  IIRC having abort()
> instead of assert() was one request regarding the error checking.

There are different classes of checks.  For things that could only be
due to libc bugs or memory clobberation, assert is right.  The only
places we want explicit checks even under -DNDEBUG are those where we
are checking return values from other components of the system such as
the kernel.  (Carlos posted a more thorough taxonomy of cases/checks
and I think we had consensus about how to treat each one.)

> But we have plenty of futex calls that are not used for PThreads
> synchronization data structures.  I don't think it clarifies the code if
> for those, we can't simply pass in FUTEX_PRIVATE, but have to first do
> another call that transforms it to whatever futex_wake or such would
> like to see internally.  So now we have those calls littered throughout
> the code, and it would be cleanest to call this on every callsite.  That
> just seems wrong.

What I was calling "internal form" is exactly FUTEX_PRIVATE et al--the
arguments to futex-internal.h functions.  I never proposed that direct
uses of futex-internal.h would do anything different.  (For NaCl, the
FUTEX_{PRIVATE,SHARED} macros might well just have the same value,
since the values would never actually be examined.)

It sort of seems like you missed this paragraph, though you quoted it:
> > If we did need to handle this case, then I think the best thing would be
> > to have another sysdeps call that transforms an internal form stored by
> > *_setpshared into the internal form actually used in futex_wake et al.
> > Then pthread_mutex_init et al would use that call instead of simply
> > copying the field from attributes object to synchronization object.  (I
> > don't have any objection to structuring it this way now even though all
> > the sysdeps implementations today would in fact just copy the value.)

As I said there, if we were to handle the "degrade to shared"
situation, then there could be a third form (a second internal form).
For such a configuration, that form would be the only one that needs
to distinguish requested-private from requested-shared just so that
*_getpshared can work.  In configurations without that need (which is
all we have today), then presumably the sysdeps call mentioned above
would simply use "final" internal form (i.e. FUTEX_* values passed to
futex-internal.h functions) as the intermediate internal form
(i.e. what's stored in attributes objects).

Since we don't need to handle that case today, I think this is all
moot.  For now, just having final internal form (FUTEX_* values) be
what's stored in attributes objects as well as synchronization objects
will work fine.

> It also doesn't handle the default attribute.  We can have a macro /
> constexpr that transform the default attribute in place, or need a
> special case for when NULL is passed as the attribute.  It would be
> simpler to transform at this time not getpshared/setpshared time, if we
> can't agree on simply doing it on futex_* call time.

Let's simplify things by not considering the case we don't actually
have.  So it's universal that the only things that get stored are the
final FUTEX_* values.  That means it's simple to just do:

	object->private = attr == NULL ? FUTEX_PRIVATE : attr->pshared;

(Remember, futex_get_pshared is only there for the *_getpshared
implementations to use.  It's never involved in either initialization
or use of synchronization objects.)

> I'll prepare a patch that does (2) above and the ENOTSUP for NaCl, but
> keeps the remaining checks as is.  I think it would be worthwhile if you
> would look at this patch specifically and check whether you do have
> concerns about any of the actually generated code (regarding runtime
> overheads).

I hope I've now convinced you of two more things.  But sure, I will
look at the concrete patch and try not to quibble overmuch.


Thanks,
Roland
Torvald Riegel June 20, 2015, 4:20 p.m. UTC | #9
On Fri, 2015-06-19 at 14:27 -0700, Roland McGrath wrote:
> > But we have plenty of futex calls that are not used for PThreads
> > synchronization data structures.  I don't think it clarifies the code if
> > for those, we can't simply pass in FUTEX_PRIVATE, but have to first do
> > another call that transforms it to whatever futex_wake or such would
> > like to see internally.  So now we have those calls littered throughout
> > the code, and it would be cleanest to call this on every callsite.  That
> > just seems wrong.
> 
> What I was calling "internal form" is exactly FUTEX_PRIVATE et al--the
> arguments to futex-internal.h functions.  I never proposed that direct
> uses of futex-internal.h would do anything different.

Ah -- so we're not as much in disagreement as I thought we might be.

> (For NaCl, the
> FUTEX_{PRIVATE,SHARED} macros might well just have the same value,
> since the values would never actually be examined.)

> It sort of seems like you missed this paragraph, though you quoted it:
> > > If we did need to handle this case, then I think the best thing would be
> > > to have another sysdeps call that transforms an internal form stored by
> > > *_setpshared into the internal form actually used in futex_wake et al.
> > > Then pthread_mutex_init et al would use that call instead of simply
> > > copying the field from attributes object to synchronization object.  (I
> > > don't have any objection to structuring it this way now even though all
> > > the sysdeps implementations today would in fact just copy the value.)
> 
> As I said there, if we were to handle the "degrade to shared"
> situation, then there could be a third form (a second internal form).
> For such a configuration, that form would be the only one that needs
> to distinguish requested-private from requested-shared just so that
> *_getpshared can work.  In configurations without that need (which is
> all we have today), then presumably the sysdeps call mentioned above
> would simply use "final" internal form (i.e. FUTEX_* values passed to
> futex-internal.h functions) as the intermediate internal form
> (i.e. what's stored in attributes objects).
> 
> Since we don't need to handle that case today, I think this is all
> moot.  For now, just having final internal form (FUTEX_* values) be
> what's stored in attributes objects as well as synchronization objects
> will work fine.

If we had sufficient space in the attribute structures to just store an
int that is just FUTEX_PRIVATE or FUTEX_SHARED, I'd agree that this
would work for now.  (And if we don't consider the "degrade to shared"
case.)

However, on x86 for example, the attributes structs are 4 bytes for
condvar, barrier, and mutex and 8 bytes for rwlock.  Barrier just has a
shared attribute; condvars need to store shared and which clock.  Mutex
attributes use several bits (kind, shared, robust, ...).  rwlock has
kind and shared.  Thus, only barrier and rwlock could use an int just
for the internal form of shared.

We could try to make the internal form compatible with the the other
things that condvar and mutex need to put into the same int, but then
we're restricting the actual values the futex API can use.

We have one such restriction already (which I need to document):
FUTEX_PRIVATE must be zero, because the initializers for rwlock,
condvar, and mutex all have zeros for the shared field (or bit).


In the interest of making progress on this, I suggest that we treat the
attribute initialization separately.  I'm going to work with you towards
bringing that into a shape that we're both happy with.  I agree that
eventually, we'll want to use FUTEX_PRIVATE / FUTEX_SHARED to store
shared or not in data structures such as pthread_barrier.  However,
right now, we can't yet do that everywhere, or not easily in this patch:
* barrier still has assembly implementations that expect other values;
* for condvar I have posted a new algorithm, and we'd do the conversion
in that patch;
* mutex encodes shared together with other bits, and we have the
dependencies on what the lll_* code does;
* for rwlock, a new algorithm is WIP.
diff mbox

Patch

commit e5572e4568a2e44b6cc5a570c14c6f342a167f90
Author: Torvald Riegel <triegel@redhat.com>
Date:   Thu Dec 4 14:12:23 2014 +0100

    Add and use new glibc-internal futex API.
    
    This adds new functions for futex operations, starting with wait,
    abstimed_wait, reltimed_wait, wake.  They add documentation and error
    checking according to the current draft of the Linux kernel futex manpage.
    
    Waiting with absolute or relative timeouts is split into separate functions.
    This allows for removing a few cases of code duplication in pthreads code,
    which uses absolute timeouts; also, it allows us to put platform-specific
    code to go from an absolute to a relative timeout into the platform-specific
    futex abstractions.  The latter is done by adding lll_futex_abstimed_wait.
    I expect that we will refactor this later on, depending on how we do the
    lll_ parts.
    
    Futex operations that can be canceled are also split out into separate
    functions suffixed by "_cancelable".
    
    There are separate versions for both Linux and NaCl; while they currently
    differ only slightly, my expectation is that the separate versions of
    lowlevellock-futex.h will eventually be merged into futex-internal.h
    when we get to move the lll_ functions over to the new futex API.
    
    The sanity checks regarding whether shared futexes are supported abort
    instead of returning errors because POSIX error specs don't really consider
    that there could be no support for shared futexes.  Aborting is better
    than returning an unspecified error or an error specified for a different
    condition; only NaCl has no support for shared futexes.

diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index 8e620c4..7595186 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -29,6 +29,7 @@ 
 #include <tls.h>
 #include <list.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <kernel-features.h>
 #include <stack-aliasing.h>
 
@@ -987,7 +988,8 @@  setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
   if (t->setxid_futex == -1
       && ! atomic_compare_and_exchange_bool_acq (&t->setxid_futex, -2, -1))
     do
-      lll_futex_wait (&t->setxid_futex, -2, LLL_PRIVATE);
+      futex_wait_simple ((unsigned int *) &t->setxid_futex, -2,
+			 FUTEX_PRIVATE);
     while (t->setxid_futex == -2);
 
   /* Don't let the thread exit before the setxid handler runs.  */
@@ -1005,7 +1007,7 @@  setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
 	  if ((ch & SETXID_BITMASK) == 0)
 	    {
 	      t->setxid_futex = 1;
-	      lll_futex_wake (&t->setxid_futex, 1, LLL_PRIVATE);
+	      futex_wake ((unsigned int *) &t->setxid_futex, 1, FUTEX_PRIVATE);
 	    }
 	  return;
 	}
@@ -1032,7 +1034,7 @@  setxid_unmark_thread (struct xid_command *cmdp, struct pthread *t)
 
   /* Release the futex just in case.  */
   t->setxid_futex = 1;
-  lll_futex_wake (&t->setxid_futex, 1, LLL_PRIVATE);
+  futex_wake ((unsigned int *) &t->setxid_futex, 1, FUTEX_PRIVATE);
 }
 
 
@@ -1141,7 +1143,8 @@  __nptl_setxid (struct xid_command *cmdp)
       int cur = cmdp->cntr;
       while (cur != 0)
 	{
-	  lll_futex_wait (&cmdp->cntr, cur, LLL_PRIVATE);
+	  futex_wait_simple ((unsigned int *) &cmdp->cntr, cur,
+			     FUTEX_PRIVATE);
 	  cur = cmdp->cntr;
 	}
     }
@@ -1251,7 +1254,8 @@  __wait_lookup_done (void)
 	continue;
 
       do
-	lll_futex_wait (gscope_flagp, THREAD_GSCOPE_FLAG_WAIT, LLL_PRIVATE);
+	futex_wait_simple ((unsigned int *) gscope_flagp,
+			   THREAD_GSCOPE_FLAG_WAIT, FUTEX_PRIVATE);
       while (*gscope_flagp == THREAD_GSCOPE_FLAG_WAIT);
     }
 
@@ -1273,7 +1277,8 @@  __wait_lookup_done (void)
 	continue;
 
       do
-	lll_futex_wait (gscope_flagp, THREAD_GSCOPE_FLAG_WAIT, LLL_PRIVATE);
+	futex_wait_simple ((unsigned int *) gscope_flagp,
+			   THREAD_GSCOPE_FLAG_WAIT, FUTEX_PRIVATE);
       while (*gscope_flagp == THREAD_GSCOPE_FLAG_WAIT);
     }
 
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
index deac1eb..ea1bbc9 100644
--- a/nptl/cancellation.c
+++ b/nptl/cancellation.c
@@ -19,6 +19,7 @@ 
 #include <setjmp.h>
 #include <stdlib.h>
 #include "pthreadP.h"
+#include <futex-internal.h>
 
 
 /* The next two functions are similar to pthread_setcanceltype() but
@@ -93,7 +94,7 @@  __pthread_disable_asynccancel (int oldtype)
   while (__builtin_expect ((newval & (CANCELING_BITMASK | CANCELED_BITMASK))
 			   == CANCELING_BITMASK, 0))
     {
-      lll_futex_wait (&self->cancelhandling, newval, LLL_PRIVATE);
+      futex_wait_simple (&self->cancelhandling, newval, FUTEX_PRIVATE);
       newval = THREAD_GETMEM (self, cancelhandling);
     }
 }
diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
index 8a51161..c875d3d 100644
--- a/nptl/nptl-init.c
+++ b/nptl/nptl-init.c
@@ -34,6 +34,7 @@ 
 #include <shlib-compat.h>
 #include <smp.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <kernel-features.h>
 #include <libc-internal.h>
 #include <pthread-pids.h>
@@ -279,10 +280,10 @@  sighandler_setxid (int sig, siginfo_t *si, void *ctx)
 
   /* And release the futex.  */
   self->setxid_futex = 1;
-  lll_futex_wake (&self->setxid_futex, 1, LLL_PRIVATE);
+  futex_wake ((unsigned int *) &self->setxid_futex, 1, FUTEX_PRIVATE);
 
   if (atomic_decrement_val (&__xidcmd->cntr) == 0)
-    lll_futex_wake (&__xidcmd->cntr, 1, LLL_PRIVATE);
+    futex_wake ((unsigned int *) &__xidcmd->cntr, 1, FUTEX_PRIVATE);
 }
 #endif
 
diff --git a/nptl/pthread_barrier_init.c b/nptl/pthread_barrier_init.c
index 82e13fb..374a2f7 100644
--- a/nptl/pthread_barrier_init.c
+++ b/nptl/pthread_barrier_init.c
@@ -36,6 +36,7 @@  pthread_barrier_init (barrier, attr, count)
 {
   struct pthread_barrier *ibarrier;
 
+  /* XXX EINVAL is not specified by POSIX as a possible error code.  */
   if (__glibc_unlikely (count == 0))
     return EINVAL;
 
@@ -46,8 +47,9 @@  pthread_barrier_init (barrier, attr, count)
 
   if (iattr->pshared != PTHREAD_PROCESS_PRIVATE
       && __builtin_expect (iattr->pshared != PTHREAD_PROCESS_SHARED, 0))
-    /* Invalid attribute.  */
-    return EINVAL;
+    /* Invalid attribute, but the initializer of the attributes has to
+       check that.  */
+    abort ();
 
   ibarrier = (struct pthread_barrier *) barrier;
 
@@ -57,6 +59,8 @@  pthread_barrier_init (barrier, attr, count)
   ibarrier->init_count = count;
   ibarrier->curr_event = 0;
 
+  /* XXX Don't use FUTEX_SHARED or FUTEX_PRIVATE as long as there are still
+     assembly implementations that expect the value determined below.  */
 #ifdef __ASSUME_PRIVATE_FUTEX
   ibarrier->private = (iattr->pshared != PTHREAD_PROCESS_PRIVATE
 		       ? 0 : FUTEX_PRIVATE_FLAG);
diff --git a/nptl/pthread_barrier_wait.c b/nptl/pthread_barrier_wait.c
index 9e7090f..b2fed86 100644
--- a/nptl/pthread_barrier_wait.c
+++ b/nptl/pthread_barrier_wait.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthreadP.h>
 
 
@@ -29,9 +30,12 @@  pthread_barrier_wait (barrier)
 {
   struct pthread_barrier *ibarrier = (struct pthread_barrier *) barrier;
   int result = 0;
+  int lll_private = ibarrier->private ^ FUTEX_PRIVATE_FLAG;
+  int futex_private = (lll_private == LLL_PRIVATE)
+		      ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Make sure we are alone.  */
-  lll_lock (ibarrier->lock, ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+  lll_lock (ibarrier->lock, lll_private);
 
   /* One more arrival.  */
   --ibarrier->left;
@@ -44,8 +48,7 @@  pthread_barrier_wait (barrier)
       ++ibarrier->curr_event;
 
       /* Wake up everybody.  */
-      lll_futex_wake (&ibarrier->curr_event, INT_MAX,
-		      ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+      futex_wake (&ibarrier->curr_event, INT_MAX, futex_private);
 
       /* This is the thread which finished the serialization.  */
       result = PTHREAD_BARRIER_SERIAL_THREAD;
@@ -57,12 +60,11 @@  pthread_barrier_wait (barrier)
       unsigned int event = ibarrier->curr_event;
 
       /* Before suspending, make the barrier available to others.  */
-      lll_unlock (ibarrier->lock, ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+      lll_unlock (ibarrier->lock, lll_private);
 
       /* Wait for the event counter of the barrier to change.  */
       do
-	lll_futex_wait (&ibarrier->curr_event, event,
-			ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+	futex_wait_simple (&ibarrier->curr_event, event, futex_private);
       while (event == ibarrier->curr_event);
     }
 
@@ -72,7 +74,7 @@  pthread_barrier_wait (barrier)
   /* If this was the last woken thread, unlock.  */
   if (atomic_increment_val (&ibarrier->left) == init_count)
     /* We are done.  */
-    lll_unlock (ibarrier->lock, ibarrier->private ^ FUTEX_PRIVATE_FLAG);
+    lll_unlock (ibarrier->lock, lll_private);
 
   return result;
 }
diff --git a/nptl/pthread_barrierattr_setpshared.c b/nptl/pthread_barrierattr_setpshared.c
index 86d72c5..35c3b14 100644
--- a/nptl/pthread_barrierattr_setpshared.c
+++ b/nptl/pthread_barrierattr_setpshared.c
@@ -18,6 +18,7 @@ 
 
 #include <errno.h>
 #include "pthreadP.h"
+#include <futex-internal.h>
 
 
 int
@@ -30,6 +31,12 @@  pthread_barrierattr_setpshared (attr, pshared)
   if (pshared != PTHREAD_PROCESS_PRIVATE
       && __builtin_expect (pshared != PTHREAD_PROCESS_SHARED, 0))
     return EINVAL;
+  /* Sanity check for systems that do not support shared futexes.  POSIX
+     does not specify an error code for that; EINVAL can only be returned
+     if the previous check fails.  Thus, we cannot expect programs to
+     expect this error, and thus aborting is safer.  */
+  if (pshared == PTHREAD_PROCESS_SHARED && !futex_supports_shared ())
+    abort ();
 
   iattr = (struct pthread_barrierattr *) attr;
 
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 71a5619..b23deb2 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -31,6 +31,7 @@ 
 #include <kernel-features.h>
 #include <exit-thread.h>
 #include <default-sched.h>
+#include <futex-internal.h>
 
 #include <shlib-compat.h>
 
@@ -269,7 +270,7 @@  START_THREAD_DEFN
 
   /* Allow setxid from now onwards.  */
   if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0) == -2))
-    lll_futex_wake (&pd->setxid_futex, 1, LLL_PRIVATE);
+    futex_wake ((unsigned int *) &pd->setxid_futex, 1, FUTEX_PRIVATE);
 
 #ifdef __NR_set_robust_list
 # ifndef __ASSUME_SET_ROBUST_LIST
@@ -414,7 +415,7 @@  START_THREAD_DEFN
 	  this->__list.__next = NULL;
 
 	  atomic_or (&this->__lock, FUTEX_OWNER_DIED);
-	  lll_futex_wake (&this->__lock, 1, /* XYZ */ LLL_SHARED);
+	  futex_wake (&this->__lock, 1, /* XYZ */ FUTEX_SHARED);
 	}
       while (robust != (void *) &pd->robust_head);
     }
@@ -442,7 +443,12 @@  START_THREAD_DEFN
       /* Some other thread might call any of the setXid functions and expect
 	 us to reply.  In this case wait until we did that.  */
       do
-	lll_futex_wait (&pd->setxid_futex, 0, LLL_PRIVATE);
+	/* XXX This differs from the typical futex_wait_simple pattern in that
+	   the futex_wait condition (setxid_futex) is different from the
+	   condition used in the surrounding loop (cancelhandling).  We need
+	   to check and document why this is correct.  */
+	futex_wait_simple ((unsigned int *) &pd->setxid_futex, 0,
+			   FUTEX_PRIVATE);
       while (pd->cancelhandling & SETXID_BITMASK);
 
       /* Reset the value so that the stack can be reused.  */
@@ -683,7 +689,7 @@  __pthread_create_2_1 (newthread, attr, start_routine, arg)
 	     stillborn thread.  */
 	  if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0)
 				== -2))
-	    lll_futex_wake (&pd->setxid_futex, 1, LLL_PRIVATE);
+	    futex_wake ((unsigned int *) &pd->setxid_futex, 1, FUTEX_PRIVATE);
 
 	  /* Free the resources.  */
 	  __deallocate_stack (pd);
diff --git a/nptl/pthread_once.c b/nptl/pthread_once.c
index fe6d923..642730b 100644
--- a/nptl/pthread_once.c
+++ b/nptl/pthread_once.c
@@ -17,7 +17,7 @@ 
    <http://www.gnu.org/licenses/>.  */
 
 #include "pthreadP.h"
-#include <lowlevellock.h>
+#include <futex-internal.h>
 #include <atomic.h>
 
 
@@ -35,7 +35,7 @@  clear_once_control (void *arg)
      get interrupted (see __pthread_once), so all we need to relay to other
      threads is the state being reset again.  */
   atomic_store_relaxed (once_control, 0);
-  lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE);
+  futex_wake ((unsigned int *) once_control, INT_MAX, FUTEX_PRIVATE);
 }
 
 
@@ -100,8 +100,10 @@  __pthread_once_slow (pthread_once_t *once_control, void (*init_routine) (void))
 	     is set and __PTHREAD_ONCE_DONE is not.  */
 	  if (val == newval)
 	    {
-	      /* Same generation, some other thread was faster. Wait.  */
-	      lll_futex_wait (once_control, newval, LLL_PRIVATE);
+	      /* Same generation, some other thread was faster.  Wait and
+		 retry.  */
+	      futex_wait_simple ((unsigned int *)once_control,
+				 (unsigned int) newval, FUTEX_PRIVATE);
 	      continue;
 	    }
 	}
@@ -122,7 +124,7 @@  __pthread_once_slow (pthread_once_t *once_control, void (*init_routine) (void))
       atomic_store_release (once_control, __PTHREAD_ONCE_DONE);
 
       /* Wake up all other threads.  */
-      lll_futex_wake (once_control, INT_MAX, LLL_PRIVATE);
+      futex_wake ((unsigned int *) once_control, INT_MAX, FUTEX_PRIVATE);
       break;
     }
 
diff --git a/nptl/pthread_rwlock_rdlock.c b/nptl/pthread_rwlock_rdlock.c
index 004a386..aa1593d 100644
--- a/nptl/pthread_rwlock_rdlock.c
+++ b/nptl/pthread_rwlock_rdlock.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <stap-probe.h>
@@ -32,6 +33,8 @@  __pthread_rwlock_rdlock_slow (pthread_rwlock_t *rwlock)
 {
   int result = 0;
   bool wake = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Lock is taken in caller.  */
 
@@ -60,9 +63,10 @@  __pthread_rwlock_rdlock_slow (pthread_rwlock_t *rwlock)
       /* Free the lock.  */
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
-      /* Wait for the writer to finish.  */
-      lll_futex_wait (&rwlock->__data.__readers_wakeup, waitval,
-		      rwlock->__data.__shared);
+      /* Wait for the writer to finish.  We do not check the return value
+	 because we decide how to continue based on the state of the rwlock.  */
+      futex_wait_simple (&rwlock->__data.__readers_wakeup, waitval,
+			 futex_shared);
 
       /* Get the lock.  */
       lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
@@ -103,8 +107,7 @@  __pthread_rwlock_rdlock_slow (pthread_rwlock_t *rwlock)
   lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
   if (wake)
-    lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-		    rwlock->__data.__shared);
+    futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
   return result;
 }
@@ -117,6 +120,8 @@  __pthread_rwlock_rdlock (pthread_rwlock_t *rwlock)
 {
   int result = 0;
   bool wake = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   LIBC_PROBE (rdlock_entry, 1, rwlock);
 
@@ -164,8 +169,7 @@  __pthread_rwlock_rdlock (pthread_rwlock_t *rwlock)
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
       if (wake)
-	lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-			rwlock->__data.__shared);
+	futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
       return result;
     }
diff --git a/nptl/pthread_rwlock_timedrdlock.c b/nptl/pthread_rwlock_timedrdlock.c
index 63fb313..207918e 100644
--- a/nptl/pthread_rwlock_timedrdlock.c
+++ b/nptl/pthread_rwlock_timedrdlock.c
@@ -19,10 +19,10 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <sys/time.h>
-#include <kernel-features.h>
 #include <stdbool.h>
 
 
@@ -34,6 +34,8 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
 {
   int result = 0;
   bool wake = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Make sure we are alone.  */
   lll_lock(rwlock->__data.__lock, rwlock->__data.__shared);
@@ -91,38 +93,6 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
 	  break;
 	}
 
-      /* Work around the fact that the kernel rejects negative timeout values
-	 despite them being valid.  */
-      if (__glibc_unlikely (abstime->tv_sec < 0))
-	{
-	  result = ETIMEDOUT;
-	  break;
-	}
-
-#if (!defined __ASSUME_FUTEX_CLOCK_REALTIME \
-     || !defined lll_futex_timed_wait_bitset)
-      /* Get the current time.  So far we support only one clock.  */
-      struct timeval tv;
-      (void) __gettimeofday (&tv, NULL);
-
-      /* Convert the absolute timeout value to a relative timeout.  */
-      struct timespec rt;
-      rt.tv_sec = abstime->tv_sec - tv.tv_sec;
-      rt.tv_nsec = abstime->tv_nsec - tv.tv_usec * 1000;
-      if (rt.tv_nsec < 0)
-	{
-	  rt.tv_nsec += 1000000000;
-	  --rt.tv_sec;
-	}
-      /* Did we already time out?  */
-      if (rt.tv_sec < 0)
-	{
-	  /* Yep, return with an appropriate error.  */
-	  result = ETIMEDOUT;
-	  break;
-	}
-#endif
-
       /* Remember that we are a reader.  */
       if (++rwlock->__data.__nr_readers_queued == 0)
 	{
@@ -137,17 +107,11 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
       /* Free the lock.  */
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
-      /* Wait for the writer to finish.  */
-#if (!defined __ASSUME_FUTEX_CLOCK_REALTIME \
-     || !defined lll_futex_timed_wait_bitset)
-      err = lll_futex_timed_wait (&rwlock->__data.__readers_wakeup,
-				  waitval, &rt, rwlock->__data.__shared);
-#else
-      err = lll_futex_timed_wait_bitset (&rwlock->__data.__readers_wakeup,
-					 waitval, abstime,
-					 FUTEX_CLOCK_REALTIME,
-					 rwlock->__data.__shared);
-#endif
+      /* Wait for the writer to finish.  We handle ETIMEDOUT below; on other
+	 return values, we decide how to continue based on the state of the
+	 rwlock.  */
+      err = futex_abstimed_wait (&rwlock->__data.__readers_wakeup, waitval,
+				 abstime, futex_shared);
 
       /* Get the lock.  */
       lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
@@ -155,7 +119,7 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
       --rwlock->__data.__nr_readers_queued;
 
       /* Did the futex call time out?  */
-      if (err == -ETIMEDOUT)
+      if (err == ETIMEDOUT)
 	{
 	  /* Yep, report it.  */
 	  result = ETIMEDOUT;
@@ -167,8 +131,7 @@  pthread_rwlock_timedrdlock (rwlock, abstime)
   lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
   if (wake)
-    lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-		    rwlock->__data.__shared);
+    futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
   return result;
 }
diff --git a/nptl/pthread_rwlock_timedwrlock.c b/nptl/pthread_rwlock_timedwrlock.c
index c542534..2f30022 100644
--- a/nptl/pthread_rwlock_timedwrlock.c
+++ b/nptl/pthread_rwlock_timedwrlock.c
@@ -19,10 +19,10 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <sys/time.h>
-#include <kernel-features.h>
 #include <stdbool.h>
 
 
@@ -34,6 +34,8 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
 {
   int result = 0;
   bool wake_readers = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Make sure we are alone.  */
   lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
@@ -71,37 +73,6 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
 	  break;
 	}
 
-      /* Work around the fact that the kernel rejects negative timeout values
-	 despite them being valid.  */
-      if (__glibc_unlikely (abstime->tv_sec < 0))
-	{
-	  result = ETIMEDOUT;
-	  break;
-	}
-
-#if (!defined __ASSUME_FUTEX_CLOCK_REALTIME \
-     || !defined lll_futex_timed_wait_bitset)
-      /* Get the current time.  So far we support only one clock.  */
-      struct timeval tv;
-      (void) __gettimeofday (&tv, NULL);
-
-      /* Convert the absolute timeout value to a relative timeout.  */
-      struct timespec rt;
-      rt.tv_sec = abstime->tv_sec - tv.tv_sec;
-      rt.tv_nsec = abstime->tv_nsec - tv.tv_usec * 1000;
-      if (rt.tv_nsec < 0)
-	{
-	  rt.tv_nsec += 1000000000;
-	  --rt.tv_sec;
-	}
-      /* Did we already time out?  */
-      if (rt.tv_sec < 0)
-	{
-	  result = ETIMEDOUT;
-	  break;
-	}
-#endif
-
       /* Remember that we are a writer.  */
       if (++rwlock->__data.__nr_writers_queued == 0)
 	{
@@ -116,17 +87,11 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
       /* Free the lock.  */
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
-      /* Wait for the writer or reader(s) to finish.  */
-#if (!defined __ASSUME_FUTEX_CLOCK_REALTIME \
-     || !defined lll_futex_timed_wait_bitset)
-      err = lll_futex_timed_wait (&rwlock->__data.__writer_wakeup,
-				  waitval, &rt, rwlock->__data.__shared);
-#else
-      err = lll_futex_timed_wait_bitset (&rwlock->__data.__writer_wakeup,
-					 waitval, abstime,
-					 FUTEX_CLOCK_REALTIME,
-					 rwlock->__data.__shared);
-#endif
+      /* Wait for the writer or reader(s) to finish.  We handle ETIMEDOUT
+	 below; on other return values, we decide how to continue based on
+	 the state of the rwlock.  */
+      err = futex_abstimed_wait (&rwlock->__data.__writer_wakeup, waitval,
+				 abstime, futex_shared);
 
       /* Get the lock.  */
       lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
@@ -135,7 +100,7 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
       --rwlock->__data.__nr_writers_queued;
 
       /* Did the futex call time out?  */
-      if (err == -ETIMEDOUT)
+      if (err == ETIMEDOUT)
 	{
 	  result = ETIMEDOUT;
 	  /* If we prefer writers, it can have happened that readers blocked
@@ -166,8 +131,7 @@  pthread_rwlock_timedwrlock (rwlock, abstime)
 
   /* Might be required after timeouts.  */
   if (wake_readers)
-    lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-	rwlock->__data.__shared);
+    futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
   return result;
 }
diff --git a/nptl/pthread_rwlock_tryrdlock.c b/nptl/pthread_rwlock_tryrdlock.c
index cde123f..ee0ab1f 100644
--- a/nptl/pthread_rwlock_tryrdlock.c
+++ b/nptl/pthread_rwlock_tryrdlock.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include "pthreadP.h"
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <elide.h>
 #include <stdbool.h>
 
@@ -28,6 +29,8 @@  __pthread_rwlock_tryrdlock (pthread_rwlock_t *rwlock)
 {
   int result = EBUSY;
   bool wake = false;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   if (ELIDE_TRYLOCK (rwlock->__data.__rwelision,
 		     rwlock->__data.__lock == 0
@@ -63,8 +66,7 @@  __pthread_rwlock_tryrdlock (pthread_rwlock_t *rwlock)
   lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
   if (wake)
-    lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-		    rwlock->__data.__shared);
+    futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX, futex_shared);
 
   return result;
 }
diff --git a/nptl/pthread_rwlock_unlock.c b/nptl/pthread_rwlock_unlock.c
index d2ad4b0..b41c6ba 100644
--- a/nptl/pthread_rwlock_unlock.c
+++ b/nptl/pthread_rwlock_unlock.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <stap-probe.h>
@@ -29,6 +30,9 @@ 
 int
 __pthread_rwlock_unlock (pthread_rwlock_t *rwlock)
 {
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
+
   LIBC_PROBE (rwlock_unlock, 1, rwlock);
 
   if (ELIDE_UNLOCK (rwlock->__data.__writer == 0
@@ -51,16 +55,15 @@  __pthread_rwlock_unlock (pthread_rwlock_t *rwlock)
 	{
 	  ++rwlock->__data.__writer_wakeup;
 	  lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
-	  lll_futex_wake (&rwlock->__data.__writer_wakeup, 1,
-			  rwlock->__data.__shared);
+	  futex_wake (&rwlock->__data.__writer_wakeup, 1, futex_shared);
 	  return 0;
 	}
       else if (rwlock->__data.__nr_readers_queued)
 	{
 	  ++rwlock->__data.__readers_wakeup;
 	  lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
-	  lll_futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
-			  rwlock->__data.__shared);
+	  futex_wake (&rwlock->__data.__readers_wakeup, INT_MAX,
+		      futex_shared);
 	  return 0;
 	}
     }
diff --git a/nptl/pthread_rwlock_wrlock.c b/nptl/pthread_rwlock_wrlock.c
index 835a62f..9c495d8 100644
--- a/nptl/pthread_rwlock_wrlock.c
+++ b/nptl/pthread_rwlock_wrlock.c
@@ -19,6 +19,7 @@ 
 #include <errno.h>
 #include <sysdep.h>
 #include <lowlevellock.h>
+#include <futex-internal.h>
 #include <pthread.h>
 #include <pthreadP.h>
 #include <stap-probe.h>
@@ -30,6 +31,8 @@  static int __attribute__((noinline))
 __pthread_rwlock_wrlock_slow (pthread_rwlock_t *rwlock)
 {
   int result = 0;
+  int futex_shared =
+      (rwlock->__data.__shared == LLL_PRIVATE) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   /* Caller has taken the lock.  */
 
@@ -58,9 +61,11 @@  __pthread_rwlock_wrlock_slow (pthread_rwlock_t *rwlock)
       /* Free the lock.  */
       lll_unlock (rwlock->__data.__lock, rwlock->__data.__shared);
 
-      /* Wait for the writer or reader(s) to finish.  */
-      lll_futex_wait (&rwlock->__data.__writer_wakeup, waitval,
-		      rwlock->__data.__shared);
+      /* Wait for the writer or reader(s) to finish.  We do not check the
+	 return value because we decide how to continue based on the state of
+	 the rwlock.  */
+      futex_wait_simple (&rwlock->__data.__writer_wakeup, waitval,
+			 futex_shared);
 
       /* Get the lock.  */
       lll_lock (rwlock->__data.__lock, rwlock->__data.__shared);
diff --git a/nptl/sem_init.c b/nptl/sem_init.c
index 575b661..bb41620 100644
--- a/nptl/sem_init.c
+++ b/nptl/sem_init.c
@@ -21,22 +21,7 @@ 
 #include <shlib-compat.h>
 #include "semaphoreP.h"
 #include <kernel-features.h>
-
-/* Returns FUTEX_PRIVATE if pshared is zero and private futexes are supported;
-   returns FUTEX_SHARED otherwise.
-   TODO Remove when cleaning up the futex API throughout glibc.  */
-static __always_inline int
-futex_private_if_supported (int pshared)
-{
-  if (pshared != 0)
-    return LLL_SHARED;
-#ifdef __ASSUME_PRIVATE_FUTEX
-  return LLL_PRIVATE;
-#else
-  return THREAD_GETMEM (THREAD_SELF, header.private_futex)
-      ^ FUTEX_PRIVATE_FLAG;
-#endif
-}
+#include <futex-internal.h>
 
 
 int
@@ -48,6 +33,12 @@  __new_sem_init (sem_t *sem, int pshared, unsigned int value)
       __set_errno (EINVAL);
       return -1;
     }
+  if (pshared != 0 && !futex_supports_shared ())
+    /* POSIX doesn't explicitly associate an error with having no support
+       for shared futexes.  We could perhaps use another allowed error code
+       such as EPERM, but aborting is consistent with what we do elsewhere
+       (e.g., in pthread_barrierattr_setshared).  */
+    abort ();
 
   /* Map to the internal type.  */
   struct new_sem *isem = (struct new_sem *) sem;
@@ -60,7 +51,7 @@  __new_sem_init (sem_t *sem, int pshared, unsigned int value)
   isem->nwaiters = 0;
 #endif
 
-  isem->private = futex_private_if_supported (pshared);
+  isem->private = (pshared == 0) ? FUTEX_PRIVATE : FUTEX_SHARED;
 
   return 0;
 }
diff --git a/nptl/sem_open.c b/nptl/sem_open.c
index bfd2dea..2e053ad 100644
--- a/nptl/sem_open.c
+++ b/nptl/sem_open.c
@@ -30,6 +30,7 @@ 
 #include <sys/stat.h>
 #include "semaphoreP.h"
 #include <shm-directory.h>
+#include <futex-internal.h>
 
 
 /* Comparison function for search of existing mapping.  */
@@ -200,7 +201,7 @@  sem_open (const char *name, int oflag, ...)
       sem.newsem.nwaiters = 0;
 #endif
       /* This always is a shared semaphore.  */
-      sem.newsem.private = LLL_SHARED;
+      sem.newsem.private = FUTEX_SHARED;
 
       /* Initialize the remaining bytes as well.  */
       memset ((char *) &sem.initsem + sizeof (struct new_sem), '\0',
diff --git a/nptl/sem_post.c b/nptl/sem_post.c
index b6d30b5..06d8359 100644
--- a/nptl/sem_post.c
+++ b/nptl/sem_post.c
@@ -20,37 +20,13 @@ 
 #include <atomic.h>
 #include <errno.h>
 #include <sysdep.h>
-#include <lowlevellock.h>
+#include <lowlevellock.h>	/* lll_futex* used by the old code.  */
+#include <futex-internal.h>
 #include <internaltypes.h>
 #include <semaphore.h>
 
 #include <shlib-compat.h>
 
-/* Wrapper for lll_futex_wake, with error checking.
-   TODO Remove when cleaning up the futex API throughout glibc.  */
-static __always_inline void
-futex_wake (unsigned int* futex, int processes_to_wake, int private)
-{
-  int res = lll_futex_wake (futex, processes_to_wake, private);
-  /* No error.  Ignore the number of woken processes.  */
-  if (res >= 0)
-    return;
-  switch (res)
-    {
-    case -EFAULT: /* Could have happened due to memory reuse.  */
-    case -EINVAL: /* Could be either due to incorrect alignment (a bug in
-		     glibc or in the application) or due to memory being
-		     reused for a PI futex.  We cannot distinguish between the
-		     two causes, and one of them is correct use, so we do not
-		     act in this case.  */
-      return;
-    case -ENOSYS: /* Must have been caused by a glibc bug.  */
-    /* No other errors are documented at this time.  */
-    default:
-      abort ();
-    }
-}
-
 
 /* See sem_wait for an explanation of the algorithm.  */
 int
diff --git a/nptl/sem_wait.c b/nptl/sem_wait.c
index c1fd10c..fce7ed4 100644
--- a/nptl/sem_wait.c
+++ b/nptl/sem_wait.c
@@ -17,6 +17,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
+#include <lowlevellock.h>	/* lll_futex* used by the old code.  */
 #include "sem_waitcommon.c"
 
 int
diff --git a/nptl/sem_waitcommon.c b/nptl/sem_waitcommon.c
index 772425d..d3702c7 100644
--- a/nptl/sem_waitcommon.c
+++ b/nptl/sem_waitcommon.c
@@ -20,7 +20,7 @@ 
 #include <kernel-features.h>
 #include <errno.h>
 #include <sysdep.h>
-#include <lowlevellock.h>
+#include <futex-internal.h>
 #include <internaltypes.h>
 #include <semaphore.h>
 #include <sys/time.h>
@@ -29,110 +29,6 @@ 
 #include <shlib-compat.h>
 #include <atomic.h>
 
-/* Wrapper for lll_futex_wait with absolute timeout and error checking.
-   TODO Remove when cleaning up the futex API throughout glibc.  */
-static __always_inline int
-futex_abstimed_wait (unsigned int* futex, unsigned int expected,
-		     const struct timespec* abstime, int private, bool cancel)
-{
-  int err, oldtype;
-  if (abstime == NULL)
-    {
-      if (cancel)
-	oldtype = __pthread_enable_asynccancel ();
-      err = lll_futex_wait (futex, expected, private);
-      if (cancel)
-	__pthread_disable_asynccancel (oldtype);
-    }
-  else
-    {
-#if (defined __ASSUME_FUTEX_CLOCK_REALTIME	\
-     && defined lll_futex_timed_wait_bitset)
-      /* The Linux kernel returns EINVAL for this, but in userspace
-	 such a value is valid.  */
-      if (abstime->tv_sec < 0)
-	return ETIMEDOUT;
-#else
-      struct timeval tv;
-      struct timespec rt;
-      int sec, nsec;
-
-      /* Get the current time.  */
-      __gettimeofday (&tv, NULL);
-
-      /* Compute relative timeout.  */
-      sec = abstime->tv_sec - tv.tv_sec;
-      nsec = abstime->tv_nsec - tv.tv_usec * 1000;
-      if (nsec < 0)
-        {
-          nsec += 1000000000;
-          --sec;
-        }
-
-      /* Already timed out?  */
-      if (sec < 0)
-        return ETIMEDOUT;
-
-      /* Do wait.  */
-      rt.tv_sec = sec;
-      rt.tv_nsec = nsec;
-#endif
-      if (cancel)
-	oldtype = __pthread_enable_asynccancel ();
-#if (defined __ASSUME_FUTEX_CLOCK_REALTIME	\
-     && defined lll_futex_timed_wait_bitset)
-      err = lll_futex_timed_wait_bitset (futex, expected, abstime,
-					 FUTEX_CLOCK_REALTIME, private);
-#else
-      err = lll_futex_timed_wait (futex, expected, &rt, private);
-#endif
-      if (cancel)
-	__pthread_disable_asynccancel (oldtype);
-    }
-  switch (err)
-    {
-    case 0:
-    case -EAGAIN:
-    case -EINTR:
-    case -ETIMEDOUT:
-      return -err;
-
-    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
-    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
-		     being normalized.  Must have been caused by a glibc or
-		     application bug.  */
-    case -ENOSYS: /* Must have been caused by a glibc bug.  */
-    /* No other errors are documented at this time.  */
-    default:
-      abort ();
-    }
-}
-
-/* Wrapper for lll_futex_wake, with error checking.
-   TODO Remove when cleaning up the futex API throughout glibc.  */
-static __always_inline void
-futex_wake (unsigned int* futex, int processes_to_wake, int private)
-{
-  int res = lll_futex_wake (futex, processes_to_wake, private);
-  /* No error.  Ignore the number of woken processes.  */
-  if (res >= 0)
-    return;
-  switch (res)
-    {
-    case -EFAULT: /* Could have happened due to memory reuse.  */
-    case -EINVAL: /* Could be either due to incorrect alignment (a bug in
-		     glibc or in the application) or due to memory being
-		     reused for a PI futex.  We cannot distinguish between the
-		     two causes, and one of them is correct use, so we do not
-		     act in this case.  */
-      return;
-    case -ENOSYS: /* Must have been caused by a glibc bug.  */
-    /* No other errors are documented at this time.  */
-    default:
-      abort ();
-    }
-}
-
 
 /* The semaphore provides two main operations: sem_post adds a token to the
    semaphore; sem_wait grabs a token from the semaphore, potentially waiting
@@ -220,11 +116,12 @@  do_futex_wait (struct new_sem *sem, const struct timespec *abstime)
   int err;
 
 #if __HAVE_64B_ATOMICS
-  err = futex_abstimed_wait ((unsigned int *) &sem->data + SEM_VALUE_OFFSET, 0,
-			     abstime, sem->private, true);
+  err = futex_abstimed_wait_cancelable (
+      (unsigned int *) &sem->data + SEM_VALUE_OFFSET, 0, abstime,
+      sem->private);
 #else
-  err = futex_abstimed_wait (&sem->value, SEM_NWAITERS_MASK, abstime,
-			     sem->private, true);
+  err = futex_abstimed_wait_cancelable (&sem->value, SEM_NWAITERS_MASK,
+					abstime, sem->private);
 #endif
 
   return err;
diff --git a/nptl/unregister-atfork.c b/nptl/unregister-atfork.c
index 3838cb7..6d08ed7 100644
--- a/nptl/unregister-atfork.c
+++ b/nptl/unregister-atfork.c
@@ -20,6 +20,7 @@ 
 #include <stdlib.h>
 #include <fork.h>
 #include <atomic.h>
+#include <futex-internal.h>
 
 
 void
@@ -114,7 +115,7 @@  __unregister_atfork (dso_handle)
       atomic_decrement (&deleted->handler->refcntr);
       unsigned int val;
       while ((val = deleted->handler->refcntr) != 0)
-	lll_futex_wait (&deleted->handler->refcntr, val, LLL_PRIVATE);
+	futex_wait_simple (&deleted->handler->refcntr, val, FUTEX_PRIVATE);
 
       deleted = deleted->next;
     }
diff --git a/sysdeps/nacl/exit-thread.h b/sysdeps/nacl/exit-thread.h
index c809405..a092462 100644
--- a/sysdeps/nacl/exit-thread.h
+++ b/sysdeps/nacl/exit-thread.h
@@ -18,7 +18,7 @@ 
 
 #include <assert.h>
 #include <atomic.h>
-#include <lowlevellock.h>
+#include <futex-internal.h>
 #include <nacl-interfaces.h>
 #include <nptl/pthreadP.h>
 
@@ -64,7 +64,7 @@  __exit_thread (void)
       assert (NACL_EXITING_TID > 0);
 
       atomic_store_relaxed (&pd->tid, NACL_EXITING_TID);
-      lll_futex_wake (&pd->tid, 1, LLL_PRIVATE);
+      futex_wake (&pd->tid, 1, FUTEX_PRIVATE);
     }
 
   /* This clears PD->tid some time after the thread stack can never
diff --git a/sysdeps/nacl/futex-internal.h b/sysdeps/nacl/futex-internal.h
new file mode 100644
index 0000000..09c2a1c
--- /dev/null
+++ b/sysdeps/nacl/futex-internal.h
@@ -0,0 +1,247 @@ 
+/* futex operations for glibc-internal use.  NaCl version.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef FUTEX_INTERNAL_H
+#define FUTEX_INTERNAL_H
+
+#include <errno.h>
+#include <lowlevellock-futex.h>
+#include <sys/time.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <nptl/pthreadP.h>
+#include <libc-internal.h>
+
+/* See sysdeps/nptl/futex-internal.h for documentation; this file only
+   contains NaCl-specific comments.
+
+   There is no support yet for shared futexes nor for exact relative
+   timeouts.  */
+
+/* Defined this way for interoperability with lowlevellock.  */
+#define FUTEX_PRIVATE LLL_PRIVATE
+#define FUTEX_SHARED  LLL_SHARED
+
+/* FUTEX_SHARED is not yet supported.  */
+static __always_inline bool
+futex_supports_shared (void)
+{
+  return false;
+}
+
+/* Relative timeouts are only emulated via absolute timeouts using the
+   system clock.  */
+static __always_inline bool
+futex_supports_exact_relative_timeouts (void)
+{
+  return false;
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_wait (unsigned int *futex_word, unsigned int expected, int private)
+{
+  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+      return -err;
+
+    case -ETIMEDOUT: /* Cannot have happened as we provided no timeout.  */
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline void
+futex_wait_simple (unsigned int *futex_word, unsigned int expected,
+		   int private)
+{
+  ignore_value (futex_wait (futex_word, expected, private));
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_wait_cancelable (unsigned int *futex_word, unsigned int expected,
+		       int private)
+{
+  int oldtype;
+  oldtype = __pthread_enable_asynccancel ();
+  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
+  __pthread_disable_asynccancel (oldtype);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+      return -err;
+
+    case -ETIMEDOUT: /* Cannot have happened as we provided no timeout.  */
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_reltimed_wait (unsigned int* futex_word, unsigned int expected,
+		     const struct timespec* reltime, int private)
+{
+  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_reltimed_wait_cancelable (unsigned int* futex_word,
+				unsigned int expected,
+			        const struct timespec* reltime, int private)
+{
+  int oldtype;
+  oldtype = __pthread_enable_asynccancel ();
+  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
+  __pthread_disable_asynccancel (oldtype);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_abstimed_wait (unsigned int* futex_word, unsigned int expected,
+		     const struct timespec* abstime, int private)
+{
+  int err = lll_futex_abstimed_wait (futex_word, expected, abstime, private);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_abstimed_wait_cancelable (unsigned int* futex_word,
+				unsigned int expected,
+			        const struct timespec* abstime, int private)
+{
+  int oldtype;
+  oldtype = __pthread_enable_asynccancel ();
+  int err = lll_futex_abstimed_wait (futex_word, expected, abstime, private);
+  __pthread_disable_asynccancel (oldtype);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline void
+futex_wake (unsigned int* futex_word, int processes_to_wake, int private)
+{
+  int res = lll_futex_wake (futex_word, processes_to_wake, private);
+  /* No error.  Ignore the number of woken processes.  */
+  if (res >= 0)
+    return;
+  switch (res)
+    {
+    case -EFAULT: /* Could have happened due to memory reuse.  */
+    case -EINVAL: /* Could be either due to incorrect alignment (a bug in
+		     glibc or in the application) or due to memory being
+		     reused for a PI futex.  We cannot distinguish between the
+		     two causes, and one of them is correct use, so we do not
+		     act in this case.  */
+      return;
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+#endif  /* futex-internal.h */
diff --git a/sysdeps/nacl/lowlevellock-futex.h b/sysdeps/nacl/lowlevellock-futex.h
index b614ac8..c8fd0d8 100644
--- a/sysdeps/nacl/lowlevellock-futex.h
+++ b/sysdeps/nacl/lowlevellock-futex.h
@@ -63,6 +63,11 @@ 
     -_err;                                                              \
   })
 
+/* Wait until a lll_futex_wake call on FUTEXP, or time ABSTIME has passed.  */
+#define lll_futex_abstimed_wait(futexp, val, timeout, private)   \
+  (- __nacl_irt_futex.futex_wait_abs ((volatile int *) (futexp), \
+				      val, (abstime)));
+
 /* Wake up up to NR waiters on FUTEXP.  */
 #define lll_futex_wake(futexp, nr, private)                     \
   ({                                                            \
diff --git a/sysdeps/nptl/aio_misc.h b/sysdeps/nptl/aio_misc.h
index fb69b0f..b648ac4 100644
--- a/sysdeps/nptl/aio_misc.h
+++ b/sysdeps/nptl/aio_misc.h
@@ -22,14 +22,14 @@ 
 
 #include <assert.h>
 #include <nptl/pthreadP.h>
-#include <lowlevellock.h>
+#include <futex-internal.h>
 
 #define DONT_NEED_AIO_MISC_COND	1
 
 #define AIO_MISC_NOTIFY(waitlist) \
   do {									      \
     if (*waitlist->counterp > 0 && --*waitlist->counterp == 0)		      \
-      lll_futex_wake (waitlist->counterp, 1, LLL_PRIVATE);		      \
+      futex_wake ((unsigned int *) waitlist->counterp, 1, FUTEX_PRIVATE);     \
   } while (0)
 
 #define AIO_MISC_WAIT(result, futex, timeout, cancel)			      \
@@ -48,9 +48,9 @@ 
 	int status;							      \
 	do								      \
 	  {								      \
-	    status = lll_futex_timed_wait (futexaddr, oldval, timeout,	      \
-					   LLL_PRIVATE);		      \
-	    if (status != -EWOULDBLOCK)					      \
+	    status = futex_reltimed_wait ((unsigned int *) futexaddr,	      \
+					  oldval, timeout, FUTEX_PRIVATE);    \
+	    if (status != EAGAIN)					      \
 	      break;							      \
 									      \
 	    oldval = *futexaddr;					      \
@@ -60,12 +60,12 @@ 
 	if (cancel)							      \
 	  LIBC_CANCEL_RESET (oldtype);					      \
 									      \
-	if (status == -EINTR)						      \
+	if (status == EINTR)						      \
 	  result = EINTR;						      \
-	else if (status == -ETIMEDOUT)					      \
+	else if (status == ETIMEDOUT)					      \
 	  result = EAGAIN;						      \
 	else								      \
-	  assert (status == 0 || status == -EWOULDBLOCK);		      \
+	  assert (status == 0 || status == EAGAIN);			      \
 									      \
 	pthread_mutex_lock (&__aio_requests_mutex);			      \
       }									      \
diff --git a/sysdeps/nptl/fork.c b/sysdeps/nptl/fork.c
index 74482b7..2b9ae4b 100644
--- a/sysdeps/nptl/fork.c
+++ b/sysdeps/nptl/fork.c
@@ -30,6 +30,7 @@ 
 #include <nptl/pthreadP.h>
 #include <fork.h>
 #include <arch-fork.h>
+#include <futex-internal.h>
 
 
 static void
@@ -219,7 +220,7 @@  __libc_fork (void)
 
 	  if (atomic_decrement_and_test (&allp->handler->refcntr)
 	      && allp->handler->need_signal)
-	    lll_futex_wake (&allp->handler->refcntr, 1, LLL_PRIVATE);
+	    futex_wake (&allp->handler->refcntr, 1, FUTEX_PRIVATE);
 
 	  allp = allp->next;
 	}
diff --git a/sysdeps/nptl/futex-internal.h b/sysdeps/nptl/futex-internal.h
new file mode 100644
index 0000000..292aa39
--- /dev/null
+++ b/sysdeps/nptl/futex-internal.h
@@ -0,0 +1,188 @@ 
+/* futex operations for glibc-internal use.  Stub version.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef FUTEX_INTERNAL_H
+#define FUTEX_INTERNAL_H
+
+#include <errno.h>
+#include <lowlevellock-futex.h>
+#include <sys/time.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <nptl/pthreadP.h>
+
+/* This file defines futex operations used internally in glibc.  A futex
+   consists of the so-called futex word in userspace, which is of type int
+   and represents an application-specific condition, and kernel state
+   associated with this particular futex word (e.g., wait queues).  The futex
+   operations we provide are wrappers for the futex syscalls and add
+   glibc-specific error checking of the syscall return value.  We abort on
+   error codes that are caused by bugs in glibc or in the calling application,
+   or when an error code is not known.  We return error codes that can arise
+   in correct executions to the caller.  Each operation calls out exactly the
+   return values that callers need to handle.
+
+   The private flag must be either FUTEX_PRIVATE or FUTEX_SHARED.
+   FUTEX_PRIVATE is always supported, and the implementation can internally
+   use FUTEX_SHARED when FUTEX_PRIVATE is requested.  FUTEX_SHARED is not
+   necessarily supported (use futex_supports_shared to detect this).
+
+   We expect callers to only use these operations if futexes and the
+   specific futex operations being used are supported (e.g., FUTEX_SHARED).
+
+   Given that waking other threads waiting on a futex involves concurrent
+   accesses to the futex word, you must use atomic operations to access the
+   futex word.
+
+   Both absolute and relative timeouts can be used.  An absolute timeout
+   expires when the given specific point in time on the CLOCK_REALTIME clock
+   passes, or when it already has passed.  A relative timeout expires when
+   the given duration of time on the CLOCK_MONOTONIC clock passes.  Relative
+   timeouts may be imprecise (see futex_supports_exact_relative_timeouts).
+
+   Due to POSIX requirements on when synchronization data structures such
+   as mutexes or semaphores can be destroyed and due to the futex design
+   having separate fast/slow paths for wake-ups, we need to consider that
+   futex_wake calls might effectively target a data structure that has been
+   destroyed and reused for another object, or unmapped; thus, some
+   errors or spurious wake-ups can happen in correct executions that would
+   not be possible in a program using just a single futex whose lifetime
+   does not end before the program terminates.  For background, see:
+   https://sourceware.org/ml/libc-alpha/2014-04/msg00075.html
+   https://lkml.org/lkml/2014/11/27/472  */
+
+/* Defined this way for interoperability with lowlevellock.  */
+#define FUTEX_PRIVATE LLL_PRIVATE
+#define FUTEX_SHARED  LLL_SHARED
+
+/* Returns true iff FUTEX_SHARED is supported for all futex operations.  */
+static __always_inline bool
+futex_supports_shared (void);
+/* Returns true if relative timeouts are robust to concurrent changes to the
+   system clock.  If this returns false, relative timeouts can still be used
+   but might be effectively longer or shorter than requested.  */
+static __always_inline bool
+futex_supports_exact_relative_timeouts (void);
+
+/* Atomically wrt other futex operations on the same futex, this blocks iff
+   the value *FUTEX_WORD matches the expected value.  This is
+   semantically equivalent to:
+     l = <get lock associated with futex> (FUTEX_WORD);
+     wait_flag = <get wait_flag associated with futex> (FUTEX_WORD);
+     lock (l);
+     val = atomic_load_relaxed (FUTEX_WORD);
+     if (val != expected) { unlock (l); return EAGAIN; }
+     atomic_store_relaxed (wait_flag, true);
+     unlock (l);
+     // Now block; can time out in futex_time_wait (see below)
+     while (atomic_load_relaxed(wait_flag) && !<spurious wake-up>);
+
+   Note that no guarantee of a happens-before relation between a woken
+   futex_wait and a futex_wake is documented; however, this does not matter
+   in practice because we have to consider spurious wake-ups (see below),
+   and thus would not be able to reliably reason about which futex_wake woke
+   us.
+
+   Returns 0 if woken by a futex operation or spuriously.  (Note that due to
+   the POSIX requirements mentioned above, we need to conservatively assume
+   that unrelated futex_wake operations could wake this futex; it is easiest
+   to just be prepared for spurious wake-ups.)
+   Returns EAGAIN if the futex word did not match the expected value.
+   Returns EINTR if waiting was interrupted by a signal.
+
+   Note that some previous code in glibc assumed the underlying futex
+   operation (e.g., syscall) to start with or include the equivalent of a
+   seq_cst fence; this allows one to avoid an explicit seq_cst fence before
+   a futex_wait call when synchronizing similar to Dekker synchronization.
+   However, we make no such guarantee here.  */
+static __always_inline int
+futex_wait (unsigned int *futex_word, unsigned int expected, int private);
+
+/* Like futex_wait but does not provide any indication why we stopped waiting.
+   Thus, when this function returns, you have to always check FUTEX_WORD to
+   determine whether you need to continue waiting, and you cannot detect
+   whether the waiting was interrupted by a signal.  Example use:
+     while (atomic_load_relaxed (&futex_word) == 23)
+       futex_wait_simple (&futex_word, 23, FUTEX_PRIVATE);
+   This is common enough to make providing this wrapper worthwhile.  */
+static __always_inline void
+futex_wait_simple (unsigned int *futex_word, unsigned int expected,
+		   int private);
+
+/* Like futex_wait but cancelable.  */
+static __always_inline int
+futex_wait_cancelable (unsigned int *futex_word, unsigned int expected,
+		       int private);
+
+/* Like futex_wait, but will eventually time out (i.e., stop being
+   blocked) after the duration of time provided (i.e., RELTIME) has
+   passed.  The caller must provide a normalized RELTIME.  RELTIME can also
+   equal NULL, in which case this function behaves equivalent to futex_wait.
+
+   Returns 0 if woken by a futex operation or spuriously.  (Note that due to
+   the POSIX requirements mentioned above, we need to conservatively assume
+   that unrelated futex_wake operations could wake this futex; it is easiest
+   to just be prepared for spurious wake-ups.)
+   Returns EAGAIN if the futex word did not match the expected value.
+   Returns EINTR if waiting was interrupted by a signal.
+   Returns ETIMEDOUT if the timeout expired.
+   */
+static __always_inline int
+futex_reltimed_wait (unsigned int* futex_word, unsigned int expected,
+		     const struct timespec* reltime, int private);
+
+/* Like futex_reltimed_wait, but cancelable.  */
+static __always_inline int
+futex_reltimed_wait_cancelable (unsigned int* futex_word,
+				unsigned int expected,
+			        const struct timespec* reltime, int private);
+
+/* Like futex_reltimed_wait, but the provided timeout (ABSTIME) is an
+   absolute point in time; a call will time out after this point in time.  */
+static __always_inline int
+futex_abstimed_wait (unsigned int* futex_word, unsigned int expected,
+		     const struct timespec* abstime, int private);
+
+/* Like futex_reltimed_wait, but cancelable.  */
+static __always_inline int
+futex_abstimed_wait_cancelable (unsigned int* futex_word,
+				unsigned int expected,
+			        const struct timespec* abstime, int private);
+
+/* Atomically wrt other futex operations on the same futex, this unblocks the
+   specified number of processes, or all processes blocked on this futex if
+   there are fewer than the specified number.  Semantically, this is
+   equivalent to:
+     l = <get lock associated with futex> (FUTEX_WORD);
+     lock (l);
+     for (res = 0; PROCESSES_TO_WAKE > 0; PROCESSES_TO_WAKE--, res++) {
+       if (<no process blocked on futex>) break;
+       wf = <get wait_flag of a process blocked on futex> (FUTEX_WORD);
+       // No happens-before guarantee with woken futex_wait (see above)
+       atomic_store_relaxed (wf, 0);
+     }
+     return res;
+
+   Note that we need to support futex_wake calls to past futexes whose memory
+   has potentially been reused due to POSIX' requirements on synchronization
+   object destruction (see above); therefore, we must not report or abort
+   on most errors.  */
+static __always_inline void
+futex_wake (unsigned int* futex_word, int processes_to_wake, int private);
+
+#endif  /* futex-internal.h */
diff --git a/sysdeps/nptl/gai_misc.h b/sysdeps/nptl/gai_misc.h
index bb83dca..f2dc327 100644
--- a/sysdeps/nptl/gai_misc.h
+++ b/sysdeps/nptl/gai_misc.h
@@ -23,14 +23,14 @@ 
 #include <assert.h>
 #include <signal.h>
 #include <nptl/pthreadP.h>
-#include <lowlevellock.h>
+#include <futex-internal.h>
 
 #define DONT_NEED_GAI_MISC_COND	1
 
 #define GAI_MISC_NOTIFY(waitlist) \
   do {									      \
     if (*waitlist->counterp > 0 && --*waitlist->counterp == 0)		      \
-      lll_futex_wake (waitlist->counterp, 1, LLL_PRIVATE);		      \
+      futex_wake ((unsigned int *) waitlist->counterp, 1, FUTEX_PRIVATE);     \
   } while (0)
 
 #define GAI_MISC_WAIT(result, futex, timeout, cancel) \
@@ -49,9 +49,9 @@ 
 	int status;							      \
 	do								      \
 	  {								      \
-	    status = lll_futex_timed_wait (futexaddr, oldval, timeout,	      \
-					   LLL_PRIVATE);		      \
-	    if (status != -EWOULDBLOCK)					      \
+	    status = futex_reltimed_wait ((unsigned int *)futexaddr,	      \
+					  oldval, timeout, FUTEX_PRIVATE);    \
+	    if (status != EAGAIN)					      \
 	      break;							      \
 									      \
 	    oldval = *futexaddr;					      \
@@ -61,12 +61,12 @@ 
 	if (cancel)							      \
 	  LIBC_CANCEL_RESET (oldtype);					      \
 									      \
-	if (status == -EINTR)						      \
+	if (status == EINTR)						      \
 	  result = EINTR;						      \
-	else if (status == -ETIMEDOUT)					      \
+	else if (status == ETIMEDOUT)					      \
 	  result = EAGAIN;						      \
 	else								      \
-	  assert (status == 0 || status == -EWOULDBLOCK);		      \
+	  assert (status == 0 || status == EAGAIN);			      \
 									      \
 	pthread_mutex_lock (&__gai_requests_mutex);			      \
       }									      \
diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
index a095ad9..883f144 100644
--- a/sysdeps/nptl/lowlevellock-futex.h
+++ b/sysdeps/nptl/lowlevellock-futex.h
@@ -43,6 +43,11 @@ 
 #define lll_futex_timed_wait(futexp, val, timeout, private)             \
   -ENOSYS
 
+/* Wait until a lll_futex_wake call on FUTEXP, or ABSTIME, a point in time
+   counted by CLOCK_REALTIME, has passed.  ABSTIME must be normalized.  */
+#define lll_futex_abstimed_wait(futexp, val, abstime, private) \
+  -ENOSYS
+
 /* This macro should be defined only if FUTEX_CLOCK_REALTIME is also defined.
    If CLOCKBIT is zero, this is identical to lll_futex_timed_wait.
    If CLOCKBIT has FUTEX_CLOCK_REALTIME set, then it's the same but
diff --git a/sysdeps/unix/sysv/linux/futex-internal.h b/sysdeps/unix/sysv/linux/futex-internal.h
new file mode 100644
index 0000000..9e36177
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/futex-internal.h
@@ -0,0 +1,251 @@ 
+/* futex operations for glibc-internal use.  Linux version.
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef FUTEX_INTERNAL_H
+#define FUTEX_INTERNAL_H
+
+#include <errno.h>
+#include <lowlevellock-futex.h>
+#include <sys/time.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <nptl/pthreadP.h>
+#include <libc-internal.h>
+
+/* See sysdeps/nptl/futex-internal.h for documentation; this file only
+   contains Linux-specific comments.
+
+   The Linux kernel treats provides absolute timeouts based on the
+   CLOCK_REALTIME clock and relative timeouts measured against the
+   CLOCK_MONOTONIC clock.
+
+   We expect a Linux kernel version of 2.6.22 or more recent (since this
+   version, EINTR is not returned on spurious wake-ups anymore).  */
+
+/* Defined this way for interoperability with lowlevellock.  */
+#define FUTEX_PRIVATE LLL_PRIVATE
+#define FUTEX_SHARED  LLL_SHARED
+
+/* FUTEX_SHARED is always supported by the Linux kernel.  */
+static __always_inline bool
+futex_supports_shared (void)
+{
+  return true;
+}
+
+/* The Linux kernel supports relative timeouts measured against the
+   CLOCK_MONOTONIC clock.  */
+static __always_inline bool
+futex_supports_exact_relative_timeouts (void)
+{
+  return true;
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_wait (unsigned int *futex_word, unsigned int expected, int private)
+{
+  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+      return -err;
+
+    case -ETIMEDOUT: /* Cannot have happened as we provided no timeout.  */
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline void
+futex_wait_simple (unsigned int *futex_word, unsigned int expected,
+		   int private)
+{
+  ignore_value (futex_wait (futex_word, expected, private));
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_wait_cancelable (unsigned int *futex_word, unsigned int expected,
+		       int private)
+{
+  int oldtype;
+  oldtype = __pthread_enable_asynccancel ();
+  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
+  __pthread_disable_asynccancel (oldtype);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+      return -err;
+
+    case -ETIMEDOUT: /* Cannot have happened as we provided no timeout.  */
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_reltimed_wait (unsigned int* futex_word, unsigned int expected,
+		     const struct timespec* reltime, int private)
+{
+  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_reltimed_wait_cancelable (unsigned int* futex_word,
+				unsigned int expected,
+			        const struct timespec* reltime, int private)
+{
+  int oldtype;
+  oldtype = __pthread_enable_asynccancel ();
+  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
+  __pthread_disable_asynccancel (oldtype);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_abstimed_wait (unsigned int* futex_word, unsigned int expected,
+		     const struct timespec* abstime, int private)
+{
+  int err = lll_futex_abstimed_wait (futex_word, expected, abstime, private);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline int
+futex_abstimed_wait_cancelable (unsigned int* futex_word,
+				unsigned int expected,
+			        const struct timespec* abstime, int private)
+{
+  int oldtype;
+  oldtype = __pthread_enable_asynccancel ();
+  int err = lll_futex_abstimed_wait (futex_word, expected, abstime, private);
+  __pthread_disable_asynccancel (oldtype);
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -EINVAL: /* Either due to wrong alignment or due to the timeout not
+		     being normalized.  Must have been caused by a glibc or
+		     application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+/* See sysdeps/nptl/futex-internal.h for details.  */
+static __always_inline void
+futex_wake (unsigned int* futex_word, int processes_to_wake, int private)
+{
+  int res = lll_futex_wake (futex_word, processes_to_wake, private);
+  /* No error.  Ignore the number of woken processes.  */
+  if (res >= 0)
+    return;
+  switch (res)
+    {
+    case -EFAULT: /* Could have happened due to memory reuse.  */
+    case -EINVAL: /* Could be either due to incorrect alignment (a bug in
+		     glibc or in the application) or due to memory being
+		     reused for a PI futex.  We cannot distinguish between the
+		     two causes, and one of them is correct use, so we do not
+		     act in this case.  */
+      return;
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      abort ();
+    }
+}
+
+#endif  /* futex-internal.h */
diff --git a/sysdeps/unix/sysv/linux/lowlevellock-futex.h b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
index 59f6627..202b7ae 100644
--- a/sysdeps/unix/sysv/linux/lowlevellock-futex.h
+++ b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
@@ -22,6 +22,7 @@ 
 #ifndef __ASSEMBLER__
 #include <sysdep.h>
 #include <tls.h>
+#include <sys/time.h>
 #include <kernel-features.h>
 #endif
 
@@ -92,6 +93,19 @@ 
 		     __lll_private_flag (FUTEX_WAIT, private),  \
 		     val, timeout)
 
+#define lll_futex_abstimed_wait(futexp, val, abstime, private)		    \
+  ({									    \
+    /* Work around the fact that the kernel rejects negative timeout values \
+       despite them being valid.  */					    \
+    int ret;								    \
+    if (__glibc_unlikely (((abstime) != NULL) && ((abstime)->tv_sec < 0)))  \
+      ret = -ETIMEDOUT;							    \
+    else								    \
+      ret = lll_futex_timed_wait_bitset (futexp, val, abstime,		    \
+					 FUTEX_CLOCK_REALTIME, private);    \
+    ret;								    \
+  })
+
 #define lll_futex_timed_wait_bitset(futexp, val, timeout, clockbit, private) \
   lll_futex_syscall (6, futexp,                                         \
 		     __lll_private_flag (FUTEX_WAIT_BITSET | (clockbit), \