Message ID | 20171017232440.5134-1-mathieu.desnoyers@efficios.com |
---|---|
State | New |
Headers | show |
Series | [RFC,v2] pthread_setspecific: Provide signal-safety across keys | expand |
On 10/17/2017 04:24 PM, Mathieu Desnoyers wrote: > The intent here is to provide signal-safety against callers to > pthread_{get/set}specific that work on different pthread keys, > without hurting performance of the normal use-cases that do not > care about signal-safety. ... I just wanted to note 2 things before we go any further: (1) We should be able to accept your patch even though you don't have a copyright assignment (that I can see) with the FSF. The reason for this is that the patch is small and is a limited number of lines. (2) After this patch, because submitted lines of changes are cumulative, we are going to need a copyright assignment to accept further changes. You can view the project Contribution Checklist here: https://sourceware.org/glibc/wiki/Contribution%20checklist We'll help you work through the more esoteric aspects of the list, so don't worry about those. However, I wanted to make the copyright aspect as clear as possible. It would be awesome to continue to receive contributions from someone as experienced as yourself, particularly for parallelism and concurrency issues, or issues that impact lttng. So if you need any help with assignments, please feel free to email me and I can help answer any questions as a steward for the project.
* Carlos O'Donell: > On 10/17/2017 04:24 PM, Mathieu Desnoyers wrote: >> The intent here is to provide signal-safety against callers to >> pthread_{get/set}specific that work on different pthread keys, >> without hurting performance of the normal use-cases that do not >> care about signal-safety. > > ... > > I just wanted to note 2 things before we go any further: > > (1) We should be able to accept your patch even though you don't > have a copyright assignment (that I can see) with the FSF. > The reason for this is that the patch is small and is a limited > number of lines. A complete patch will likely be over the threshold, though. And the assignment is by no means a guarantee we will accept the patch, of course.
* Mathieu Desnoyers: > @@ -327,7 +328,7 @@ __nptl_deallocate_tsd (void) > { > /* The first block is allocated as part of the thread > descriptor. */ > - free (level2); > + (void) munmap (level2, PTHREAD_KEY_2NDLEVEL_SIZE * sizeof (*level2)); > THREAD_SETMEM_NC (self, specific, cnt, NULL); I think __nptl_deallocate_tsd needs to disable signals as well, even when the level2 pointer is NULL, and keep them disabled until the thread exits. This is not a localized change; I don't know if it is safe.
On 18/10/17 06:33, Florian Weimer wrote: > * Mathieu Desnoyers: > >> @@ -327,7 +328,7 @@ __nptl_deallocate_tsd (void) >> { >> /* The first block is allocated as part of the thread >> descriptor. */ >> - free (level2); >> + (void) munmap (level2, PTHREAD_KEY_2NDLEVEL_SIZE * sizeof (*level2)); >> THREAD_SETMEM_NC (self, specific, cnt, NULL); > > I think __nptl_deallocate_tsd needs to disable signals as well, even > when the level2 pointer is NULL, and keep them disabled until the > thread exits. This is not a localized change; I don't know if it is > safe. > dtors must not run with signals blocked (posix does not say that dtors may see changed signal mask) i think it's the user's fault if there is async pthread_getspecific call while a thread is exiting. the user can block the signals before thread exit if that's the intended behaviour.
----- On Oct 18, 2017, at 1:33 AM, Florian Weimer fw@deneb.enyo.de wrote: > * Mathieu Desnoyers: > >> @@ -327,7 +328,7 @@ __nptl_deallocate_tsd (void) >> { >> /* The first block is allocated as part of the thread >> descriptor. */ >> - free (level2); >> + (void) munmap (level2, PTHREAD_KEY_2NDLEVEL_SIZE * sizeof (*level2)); >> THREAD_SETMEM_NC (self, specific, cnt, NULL); > > I think __nptl_deallocate_tsd needs to disable signals as well, even > when the level2 pointer is NULL, and keep them disabled until the > thread exits. This is not a localized change; I don't know if it is > safe. I'd hate to slow down thread exit for everyone though. Disabling and re-enabling signals is not exactly a cheap operation. How about we add a new "thread_exiting" flag to struct pthread, and do something like the following ? The main change there would affect pthread_{get,set}specific called within pthread_key destr_function handlers: pthread_getspecific would return NULL, and pthread_setspecific would return EPERM (which would be a non-POSIX return value). This would also ensure that if pthread_{get,set}specific are called within signal handlers nested over pthread teardown, those return NULL and EPERM. Thoughts ? Thanks, Mathieu diff --git a/nptl/descr.h b/nptl/descr.h index c5ad0c8dba..6a4ef4559a 100644 --- a/nptl/descr.h +++ b/nptl/descr.h @@ -335,6 +335,9 @@ struct pthread /* True if thread must stop at startup time. */ bool stopped_start; + /* True if thread is exiting. */ + bool thread_exiting; + /* The parent's cancel handling at the time of the pthread_create call. This might be needed to undo the effects of a cancellation. */ int parent_cancelhandling; diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index 78abc07f91..fba5e2dbde 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -464,6 +464,8 @@ START_THREAD_DEFN THREAD_SETMEM (pd, result, pd->start_routine (pd->arg)); } + THREAD_SETMEM (pd, thread_exiting, true); + /* Call destructors for the thread_local TLS variables. */ #ifndef SHARED if (&__call_tls_dtors != NULL) diff --git a/nptl/pthread_getspecific.c b/nptl/pthread_getspecific.c index 114d6da29b..351ddd486f 100644 --- a/nptl/pthread_getspecific.c +++ b/nptl/pthread_getspecific.c @@ -25,6 +25,10 @@ __pthread_getspecific (pthread_key_t key) { struct pthread_key_data *data; + /* Operation is not permitted while thread exits. */ + if (THREAD_GETMEM_NC (THREAD_SELF, thread_exiting)) + return NULL; + /* Special case access to the first 2nd-level block. This is the usual case. */ if (__glibc_likely (key < PTHREAD_KEY_2NDLEVEL_SIZE)) diff --git a/nptl/pthread_setspecific.c b/nptl/pthread_setspecific.c index 2beb7f97fc..1c0a14dce8 100644 --- a/nptl/pthread_setspecific.c +++ b/nptl/pthread_setspecific.c @@ -33,6 +33,10 @@ __pthread_setspecific (pthread_key_t key, const void *value) self = THREAD_SELF; + /* Operation is not permitted while thread exits. */ + if (THREAD_GETMEM_NC (self, thread_exiting)) + return EPERM; + /* Special case access to the first 2nd-level block. This is the usual case. */ if (__glibc_likely (key < PTHREAD_KEY_2NDLEVEL_SIZE))
----- On Oct 18, 2017, at 7:37 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: > On 18/10/17 06:33, Florian Weimer wrote: >> * Mathieu Desnoyers: >> >>> @@ -327,7 +328,7 @@ __nptl_deallocate_tsd (void) >>> { >>> /* The first block is allocated as part of the thread >>> descriptor. */ >>> - free (level2); >>> + (void) munmap (level2, PTHREAD_KEY_2NDLEVEL_SIZE * sizeof (*level2)); >>> THREAD_SETMEM_NC (self, specific, cnt, NULL); >> >> I think __nptl_deallocate_tsd needs to disable signals as well, even >> when the level2 pointer is NULL, and keep them disabled until the >> thread exits. This is not a localized change; I don't know if it is >> safe. >> > > dtors must not run with signals blocked (posix does not > say that dtors may see changed signal mask) Would it be more acceptable to have a "thread_exiting" flag, and have pthread_{get,set}specific return NULL and EPERM respectively when invoked from dtors and from signal handlers nested over thread teardown ? > > i think it's the user's fault if there is async > pthread_getspecific call while a thread is exiting. > the user can block the signals before thread exit > if that's the intended behaviour. In the lttng-ust tracer use-case, I cannot change the application. A simple library could be instrumented, and use lttng-ust for tracing, without any change to the application. So unfortunately the solution you are proposing does not work in my case. Thanks, Mathieu
On 18/10/17 17:31, Mathieu Desnoyers wrote: > ----- On Oct 18, 2017, at 7:37 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: >> dtors must not run with signals blocked (posix does not >> say that dtors may see changed signal mask) > > Would it be more acceptable to have a "thread_exiting" flag, > and have pthread_{get,set}specific return NULL and EPERM > respectively when invoked from dtors and from signal handlers > nested over thread teardown ? > >> >> i think it's the user's fault if there is async >> pthread_getspecific call while a thread is exiting. >> the user can block the signals before thread exit >> if that's the intended behaviour. > > In the lttng-ust tracer use-case, I cannot change the application. > A simple library could be instrumented, and use lttng-ust for > tracing, without any change to the application. So unfortunately > the solution you are proposing does not work in my case. > instead of adding hacks to the libc and abusing the tsd apis, why not use tls? in c11 that is as-safe (unfortunately glibc does not implement this correctly, but e.g. in musl libc all tls accesses are as-safe so it can be done, with glibc you would have to use initial-exec tls, so it won't work with dlopen, but that's a glibc bug) if you need dtors then the tsd apis can be used for that and you can block signals in your dtors before cleaning tls state. (this would be portable to any conforming libc implementation)
----- On Oct 19, 2017, at 4:55 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: > On 18/10/17 17:31, Mathieu Desnoyers wrote: >> ----- On Oct 18, 2017, at 7:37 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: >>> dtors must not run with signals blocked (posix does not >>> say that dtors may see changed signal mask) >> >> Would it be more acceptable to have a "thread_exiting" flag, >> and have pthread_{get,set}specific return NULL and EPERM >> respectively when invoked from dtors and from signal handlers >> nested over thread teardown ? >> >>> >>> i think it's the user's fault if there is async >>> pthread_getspecific call while a thread is exiting. >>> the user can block the signals before thread exit >>> if that's the intended behaviour. >> >> In the lttng-ust tracer use-case, I cannot change the application. >> A simple library could be instrumented, and use lttng-ust for >> tracing, without any change to the application. So unfortunately >> the solution you are proposing does not work in my case. >> > > instead of adding hacks to the libc and abusing the tsd apis, > why not use tls? in c11 that is as-safe (unfortunately glibc > does not implement this correctly, but e.g. in musl libc all > tls accesses are as-safe so it can be done, with glibc you > would have to use initial-exec tls, so it won't work with > dlopen, but that's a glibc bug) I indeed try to use TLS whenever possible. Unfortunately, I also need to perform lazy thread registration (e.g. add thread to a linked list for urcu-bp, or register to the kernel new rseq system call), which needs to be paired with an unregistration before the thread exits. I found that pthread_key is a good match for that use-case. The main issue is its non-signal-safety. > > if you need dtors then the tsd apis can be used for that and > you can block signals in your dtors before cleaning tls state. > (this would be portable to any conforming libc implementation) I'm not sure I understand your statement here. I need to set a dtor from a signal handler, but AFAIU in the previous statement you said tsd apis were not async signal safe. Perhaps I mis-associate "tsd apis" to pthread_{get,set}specific and you have something else in mind ? By the way, rather than adding a new "EPERM" return code to pthread_setspecific to handle the case where it is invoked from a thread dtors (which would be non-posix and unexpected), perhaps we should introduce a pthread_setspecific_np() ? Thanks, Mathieu
On 19/10/17 11:37, Mathieu Desnoyers wrote: > ----- On Oct 19, 2017, at 4:55 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: >> if you need dtors then the tsd apis can be used for that and >> you can block signals in your dtors before cleaning tls state. >> (this would be portable to any conforming libc implementation) > > I'm not sure I understand your statement here. I need to set a > dtor from a signal handler, but AFAIU in the previous statement > you said tsd apis were not async signal safe. Perhaps I > mis-associate "tsd apis" to pthread_{get,set}specific and you > have something else in mind ? > well the idea was that you use tls access in signal handler and setup a tsd dtor in non-signal handler. (at tls access you can check some flag to see if the tsd for the thread is initialized or destructed) but this assumes you can make calls in the thread outside of a signal handler. > By the way, rather than adding a new "EPERM" return code to > pthread_setspecific to handle the case where it is invoked > from a thread dtors (which would be non-posix and unexpected), > perhaps we should introduce a pthread_setspecific_np() ? pthread_setspecific has well defined behaviour in dtors so EPERM is not ok. but i dont think _np api for this is appropriate either.
----- On Oct 19, 2017, at 8:03 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: > On 19/10/17 11:37, Mathieu Desnoyers wrote: >> ----- On Oct 19, 2017, at 4:55 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: >>> if you need dtors then the tsd apis can be used for that and >>> you can block signals in your dtors before cleaning tls state. >>> (this would be portable to any conforming libc implementation) >> >> I'm not sure I understand your statement here. I need to set a >> dtor from a signal handler, but AFAIU in the previous statement >> you said tsd apis were not async signal safe. Perhaps I >> mis-associate "tsd apis" to pthread_{get,set}specific and you >> have something else in mind ? >> > > well the idea was that you use tls access in signal handler > and setup a tsd dtor in non-signal handler. > (at tls access you can check some flag to see if the tsd > for the thread is initialized or destructed) > but this assumes you can make calls in the thread outside > of a signal handler. Unfortuntately, I need to setup the tsd dtor from the signal handler. > >> By the way, rather than adding a new "EPERM" return code to >> pthread_setspecific to handle the case where it is invoked >> from a thread dtors (which would be non-posix and unexpected), >> perhaps we should introduce a pthread_setspecific_np() ? > > pthread_setspecific has well defined behaviour in dtors > so EPERM is not ok. > > but i dont think _np api for this is appropriate either. So far the _np API is the less unappealing solution I have found to solve the problem at hand. Do you have alternative solutions to propose ? Thanks, Mathieu
On 19/10/17 13:59, Mathieu Desnoyers wrote: > ----- On Oct 19, 2017, at 8:03 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: >>> By the way, rather than adding a new "EPERM" return code to >>> pthread_setspecific to handle the case where it is invoked >>> from a thread dtors (which would be non-posix and unexpected), >>> perhaps we should introduce a pthread_setspecific_np() ? >> >> pthread_setspecific has well defined behaviour in dtors >> so EPERM is not ok. >> >> but i dont think _np api for this is appropriate either. > > So far the _np API is the less unappealing solution I have > found to solve the problem at hand. Do you have alternative > solutions to propose ? > the original approach tried to make pthread_setspecific as-safe as an extension which has non-trivial requirement on thread exit as discussed: thread exit retries the tsd dtors until all tsd values are NULL, if setspecific may run in a signal handler that means this loop should only stop if all tsd values are still NULL after signals are masked (and signals remain masked until the thread exits). this way when your library is called you can always do the 'register thread' operation and then the dtor with 'unregister thread' operation will be run even if the call happened in a signal interrupting tsd dtors. (you may need to mask signals in your own dtor though) currently the dtor retry loop only runs a fixed number of iterations (dtors of non-NULL tsd values after the last iteration are not run), but posix allows an infinite loop there, which would be appropriate for as-safe setspecific. (i don't know if that breaks anything) your second approach exposes whether tsd dtors have started running and then you skip 'register thread' if it seems the thread is exiting. there are various ways to expose this information (return value of pthread_setspecific_np may not be the best one) and it's not clear if this is useful for others as well or just for this specific application (which does not seem to care about library calls that happen after the thread is exiting)
----- On Oct 19, 2017, at 12:55 PM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: > On 19/10/17 13:59, Mathieu Desnoyers wrote: >> ----- On Oct 19, 2017, at 8:03 AM, Szabolcs Nagy szabolcs.nagy@arm.com wrote: >>>> By the way, rather than adding a new "EPERM" return code to >>>> pthread_setspecific to handle the case where it is invoked >>>> from a thread dtors (which would be non-posix and unexpected), >>>> perhaps we should introduce a pthread_setspecific_np() ? >>> >>> pthread_setspecific has well defined behaviour in dtors >>> so EPERM is not ok. >>> >>> but i dont think _np api for this is appropriate either. >> >> So far the _np API is the less unappealing solution I have >> found to solve the problem at hand. Do you have alternative >> solutions to propose ? >> > > the original approach tried to make pthread_setspecific > as-safe as an extension which has non-trivial requirement > on thread exit as discussed: > > thread exit retries the tsd dtors until all tsd values > are NULL, if setspecific may run in a signal handler > that means this loop should only stop if all tsd values > are still NULL after signals are masked (and signals > remain masked until the thread exits). > > this way when your library is called you can always > do the 'register thread' operation and then the dtor > with 'unregister thread' operation will be run even > if the call happened in a signal interrupting tsd dtors. > (you may need to mask signals in your own dtor though) > > currently the dtor retry loop only runs a fixed number > of iterations (dtors of non-NULL tsd values after the > last iteration are not run), but posix allows an infinite > loop there, which would be appropriate for as-safe > setspecific. (i don't know if that breaks anything) > > your second approach exposes whether tsd dtors have > started running and then you skip 'register thread' > if it seems the thread is exiting. there are various > ways to expose this information (return value of > pthread_setspecific_np may not be the best one) and > it's not clear if this is useful for others as well > or just for this specific application (which does > not seem to care about library calls that happen > after the thread is exiting) Good points! This brings me to a third approach that would fit my use-case, and perhaps be more generic than the pthread_setspecific_np approach. We could introduce a new as-safe "query" function, e.g. pthread_exiting(), which could be called by the application to conditionally decide not to invoke pthread_setspecific() from dtors and signal handlers nested over dtors. In applications like a tracer, or liburcu-bp trying to register a thread, doing that registration while in the thread dtors is pretty much useless, so it could be simply skipped. Thoughts ? Thanks, Mathieu
On 10/19/2017 12:42 PM, Mathieu Desnoyers wrote: > Good points! This brings me to a third approach that would > fit my use-case, and perhaps be more generic than the > pthread_setspecific_np approach. We could introduce a > new as-safe "query" function, e.g. pthread_exiting(), > which could be called by the application to conditionally > decide not to invoke pthread_setspecific() from dtors and > signal handlers nested over dtors. In applications like a > tracer, or liburcu-bp trying to register a thread, doing > that registration while in the thread dtors is pretty much > useless, so it could be simply skipped. APIs that check status usually have lifetime decidability issues with the objects they check. This is a deep design topic. I'm not going to delve too deep here. To be honest what I'd like to see from you before we go down the deep topic of API design is this: A detailed writeup of your use case and the behaviour you're trying to support. I still don't have a clear grasp on what appears to be two rather specific requirements for the library and threads it creates.
diff --git a/manual/threads.texi b/manual/threads.texi index 769d974d50..62504ecfa9 100644 --- a/manual/threads.texi +++ b/manual/threads.texi @@ -53,20 +53,22 @@ is it called during thread exit. @c pthread_getspecific ok Return the thread-specific data associated with @var{key} in the calling thread. +As a non-POSIX extension, @code{pthread_getspecific} is async-signal safe. @end deftypefun @deftypefun int pthread_setspecific (pthread_key_t @var{key}, const void *@var{value}) @standards{POSIX, pthread.h} -@safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @ascuheap{}}@acunsafe{@acucorrupt{} @acsmem{}}} -@c pthread_setspecific @asucorrupt @ascuheap @acucorrupt @acsmem -@c a level2 block may be allocated by a signal handler after -@c another call already made a decision to allocate it, thus losing -@c the allocated value. the seq number is updated before the -@c value, which might cause an earlier-generation value to seem -@c current if setspecific is cancelled or interrupted by a signal +@safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{}}@acunsafe{@acucorrupt{} @acsmem{}}} +@c pthread_setspecific @asucorrupt @acucorrupt @acsmem +@c the seq number is updated before the value, which might cause +@c an earlier-generation value to seem current if setspecific is +@c cancelled or interrupted by a signal @c KEY_UNUSED ok -@c calloc dup @ascuheap @acsmem +@c dup @acsmem Associate the thread-specific @var{value} with @var{key} in the calling thread. +As a non-POSIX extension, @code{pthread_setspecific} is async-signal safe for +concurrent invocations against different @var{key}, but not async-signal +safe for concurrent invocations against the same @var{key}. @end deftypefun diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index 992331e280..78abc07f91 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -33,6 +33,7 @@ #include <default-sched.h> #include <futex-internal.h> #include "libioP.h" +#include <sys/mman.h> #include <shlib-compat.h> @@ -327,7 +328,7 @@ __nptl_deallocate_tsd (void) { /* The first block is allocated as part of the thread descriptor. */ - free (level2); + (void) munmap (level2, PTHREAD_KEY_2NDLEVEL_SIZE * sizeof (*level2)); THREAD_SETMEM_NC (self, specific, cnt, NULL); } } diff --git a/nptl/pthread_setspecific.c b/nptl/pthread_setspecific.c index 214af3b661..2beb7f97fc 100644 --- a/nptl/pthread_setspecific.c +++ b/nptl/pthread_setspecific.c @@ -18,6 +18,7 @@ #include <errno.h> #include <stdlib.h> +#include <sys/mman.h> #include "pthreadP.h" @@ -61,18 +62,35 @@ __pthread_setspecific (pthread_key_t key, const void *value) level2 = THREAD_GETMEM_NC (self, specific, idx1st); if (level2 == NULL) { + sigset_t ss, old_ss; + if (value == NULL) /* We don't have to do anything. The value would in any case be NULL. We can save the memory allocation. */ return 0; + /* Ensure that pthread_setspecific and pthread_getspecific are + signal-safe when used on different keys. */ + sigfillset (&ss); + pthread_sigmask (SIG_BLOCK, &ss, &old_ss); + /* Read level2 again with signals disabled. */ + level2 = THREAD_GETMEM_NC (self, specific, idx1st); + if (level2 != NULL) + goto skip_resize; + level2 - = (struct pthread_key_data *) calloc (PTHREAD_KEY_2NDLEVEL_SIZE, - sizeof (*level2)); - if (level2 == NULL) + = (struct pthread_key_data *) mmap (NULL, + PTHREAD_KEY_2NDLEVEL_SIZE * sizeof (*level2), + PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, + -1, 0); + if (level2 == MAP_FAILED) { + pthread_sigmask (SIG_SETMASK, &old_ss, NULL); return ENOMEM; + } THREAD_SETMEM_NC (self, specific, idx1st, level2); +skip_resize: + pthread_sigmask (SIG_SETMASK, &old_ss, NULL); } /* Pointer to the right array element. */
The intent here is to provide signal-safety against callers to pthread_{get/set}specific that work on different pthread keys, without hurting performance of the normal use-cases that do not care about signal-safety. One thing to keep in mind is that callers of pthread_{get/set}specific on a given key can disable signals if they require signal-safety against operations on their key. The real problem in my situation (lttng-ust tracer) is having pthread_setspecific (invoked by the application) do memory allocation and update pointers without disabling signals when it needs to allocate "level2". This only happens the first time a key >= PTHREAD_KEY_2NDLEVEL_SIZE is encountered by a thread. If lttng-ust tracing inserted into a signal handler nests over this allocation, the application pthread key specific may be lost, and memory leaked. Also, use directly mmap rather than calloc to ensure signal-safety. Make just the memory allocation part of pthread_setspecific signal-safe. Given this is not required very often, it should not cause a significant overhead. Document those new non-POSIX async-signal-safety guarantees in pthread_getspecific and pthread_setspecific user documentation. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: "Carlos O'Donell" <carlos@redhat.com> CC: Florian Weimer <fw@deneb.enyo.de> CC: "Ben Maurer" <bmaurer@fb.com> CC: libc-alpha@sourceware.org --- manual/threads.texi | 18 ++++++++++-------- nptl/pthread_create.c | 3 ++- nptl/pthread_setspecific.c | 24 +++++++++++++++++++++--- 3 files changed, 33 insertions(+), 12 deletions(-)