Message ID | 20200513202630.2123238-1-adhemerval.zanella@linaro.org |
---|---|
Headers | show |
Series | Make strsignal and strerror async-signal-safe | expand |
* Adhemerval Zanella via Libc-alpha: > 2. Avoid to try translate the returned message. No more translation is a significant change. Is this really appropriate? Using strerror output in translated strings is very common: <https://codesearch.debian.net/search?q=_%5C%28.*strerror&literal=0> I still prefer my approach with new functions which return a string (with the name of the constant), but only for known error numbers. I think the constant names would also be more useful to us in diagnostics. > The translation avoidance is more tricky because it is a semantic > change and that's why the patchset also add a new strsignal_l > symbol. Some systems (bionic) does provide async-signal-safe > version of the symbols, other system (FreeBSD) it is configurable. > Also, some system also does not translate the messags (bionic > and AIX7). Isn't bionic basically a loader for Dalvik, with few end-user visible use cases? So it might not be the right reference. Thanks, Florian
On 14/05/2020 09:05, Florian Weimer wrote: > * Adhemerval Zanella via Libc-alpha: > >> 2. Avoid to try translate the returned message. > > No more translation is a significant change. Is this really > appropriate? I take that once we have proper symbols that provides the translation functionality, making them async-signal-safe is an welcome improvement. The glibc itself uses the __sys_* access on libSegFault due this shortcoming. And I am not if other programs are really aware of the shortcoming of the always translatable error strings. > > Using strerror output in translated strings is very common: > > <https://codesearch.debian.net/search?q=_%5C%28.*strerror&literal=0> > > I still prefer my approach with new functions which return a string > (with the name of the constant), but only for known error numbers. I > think the constant names would also be more useful to us in diagnostics. New non-standard symbols takes time and code effort to be used, and it would required either to be implemented in other libc to fully portable or even additional efforts from the projects to use it (and there is recent trend to avoid use new glibc symbol that might break the usage on older glibc, although it is not really supported from out side). So I do prefer to improve current interfaces than add newer ones (unless there are functionalities that current approach does not support). The translation from str{signal,error} is an implementation detail and I was done mainly glibc also exports sys_* (which has its own issues and shortcomings). And I am not if changing really breaks thing here, in fact it might provide more robust log message if they are being generated in signal handler (which seems to be common in some code). > >> The translation avoidance is more tricky because it is a semantic >> change and that's why the patchset also add a new strsignal_l >> symbol. Some systems (bionic) does provide async-signal-safe >> version of the symbols, other system (FreeBSD) it is configurable. >> Also, some system also does not translate the messags (bionic >> and AIX7). > > Isn't bionic basically a loader for Dalvik, with few end-user visible > use cases? So it might not be the right reference. Not sure, although android does use multiple services in native code. > > Thanks, > Florian >
On Thu, May 14, 2020 at 8:47 AM Adhemerval Zanella via Libc-alpha <libc-alpha@sourceware.org> wrote: > On 14/05/2020 09:05, Florian Weimer wrote: > > * Adhemerval Zanella via Libc-alpha: > > > >> 2. Avoid to try translate the returned message. > > > > No more translation is a significant change. Is this really > > appropriate? > > > > Using strerror output in translated strings is very common: > > > > <https://codesearch.debian.net/search?q=_%5C%28.*strerror&literal=0> > > I take that once we have proper symbols that provides the translation > functionality, making them async-signal-safe is an welcome improvement. > > The glibc itself uses the __sys_* access on libSegFault due this > shortcoming. And I am not if other programs are really aware of > the shortcoming of the always translatable error strings. I think I'm with Florian here. There is lots of existing code that expects strerror() to translate, and _most_ of the time it's not being called from an async signal handler, making everyone change their code to use strerror_l (and get a locale_t to use it with) just to get back the behavior they're used to seems like too much churn. It would be nice if there was a way for strerror to detect that it wasn't safe for it to call dcgettext or allocate memory. In principle we ought to be able to try-lock the "I might need to initialize the global locale now" lock but, looking at the guts of dcgettext, there's several different locks involved and it's not clear to me that we could make this work without a major overhaul of libintl. (Giant-hammer idea: have sigaction wrap all signal handlers with a routine that sets a thread-local flag while calling the signal handler. siglongjmp and swapcontext would need to restore this flag.) zw
* Adhemerval Zanella: > On 14/05/2020 09:05, Florian Weimer wrote: >> * Adhemerval Zanella via Libc-alpha: >> >>> 2. Avoid to try translate the returned message. >> >> No more translation is a significant change. Is this really >> appropriate? > > I take that once we have proper symbols that provides the translation > functionality, making them async-signal-safe is an welcome > improvement. But why would non-portable software switch to those interfaces? It seems more conservative to me to introduce new interfaces that are async-signal-safe (and thread-safe). This change also continues the pattern of removal of widely-used interfaces without a deprecation warning (although these interfaces have been informally deprecated for a long time, so maybe that's okay). > The glibc itself uses the __sys_* access on libSegFault due this > shortcoming. And I am not if other programs are really aware of > the shortcoming of the always translatable error strings. Thread safety for strerror can be a problem. We can fix that independently. Also the use of legacy TLS is problematic and we should get rid of that. >> Using strerror output in translated strings is very common: >> >> <https://codesearch.debian.net/search?q=_%5C%28.*strerror&literal=0> >> >> I still prefer my approach with new functions which return a string >> (with the name of the constant), but only for known error numbers. I >> think the constant names would also be more useful to us in diagnostics. > > New non-standard symbols takes time and code effort to be used, and > it would required either to be implemented in other libc to fully > portable or even additional efforts from the projects to use it > (and there is recent trend to avoid use new glibc symbol that might > break the usage on older glibc, although it is not really supported > from out side). Yes, but as I showed you, your proposed change leads to partially translated strings. I don't see a compelling reason to do that. Yes, intl/ is not async-signal-safe, and it would be some work to make it so, but that isn't justification enough to drop the translation. Thanks, Florian
On 14/05/2020 12:16, Florian Weimer wrote: > * Adhemerval Zanella: > >> On 14/05/2020 09:05, Florian Weimer wrote: >>> * Adhemerval Zanella via Libc-alpha: >>> >>>> 2. Avoid to try translate the returned message. >>> >>> No more translation is a significant change. Is this really >>> appropriate? >> >> I take that once we have proper symbols that provides the translation >> functionality, making them async-signal-safe is an welcome >> improvement. > > But why would non-portable software switch to those interfaces? > > It seems more conservative to me to introduce new interfaces that are > async-signal-safe (and thread-safe). The downside is new interface usually takes time to be adopted, they add even more code bloat/complexity to support old interfaces, and they does help current usage. > > This change also continues the pattern of removal of widely-used > interfaces without a deprecation warning (although these interfaces have > been informally deprecated for a long time, so maybe that's okay). > >> The glibc itself uses the __sys_* access on libSegFault due this >> shortcoming. And I am not if other programs are really aware of >> the shortcoming of the always translatable error strings. > > Thread safety for strerror can be a problem. We can fix that > independently. Also the use of legacy TLS is problematic and we should > get rid of that. By legacy TLS do you mean __thread usage? > >>> Using strerror output in translated strings is very common: >>> >>> <https://codesearch.debian.net/search?q=_%5C%28.*strerror&literal=0> >>> >>> I still prefer my approach with new functions which return a string >>> (with the name of the constant), but only for known error numbers. I >>> think the constant names would also be more useful to us in diagnostics. >> >> New non-standard symbols takes time and code effort to be used, and >> it would required either to be implemented in other libc to fully >> portable or even additional efforts from the projects to use it >> (and there is recent trend to avoid use new glibc symbol that might >> break the usage on older glibc, although it is not really supported >> from out side). > > Yes, but as I showed you, your proposed change leads to partially > translated strings. I don't see a compelling reason to do that. Yes, > intl/ is not async-signal-safe, and it would be some work to make it so, > but that isn't justification enough to drop the translation. What do you mean by *partially* translated?
* Adhemerval Zanella: > On 14/05/2020 12:16, Florian Weimer wrote: >> * Adhemerval Zanella: >> >>> On 14/05/2020 09:05, Florian Weimer wrote: >>>> * Adhemerval Zanella via Libc-alpha: >>>> >>>>> 2. Avoid to try translate the returned message. >>>> >>>> No more translation is a significant change. Is this really >>>> appropriate? >>> >>> I take that once we have proper symbols that provides the translation >>> functionality, making them async-signal-safe is an welcome >>> improvement. >> >> But why would non-portable software switch to those interfaces? >> >> It seems more conservative to me to introduce new interfaces that are >> async-signal-safe (and thread-safe). > > The downside is new interface usually takes time to be adopted, they add > even more code bloat/complexity to support old interfaces, and they > does help current usage. But dropping features from existing interfaces does not seem a good alternative in this case. >> This change also continues the pattern of removal of widely-used >> interfaces without a deprecation warning (although these interfaces have >> been informally deprecated for a long time, so maybe that's okay). >> >>> The glibc itself uses the __sys_* access on libSegFault due this >>> shortcoming. And I am not if other programs are really aware of >>> the shortcoming of the always translatable error strings. >> >> Thread safety for strerror can be a problem. We can fix that >> independently. Also the use of legacy TLS is problematic and we should >> get rid of that. > > By legacy TLS do you mean __thread usage? No, the libc-internal equivalent of pthread_key_create (__libc_key_create). >> Yes, but as I showed you, your proposed change leads to partially >> translated strings. I don't see a compelling reason to do that. Yes, >> intl/ is not async-signal-safe, and it would be some work to make it so, >> but that isn't justification enough to drop the translation. > > What do you mean by *partially* translated? First hit from the Debian Code Search query I posted: die(_("cannot open %s: %s"), device_name, strerror(errno)); Currently, this translates to: DEVICE_NAME kann nicht geöffnet werden: Datei oder Verzeichnis nicht gefunden After removing translation from strerror, this turns into: DEVICE_NAME kann nicht geöffnet werden: No such file or directory That's a clear regression. Thanks, Florian
On 14/05/2020 12:11, Zack Weinberg wrote: > On Thu, May 14, 2020 at 8:47 AM Adhemerval Zanella via Libc-alpha > <libc-alpha@sourceware.org> wrote: >> On 14/05/2020 09:05, Florian Weimer wrote: >>> * Adhemerval Zanella via Libc-alpha: >>> >>>> 2. Avoid to try translate the returned message. >>> >>> No more translation is a significant change. Is this really >>> appropriate? >>> >>> Using strerror output in translated strings is very common: >>> >>> <https://codesearch.debian.net/search?q=_%5C%28.*strerror&literal=0> >> >> I take that once we have proper symbols that provides the translation >> functionality, making them async-signal-safe is an welcome improvement. >> >> The glibc itself uses the __sys_* access on libSegFault due this >> shortcoming. And I am not if other programs are really aware of >> the shortcoming of the always translatable error strings. > > I think I'm with Florian here. There is lots of existing code that > expects strerror() to translate, and _most_ of the time it's not being > called from an async signal handler, making everyone change their code > to use strerror_l (and get a locale_t to use it with) just to get back > the behavior they're used to seems like too much churn. Alright, I took this semantic change would to make them async-signal-safe allow to be more beneficial in the long term (in both code complexity and more concise API) but looks like we should go to newer symbols instead. > > It would be nice if there was a way for strerror to detect that it > wasn't safe for it to call dcgettext or allocate memory. In principle > we ought to be able to try-lock the "I might need to initialize the > global locale now" lock but, looking at the guts of dcgettext, there's > several different locks involved and it's not clear to me that we > could make this work without a major overhaul of libintl. > > (Giant-hammer idea: have sigaction wrap all signal handlers with a > routine that sets a thread-local flag while calling the signal > handler. siglongjmp and swapcontext would need to restore this flag.) The sigaction wrap was a suggestion I proposed to Florian some time ago as a way to detect a code was running in a signal handler. FreeBSB does this for the its libpthread analogous and its adds some complexity (it uses a static list and it requires some locking to handle concurrent access), but I think it should be feasible.
On 14/05/2020 13:40, Florian Weimer wrote: > * Adhemerval Zanella: > >> On 14/05/2020 12:16, Florian Weimer wrote: >>> * Adhemerval Zanella: >>> >>>> On 14/05/2020 09:05, Florian Weimer wrote: >>>>> * Adhemerval Zanella via Libc-alpha: >>>>> >>>>>> 2. Avoid to try translate the returned message. >>>>> >>>>> No more translation is a significant change. Is this really >>>>> appropriate? >>>> >>>> I take that once we have proper symbols that provides the translation >>>> functionality, making them async-signal-safe is an welcome >>>> improvement. >>> >>> But why would non-portable software switch to those interfaces? >>> >>> It seems more conservative to me to introduce new interfaces that are >>> async-signal-safe (and thread-safe). >> >> The downside is new interface usually takes time to be adopted, they add >> even more code bloat/complexity to support old interfaces, and they >> does help current usage. > > But dropping features from existing interfaces does not seem a good > alternative in this case. > >>> This change also continues the pattern of removal of widely-used >>> interfaces without a deprecation warning (although these interfaces have >>> been informally deprecated for a long time, so maybe that's okay). >>> >>>> The glibc itself uses the __sys_* access on libSegFault due this >>>> shortcoming. And I am not if other programs are really aware of >>>> the shortcoming of the always translatable error strings. >>> >>> Thread safety for strerror can be a problem. We can fix that >>> independently. Also the use of legacy TLS is problematic and we should >>> get rid of that. >> >> By legacy TLS do you mean __thread usage? > > No, the libc-internal equivalent of pthread_key_create > (__libc_key_create). Right, it seems that dlfcn is the only place that still uses it. What about moving per-thread state to __thread (generic) / pthread (Linux), do you have some reservation about it? > >>> Yes, but as I showed you, your proposed change leads to partially >>> translated strings. I don't see a compelling reason to do that. Yes, >>> intl/ is not async-signal-safe, and it would be some work to make it so, >>> but that isn't justification enough to drop the translation. >> >> What do you mean by *partially* translated? > > First hit from the Debian Code Search query I posted: > > die(_("cannot open %s: %s"), device_name, strerror(errno)); > > Currently, this translates to: > > DEVICE_NAME kann nicht geöffnet werden: Datei oder Verzeichnis nicht gefunden > > After removing translation from strerror, this turns into: > > DEVICE_NAME kann nicht geöffnet werden: No such file or directory > > That's a clear regression. Alright, as from Zack message I took this semantic change would to make them async-signal-safe allow to be more beneficial in the long term (in both code complexity and more concise API) but looks like we should go to newer symbols instead.
* Adhemerval Zanella: > What about moving per-thread state to __thread (generic) / pthread (Linux), > do you have some reservation about it? No, that makes sense. You can free allocations via malloc/thread-freeres.c. I think I have a patch somewhere that unifies strerror and strerror_l at the same time, or I can review yours. I read POSIX as saying that strerror and strerror_l can share the same buffer. Thanks, Florian
On 14/05/2020 13:51, Florian Weimer wrote: > * Adhemerval Zanella: > >> What about moving per-thread state to __thread (generic) / pthread (Linux), >> do you have some reservation about it? > > No, that makes sense. You can free allocations via > malloc/thread-freeres.c. > > I think I have a patch somewhere that unifies strerror and strerror_l at > the same time, or I can review yours. I read POSIX as saying that > strerror and strerror_l can share the same buffer. Alright, I will rework the patch to add new async-signal-safe symbols to sys_{sig,err}list.
* Adhemerval Zanella: > On 14/05/2020 13:51, Florian Weimer wrote: >> * Adhemerval Zanella: >> >>> What about moving per-thread state to __thread (generic) / pthread (Linux), >>> do you have some reservation about it? >> >> No, that makes sense. You can free allocations via >> malloc/thread-freeres.c. >> >> I think I have a patch somewhere that unifies strerror and strerror_l at >> the same time, or I can review yours. I read POSIX as saying that >> strerror and strerror_l can share the same buffer. > > Alright, I will rework the patch to add new async-signal-safe symbols to > sys_{sig,err}list. I think it would be more valuable to return the names of the *constants*, not the strings. There's currently no interface for that, and (almost) everybody rolls their own. Thanks, Florian
On 14/05/2020 14:02, Florian Weimer wrote: > * Adhemerval Zanella: > >> On 14/05/2020 13:51, Florian Weimer wrote: >>> * Adhemerval Zanella: >>> >>>> What about moving per-thread state to __thread (generic) / pthread (Linux), >>>> do you have some reservation about it? >>> >>> No, that makes sense. You can free allocations via >>> malloc/thread-freeres.c. >>> >>> I think I have a patch somewhere that unifies strerror and strerror_l at >>> the same time, or I can review yours. I read POSIX as saying that >>> strerror and strerror_l can share the same buffer. >> >> Alright, I will rework the patch to add new async-signal-safe symbols to >> sys_{sig,err}list. > > I think it would be more valuable to return the names of the > *constants*, not the strings. There's currently no interface for that, > and (almost) everybody rolls their own. For signal we have two list, sys_siglist and sys_sigabbrev and former returns the signal name. We don't an analogous for errnos though. What I have in mind would be something like: const char *strlist_abbrev (int num); Return the signal NUM name (for instance,'INT' for SIGINT) or NULL if NUM is not present in the internal list. No translation is done neither a string is created to indicate invalid NUM. const char *strlist_descr (int num); Return the signal NUM description (for instance,'Interrupt' for SIGINT) or NULL if NUM is not present in the internal list. No translation is done neither a string is created to indicate invalid NUM. And for strerror analogous: const char *errlist_abbrev (int num); Return the errno NUM name (for instance,'EINVAL' for EINVAL) or NULL if NUM is not present in the internal list. No translation is done neither a string is created to indicate invalid NUM. const char *errlist_descr (int num); Return the signal NUM description (for instance,'Invalid argument' for EINVAL) or NULL if NUM is not present in the internal list. No translation is done neither a string is created to indicate invalid NUM.
* Zack Weinberg: > It would be nice if there was a way for strerror to detect that it > wasn't safe for it to call dcgettext or allocate memory. In principle > we ought to be able to try-lock the "I might need to initialize the > global locale now" lock but, looking at the guts of dcgettext, there's > several different locks involved and it's not clear to me that we > could make this work without a major overhaul of libintl. Right. > (Giant-hammer idea: have sigaction wrap all signal handlers with a > routine that sets a thread-local flag while calling the signal > handler. siglongjmp and swapcontext would need to restore this flag.) It's hard to ensure that the sigaction flags match the registered signal handler at all times, though. Trampolines would work, but they introduce a lot of complexity. There is some discussion here: <https://sourceware.org/glibc/wiki/SignalHandlerWrapper> Thanks, Florian