Message ID | 540F501B.2080808@redhat.com |
---|---|
State | New |
Headers | show |
Florian Weimer <fweimer@redhat.com> writes: > --- a/iconv/gconv_simple.c > +++ b/iconv/gconv_simple.c > @@ -387,8 +387,7 @@ ucs4_internal_loop_single (struct __gconv_step *step, > return __GCONV_INCOMPLETE_INPUT; > } > > - if (__builtin_expect (((unsigned char *) state->__value.__wchb)[0] > 0x80, > - 0)) > + if (__glibc_unlikely (((unsigned char *) state->__value.__wchb)[0] > 0x80)) I think that is a bug in the original code. Should compare > 0x7f. Likewise in internal_ucs4_loop_unaligned. Contrast to ucs4_internal_loop, which compares inval > 0x7fffffff. > - if (__builtin_expect (flags & GCONV_AVOID_NOCONV, 0) > + if (__glibc_unlikely (flags & GCONV_AVOID_NOCONV) This looks a bit suspicious because it is not obvious that the flags & GCONV_AVOID_NOCONV expression always evaluates to 0 or 1. I'm not sure __glibc_unlikely ((flags & GCONV_AVOID_NOCONV) != 0) would be any nicer, though. My concern was that the code is not very different from __builtin_expect (x & 2, 1), from which an excessively clever compiler could figure out that the expectation cannot ever hold. Anyway, the code is different enough to avoid any such problem. > - /*if (__builtin_expect (ch, 0) == __UNKNOWN_10646_CHAR)*/ \ > + /*if (__glibc_unlikely (ch == __UNKNOWN_10646_CHAR))*/ \ > if (__glibc_unlikely (ch == __UNKNOWN_10646_CHAR)) \ I don't think that comment has any value left. I didn't finish reading the patch yet.
On 09/10/2014 01:55 AM, Kalle Olavi Niemitalo wrote: > Florian Weimer <fweimer@redhat.com> writes: > >> --- a/iconv/gconv_simple.c >> +++ b/iconv/gconv_simple.c >> @@ -387,8 +387,7 @@ ucs4_internal_loop_single (struct __gconv_step *step, >> return __GCONV_INCOMPLETE_INPUT; >> } >> >> - if (__builtin_expect (((unsigned char *) state->__value.__wchb)[0] > 0x80, >> - 0)) >> + if (__glibc_unlikely (((unsigned char *) state->__value.__wchb)[0] > 0x80)) > > I think that is a bug in the original code. Should compare > 0x7f. > Likewise in internal_ucs4_loop_unaligned. > Contrast to ucs4_internal_loop, which compares inval > 0x7fffffff. This might be the case. But it seems this code is effectively dead—iconv only converts complete multi-byte sequences, and the mbr* need a matching locale, and we haven't got a UCS-4 locale (multi-byte character sets which regularly use NUL bytes do not work with C). Please file a bug if you think this still needs fixing, especially if you can write a test case which shows the issue. >> - if (__builtin_expect (flags & GCONV_AVOID_NOCONV, 0) >> + if (__glibc_unlikely (flags & GCONV_AVOID_NOCONV) > > This looks a bit suspicious because it is not obvious that the > flags & GCONV_AVOID_NOCONV expression always evaluates to 0 or 1. > I'm not sure __glibc_unlikely ((flags & GCONV_AVOID_NOCONV) != 0) > would be any nicer, though. Has glibc a rule not to rely on implicit booleans? Than the != 0 variant would be preferred. >> - /*if (__builtin_expect (ch, 0) == __UNKNOWN_10646_CHAR)*/ \ >> + /*if (__glibc_unlikely (ch == __UNKNOWN_10646_CHAR))*/ \ >> if (__glibc_unlikely (ch == __UNKNOWN_10646_CHAR)) \ > > I don't think that comment has any value left. Okay, I'll remove it. > I didn't finish reading the patch yet. Thanks so far!
Florian Weimer <fweimer@redhat.com> writes: > Has glibc a rule not to rely on implicit booleans? Than the != 0 > variant would be preferred. I'm not sure. Roland McGrath has objected to implicit Boolean coercion before; see <https://sourceware.org/ml/libc-alpha/2012-09/msg00218.html> and <https://sourceware.org/ml/libc-alpha/2014-06/msg00119.html>. If there is a consensus, then I guess someone should edit <https://sourceware.org/glibc/wiki/Style_and_Conventions>, in which some examples of good patterns nowadays do "if (buf)" or similar. The GNU Coding Standards don't discuss the issue, AFAICS.
The preferred style in glibc has always been to eschew implicit Boolean coercions, except for the return value of strcmp/memcmp and the like (where the most common idiomatic uses treat the value as a Boolean even though nonzero values have further meaning).
On 09/10/2014 02:07 PM, Roland McGrath wrote: > The preferred style in glibc has always been to eschew implicit Boolean > coercions, except for the return value of strcmp/memcmp and the like (where > the most common idiomatic uses treat the value as a Boolean even though > nonzero values have further meaning). > Wikified. https://sourceware.org/glibc/wiki/Style_and_Conventions#Boolean_Coercions c.
> On 09/10/2014 02:07 PM, Roland McGrath wrote: > > The preferred style in glibc has always been to eschew implicit Boolean > > coercions, except for the return value of strcmp/memcmp and the like (where > > the most common idiomatic uses treat the value as a Boolean even though > > nonzero values have further meaning). > > > > Wikified. > > https://sourceware.org/glibc/wiki/Style_and_Conventions#Boolean_Coercions I forgot to mention the other exception: if (foo & BIT) or if (!(foo & BIT)) is fine, no need for if ((foo & BIT) != 0) or if ((foo & BIT) == 0).
On 09/12/2014 01:52 PM, Roland McGrath wrote: >> On 09/10/2014 02:07 PM, Roland McGrath wrote: >>> The preferred style in glibc has always been to eschew implicit Boolean >>> coercions, except for the return value of strcmp/memcmp and the like (where >>> the most common idiomatic uses treat the value as a Boolean even though >>> nonzero values have further meaning). >>> >> >> Wikified. >> >> https://sourceware.org/glibc/wiki/Style_and_Conventions#Boolean_Coercions > > I forgot to mention the other exception: if (foo & BIT) or if (!(foo & BIT)) > is fine, no need for if ((foo & BIT) != 0) or if ((foo & BIT) == 0). Wikified++ c.
On 09/12/2014 07:52 PM, Roland McGrath wrote: > I forgot to mention the other exception: if (foo & BIT) or if (!(foo & BIT)) > is fine, no need for if ((foo & BIT) != 0) or if ((foo & BIT) == 0). Thanks, so there shouldn't be a need to change the patch.
From 25282e6d2d0bb5d9314ec02845747aa9c86c5606 Mon Sep 17 00:00:00 2001 From: Florian Weimer <fweimer@redhat.com> Date: Tue, 9 Sep 2014 20:37:28 +0200 Subject: [PATCH 3/3] Manual cleanups after __builtin_expect conversion of iconv/ and iconvdata/ This commit removes the one remaining __builtin_expect expressions which can be converted, and fixes long lines introduces by the automated rewriting. 2014-09-09 Florian Weimer <fweimer@redhat.com> * iconv/gconv_dl.c (__gconv_find_shlib): Wrap long line resulting from the automated __builtin_expect conversion. * iconvdata/euc-kr.c (euckr_from_ucs4): Likewise. * iconv/gconv_cache.c (__gconv_load_cache): Replace multi-line __builtin_expect with __glibc_unlikely. diff --git a/iconv/gconv_cache.c b/iconv/gconv_cache.c index cb48eca..dab5ac6 100644 --- a/iconv/gconv_cache.c +++ b/iconv/gconv_cache.c @@ -116,9 +116,9 @@ __gconv_load_cache (void) || __glibc_unlikely (header->string_offset >= cache_size) || __glibc_unlikely (header->hash_offset >= cache_size) || __glibc_unlikely (header->hash_size == 0) - || __builtin_expect ((header->hash_offset + || __glibc_unlikely ((header->hash_offset + header->hash_size * sizeof (struct hash_entry)) - > cache_size, 0) + > cache_size) || __glibc_unlikely (header->module_offset >= cache_size) || __glibc_unlikely (header->otherconv_offset > cache_size)) { diff --git a/iconv/gconv_dl.c b/iconv/gconv_dl.c index 840793d..05448b3 100644 --- a/iconv/gconv_dl.c +++ b/iconv/gconv_dl.c @@ -93,7 +93,8 @@ __gconv_find_shlib (const char *name) found->counter = -TRIES_BEFORE_UNLOAD - 1; found->handle = NULL; - if (__glibc_unlikely (__tsearch (found, &loaded, known_compare) == NULL)) + if (__glibc_unlikely (__tsearch (found, &loaded, known_compare) + == NULL)) { /* Something went wrong while inserting the entry. */ free (found); diff --git a/iconvdata/euc-kr.c b/iconvdata/euc-kr.c index 50d1a8c..2f10453 100644 --- a/iconvdata/euc-kr.c +++ b/iconvdata/euc-kr.c @@ -38,7 +38,8 @@ euckr_from_ucs4 (uint32_t ch, unsigned char *cp) cp[0] = '\xa3'; cp[1] = '\xdc'; } - else if (__glibc_likely (ucs4_to_ksc5601 (ch, cp, 2) != __UNKNOWN_10646_CHAR)) + else if (__glibc_likely (ucs4_to_ksc5601 (ch, cp, 2) + != __UNKNOWN_10646_CHAR)) { cp[0] |= 0x80; cp[1] |= 0x80; -- 1.9.3