Message ID | 818b80a3-a679-896a-f830-9ddf96e216fa@redhat.com |
---|---|
State | New |
Headers | show |
Series | locale: Avoid warning in --verbose mode for non-symbolic character. | expand |
* Carlos O'Donell: > 2017-10-13 Carlos O'Donell <carlos@redhat.com> > > * locale/programs/linereader.c (get_string): Don't warn on > non-symbolic character. Looks good to me. (I support moving away from the <Uxxxx> symbols.)
On 10/13/2017 01:43 PM, Florian Weimer wrote: > * Carlos O'Donell: > >> 2017-10-13 Carlos O'Donell <carlos@redhat.com> >> >> * locale/programs/linereader.c (get_string): Don't warn on >> non-symbolic character. > > Looks good to me. (I support moving away from the <Uxxxx> symbols.) > Thanks. Pushed.
On Fri, Oct 13, 2017 at 3:34 PM, Carlos O'Donell <carlos@redhat.com> wrote: > > I'm working through a new C.UTF-8 locale and I need this fix to avoid > warnings in --verbose mode. Yay officially supported C.UTF-8. > Zack Weinberg's excellent scripts are coming along we can use these to find > instances of human errors in the scripts: > https://sourceware.org/ml/libc-alpha/2017-07/msg00860.html > https://sourceware.org/ml/libc-alpha/2017-08/msg00136.html The latest edition of the script is https://sourceware.org/ml/libc-alpha/2017-08/msg01199.html and I've taken it as far as I can with my complete lack of knowledge of what actually goes in locale definitions. I would like to think we could get rid of all of the uses of non-NFC strings in the definition files, but I really don't know if that's possible, or what would be a good way to express to the script that some non-normalized strings are intentional. I'm not planning on working on it any more without a whole bunch of guidance from Mike, Rafal, and company. zw
diff --git a/locale/programs/linereader.c b/locale/programs/linereader.c index 52b3409..02fb476 100644 --- a/locale/programs/linereader.c +++ b/locale/programs/linereader.c @@ -634,7 +634,6 @@ get_string (struct linereader *lr, const struct charmap_t *charmap, size_t buf2act = 0; size_t buf2max = 56 * sizeof (uint32_t); int ch; - int warned = 0; /* We have to provide the wide character result as well. */ if (return_widestr) @@ -664,13 +663,6 @@ get_string (struct linereader *lr, const struct charmap_t *charmap, break; } - if (verbose && !warned) - { - lr_error (lr, _("\ -non-symbolic character value should not be used")); - warned = 1; - } - ADDC (ch); if (return_widestr) ADDWC ((uint32_t) ch);
Mike, I'm working through a new C.UTF-8 locale and I need this fix to avoid warnings in --verbose mode. ~~~ commit msg: locale: Avoid warning in --verbose mode for non-symbolic character. In "Is it OK to write ASCII strings directly into locale source files?" https://sourceware.org/ml/libc-alpha/2017-07/msg00807.html there is universal consensus that we do not have to keep writing <Uxxxx> symbolic characters in locale files. Ulrich Drepper's historical comment was that symbolic characters were used for the eventuality of converting the source files to any encoding system. Fast forward to today and UTF-8 is the standard. So the requirement of <Uxxxx> is hard to justify. Zack Weinberg's excellent scripts are coming along we can use these to find instances of human errors in the scripts: https://sourceware.org/ml/libc-alpha/2017-07/msg00860.html https://sourceware.org/ml/libc-alpha/2017-08/msg00136.html It still won't be easy to distinguish from i for í, but that's still the case for <Uxxxx> characters which humans can't read either. Since we all agreed that we should be able to use non-symbolic (<Uxxxx>) characters in locale files, the following change removes the verbose warning that is raised if you use non-symbolic characters in the locale file. Signed-off-by: Carlos O'Donell <carlos@redhat.com> ~~~ Tested by building all locales and showing no warnings. Tested with my C.UTF-8 which issued warnings before and now it doesn't both when using --verbose. OK to checkin? 2017-10-13 Carlos O'Donell <carlos@redhat.com> * locale/programs/linereader.c (get_string): Don't warn on non-symbolic character. ---