diff mbox series

locale: Avoid warning in --verbose mode for non-symbolic character.

Message ID 818b80a3-a679-896a-f830-9ddf96e216fa@redhat.com
State New
Headers show
Series locale: Avoid warning in --verbose mode for non-symbolic character. | expand

Commit Message

Carlos O'Donell Oct. 13, 2017, 7:34 p.m. UTC
Mike,

I'm working through a new C.UTF-8 locale and I need this fix to avoid
warnings in --verbose mode.

~~~ commit msg: 
locale: Avoid warning in --verbose mode for non-symbolic character.

In "Is it OK to write ASCII strings directly into locale source files?"
https://sourceware.org/ml/libc-alpha/2017-07/msg00807.html
there is universal consensus that we do not have to keep writing <Uxxxx>
symbolic characters in locale files.

Ulrich Drepper's historical comment was that symbolic characters were
used for the eventuality of converting the source files to any encoding
system. Fast forward to today and UTF-8 is the standard. So the requirement
of <Uxxxx> is hard to justify.

Zack Weinberg's excellent scripts are coming along we can use these to find
instances of human errors in the scripts:
https://sourceware.org/ml/libc-alpha/2017-07/msg00860.html
https://sourceware.org/ml/libc-alpha/2017-08/msg00136.html

It still won't be easy to distinguish from i for í, but that's still the
case for <Uxxxx> characters which humans can't read either.

Since we all agreed that we should be able to use non-symbolic (<Uxxxx>)
characters in locale files, the following change removes the verbose
warning that is raised if you use non-symbolic characters in the locale
file.

Signed-off-by: Carlos O'Donell <carlos@redhat.com>
~~~

Tested by building all locales and showing no warnings.
Tested with my C.UTF-8 which issued warnings before and now it doesn't
both when using --verbose.

OK to checkin?

2017-10-13  Carlos O'Donell  <carlos@redhat.com>

	* locale/programs/linereader.c (get_string): Don't warn on
	non-symbolic character.

---

Comments

Florian Weimer Oct. 13, 2017, 8:43 p.m. UTC | #1
* Carlos O'Donell:

> 2017-10-13  Carlos O'Donell  <carlos@redhat.com>
>
> 	* locale/programs/linereader.c (get_string): Don't warn on
> 	non-symbolic character.

Looks good to me.  (I support moving away from the <Uxxxx> symbols.)
Carlos O'Donell Oct. 13, 2017, 9:38 p.m. UTC | #2
On 10/13/2017 01:43 PM, Florian Weimer wrote:
> * Carlos O'Donell:
> 
>> 2017-10-13  Carlos O'Donell  <carlos@redhat.com>
>>
>> 	* locale/programs/linereader.c (get_string): Don't warn on
>> 	non-symbolic character.
> 
> Looks good to me.  (I support moving away from the <Uxxxx> symbols.)
> 

Thanks. Pushed.
Zack Weinberg Oct. 17, 2017, 6:26 p.m. UTC | #3
On Fri, Oct 13, 2017 at 3:34 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>
> I'm working through a new C.UTF-8 locale and I need this fix to avoid
> warnings in --verbose mode.

Yay officially supported C.UTF-8.

> Zack Weinberg's excellent scripts are coming along we can use these to find
> instances of human errors in the scripts:
> https://sourceware.org/ml/libc-alpha/2017-07/msg00860.html
> https://sourceware.org/ml/libc-alpha/2017-08/msg00136.html

The latest edition of the script is
https://sourceware.org/ml/libc-alpha/2017-08/msg01199.html and I've
taken it as far as I can with my complete lack of knowledge of what
actually goes in locale definitions.  I would like to think we could
get rid of all of the uses of non-NFC strings in the definition files,
but I really don't know if that's possible, or what would be a good
way to express to the script that some non-normalized strings are
intentional.  I'm not planning on working on it any more without a
whole bunch of guidance from Mike, Rafal, and company.

zw
diff mbox series

Patch

diff --git a/locale/programs/linereader.c b/locale/programs/linereader.c
index 52b3409..02fb476 100644
--- a/locale/programs/linereader.c
+++ b/locale/programs/linereader.c
@@ -634,7 +634,6 @@  get_string (struct linereader *lr, const struct charmap_t *charmap,
       size_t buf2act = 0;
       size_t buf2max = 56 * sizeof (uint32_t);
       int ch;
-      int warned = 0;
 
       /* We have to provide the wide character result as well.  */
       if (return_widestr)
@@ -664,13 +663,6 @@  get_string (struct linereader *lr, const struct charmap_t *charmap,
                    break;
                }
 
-             if (verbose && !warned)
-               {
-                 lr_error (lr, _("\
-non-symbolic character value should not be used"));
-                 warned = 1;
-               }
-
              ADDC (ch);
              if (return_widestr)
                ADDWC ((uint32_t) ch);