diff mbox

localedate: LC_IDENTIFICATION.category: set to ISO 14652 2002 standard

Message ID 20160413163945.GK6588@vapier.lan
State New
Headers show

Commit Message

Mike Frysinger April 13, 2016, 4:39 p.m. UTC
The ISO 14652 standard defines the valid values for the category
keyword as only two options:
	posix:1993
	i18n:2002

The vast majority of locales had changed the "i18n" string to the
name of its own locale (e.g. "ak_GH:2013") as well as tweaking the
date (presumably thinking it should be the date of submission).

Convert all of them to "i18n:2002" for consistency.

Compressed + attached due to size.  Example change:

Comments

Chris Leonard April 13, 2016, 6:50 p.m. UTC | #1
+1 from me FWIW

cjl

On Wed, Apr 13, 2016 at 12:39 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> The ISO 14652 standard defines the valid values for the category
> keyword as only two options:
>         posix:1993
>         i18n:2002
>
> The vast majority of locales had changed the "i18n" string to the
> name of its own locale (e.g. "ak_GH:2013") as well as tweaking the
> date (presumably thinking it should be the date of submission).
>
> Convert all of them to "i18n:2002" for consistency.
>
> Compressed + attached due to size.  Example change:
> --- a/localedata/locales/ak_GH
> +++ b/localedata/locales/ak_GH
> @@ -37,19 +37,19 @@ language     "Akan"
>  territory    "Ghana"
>  revision     "1.0"
>  date         "2013-08-24"
> -%
> -category  "ak_GH:2013";LC_IDENTIFICATION
> -category  "ak_GH:2013";LC_CTYPE
> -category  "ak_GH:2013";LC_COLLATE
> -category  "ak_GH:2013";LC_TIME
> -category  "ak_GH:2013";LC_NUMERIC
> -category  "ak_GH:2013";LC_MONETARY
> -category  "ak_GH:2013";LC_PAPER
> -category  "ak_GH:2013";LC_MEASUREMENT
> -category  "ak_GH:2013";LC_MESSAGES
> -category  "ak_GH:2013";LC_NAME
> -category  "ak_GH:2013";LC_ADDRESS
> -category  "ak_GH:2013";LC_TELEPHONE
> +
> +category "i18n:2002";LC_IDENTIFICATION
> +category "i18n:2002";LC_CTYPE
> +category "i18n:2002";LC_COLLATE
> +category "i18n:2002";LC_TIME
> +category "i18n:2002";LC_NUMERIC
> +category "i18n:2002";LC_MONETARY
> +category "i18n:2002";LC_PAPER
> +category "i18n:2002";LC_MEASUREMENT
> +category "i18n:2002";LC_MESSAGES
> +category "i18n:2002";LC_NAME
> +category "i18n:2002";LC_ADDRESS
> +category "i18n:2002";LC_TELEPHONE
>  END LC_IDENTIFICATION
>
>  LC_CTYPE
Carlos O'Donell April 13, 2016, 6:57 p.m. UTC | #2
On 04/13/2016 12:39 PM, Mike Frysinger wrote:
> The ISO 14652 standard defines the valid values for the category
> keyword as only two options:
> 	posix:1993
> 	i18n:2002
> 
> The vast majority of locales had changed the "i18n" string to the
> name of its own locale (e.g. "ak_GH:2013") as well as tweaking the
> date (presumably thinking it should be the date of submission).
> 
> Convert all of them to "i18n:2002" for consistency.

Any chance you can tighten the parser to reject anything but the
two valid category keywords?

I think this change is correct, but I'd rather see a patch that
enforces policy *and* changes the locale source to match.
Mike Frysinger April 13, 2016, 8:05 p.m. UTC | #3
On 13 Apr 2016 14:57, Carlos O'Donell wrote:
> On 04/13/2016 12:39 PM, Mike Frysinger wrote:
> > The ISO 14652 standard defines the valid values for the category
> > keyword as only two options:
> > 	posix:1993
> > 	i18n:2002
> > 
> > The vast majority of locales had changed the "i18n" string to the
> > name of its own locale (e.g. "ak_GH:2013") as well as tweaking the
> > date (presumably thinking it should be the date of submission).
> > 
> > Convert all of them to "i18n:2002" for consistency.
> 
> Any chance you can tighten the parser to reject anything but the
> two valid category keywords?

i figured someone would ask for that eventually :).  it's not clear to
me how many valid values there are because the ISO 14652 standard is
difficult to obtain.  i've only be able to find 1999 and 2002 copies,
but i'm pretty sure there's other revisions as well.  maybe we start
off only accepting these two values and worry about the rest later ?

the other aspect is that, while we might validate some sanity on the
category fields in general, the code (afaict) is not structured for
actually handling the differences.  for example, if the locale says
posix:1993 or i18n:1999 (which the older ISO 14652 1999 standard
allows), we don't change the parsing behavior to reject features
that are new to i18n:2002.

i guess one thing at a time: let's update localdef to only accept
these two values and reject all others.  i'll look at that before
merging this patch in case it's easy to do.
-mike
diff mbox

Patch

--- a/localedata/locales/ak_GH
+++ b/localedata/locales/ak_GH
@@ -37,19 +37,19 @@  language     "Akan"
 territory    "Ghana"
 revision     "1.0"
 date         "2013-08-24"
-%
-category  "ak_GH:2013";LC_IDENTIFICATION
-category  "ak_GH:2013";LC_CTYPE
-category  "ak_GH:2013";LC_COLLATE
-category  "ak_GH:2013";LC_TIME
-category  "ak_GH:2013";LC_NUMERIC
-category  "ak_GH:2013";LC_MONETARY
-category  "ak_GH:2013";LC_PAPER
-category  "ak_GH:2013";LC_MEASUREMENT
-category  "ak_GH:2013";LC_MESSAGES
-category  "ak_GH:2013";LC_NAME
-category  "ak_GH:2013";LC_ADDRESS
-category  "ak_GH:2013";LC_TELEPHONE
+
+category "i18n:2002";LC_IDENTIFICATION
+category "i18n:2002";LC_CTYPE
+category "i18n:2002";LC_COLLATE
+category "i18n:2002";LC_TIME
+category "i18n:2002";LC_NUMERIC
+category "i18n:2002";LC_MONETARY
+category "i18n:2002";LC_PAPER
+category "i18n:2002";LC_MEASUREMENT
+category "i18n:2002";LC_MESSAGES
+category "i18n:2002";LC_NAME
+category "i18n:2002";LC_ADDRESS
+category "i18n:2002";LC_TELEPHONE
 END LC_IDENTIFICATION
 
 LC_CTYPE