Message ID | ce089856-d470-6699-7c72-0b1c3644b85a@redhat.com |
---|---|
State | New |
Headers | show |
Series | Add verbose comments to 'era' in ja_JP locale. | expand |
On 3/28/19 2:09 PM, Carlos O'Donell wrote: > Rafal, > > While reviewing DJ's new test I went through all the dates, years, > and names, and figured I'd put them into a verbose comment in ja_JP > to make this easier to maintain in the future. > > What do you think of this for master? > > 8< --- 8< ---- 8< > --- > localedata/locales/ja_JP | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP > index 9bfbb2bb9b..74ef9e39f3 100644 > --- a/localedata/locales/ja_JP > +++ b/localedata/locales/ja_JP > @@ -14946,6 +14946,29 @@ am_pm "<U5348><U524D>";"<U5348><U5F8C>" > > t_fmt_ampm "%p%I<U6642>%M<U5206>%S<U79D2>" > > +# The era names are laid out in groups of 2 to account for the desire > +# to avoid using '1' for the first era year. Instead of 1 we use '元' > +# <U5143> or "gan" as the first era year. > +# > +# The following dates and their names are recorded below in descending > +# date order (note that '年' <U5E74> or "year" follows each date). > +# > +# Offset: Start date: End date: Era name: Using "gan": > +# (Y) (YYYY-MM-DD) > +# 2 1990-01-01 +* 平成 (Heisei) No > +# 1 1989-01-08 1989-12-31 平成 (Heisei) Yes > +# 2 1927-01-01 1989-01-07 昭和 (Shōwa) No > +# 1 1926-12-25 1926-12-31 昭和 (Shōwa) Yes > +# 2 1913-01-01 1926-12-24 大正 (Taishō) No > +# 1 1912-07-30 1912-12-31 大正 (Taishō) Yes > +# 6 1873-01-01 1912-07-29 明治 (Meiji) No > +# 1 0001-01-01 1872-12-31 西暦 (C.E) No > +# 1 -0000-12-31 -* 紀元前 (B.C.E.) No This should read "-0001-12-31" here. Fixed locally. > +# > +# Note: > +# - The last entry 紀元前 means pre-era/B.C./B.C.E. > +# - The second-to-last entry 西暦 means C.E. > +# > era "+:2:1990//01//01:+*:<U5E73><U6210>:%EC%Ey<U5E74>";/ > "+:1:1989//01//08:1989//12//31:<U5E73><U6210>:%EC<U5143><U5E74>";/ > "+:2:1927//01//01:1989//01//07:<U662D><U548C>:%EC%Ey<U5E74>";/
28.03.2019 19:09 Carlos O'Donell <codonell@redhat.com> wrote: > > Rafal, > > While reviewing DJ's new test I went through all the dates, years, > and names, and figured I'd put them into a verbose comment in ja_JP > to make this easier to maintain in the future. > > What do you think of this for master? Sadly, I don't have enough knowledge about Japanese calendar to verify if your comments are correct or not. Fortunately I can see Tamuki Shoichi on the CC: list so I hope to read some feedback from him. Few remarks below, though: > 8< --- 8< ---- 8< > --- > localedata/locales/ja_JP | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP > index 9bfbb2bb9b..74ef9e39f3 100644 > --- a/localedata/locales/ja_JP > +++ b/localedata/locales/ja_JP > @@ -14946,6 +14946,29 @@ am_pm "<U5348><U524D>";"<U5348><U5F8C>" > [...] > +# Offset: Start date: End date: Era name: Using "gan": > +# (Y) (YYYY-MM-DD) > +# 2 1990-01-01 +* 平成 (Heisei) No What does the symbol "+*" mean? Am I the only one confused? Does it maybe mean "infinity"? Can we use anything different, like "+inf"? > +# 1 1989-01-08 1989-12-31 平成 (Heisei) Yes > +# 2 1927-01-01 1989-01-07 昭和 (Shōwa) No > +# 1 1926-12-25 1926-12-31 昭和 (Shōwa) Yes > +# 2 1913-01-01 1926-12-24 大正 (Taishō) No > +# 1 1912-07-30 1912-12-31 大正 (Taishō) Yes > +# 6 1873-01-01 1912-07-29 明治 (Meiji) No > +# 1 0001-01-01 1872-12-31 西暦 (C.E) No > +# 1 -0000-12-31 -* 紀元前 (B.C.E.) No I was going to complain that the column with "Yes" and "No" is badly unaligned. It appears bad in my email client but it became aligned when I clicked "reply". Just please make sure all columns are aligned. Unfortunately, this time "-*" (again, is this anything like "-inf"?) looks shifted too much to the right and pushes the following columns. Hm... if it means "-inf" then shouldn't the columns be swapped, I mean "Start date: -inf, End date: -0001-12-31"? (Yes, I read your another email as well.) > +# > +# Note: > +# - The last entry 紀元前 means pre-era/B.C./B.C.E. > +# - The second-to-last entry 西暦 means C.E. Aren't the terms "B.C.", "B.C.E.", and "C.E." reserved for the Christian calendar? I'm sorry about my ignorance. I think I need further explanations before I tell any opinion about this patch. Of course, I'll appreciate if other people give more valuable feedback. Regards, Rafal
Hello Carlos-san, From: Carlos O'Donell <codonell@redhat.com> Subject: [PATCH] Add verbose comments to 'era' in ja_JP locale. Date: Thu, 28 Mar 2019 14:09:36 -0400 > While reviewing DJ's new test I went through all the dates, years, > and names, and figured I'd put them into a verbose comment in ja_JP > to make this easier to maintain in the future. > > What do you think of this for master? Thank you for the new text. Sorry, I am not happy to put the information in ja_JP locale data. Since it is necessary to describe similar information in other locale data such as *_TW, and also it becomes rather troublesome to maintain, it would be better to include the information in a documentation named "The locale definition source file format", that is expected to be created in Glibc. This documentation looks something like this: http://pubs.opengroup.org/onlinepubs/7908799/xbd/locale.html Regards, TAMUKI Shoichi
Hello Carlos-san, From: TAMUKI Shoichi <tamuki@linet.gr.jp> Subject: Re: [PATCH] Add verbose comments to 'era' in ja_JP locale. Date: Fri, 29 Mar 2019 15:53:08 +0900 > Sorry, I am not happy to put the information in ja_JP locale data. > Since it is necessary to describe similar information in other locale > data such as *_TW, and also it becomes rather troublesome to maintain, > it would be better to include the information in a documentation named > "The locale definition source file format", that is expected to be > created in Glibc. If adding the text as shown below, it does not affect to maintain era data, so there may be no problem. It is available in *_TW as well. diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP index 9bfbb2bb9b..983b866650 100644 --- a/localedata/locales/ja_JP +++ b/localedata/locales/ja_JP @@ -14946,6 +14946,12 @@ am_pm "<U5348><U524D>";"<U5348><U5F8C>" t_fmt_ampm "%p%I<U6642>%M<U5206>%S<U79D2>" +% The era names are laid out in groups of 2 to account for the desire +% to avoid using '1' for the first era year. Instead of '1' we use +% <U5143> or "origin" as the first era year. +% +% Note that <U5E74> or "year" follows each year number. +% era "+:2:1990//01//01:+*:<U5E73><U6210>:%EC%Ey<U5E74>";/ "+:1:1989//01//08:1989//12//31:<U5E73><U6210>:%EC<U5143><U5E74>";/ "+:2:1927//01//01:1989//01//07:<U662D><U548C>:%EC%Ey<U5E74>";/ The rest of the information is independent of the specific locale data and it is appropriate to include it in a separate document. Please be aware that we will be in the process of adding entry for the new Japanese era to ja_JP locale data for several days from now. Regards, TAMUKI Shoichi
29.03.2019 07:53 TAMUKI Shoichi <tamuki@linet.gr.jp> wrote: > [...] > Sorry, I am not happy to put the information in ja_JP locale data. That was my first thought as well. But after a while I found a reason in Carlos' patch. Yes, we should not explain in the locale data file how era format works but we should explain how the rules have been applied to implement this particular locale data file. It's like a comment in a source code file: it should not explain how the language works but how this particular solution had been implemented and what this piece of code meant. Shortly: explaining the format of the era field - no; explaining how and why it has been provided for ja_JP - yes. > [...] > it would be better to include the information in a documentation named > "The locale definition source file format", that is expected to be > created in Glibc. Sadly, that document does not exist now. As far as I remember the previous documentation was so outdated that it was better to remove it. The current guidelines say that in order to create a new locale file you should take any existing (and working) now and change it to your needs. Which of course is not an excuse to put the general description of the era field in a locale data file. > This documentation looks something like this: > > http://pubs.opengroup.org/onlinepubs/7908799/xbd/locale.html This is definitely good but Glibc has many extensions which means that a potential Glibc version should include this information plus have many additional pieces. Regards, Rafal
On 3/29/19 7:04 AM, Rafal Luzynski wrote: > 29.03.2019 07:53 TAMUKI Shoichi <tamuki@linet.gr.jp> wrote: >> [...] >> Sorry, I am not happy to put the information in ja_JP locale data. > > That was my first thought as well. But after a while I found a reason > in Carlos' patch. Yes, we should not explain in the locale data file > how era format works but we should explain how the rules have been > applied to implement this particular locale data file. It's like > a comment in a source code file: it should not explain how the language > works but how this particular solution had been implemented and what > this piece of code meant. > > Shortly: explaining the format of the era field - no; explaining how > and why it has been provided for ja_JP - yes. > >> [...] >> it would be better to include the information in a documentation named >> "The locale definition source file format", that is expected to be >> created in Glibc. > > Sadly, that document does not exist now. As far as I remember the > previous documentation was so outdated that it was better to remove it. > The current guidelines say that in order to create a new locale file > you should take any existing (and working) now and change it to your > needs. > > Which of course is not an excuse to put the general description of the > era field in a locale data file. > >> This documentation looks something like this: >> >> http://pubs.opengroup.org/onlinepubs/7908799/xbd/locale.html > > This is definitely good but Glibc has many extensions which means > that a potential Glibc version should include this information plus > have many additional pieces. I agree with Rafal on all points. Perfect is the enemy of the good. We certainly need a document describing how to write, edit, and compile locales, and what formats are avialable. Such a documnet is a huge untertaking. The intent of my patch, as Rafal points out, is to add source-code comments to the ja_JP locale to make it easier for me to review. It seems like Rafal does not object to the patch. I'll see if I can get consensus from TAMUKI-san in the other email.
29.03.2019 15:57 Carlos O'Donell <codonell@redhat.com> wrote: > [...] > It seems like Rafal does not object to the patch. True, I don't object which means I can't see any error but I'd like to hear the final work from TAMUKI-san due to my poor knowledge about the Japanese calendar. > I'll see if I can get consensus from TAMUKI-san in the other email. That's what I mean. Regards, Rafal
Hello Rafal-san, From: Rafal Luzynski <digitalfreak@lingonborough.com> Subject: Re: [PATCH] Add verbose comments to 'era' in ja_JP locale. Date: Fri, 29 Mar 2019 12:04:18 +0100 (CET) > > Sorry, I am not happy to put the information in ja_JP locale data. > > That was my first thought as well. But after a while I found a reason > in Carlos' patch. Yes, we should not explain in the locale data file > how era format works but we should explain how the rules have been > applied to implement this particular locale data file. It's like > a comment in a source code file: it should not explain how the language > works but how this particular solution had been implemented and what > this piece of code meant. > > Shortly: explaining the format of the era field - no; explaining how > and why it has been provided for ja_JP - yes. OK. I got it. > > it would be better to include the information in a documentation named > > "The locale definition source file format", that is expected to be > > created in Glibc. > > Sadly, that document does not exist now. As far as I remember the > previous documentation was so outdated that it was better to remove it. > The current guidelines say that in order to create a new locale file > you should take any existing (and working) now and change it to your > needs. > > Which of course is not an excuse to put the general description of the > era field in a locale data file. Was the document deleted? Oh my goodness. As you know, it was pointed out that there is a bug in the direction of BC in Bugzilla. If there was a proper manual in Glibc, there would be no problem. https://sourceware.org/bugzilla/show_bug.cgi?id=24162#c6 > > This documentation looks something like this: > > > > http://pubs.opengroup.org/onlinepubs/7908799/xbd/locale.html > > This is definitely good but Glibc has many extensions which means > that a potential Glibc version should include this information plus > have many additional pieces. Certainly, it will take enough effort and time to rebuild it. By the way, Glibc does not support the abbreviation of era, so we can not use commonly used expressions in Japan, like "H31.03.30" (today). I want to introduce "abera" to Glibc in the future. # H -> Heisei # S -> Showa # T -> Taisho # M -> Meiji # ROC -> Minguo (in Taiwan) Regards, TAMUKI Shoichi
Hello Carlos-san, From: Carlos O'Donell <codonell@redhat.com> Subject: Re: [PATCH] Add verbose comments to 'era' in ja_JP locale. Date: Fri, 29 Mar 2019 10:57:14 -0400 > I agree with Rafal on all points. > > Perfect is the enemy of the good. > > We certainly need a document describing how to write, edit, and > compile locales, and what formats are avialable. > > Such a documnet is a huge untertaking. The intent of my patch, as > Rafal points out, is to add source-code comments to the ja_JP > locale to make it easier for me to review. Agreed. We certainly need such a document for Glibc, however it will take enough effort and time to rebuild it. Since era is particularly complex in format, it is a good idea to put descriptions in ja_JP locale. However, I have some suggestions. In the current patch, I would like to be brief as the line of the comment are long. In particular, since the description segments of era has overlapping content, it is not necessary to put it in the comment. Instead, how about adding an explanation of the format of description segment of era. Next, I would like to avoid putting kanji character in the locale data. The locale data of Glibc can be customized by users using localedef. As ja_JP locale data does not depend on encodings, users of either ja_JP.eucJP or ja_JP.SJIS environment may be garbled and unable to edit correctly. It is good to describe as <GAN>, <NEN>, etc. according to other existing comment lines of ja_JP locale data. Also, it is better to use "%" instead of "#" at the beginning of the comment line of locale data. How about the following explanation. % The era names are laid out in groups of 2 to account for the desire % to avoid using '1' for the first era year. Instead of '1' we use % <U5143> or <GAN> as the first era year. % % The following dates and their names are recorded below in descending % date order (note that <U5E74> or <NEN> follows each date). % <HEISEI> -> <SHOWA> -> <TAISHO> -> <MEIJI> -> <AD> -> <BC> % % Each string is an era description segment with the format: % "direction:offset:start_date:end_date:era_name:era_format" % % Note: % - The '+*' entry in end_date means "forever going forward" % - The '-*' entry in end_date means "forever going backwards" count up. % - Negative year number in start_date is prior to AD 1 (BC) counting up. % - The last entry <U7D00><U5143><U524D> in era_name means BC. % - The second-to-last entry <U897F><U66A6> in era_name means AD. % Regards, TAMUKI Shoichi
diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP index 9bfbb2bb9b..74ef9e39f3 100644 --- a/localedata/locales/ja_JP +++ b/localedata/locales/ja_JP @@ -14946,6 +14946,29 @@ am_pm "<U5348><U524D>";"<U5348><U5F8C>" t_fmt_ampm "%p%I<U6642>%M<U5206>%S<U79D2>" +# The era names are laid out in groups of 2 to account for the desire +# to avoid using '1' for the first era year. Instead of 1 we use '元' +# <U5143> or "gan" as the first era year. +# +# The following dates and their names are recorded below in descending +# date order (note that '年' <U5E74> or "year" follows each date). +# +# Offset: Start date: End date: Era name: Using "gan": +# (Y) (YYYY-MM-DD) +# 2 1990-01-01 +* 平成 (Heisei) No +# 1 1989-01-08 1989-12-31 平成 (Heisei) Yes +# 2 1927-01-01 1989-01-07 昭和 (Shōwa) No +# 1 1926-12-25 1926-12-31 昭和 (Shōwa) Yes +# 2 1913-01-01 1926-12-24 大正 (Taishō) No +# 1 1912-07-30 1912-12-31 大正 (Taishō) Yes +# 6 1873-01-01 1912-07-29 明治 (Meiji) No +# 1 0001-01-01 1872-12-31 西暦 (C.E) No +# 1 -0000-12-31 -* 紀元前 (B.C.E.) No +# +# Note: +# - The last entry 紀元前 means pre-era/B.C./B.C.E. +# - The second-to-last entry 西暦 means C.E. +# era "+:2:1990//01//01:+*:<U5E73><U6210>:%EC%Ey<U5E74>";/ "+:1:1989//01//08:1989//12//31:<U5E73><U6210>:%EC<U5143><U5E74>";/ "+:2:1927//01//01:1989//01//07:<U662D><U548C>:%EC%Ey<U5E74>";/