diff mbox

Remove locale timezone information

Message ID 557AE725.5050104@redhat.com
State New
Headers show

Commit Message

Marko Myllynen June 12, 2015, 2:05 p.m. UTC
Hi,

as discussed in the thread starting at

https://sourceware.org/ml/libc-alpha/2015-06/msg00098.html

it looks like the best options is to remove locale timezone information
from locales which currently provide it (in incomplete or incorrect
fashion) rather than to start duplicating tzdata info in glibc.

2015-06-12  Marko Myllynen  <myllynen@redhat.com>

	[BZ #18525]
	* locales/km_KH: Remove timezone definition.
	* locales/lo_LA: Likewise.
	* locales/my_MM: Likewise.
	* locales/nan_TW@latin: Likewise.
	* locales/th_TH: Likewise.
	* locales/uk_UA: Likewise.


Thanks,

Comments

Mike Frysinger Aug. 5, 2015, 9:07 a.m. UTC | #1
On 12 Jun 2015 17:05, Marko Myllynen wrote:
> as discussed in the thread starting at
> 
> https://sourceware.org/ml/libc-alpha/2015-06/msg00098.html
> 
> it looks like the best options is to remove locale timezone information
> from locales which currently provide it (in incomplete or incorrect
> fashion) rather than to start duplicating tzdata info in glibc.

thanks, pushed now!
-mike
Keld Simonsen Aug. 5, 2015, 10:01 a.m. UTC | #2
On Wed, Aug 05, 2015 at 05:07:48AM -0400, Mike Frysinger wrote:
> On 12 Jun 2015 17:05, Marko Myllynen wrote:
> > as discussed in the thread starting at
> > 
> > https://sourceware.org/ml/libc-alpha/2015-06/msg00098.html
> > 
> > it looks like the best options is to remove locale timezone information
> > from locales which currently provide it (in incomplete or incorrect
> > fashion) rather than to start duplicating tzdata info in glibc.
> 
> thanks, pushed now!
> -mike

That is the wrong direction. Please revert the change.

Best regarrds
Keld
Keld Simonsen Aug. 5, 2015, 10:22 a.m. UTC | #3
On Wed, Aug 05, 2015 at 12:01:26PM +0200, keld@keldix.com wrote:
> On Wed, Aug 05, 2015 at 05:07:48AM -0400, Mike Frysinger wrote:
> > On 12 Jun 2015 17:05, Marko Myllynen wrote:
> > > as discussed in the thread starting at
> > > 
> > > https://sourceware.org/ml/libc-alpha/2015-06/msg00098.html
> > > 
> > > it looks like the best options is to remove locale timezone information
> > > from locales which currently provide it (in incomplete or incorrect
> > > fashion) rather than to start duplicating tzdata info in glibc.
> > 
> > thanks, pushed now!
> > -mike
> 
> That is the wrong direction. Please revert the change.

Let me explain. We would like to make the installation process easier
for users. That is, if we can remove one more question under the installation
process of linux, that would be a  goal. If the timezone is fully determined
by the choice of locale, then there is no need to ask for the timezone.

Currently the timezone is often determined by a click on a map, which is quite
error prone, because the world is big and it is hard to hit the right place.
That process is also time consuming. Some users omit the step,
just using the default, which is often wrong.

Furthermore it is not politically correct timezone information in these apps,
they give the time zone as a city name, which is quite politicalle problematic
There is a growing political uproar against the big cities and the associated
political powers. We just had an elelction here in Denmark where this issue
(Not Linux locale names, but country vs the capital:-) were probably the most
defining issue. Also think of the USA with Washington DC against rural America.

For countries with more timezones, the locale data helps narrowing down the
choices. And there are not that many countries with more than 1 timezone,
eg USA, Canada, Russia and Greenland. Many big countries like China and India
only have 1 timezone , and the countries in Europe and Africa and South America
and Asia almost all have only 1 timezone.

So the locale timezone info helps doing an easier job - and also a more culturally
acceptable job. I hope we are all for going in that direction.

Best regards
keld
Andreas Schwab Aug. 5, 2015, 10:37 a.m. UTC | #4
keld@keldix.com writes:

> Let me explain. We would like to make the installation process easier
> for users. That is, if we can remove one more question under the installation
> process of linux, that would be a  goal. If the timezone is fully determined
> by the choice of locale, then there is no need to ask for the timezone.

To make this useful the locale would need to use the Olson name, not the
POSIX name of the timezone.

Andreas.
Mike Frysinger Aug. 5, 2015, 10:53 a.m. UTC | #5
On 05 Aug 2015 12:22, keld@keldix.com wrote:
> On Wed, Aug 05, 2015 at 12:01:26PM +0200, keld@keldix.com wrote:
> > On Wed, Aug 05, 2015 at 05:07:48AM -0400, Mike Frysinger wrote:
> > > On 12 Jun 2015 17:05, Marko Myllynen wrote:
> > > > as discussed in the thread starting at
> > > > 
> > > > https://sourceware.org/ml/libc-alpha/2015-06/msg00098.html
> > > > 
> > > > it looks like the best options is to remove locale timezone information
> > > > from locales which currently provide it (in incomplete or incorrect
> > > > fashion) rather than to start duplicating tzdata info in glibc.
> > > 
> > > thanks, pushed now!
> > 
> > That is the wrong direction. Please revert the change.
> 
> Let me explain. We would like to make the installation process easier
> for users. That is, if we can remove one more question under the installation
> process of linux, that would be a  goal. If the timezone is fully determined
> by the choice of locale, then there is no need to ask for the timezone.

having 6 out of 300+ locales define timezone info is not something people can 
rely on.  we also don't want to duplicate data that is *actively* maintained
elsewhere (the tz lists).  plus many locales span multiple timezones.

i don't see crappy UIs as a problem glibc either can or should solve.  at the
end of the day, trying to set the timezone based on the locale isn't going to
cover everyone (even a significant number of people), especially as you move
around.  i certainly don't want it being tied to my locale.  which means UIs
are required to get this right.

if you want to bring timezone data into the locale fields, then it should be
done consistently across the board and in a maintainable manner.  until that
happens, having data in a measly 6 locales is more of a disservice imo.
-mike
Keld Simonsen Aug. 5, 2015, 10:58 a.m. UTC | #6
On Wed, Aug 05, 2015 at 12:37:08PM +0200, Andreas Schwab wrote:
> keld@keldix.com writes:
> 
> > Let me explain. We would like to make the installation process easier
> > for users. That is, if we can remove one more question under the installation
> > process of linux, that would be a  goal. If the timezone is fully determined
> > by the choice of locale, then there is no need to ask for the timezone.
> 
> To make this useful the locale would need to use the Olson name, not the
> POSIX name of the timezone.

Olson names are problematic, as they are not culturally acceptable in many
cases, as I explained. This is a big issue a number of places, including 
my country, Denmark, and probably also the USA.

In my country this issue is so big that many people refuse to talk about
other issues. I am myself from the Copenhagen area, so I am frustrated about this
being an obstacle to discuss "real" problems.

Furthermore, Olsen TZ data does not allow for DST changes over years, while the 
locale timezone data do allow this, providing correct time display for
also older data, including in the USA, where the DST change was changed from
end of September to end of October at some point. This also
happened many places in Europe. 

Best regards
keld
Keld Simonsen Aug. 5, 2015, 11:30 a.m. UTC | #7
On Wed, Aug 05, 2015 at 06:53:11AM -0400, Mike Frysinger wrote:
> On 05 Aug 2015 12:22, keld@keldix.com wrote:
> > On Wed, Aug 05, 2015 at 12:01:26PM +0200, keld@keldix.com wrote:
> > > On Wed, Aug 05, 2015 at 05:07:48AM -0400, Mike Frysinger wrote:
> > > > On 12 Jun 2015 17:05, Marko Myllynen wrote:
> > > > > as discussed in the thread starting at
> > > > > 
> > > > > https://sourceware.org/ml/libc-alpha/2015-06/msg00098.html
> > > > > 
> > > > > it looks like the best options is to remove locale timezone information
> > > > > from locales which currently provide it (in incomplete or incorrect
> > > > > fashion) rather than to start duplicating tzdata info in glibc.
> > > > 
> > > > thanks, pushed now!
> > > 
> > > That is the wrong direction. Please revert the change.
> > 
> > Let me explain. We would like to make the installation process easier
> > for users. That is, if we can remove one more question under the installation
> > process of linux, that would be a  goal. If the timezone is fully determined
> > by the choice of locale, then there is no need to ask for the timezone.
> 
> having 6 out of 300+ locales define timezone info is not something people can 
> rely on.

Well, it takes time to build the info. Furthermore, if data is not available
they can always fall back to the TZ info. This is all spelled out in the standard.

> we also don't want to duplicate data that is *actively* maintained
> elsewhere (the tz lists). 

The relation between tz and country is not maintained anywhere else, except in some
distributions, AFAIK.

Anyway many of the data in locales are just duplication
of data maintained elsewhere. That is the main purpose of a locale, to collect
data that are often maintained elsewhere, in a big bundle so that a user
can easily select that big bundle for his/her environment.

> plus many locales span multiple timezones.

How many? I think it is not a majority (by far). Anyway, as I explained,
even if there are more timezone choices for a locale, then it is an improvement
to have the choices instead of a cumbersome clicking on a map, which selects
Washington DC or New York as a default (sic!)

> i don't see crappy UIs as a problem glibc either can or should solve.  at the
> end of the day,

Ultimately glibc implements functionality that is benefitting the end user.
This data has one of its foremost usefulness in setting timezone info.

> trying to set the timezone based on the locale isn't going to
> cover everyone (even a significant number of people),

Most of the world would be covered.  A quick calculation: countries with more than
1 timezone: USA, Canada, Mexico, Greenland, Brazil, Russia, Australia.
In total about 800 mill people out of a world polulation of 7000 mill.
And for many of these countries, the choice is only between 2 or 3 zones,
ag: Brazil, Mexico, Greenland, Australia.
The only real problmatic countries are USA, Canada and Russia.

> especially as you move
> around.  i certainly don't want it being tied to my locale.  which means UIs
> are required to get this right.

Most people don't move. But if you travel, you can easily change timezone.
I regularily do that.

> if you want to bring timezone data into the locale fields, then it should be
> done consistently across the board and in a maintainable manner.  until that
> happens, having data in a measly 6 locales is more of a disservice imo.

It is a beginning, and instead of just removing it, I think we should rather
encourage this, and even set out to have an overall collection of the data.
Furthermore, the data does not harm, if it is correct.  It can be used as
explained in the standard, even if the data is only available for a few
countries, and there is a clear migration path specified.

Best regards
Keld
Andreas Schwab Aug. 5, 2015, 12:27 p.m. UTC | #8
keld@keldix.com writes:

> Furthermore, Olsen TZ data does not allow for DST changes over years, while the 
> locale timezone data do allow this, providing correct time display for
> also older data, including in the USA, where the DST change was changed from
> end of September to end of October at some point. This also
> happened many places in Europe. 

??? It's exactly the other way round.  The POSIX timezone has no
history.

Andreas.
Andreas Schwab Aug. 5, 2015, 12:33 p.m. UTC | #9
keld@keldix.com writes:

> Well, it takes time to build the info. Furthermore, if data is not available
> they can always fall back to the TZ info. This is all spelled out in the standard.

The TZ info isn't a fallback, the POSIX timezone is.

Andreas.
Keld Simonsen Aug. 5, 2015, 1 p.m. UTC | #10
On Wed, Aug 05, 2015 at 02:27:04PM +0200, Andreas Schwab wrote:
> keld@keldix.com writes:
> 
> > Furthermore, Olsen TZ data does not allow for DST changes over years, while the 
> > locale timezone data do allow this, providing correct time display for
> > also older data, including in the USA, where the DST change was changed from
> > end of September to end of October at some point. This also
> > happened many places in Europe. 
> 
> ??? It's exactly the other way round.  The POSIX timezone has no
> history.

True, POSIX has no history, but I was talking about ISO TR 14652 and 30112, which have history.
And I see that Olson tx data has history, but not in a format compatible with POSIX TZ.

best regards
keld
Keld Simonsen Aug. 5, 2015, 1:08 p.m. UTC | #11
On Wed, Aug 05, 2015 at 02:33:02PM +0200, Andreas Schwab wrote:
> keld@keldix.com writes:
> 
> > Well, it takes time to build the info. Furthermore, if data is not available
> > they can always fall back to the TZ info. This is all spelled out in the standard.
> 
> The TZ info isn't a fallback, the POSIX timezone is.

I was talking about ISO TR 30112 LC_TIME timezone spec. The TZ environment variable
overrides the LC_TIME timezone spec.

Best regards
keld
Mike Frysinger Aug. 5, 2015, 1:33 p.m. UTC | #12
On 05 Aug 2015 13:30, keld@keldix.com wrote:
> On Wed, Aug 05, 2015 at 06:53:11AM -0400, Mike Frysinger wrote:
> > having 6 out of 300+ locales define timezone info is not something people can 
> > rely on.
> 
> Well, it takes time to build the info.

how much time exactly do you think is reasonable ?  it's been more than 10 years 
and clearly no one thus far cares enough to drive this.  if you do, then feel 
free, but until that happens i see no reason to restore the incomplete data.

> How many? I think it is not a majority (by far).

seeing as how people literally travel around the world now, and mobility is only 
increasing, any combo is fair game now.

> Anyway, as I explained,
> even if there are more timezone choices for a locale, then it is an improvement
> to have the choices instead of a cumbersome clicking on a map, which selects
> Washington DC or New York as a default (sic!)

which is why there's been a rise to use packages like geoip to automatically 
detect an appropriate region based on the network connectivity, gps, cellular
stations, or other sources.

i've just snipped the rest because none of the responses i found convincing
and rehashing things is going nowhere.  sorry.
-mike
Keld Simonsen Aug. 5, 2015, 3:56 p.m. UTC | #13
On Wed, Aug 05, 2015 at 09:33:05AM -0400, Mike Frysinger wrote:
> On 05 Aug 2015 13:30, keld@keldix.com wrote:
> > On Wed, Aug 05, 2015 at 06:53:11AM -0400, Mike Frysinger wrote:
> > > having 6 out of 300+ locales define timezone info is not something people can 
> > > rely on.
> > 
> > Well, it takes time to build the info.
> 
> how much time exactly do you think is reasonable ?  it's been more than 10 years 
> and clearly no one thus far cares enough to drive this.  if you do, then feel 
> free, but until that happens i see no reason to restore the incomplete data.

Yes, it has taken quite a long time. Maybe because the locales that people build on
do not have timezone info, and maybe because 14652 timezone syntax was not supported by
glibc, including DST history changes.

I don't know. I think I could add timezone info since the epoch based on
the Olson data for each of the locales. Would you be positive about committing such changes 
Mike?  I would then have to write up a program for that.

Do we use Olson tz data in any of the  glibc functions?

> > How many? I think it is not a majority (by far).
> 
> seeing as how people literally travel around the world now, and mobility is only 
> increasing, any combo is fair game now.

Why is changing the timezone in an app not a satisfactory way of handling this?

You must acknowledge that the multiple timezone per country is a quite limited problem,
only really valid for the USA, Canada and Russia. I don't know where you are from,
but could you consider that it would be an improvement for the vast majority of the
world, even if it was not as great for you?

Anyway using geoip could solve part of this, as you note below.
But when you travel, you normally would like to retain most of your environment
such as the language, only TZ info would need to change.

> > Anyway, as I explained,
> > even if there are more timezone choices for a locale, then it is an improvement
> > to have the choices instead of a cumbersome clicking on a map, which selects
> > Washington DC or New York as a default (sic!)
> 
> which is why there's been a rise to use packages like geoip to automatically 
> detect an appropriate region based on the network connectivity, gps, cellular
> stations, or other sources.

Yes, that is a good way forward but for initial setup of a machine, one still
would need the coupling on the geoip data with a locale, and then the coupling 
of a locale to a timezone. So still the  timezone info coupled to the locale 
is very useful.

> i've just snipped the rest because none of the responses i found convincing
> and rehashing things is going nowhere.  sorry.

Well, also sorry that we don't agree on the purpose of glibc,  and on making the life of 
end users easier, and making time display correct.  It also seems like we have different
ways of counting the people in the world, or at least giving importance to them.

Best regards
Keld
Joseph Myers Aug. 5, 2015, 4:15 p.m. UTC | #14
On Wed, 5 Aug 2015, Keld Simonsen wrote:

> I don't know. I think I could add timezone info since the epoch based on
> the Olson data for each of the locales. Would you be positive about committing such changes 
> Mike?  I would then have to write up a program for that.

We should not duplicate the work done externally in tracking timezone 
changes, nor embed copies of such frequently changing data in glibc (the 
existing copies of timezone data are purely for use as test inputs for zic 
and time-related functions, not for installation).

I don't think it's useful to embed Olson timezone names in glibc locales 
either.  Rather, location-selection mechanisms maintained entirely outside 
glibc should provide user-friendly ways of choosing both (language, 
country) locale rules and timezone (the latter might change when 
travelling, the former probably not).

Olson timezone names are explicitly not intended to be presented directly 
to users for timezone selection.  See the Theory file 
<https://github.com/eggert/tz/blob/master/Theory>.  If a distribution is 
presenting such names (and thus causing issues because the named city is 
politically inappropriate), report it as a bug directly to that 
distribution.
Paul Eggert Aug. 5, 2015, 4:20 p.m. UTC | #15
On 08/05/2015 03:22 AM, keld@keldix.com wrote:
> For countries with more timezones, the locale data helps narrowing down the
> choices. And there are not that many countries with more than 1 timezone,
> eg USA, Canada, Russia and Greenland. Many big countries like China and India
> only have 1 timezone

Actually, China has two time zones: tzdata's Asia/Shanghai and 
Asia/Urumqi both reflect officially-kept time.  Even Germany has more 
than one tzdata entry, due to the a difference in post-1970 history of 
timekeeping in its Swiss enclaves.  So the problem of many time zones 
for one locale is bigger than what you're suggesting, even if we ignore 
traveling users (which is a pretty big class to ignore).

As for tzdata names "not being culturally acceptable", they are intended 
for use as internal identifiers, visible to experts like us but not to 
end users, so "cultural acceptability" should not be an issue.  End 
users in China, for example, are not expected to see "Asia/Shanghai" 
even in an English locale, but instead are expected to see "China 
Standard Time" or "Beijing Time" or something like that.  Strings like 
"China Standard Time" and "北京时间" are maintained by the Unicode 
Common Locale Data Repository <http://cldr.unicode.org/> and are widely 
used in glibc-based systems.  I don't know whether CLDR supports ISO TR 
14652 and 30112, but if it doesn't then I suggest approaching the CLDR 
maintainers.
Mike Frysinger Aug. 6, 2015, 2:52 a.m. UTC | #16
On 05 Aug 2015 17:56, Keld Simonsen wrote:
> or at least giving importance to them.

please, cut the crap
-mike
Mike Frysinger Aug. 6, 2015, 2:56 a.m. UTC | #17
On 05 Aug 2015 09:20, Paul Eggert wrote:
> On 08/05/2015 03:22 AM, keld@keldix.com wrote:
> > For countries with more timezones, the locale data helps narrowing down the
> > choices. And there are not that many countries with more than 1 timezone,
> > eg USA, Canada, Russia and Greenland. Many big countries like China and India
> > only have 1 timezone
> 
> Actually, China has two time zones: tzdata's Asia/Shanghai and 
> Asia/Urumqi both reflect officially-kept time.  Even Germany has more 
> than one tzdata entry, due to the a difference in post-1970 history of 
> timekeeping in its Swiss enclaves.  So the problem of many time zones 
> for one locale is bigger than what you're suggesting, even if we ignore 
> traveling users (which is a pretty big class to ignore).

a cursory search shows many more countries as well:
Canada, USA, Mexico, Brazil, Australia, Russia, Mongolia, Kazakhstan,
Democratic Republic of Congo, Indonesia, Greenland (does that mean Denmark
too?).  that's at least 16% of the world's population (35% if you count
China).

and that's just for the current period of time.  as you highlight, if you
look back historically, there are other countries that spanned timezones.

locales also are not strictly defined by country borders which means the
timezone spans are even higher (i'm not counting people who travel).
-mike
Keld Simonsen Aug. 6, 2015, 2:30 p.m. UTC | #18
On Wed, Aug 05, 2015 at 10:56:44PM -0400, Mike Frysinger wrote:
> On 05 Aug 2015 09:20, Paul Eggert wrote:
> > On 08/05/2015 03:22 AM, keld@keldix.com wrote:
> > > For countries with more timezones, the locale data helps narrowing down the
> > > choices. And there are not that many countries with more than 1 timezone,
> > > eg USA, Canada, Russia and Greenland. Many big countries like China and India
> > > only have 1 timezone
> > 
> > Actually, China has two time zones: tzdata's Asia/Shanghai and 
> > Asia/Urumqi both reflect officially-kept time.  Even Germany has more 
> > than one tzdata entry, due to the a difference in post-1970 history of 
> > timekeeping in its Swiss enclaves.  So the problem of many time zones 
> > for one locale is bigger than what you're suggesting, even if we ignore 
> > traveling users (which is a pretty big class to ignore).
> 
> a cursory search shows many more countries as well:
> Canada, USA, Mexico, Brazil, Australia, Russia, Mongolia, Kazakhstan,
> Democratic Republic of Congo, Indonesia, Greenland (does that mean Denmark
> too?).  that's at least 16% of the world's population (35% if you count
> China).

I am glad you are now coming forward with actual facts, Mike.
Then we can hopefully find out what the facts are, and probably agree on something.
I have previously done discussion with people were we were intially 
in violent diagreement, but along the road I got a little wiser and probably
my opponent also got a little wiser, and we found some workable solutions. 
That is why I keep responding to almost all of pour posts in a technical
tone, and I hope you could do the same, for the benefit of the glibc project.

I have as you may know been involved with glibc i18n for many years,
and I designed and speced many of the i18n enhancements over POSIX/C,
and also provided lots of data for that purpose to the glibc project.
If I am not corrected with my futurisic ideas, I will probably continue infinitely
on this path - till now it has led us much of the way to where we are now.

Most of these countries I have already mentioned in previous posts.
For Greenland, yes, it is part of the Kingdom of Denmark, but not part
of the State of Denmark. Greenland has its own country code, and
thus its own locales.

I also mentioned in previous posts that narrowing the choice of locales down to
two or three is a big improvement over the current state, even if you cannot 
fully determine a locale for a country. Do you think there is any merit in that
observation, Mike?

My conclusion was then that the only countries that did not benefit 
hugely on the narrow range of plausible locales were the USA, Canada and Russia.
But anyway, having to chose between about a dozen different locales that can be presented 
in one display, is a much nicer option than chosing amongst a long list of all glibc
locales.

> and that's just for the current period of time.  as you highlight, if you
> look back historically, there are other countries that spanned timezones.

Yes, the Olson tz database probably has all these data. 
Still, if you order the timezones for a country in some way, eg order of population,
or alphabetically, you could probably find a solution that is useful to most
people. And then you have the option of setting a specific
TZ variable if you have someting really special. This is UNIX, you know,
we can tweek it endlessly.

> locales also are not strictly defined by country borders which means the
> timezone spans are even higher (i'm not counting people who travel).

I doubt this is a big case. And anyway it can be tweeked, as noted above, right?

best regards
Keld
Paul Eggert Aug. 6, 2015, 2:47 p.m. UTC | #19
On 08/06/2015 07:30 AM, keld@keldix.com wrote:
> Still, if you order the timezones for a country in some way, eg order of population,
> or alphabetically, you could probably find a solution that is useful to most
> people.

This is already done, by applications outside glibc.  For example, the 
tzselect program (part of of the tzcode distribution) lets you list 
tzdata names alphabetically, or in order of distance from your location, 
or geographically.  Other commonly-used time zone selectors do something 
fancier, e.g., based on <http://efele.net/maps/tz> shapefiles.  None of 
these selectors are based on the recently-removed glibc locale 
information, or would benefit from reverting the removal.
Andrew Pinski Aug. 6, 2015, 2:55 p.m. UTC | #20
> On Aug 6, 2015, at 4:30 PM, keld@keldix.com wrote:
> 
>> On Wed, Aug 05, 2015 at 10:56:44PM -0400, Mike Frysinger wrote:
>>> On 05 Aug 2015 09:20, Paul Eggert wrote:
>>>> On 08/05/2015 03:22 AM, keld@keldix.com wrote:
>>>> For countries with more timezones, the locale data helps narrowing down the
>>>> choices. And there are not that many countries with more than 1 timezone,
>>>> eg USA, Canada, Russia and Greenland. Many big countries like China and India
>>>> only have 1 timezone
>>> 
>>> Actually, China has two time zones: tzdata's Asia/Shanghai and 
>>> Asia/Urumqi both reflect officially-kept time.  Even Germany has more 
>>> than one tzdata entry, due to the a difference in post-1970 history of 
>>> timekeeping in its Swiss enclaves.  So the problem of many time zones 
>>> for one locale is bigger than what you're suggesting, even if we ignore 
>>> traveling users (which is a pretty big class to ignore).
>> 
>> a cursory search shows many more countries as well:
>> Canada, USA, Mexico, Brazil, Australia, Russia, Mongolia, Kazakhstan,
>> Democratic Republic of Congo, Indonesia, Greenland (does that mean Denmark
>> too?).  that's at least 16% of the world's population (35% if you count
>> China).
> 
> I am glad you are now coming forward with actual facts, Mike.
> Then we can hopefully find out what the facts are, and probably agree on something.
> I have previously done discussion with people were we were intially 
> in violent diagreement, but along the road I got a little wiser and probably
> my opponent also got a little wiser, and we found some workable solutions. 
> That is why I keep responding to almost all of pour posts in a technical
> tone, and I hope you could do the same, for the benefit of the glibc project.
> 
> I have as you may know been involved with glibc i18n for many years,
> and I designed and speced many of the i18n enhancements over POSIX/C,
> and also provided lots of data for that purpose to the glibc project.
> If I am not corrected with my futurisic ideas, I will probably continue infinitely
> on this path - till now it has led us much of the way to where we are now.
> 
> Most of these countries I have already mentioned in previous posts.
> For Greenland, yes, it is part of the Kingdom of Denmark, but not part
> of the State of Denmark. Greenland has its own country code, and
> thus its own locales.
> 
> I also mentioned in previous posts that narrowing the choice of locales down to
> two or three is a big improvement over the current state, even if you cannot 
> fully determine a locale for a country. Do you think there is any merit in that
> observation, Mike?
> 
> My conclusion was then that the only countries that did not benefit 
> hugely on the narrow range of plausible locales were the USA, Canada and Russia.
> But anyway, having to chose between about a dozen different locales that can be presented 
> in one display, is a much nicer option than chosing amongst a long list of all glibc
> locales.
> 
>> and that's just for the current period of time.  as you highlight, if you
>> look back historically, there are other countries that spanned timezones.
> 
> Yes, the Olson tz database probably has all these data. 
> Still, if you order the timezones for a country in some way, eg order of population,
> or alphabetically, you could probably find a solution that is useful to most
> people. And then you have the option of setting a specific
> TZ variable if you have someting really special. This is UNIX, you know,
> we can tweek it endlessly.
> 
>> locales also are not strictly defined by country borders which means the
>> timezone spans are even higher (i'm not counting people who travel).
> 
> I doubt this is a big case. And anyway it can be tweeked, as noted above, right?

It is a big case in the gnu community and most open source community where people go to conferences. There is one such starting tomorrow. I doubt I want to force to use the one of the cet locals to get the timezone in Prague. 

Thanks,
Andrew


> 
> best regards
> Keld
Keld Simonsen Aug. 6, 2015, 2:56 p.m. UTC | #21
On Wed, Aug 05, 2015 at 09:20:14AM -0700, Paul Eggert wrote:
> On 08/05/2015 03:22 AM, keld@keldix.com wrote:
> >For countries with more timezones, the locale data helps narrowing down the
> >choices. And there are not that many countries with more than 1 timezone,
> >eg USA, Canada, Russia and Greenland. Many big countries like China and 
> >India
> >only have 1 timezone
> 
> Actually, China has two time zones: tzdata's Asia/Shanghai and 
> Asia/Urumqi both reflect officially-kept time.  Even Germany has more 
> than one tzdata entry, due to the a difference in post-1970 history of 
> timekeeping in its Swiss enclaves.  So the problem of many time zones 
> for one locale is bigger than what you're suggesting, even if we ignore 
> traveling users (which is a pretty big class to ignore).

It is always interesting with strange stories. Was there a differnce between German
an Swiss timezones? Something with Switzerland not being EU?
As written in earlier posts, you can always override standard TZ values in specific
cases. For travelling, I see no problem, at least I travel a lot, and I have
my local time changed easily. However all my timestamps are kept in UTC and in seconds
since the Epoch. Where do you see problems, Paul?

And my mother told me that "two" is not "many". Both your examples are for countries
with two timezones, and I guess that one of them are the one that the majority
will use. And then even for the minority, it will help them to just pick number two,
and not have to pick amongst myriads of timezones.

> As for tzdata names "not being culturally acceptable", they are intended 
> for use as internal identifiers, visible to experts like us but not to 
> end users, so "cultural acceptability" should not be an issue.  End 
> users in China, for example, are not expected to see "Asia/Shanghai" 
> even in an English locale, but instead are expected to see "China 
> Standard Time" or "Beijing Time" or something like that.  Strings like 
> "China Standard Time" and "????????????" are maintained by the Unicode 
> Common Locale Data Repository <http://cldr.unicode.org/> and are widely 
> used in glibc-based systems.  I don't know whether CLDR supports ISO TR 
> 14652 and 30112, but if it doesn't then I suggest approaching the CLDR 
> maintainers.

I take your word on timezone names, Paul, as I consider you close to the horse's mouth.
Good to know that the city names are only meant to be internal.
Still we need to convey that info to many developers on Linux install procedures, then.

CLDR was built with input from 14652, They killed it and tried to take over the info
of it and 15897. Embrace and  enhance. I will have a look on CLDR timezone names.

Best regards
Keld
Keld Simonsen Aug. 6, 2015, 5:16 p.m. UTC | #22
On Thu, Aug 06, 2015 at 07:47:01AM -0700, Paul Eggert wrote:
> On 08/06/2015 07:30 AM, keld@keldix.com wrote:
> >Still, if you order the timezones for a country in some way, eg order of 
> >population,
> >or alphabetically, you could probably find a solution that is useful to 
> >most
> >people.
> 
> This is already done, by applications outside glibc.  For example, the 
> tzselect program (part of of the tzcode distribution) lets you list 
> tzdata names alphabetically, or in order of distance from your location, 
> or geographically.  Other commonly-used time zone selectors do something 
> fancier, e.g., based on <http://efele.net/maps/tz> shapefiles.  None of 
> these selectors are based on the recently-removed glibc locale 
> information, or would benefit from reverting the removal.

Given that the data is not present in current locales, I understand this.
But with good data in the locales, that would be another matter.

best regards
keld
Keld Simonsen Aug. 6, 2015, 5:24 p.m. UTC | #23
On Thu, Aug 06, 2015 at 04:55:14PM +0200, pinskia@gmail.com wrote:
> 
> > On Aug 6, 2015, at 4:30 PM, keld@keldix.com wrote:
> > 
> > 
> >> locales also are not strictly defined by country borders which means the
> >> timezone spans are even higher (i'm not counting people who travel).
> > 
> > I doubt this is a big case. And anyway it can be tweeked, as noted above, right?
> 
> It is a big case in the gnu community and most open source community where people go to conferences. There is one such starting tomorrow. I doubt I want to force to use the one of the cet locals to get the timezone in Prague. 

I think we misunderstand eachother. I am in no doubt that people travel all the time,
and changing timezone is an issue here. I was not addressing that, but addressing that 
"locales also are not strictly defined by country borders which means the timezone spans are even higher".
I doubt that this is a big problem, but would like to hear where there are problems with this.

Travelling - I think this is a well known issue, and already solved. At least I travel a lot,
and I am happy that in my kde interface I can just click on the time displayed
in my bottom panel, and  change the time displayed to the one that I need.
Where do you foresee problems with the timezone category on this?
My take is that it does not change anything here.

Best regards
Keld
Keld Simonsen Aug. 6, 2015, 6:01 p.m. UTC | #24
On Wed, Aug 05, 2015 at 04:15:14PM +0000, Joseph Myers wrote:
> On Wed, 5 Aug 2015, Keld Simonsen wrote:
> 
> > I don't know. I think I could add timezone info since the epoch based on
> > the Olson data for each of the locales. Would you be positive about committing such changes 
> > Mike?  I would then have to write up a program for that.
> 
> We should not duplicate the work done externally in tracking timezone 
> changes, nor embed copies of such frequently changing data in glibc (the 
> existing copies of timezone data are purely for use as test inputs for zic 
> and time-related functions, not for installation).

I don't understand your attitude here. Most other data in locales are
coming from somewhere else, such as the language codes, the country codes,
the date formats, the character attributes, the collation sequence.

Locales are just a way of providing all these data, collected from
different sources, into one uniform syntax, to be utilized by
different glibc functions. And also collected so the user
can choose the whole collection in one go. That has proven a useful
concept for many years in the POSIX/C environment.

Best regards
Keld
Andreas Schwab Aug. 6, 2015, 6:44 p.m. UTC | #25
Keld Simonsen <keld@keldix.com> writes:

> Locales are just a way of providing all these data, collected from
> different sources, into one uniform syntax, to be utilized by
> different glibc functions. And also collected so the user
> can choose the whole collection in one go. That has proven a useful
> concept for many years in the POSIX/C environment.

And the timezone package it the primary source for timezone data.

Andreas.
Zack Weinberg Aug. 6, 2015, 8:12 p.m. UTC | #26
Is it possible for a locale to specify an Olson timezone which will be
used *as the default* (i.e. unless overridden by TZ=) ?  If that were
possible, then it would be straightforward to label most locales with
an appropriate timezone and there would be no issue of having to
maintain the same information in two places.

On Thu, Aug 6, 2015 at 2:44 PM, Andreas Schwab <schwab@linux-m68k.org> wrote:
> Keld Simonsen <keld@keldix.com> writes:
>
>> Locales are just a way of providing all these data, collected from
>> different sources, into one uniform syntax, to be utilized by
>> different glibc functions. And also collected so the user
>> can choose the whole collection in one go. That has proven a useful
>> concept for many years in the POSIX/C environment.
>
> And the timezone package it the primary source for timezone data.
>
> Andreas.
>
> --
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."
Joseph Myers Aug. 6, 2015, 8:15 p.m. UTC | #27
On Thu, 6 Aug 2015, Keld Simonsen wrote:

> On Wed, Aug 05, 2015 at 04:15:14PM +0000, Joseph Myers wrote:
> > On Wed, 5 Aug 2015, Keld Simonsen wrote:
> > 
> > > I don't know. I think I could add timezone info since the epoch based on
> > > the Olson data for each of the locales. Would you be positive about committing such changes 
> > > Mike?  I would then have to write up a program for that.
> > 
> > We should not duplicate the work done externally in tracking timezone 
> > changes, nor embed copies of such frequently changing data in glibc (the 
> > existing copies of timezone data are purely for use as test inputs for zic 
> > and time-related functions, not for installation).
> 
> I don't understand your attitude here. Most other data in locales are
> coming from somewhere else, such as the language codes, the country codes,
> the date formats, the character attributes, the collation sequence.

tzdata is updated many times a year, sometimes with no more than a few 
days' notice that a country is changing its timezone; I don't think 
countries change their collation rules with a few days' notice like that.  
GNU/Linux distributions have well-established processes for getting those 
updates out to users in a timely manner.  Note that tzdata comes from 
separate sources to glibc, and is built independently of glibc; anything 
built from the glibc source tree would be liable to require other binary 
packages built from the same source tree to be updated at the same time, 
which is not helpful.

A few years ago we deliberately stopped installing timezone data from 
glibc because it was much better for distributions to get the updates 
directly from the upstream project.  The principles haven't changed.

It's possible the tzdist protocol may become relevant in future for this 
purpose, or that glibc might gain support for rereading timezone data in 
future (relevant for long-running processes when timezone rules change).  
But I don't see timezone information in locales as being relevant to glibc 
in the future.  The path that leads to systems where timezone information 
in locales is relevant is a path that diverged from glibc (and Unix-like 
systems in general) a long time ago (over 20 years ago, at least, given 
how POSIX had separate interfaces for timezones and locales over 20 years 
ago and the tz database dates back to 1986 or before.

Timezones and locales are completely orthogonal in glibc.  Moving away 
from that would be a backwards step.  If anything, we should add 
interfaces involving explicit timezone objects (see bug 17651) just like 
the interfaces involving explicit locale objects to make it even more 
convenient for applications to use arbitrary combinations of locales and 
timezones when processing data where different records may involve 
different timezones.
diff mbox

Patch

diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 5563659..aaef20d 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -1838,8 +1838,6 @@  am_pm    "<U1796><U17D2><U179A><U17B9><U1780>";"<U179B><U17D2><U1784><U17B6><U17
 %date_fmt       "<U0025><U0061><U0020><U0025><U0065><U0020><U0025><U0062>/
 %<U0020><U0025><U0045><U0079><U0020><U0025><U0048><U003A><U0025><U004D>/
 %<U003A><U0025><U0053><U0020><U0025><U005A>"
-% ICT-7ICT-7
-%timezone	"<U0049><U0043><U0054><U002D><U0037><U0049><U0043><U0054><U002d><U0037>"
 
 END LC_TIME
 
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index c584877..eba90ce 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -716,8 +716,6 @@  era_d_t_fmt     "<U0EA7><U0EB1><U0E99>%A<U0E97><U0EB5><U0EC8><U0020>%e<U0020>%B<
 date_fmt       "<U0025><U0061><U0020><U0025><U0065><U0020><U0025><U0062>/
 <U0020><U0025><U0045><U0079><U0020><U0025><U0048><U003A><U0025><U004D>/
 <U003A><U0025><U0053><U0020><U0025><U005A>"
-% ICT-7ICT-7
-timezone	"<U0049><U0043><U0054><U002D><U0037><U0049><U0043><U0054><U002d><U0037>"
 END LC_TIME
 
 LC_MESSAGES
diff --git a/localedata/locales/my_MM b/localedata/locales/my_MM
index d9a2db1..165519a 100644
--- a/localedata/locales/my_MM
+++ b/localedata/locales/my_MM
@@ -157,9 +157,6 @@  t_fmt       "<U0025><U004F><U0049><U003A><U0025><U004F><U004D><U003A><U0025><U00
 % %OI:%OM:%OS %p
 t_fmt_ampm  "<U0025><U004F><U0049><U003A><U0025><U004F><U004D><U003A><U0025><U004F><U0053><U0020><U0025><U0070>"
 
-% MMT-6.5MMT-6.5
-timezone "<U004D><U004D><U0054><U002D><U0036><U002E><U0035><U004D><U004D><U0054><U002D><U0036><U002E><U0035>"
-
 alt_digits  "<U1040><U1040>";/
 		"<U1040><U1041>";/
 		"<U1040><U1042>";/
diff --git a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
index a1e7d49..eb2b292 100644
--- a/localedata/locales/nan_TW@latin
+++ b/localedata/locales/nan_TW@latin
@@ -136,7 +136,6 @@  d_fmt       "<U0025><U0046>"
 t_fmt       "<U0025><U0072>"
 am_pm       "<U0074><U00E9><U006E><U0067><U002D><U0070><U006F><U0358>";"<U0113><U002D><U0070><U006F><U0358>"
 t_fmt_ampm  "<U0025><U0049><U003A><U0025><U004D><U003A><U0025><U0053><U0020><U0025><U0070>"
-timezone    "<U0054><U0053><U0054><U002D><U0038>"
 date_fmt    "<U0025><U0059><U0020><U0025><U0062><U0020><U0025><U0064><U0020><U0028><U0025><U0061><U0029><U0020><U0025><U0048><U003A><U0025><U004D><U003A><U0025><U0053><U0020><U0025><U005A>"
 END LC_TIME
 
diff --git a/localedata/locales/th_TH b/localedata/locales/th_TH
index 88c3637..5b8c41b 100644
--- a/localedata/locales/th_TH
+++ b/localedata/locales/th_TH
@@ -911,8 +911,6 @@  era_d_t_fmt     "<U0E27><U0E31><U0E19>%A<U0E17><U0E35><U0E48><U0020>%e<U0020>%B<
 date_fmt       "<U0025><U0061><U0020><U0025><U0065><U0020><U0025><U0062>/
 <U0020><U0025><U0045><U0079><U0020><U0025><U0048><U003A><U0025><U004D>/
 <U003A><U0025><U0053><U0020><U0025><U005A>"
-% ICT-7ICT-7
-timezone	"<U0049><U0043><U0054><U002D><U0037><U0049><U0043><U0054><U002d><U0037>"
 END LC_TIME
 
 LC_MESSAGES
diff --git a/localedata/locales/uk_UA b/localedata/locales/uk_UA
index 511f004..5e58043 100644
--- a/localedata/locales/uk_UA
+++ b/localedata/locales/uk_UA
@@ -964,31 +964,6 @@  first_weekday 2
 % Define the first workday relative to the <week> keyword
 first_workday 2
 
-% Zymovyj CHas (winter time) or z.ch. (or nothing)
-% Litnij CHas (summer time) or l.ch.
-%
-% ( or EET/EEST (Easter Europe [Summer] Time) )
-% ( or Europe/Kyiv (or Kiev, in Russian) )
-%
-% Format:
-%
-% <ZoneName><Offset><ZoneName><Offset>,<rule>,<rule>[,...]
-%
-%  ZoneName - at least 3 letters, up to 10
-%  Offset - (+|-)hh[:mm[:ss]]
-%     - - time zone is east of Prime Meridian
-%     + - time zone is west of Prime Meridian
-%  rule: <date>[/time[/year]]
-%   date:
-%     J<JulianDay> , 1-365 (without 29.02)
-%     <JulianDay> , 0-364 (without 29.02)
-%     M<m>.<n>.<d> - m - month(1-12)
-%                    n - week(1-5)
-%                    d - day of week(0-7), day zero and day seven is Sunday
-%   time - the same as <offset> (but without leading +/-)
-%
-timezone "<U0437><U002E><U0447><U002E><U002D><U0030><U0032><U003A><U0030><U0030><U043B><U002E><U0447><U002E><U002D><U0030><U0033><U003A><U0030><U0030><U002C><U004D><U0033><U002E><U0035><U002E><U0030><U002F><U0030><U0033><U003A><U0030><U0030><U002C><U004D><U0031><U0030><U002E><U0035><U002E><U0030><U002F><U0030><U0034><U003A><U0030><U0030>"
-
 % Example:
 %
 %           traven`         cherven`