diff mbox

sort diacritics left-to-right except in fr_CA locale

Message ID orvbl33rbr.fsf@free.home
State New
Headers show

Commit Message

Alexandre Oliva Dec. 23, 2014, 4:47 a.m. UTC
On Dec 17, 2014, Roland McGrath <roland@hack.frob.com> wrote:

>> Noted, thanks.  Any other comments on the patch, before I post a revised
>> version mentioning the yet-to-be-filed bug report?

> I am pretty useless in that area of the code, sorry.

Ping? (as in, anyone else? :-)

Here's a revised patch that adds a reference to the newly-filed bug
report.

for  ChangeLog

	[BZ #17750]
	* localedata/Makefile (test-input): Add fr_CA.UTF-8.
	(LOCALES): Likewise.
	* localedata/fr_CA.in: Copied and adjusted from...
	* localedata/fr_FR.in: ... this.  Adjusted too.
	* localedata/locales/de_DE (DIACRIT_FORWARD): Do not define.
	* localedata/locales/lb_LU (DIACRIT_FORWARD): Likewise.
	* localedata/locales/fr_CA (DIACRIT_BACKWARD): Define.
	* localedata/locales/iso14651_t1_common (DIACRIT_FORWARD):
	Make it the new default, overridable with DIACRIT_BACKWARD.
	* NEWS: Note behavior change.
---
 NEWS                                  |   11 +++-
 localedata/Makefile                   |    4 +
 localedata/fr_CA.in                   |   96 +++++++++++++++++++++++++++++++++
 localedata/fr_FR.in                   |   22 ++++----
 localedata/locales/de_DE              |    2 -
 localedata/locales/fr_CA              |    2 +
 localedata/locales/iso14651_t1_common |    6 +-
 localedata/locales/lb_LU              |    2 -
 8 files changed, 124 insertions(+), 21 deletions(-)
 create mode 100644 localedata/fr_CA.in

Comments

Alexandre Oliva Jan. 6, 2015, 12:11 a.m. UTC | #1
On Dec 23, 2014, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Dec 17, 2014, Roland McGrath <roland@hack.frob.com> wrote:
>>> Noted, thanks.  Any other comments on the patch, before I post a revised
>>> version mentioning the yet-to-be-filed bug report?

>> I am pretty useless in that area of the code, sorry.

> Ping? (as in, anyone else? :-)

> Here's a revised patch that adds a reference to the newly-filed bug
> report.

> for  ChangeLog

> 	[BZ #17750]
> 	* localedata/Makefile (test-input): Add fr_CA.UTF-8.
> 	(LOCALES): Likewise.
> 	* localedata/fr_CA.in: Copied and adjusted from...
> 	* localedata/fr_FR.in: ... this.  Adjusted too.
> 	* localedata/locales/de_DE (DIACRIT_FORWARD): Do not define.
> 	* localedata/locales/lb_LU (DIACRIT_FORWARD): Likewise.
> 	* localedata/locales/fr_CA (DIACRIT_BACKWARD): Define.
> 	* localedata/locales/iso14651_t1_common (DIACRIT_FORWARD):
> 	Make it the new default, overridable with DIACRIT_BACKWARD.
> 	* NEWS: Note behavior change.

Ping?
Mike Frysinger March 5, 2015, 6:05 p.m. UTC | #2
On 23 Dec 2014 02:47, Alexandre Oliva wrote:
> On Dec 17, 2014, Roland McGrath <roland@hack.frob.com> wrote:
> 
> >> Noted, thanks.  Any other comments on the patch, before I post a revised
> >> version mentioning the yet-to-be-filed bug report?
> 
> > I am pretty useless in that area of the code, sorry.
> 
> Ping? (as in, anyone else? :-)
> 
> Here's a revised patch that adds a reference to the newly-filed bug
> report.

ok
-mike
Mike Frysinger April 12, 2016, 7:49 a.m. UTC | #3
On 05 Mar 2015 13:05, Mike Frysinger wrote:
> On 23 Dec 2014 02:47, Alexandre Oliva wrote:
> > On Dec 17, 2014, Roland McGrath <roland@hack.frob.com> wrote:
> > 
> > >> Noted, thanks.  Any other comments on the patch, before I post a revised
> > >> version mentioning the yet-to-be-filed bug report?
> > 
> > > I am pretty useless in that area of the code, sorry.
> > 
> > Ping? (as in, anyone else? :-)
> > 
> > Here's a revised patch that adds a reference to the newly-filed bug
> > report.
> 
> ok

were you going to merge this ?
-mike
Keld Simonsen April 12, 2016, 9 a.m. UTC | #4
On Tue, Apr 12, 2016 at 03:49:03AM -0400, Mike Frysinger wrote:
> On 05 Mar 2015 13:05, Mike Frysinger wrote:
> > On 23 Dec 2014 02:47, Alexandre Oliva wrote:
> > > On Dec 17, 2014, Roland McGrath <roland@hack.frob.com> wrote:
> > > 
> > > >> Noted, thanks.  Any other comments on the patch, before I post a revised
> > > >> version mentioning the yet-to-be-filed bug report?
> > > 
> > > > I am pretty useless in that area of the code, sorry.
> > > 
> > > Ping? (as in, anyone else? :-)
> > > 
> > > Here's a revised patch that adds a reference to the newly-filed bug
> > > report.
> > 
> > ok
> 
> were you going to merge this ?

Well, a number of locales where French is influential, should stll have
the reversed accents sorting. This should include fr_BE, fr_CH, da_DK,
fr_CA and a number of locales in Africa, that uses French as a business language.
da_DK we say that as where this matters, it is most likely because the words or names
originate from French. The same reasoning coul also apply to nb_NO, nn_NO and sv_SE

Best regards
keld
diff mbox

Patch

diff --git a/NEWS b/NEWS
index 0d481c2..0e267eb 100644
--- a/NEWS
+++ b/NEWS
@@ -15,7 +15,7 @@  Version 2.21
   17522, 17555, 17570, 17571, 17572, 17573, 17574, 17581, 17582, 17583,
   17584, 17585, 17589, 17594, 17601, 17608, 17616, 17625, 17630, 17633,
   17634, 17647, 17653, 17657, 17664, 17665, 17668, 17682, 17717, 17719,
-  17722, 17724, 17725, 17733, 17744, 17745, 17746, 17747.
+  17722, 17724, 17725, 17733, 17744, 17745, 17746, 17747, 17750.
 
 * CVE-2104-7817 The wordexp function could ignore the WRDE_NOCMD flag
   under certain input conditions resulting in the execution of a shell for
@@ -46,6 +46,15 @@  Version 2.21
 
 * Merged gettext 0.19.3 into the intl subdirectory.  This fixes building
   with newer versions of bison.
+
+* Collation (sorting) general rules regarding diacritics have been fixed to
+  match those in Unicode CLDR, namely, whether diacritic tie-breaking takes
+  place in a forward or backward pass over the strings or wstrings.  The
+  only locale that sort diacritics with a backward pass is now fr_CA; it
+  already sorted «cote < côte < coté < côté» before.  All other locales now
+  use a forward pass, so that they sort «cote < coté < côte < côté», which
+  only de_DE and lb_LU did before.  (Bugzilla #17750)
+
 
 Version 2.20
 
diff --git a/localedata/Makefile b/localedata/Makefile
index 0826b36..4fc523e 100644
--- a/localedata/Makefile
+++ b/localedata/Makefile
@@ -37,7 +37,7 @@  test-srcs := collate-test xfrm-test tst-fmon tst-rpmatch tst-trans \
 	     tst-ctype tst-langinfo tst-langinfo-static tst-numeric
 test-input := de_DE.ISO-8859-1 en_US.ISO-8859-1 da_DK.ISO-8859-1 \
 	      hr_HR.ISO-8859-2 sv_SE.ISO-8859-1 tr_TR.UTF-8 fr_FR.UTF-8 \
-	      si_LK.UTF-8
+	      si_LK.UTF-8 fr_CA.UTF-8
 test-input-data = $(addsuffix .in, $(basename $(test-input)))
 test-output := $(foreach s, .out .xout, \
 			 $(addsuffix $s, $(basename $(test-input))))
@@ -106,7 +106,7 @@  LOCALES := de_DE.ISO-8859-1 de_DE.UTF-8 en_US.ANSI_X3.4-1968 \
 	   hr_HR.ISO-8859-2 sv_SE.ISO-8859-1 ja_JP.SJIS fr_FR.ISO-8859-1 \
 	   nb_NO.ISO-8859-1 nn_NO.ISO-8859-1 tr_TR.UTF-8 cs_CZ.UTF-8 \
 	   zh_TW.EUC-TW fa_IR.UTF-8 fr_FR.UTF-8 ja_JP.UTF-8 si_LK.UTF-8 \
-	   tr_TR.ISO-8859-9 en_GB.UTF-8
+	   tr_TR.ISO-8859-9 en_GB.UTF-8 fr_CA.UTF-8
 LOCALE_SRCS := $(shell echo "$(LOCALES)"|sed 's/\([^ .]*\)[^ ]*/\1/g')
 CHARMAPS := $(shell echo "$(LOCALES)" | \
 		    sed -e 's/[^ .]*[.]\([^ ]*\)/\1/g' -e s/SJIS/SHIFT_JIS/g)
diff --git a/localedata/fr_CA.in b/localedata/fr_CA.in
new file mode 100644
index 0000000..1c05d69
--- /dev/null
+++ b/localedata/fr_CA.in
@@ -0,0 +1,96 @@ 
+@@@@@
+0000
+9999
+Aalborg
+aide
+aïeul
+air
+@@@air
+air@@@
+Ålborg
+août
+bohème
+Bohême
+Bohémien
+caennais
+cæsium
+çà et là
+C.A.F.
+Canon
+cañon
+casanier
+cølibat
+colon
+côlon
+COOP
+CO-OP
+coop
+co-op
+Copenhagen
+COTE
+cote
+CÔTE
+côte
+COTÉ
+coté
+CÔTÉ
+côté
+du
+dû
+élève
+élevé
+gène
+gêne
+gêné
+Größe
+Grossist
+haie
+haïe
+île
+Île d'Orléans
+lame
+l'âme
+lamé
+les
+LÈS
+lèse
+lésé
+L'Haÿ-les-Roses
+MÂCON
+maçon
+McArthur
+Mc Arthur
+Mc Mahon
+MODÈLE
+modelé
+NOËL
+Noël
+notre
+nôtre
+ode
+œil
+ou
+OÙ
+ovoïde
+pèche
+pêche
+PÉCHÉ
+péché
+pêché
+pécher
+pêcher
+pechère
+péchère
+relève
+relevé
+resume
+resumé
+résumé
+révèle
+révélé
+vice-president
+vice-président
+vice-president's offices
+vice-presidents' offices
+VICE-VERSA
+vice versa
diff --git a/localedata/fr_FR.in b/localedata/fr_FR.in
index dd5c533..070eb4dc 100644
--- a/localedata/fr_FR.in
+++ b/localedata/fr_FR.in
@@ -29,16 +29,16 @@  CO-OP
 Copenhagen
 cote
 COTE
-côte
-CÔTE
 coté
 COTÉ
+côte
+CÔTE
 côté
 CÔTÉ
 du
 dû
-élève
 élevé
+élève
 gène
 gêne
 gêné
@@ -49,20 +49,20 @@  haïe
 île
 Île d'Orléans
 lame
-l'âme
 lamé
+l'âme
 les
 LÈS
-lèse
 lésé
+lèse
 L'Haÿ-les-Roses
-MÂCON
 maçon
+MÂCON
 McArthur
 Mc Arthur
 Mc Mahon
-MODÈLE
 modelé
+MODÈLE
 Noël
 NOËL
 notre
@@ -72,22 +72,22 @@  ode
 ou
 OÙ
 ovoïde
-pèche
-pêche
 péché
 PÉCHÉ
+pèche
+pêche
 pêché
 pécher
 pêcher
 pechère
 péchère
-relève
 relevé
+relève
 resume
 resumé
 résumé
-révèle
 révélé
+révèle
 vice-president
 vice-président
 vice-president's offices
diff --git a/localedata/locales/de_DE b/localedata/locales/de_DE
index e2704a7..2c3510a 100644
--- a/localedata/locales/de_DE
+++ b/localedata/locales/de_DE
@@ -76,8 +76,6 @@  END LC_CTYPE
 
 LC_COLLATE
 
-define DIACRIT_FORWARD
-
 % Copy the template from ISO/IEC 14651
 copy "iso14651_t1"
 
diff --git a/localedata/locales/fr_CA b/localedata/locales/fr_CA
index 5e2c5a1..878539b 100644
--- a/localedata/locales/fr_CA
+++ b/localedata/locales/fr_CA
@@ -51,6 +51,8 @@  copy "fr_FR"
 END LC_CTYPE
 
 LC_COLLATE
+define DIACRIT_BACKWARD
+
 copy "en_CA"
 END LC_COLLATE
 
diff --git a/localedata/locales/iso14651_t1_common b/localedata/locales/iso14651_t1_common
index e0c3eaa..1fc214f 100644
--- a/localedata/locales/iso14651_t1_common
+++ b/localedata/locales/iso14651_t1_common
@@ -5060,10 +5060,10 @@  order_start <SPECIAL>;forward;backward;forward;forward,position
 <U009E> IGNORE;IGNORE;IGNORE;<U009E>
 <U009F> IGNORE;IGNORE;IGNORE;<U009F>
 
-ifdef DIACRIT_FORWARD
-order_start <LATIN>;forward;forward;forward;forward,position
-else
+ifdef DIACRIT_BACKWARD
 order_start <LATIN>;forward;backward;forward;forward,position
+else
+order_start <LATIN>;forward;forward;forward;forward,position
 endif
 #
 <U00A0> <U0020>;<BAS>;<MIN>;IGNORE # 170<NBSP>
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index a74e162..c8616fd 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -77,8 +77,6 @@  END LC_CTYPE
 
 LC_COLLATE
 
-define DIACRIT_FORWARD
-
 % Copy the template from ISO/IEC 14651
 copy "iso14651_t1"