Message ID | s9degldez4x.fsf@ari.site |
---|---|
State | New |
Headers | show |
Hi Mike, I reviewed the resulting transliteration and special decompose rules and in general everything looks very good, few minor comments below. On 2015-06-15 19:04, Mike FABIAN wrote: > > Subject: [PATCH 1/4] Remove duplicate transliterations for U+0152 and U+0153 > from C-translit.h.in this looks like an obvious fix. > Subject: [PATCH 2/4] Addition and fixes for translit_neutral > > +% LATIN CAPITAL LETTER ENG > +<U014A> <U004E> > +% LATIN SMALL LETTER ENG > +<U014B> <U006E> Hmm, I presume NG/ng would be more expected than N/n here, but reading https://en.wikipedia.org/wiki/Eng_%28letter%29 doesn't give a clear answer either way, what do you think? > +% EURO-CURRENCY SIGN > +% CRUZEIRO SIGN > +% FRENCH FRANC SIGN > +% LIRA SIGN > +% PESETA SIGN > % DONG SIGN > +% INDIAN RUPEE SIGN > +% TURKISH LIRA SIGN While at it, should we perhaps also add pound, ruble, drachma, won, and hryvnia signs here? > Subject: [PATCH 3/4] Update the translit files to Unicode 7.0.0 The generated files included in this patch look good. > Subject: [PATCH 4/4] Add transliteration rules for da, nb, nn, and sv locales. AFAICS these also look good. Thanks,
Hi, actually, one more additional note: after these patches some rules are now duplicated, see below for few examples, is there some particular reason for this or could those duplicates be avoided? localhost:~> grep '^<U00C6>' translit* translit_combining:<U00C6> "<U0041><U0045>" translit_neutral:<U00C6> "<U0041><U0045>" localhost:~> grep '^<U00D8>' translit* translit_combining:<U00D8> <U004F> translit_neutral:<U00D8> <U004F> localhost:~> Thanks, On 2015-06-16 16:24, Marko Myllynen wrote: > Hi Mike, > > I reviewed the resulting transliteration and special decompose rules and > in general everything looks very good, few minor comments below. > > On 2015-06-15 19:04, Mike FABIAN wrote: >> >> Subject: [PATCH 1/4] Remove duplicate transliterations for U+0152 and U+0153 >> from C-translit.h.in > > this looks like an obvious fix. > >> Subject: [PATCH 2/4] Addition and fixes for translit_neutral >> >> +% LATIN CAPITAL LETTER ENG >> +<U014A> <U004E> >> +% LATIN SMALL LETTER ENG >> +<U014B> <U006E> > > Hmm, I presume NG/ng would be more expected than N/n here, but reading > https://en.wikipedia.org/wiki/Eng_%28letter%29 doesn't give a clear > answer either way, what do you think? > >> +% EURO-CURRENCY SIGN >> +% CRUZEIRO SIGN >> +% FRENCH FRANC SIGN >> +% LIRA SIGN >> +% PESETA SIGN >> % DONG SIGN >> +% INDIAN RUPEE SIGN >> +% TURKISH LIRA SIGN > > While at it, should we perhaps also add pound, ruble, drachma, won, and > hryvnia signs here? > >> Subject: [PATCH 3/4] Update the translit files to Unicode 7.0.0 > > The generated files included in this patch look good. > >> Subject: [PATCH 4/4] Add transliteration rules for da, nb, nn, and sv locales. > > AFAICS these also look good. > > Thanks, >
Hi, On 2015-06-16 17:24, Mike FABIAN wrote: > Marko Myllynen <myllynen@redhat.com> さんはかきました: > >>> Subject: [PATCH 2/4] Addition and fixes for translit_neutral >>> >>> +% LATIN CAPITAL LETTER ENG >>> +<U014A> <U004E> >>> +% LATIN SMALL LETTER ENG >>> +<U014B> <U006E> >> >> Hmm, I presume NG/ng would be more expected than N/n here, but reading >> https://en.wikipedia.org/wiki/Eng_%28letter%29 doesn't give a clear >> answer either way, what do you think? > > http://unicode.org/cldr/trac/browser/trunk/common/transforms/Latin-ASCII.xml#L54 > > has: > > 54 <tRule>Ŋ → N ; # 014A;LATIN CAPITAL LETTER ENG</tRule> > 55 <tRule>ŋ → n ; # 014B;LATIN SMALL LETTER ENG</tRule> > > "ng" might be phonetically closer but the main spirit of the "neutral" > transliteration to ASCII seems to be something like "drop the accents", > not "approximate the pronunciation using ASCII". I see, looks ok then. Thanks,
From ef2a1022224d32989891f7a12f2170a1b3a7e7f9 Mon Sep 17 00:00:00 2001 From: Mike FABIAN <mfabian@redhat.com> Date: Wed, 20 May 2015 11:16:30 +0200 Subject: [PATCH 4/4] Add transliteration rules for da, nb, nn, and sv locales. for localedata/Changelog [BZ #89] * locales/da_DK add more transliteration rules * locales/nb_NO add transliteration rules * locales/sv_SE add transliteration rules --- localedata/locales/da_DK | 21 ++++++++++++++++++--- localedata/locales/nb_NO | 22 ++++++++++++++++++++++ localedata/locales/sv_SE | 22 ++++++++++++++++++++++ 3 files changed, 62 insertions(+), 3 deletions(-) diff --git a/localedata/locales/da_DK b/localedata/locales/da_DK index c5024a4..d1d4087 100644 --- a/localedata/locales/da_DK +++ b/localedata/locales/da_DK @@ -137,11 +137,26 @@ translit_start include "translit_combining";"" -% Danish. -% LATIN CAPITAL LETTER A WITH RING ABOVE. +% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE" +<U00C4> "<U0041><U0308>";"<U0041><U0045>" +% LATIN CAPITAL LETTER A WITH RING ABOVE -> "AA" <U00C5> "<U0041><U030A>";"<U0041><U0041>" -% LATIN SMALL LETTER A WITH RING ABOVE. +% LATIN CAPITAL LETTER AE -> "AE" +<U00C6> "<U0041><U0045>" +% LATIN CAPITAL LETTER O WITH DIAERESIS -> "OE" +<U00D6> "<U004F><U0308>";"<U004F><U0045>" +% LATIN CAPITAL LETTER O WITH STROKE -> "OE" +<U00D8> "<U004F><U0338>";"<U004F><U0045>" +% LATIN SMALL LETTER A WITH DIAERESIS -> "ae" +<U00E4> "<U0061><U0308>";"<U0061><U0065>" +% LATIN SMALL LETTER A WITH RING ABOVE -> "aa" <U00E5> "<U0061><U030A>";"<U0061><U0061>" +% LATIN SMALL LETTER AE -> "ae" +<U00E6> "<U0061><U0065>" +% LATIN SMALL LETTER O WITH DIAERESIS -> "oe" +<U00F6> "<U006F><U0308>";"<U006F><U0065>" +% LATIN SMALL LETTER O WITH STROKE -> "oe" +<U00F8> "<U006F><U0338>";"<U006F><U0065>" translit_end diff --git a/localedata/locales/nb_NO b/localedata/locales/nb_NO index 513d50c..332092a 100644 --- a/localedata/locales/nb_NO +++ b/localedata/locales/nb_NO @@ -127,6 +127,28 @@ copy "i18n" translit_start include "translit_combining";"" + +% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE" +<U00C4> "<U0041><U0308>";"<U0041><U0045>" +% LATIN CAPITAL LETTER A WITH RING ABOVE -> "AA" +<U00C5> "<U0041><U030A>";"<U0041><U0041>" +% LATIN CAPITAL LETTER AE -> "AE" +<U00C6> "<U0041><U0045>" +% LATIN CAPITAL LETTER O WITH DIAERESIS -> "OE" +<U00D6> "<U004F><U0308>";"<U004F><U0045>" +% LATIN CAPITAL LETTER O WITH STROKE -> "OE" +<U00D8> "<U004F><U0338>";"<U004F><U0045>" +% LATIN SMALL LETTER A WITH DIAERESIS -> "ae" +<U00E4> "<U0061><U0308>";"<U0061><U0065>" +% LATIN SMALL LETTER A WITH RING ABOVE -> "aa" +<U00E5> "<U0061><U030A>";"<U0061><U0061>" +% LATIN SMALL LETTER AE -> "ae" +<U00E6> "<U0061><U0065>" +% LATIN SMALL LETTER O WITH DIAERESIS -> "oe" +<U00F6> "<U006F><U0308>";"<U006F><U0065>" +% LATIN SMALL LETTER O WITH STROKE -> "oe" +<U00F8> "<U006F><U0338>";"<U006F><U0065>" + translit_end END LC_CTYPE diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE index ecf7858..92358b9 100644 --- a/localedata/locales/sv_SE +++ b/localedata/locales/sv_SE @@ -112,6 +112,28 @@ copy "i18n" translit_start include "translit_combining";"" + +% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE" +<U00C4> "<U0041><U0308>";"<U0041><U0045>" +% LATIN CAPITAL LETTER A WITH RING ABOVE -> "AA" +<U00C5> "<U0041><U030A>";"<U0041><U0041>" +% LATIN CAPITAL LETTER AE -> "AE" +<U00C6> "<U0041><U0045>" +% LATIN CAPITAL LETTER O WITH DIAERESIS -> "OE" +<U00D6> "<U004F><U0308>";"<U004F><U0045>" +% LATIN CAPITAL LETTER O WITH STROKE -> "OE" +<U00D8> "<U004F><U0338>";"<U004F><U0045>" +% LATIN SMALL LETTER A WITH DIAERESIS -> "ae" +<U00E4> "<U0061><U0308>";"<U0061><U0065>" +% LATIN SMALL LETTER A WITH RING ABOVE -> "aa" +<U00E5> "<U0061><U030A>";"<U0061><U0061>" +% LATIN SMALL LETTER AE -> "ae" +<U00E6> "<U0061><U0065>" +% LATIN SMALL LETTER O WITH DIAERESIS -> "oe" +<U00F6> "<U006F><U0308>";"<U006F><U0065>" +% LATIN SMALL LETTER O WITH STROKE -> "oe" +<U00F8> "<U006F><U0338>";"<U006F><U0065>" + translit_end END LC_CTYPE -- 2.4.2