[v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
diff mbox series

Message ID d5582688-819b-90c2-3f4a-0d19c932d487@kobylkin.com
State New
Headers show
Series
  • [v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
Related show

Commit Message

Diego (Egor) Kobylkin Oct. 12, 2018, 2:05 p.m. UTC
Dear locale maintainers,

fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"

https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]

add the Cyrillic transliteration table translit_cyrillic file

https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]

to localedata/locales/ and include it in all your locales going forward.

The patch included inline below.

From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic

Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.

Current bug effect:

The glibc wiki explicitly lists this use case as the test example

https://sourceware.org/glibc/wiki/Locales#Testing_Locales :

LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt

currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:

LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC

CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.

 - It produces a string of question marks and spaces.

This is what it should produce and it does so after the patch applied:

CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.


The root problem and the fix:

The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.



COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.

Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.

While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.

The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].

The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.

The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.

I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Данило Шеган <danilo@gnome.org>  (sr_RS) have confirmed the
exclusion.

Links:

[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A

Best regards,
Egor Kobylkin

---
2018-10-11  Egor Kobylkin  <egor@kobylkin.com>

	[BZ #2872]
	* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
	* localedata/locales/C: Add include "translit_cyrillic";"" to LC_CTYPE
translit section.
	* localedata/locales/aa_DJ: Likewise.
	* localedata/locales/af_ZA: Likewise.
	* localedata/locales/ak_GH: Likewise.
	* localedata/locales/am_ET: Likewise.
	* localedata/locales/ar_EG: Likewise.
	* localedata/locales/be_BY: Likewise.
	* localedata/locales/bem_ZM: Likewise.
	* localedata/locales/ber_DZ: Likewise.
	* localedata/locales/ber_MA: Likewise.
	* localedata/locales/bg_BG: Likewise.
	* localedata/locales/bi_VU: Likewise.
	* localedata/locales/bn_BD: Likewise.
	* localedata/locales/bo_CN: Likewise.
	* localedata/locales/ca_ES: Likewise.
	* localedata/locales/ce_RU: Likewise.
	* localedata/locales/cmn_TW: Likewise.
	* localedata/locales/cs_CZ: Likewise.
	* localedata/locales/cv_RU: Likewise.
	* localedata/locales/cy_GB: Likewise.
	* localedata/locales/da_DK: Likewise.
	* localedata/locales/de_DE: Likewise.
	* localedata/locales/dv_MV: Likewise.
	* localedata/locales/dz_BT: Likewise.
	* localedata/locales/el_GR: Likewise.
	* localedata/locales/en_GB: Likewise.
	* localedata/locales/en_NG: Likewise.
	* localedata/locales/en_ZM: Likewise.
	* localedata/locales/es_CU: Likewise.
	* localedata/locales/es_ES: Likewise.
	* localedata/locales/et_EE: Likewise.
	* localedata/locales/fa_IR: Likewise.
	* localedata/locales/ff_SN: Likewise.
	* localedata/locales/fi_FI: Likewise.
	* localedata/locales/fr_FR: Likewise.
	* localedata/locales/ga_IE: Likewise.
	* localedata/locales/gd_GB: Likewise.
	* localedata/locales/gu_IN: Likewise.
	* localedata/locales/gv_GB: Likewise.
	* localedata/locales/he_IL: Likewise.
	* localedata/locales/hi_IN: Likewise.
	* localedata/locales/hif_FJ: Likewise.
	* localedata/locales/hr_HR: Likewise.
	* localedata/locales/ht_HT: Likewise.
	* localedata/locales/hu_HU: Likewise.
	* localedata/locales/hy_AM: Likewise.
	* localedata/locales/id_ID: Likewise.
	* localedata/locales/is_IS: Likewise.
	* localedata/locales/it_IT: Likewise.
	* localedata/locales/ja_JP: Likewise.
	* localedata/locales/kab_DZ: Likewise.
	* localedata/locales/kk_KZ: Likewise.
	* localedata/locales/km_KH: Likewise.
	* localedata/locales/kn_IN: Likewise.
	* localedata/locales/ko_KR: Likewise.
	* localedata/locales/ks_IN: Likewise.
	* localedata/locales/kw_GB: Likewise.
	* localedata/locales/lb_LU: Likewise.
	* localedata/locales/lg_UG: Likewise.
	* localedata/locales/lij_IT: Likewise.
	* localedata/locales/ln_CD: Likewise.
	* localedata/locales/lo_LA: Likewise.
	* localedata/locales/lt_LT: Likewise.
	* localedata/locales/lv_LV: Likewise.
	* localedata/locales/mg_MG: Likewise.
	* localedata/locales/mhr_RU: Likewise.
	* localedata/locales/mk_MK: Likewise.
	* localedata/locales/ml_IN: Likewise.
	* localedata/locales/ms_MY: Likewise.
	* localedata/locales/mt_MT: Likewise.
	* localedata/locales/nan_TW@latin: Likewise.
	* localedata/locales/nb_NO: Likewise.
	* localedata/locales/ne_NP: Likewise.
	* localedata/locales/nhn_MX: Likewise.
	* localedata/locales/niu_NU: Likewise.
	* localedata/locales/niu_NZ: Likewise.
	* localedata/locales/nl_NL: Likewise.
	* localedata/locales/nr_ZA: Likewise.
	* localedata/locales/oc_FR: Likewise.
	* localedata/locales/om_KE: Likewise.
	* localedata/locales/or_IN: Likewise.
	* localedata/locales/os_RU: Likewise.
	* localedata/locales/pa_IN: Likewise.
	* localedata/locales/pa_PK: Likewise.
	* localedata/locales/pl_PL: Likewise.
	* localedata/locales/pt_PT: Likewise.
	* localedata/locales/quz_PE: Likewise.
	* localedata/locales/ro_RO: Likewise.
	* localedata/locales/ru_RU: Likewise.
	* localedata/locales/rw_RW: Likewise.
	* localedata/locales/sa_IN: Likewise.
	* localedata/locales/sd_IN: Likewise.
	* localedata/locales/sd_IN@devanagari: Likewise.
	* localedata/locales/sd_PK: Likewise.
	* localedata/locales/se_NO: Likewise.
	* localedata/locales/sgs_LT: Likewise.
	* localedata/locales/shn_MM: Likewise.
	* localedata/locales/si_LK: Likewise.
	* localedata/locales/sk_SK: Likewise.
	* localedata/locales/sl_SI: Likewise.
	* localedata/locales/sm_WS: Likewise.
	* localedata/locales/so_SO: Likewise.
	* localedata/locales/sq_AL: Likewise.
	* localedata/locales/ss_ZA: Likewise.
	* localedata/locales/st_ZA: Likewise.
	* localedata/locales/sv_SE: Likewise.
	* localedata/locales/sw_KE: Likewise.
	* localedata/locales/ta_IN: Likewise.
	* localedata/locales/te_IN: Likewise.
	* localedata/locales/th_TH: Likewise.
	* localedata/locales/ti_ET: Likewise.
	* localedata/locales/tn_ZA: Likewise.
	* localedata/locales/to_TO: Likewise.
	* localedata/locales/tpi_PG: Likewise.
	* localedata/locales/tr_TR: Likewise.
	* localedata/locales/ts_ZA: Likewise.
	* localedata/locales/unm_US: Likewise.
	* localedata/locales/ur_IN: Likewise.
	* localedata/locales/ur_PK: Likewise.
	* localedata/locales/ve_ZA: Likewise.
	* localedata/locales/vi_VN: Likewise.
	* localedata/locales/wa_BE: Likewise.
	* localedata/locales/wo_SN: Likewise.
	* localedata/locales/xh_ZA: Likewise.
	* localedata/locales/yi_US: Likewise.
	* localedata/locales/yuw_PG: Likewise.
	* localedata/locales/zh_CN: Likewise.
	* localedata/locales/zu_ZA: Likewise.

Comments

Rafal Luzynski Oct. 13, 2018, 12:59 a.m. UTC | #1
Egor,

Thank you for the update.  I took a closer look at your patch so this
time my review is more complete than before although not yet fully complete.

As far as I understand, ISO-9 and its GOST variants are meant to be
universal rather than Russian-specific.  Therefore it is correct to place
them in the external file, like translit_cyrillic, and then include this
file in other locales adding locale specific modifications, if required.
For example, if there are any Russian-specific rules not included in this
file, they should go to ru_RU.

The text of the ISO-9 standard is not available in public, have we got
anything better than an article in Wikipedia?

Regarding the format of your commit message, I hesitate to say anything
more because there are more experienced maintainers around here.  Please
take a look at the Contribution Checklist. [1]

While at this, what is your legal relationship with GLIBC project?  Have
you signed the FSF Copyright Assignment?  It is not necessary for the locale
data but it might be necessary if you are going to contribute the testing code.

Regarding the tests, I think there is no complete transliteration test
suite at the moment.  Probably the only test is localedata/bug-iconv-trans.c.
You can also see the collation tests placed in the same directory, they
use those multiple *.UTF-8.in files.

You can skip the tests for now.

Technical issue:  Please either attach your patch to the email message or
paste it inline, not both.  The patch as it is now is not applicable.
I had to edit it manually to apply.


12.10.2018 16:05 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> From this patch I have excluded locales that already mention cyrillic or
> have a transliteration table for it:
> az_AZ
> iso14651_t1_common
> ky_KG
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic

I confirm that these locales are excluded and there are no other missing
locales.

> [...]
>
> diff -uNr a/localedata/locales/C b/localedata/locales/C
> --- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
> +++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000

There is no such file.  Where have you got the source code from?  Are you
sure this is glibc? :-)

> [...]
> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
> --- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
> +++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
> @@ -1394,6 +1394,7 @@
> <U137A> <U0060><U0039><U0030>
> <U137B> <U0060><U0031><U0030><U0030>
> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
> +include "translit_cyrillic";""
> translit_end
> %
> END LC_CTYPE

Shouldn't “include "translit_cyrillic";""” be placed before the custom rules,
together with other includes?  The same in more files, I will not mention
them all.

> [...]
> diff -uNr a/localedata/locales/sd_IN@devanagari
> b/localedata/locales/sd_IN@devanagari
> --- a/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:18.000000000
> +0000
> +++ b/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:49.000000000
> +0000

Those 3 lines have been broken by the email agent, the patch is not applicable.

> [...]
> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
> --- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
> +++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000

There is no such file in glibc.

> [...]
> diff -uNr a/localedata/locales/translit_cyrillic
> b/localedata/locales/translit_cyrillic
> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
> +0000
> +++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
> +0000

Again 3 lines broken, the patch is not applicable.

> [...]
> +% Contributions welcome for the rest of Cyrillic script in Unicode
> +% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.

I am still tempted to add more Cyrillic characters but I understand
that it must be clearly separated which transliteration rules come from
ISO-9 and which are our own invention.  But that's not for now.

> [...]
> +translit_start
> +
> +% CYRILLIC CAPITAL LETTER IO
> +<U0401> <U00CB>;"<U0059><U004F>"

This says that for ASCII (GOST 7.79 System B) you would like to transliterate
"Ё" as "YO" but the table in Wikipedia says "Yo".  I understand that one or
another may be correct depending on the context but we should be consistent
and also better let's stick with the standard.

> +% CYRILLIC CAPITAL LETTER DJE
> +<U0402> <U0110>;"<U0044><U004A>"

This says "DJ" but System B does not mention it.  Where does it come from?
Also, I think it should be "Dj" rather than "DJ".

> +% CYRILLIC CAPITAL LETTER GJE
> +<U0403> <U01F4>;"<U0047><U0060>"

Correct, according to both systems.

> +% CYRILLIC CAPITAL LETTER UKRAINIAN IE
> +<U0404> <U00CA>;"<U0059><U0065>"

"Ye" - correct.

> +% CYRILLIC CAPITAL LETTER DZE
> +<U0405> <U1E90>;"<U005A><U0060>"

Correct.

> +% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
> +<U0406> <U00CC>;<U0049>

Correct.  The table mentions an alternative transliteration "I`" but
says that it is "only before vowels for Old Russian and Old Bulgarian".
I think we can skip this other variant.

> +% CYRILLIC CAPITAL LETTER YI
> +<U0407> <U00CF>;"<U0059><U0069>"

"Yi" - correct.

> +% CYRILLIC CAPITAL LETTER JE
> +<U0408> "<U004A><U030C>";<U004A>

Correct.

> +% CYRILLIC CAPITAL LETTER LJE
> +<U0409> "<U004C><U0302>";"<U004C><U0060>"

Correct, according to the standard.  If Serbian language requires "Lj"
then overrides should go to sr_RS file.

> +% CYRILLIC CAPITAL LETTER NJE
> +<U040A> "<U004E><U0302>";"<U004E><U0060>"

Correct, the same comment.

> +% CYRILLIC CAPITAL LETTER TSHE
> +<U040B> <U0106>;"<U0054><U0053><U0048>"

Where does "TSH" come from?  It is not mentioned by the System B table.
Also I am afraid this is not correct.

> +% CYRILLIC CAPITAL LETTER KJE
> +<U040C> <U1E30>;"<U004B><U0060>"

Correct.

> +% CYRILLIC CAPITAL LETTER SHORT U
> +<U040E> <U016C>;"<U0055><U0060>"

"U`" - correct.

> +% CYRILLIC CAPITAL LETTER DZHE
> +<U040F> "<U0044><U0302>";"<U0044><U0068>"

"Dh" - correct.

> [...]
> +% CYRILLIC CAPITAL LETTER ZHE
> +<U0416> <U017D>;"<U005A><U0048>"

"ZH" - shouldn't be "Zh"?

> [...]
> +% CYRILLIC UNDEFINED
> +<U0423><U0301> <U00DA>;"<U0055><U0060>"

1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
2. OK, the System A table mentions this letter but System B does not.
   Somehow we should handle it.  I think that "U`" is the best we can
   do for now.
3. It must be tested whether this actually works.

> [...]
> +% CYRILLIC CAPITAL LETTER HA
> +<U0425> <U0048>;<U0058>

I don't think that "H" is unavailable in any encoding therefore it will
always be transliterated as "H" and never as "X".  We can't help it and
I don't think it is bad.

> +% CYRILLIC CAPITAL LETTER TSE
> +<U0426> <U0043>;"<U0043><U005A>"

1. "CZ" - maybe should be "Cz"?
2. Are we able to implement the rule: "c before i, e, y, j"?

> +% CYRILLIC CAPITAL LETTER CHE
> +<U0427> <U010C>;"<U0043><U0048>"

"CH" -> "Ch"?

> +% CYRILLIC CAPITAL LETTER SHA
> +<U0428> <U0160>;"<U0053><U0048>"

"SH" -> "Sh"?

> +% CYRILLIC CAPITAL LETTER SHCHA
> +<U0429> <U015C>;"<U0053><U0048><U0048>"

"SHH" -> "Shh"?

> +% CYRILLIC CAPITAL LETTER HARD SIGN
> +<U042A> <U02BA>;"<U0041><U0060>"

"A`" is only for Bulgarian and should go to bg_BG.  How should
we transliterate an upper case hard sign to plain ASCII?  I think
that just "``", same as lower case.

> +% CYRILLIC CAPITAL LETTER YERU
> +<U042B> <U0059>;"<U0059><U0060>"

Again, as "Y" is always available it will never be transliterated
as "Y`".

> +% CYRILLIC CAPITAL LETTER SOFT SIGN
> +<U042C> <U02B9>;<U0060>

OK, I like it to be transliterated to plain ASCII as "`".

> +% CYRILLIC CAPITAL LETTER E
> +<U042D> <U00C8>;"<U0045><U0060>"

OK

> +% CYRILLIC CAPITAL LETTER YU
> +<U042E> <U00DB>;"<U0059><U0055>"

"YU" -> "Yu"?

> +% CYRILLIC CAPITAL LETTER YA
> +<U042F> <U00C2>;"<U0059><U0041>"

"YA" -> "Ya"?

> [...]

I am sorry, this is of course incomplete but that's enough for tonight.

Regards,

Rafal


[1] https://sourceware.org/glibc/wiki/Contribution%20checklist
Diego (Egor) Kobylkin Oct. 13, 2018, 4:58 p.m. UTC | #2
Hi Rafal,

Thanks for the thorough checking, it really helps.

On 13.10.2018 02:59, Rafal Luzynski wrote:
> Technical issue:  Please either attach your patch to the email 
> message or paste it inline, not both.  The patch as it is now is not 
> applicable. I had to edit it manually to apply.
>> diff -uNr a/localedata/locales/C b/localedata/locales/C --- 
>> a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000 +++ 
>> b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
> 
> There is no such file.  Where have you got the source code from?
> Are you sure this is glibc? :-)

I was running my patch process against the Ubuntu 18.04 version of
localedata/locales. Now I have checked out the GitHub glibc source v2.28
and done the same. Please find the new patch attached. I am not
submitting it as a patch request because we have not yet addressed the
rest of your comments below. But at least this should be working as a
patch for you. Please let me know if there is any problem there still.

>> [...] From this patch I have excluded locales that already mention 
>> cyrillic or have a transliteration table for it: az_AZ 
>> iso14651_t1_common ky_KG mn_MN sr_RS tg_TJ tk_TM tt_RU uk_UA uz_UZ 
>> uz_UZ@cyrillic
> 
> I confirm that these locales are excluded and there are no other 
> missing locales.

Because of the surprisingly different list of locales between Ubuntu and
glibc there is now a different list of excluded ones as well.

mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA

az_AZ, ky_KG are now included because they don't have cyrillic translit
in glibc. iso14651_t1_common is still implicitly excluded, because it
doesn't have 'translit_end' string.

Somehow az_AZ and tr_TR from glibc fail to transliterate Cyrillic even
after the patch applied (az_AZ is explicitly including tr_TR). I do not
see a reason, maybe you could check?


> Regarding the tests, I think there is no complete transliteration 
> test suite at the moment.  Probably the only test is 
> localedata/bug-iconv-trans.c. You can also see the collation tests 
> placed in the same directory, they use those multiple *.UTF-8.in 
> files.
> 
> You can skip the tests for now.

In the copy of localedata/bug-iconv-trans.c lines 10-11 we could just
change the list of the symbols we are now transliterating

  const char str[] = "ÄäÖöÜüß";
  const char expected[] = "AEaeOEoeUEuess";

like this

  const char str[] =
"ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩъЫьЭЮЯабвгдежзийклмнопрстуу́фхцчшщЪыЬэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ
ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"
  const char expected[] =
"YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU`FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu`fxczchsh
shh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`
T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`
Y`y`'";

First I though they could just be added but not all locales
transliterate Umlauts so just extending the current test won't do as it
will fail for those locales.


>> [...] diff -uNr a/localedata/locales/am_ET 
>> b/localedata/locales/am_ET --- a/localedata/locales/am_ET 
>> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET 
>> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A> 
>> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C> 
>> <U0060><U0031><U0030><U0030><U0030><U0030> +include 
>> "translit_cyrillic";"" translit_end % END LC_CTYPE
> 
> Shouldn't “include "translit_cyrillic";""” be placed before the 
> custom rules, together with other includes?  The same in more files, 
> I will not mention them all.

If I recall correctly it is because of the
"translit_end
END LC_CTYPE"
part at the end of the translit_cyrillic. This way it works for any
locale, regardless whether it has translit itself or not. And being at
the end it does not supersede any previous transliteration that may be
there for a reason.

As with some other comments, I am not super familiar with the formats of
glibc files. So if you have a definitive suggestion - pls. formulate it
as an imperative, not a question.


>> [...] +translit_start + +% CYRILLIC CAPITAL LETTER IO +<U0401> 
>> <U00CB>;"<U0059><U004F>"
> 
> This says that for ASCII (GOST 7.79 System B) you would like to 
> transliterate "Ё" as "YO" but the table in Wikipedia says "Yo".  I 
> understand that one or another may be correct depending on the 
> context but we should be consistent and also better let's stick with 
> the standard.

The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
example for "Сх" and "Ш" that would both transliterate to Sh:
With SH:"Схема"->"Shema" but "Шема"->"SHema"
With Sh:"Схема"->"Shema" and "Шема"->"Shema". Collision!
This is important e.g. for renaming files, grouping as in using uniq etc.

> 
>> +% CYRILLIC CAPITAL LETTER DJE +<U0402> <U0110>;"<U0044><U004A>"
> 
> This says "DJ" but System B does not mention it.  Where does it come 
> from? Also, I think it should be "Dj" rather than "DJ".
I took the first two letters from its name.


>> [...] +% CYRILLIC UNDEFINED +<U0423><U0301> 
>> <U00DA>;"<U0055><U0060>"
> 
> 1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
> 2. OK, the System A table mentions this letter but System B does not.
> Somehow we should handle it.  I think that "U`" is the best we can do
> for now. 3. It must be tested whether this actually works.
1. Let's do it just before you are ready to commit the patch, because it
breaks formulas in my worksheet and I will have to do it manually?
3. I have tested and it doesn't work/gets ignored. But if you were to
handle COMBINING it would work, wouldn't it?


>> [...] +% CYRILLIC CAPITAL LETTER HA +<U0425> <U0048>;<U0058>
> 
> I don't think that "H" is unavailable in any encoding therefore it 
> will always be transliterated as "H" and never as "X".  We can't
> help it and I don't think it is bad.
> 
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.

>> +% CYRILLIC CAPITAL LETTER TSE +<U0426> <U0043>;"<U0043><U005A>"
> 
> 1. "CZ" - maybe should be "Cz"?> 2. Are we able to implement the
> rule: "c before i, e, y, j"?
> 
1. see for CYRILLIC CAPITAL LETTER IO
2. not sure what you are talking about in 2. but I believe it's not
possible as per Marko's email.


>> +% CYRILLIC CAPITAL LETTER HARD SIGN +<U042A> 
>> <U02BA>;"<U0041><U0060>"
> 
> "A`" is only for Bulgarian and should go to bg_BG.  How should we 
> transliterate an upper case hard sign to plain ASCII?  I think that 
> just "``", same as lower case.
This is to avoid collision. Besides AFAIK e.g. in Russian there is no
capital hard sign because there are no words starting with it.

> 
>> +% CYRILLIC CAPITAL LETTER YERU +<U042B> <U0059>;"<U0059><U0060>"
> 
> Again, as "Y" is always available it will never be transliterated as 
> "Y`".
> 
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.


Bests,
Diego
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/aa_DJ	2018-10-13 16:52:32.666374687 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/af_ZA	2018-10-13 16:52:32.442373810 +0000
@@ -70,6 +70,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/ak_GH	2018-10-13 16:52:32.774375109 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/am_ET	2018-10-13 16:52:32.466373904 +0000
@@ -893,6 +893,7 @@
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/ar_EG	2018-10-13 16:52:32.806375234 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/az_AZ b/localedata/locales/az_AZ
--- a/localedata/locales/az_AZ	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/az_AZ	2018-10-13 16:52:32.494374014 +0000
@@ -136,6 +136,7 @@
 <U0259> "<U00E4>"
 <U018F> "<U00C4>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/be_BY	2018-10-13 16:52:32.518374107 +0000
@@ -91,6 +91,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/bem_ZM	2018-10-13 16:52:32.674374718 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/ber_DZ	2018-10-13 16:52:32.878375516 +0000
@@ -136,6 +136,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/ber_MA	2018-10-13 16:52:32.858375438 +0000
@@ -83,6 +83,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/bg_BG	2018-10-13 16:52:32.446373826 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/bi_VU	2018-10-13 16:52:32.786375156 +0000
@@ -39,6 +39,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/bn_BD	2018-10-13 16:52:32.766375078 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/bo_CN	2018-10-13 16:52:32.930375719 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/ca_ES	2018-10-13 16:52:32.930375719 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/ce_RU	2018-10-13 16:52:32.490373998 +0000
@@ -38,6 +38,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/cmn_TW	2018-10-13 16:52:32.670374702 +0000
@@ -49,6 +49,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-13 16:52:32.238373012 +0000
+++ b/localedata/locales/cs_CZ	2018-10-13 16:52:32.874375500 +0000
@@ -215,6 +215,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-13 16:52:32.238373012 +0000
+++ b/localedata/locales/cv_RU	2018-10-13 16:52:32.610374468 +0000
@@ -103,6 +103,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/cy_GB	2018-10-13 16:52:32.434373779 +0000
@@ -65,6 +65,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/da_DK	2018-10-13 16:52:32.894375579 +0000
@@ -169,6 +169,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/de_DE	2018-10-13 16:52:32.898375594 +0000
@@ -78,6 +78,7 @@
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/dv_MV	2018-10-13 16:52:32.842375375 +0000
@@ -51,6 +51,7 @@
 include "translit_combining";""
 
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/dz_BT	2018-10-13 16:52:32.838375360 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/el_GR	2018-10-13 16:52:32.862375454 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/en_GB	2018-10-13 16:52:32.794375187 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/en_NG	2018-10-13 16:52:32.626374530 +0000
@@ -49,6 +49,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/en_ZM	2018-10-13 16:52:32.454373857 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/es_CU	2018-10-13 16:52:32.886375547 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/es_ES	2018-10-13 16:52:32.426373748 +0000
@@ -107,6 +107,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/et_EE	2018-10-13 16:52:32.758375046 +0000
@@ -113,6 +113,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/fa_IR	2018-10-13 16:52:32.446373826 +0000
@@ -78,6 +78,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/ff_SN	2018-10-13 16:52:32.466373904 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/fi_FI	2018-10-13 16:52:32.846375391 +0000
@@ -177,6 +177,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/fr_FR	2018-10-13 16:52:32.522374123 +0000
@@ -58,6 +58,7 @@
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/ga_IE	2018-10-13 16:52:32.906375626 +0000
@@ -53,6 +53,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/gd_GB	2018-10-13 16:52:32.894375579 +0000
@@ -45,6 +45,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/gu_IN	2018-10-13 16:52:32.802375218 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/gv_GB	2018-10-13 16:52:32.626374530 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/he_IL	2018-10-13 16:52:32.926375704 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hi_IN	2018-10-13 16:52:32.634374561 +0000
@@ -61,6 +61,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hif_FJ	2018-10-13 16:52:32.642374593 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hr_HR	2018-10-13 16:52:32.870375485 +0000
@@ -61,6 +61,7 @@
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/ht_HT	2018-10-13 16:52:32.798375203 +0000
@@ -57,6 +57,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hu_HU	2018-10-13 16:52:32.518374107 +0000
@@ -476,6 +476,7 @@
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hy_AM	2018-10-13 16:52:32.766375078 +0000
@@ -75,6 +75,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/id_ID	2018-10-13 16:52:32.522374123 +0000
@@ -54,6 +54,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/is_IS	2018-10-13 16:52:32.606374452 +0000
@@ -149,6 +149,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/it_IT	2018-10-13 16:52:32.770375093 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ja_JP	2018-10-13 16:52:32.754375031 +0000
@@ -1681,6 +1681,7 @@
 include "translit_combining";""
 include "translit_cjk_variants";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kab_DZ	2018-10-13 16:52:32.922375688 +0000
@@ -41,6 +41,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kk_KZ	2018-10-13 16:52:32.866375469 +0000
@@ -99,6 +99,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/km_KH	2018-10-13 16:52:32.598374421 +0000
@@ -42,6 +42,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kn_IN	2018-10-13 16:52:32.762375062 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ko_KR	2018-10-13 16:52:32.582374358 +0000
@@ -6099,6 +6099,7 @@
 include "translit_combining";""
 include "translit_hangul";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ks_IN	2018-10-13 16:52:32.510374076 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kw_GB	2018-10-13 16:52:32.790375171 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ky_KG b/localedata/locales/ky_KG
--- a/localedata/locales/ky_KG	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ky_KG	2018-10-13 16:52:32.410373685 +0000
@@ -82,6 +82,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lb_LU	2018-10-13 16:52:32.874375500 +0000
@@ -77,6 +77,7 @@
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lg_UG	2018-10-13 16:52:32.430373763 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lij_IT	2018-10-13 16:52:32.782375140 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ln_CD	2018-10-13 16:52:32.438373795 +0000
@@ -39,6 +39,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lo_LA	2018-10-13 16:52:32.530374154 +0000
@@ -50,6 +50,7 @@
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lt_LT	2018-10-13 16:52:32.602374436 +0000
@@ -163,6 +163,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lv_LV	2018-10-13 16:52:32.794375187 +0000
@@ -125,6 +125,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mg_MG	2018-10-13 16:52:32.486373982 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mhr_RU	2018-10-13 16:52:32.866375469 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mk_MK	2018-10-13 16:52:32.598374421 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ml_IN	2018-10-13 16:52:32.610374468 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ms_MY	2018-10-13 16:52:32.638374577 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mt_MT	2018-10-13 16:52:32.890375563 +0000
@@ -47,6 +47,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-13 16:52:32.530374154 +0000
@@ -52,6 +52,7 @@
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/nb_NO	2018-10-13 16:52:32.778375125 +0000
@@ -166,6 +166,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ne_NP	2018-10-13 16:52:32.842375375 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/nhn_MX	2018-10-13 16:52:32.766375078 +0000
@@ -59,6 +59,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/niu_NU	2018-10-13 16:52:32.802375218 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/niu_NZ	2018-10-13 16:52:32.850375407 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/nl_NL	2018-10-13 16:52:32.602374436 +0000
@@ -56,6 +56,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/nr_ZA	2018-10-13 16:52:32.918375673 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/oc_FR	2018-10-13 16:52:32.818375281 +0000
@@ -54,6 +54,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/om_KE	2018-10-13 16:52:32.918375673 +0000
@@ -156,6 +156,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/or_IN	2018-10-13 16:52:32.926375704 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/os_RU	2018-10-13 16:52:32.910375641 +0000
@@ -71,6 +71,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pa_IN	2018-10-13 16:52:32.638374577 +0000
@@ -60,6 +60,7 @@
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pa_PK	2018-10-13 16:52:32.422373732 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pl_PL	2018-10-13 16:52:32.502374045 +0000
@@ -130,6 +130,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pt_PT	2018-10-13 16:52:32.910375641 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/quz_PE	2018-10-13 16:52:32.470373920 +0000
@@ -55,6 +55,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ro_RO	2018-10-13 16:52:32.646374608 +0000
@@ -142,6 +142,7 @@
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ru_RU	2018-10-13 16:52:32.534374170 +0000
@@ -69,6 +69,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/rw_RW	2018-10-13 16:52:32.814375265 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sa_IN	2018-10-13 16:52:32.790375171 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sd_IN	2018-10-13 16:52:32.770375093 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-13 16:52:32.818375281 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/se_NO	2018-10-13 16:52:32.634374561 +0000
@@ -221,6 +221,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sgs_LT	2018-10-13 16:52:32.810375250 +0000
@@ -58,6 +58,7 @@
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/shn_MM	2018-10-13 16:52:32.506374060 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/si_LK	2018-10-13 16:52:32.814375265 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sk_SK	2018-10-13 16:52:32.418373716 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sl_SI	2018-10-13 16:52:32.486373982 +0000
@@ -2120,6 +2120,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sm_WS	2018-10-13 16:52:32.498374029 +0000
@@ -37,6 +37,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/so_SO	2018-10-13 16:52:32.414373701 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sq_AL	2018-10-13 16:52:32.798375203 +0000
@@ -45,6 +45,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ss_ZA	2018-10-13 16:52:32.846375391 +0000
@@ -66,6 +66,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/st_ZA	2018-10-13 16:52:32.906375626 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sv_SE	2018-10-13 16:52:32.630374546 +0000
@@ -173,6 +173,7 @@
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sw_KE	2018-10-13 16:52:32.590374389 +0000
@@ -43,6 +43,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ta_IN	2018-10-13 16:52:32.586374374 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/te_IN	2018-10-13 16:52:32.642374593 +0000
@@ -63,6 +63,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/th_TH	2018-10-13 16:52:32.902375610 +0000
@@ -57,6 +57,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ti_ET	2018-10-13 16:52:32.618374499 +0000
@@ -864,6 +864,7 @@
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-13 16:52:32.270373137 +0000
+++ b/localedata/locales/tn_ZA	2018-10-13 16:52:32.882375532 +0000
@@ -67,6 +67,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-13 16:52:32.270373137 +0000
+++ b/localedata/locales/to_TO	2018-10-13 16:52:32.822375297 +0000
@@ -36,6 +36,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-13 16:52:32.270373137 +0000
+++ b/localedata/locales/tpi_PG	2018-10-13 16:52:32.454373857 +0000
@@ -44,6 +44,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/tr_TR	2018-10-13 16:52:32.662374671 +0000
@@ -2538,6 +2538,7 @@
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic	2018-10-13 16:52:32.942375766 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995 
+% It implements the GOST_7.79 System A (Latin Script) as a first 
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference. 
+% The System B is extended from GOST_7.79-Russian using open sources 
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced 
+% in that bug's doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ts_ZA	2018-10-13 16:52:32.806375234 +0000
@@ -62,6 +62,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/unm_US	2018-10-13 16:52:32.782375140 +0000
@@ -48,6 +48,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ur_IN	2018-10-13 16:52:32.762375062 +0000
@@ -46,6 +46,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ur_PK	2018-10-13 16:52:32.510374076 +0000
@@ -57,6 +57,7 @@
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ve_ZA	2018-10-13 16:52:32.854375422 +0000
@@ -65,6 +65,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/vi_VN	2018-10-13 16:52:32.826375313 +0000
@@ -57,6 +57,7 @@
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/wa_BE	2018-10-13 16:52:32.850375407 +0000
@@ -59,6 +59,7 @@
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/wo_SN	2018-10-13 16:52:32.886375547 +0000
@@ -54,6 +54,7 @@
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/xh_ZA	2018-10-13 16:52:32.858375438 +0000
@@ -64,6 +64,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/yi_US	2018-10-13 16:52:32.506374060 +0000
@@ -66,6 +66,7 @@
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/yuw_PG	2018-10-13 16:52:32.494374014 +0000
@@ -40,6 +40,7 @@
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-13 16:52:32.278373168 +0000
+++ b/localedata/locales/zh_CN	2018-10-13 16:52:32.862375454 +0000
@@ -58,6 +58,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-13 16:52:32.278373168 +0000
+++ b/localedata/locales/zu_ZA	2018-10-13 16:52:32.886375547 +0000
@@ -68,6 +68,7 @@
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
Marko Myllynen Oct. 15, 2018, 11:04 a.m. UTC | #3
Hi,

On 2018-10-13 19:58, Egor Kobylkin wrote:
> On 13.10.2018 02:59, Rafal Luzynski wrote:
> 
>> Regarding the tests, I think there is no complete transliteration 
>> test suite at the moment.  Probably the only test is 
>> localedata/bug-iconv-trans.c. You can also see the collation tests 
>> placed in the same directory, they use those multiple *.UTF-8.in 
>> files.
>>
>> You can skip the tests for now.
> 
> First I though they could just be added but not all locales
> transliterate Umlauts so just extending the current test won't do as it
> will fail for those locales.

I still think a one-time check against uconv(1) (part of Unicode's ICU
project) for discrepancies.

>>> [...] diff -uNr a/localedata/locales/am_ET 
>>> b/localedata/locales/am_ET --- a/localedata/locales/am_ET 
>>> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET 
>>> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A> 
>>> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C> 
>>> <U0060><U0031><U0030><U0030><U0030><U0030> +include 
>>> "translit_cyrillic";"" translit_end % END LC_CTYPE
>>
>> Shouldn't “include "translit_cyrillic";""” be placed before the 
>> custom rules, together with other includes?  The same in more files, 
>> I will not mention them all.
> 
> If I recall correctly it is because of the
> "translit_end
> END LC_CTYPE"
> part at the end of the translit_cyrillic. This way it works for any
> locale, regardless whether it has translit itself or not. And being at
> the end it does not supersede any previous transliteration that may be
> there for a reason.

I suspect one problem would be that the latter rule wins, so if there
are some locale-specific rules than possible translit_* inclusions would
override them if not included before the locale-specific rules.

Cheers,
Diego (Egor) Kobylkin Oct. 15, 2018, 11:54 a.m. UTC | #4
On 15.10.2018 13:04, Marko Myllynen wrote:
> Hi,
> 
> On 2018-10-13 19:58, Egor Kobylkin wrote:
>> On 13.10.2018 02:59, Rafal Luzynski wrote:
>>
>>> Regarding the tests, I think there is no complete transliteration 
>>> test suite at the moment.  Probably the only test is 
>>> localedata/bug-iconv-trans.c. You can also see the collation tests 
>>> placed in the same directory, they use those multiple *.UTF-8.in 
>>> files.
>>>
>>> You can skip the tests for now.
>>
>> First I though they could just be added but not all locales
>> transliterate Umlauts so just extending the current test won't do as it
>> will fail for those locales.
> 
> I still think a one-time check against uconv(1) (part of Unicode's ICU
> project) for discrepancies.

Just an addition. I have changes a few constants to see whether
localedata/bug-iconv-trans.c could be made to test cyrillic. Attached is
the bug-iconv-trans-cyr.c that goes through in this form. I had to save
it as UTF-8 instead of ISO-8859-15 for localedata/bug-iconv-trans.c.

>>>> [...] diff -uNr a/localedata/locales/am_ET 
>>>> b/localedata/locales/am_ET --- a/localedata/locales/am_ET 
>>>> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET 
>>>> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A> 
>>>> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C> 
>>>> <U0060><U0031><U0030><U0030><U0030><U0030> +include 
>>>> "translit_cyrillic";"" translit_end % END LC_CTYPE
>>>
>>> Shouldn't “include "translit_cyrillic";""” be placed before the 
>>> custom rules, together with other includes?  The same in more files, 
>>> I will not mention them all.
>>
>> If I recall correctly it is because of the
>> "translit_end
>> END LC_CTYPE"
>> part at the end of the translit_cyrillic. This way it works for any
>> locale, regardless whether it has translit itself or not. And being at
>> the end it does not supersede any previous transliteration that may be
>> there for a reason.
> 
> I suspect one problem would be that the latter rule wins, so if there
> are some locale-specific rules than possible translit_* inclusions would
> override them if not included before the locale-specific rules.

What is the best way forward here? Can somebody make an explicit
suggestion on how to change the current approach if needed?

Bests,
Egor
#include <iconv.h>
#include <locale.h>
#include <stdio.h>
#include <string.h>

int
main (void)
{
  iconv_t cd;
  const char str[] = "CyrillicLetters_ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуу́фхцчшщъыьэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’";
  const char expected[] = "CyrillicLetters_YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUUFHCCHSHSHHA`Y`E`YUYAabvgdezhzijklmnoprstuufhcchshshh``y`e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'";
  char *inptr = (char *) str;
  size_t inlen = strlen (str) + 1;
  char outbuf[500];
  char *outptr = outbuf;
  size_t outlen = sizeof (outbuf);
  int result = 0;
  size_t n;

  if (setlocale (LC_ALL, "de_DE.UTF-8") == NULL)
    {
      puts ("setlocale failed");
      return 1;
    }

  cd = iconv_open ("ANSI_X3.4-1968//TRANSLIT", "UTF-8");
  if (cd == (iconv_t) -1)
    {
      puts ("iconv_open failed");
      return 1;
    }

  n = iconv (cd, &inptr, &inlen, &outptr, &outlen);
  if (n != 174)
    {
      if (n == (size_t) -1)
	printf ("iconv() returned error: %m\n");
      else
	printf ("iconv() returned %Zd, expected 7\n", n);
      result = 1;
    }
  if (inlen != 0)
    {
      puts ("not all input consumed");
      result = 1;
    }
  else if (inptr - str != strlen (str) + 1)
    {
      printf ("inptr wrong, advanced by %td\n", inptr - str);
      result = 1;
    }
  if (memcmp (outbuf, expected, sizeof (expected)) != 0)
    {
      printf ("result wrong: \"%.*s\", expected: \"%s\"\n",
	      (int) (sizeof (outbuf) - outlen), outbuf, expected);
      result = 1;
    }
  else if (outlen != sizeof (outbuf) - sizeof (expected))
    {
      printf ("outlen wrong: %Zd, expected %Zd\n", outlen,
	      sizeof (outbuf) - 15);
      result = 1;
    }
  else
    printf ("output is \"%s\" which is OK\n", outbuf);

  return result;
}
Rafal Luzynski Oct. 23, 2018, 11:08 p.m. UTC | #5
Hi Egor,

Thank you for your updates and again I'm sorry for my delayed response.
A general remark about this: if you are in a hurry and you need the
corrected transliteration rules for yourself or for your users then
you don't have to wait for the patch to be reviewed and accepted here.
You can make your own locale and use it, you don't need to rebuild glibc,
you don't even need root privileges to do it.  The locale data subsystem
is designed to allow users create and use their own locales.

I have seen and tested locally your newer patch [1] but I will reply
in this thread because I think it is easier to reply in context.

I would like to summarize the differences between v5 [2] and v6 to make
sure that I noticed them all and that you have not introduced any changes
inadvertently.  (Yes, that means I have skipped another patch which you
sent between those two.)

* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* You consequently transliterate single uppercase Cyrillic letters
  to sequences of all uppercase Latin letters in all languages (whenever
  a Cyrillic letter is transliterated to more than one Latin letter),
  for example "Ї" is now transliterated as "YI" rather than "Yi".

Again I must say that I experienced lots of technical difficulties to apply
the patch and I had to rework it manually because it is not applicable as
it is now.  Here I explain below how to make a technically correct patch:

13.10.2018 18:58 Egor Kobylkin <egor@kobylkin.com> wrote:
> 
> 
> Hi Rafal,
> 
> Thanks for the thorough checking, it really helps.
> 
> On 13.10.2018 02:59, Rafal Luzynski wrote:
> > Technical issue:  Please either attach your patch to the email 
> > message or paste it inline, not both.  The patch as it is now is not 
> > applicable. I had to edit it manually to apply.
> >> diff -uNr a/localedata/locales/C b/localedata/locales/C --- 
> >> a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000 +++ 
> >> b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
> > 
> > There is no such file.  Where have you got the source code from?
> > Are you sure this is glibc? :-)
> 
> I was running my patch process against the Ubuntu 18.04 version of
> localedata/locales. Now I have checked out the GitHub glibc source v2.28
> and done the same. [...]

Remarks:

* Please use the repository at https://sourceware.org/git/?p=glibc.git
  rather than a copy at GitHub.
* Please use the master branch rather than 2.28.
* Commit your work locally.
* Use "git format-patch" (e.g., "git format-patch HEAD^..HEAD") to generate
  the patch, then you can email it to this list.
* You can email it inline or, if your email client breaks the lines and
inserts
  other unnecessary characters, send as an attachment.
* Use "git pull --rebase" to keep your work up to date.
* Read the Contribution Checklist [3] for more details.

> 
> >> [...] From this patch I have excluded locales that already mention 
> >> cyrillic or have a transliteration table for it: az_AZ 
> >> iso14651_t1_common ky_KG mn_MN sr_RS tg_TJ tk_TM tt_RU uk_UA uz_UZ 
> >> uz_UZ@cyrillic
> > 
> > I confirm that these locales are excluded and there are no other 
> > missing locales.
> 
> Because of the surprisingly different list of locales between Ubuntu and
> glibc there is now a different list of excluded ones as well.
> 
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic
> uk_UA
> 
> az_AZ, ky_KG are now included

As far as I can see, there are no other differences between those two
patches.

> because they don't have cyrillic translit
> in glibc. iso14651_t1_common is still implicitly excluded, because it
> doesn't have 'translit_end' string.
> 
> Somehow az_AZ and tr_TR from glibc fail to transliterate Cyrillic even
> after the patch applied (az_AZ is explicitly including tr_TR). I do not
> see a reason, maybe you could check?

I noticed that az_AZ does not build at all, localedef program reports
a "circular dependency" (if I recall correctly).  I think that since az_AZ
contains “copy "tr_TR"” and tr_TR already contains (in your patch)
“include "translit_cyrillic";""” you should just remove
“include "translit_cyrillic";""” from az_AZ which effectively means that
there are no changes in az_AZ.  Optionally, you can add a comment to az_AZ
to explain why it does not contain “include "translit_cyrillic";""” and to
make sure that if anyone removes “copy "tr_TR"” ever in the future, the
“include "translit_cyrillic";""” will be added at the same time.  I have
verified that removing that line makes the locale data build without an
error but I have not yet verified that they work as expected.

> > Regarding the tests, I think there is no complete transliteration 
> > test suite at the moment.  Probably the only test is 
> > localedata/bug-iconv-trans.c. You can also see the collation tests 
> > placed in the same directory, they use those multiple *.UTF-8.in 
> > files.
> > 
> > You can skip the tests for now.
> 
> In the copy of localedata/bug-iconv-trans.c lines 10-11 we could just
> change the list of the symbols we are now transliterating
> 
>   const char str[] = "ÄäÖöÜüß";
>   const char expected[] = "AEaeOEoeUEuess";
> 
> like this
> 
>   const char str[] =
> "ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩъЫьЭЮЯабвгдежзийклмнопрстуу́фхцчшщЪыЬэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ
> ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"
>   const char expected[] =
> "YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU`FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu`fxczchsh
> shh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`
> T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`
> Y`y`'";
> 
> First I though they could just be added but not all locales
> transliterate Umlauts so just extending the current test won't do as it
> will fail for those locales.

I noticed that you pasted a patch in a Bugzilla comment. [4] If I understand
correctly you suggest to rework the existing test case to test Cyrillic
transliteration instead of German.  Please don't do it: the existing test
cases may be extended but must not be removed.  I think we should rework
this
test case to handle multiple locales and multiple transliteration pairs;
optionally we can add a new case instead.  Currently I lean into reworking
the existing test case.

> >> [...] diff -uNr a/localedata/locales/am_ET 
> >> b/localedata/locales/am_ET --- a/localedata/locales/am_ET 
> >> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET 
> >> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A> 
> >> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C> 
> >> <U0060><U0031><U0030><U0030><U0030><U0030> +include 
> >> "translit_cyrillic";"" translit_end % END LC_CTYPE
> > 
> > Shouldn't “include "translit_cyrillic";""” be placed before the 
> > custom rules, together with other includes?  The same in more files, 
> > I will not mention them all.
> 
> If I recall correctly it is because of the
> "translit_end
> END LC_CTYPE"
> part at the end of the translit_cyrillic. This way it works for any
> locale, regardless whether it has translit itself or not. And being at
> the end it does not supersede any previous transliteration that may be
> there for a reason.
> 
> As with some other comments, I am not super familiar with the formats of
> glibc files. So if you have a definitive suggestion - pls. formulate it
> as an imperative, not a question.

I feel like a newcomer here so it was meant to be a question to other
more experienced maintainers but probably it's time to change this attitude.
So, also taking into account what Marko wrote, [5] please put the include
directive after all other include directives, or after the "translit_start"
directive if there are no other includes, rather than putting it just before
"translit_end".  Even if putting it at the dnd works sometimes or even
always.
Same as you put #include's near top of the file when writing a C program
even
if sometimes you may put it anywhere and it will work.  If you use a script
to insert your include directives then please rework it, if you insert them
manually then just move them manually.

> >> [...] +translit_start + +% CYRILLIC CAPITAL LETTER IO +<U0401> 
> >> <U00CB>;"<U0059><U004F>"
> > 
> > This says that for ASCII (GOST 7.79 System B) you would like to 
> > transliterate "Ё" as "YO" but the table in Wikipedia says "Yo".  I 
> > understand that one or another may be correct depending on the 
> > context but we should be consistent and also better let's stick with 
> > the standard.
> 
> The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
> example for "Сх" and "Ш" that would both transliterate to Sh:
> With SH:"Схема"->"Shema" but "Шема"->"SHema"
> With Sh:"Схема"->"Shema" and "Шема"->"Shema". Collision!
> This is important e.g. for renaming files, grouping as in using uniq etc.

I understand this idea.  Is this part of any existing standard?  I can't
see it regulated by GOST 7.79.

I'd rather not include the transliteration rules which seems reasonable to
us (the developers) but are not known and therefore not acceptable by the
outer world.

> 
> > 
> >> +% CYRILLIC CAPITAL LETTER DJE +<U0402> <U0110>;"<U0044><U004A>"
> > 
> > This says "DJ" but System B does not mention it.  Where does it come 
> > from? Also, I think it should be "Dj" rather than "DJ".
> I took the first two letters from its name.

As I said previously, I would like to add more Cyrillic letters even if
they are not regulated by any standard.  But let's separate them and make
it clear that these rules are based on GOST 7.79 and those are our own
invention (or come from other standard etc.)  I think that all these
rules may even be in the same file but in different parts of it.

> >> [...] +% CYRILLIC UNDEFINED +<U0423><U0301> 
> >> <U00DA>;"<U0055><U0060>"
> > 
> > 1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
> > 2. OK, the System A table mentions this letter but System B does not.
> > Somehow we should handle it.  I think that "U`" is the best we can do
> > for now. 3. It must be tested whether this actually works.
> 1. Let's do it just before you are ready to commit the patch, because it
> breaks formulas in my worksheet and I will have to do it manually?
> 3. I have tested and it doesn't work/gets ignored. But if you were to
> handle COMBINING it would work, wouldn't it?

My guess is that since translit_combining just removes all those combining
diacritic characters and translit_combining is usually included before
translit_cyrillic then <U0301> is removed even before <U0423> is taken
into account.  Also my another guess is that it might work good if you
just removed this rule: <U0423> would be translated to "U" and <U0301>
would remain unchanged and eventually those two characters would produce
"Ú".  But, again, that's just a guess, I have not tested.

> >> [...] +% CYRILLIC CAPITAL LETTER HA +<U0425> <U0048>;<U0058>
> > 
> > I don't think that "H" is unavailable in any encoding therefore it 
> > will always be transliterated as "H" and never as "X".  We can't
> > help it and I don't think it is bad.
> > 
> But we can keep this for when/if there is a way to explicitly request
> transcription instead of transliteration.

Note that either it will make the test cases fail or we will have to
prepare the test cases deliberately skip the translation of <U0425>
into "X" because "H" will be always working.  We can't force iconv
to choose the second transliteration rule if the first one works.

That means we will have a problem to construct the test cases.

> >> +% CYRILLIC CAPITAL LETTER TSE +<U0426> <U0043>;"<U0043><U005A>"
> > 
> > 1. "CZ" - maybe should be "Cz"?> 2. Are we able to implement the
> > rule: "c before i, e, y, j"?
> > 
> 1. see for CYRILLIC CAPITAL LETTER IO
> 2. not sure what you are talking about in 2. but I believe it's not
> possible as per Marko's email.

Hm... I can't find a good example now.  Maybe I was mislead by the rules
of Cyrillic transliteration which I learned at school and which are not
necessarily universal and not necessarily useful for English readers.

> >> +% CYRILLIC CAPITAL LETTER HARD SIGN +<U042A> 
> >> <U02BA>;"<U0041><U0060>"
> > 
> > "A`" is only for Bulgarian and should go to bg_BG.  How should we 
> > transliterate an upper case hard sign to plain ASCII?  I think that 
> > just "``", same as lower case.
> This is to avoid collision.

What collision?

> Besides AFAIK e.g. in Russian there is no
> capital hard sign because there are no words starting with it.

True but it can be used in ALL UPPERCASE text.  Therefore we need a clear
and correct transliteration rule for it.

> 
> > 
> >> +% CYRILLIC CAPITAL LETTER YERU +<U042B> <U0059>;"<U0059><U0060>"
> > 
> > Again, as "Y" is always available it will never be transliterated as 
> > "Y`".
> > 
> But we can keep this for when/if there is a way to explicitly request
> transcription instead of transliteration.

Again, it will be difficult or impossible to construct a correct test case
and we must be aware of this.

Regards,

Rafal


[1] https://sourceware.org/ml/libc-alpha/2018-10/msg00300.html
[2] https://sourceware.org/ml/libc-alpha/2018-10/msg00213.html
[3] https://sourceware.org/glibc/wiki/Contribution%20checklist
[4] https://sourceware.org/bugzilla/show_bug.cgi?id=2872#c47
[5] https://sourceware.org/ml/libc-alpha/2018-10/msg00232.html

Patch
diff mbox series

diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C	2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA	2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH	2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@ 
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET	2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@ 
 <U137A>    <U0060><U0039><U0030>
 <U137B>    <U0060><U0031><U0030><U0030>
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG	2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG	2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY	2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM	2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ	2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA	2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU	2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD	2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN	2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN	2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES	2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU	2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW	2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@ 
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ	2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU	2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB	2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@ 
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK	2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@ 
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE	2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@ 
 % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
 <U201F> <U00AB>;<U0022>
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV	2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@ 
 include "translit_combining";""
 
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT	2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR	2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR	2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB	2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG	2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU	2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES	2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES	2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE	2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR	2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN	2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI	2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@ 
 % In France, accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE	2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB	2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@ 
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN	2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB	2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL	2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ	2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR	2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR	2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@ 
 % transliterate <U0111> {đ} into d + j
 <U0111> "<U0064><U006A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT	2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@ 
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU	2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@ 
 <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
 <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM	2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID	2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS	2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT	2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP	2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@ 
 include "translit_combining";""
 include "translit_cjk_variants";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ	2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ	2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH	2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@ 
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN	2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR	2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@ 
 include "translit_combining";""
 include "translit_hangul";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN	2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB	2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU	2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@ 
 % LATIN SMALL LETTER E WITH CIRCUMFLEX
 <U00EA> "e^"
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG	2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT	2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD	2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD	2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@ 
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA	2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@ 
 copy "i18n"
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT	2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV	2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG	2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@ 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU	2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK	2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN	2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@ 
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 %
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY	2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT	2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin	2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@ 
 % accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO	2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@ 
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP	2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX	2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL	2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA	2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR	2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@ 
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE	2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE	2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN	2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU	2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN	2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@ 
 
 translit_start
 include     "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK	2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@ 
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL	2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT	2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE	2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@ 
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO	2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@ 
 <U0162> "<U021A>";"<U0054>"
 <U0163> "<U021B>";"<U0074>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU	2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN	2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN@devanagari	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK	2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO	2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@ 
 copy "i18n"
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM	2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK	2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK	2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI	2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS	2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO	2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL	2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL	2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA	2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE	2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@ 
 % LATIN SMALL LETTER O WITH STROKE -> "oe"
 <U00F8> "<U006F><U0338>";"<U006F><U0065>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE	2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN	2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH	2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET	2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@ 
 <U137C>    <U0060><U0031><U0030><U0030><U0030><U0030>
 
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 %
 END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO	2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG	2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG	2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR	2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@ 
 
 % TURKISH LIRA SIGN
 <U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic	1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic	2018-10-11 15:10:52.000000000 +0000
@@ -0,0 +1,383 @@ 
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file.  The foregoing does not
+% affect the license of the GNU C Library as a whole.  It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995 
+% It implements the GOST_7.79 System A (Latin Script) as a first 
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference. 
+% The System B is extended from GOST_7.79-Russian using open sources 
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+%   | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with 
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA	2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US	2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN	2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@ 
 % Farsi yeh -> yeh
 <U06CC> "<U064A>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN	2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@ 
 % dong sign -> d// -> dd
 <U20AB> "<U0111>";"<U0064><U0064>"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE	2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@ 
 <U00C5> "A<U030A>";"A";"AU"
 <U00E5> "a<U030A>";"a";"au"
 
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN	2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@ 
 % Accents are simply omitted if they cannot be represented.
 include "translit_combining";""
 
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE
 
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US	2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@ 
 <U05F0> "<U05D5><U05D5>";"ww"
 <U05F1> "<U05D5><U05D9>";"wj"
 <U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG	2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@ 
 
 translit_start
 include "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN	2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 
 class	"hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA	2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA	2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@ 
 
 translit_start
 include  "translit_combining";""
+include "translit_cyrillic";""
 translit_end
 END LC_CTYPE