The Locales of Windows 7, all divvied up

by Michael S. Kaplan, published on 2011/05/24 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/05/24/10167604.aspx


Someone asked me yesterday if there were lists for:

  1. all the locales for which we did full localizations;
  2. all the locales for which we did partial (Language Interface Pack) localizations;
  3. all the locales supported by Windows that fall in neither of these categories.

Well, they knew that #1 and #2 were probably somewhere, but they did not know about #3.

I was a little bored, so I assembled the lists for Windows 7.

A few caveats:

So here are a bunch of lists:

 Table #1: The locales representing languages into which Windows 7 localizes:

LCID Name Display Name Native Name
0401 ar-SA Arabic (Saudi Arabia) العربية (المملكة العربية السعودية)
0402 bg-BG Bulgarian (Bulgaria) български (България)
0804 zh-CN Chinese (People's Republic of China) 中文(中华人民共和国)
0404 zh-TW Chinese (Taiwan) 中文(台灣)
041a hr-HR Croatian (Croatia) hrvatski (Hrvatska)
0405 cs-CZ Czech (Czech Republic) čeština (Česká republika)
0406 da-DK Danish (Denmark) dansk (Danmark)
0413 nl-NL Dutch (Netherlands) Nederlands (Nederland)
0409 en-US English (United States) English (United States)
0425 et-EE Estonian (Estonia) eesti (Eesti)
040b fi-FI Finnish (Finland) suomi (Suomi)
040c fr-FR French (France) français (France)
0407 de-DE German (Germany) Deutsch (Deutschland)
0408 el-GR Greek (Greece) ελληνικά (Ελλάδα)
040d he-IL Hebrew (Israel) עברית (ישראל)
040e hu-HU Hungarian (Hungary) magyar (Magyarország)
0410 it-IT Italian (Italy) italiano (Italia)
0411 ja-JP Japanese (Japan) 日本語 (日本)
0412 ko-KR Korean (Korea) 한국어 (대한민국)
0426 lv-LV Latvian (Latvia) latviešu (Latvija)
0427 lt-LT Lithuanian (Lithuania) lietuvių (Lietuva)
0414 nb-NO Norwegian, Bokmål (Norway) norsk, bokmål (Norge)
0415 pl-PL Polish (Poland) polski (Polska)
0416 pt-BR Portuguese (Brazil) Português (Brasil)
0816 pt-PT Portuguese (Portugal) português (Portugal)
0418 ro-RO Romanian (Romania) română (România)
0419 ru-RU Russian (Russia) русский (Россия)
081a sr-Latn-CS Serbian (Latin, Serbia and Montenegro (Former)) srpski (Srbija i Crna Gora (Prethodno))
041b sk-SK Slovak (Slovakia) slovenčina (Slovenská republika)
0424 sl-SI Slovenian (Slovenia) slovenski (Slovenija)
0c0a es-ES Spanish (Spain) español (España)
041d sv-SE Swedish (Sweden) svenska (Sverige)
041e th-TH Thai (Thailand) ไทย (ไทย)
041f tr-TR Turkish (Turkey) Türkçe (Türkiye)
0422 uk-UA Ukrainian (Ukraine) україньска (Україна)

 

Table 2: The locales representing languages for which Windows creates Language Interface Packs, aka LIPs: 

LCID Name Display Name Native Name
0436 af-ZA Afrikaans (South Africa) Afrikaans (Suid Afrika)
041c sq-AL Albanian (Albania) shqipe (Shqipëria)
045e am-ET Amharic (Ethiopia) አማርኛ (ኢትዮጵያ)
042b hy-AM Armenian (Armenia) Հայերեն (Հայաստան)
044d as-IN Assamese (India) অসমীয়া (ভাৰত)
042c az-Latn-AZ Azeri (Latin, Azerbaijan) Azərbaycan­ılı (Azərbaycanca)
042d eu-ES Basque (Basque) euskara (euskara)
0845 bn-BD Bengali (Bangladesh) বাংলা (বাংলাদেশ)
0445 bn-IN Bengali (India) বাংলা (ভারত)
201a bs-Cyrl-BA Bosnian (Cyrillic, Bosnia and Herzegovina) босански (Босна и Херцеговина)
141a bs-Latn-BA Bosnian (Latin, Bosnia and Herzegovina) bosanski (Bosna i Hercegovina)
0403 ca-ES Catalan (Catalan) català (català)
048c prs-AF Dari (Afghanistan) درى (افغانستان)
0464 fil-PH Filipino (Philippines) Filipino (Pilipinas)
0456 gl-ES Galician (Galician) galego (galego)
0437 ka-GE Georgian (Georgia) ქართული (საქართველო)
0447 gu-IN Gujarati (India) ગુજરાતી (ભારત)
0468 ha-Latn-NG Hausa (Latin, Nigeria) Hausa (Nigeria)
0439 hi-IN Hindi (India) हिंदी (भारत)
040f is-IS Icelandic (Iceland) íslenska (Ísland)
0470 ig-NG Igbo (Nigeria) Igbo (Nigeria)
0421 id-ID Indonesian (Indonesia) Bahasa Indonesia (Indonesia)
085d iu-Latn-CA Inuktitut (Latin, Canada) Inuktitut (kanata)
083c ga-IE Irish (Ireland) Gaeilge (Éire)
0434 xh-ZA isiXhosa (South Africa) isiXhosa (uMzantsi Afrika)
0435 zu-ZA isiZulu (South Africa) isiZulu (iNingizimu Afrika)
044b kn-IN Kannada (India) ಕನ್ನಡ (ಭಾರತ)
043f kk-KZ Kazakh (Kazakhstan) Қазақ (Қазақстан)
0453 km-KH Khmer (Cambodia) ខ្មែរ (កម្ពុជា)
0441 sw-KE Kiswahili (Kenya) Kiswahili (Kenya)
0457 kok-IN Konkani (India) कोंकणी (भारत)
0440 ky-KG Kyrgyz (Kyrgyzstan) Кыргыз (Кыргызстан)
046e lb-LU Luxembourgish (Luxembourg) Lëtzebuergesch (Luxembourg)
042f mk-MK Macedonian (Former Yugoslav Republic of Macedonia) македонски јазик (Македонија)
083e ms-BN Malay (Brunei Darussalam) Bahasa Malaysia (Brunei Darussalam)
043e ms-MY Malay (Malaysia) Bahasa Malaysia (Malaysia)
044c ml-IN Malayalam (India) മലയാളം (ഭാരതം)
043a mt-MT Maltese (Malta) Malti (Malta)
0481 mi-NZ Maori (New Zealand) Reo Māori (Aotearoa)
044e mr-IN Marathi (India) मराठी (भारत)
0450 mn-MN Mongolian (Cyrillic, Mongolia) Монгол хэл (Монгол улс)
0461 ne-NP Nepali (Nepal) नेपाली (नेपाल)
0814 nn-NO Norwegian, Nynorsk (Norway) norsk, nynorsk (Noreg)
0448 or-IN Oriya (India) ଓଡ଼ିଆ (ଭାରତ)
0429 fa-IR Persian (Iran) فارسى (ايران)
0446 pa-IN Punjabi (India) ਪੰਜਾਬੀ (ਭਾਰਤ)
046b quz-BO Quechua (Bolivia) runasimi (Bolivia Suyu)
0c1a sr-Cyrl-CS Serbian (Cyrillic, Serbia and Montenegro (Former)) српски (Србија и Црна Гора (Претходно))
046c nso-ZA Sesotho sa Leboa (South Africa) Sesotho sa Leboa (Afrika Borwa)
0432 tn-ZA Setswana (South Africa) Setswana (Aforika Borwa)
045b si-LK Sinhala (Sri Lanka) සිංහ (ශ්‍රී ලංකා)
0449 ta-IN Tamil (India) தமிழ் (இந்தியா)
0444 tt-RU Tatar (Russia) Татар (Россия)
044a te-IN Telugu (India) తెలుగు (భారత దేశం)
0442 tk-TM Turkmen (Turkmenistan) türkmençe (Türkmenistan)
0420 ur-PK Urdu (Islamic Republic of Pakistan) اُردو (پاکستان)
0443 uz-Latn-UZ Uzbek (Latin, Uzbekistan) U'zbek (U'zbekiston Respublikasi)
042a vi-VN Vietnamese (Vietnam) Tiếng Việt (Việt Nam)
046a yo-NG Yoruba (Nigeria) Yoruba (Nigeria)

 

Table #3: Locales whose identifiers are not directly associated with any localizations of Windows, even if a related identifier might make for one representing a suitable localization:

0484 gsw-FR Alsatian (France) Elsässisch (Frànkrisch)
1401 ar-DZ Arabic (Algeria) العربية (الجزائر)
3c01 ar-BH Arabic (Bahrain) العربية (البحرين)
0c01 ar-EG Arabic (Egypt) العربية (مصر)
0801 ar-IQ Arabic (Iraq) العربية (العراق)
2c01 ar-JO Arabic (Jordan) العربية (الأردن)
3401 ar-KW Arabic (Kuwait) العربية (الكويت)
3001 ar-LB Arabic (Lebanon) العربية (لبنان)
1001 ar-LY Arabic (Libya) العربية (ليبيا)
1801 ar-MA Arabic (Morocco) العربية (المملكة المغربية)
2001 ar-OM Arabic (Oman) العربية (عمان)
4001 ar-QA Arabic (Qatar) العربية (قطر)
2801 ar-SY Arabic (Syria) العربية (سوريا)
1c01 ar-TN Arabic (Tunisia) العربية (تونس)
3801 ar-AE Arabic (U.A.E.) العربية (الإمارات العربية المتحدة)
2401 ar-YE Arabic (Yemen) العربية (اليمن)
082c az-Cyrl-AZ Azeri (Cyrillic, Azerbaijan) Азәрбајҹан (Азәрбајҹан)
046d ba-RU Bashkir (Russia) Башҡорт (Россия)
0423 be-BY Belarusian (Belarus) Беларускі (Беларусь)
047e br-FR Breton (France) brezhoneg (Frañs)
0c04 zh-HK Chinese (Hong Kong S.A.R.) 中文(香港特别行政區)
1404 zh-MO Chinese (Macao S.A.R.) 中文(澳門特别行政區)
1004 zh-SG Chinese (Singapore) 中文(新加坡)
0483 co-FR Corsican (France) Corsu (France)
101a hr-BA Croatian (Latin, Bosnia and Herzegovina) hrvatski (Bosna i Hercegovina)
0465 dv-MV Divehi (Maldives) ދިވެހިބަސް (ދިވެހި ރާއްޖެ)
0813 nl-BE Dutch (Belgium) Nederlands (België)
0c09 en-AU English (Australia) English (Australia)
2809 en-BZ English (Belize) English (Belize)
1009 en-CA English (Canada) English (Canada)
2409 en-029 English (Caribbean) English (Caribbean)
4009 en-IN English (India) English (India)
1809 en-IE English (Ireland) English (Eire)
2009 en-JM English (Jamaica) English (Jamaica)
4409 en-MY English (Malaysia) English (Malaysia)
1409 en-NZ English (New Zealand) English (New Zealand)
3409 en-PH English (Republic of the Philippines) English (Philippines)
4809 en-SG English (Singapore) English (Singapore)
1c09 en-ZA English (South Africa) English (South Africa)
2c09 en-TT English (Trinidad and Tobago) English (Trinidad y Tobago)
0809 en-GB English (United Kingdom) English (United Kingdom)
3009 en-ZW English (Zimbabwe) English (Zimbabwe)
0438 fo-FO Faroese (Faroe Islands) føroyskt (Føroyar)
080c fr-BE French (Belgium) français (Belgique)
0c0c fr-CA French (Canada) français (Canada)
140c fr-LU French (Luxembourg) français (Luxembourg)
180c fr-MC French (Principality of Monaco) français (Principauté de Monaco)
100c fr-CH French (Switzerland) français (Suisse)
0462 fy-NL Frisian (Netherlands) Frysk (Nederlân)
0c07 de-AT German (Austria) Deutsch (Österreich)
1407 de-LI German (Liechtenstein) Deutsch (Liechtenstein)
1007 de-LU German (Luxembourg) Deutsch (Luxemburg)
0807 de-CH German (Switzerland) Deutsch (Schweiz)
046f kl-GL Greenlandic (Greenland) kalaallisut (Kalaallit Nunaat)
045d iu-Cans-CA Inuktitut (Syllabics, Canada) ᐃᓄᒃᑎᑐᑦ (ᑲᓇᑕ)
0810 it-CH Italian (Switzerland) italiano (Svizzera)
0486 qut-GT K'iche (Guatemala) K'iche (Guatemala)
0487 rw-RW Kinyarwanda (Rwanda) Kinyarwanda (Rwanda)
0454 lo-LA Lao (Lao P.D.R.) ລາວ (ສ.ປ.ປ. ລາວ)
082e dsb-DE Lower Sorbian (Germany) dolnoserbšćina (Nimska)
047a arn-CL Mapudungun (Chile) Mapudungun (Chile)
047c moh-CA Mohawk (Mohawk) Kanien'kéha (Canada)
0850 mn-Mong-CN Mongolian (Traditional Mongolian, PRC) ᠮᠤᠨᠭᠭᠤᠯ ᠬᠡᠯᠡ (ᠪᠦᠭᠦᠳᠡ ᠨᠠᠢᠷᠠᠮᠳᠠᠬᠤ ᠳᠤᠮᠳᠠᠳᠤ ᠠᠷᠠᠳ ᠣᠯᠣᠰ)
0482 oc-FR Occitan (France) Occitan (França)
0463 ps-AF Pashto (Afghanistan) پښتو (افغانستان)
086b quz-EC Quechua (Ecuador) runasimi (Ecuador Suyu)
0c6b quz-PE Quechua (Peru) runasimi (Peru Suyu)
0417 rm-CH Romansh (Switzerland) Rumantsch (Svizra)
243b smn-FI Sami, Inari (Finland) sämikielâ (Suomâ)
103b smj-NO Sami, Lule (Norway) julevusámegiella (Vuodna)
143b smj-SE Sami, Lule (Sweden) julevusámegiella (Svierik)
0c3b se-FI Sami, Northern (Finland) davvisámegiella (Suopma)
043b se-NO Sami, Northern (Norway) davvisámegiella (Norga)
083b se-SE Sami, Northern (Sweden) davvisámegiella (Ruoŧŧa)
203b sms-FI Sami, Skolt (Finland) sääm´ǩiõll (Lää´ddjânnam)
183b sma-NO Sami, Southern (Norway) åarjelsaemiengiele (Nöörje)
1c3b sma-SE Sami, Southern (Sweden) åarjelsaemiengiele (Sveerje)
044f sa-IN Sanskrit (India) संस्कृत (भारतम्)
0491 gd-GB Scottish Gaelic (United Kingdom) Gàidhlig (an Rìoghachd Aonaichte)
1c1a sr-Cyrl-BA Serbian (Cyrillic, Bosnia and Herzegovina) српски (Босна и Херцеговина)
301a sr-Cyrl-ME Serbian (Cyrillic, Montenegro) српски (Црна Гора)
281a sr-Cyrl-RS Serbian (Cyrillic, Serbia) српски (Србија)
181a sr-Latn-BA Serbian (Latin, Bosnia and Herzegovina) srpski (Bosna i Hercegovina)
2c1a sr-Latn-ME Serbian (Latin, Montenegro) srpski (Crna Gora)
241a sr-Latn-RS Serbian (Latin, Serbia) srpski (Srbija)
2c0a es-AR Spanish (Argentina) Español (Argentina)
400a es-BO Spanish (Bolivia) Español (Bolivia)
340a es-CL Spanish (Chile) Español (Chile)
240a es-CO Spanish (Colombia) Español (Colombia)
140a es-CR Spanish (Costa Rica) Español (Costa Rica)
1c0a es-DO Spanish (Dominican Republic) Español (República Dominicana)
300a es-EC Spanish (Ecuador) Español (Ecuador)
440a es-SV Spanish (El Salvador) Español (El Salvador)
100a es-GT Spanish (Guatemala) Español (Guatemala)
480a es-HN Spanish (Honduras) Español (Honduras)
080a es-MX Spanish (Mexico) Español (México)
4c0a es-NI Spanish (Nicaragua) Español (Nicaragua)
180a es-PA Spanish (Panama) Español (Panamá)
3c0a es-PY Spanish (Paraguay) Español (Paraguay)
280a es-PE Spanish (Peru) Español (Perú)
500a es-PR Spanish (Puerto Rico) Español (Puerto Rico)
540a es-US Spanish (United States) Español (Estados Unidos)
380a es-UY Spanish (Uruguay) Español (Uruguay)
200a es-VE Spanish (Venezuela) Español (Republica Bolivariana de Venezuela)
081d sv-FI Swedish (Finland) svenska (Finland)
045a syr-SY Syriac (Syria) ܣܘܪܝܝܐ (سوريا)
0428 tg-Cyrl-TJ Tajik (Cyrillic, Tajikistan) Тоҷикӣ (Тоҷикистон)
085f tzm-Latn-DZ Tamazight (Latin, Algeria) Tamazight (Djazaïr)
0451 bo-CN Tibetan (PRC) བོད་ཡིག (ཀྲུང་ཧྭ་མི་དམངས་སྤྱི་མཐུན་རྒྱལ་ཁབ།)
042e hsb-DE Upper Sorbian (Germany) hornjoserbšćina (Němska)
0480 ug-CN Uyghur (PRC) ئۇيغۇرچە (جۇڭخۇا خەلق جۇمھۇرىيىتى)
0843 uz-Cyrl-UZ Uzbek (Cyrillic, Uzbekistan) Ўзбек (Ўзбекистон)
0452 cy-GB Welsh (United Kingdom) Cymraeg (y Deyrnas Unedig)
0488 wo-SN Wolof (Senegal) Wolof (Sénégal)
0485 sah-RU Yakut (Russia) саха (Россия)
0478 ii-CN Yi (PRC) ꆈꌠꁱꂷ (ꍏꉸꏓꂱꇭꉼꇩ)

 

Enjoy!


John Cowan on 24 May 2011 8:05 AM:

I'm a little confused by table 3.  In what sense does cy-GB and es-AR "exist" and cy-AR and nv-DK (my evergreen example) "not exist"?  Only in the sense that the first has a numeric LCID and the second does not?  Or also in the sense that some cultural information is available for the former but not the latter?

Michael S. Kaplan on 24 May 2011 9:21 AM:

They exist in the sense that they are defined on every version of Windows 7 out of the box, with need to neither define nor install anything.

And also the LCID thing, though I care less about that these days....

Ian Jones on 24 May 2011 11:49 PM:

I suppose that since English (United Kingdom) is in Table 3 that means the control panel must talk about color (instead of colour). XP certainly does. This goes to show how little the differences are in OS terms.

It's always blindingly obvious in Office products as soon as the spell checker starts complaining that I don't type like an American.

Andrew West on 25 May 2011 4:19 AM:

I'm wondering where Windows got the traditional Mongolian text from?  I only ask because I think that the word for "Mongolian" (ᠮᠤᠨᠭᠭᠤᠯ = munggul) is misspelt.  Not that it surprises me, as the deeply flawed encoding model for Mongolian makes homographic misspellings inevitable (Mongolian "o" and "u" are essentially the same letter, but have been encoded separately on phonetic grounds rather than unified on the basis of their glyph shape, as is the case with every other Unicode script -- akin to encoding two Latin letter C's, one "hard c" and one "soft c").  Although the Windows spelling gets the greatest number of google hits (precicely because it is used in Windows?), I think the "u"s should be "o"s, and the "ngg" should be "ŋg".  There seem to be four different spellings current on the internet:

blogs.msdn.com/.../10167604.aspx = ᠮᠤᠨᠭᠭᠤᠯ munggul <182E 1824 1828 182D 182D 1824 182F>

en.wikipedia.org/.../Mongolian_language = ᠮᠣᠨᠭᠭᠣᠯ monggol <182E 1823 1828 182D 182D 1823 182F>

www.geonames.de/coumn.html = ᠮᠣᠩᠭ᠋ᠣᠯ moŋg{VS1}ol <182E 1823 1829 182D 180B 1823 182F>

en.wikipedia.org/.../Mongolian_script = ᠮᠣᠩᠭᠣᠯ moŋgol <182E 1823 1829 182D 1823 182F>

The above examples of the same word are all represented using slightly different characters ("o" vs "u", "ng" vs "ŋ", and "g with a variation selector" vs "g without a variation selector"), but on systems that support traditional Mongolian the first three all look almost identical (if you look closely you'll see that with Mongolian Baiti the shape of the first "g" in spellings with "ngg" is not quite the same as the shape of the "ŋ" in spellings with "ŋg" -- the character "ŋ" is in origin a ligature of "n" and "g" so you would expect "ŋ" and "ng" to look very similar).  The last spelling (ᠮᠣᠩᠭᠣᠯ moŋgol), without the VS, lacks the two dots to the side of the "g".  As the distinction between U+1823 and U+1824 is artificial I suspect that most Mongolian writers just use whatever letters appear to give the correct result, and so eye spellings proliferate; and the Windows spelling "munggul" *looks* corerct, so who's going to complain?  I believe (but am no longer sure) that the correct spelling should be ᠮᠣᠩᠭ᠋ᠣᠯ moŋg{VS1}ol with a variation selector to add two dots to the preceding "g", but this is the spelling that gets the least google hits (well, Google ignores the VS, which is correct from a a Unicode perspective, but I actually think it is wrong from a user perspective as Mongolian variation selectors are not really "ignorable" in the same way that ordinary variation selectors are).  This may simply be because variation selectors are awkward to use, and so people prefer spellings without them -- typing "n" + "g" gets the same glyph as "ŋ", but automatically gets the dots under the following "g", whereas with "ŋ" you need to add a VS after the following "g", with the result that (the incorrect?) "n" + "g" wins over (the correct?) "ŋ".  Quite why there are spellings with "u" instead of "o", I have absolutely no idea, as all scripts used for writing Mongolian that visually distinguish "o" and "u" spell the word with "o" (e.g. modern Cyrillic Монгол and 14th century Phags-pa ꡏꡡꡃ ꡣꡡꡙ).

Michael S. Kaplan on 26 May 2011 7:15 AM:

Hey Andrew! I think there's blame to go around, since better input methods in Windows and Office spellcheckers could go a long way to improve the situation, even with trouble in Unicode's encoding....but either way I forwarded the issues you mentioned on to see what we can do about them.


referenced by

2012/08/21 The Locales of Windows 8, not yet divvied up...

2012/02/21 The evolving Story of Locale Support, part 19: In honor of International Mother Language Day...

2012/02/02 The evolving Story of Locale Support, part 17: Today I feel like translating you more than before

2011/12/21 The evolving Story of Locale Support, part 13: Divvying up locales, yet again!

2011/07/07 The Locales of Windows 7, divvied up further

go to newer or older post, or back to index or month or day