by Michael S. Kaplan, published on 2006/09/27 03:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/09/27/773341.aspx
It was over a year ago that I posted about how LOCALE_SABBREVLANGNAME is so not an ISO-639 code.
But perhaps the title of this post covers the situation a little more accurately, or at least a little more clearly....
The rules are simple enough (though perhaps more complex then I laid out originally), so I will just lay them out, here and now. These are actually descriptive, nor prescriptive, which is to say that I am describing how a bunch of decisions ended up being made. I am not describing some mystical set of rules in a data handbook or anything. :-)
RULE #1: If you take every single one of these three letter codes, then each language within the full set of locales must have the first two letters of the code uniquely represent the language. Thus EN must be English, AR must be Arabic, and so on. This is true for every locale that uses one of these individual languages, so that the Language Bar can have a two letter code to use.
RULE #2: If a language has multiple locales (e.g. the aforementioned EN and AR), then usually the ISO-639 TWO LETTER CODE will be used for the first two letters of LOCALE_SABBREVLANGNAME, with the third letter chosen to uniquely identify the locale. The exception to this is when the uniqueness rule in #1 are not met, in which case a change will be made to make them unique.
RULE #3: If a language has only one locale, such as Japanese or Korean, then usually the ISO-639 THREE LETTER CODE will be used for LOCALE_SABBREVLANGNAME. The exception to this is once again when the unique two letter rule in #1 is not met, in which case a change will be made to make it unique.
Thus, to give the example of a new locale in Vista, Uighur (PRC).... Uighur's two letter ISO 639 code is ug, and its three letter ISO 639 code is uig. since the Uighur language is not used for any other locales, and further since UIG and its first two letters UI do not conflict with any other language or locale, the three letter ISO 639 code is used here.
Now of course this approach is going to upset anyone who prefers Uyghur when the whole Uighur or Uyghur question is raised, but as a choice it is not designed to choose sides, it is simply using the ISO 639 three letter code, which happens in this case to not have its first two letters match the ISO 639 two letter code.
The end result is a code that is uniquely qualified to upset people who feel that their language or locale is being misrepresented in the Language Bar....
This post brought to you by ئ (U+0626, a.k.a. ARABIC LETTER YEH WITH HAMZA ABOVE)
# Anonymous on 27 Sep 2006 9:23 AM:
# Michael S. Kaplan on 27 Sep 2006 10:41 AM:
# oyghan on 27 Sep 2006 1:30 PM:
# Michael S. Kaplan on 27 Sep 2006 2:00 PM:
# teklimakan on 27 Oct 2006 1:46 AM:
Recommendation for the English transcription of the word
“ ئۇيغۇر ” /《维吾尔》
The constant advances of the reform and opening policy and the growing trade and telecommunications exchanges between people in all sectors and the outside world, as well as increased activities in publication, news reporting and international affairs have for some time renewed the need for a uniform and normalized ethnonym for the major nationality of Xinjiang, known in its own language as [ujγur].
At present, there is utter confusion on how to render and use in English the name of that nationality, with no fewer than seven different spellings attested: Uyghur, Uygur, Uighur, Uighuir, Uiguir, Uigur and Weiwuer. This situation causes a number of problems in our work and daily lives. Therefore, the Terminology Normalization Committee for Ethnic Languages of the Xinjiang Uyghur Autonomous Region based on research and consultations with relevant experts on this issue recommends that the spelling Uyghur, corresponding to the pronunciation [ujγur], be used as the English transcription of the word.
Government organizations and individuals are invited to conform to the present notice.
The Terminology Normalization Committee for Ethnic Languages of the Xinjiang Uyghur Autonomous Region
October 11, 2006
http://www.xjyw.gov.cn/han/IMAGES/ui_eng.jpg
referenced by
2013/05/02 Out of the ashes of the old...
2013/04/16 Because, unlike Andrew West, *I* have a sense of history... 😄
2010/11/08 Header files are the wrong place to be less than helpful
2010/02/26 There is no "I" in "Uyghur". Oh. Um. Well, except in the Windows Language Bar....
2008/01/28 Behold the Table Driven Text Service, Part 5 (All about the language, baby!)
2006/11/08 What the hell is MFCENU.DLL?
2006/11/06 Who owns English, exactly?
2006/11/02 MFC + Spanish version of Windows = Oops!