by Michael S. Kaplan, published on 2004/12/26 03:41 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2004/12/26/332392.aspx
People often when looking at wingdi.h notice the following constant definitions, somewhere around line 1292:
#define HANGEUL_CHARSET 129
#define HANGUL_CHARSET 129
and they wonder -- which is the right one? People usually assume that one if the older and the other is preferred.
Well, its a funny question. The short answer is that it does not matter. They are just simple #defines and they end up being the same value anyway.
The longer answer may be of interest to some, so I'll give that too. :-)
Back in the late 1930s, George McCune and Edwin Reischauer put together a system to represent 한글 (Han'gŭl) in a romanized form using the Latin script. This system (after many years of being used around the world) became the official romanized form used by South Korea from 1984 until 2000 (a very good summary of the system can be found here). In that form the first syllable 한 (Han') is combined with 글 (gŭl) to produce Han'gŭl. People would often skip all accents/diacritics and thus Hangul is the most common way people saw it (especially in identifiers like constants which cannot contain diacritics, but also in general usage). The problem with the information that is lost in the names is a real one, however, and for many years people struggled with an imperfect system.
Then starting in the mid 1990s work was started to try to produce a romanization system that would not have all of those diacritics, and although much of this new standard was communicated earlier, in the year 2000 it was offically published as the official system by South Korea. In that standard the 'ŭ' is actually represented by 'eu', and thus the official romanization of 한글 becomes Hangeul. Given that change, there was really no good reason to not add a CHARSET_HANGEUL constant.
Now there have been some criticisms of the "Revised Romanization of Korean" both inside and outside of Korea (summarized on the government site here with all of the changes) and its ability to properly represent Korean in a completely reversible form, and thus there are people outside of Korea who continue to work with the original McCune-Reischauer romanization. Of course, the existing constant (CHARSET_HANGUL) could not be removed anyway without breaking existing code. Also the constant is not really mentioned explicily in documentation much since the CHARSET_* constants are not used much in the world of modern font linking and fallback. In the end, it was just easier to leave it in as is, but add the new constant so that people could use the "new" name if they wanted it.
Koean as a language is best represented by using actual Hangeul syllables rather the romanized form anyway, so neither form really should affect much other than trying to use the term in situations where you need to describe the language in English anyway. Given the poor reversibility, the best way to store Korean text is to not try to romanize it at all if one can avoid it. Other solutions to this problem have been proposed such as the Korean Romanization for Data Applications (KORDA), but there has not been a high demand for this solution in Windows or the .NET Framework since there are no API that would make good use of transliterated forms and collation is not really set up to support it either.
A final piece of the puzzle is what happens in North Korea. Essentially, the original McCune-Reischauer form is used for romanization, but the name 조선글 (Chosŏn'gŭl) is preferred. However, the preferred ordering for Jamos in North Korea (and thus by extension for the full syllables that are made up of Jamos) is different than that of South Korea. Therefore, the expected sort for North Korea is not directly available since there is no North Korean locale support in Windows or the .NET Framework, although proper rendering will be achievable if one has appropriate fonts.
This post brought to you by "ᅅ" (U+1145, a.k.a. HANGUL CHOSEONG IEUNG-SIOS)
# Michael Giagnocavo on 26 Dec 2004 9:12 AM:
# Michael Kaplan on 26 Dec 2004 9:30 AM:
2008/02/22 What's missing from the model
go to newer or older post, or back to index or month or day