Is it Hangul? or Hangeul? or Han'gŭl? or what?

by Michael S. Kaplan, published on 2004/12/26 03:41 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2004/12/26/332392.aspx

People often when looking at wingdi.h notice the following constant definitions, somewhere around line 1292:

and they wonder -- which is the right one? People usually assume that one if the older and the other is preferred.

Well, its a funny question. The short answer is that it does not matter. They are just simple #defines and they end up being the same value anyway.

Back in the late 1930s, George McCune and Edwin Reischauer put together a system to represent 한글 (Han'gŭl) in a romanized form using the Latin script. This system (after many years of being used around the world) became the official romanized form used by South Korea from 1984 until 2000 (a very good summary of the system can be found here). In that form the first syllable 한 (Han') is combined with 글 (gŭl) to produce Han'gŭl. People would often skip all accents/diacritics and thus Hangul is the most common way people saw it (especially in identifiers like constants which cannot contain diacritics, but also in general usage). The problem with the information that is lost in the names is a real one, however, and for many years people struggled with an imperfect system.

Then starting in the mid 1990s work was started to try to produce a romanization system that would not have all of those diacritics, and although much of this new standard was communicated earlier, in the year 2000 it was offically published as the official system by South Korea. In that standard the 'ŭ' is actually represented by 'eu', and thus the official romanization of 한글 becomes Hangeul. Given that change, there was really no good reason to not add a CHARSET_HANGEUL constant.

Now there have been some criticisms of the "Revised Romanization of Korean" both inside and outside of Korea (summarized on the government site here with all of the changes) and its ability to properly represent Korean in a completely reversible form, and thus there are people outside of Korea who continue to work with the original McCune-Reischauer romanization. Of course, the existing constant (CHARSET_HANGUL) could not be removed anyway without breaking existing code. Also the constant is not really mentioned explicily in documentation much since the CHARSET_* constants are not used much in the world of modern font linking and fallback. In the end, it was just easier to leave it in as is, but add the new constant so that people could use the "new" name if they wanted it.

Koean as a language is best represented by using actual Hangeul syllables rather the romanized form anyway, so neither form really should affect much other than trying to use the term in situations where you need to describe the language in English anyway. Given the poor reversibility, the best way to store Korean text is to not try to romanize it at all if one can avoid it. Other solutions to this problem have been proposed such as the Korean Romanization for Data Applications (KORDA), but there has not been a high demand for this solution in Windows or the .NET Framework since there are no API that would make good use of transliterated forms and collation is not really set up to support it either.

A final piece of the puzzle is what happens in North Korea. Essentially, the original McCune-Reischauer form is used for romanization, but the name 조선글 (Chosŏn'gŭl) is preferred. However, the preferred ordering for Jamos in North Korea (and thus by extension for the full syllables that are made up of Jamos) is different than that of South Korea. Therefore, the expected sort for North Korea is not directly available since there is no North Korean locale support in Windows or the .NET Framework, although proper rendering will be achievable if one has appropriate fonts.

Yea, it's sad that romanization is used so much. When buying study books, a lot of them don't even bother teaching Hangeul, which is a shame, since it's quite easy to learn, and much easier to use (for Korean anyways) than roman.

In the sake of being complete, why isn't there any DPRK locale support?

I agree, though I hope as Unicode becomes more popular the need for a cross-codepage form to handle it lessens.

Obviously in the days before Unicode was so widespread it may have been the only way to support Hangeul. It may take the books some time to catch up, though I see more and more one sites like Amazon all the time (though of course you have to search under Hangul, not Hangeul <insert eye roll here>).

The DPRK support issue is bound up with the Wassenaar treaty, in ways that I will even not pretend to understand. But I wouldn't want to get too deep into discussing *that* issue, if you know what I mean....

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.