by Michael S. Kaplan, published on 2006/06/07 13:14 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/06/07/620955.aspx
(This is also not the font post; just hang in there, it will be here soon!)
Yi is one of the minority languages of China. The Liangshan Yi script was devised in the mid 70's and the standard was pushed out to the world in 1980 (a fuller description of the script in Unicode can be seen at this Babelstone article).
It is one of the scripts that is supported in Vista with a locale, an input method, and a font named Microsoft Yi Baiti. Which is very cool. :-)
So anyway, in Unicode the script has these two blocks:
Yi Syllables (U+a000 -- U+a48f)
Yi Radicals (U+a490 -- U+a4Cf)
The first block is the one that is actually used for the language; the second block really has no specific defined use outside of dictionary-type headers or index entries.
Because of this, the two are generally not collated together (with radicals interleaved with syllables) -- similar to the way Latin is not interleaved with Han in Simplified Chinese sorts based on Pinyin and Bopomfo is not interleaved with Han in Traditional Chinese pronunciation sorts used in Taiwan.
Of course there is still room for confusion, if you look across all of both ranges there are a few that look the same (on the left is the Yi Radical, on the right is the Yi Syllable):
U+a49c ꒜ U+a0c0 ꃀ
U+a4a8 ꒨ U+a132 ꄲ
U+a49a ꒚ U+a1d9 ꇙ
U+a4bf ꒿ U+a259 ꉙ
U+a494 ꒔ U+a2cd ꋍ
U+a4c2 ꓂ U+a3b5 ꎵ
U+a4b0 ꒰ U+a3c2 ꏂ
U+a4a7 ꒧ U+a458 ꑘ
It is easy to imagine grabbing the wrong one (i.e. the radical rather than the syllable) if it is easy enough to do so.
Now this makes no difference for simply looking at text, but when trying to search within it or sort it, you could run across a real problem -- since in collation (e.g. in the Unicode Collation Algorithm) all of the radicals are put together in a separate weight space from where the syllables are.
Of course one could:
but each of these solutions would come at the price of making some other behavior seem incorrect.
In the end, the key would be to just not use the Yi Radicals when one should be using the Yi Syllables (a solution probably best handled within the input method rather than within the font or the collation).
This post brought to you by ꀕ (U+a015, a.k.a. YI SYLLABLE WU)
(star in an upcoming Unicode character story!)
# Gabe on 7 Jun 2006 2:00 PM:
# Marc Brooks on 7 Jun 2006 5:48 PM:
# Michael S. Kaplan on 7 Jun 2006 6:04 PM:
Andrew West on 8 Jun 2006 10:35 AM:
2008/02/22 What's missing from the model
go to newer or older post, or back to index or month or day