by Michael S. Kaplan, published on 2013/11/18, original URI: http://blogs.msdn.com/b/michkap/archive/2013/11/18/10468856.aspx
Every rule has an exception that proves the rule.
Even in The Unicode Standard!!!
You might be wondering what rule in particular I have in mind.
It is the somewhat famous Unicode cannot re-encode scripts once they are already encoded in the Standard rule.
You know, the rule that is cited every single time a re-encode proposal is sent to the Unicode Technical Committee.
Even this rule has its very own exception.
I can give you that exception in one word.
Korean.
From a completely technical standpoint, Korean has been encoded in Unicode FIVE TIMES THAT STILL EXIST.
How's THAT for an exception? 😏😏;-)
I think it qualifies!
Allow me to briefly go over the five encodings and their consequences for Unicode, Microsoft, Korea, North Korea, and many others.
Now encoding #4 is tied up with Unicode Normalization and the implementation that Microsoft formerly supported in Uniscribe and OpenType (until it was intentionally removed!) and which I think might still be supported in collation and which was never supported in the IME.
And encoding #5 is what is currently supported in all the support Microsoft provides in its latest version and which Unicode Normalization does not support.
Exactly what does it ultimately mean if we are left with this strange state? Is anyone hurt by it?
Is anyone specifically hurt by the missing pieces here?
I refuse to link to earlier blogs that tried to support all of these things without contradictions or problems, since no one wanted to go that way....