Unicode: Good enough for governmentS work!

by Michael S. Kaplan, published on 2011/07/22 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/07/22/10188817.aspx


The problem with Unicode is that it is quite complicated.

Well, that in and of itself isn't a problem; the problem is that as a standard it is driven by many competing forces.

And one of those forces made the XKCD world the other day!

The strip is easy enough:

And then past this problem, there is another one....

The problem here is simply stated:

Characters are chaotic.

In fact, they are chaos itself, at times.

We try to categorize and bucketize and prioritize everything, but inevitably we find that the tightly interwoven standard has yet another issue that the neatly defined bucketizations and standardizations missed.

And software, depending on those bucketizations and standardizations, has to pick up the updates to fix the problems that come with treating so much of the information in the Unicode Character Database as "Ultimate Truth" when really it just, to borrow fron Unicode's ISO 10646 brethren, "Good enough for governments work."

And so, there (and here) we are.

Actually, now that I think about it, there probably are 14 different reasons that Unicode is complicated. Reasons like: Michael Everson, Mark Davis, Ken Whistler, and so on. Every character may have a story, but it is also true that every one of those characters has a bunch of stories, too....


John Cowan on 22 Jul 2011 11:22 AM:

Curiously, however, Unicode actually has become the exceptional universal standard that actually is winning over the rest.  The time-scale involved is decades, to be sure, but Unicode should be with us for centuries.

Michael S. Kaplan on 22 Jul 2011 1:22 PM:

Or some slightly cleaner successor to Unicode without the backcompat issues.

Or some less cleaner successor to Unicode out of China....


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day