What version of Unicode should we support?

by Michael S. Kaplan, published on 2008/09/06 16:16 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/09/06/8928632.aspx

The other day, the question came in again.

And when I say the question, I mean THE question.

You know, the What Unicode version do you support? question.

Well, technically it was a slight variation, more of a What version of Unicode should we support? but clearly the same question is being asked. Someone running a program connected to (well, running on) Windows want to know what version of Unicode good programs connected to (well, running on) Windows support.

Unfortunately, not every question that is reasonable to ask is necessarily one that has a reasonable answer.

Sure, to start with there is everything I pointed out in the What Unicode version do you support? blog back in the end of 2005.

But there is a bigger issue here.

The issue is the fundamental difference between:

and to make matters worse te question is implicitly trying to take specific formal versions of the latter and trying to understand how it fits into the former.

Wanna know how it fits?

I'll tell you.


Unicode adds things for two reasons -- proposals for new scripts that are generally brought in as the proposals mature and algorithms and descriptive processes put in for a whole host of possible reasons.

Now obviously market forces can enter into the equation since strategic scripts can be fast-tracked and important algorithms can be pushed by companies that need solutions to deal with the market pressures they deal with. But these amount of nudges, to pushes to triage how quickly some things are looked it. These are the tactics of Unicode, not the strategy, which get to things in when they are mature and ready to be added to the standard in a formal version.

Now comapring that to the planning process by which Microsoft or any company chooses what languages to add locale or rendering or font or collation or formatting or pasrsing or word breaking support for itsn't like comparing apples and oranges.

It's like comparing apples and earmuffs, or other similarly different things.

The question is unfair in another way, too.

You see, a Unicode version is a complex cluster of characters and properties and algorithms that are released on a specific date.

Kind of like how a Windows or an OS X version is a complex cluster of applications and features and yes languages and fonts and so on released on a specific date.

Why is it reasonable to expect that the summary name of one (e.g. Unicode 4.1.0) would ever map to the other (e.g. Windows 5.1, aka Whistler, aka Windows XP) exactly? When in addition to all of the above differences you can't even look at dates/schedules and see connections?

Perhaps the answer is to provide the data for what each version of the product supports, so that the program connected to (well, running on) Windows knows what is available and therefore what has a potential of actually working.

Then the program connected to (well, running on) Windows can then do its own planning to decide what subset or complete set (or occasionally superset) they wish to themselves support.

Though in the end, each program connected to (well, running on) Windows will always start with the very same question.

What version of Unicode should we support?

This blog brought to you by ? (U+003f, aka QUESTION MARK)

no comments

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2012/02/06 The oft-repeated 'What version of Unicode do we/will me support?' question, Redux

2010/10/12 Unicode 6.0.0 is [virtually] released!

go to newer or older post, or back to index or month or day