The modern solution to the problem of Traditional Spanish in Vista

by Michael S. Kaplan, published on 2006/09/23 14:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/09/23/768178.aspx


A while back (well, in March of this year) I was talking about Traditional versus modern sorts, and I mentioned there that I wasn't going to talk about Traditional Spanish then, but that you should stay tuned for a future blog post.

Hopeully you were not holding your breath or anything, but let me be the first to welcome you to the future blog post!

The Traditional Spanish sort (0x0000040a) has a long and interesting history, including but not limited to the fact that Mexican Spanish (0x0000080a) actually went right along with it even after the rest of the world including Spin had moved on. In fact, this lasted until Windows 2000, when the sorting for Mexican Spanish was silently move over to match the rest of the locales in the Spanish speaking world, a fact that most people never seemed to notice (people in Mexico might have!).

When the actual change took place is more complicated to figure out then one might think, due to the fasciatingly complex machinations of the Windows development process. And it was made even more complicated by the fact the Windows 2000 beta data was at one point snapshot'ed by Jet 4.0, though the way they and later SQL Server flattened non-unique collations hides the fact that they inherited a Traditional Spanish Mexican sort.

Lucky them, lucky us. :-)

Now the original change to add an LCID for new collation before LCIDs contained the notion of a SORTID is one that was able to embarrass me in the past, and has left Windows with an interesting wart where both 0x0000040a and 0x00000c0a are both enumerated as regular LCIDs even though all of their data other than the name and the sort are basically identical. A whole generation of applications trying to show lists of locales have to know to strip that 'Traditional' one out.

And when you consider how we silently slid Mexico over, we likely should have done the same with Spain, too (did anyone notice how we palmed the Mexican card?).

In Vista, the addition of EnumSystemLocalesEx with its LOCALE_ALTERNATE_SORTS that returns locale names alongside the venerable EnumSystemLocales with its LCID_ALTERNATE_SORTS led to an interesting question. I mean, as Julie pointed out to me the original concept of alternate sorts came out of the postmortem of the Traditional/Modern Spanish issue.

Perhaps Vista, with its new major version for collation, was the best time to move 0x0000040a over into the alternate sorts column, since that is what it actually is, what alternate sorts were really designed for?

And this is the way we went.

The documentation for EnumSystemLocalesEx points it out rather directly:

If dwFlags specifies LOCALE_ALTERNATE_SORTS, the callback function will be called for every locale that represents an alternate sort order. For example, Spanish (Spain) defaults to international sort order, but traditional sort order is available as an alternate sort. German (Germany) defaults to dictionary sort order, but there is an alternate phone book sort order available.

The question of what to do with EnumSystemLocales was one that was hotly debated between those who thought that the functions returning consistent results was most important versus those who thought that the backcompat was most important versus those who think both oif these points are equally important requiring both functions to return the old result.

One could argue that backcompat has never been too terribly important here for sorting updates and alt. sort updates. Whether one remembers (0x00010427) Classical Lithuanian or (0x0000080a) Mexican Spanish that sorts like Traditional Spanish or the (0x00010412) Korean and (0x00010411) Japanese Unicode sort orders in Windows that I talked about here and here, it is obvious that we have had no problem just making the change in a new version it's the right thing to do.

Why we stood on ceremony for Modern Spanish and even gave it another LCID is beyond me, though in the end the plan for Vista is to have both enumeration functions in Vista treat Traditional Spanish as an alternate sort and not include it in the regular locales.

Which (by the way) is also the reason for how long it took to do this post. You see, I put this decision on the list of decisions that whether right or wrong I could see some major bug being put in that would require the plan to change, so I essentially waited to see if the decision would withstand the big bug (which it did, yesterday).

In total fairness to the bug I coded the 'fix' to revert the old funtction's behavior and it was tested, while the whole conversation about what to do was discussed and triaged and rediscussed and retriaged. I exerted minimal inluence on any of the conversations though I agreed with every correct and intelligent point that was brought up (there were many) and said I would go with whatever decision people wanted to go with (as an ultimate test of the sanity will prevail theory).

And you want to know what? I think that in the end, it did.

So, it is now official -- as of Vista, Traditional Spanish is seen as an alternate sort in Windows, and next week when I am talking to Julie before Vienna Teng opens for Madeleine Peyroux at the Moore I can tell her that we finally cleaned that little issue up. :-)

 

This post brought to you by ñ (U+00f1, a.k.a. LATIN SMALL LETTER N WITH TILDE)


no comments

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2009/09/02 How ConvertDefaultLocale sorta broke backward compatibility in Windows 7, and why

2006/10/03 They say it happens to everyone, at some point...

go to newer or older post, or back to index or month or day