Why is the separator blue?

by Michael S. Kaplan, published on 2008/02/21 10:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/02/21/7836829.aspx


People never ask the easy questions, now do they?

It starts from an early age -- why is the sky blue? Why do I have to go to school on days when nothing interesting is being taught? Why I do I like that girl? And so on....

And it never gets any easier. Like yesterday, when the questions went something like this:

I work on an application and we use the digit grouping symbol set by the user. We are currently able to support digit grouping symbol one character long, whereas intl.cpl allows the user to set up to 3 character string as the digit grouping symbol.

Are there any languages which have 3 character digit grouping symbol and why does windows provide this support?

Like I said, people never ask the easy questions, now do they?

Now if you look at every shipping version of windows and every possible built-in locale that ever has been or is, there is only one locale with a LOCALE_STHOUSAND that has a length greater than one WCHAR in length when returned from GetLocaleInfoW, and that is the test qps-ploc pseudo locale (LANGID value 0x0501) I mentioned in Walking off the end of the eighth bit and The name of the song is not 'ps-PS I Love You' (it's value is ,, presumably just to help people test the kind of code assumptions that this application was making.

So the direct answer is that no locale on the system that is really there will have this problem.

But in the words of The Wolf in Pulp Fiction, Let's not go sucking -- er, never mind, you can watch the movie. Let's just not congratulate ourselves just yet....

This particular upper limit has been around Windows since at least NT4 (possibly earlier?) and I believe has been settable programmatically and in the UI all this time.
 
It can be hard track down why something was done a decade ago in Windows (people have either moved up, moved on, or moved out -- of the fewer than 10 people who have "owned" the data and 30 or so who have made changes to it, probably half of them are still at Microsoft but among their cadre are enough General Manager and Director types that a meeting of locale data changes checked in club seems unlikely!).

But with that said, most of the locale upper limits were based on research into future locales that might need the higher limit (for some other properties the limits had to be raised when future locales proved that the extreme choices were themselves not extreme enough -- that way lies When the roof got raised, and why).

And of course it almost goes without saying that  in a custom locale/custom culture, all bets are off and it is quite likely that one of those "not [yet] added to Windows" can suddenly be on a machine.

Given that, the "why" question is probably not as interesting as it seems. If any user can apparently break your application in some way, that sounds serious and it should definitely be considered a bug in the application, though it likely does not happen much or people would be reporting it more -- so it can be triaged accordingly (the fact that no built in locales other than a test one designed to ferret out such applications would hit the problem would also be a good factor in the triage).

But given how easy it is for someone to injure anyone who makes the assumption, the bug should be triaged on the seriousness of the behavior if it does happen -- no one wants an application that can be brought to its knees with a single SetLocaleInfo call.

In part we already have that -- the knowledge base is full of articles like Q251005 which point to the strange things that happen to applications when these features are changed, and the problems are not going away (even some of .NET parsing gives up if you have the decimal and thousands separators the same, to this day!).

But if your application can't work within the documented limits that any user an take advantage of, you are going to find yourself with big problems down the road....

 

This post brought to you by(U+2260, a.k.a. NOT EQUAL TO)


no comments

referenced by

2011/02/18 speaking with an accent, conceptually

go to newer or older post, or back to index or month or day