by Michael S. Kaplan, published on 2010/04/11 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2010/04/11/9990383.aspx
Do you know that old saying Don't dish it out if you can't take it, by any chance?
Microsoft doesn't always follow this, from a technical perspective.
Perhaps I should explain....
There are times that Microsoft has issues or bugs in its code, which it then has to fix.
You probably already knew, that of course.
If you think back to what you were doing a few days after Christmas in 2005, you may have been reading When the roof got raised, and why, and you will see an example of such a bug.
In that case, the bug happened because stored strings in a locale did not need to have the same kinds of safeguards and checks when being retrieved by a GetLocaleInfoEx or used in other functions, when compared to the strings you might want to set in a SetLocaleInfo function or one of the other functions where you might be providing customized information like the ones that use a NUMBERFMT structure.
Interestingly, the importance of the limitations for each field is in most cases an illusion.
The serious problems that come up, like the ones I mentioned in Seeing double? You're not drunk; you're just running pseudo! (aka Announcement: Pseudo Day!), happen within the legal limits for fields, so the fact that there are safeguards don't specifically help.
The few times that it was literally problems with the length that came up, like in the case of When the roof got raised, and why, there were no security or other issues with the code that came up due to locales having longer strings than were documented. The only "bug" was in the documentation, and the only unintuitive piece was the fact that the Get* and Set* functions has two different rules.
Since Microsoft creates the data and it goes through all kinds of checks and screenings, this doesn't come up very much. But like I said the code handles the functionality case pretty well anyway.
Which is why I had come to realize since that blog in the end of 2005 that the importance of the limitation is probably overblown.
But the other day I had this proven to me in a way that I never even thought of before.
Let me take a step back here and mention that my machines have a lot more locales on them then yours probably do.
I won't give an exact number, mind you.
But I take serious advantage of the custom locale support in Vista and Windows 7 and have added all kinds of different locales. Locales that there is:
But of course since my machines are never shipped to customers, it is safe for me to do this on my machine, just as it would be safe for you to do the same for the purposes of adding one locale or twelve or twelve hundred or however many you could come up with data for....
These custom locales are "safe" in the same way that Microsoft's built-in ones are, because the methods and objects and tools used to create then construct them in a way that validates the data.
You may know where I am going with this blog now, by the way!
As one of the few people who recognized the plot twist in The Sixth Sense way before the end of the movie, I have a certain amount of respect for those whose brains can think ahead like that. :-)
Now of course you cannot set the AM/PM strings to exceed the length, because those are checked.
But in Vista, a change was made to stop storing certain fields that were pretty much only there for POSIX locale support and which were actually not used for locales formatting/parsing since they stored the data by other means anyway. Fields like LOCALE_STIME and LOCALE_SDATE -- look at the docs for them. They are deprecated, and if you try to retrieve them they will be derived from one of the respective formats that contain them.
Because of this, you might create a locale with date and/or time formats that contain separators that are longer than three characters, just based on simple data you got from the CLDR or some other source.
And now suddenly you are in the same boat as the person in Microsoft who caused those problems that led to When the roof got raised, and why. And your "gets" can now do more than your "sets" could ever do!
Who would have thought that custom locales would give you the opportunity to create the same kinds of bugs yourself? :-)
Imagine if they fix this and break a locale that someone has deployed -- a locale based on data hosted by the CLDR or other source. Wow, this is setting up for an interesting future problem!
As it turns out, the story here wasn't this easy even right now; the way I found the bug shows that.
I created a locale with Microsoft Locale Builder, saved it out as an LDML file, and created a setup package which I then installed.
But the bug I ran into? Later that same day, MSLB could not load the LDML file that it had created; it could not load from the existing custom locale I installed (that again it had created).
Loading from LDML in MSLB would pop up this error:
while trying to load from the existing, installed locale would pop up a slightly different error:
I'd be more forgiving of the first error if there actually was a "TimeSeparator" field in the LDML so I had some sense of what to change/fix, but no dice.
Both errors are kind of confusing, so hopefully people can find this blog to get hints on how to work around the problem. :-)
Just to help people searching for the error before there is a KB article, I'll put it in text so search engines can find it:
Parameter name: The property 'GregorianDateTimeFormat.TimeSeparator' must be less than or equal to 4 characters in length.
By the way, shouldn't that be three characters in length?
Until they get around to fixing this issue (if they do, I mean - there is a backcompat risk to fixing by adding more validation), you can enjoy this interesting opportunity to raise the roof yourself!