Seeing double? You're not drunk; you're just running pseudo! (aka Announcement: Pseudo Day!)

by Michael S. Kaplan, published on 2010/03/02 07:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2010/03/02/9971296.aspx


Some of you regular readers1 might recall LOCALE_SDECIMAL? Quite a character! Or two. Or three... from the middle of last month.

I (in a somewhat offhand manner2) mentioned

Now, you may remember how I have talked about the pseudo locales like qps-ploc that test out extreme cases harder to readily test for in prior blogs like Walking off the end of the eighth bit. Well it just so happens that qps-ploc has two characters in its decimal separator to make this feature easier to test out and make sure it does not break you.

So you can look at this as your chance to find the bug that even folks at Microsoft missed, and fix it in your own code.

And just remember that this feature has now been around for over a decade and a half, so if it breaks you application then it is your application that is broken. And has been for some time.

while I was there.

Maybe I should explore that one a bit!

There are other such "doubles" in there.

Like LOCALE_SDATE, which is the two-character \\ -- once again the maximum number of characters of the data? Four, including the NULL.

And LOCALE_STIME, which is the two-character  :: -- and once again the maximum number of characters of the data? Four, including the NULL.

And LOCALE_SLIST, which is the two-character ,, -- and once again the maximum number of characters of the data? Four, including the NULL.

And LOCALE_SCURRENCY, which is the two-charaacter $$ -- and the maximum number of characters of the data? Thirteen, including the NULL.

And LOCALE_SPERCENT, which is the two-character %% --and the documentation on this one has to be different: "The value is a single string of 0 to 3 characters." I suspect this is actually Four, including the NULL, to be perfectly honest.

And of course LOCALE_STHOUSAND, which is the two-character ,, -- and once again the maximum number of characters of the data? Four, including the NULL.

UPDATE 5 March 2010:

And then there is LOCALE_SMONDECIMALSEP, which is the two character .. -- and once again the maximum number of characters of the data? Four, including the NULL. You must still use Persian if you want to test what happens when it is different from LOCALE_SDECIMAL!

And also there is LOCALE_SMONTHOUSANDSEP, which has the two character ,, -- and once again the maximum number of characters of the data? Four, including the NULL.

And of course LOCALE_SNEGATIVESIGN and LOCALE_SPOSITIVESIGN are the two character -- and ++ (respectively) -- the maximum number of characters of the data? Five, including the NULL.

I already talked about the LOCALE_SDECIMAL....

Every one of these things has been legal for as long as the settings have been available in Regional and Language Options, going all the way back to NT 3.1. If your application breaks with this locale/culture, then (given the small number of applications around back that far) you have broken since BIRTH!

Think about that for a second.

And reports come in all the time, like even the .NET Framework's BCL has some problems with this in its date/time code.

Yes, these same items are changable in the .NET Framework.

The latest problem I have seen reported? In the ManagementDateTimeConverter.ToDateTime method, which as the documentation says:

Converts a given DMTF datetime to DateTime. The returned DateTime will be in the current time zone of the system.

Well, if you are using that cool pseudo locale, its results are apparently a less impressive System.ArgumentOutOfRangeException (Specified argument was out of the range of valid values.).

No word in the report on whether any of the other ManagementDateTimeConverter methods fail though I am not terribly hopeful for them not doing so under the circumstances.

Slightly less impressive. Maybe just a scosh (unless your application hits this after you've deployed it, I mean).

So.

I have a request if you are a developer of anything that runs on Windows.

If you own code that runs on Windows, please check out this locale -- you may just find and fix your bugs before your customers cause them!

And if you are a user, try switching to this locale/culture --see what you can break here.

Both groups -- the break can be anything from BSOD to appcrash to wrong results to parse errors to missing functionality to whatever (including failure to respect the settings).

Maybe everyone should do it on March 15th, it would be forever known not just as the Ides of March but as "Pseudo Day", the day Pseudo broke all the applications3!

Well, I doubt this will reach everyone as I don't have that much of a readership. Most of them just listen rather than comment or contact me anyway. But it might be fun to see how many bugs can be posted to products all over thwrold (including the Microsoft ones) on that day. I've been told that the truth is a virus - can we infect some of these misbehaving applications on that day? :-)

Since Gitlow v. New York defined the clear and present danger test4, I don't think my request above can be taken as a serious attempt to overthrow or take down any governments, even if one or two are broken a bit by my humble request in this blog to see what can be broken by switching to this locale5. There aren't any elections that I know of on that day....

 

1 - There do seem to be a few more of you now that I am a regular writer again!
2 - I have a tendency to usually bury the lead, mainly because I go for cute titles, much moreso than /. or TechCrunch bait in the hopes of snagging more "hits". Geez....
3 - Well, not every app - remember flash.globalization (ref: here) didn't screw it up. So the web and its videos will be able to go on. ;-)
4 - And I know my Gitlows. I've even dated one6,7. I'll admit I've lived an unusual life8,9.
5 - If you work for the FBI, the CIA, the NSA, the DIA, the DOD, the NVTC, Homeland Security, or any other such agency, please do not try this out on March 15th.
6 - Ok, two.
7 - Though old Benjamin Gitlow the communist was on the other side of the family10,11.
8 - Even being a nanny of some ex-Sullivanians a mere year or two later. 
9 - It might have made a somewhat readable autobiography if it could be made of Twitter-sized revelations like these. :-)
10 - Plus they shared none of his political beliefs.
11 - They also did not share his noteriety of a lawyer "win" a Supreme Court case but the still having to stay in jail. Geez, talk about bad luck!
13 - Will anyone notice that this footnote has nothing pointing to it14,15?
14 - Or that the numbering kind of skipped a tad16?
15 - I blame it on the fact that Schoolhouse Rock never played that Twelve cartoon enough when I was growing up. Just so you know.
16 - What is up with tad/scosh crap in this blog? How can one tell how big a tad or a scosh even is17?
17 - Perhaps it is the fractional value diddly/squat18.
18 - Bonus points for anyone who knows the source of the allusion for note 17(though minus points for reading this far).


parkrrrr on 2 Mar 2010 11:43 AM:

For my net zero points, I will merely mention that he was certainly funniest in the first incarnation. The two spinoffs were a bit more tired.

prl on 2 Mar 2010 12:10 PM:

If squat=0, wouldn't diddly/squat be undefinable in a real number system? (http://en.wikipedia.org/wiki/Division_by_zero).

By the way, I saw a young girl on TV the other day that would definitely be someone you could relate to. She was spelling her name and she put 'Sa8m', when asked about she said 'it's a silent 8'.

Random832 on 4 Mar 2010 7:55 AM:

What about LOCALE_SMONDECIMALSEP (etc)? What if they're different? Does the pseudo locale test this? What about the negative sign symbol? which I can't find a constant for, but which lets me type four characters in the box, so presumably five including null.

What is meant to happen if one of these strings contains another as a substring, or shares a common initial substring? This locale does not test this case*. If both the thousands separator and decimal point are a dot or comma, the windows 7 calculator, which allows both in pasted data, will treat all instances of the offending character as a thousands separator. But what's a program supposed to do if LOCALE_SDECIMAL is "." and LOCALE_STHOUSAND is ".,"?** The obvious answer seems like it would need a state machine specially built to handle this case - does any such thing exist as a library function? [It seems an awful lot of work to do without any concrete example*** of an actual cultural need**** for it]

* - apart from the trivial case of LOCALE_STHOUSAND and LOCALE_SLIST being identical, which is also the case in some perfectly ordinary locales [e.g. 1033], so they're never expected to appear together anyway.

** - Its support for multiple characters in these things is as bad as you mentioned. Interesting fact: With thousands ".," and decimal ",", it interprets the string "1.,234,56" as 1[dec]23456.

*** - ref: at least you didn't name any in http://blogs.msdn.com/michkap/archive/2008/02/21/7836829.aspx [which I had a hard time finding again since there was no link back from "Quite a character"]

**** - as opposed to what's likely to be dismissed as "users messing with things they shouldn't"

***** - Bloom County. I don't recall the precise context, but I believe it involved chopping up a watermelon which represented someone's pre-tax income.

Michael S. Kaplan on 5 Mar 2010 8:42 PM:

I updated the ones I saw... :-)

Michael S. Kaplan on 5 Mar 2010 8:50 PM:

For the other big case involving strings and substrings, no one thought of it. Some NLS tester may think of it now and test it (and put in the bug!) now that it was mentioned here.

They may even credit the comments in this very blog!


referenced by

2012/06/20 Not giving someone their space can lead to SA (Separator Anxiety)

2011/01/28 Punishing a locale for its language's letter choices is just unseemly

2010/04/11 When one dishes it out but can't take it (aka Raising the roof, 2010 edition)

go to newer or older post, or back to index or month or day