Punishing a locale for its language's letter choices is just unseemly

by Michael S. Kaplan, published on 2011/01/28 07:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/01/28/10121371.aspx


So, every now and again we have a bug in a function that is part of Windows.

You may have run across this sort of thing in the past.

Now there are times that one can't convince people that the scenario that causes the bug is reasonable, so that even if they acknowledge that the problem exists, they don't feel it makes sense to fix.

The bugs in Excel and .Net described in Seeing double? You're not drunk; you're just running pseudo! (aka Announcement: Pseudo Day!) are an example of this kind of problem. The fact that a particular locale (pseudo, in fact!) hits problems that any user can repro by customizing some of the Regional and Language Options settings doesn't bother the appropriate owners of Excel or .Net.

That bug is not the subject of today's blog.

Today I'm going to talk about another bug, one found in GetTimeFormat and GetTimeFormatEx. One which has existed in Windows for over a decade. And one that you can see with an existing locale, in a very visible part of the user interface.

To see the bug, just switch to any one of the German locales, and change your time formats to one of the optional formats that includes a ' Uhr' literal in it:

   

The approved syntax for a string literal within a format string is to enclose the string in single quotes -- which is how these formats are designed, for German. my German is pretty rusty but I think Uhr means "time" or maybe "clock" or something like that -- the way in English we might say "O'Clock". A German speaker whose knowledge is not based on Yiddish (which is really just 16th century German with Hebrew letters) should feel free to correct/clarify that point!

Any call to GetTimeFormat or GetTimeFormatEx with the TIME_NOSECONDS or TIME_NOMINUTESORSECONDS flags.

While the straight time format will be something like

11:04:27 Uhr

the format with TIME_NOMINUTESORSECONDS will be

11 11r

and the format with TIME_NOSECONDS will be

11:04 11r

Now it is pretty easy to see what is going on here -- the fact that ' Uhr' is a literal is ignored, and the fact that the 'h' when not treated as part of a literal has a specific meaning in the format string causes the hour to be inserted.Of course the time in the system notification area (aka the 'system tray") goes through this code path, as do other fun spots like the time in the CMD 'dir" command.

Related bugs can be found with custom formats involving literals containing m or s as well as h, but given the issues raised in Seeing double? You're not drunk; you're just running pseudo! (aka Announcement: Pseudo Day!) I'm going to focus on the time fomats in actual shippingt locales....

And they can all show this terrible, weirdly formatted time string that has no excuse for the poor parsing job that it does, and has been doing for over ten years now.

Even after the refactoring work they did in the most recent version, described in We do seem to be short on time... (Windows 7 edition).

Okay, it is technically not a regression.

I would hardly recommend fixing it in a hotfix, or a GDR, or a service pack.

But I would never leave it broken this way in any function I owned for a major version....

Would you?

If not, then at a minimum the documentation should clearly warn against TIME_NOSECONDS and TIME_NOMINUTESORSECONDS since they do not work properly with string literals that contain format picture inserts. Punishing a locale for its letter choices is just unseemly....


John Cowan on 28 Jan 2011 7:43 AM:

But can you be SUUUUUURE that no customers are depending on this exact bug?

After all, the Billster himself said that Microsoft only issues new versions for new features (right back to MS-DOS itself), not to fix bugs.

Michael S. Kaplan on 28 Jan 2011 8:08 AM:

Thus my minimal suggested fallback position -- owning up to the bad behavior.

Or perhaps a new flag that doesnot subvert the right of the German format string to be properly expressed? :-)

Jeremy Drake on 28 Jan 2011 10:31 AM:

My recollection from high school German class is that you are correct.  "Uhr" literally means clock, but when used in this context you can think of it more as "O'Clock".

However, my recollection also tells me that the proper way to express a time is hours(0-23) 'Uhr' minutes, not hours minutes 'Uhr'.  For example, "zehn Uhr neunundzwanzig".  This page seems to confirm that: german.about.com/.../blht_telltime.htm

"8.For precise times, you say Uhr between the hour and the minutes: "zehn Uhr zwölf" = 10:12."

Michael S. Kaplan on 28 Jan 2011 11:50 AM:

Yea, this is not the default format for them (the defult is the 24-hour clock!). But the non-default shouldn't fail....

Aaron.E on 28 Jan 2011 11:51 AM:

Do you get a steady stream of bug reports from German users about this issue?  I don't know how something like this could have gone ten years without getting any attention, unless no one is using that time format.  Has there ever been an investigation into whether that format is used enough to justify the expense of addressing the bug?  If it's not, maybe the easiest solution is hiding that format from the UI.  

Of course, if this is a common format or if there are many time formats with a similar problem, then adding a new flag in a major version is probably the only safe 'fix'.  Though I would think the documentation should be ammended to include this caveat rather sooner than that.

Michael S. Kaplan on 28 Jan 2011 12:46 PM:

AFAIK it has only been reported one time so far, which seems unlikely to me but I lack proof to the contrary.

Random832 on 31 Jan 2011 9:16 AM:

The bigger question is why " 'U" is getting removed at all - it's like it thinks that text that follows the seconds bit is supposed to be removed, and it's merely failing to remove _all_ of it because it thinks it sees another hour bit. So, in other words, if this bug were fixed, it would turn it into "11" or "11:04" respectively when it [to my uninformed eye] ought to be "11 Uhr" and "11:04 Uhr", since the 'Uhr' text isn't associated with the minutes and seconds that are being removed.

Why was this done rather than to define a LOCALE_SSHORTESTTIME for the hours-only case, with its own text box in the control panel, and make all the locale data owners define it? And what exactly does TIME_NOSECONDS do that LOCALE_SSHORTTIME [which ought not to contain seconds] doesn't?

Also, where do the alternate choices in the dropdowns come from? I can't see the 'Uhr' in CultureInfo("de-DE").DateTimeFormat - are these not exposed in .NET? are they exposed anywhere?

"(the defult is the 24-hour clock!)" You say that as if to imply "HH:mm:ss 'Uhr'" isn't a 24-hour clock format.

Michael S. Kaplan on 31 Jan 2011 11:16 AM:

Obviously if they fix the parse bug they have to look at all of that....

TIME_NOSECONDS and TIME_NOMINUTESORSECONDS have existed in windows for over 10 years, and short time is new in the latest version, for .Net compatibility. Editing every locale to add new alternate formats is probably not the most effective use of resources. :-)


go to newer or older post, or back to index or month or day