Find the [globalization] bug(s)?

by Michael S. Kaplan, published on 2006/04/07 21:50 -04:00, original URI:

This may be fun. Extra credit to those who can find problems without copying and pasting the code somewhere and compiling it (and you are all on the honor system if you want to claim the extra credit!)....

It is technically a C# question, though mainly it is just a .NET Framework question.

First, there is a step to set up the machine (for the extra credit just do this step mentally and keep in mind that you did it!):

In Regional and Language Options->Regional Options->Customize->Date, set the Date Separator to ":" -- the idea is to make sure that both the Date and Time Separators are identical.  

Ok, now just review the following bit of code:

DataTime dt1 = System.DataTime.Now();
String st = dt1.ToUniversalTime().ToString();
DateTime dt2 = System.Convert.ToDateTime(st);

All comments will be moderated until some time over the weekend.

I suppose there could even be additional extra credit for any non-programmer who has a reasonable theory....


# Gabriel Lozano-Morán on 8 Apr 2006 2:32 AM:

My guess is that this would fail because the date is not an ISO 8601 date format and therefore the internal parsing will fail.

I think that the real question here is "why does the above code fail?"

# Michael S. Kaplan on 8 Apr 2006 10:26 PM:

Hmmm... not too many people wanting to make any guesses (or to try compiling the code and running it to see what happens, either). I'll wait a bit longer before the answers get posted, and give some people a chance to look good. :-)

# on 9 Apr 2006 9:58 PM:

The convert piece needs a locale.

# Jeff Parker on 10 Apr 2006 9:50 AM:

Sorry just getting in and reading the blogs from over the weekend.

Anyway, my answer would be very similar to this one I gave on webservices.

Anyway this is not really a bug but the time here dt1.ToUniversalTime().ToString();
Will not contain any UTC offset in it as by default the ToString does not have it. So when you convert it back in on the second line it will be that time in the string but will have the UTC offset of the computer you are running this code on.

# Mark on 10 Apr 2006 10:19 AM:

Without running the code, my guess would be that because the date and time seperators are the same, that the convert will have problems differentiating the date portion from the time portion.  

# Brandon Haase on 10 Apr 2006 1:04 PM:

I'll take a guess, trying for partial extra-credit since I referenced msdn ;)

First, the first line has DataTime instead of DateTime... so it wouldn't compile unless you've defined System.DataTime. Also, Now is a static property of DateTime, so you'll get another error for the parenthesis.

Assuming you meant DateTime dt1 = DateTime.Now;, dt1 would be fine, because internal representation of the data does not care about the delimiters.

The string representation of the date assigned to the st variable would have :'s as a date seperator. You would get something like "4:10:2006 12:52:31 PM". Note the universal time conversion subtracts the UtcOffset.

System.Convert.ToDateTime doesn't blow up because the DateTime.Parse method (what ToDateTime invokes) disregards unrecognized elements and substitutes missing elements from the current date to avoid throwing an exception.

I wouldn't be too surprised if the date value is parsed as date *and* time or the operation swaps or otherwise munges the date/time info.

Now I am going to compile it and see what happens...

# Brandon Haase on 10 Apr 2006 1:10 PM:

Ooh... just barely missed it.

The parse operation ignores the real date value and parses the time as though it were a date. No time value is parsed.

Nice one. Where was this found?

# Michael S. Kaplan on 10 Apr 2006 4:23 PM:

There are indeed many problems here, some of whicch were found by people (see above comments).

The principal problem is the mismatch -- formatting a date into a string in a manner independent of culture and then parsing it back to a date in a culture-sensitive manner.

As Brandon indicated, the change in Regional Options makes the problem more noticable, since instead of sometimes causing invalid date values to be returned it will actually throw an exception due to the confusion that the change causes.

I won't argue that this makes the bug worse though; I would argue it makes the bug better! Because bugs that corrupt data without being noticed can be much much worse for an application, in the long run.

Additional problems on the time value itself -- the format call moves it to universal time, while the Parse call will assume local time -- so you will watch the dat move around even if the other potential mismatches were not present....

# Maurits [MSFT] on 10 Apr 2006 5:52 PM:

> the format call moves it to universal time, while the Parse call will assume local time

SMTP solves this by adding an RFC2822-style "zone" offset.

Pacific Time: -0800 or -0700 depending on whether DST is on
Eastern Time: -0500 or -0400 depending on DST
Chatham Islands: +1245 or +1345 depending on DST

This keeps local time readable but without letting ambiguity into the picture.

# Michael S. Kaplan on 10 Apr 2006 6:56 PM:

Of course here the ambiguity has been multiplied by intentionally using mismatched methods, right? :-)

Plus, if you always added that info to every string, then most users would likely very confused since they do not expect that sort of detail in all cases....

# Maurits [MSFT] on 10 Apr 2006 7:29 PM:

> most users would likely very confused

I suppose it would depend on how used they are to dealing with times and dates from different time zones.

Most USAnians can handle "Lost! Season Finale at 8/7 Central"...
And most USAnians can handle "Super Bowl plays at 9PM Eastern"...

And most email server sysadmins can handle the RFC2822 syntax...

... but I don't know how (say) Europeans deal with international conference calls.  Do corporate netizens use Central Office Time, even if no-one on the call happens to be at the Central Office?

I think this would mandate some new structs :( since DateTime's don't keep track of their own time zone.

struct DateTimeWithTimeZone
DateTime dateTime; // stored in UTC, natch
TimeZoneOffset timeZoneOffset; // +1245 or -0800

bool Positive; // + or -
int Hours; // 12
int Minutes; // 45

and some static constants

static TimeZoneOffset EST = { false, 5, 0 };
static TimeZoneOffset PST = { false, 8, 0 };
static TimeZoneOffset UTC = { true, 0, 0 }; // true by convention

# Michael S. Kaplan on 10 Apr 2006 8:00 PM:

Hmmm.... okay. But of course this has nothing to do with the original question (in pointing out bugs in the code we don't try to design one or more of the types in the underlying programming framework!).


# Maurits [MSFT] on 10 Apr 2006 8:08 PM:

"mandate" was the wrong word... I meant something more like "entail."

AFAICS the underlying problem[1] is not solvable without major surgery[2], which cure is worse than the disease... so the bug is firmly located in the code, as you describe, and not in the framework.

All this addresses only the time-zone-offset bug, and not the "regional date format" bug.

[1] time zones are hard
[2] make time zones part and parcel of datetimes, a-la RFC2822

go to newer or older post, or back to index or month or day