Are you an unmarked literal or a placeholder? (man or mouse?)

by Michael S. Kaplan, published on 2012/05/31 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2012/05/31/10312206.aspx


The question brings out one of the less intuitive behaviors of .NET Globalization:

Guys, I know I have already bugged you about this, but this thing keeps puzzling me.

The most harmless .NET code on Earth

            Console.WriteLine(DateTime.Now.ToString("H:mm:ss", new CultureInfo("pa-Arab-PK")));
prints
21.28.15

The pattern "H:mm:ss" was obtained from the list of all long time 24-hour patterns that .NET returns and used unmodified, as an opaque string. The MSDN library says .NET takes .TimeSeparator from .ShortTimePattern; so it is a dot and all colons in the long time pattern get resolved to dots.

It feels like a bug. Apparently .NET cannot properly handle NLS data with different time separators within the same locale. The question is: is it a good idea to introduce such data that .NET cannot handle, even if we are told it is what Punjabi people really want?

Yes, and the DateTimeFormatInfo.TimeSeparator Property docs do say this.

But they say more than that. They point out, at length:

Remarks

If the custom pattern includes the format pattern ":", DateTime.ToString displays the value of TimeSeparator in place of the ":" in the format pattern.

The time separator is derived from the ShortTimePattern property. Your applications are recommended to set the short or long time patterns to the exact values of interest, instead of attempting to have the time separator replaced. For example, to obtain the pattern h-mm-ss, the application should set "h-mm-ss" specifically. This also permits the setting of patterns such as "h'h 'mm'm 'ss's'" (3h 36m 12s) that don't contain a traditional separator between all parts of the format.

Perhaps this very issue -- the way that (for example) calling DateTimeFormatInfo.GetAllDateTimePatterns Method (Char) with the "T" (Long time pattern) can lead to unexpected results that seem to not go along harmoniously with an established separatof, given the fact that there is no way for .NET or Windows or anyone to discern the true intent of the COLON and such: whether it is intended to be a literal that is inserted as is, or a placeholder for the TimeSeparator.

One theoretical "fix" to make sure the intent is never unclear would be to remove the whole notion of de facto (i.e. unmarked) literals and always require everything that is not an insert to be a literal.

Though the cost (virtually every format string being more complicated, and less intuitive) is pretty high -- perhaps unexpectedly high.

It's probably easier to just confuse a few people, rather than to confuse many of them!


Joshua on 31 May 2012 9:31 AM:

Ewwww. Just for that, I'm leaning to using HL7's time format internally.

You know the format: yyyy[MM[dd[HH[mm[ss[.f[f[f[f]]]]]]]]][±ZZzz].

It has the nice advantage of recording its precision in the time format.


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day