by Michael S. Kaplan, published on 2006/05/04 03:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/05/04/589656.aspx
Last night a colleague asked me via email:
Hi Michael,
I have the following line
Console.WriteLine("A > B ? = {0}", CultureInfo.CurrentCulture.CompareInfo.Compare("kok-in", "ko-kr", CompareOptions.None));
And current culture is en-US
This line printing
A > B ? = -1
Which means "kok-in" < "ko-kr"
I think this is wrong. Right?
I have actually talked about string sorts vs. words sorts in the past in posts like this one, though it raises an interesting question in terms of usability.
The easy argument? Well, in that CompareInfo.Compare call above, just use CompareOptions.StringSort rather than CompareOptions.None.
However, in this particular case, the behavior was causing a problem when using SortedList with .NET Framework culture names as keys. Because the scenarios for which the word sort behavior is intuitive do not have among them a scenario like hyphenated tokens of 2-3 characters each. And although the string sort behavior works better here (since treating the hyphen as a symbol will give the behavior most would expect, in this scenario), there is actually a problem with that approach, namely the fact that the SortedList class does not give an easy way to apply the flag. To use it, you have to do something like implement an IComparer with a Compare call that uses the flag, which is the most common operation in the world....
This is made more complicated by the problems I pointed out previously in the post On approaching international programming....
Taking a step back, the names of the two sorts (string sort vs. word sort) do not give good clues in the words as to what each one does, anyway. So without docs, it would be hard to figure out which one to use in the first place!
The fact that the SortedList does not contain easy ways to get at these settings makes it (in my opnion) just a little less sorted, if you know what I mean. :-)
This post brought to you by "A" (U+0041, LATIN CAPITAL LETTER A)
Maurits [MSFT] on 4 May 2006 11:48 AM:
Michael S. Kaplan on 4 May 2006 12:38 PM:
Maurits [MSFT] on 4 May 2006 12:45 PM:
Michael S. Kaplan on 4 May 2006 1:07 PM:
Maurits [MSFT] on 4 May 2006 2:00 PM:
Michael S. Kaplan on 4 May 2006 2:16 PM:
referenced by
2010/09/07 Refusing to ignore some particular character's width isn't [always] an act of discrimination…
2007/09/20 A&P of Sort Keys, part 9 (aka Not always transitive, but punctual and punctuating)
2007/05/06 One product's feature is another product's bug -- just ask 'em!
2006/11/16 The problem of string comparisons, WORD sorts, and the minus that is treated like the hyphen