by Michael S. Kaplan, published on 2004/12/30 01:58 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2004/12/29/344136.aspx
There is a great deal of confusion surrounding the meaning of these two different things in the .NET Framework, and when to use each. If you have suffered, are suffering, or think may suffer in the future from such a confusion, then read on!
(Otherwise, I guess you can go away and come back another time)
The invariant culture's direct ancestor is the invariant locale. Officially added to the Windows source tree at 10:23am on May 12, 2001, its intention was not to be used as an actual locale (which would explain why no locale data was added until a month later; until then no one was using it in GetLocaleInfo!).
Originally, LOCALE_INVARIANT had just one noble purpose -- to allow one to use CompareString (and LCMapString with the LCMAP_SORTKEY flag) in a way that would only use the "Default" Windows sorting table as mentioned a little bit here and especially here. The results, as that second article mentioned, would not vary when the user or system locale settings did; they would be invariant within that installation of Windows.
The data was added for this locale a month later, as I said, for obvious reasons -- if you have an LCID that one function considers to be valid, you must have a very good reason if another will not. And it cannot duplicate any other locale, either. Much weird data was added so that no one would be tempted to try to act like they spoke a language called "Invariant" and then all was good.
Note that these string comparisons still had much linguistic value -- half of the locales in Windows use that default table, so an invariant sort would not only avoid varying, it would also look right to a lot of the world.
The .NET framework had similar requirements (with the additional need for invariant parsing/formatting support) and thus CultureInfo.InvariantCulture was created. As with the locale, any string comparions made with InvariantCulture's CompareInfo object would have linguistic validity in a lot of places, and would not vary within that installation of the .NET Framework.
So everyone had what they needed, right?
A bunch of people wanted a method of doing a more binary type of comparison, instead of one that would be based on the "linguistically appropriate" approach gven a particular culture1.
The difference between what we had and what they wanted was akin to the difference between the C Runtime's strcoll/wcscoll versus strcmp/wcscmp (in the CRT documentation they refer to the difference as being locale based versus lexicographic).
The other advantage to such a "lexicographic" comparison is that it would be faster since a simple binary comparison of the code point values was being used.
To meet this need, the notion of an Ordinal sort was added and an Ordinal member was added to the CompareOptions enumeration. Selecting it would ignore all of those cultural collation features and give you a binary sort that would also, incidentally, not vary.
The only remaining problem at this point is that there were now two useful ways to do these different "niche" type of comparisons but neither name really jumps out at the developers who were looking for such solutions.
That problem remains to this day, though every single time I speak at a conference or answer a question in a newsgroup or get someone to look at posts like this one, then there is at least one less developer who has this problem. Maybe this time it is you? :-)
Now the story does not end here; many people have wanted to do things in a case-insensitive way. Of course if you wanted a case-insensitive invariant comparison then you could have done that all along -- just use the InvariantCulture's CompareInfo methods with the CompareOptions.IgnoreCase flag passed in. Easy!
But some people wanted a case-insensitive ordinal comparison?!?
Now the closet linguist in me shudders at this concept since a casing operation is essentially a linguistic one while an ordinal one is specifically not -- it's lexicographic.
So people are asking for a linguistic non-linguistic support, a request that for me brings to mind the comedian Steven Wright's dog2.
However, the technical half of me understands the need and so I got over my linguistic fetish as one of my colleagues on the BCL team worked in Whidbey to add a new OrdinalIgnoreCase member to the CompareOptions enumeration.
The behavior is basically to do the casing operation using the default casing tables prior to doing the binary comparison. This feature has been in the "Whidbey" version of the .NET Framework for some time (first checked into the source code tree on February 7, 2003), so you can try it out today if you have just about any build of Whidbey underfoot.
Hopefully this post will help clear up some of the confusion about these two interesting comparison types.
1 - What can I say? Some people are Некультурные (uncultured) though not in the culturally offensive sense.
2 - Steven Wright claimed to have named his dog Stay so that he could call out "Come here, Stay! Come here, Stay!" and watch the dog walk toward him in a stuttery fashion.
This post brought to you by "Ω" (U+03a9, GREEK CAPITAL LETTER OMEGA)
I talked to Omega just before this post went live. She said that as the last letter in the Greek alphabet (who was pretty much always therefore last in the queue), she understood the cost of keeping letters in order. Any performance benefit is good one, to her mind. Especially since a binary sort would let her come before her little sister (U+03c9, GREEK SMALL LETTER OMEGA) for once.
# Panos Theofanopoulos on 30 Dec 2004 8:29 AM:
# Michael Kaplan on 30 Dec 2004 8:36 AM:
# Michael Kaplan on 30 Dec 2004 8:48 AM:
# Norman Diamond on 4 Jan 2005 5:10 PM:
# Michael Kaplan on 4 Jan 2005 5:20 PM:
# Norman Diamond on 4 Jan 2005 10:51 PM:
# Michael Kaplan on 5 Jan 2005 12:01 AM:
# Norman Diamond on 5 Jan 2005 4:38 PM:
# Michael Kaplan on 5 Jan 2005 5:03 PM:
# Norman Diamond on 5 Jan 2005 11:05 PM:
# Michael Kaplan on 6 Jan 2005 1:08 AM:
# Michael Kaplan on 6 Jan 2005 5:13 PM:
# Norman Diamond on 7 Jan 2005 7:41 PM:
# Michael Kaplan on 7 Jan 2005 7:56 PM:
# Phil Hackett on 2 Sep 2008 12:00 PM:
Thanks Michael - that article was -really- useful. It's a shame the official Microsoft help/documentation doesn't explain this, really, as it's such an obvious question that people are bound to ask.
2006/08/27 It has not always been so invariant
2006/05/24 Invariant vs. Ordinal, the third
2005/04/26 Intelligent unmanaged string comparison
2005/04/13 Invariant and Ordinal Redux
2005/04/03 TechEd Bloggers does not work for this site?
2004/12/30 How do sort keys work?
go to newer or older post, or back to index or month or day