by Michael S. Kaplan, published on 2007/10/12 10:16 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/10/12/5396685.aspx
Warning: this post discusses an undocumented side effect of the shipping implementation of a few functions in the NLS API. The advice given is sound and is unlikely to ever become unsound, but relying on the side effect itself as a permanent state of affairs is probably not in your best interest, in case it ever is deemed a bug to fix by the team that owns it....
Yesterday, I was asked about some performance results that were a bit puzzling.
It indirectly relates to what I was talking about in Invariant vs. Ordinal, the third. I did, in my list of reasons to call CompareStringOrdinal, point out that it was faster than CompareString/CompareStringEx (and in this case, the lstrcmpi function that is essentially just a wrapper around CompareString).
Now interestingly, there is a case in Vista and Server 2008 where that is not always true....
It is an interesting side effect of the way both functions are written.
As I pointed out in A&P of Sort Keys, part 13, CompareString/CompareStringEx both try to exit as soon as they can. Even though they have made mistakes at times (the main point of that post), it is often a good thing that it is as fast as possible. Everyone wants good performance, after all.
In fact, if your default user locale is not one with compressions in it and the strings are not at all similar, a call to lstrcmpi on two null-terminated strings will return after comparing that very first character in each string!
Now when you look at CompareStringOrdinal, it is no slouch in the performance department -- and it has no changing behavior based on locale settings like lstrcmpi. It too will return with results of comparing those two dissimilar strings after comparing a single character. It does not even have to look up a weight, it just compares the code points directly.
However, now we get to the interesting side case -- as a wrapper around a funtion like RtlCompareUnicodeString, the CompareStringOrdinal function has to know the length of the strings (structures like UNICODE_STRING have the length in them, after all!).
So if you call CompareStringOrdinal and pass -1 to indicate the strings are NULL-terminated, then CompareStringOrdinal has to walk each string an additional time to get the length. Not by any means a lengthy operation -- and compared to strings that are alike or close to it even a small amount of the time, not a big enough detail to make it look bad compared to lstrcmpi. But if you have a ton of strings that are nothing like each other most of the time, you'll see the difference that extra string walk adds to the elapsed time....
The actual performance difference is not huge, and as Gene Apperson once pointed out, "It doesn’t matter how fast your code is if it doesn’t work." So if it is the right function to call, then it is the right function to call, even if it were always slower.
In the future, the implementation could even change as long as the underlying behavior never did, in all sorts of ways.
One of the "strengths" in the argument to add this function was that it was just wrapping a function that has been around almost since the beginning of 32-bit Windows (RtlCompareUnicodeString). Hoe very zen that one's strength turns out to be one's weakness? :-)
It is also interesting since in the case of CompareString/CompareStringEx, passing -1 for both lengths (which is what lstrcmpi does) is more likely to be the most optimized case rather than passing the length when you know it by intentional design, yet in CompareStringOrdinal (due to this side effect of the implementation) it is the other way around in Vista and Server 2008!
This post brought to you by ⼢ (U+2f22, a.k.a. KANGXI RADICAL GO SLOWLY)
go to newer or older post, or back to index or month or day