CompareString ignores case by lowercasing....

by Michael S. Kaplan, published on 2006/03/02 15:59 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/03/02/542395.aspx


I was having a converation with Shawn earlier and had something of an epiphany that I thought I'd share....

I have mentioned that Windows handles its "ignore case" functionality (as in the way we handle it in RtlCompareUnicodeString and CompareStringOrdinal) for the filesystem and the registry and environment variables and such by UPPERCASING prior to comparison.

But in the past here when I have talked about linguistic comparisons via CompareString, and especially when you look at sort key examples like in this post:

Case Sensitive

limes 0E 48 0E 32 0E 51 0E 21 0E 91 01 01 01 01 00
lIMES 0E 48 0E 32 0E 51 0E 21 0E 91 01 01 02 12 12 12 12 01 01 00
LIMES 0E 48 0E 32 0E 51 0E 21 0E 91 01 01 12 12 12 12 12 01 01 00

Case Insensitive

limes 0E 48 0E 32 0E 51 0E 21 0E 91 01 01 01 01 00
lIMES 0E 48 0E 32 0E 51 0E 21 0E 91 01 01 01 01 00
LIMES 0E 48 0E 32 0E 51 0E 21 0E 91 01 01 01 01 00

it is clear that for our tables it is the uppercasing and unusual casing that gets the extra weight, thus ignoring case (basically ignoring weights) is a way of doing a very firm, de facto LOWERCASING prior to comparison.

I had never really thought of it that way until just a moment ago....

Not that it is a 100% rule -- I mean if you take the Hungarian Technical Sort into account where the weights for the exemplar characters in Hungarian are reversed in this respect (while not touching those unusual case situations). But it is usually true. :-)

 

This post brought to you by "Œ" (U+0152, a.k.a. LATIN CAPITAL LIGATURE OE)


no comments

referenced by

2008/11/01 What's the shape of the sort?

2006/08/01 Sometimes, ignoring case is stupid

2006/06/02 What the @!#$% is the TERTIARY_WEIGHTS() function for?

go to newer or older post, or back to index or month or day