Real developers use CompareInfo's Compare (Part 1)

by Michael S. Kaplan, published on 2005/07/08 03:31 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/07/08/436669.aspx


In the not too distant past, I pointed out that String.Compare is for sissies (not for people who want SQLCLR consistency). Well, today I am going to start to wrap that functionality I described, so that it can be used without a ton of work being done. Because real developers don't eat quicheuse CompareInfo's Compare. :-)

I will likely take my time with this and talk about other related topics. Never fear, when I am done I will put all of the code together in one place....

Part 1 of our saga has us getting the flag values right. I will add function for both directions. Since my eventual intent will be to call the NLS API funtions (which define the flags as DWORD) I will use System.Uint32 and mark them private (the right balance between CLS compliancy and header file accuracy, in my opinion!). They are internal functions, anyway. :-)

As a side note, I once tried to find out why the the values were changed, and no one could really remember. I guess I could start an Every Constant Has a Story category of posts, but I honestly don't think I know enough to sustain a series. I'll bet Raymond Chen or Larry Osterman could do a number of these, though!


using System.Globalization;

private const System.UInt32 NORM_IGNORECASE = 0x00000001; // ignore case
private const System.UInt32 NORM_IGNORENONSPACE = 0x00000002; // ignore nonspacing chars
private const System.UInt32 NORM_IGNORESYMBOLS = 0x00000004; // ignore symbols
private const System.UInt32 NORM_IGNOREKANATYPE = 0x00010000; // ignore kanatype
private const System.UInt32 NORM_IGNOREWIDTH = 0x00020000; // ignore width
//private const System.UInt32 SORT_STRINGSORT = 0x00001000; // use string sort method

private System.UInt32 NativeCompareFlagsFromManagedCompareOptions(CompareOptions options) {
    System.UInt32 flags = 0;
           
    if ((options & CompareOptions.IgnoreCase) != 0) { flags |= NORM_IGNORECASE; }
    if ((options & CompareOptions.IgnoreNonSpace) != 0) { flags |= NORM_IGNORENONSPACE; }
    if ((options & CompareOptions.IgnoreSymbols) != 0) { flags |= NORM_IGNORESYMBOLS; }
    if ((options & CompareOptions.IgnoreKanaType) != 0) { flags |= NORM_IGNOREKANATYPE; }
    if ((options & CompareOptions.IgnoreWidth) != 0) { flags |= NORM_IGNOREWIDTH; }
    //if ((options & CompareOptions.StringSort) != 0) { flags |= SORT_STRINGSORT; }

    return flags;
}

private CompareOptions ManagedCompareOptionsFromNativeCompareFlags(System.UInt32 flags) {
    CompareOptions options = CompareOptions.None;

    /*
    if((options & (CompareOptions.Ordinal | CompareOptions.OrdinalIgnoreCase)) != 0) {
        throw new ArgumentOutOfRangeException("options");
    }
    */
            
    if ((flags & NORM_IGNORECASE) != 0) { options |= CompareOptions.IgnoreCase; }
    if ((flags & NORM_IGNORENONSPACE) != 0) { options |= CompareOptions.IgnoreNonSpace; }
    if ((flags & NORM_IGNORESYMBOLS) != 0) { options |= CompareOptions.IgnoreSymbols; }
    if ((flags & NORM_IGNOREKANATYPE) != 0) { options |= CompareOptions.IgnoreKanaType; }
    if ((flags & NORM_IGNOREWIDTH) != 0) { options |= CompareOptions.IgnoreWidth; }
    //if ((flags & SORT_STRINGSORT) != 0) { options |= CompareOptions.StringSort; }

    return options;
}

I thought about possibly making NativeCompareFlagsFromManagedCompareOptions throw if CompareOptions.Ordinal or CompareOptions.OrdinalIgnoreCase were passed but for now decided not to bother, and commented that check out. Also, SQL Server never seems to pass SORT_STRINGSORT in its ComparisonStyle data, so that code is commented out, too.

Now about 100 years ago I wrote an article for Smart Access that did a reverse engineering job on mswstr10.dll and sqlsort.dll, the Jet and SQL Server DLLs that gave the ability for these database products to do comparisons independent of the operating system. However:

So I decided not to bother with the Jet/SQL Server functions. Like I said, they are horribly out of date, so we are better off with the OS functions here.

I'll post some more tomorrow, as we slowly build up the code for our wrapper objects that gives SQL Server comparison semantics....

 

This post brought to you by "Ʃ" (U+01a9, a.k.a. LATIN CAPITAL LETTER ESH)
A letter that is sometimes confused with its Greek cousin, U+03a3 (GREEK CAPITAL LETTER SIGMA), though I cannot imagine why. :-)


# A. Skrobov on 8 Jul 2005 3:08 PM:

LATIN CAPITAL LETTER ESH is not U+0a19, it's in fact U+01a9... Also why have you stopped linking the red characters to http://www.fileformat.info/info/unicode/char/ ?

# Dmitriy Zaslavskiy on 8 Jul 2005 3:22 PM:

In .NET <v2 CompareInfo seems to be the only way to call IndexOf and get Ordinal behaviour

# Michael S. Kaplan on 8 Jul 2005 6:30 PM:

A. Skrobov - good catch, as little dyslexic typing on my part. Link also added now...

Dmitriy -- actually, there are string methods with both ordinal and ordinalignorecase (the latter was mostly added post beta 2, the former hasbeen in for a while now).

# Richard Mitchell on 11 Jul 2005 3:51 AM:

I've already written some code to change the SQL Server flags to CompareOptions, is there any need to set all of the flags manually as only two actually change location...

// IgnoreWidth
if ((compareOptions & 0x00020000) != 0)
compareOptions ^= 0x00020010;

// IgnoreKana
if ((compareOptions & 0x00010000) != 0)
compareOptions ^= 0x00010008;

// StringSort (for completeness)
if ((compareOptions & 0x00001000) != 0)
compareOptions ^= 0x20001000;

this.m_compareOptions = (CompareOptions)compareOptions;

Plus it gave me an excuse to use ^= in anger.

referenced by

2005/11/29 SQL Server's cultural sensitivities

2005/09/18 Extending collation support in SQL Server and Jet, Part 2 (generating sort keys)

go to newer or older post, or back to index or month or day