by Michael S. Kaplan, published on 2005/04/14 11:00 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/04/14/408116.aspx
Yesterday, someone named Ben posted the following comment to my post Invariant and Ordinal Redux:
I appreciate your enthusiasm for picking out common programming errors like this, but as a professional programmer, I find a lot of these internationalization parameters confusing.
How do I know if I need to pass the NORM_IGNOREKANATYPE flag to CompareString? How do I know if I want LOCALE_USER_DEFAULT or LOCALE_SYSTEM_DEFAULT, or some other locale?
I simply don't know. Unless I learn Japanese, or know someone who knows Japanese, I'll never know the answer. The trouble is that the APIs feel like they were written by linguists.
Me? I just want to compare filenames, or compare entries in a hash table, or compare usernames, etc. I don't want to even have the choice of ignoring kana types. I just want the CompareStrings API do the *right thing* out of the box. If that is too hard for a single function, then let's write some API sets that are easy to use for common cases. I think this would be a more useful endeavor than to write articles about the nuances between CT_CTYPE3 and CT_CTYPE2.
Sometimes less choice is better. Please please finish that list of do's and don'ts. Please please make a list of "If you want to sort like a dictionary, do this... If you want to put filenames into a hash table, do this..."
My initial reaction was to point out that the APIs were not written by linguists -- but the developers had expert advice from linguists when the functionality was exposed.
My second reaction was a technical one, thinking of which ones I had already covered (like What is my locale? Well, which locale do you mean? answering some that locale question) and which ones might make good future posts (like the care and feeding of NORM_IGNOREKANATYPE) and so on.
My third reaction was to slow down this "developer" in me trying to solve the technical problem and look to what was really being suggested. Unfortunately, Ben's supposition is correct -- the APIs are complicated, and there is too much functionality to try to distill into simple usage without having detailed articles about the nuances. Articles that could be read by the kind of devs who try to solve the problem you indicated.
In a very real and almost biblical sense, one can talk about "CompareString which begat lstrcmp and lstrcmpi in the USER kingdom, and was fruitful an multiplied in the SHELL kingdom and begat StrCmp, StrCmpI, IntlStrEqN, IntlStrEqNI, StrCmpN, StrCmpNI, StrIsIntlEqual, some of whom later begat StrCmpLogicalW. And in that kingdom functions which were not begat from CompareString also flourished like those that used the C rules -- StrCmpC, StrCmpIC, StrCmpNC, and StrCmpNIC. And in the kingdom of .NET the managed brother CompareInfo was also fruitful and begat the five overloads of String.Compare and in Whidbey begat the StringComparer class and the StringComparison enumeration. And CompareInfo.IsPrefix and its overrides begat String.StartsWith. And CompareInfo.IsSuffix and its overrides begat String.EndsWith. And..."
Of course what the SHELL folks and the BCL folks did showed that in attempting to simplify individual functionalities into single APIs, you cause an explosion of simple APIs that are also very tough to unravel what to use.
Topically modifying what Hal Holbook said on The West Wing (playing the cantankerous Albie Duncan) in the episode Game On:
It's not simple. It's incredibly complicated. I've been doing NLS work for over 10 years and there is no right answer to these questions and software development needs all the words it can get its hands on...
I could tell you when it is ok to use lstrcmp and lstrcmpi and StrCmpLogicalW. I could not even try to tell you how to navigate the rest of that stuff in the Shell or a lot of the stuff in .NET, even though a lot of it calls right into us. Because to me it is just a decision of whether one wants one's complexities to be horizontal or vertical, with the bonus of the vertical complexity (the NLS kind) being that all of the functionality is there, versus the individual McNugget that the developer was trying to surface in the simplified method, which will always be missing one or more of the functionalities that are possible, despite seeming to me to be a lot more complex....
So while I will give practical advice from time to time like (like "use the new OrdinalIgnoreCase type comparisons when trying to imitate the OS, because the OS does not know CompareString from Cholesterol"), the bulk of what I say will be exploring that vertical space of the NLS managed and unmanaged APIs and how best to use them to get the results you want.
Because the problem I have personally with the horiztonal space is that when you have to change behavior because the call did not do what you thought it did, the change is more than just passing a new flag; it is often calling a whole new function in a whole new way (just take the String.StartsWith method as an example -- if you want to do some operations you have to move to CompareInfo.IsPrefix, which has entirely different calling semantics (one is a static method that takes two strings, the other is an instance method on a string). Or if I want to change the STRINGSORT/WORDSORT behavior of StrCmp, I have to go figure out all the parameters of CompareString now, which if I had done in the first place I would not have been trapped in the Sargasso of SHLWAPI.
Hopefully this fits with the model people are expecting here. If not then maybe the Shell or BCL folks will step up and work to provide the uber-conversion charts to know when to call which of the 30 methods that are all designed to simplify the five methods that NLS provides (or in the unmanaged world the 30 functions designed to simplify the one function).
Simplification is just too complex for me. :-)
This post brought to you by "A" (U+0041, LATIN CAPITAL LETTER A)
After Happy Days went off the air and everybody realized the Fonz was short, the letter behind "Aaaaay" had its reputation injured a bit andis looking to expand into new markets, like this blog!
# Barry Kelly on 14 Apr 2005 10:05 AM:
# Michael S. Kaplan on 14 Apr 2005 10:33 AM:
# Dean Harding on 14 Apr 2005 7:47 PM:
# Michael S. Kaplan on 14 Apr 2005 8:52 PM:
# Dean Harding on 14 Apr 2005 9:20 PM:
# Michael S. Kaplan on 14 Apr 2005 9:38 PM:
referenced by
2006/05/04 Sort the words, sort the strings
2005/06/12 Browsing the shoals of managed string comparisons