What's wrong with what FxCop does for globalization, Part 0

by Michael S. Kaplan, published on 2006/12/05 02:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/12/05/1210407.aspx


Some people really like the visibility that globalization gets with FxCop in the managed world.

I have a net neutral feeling about it, myself. I mean, I like the visibility, I am just not sure that it is improving the behavior much. And I am sure that it is exposing a flaw in the way many of the methods are put together since it makes some code feel less readable to people at times.

Like the other day, when Navid asked:

What are the benefits or possible disadvantages of doing the following (assume b is any non-string object):

    string a = string.Concat(b);

Instead of:

    string a = b.ToString();

In actuality, to adhere to FxCop guidelines, you’ll need to specify the IFormatProvider when using the ToString() method. Therefore, the correct statement should be:

    string a = b.ToString(CultureInfo.InvariantCulture);

The first thing to notice is that if b is null, the ToString() method will throw an exception whereas the first statement will work without problems. This may save you a null-check in certain circumstances. Furthermore, since no IFormatProvider is necessary, you don’t have to import System.Globalization into your class either.

However, I suppose some disadvantages may be that using Concat() instead of ToString() may not be as clear for some readers. Also, if you need to use a specific CultureInfo, then ToString() may be the obvious choice. With respect to performance, I have no idea on the implications but would guess that ToString() is probably slightly faster. It may be worth looking at the IL to see what the differences actually are.

In any regard, I find the first statement more pleasing but I am unsure if it’s actually The Right Thing To Do. I am more or less using the first statement as a shorthand for:

    string a = string.Empty;
    if (!string.IsNullOrEmpty(b))
    {
        a = b.ToString(CultureInfo.InvariantCulture);
    }

Now it is unfair to blame FxCop here -- it is making a very valid point, stating (in SDET David Kean's words) "Make sure you pass an IFormatProvider to b.ToString() to make it clear and explicit what IFormatProvider is being used. If you do not specify one, the default in most cases is CultureInfo.CurrentCulture, which will cause the output of ToString to vary depending on the user’s currently selected locale."

Using System.String.Concat, which is clearly the wrong method from an intent standpoint, rather than System.String.ToString, just to avoid the other two evils (an FxCop warning or an unhanded FxCop error) is as clear indication that in this case, FxCop has no choice but to either encourage bad behavior or encourage developers to use unclear code constructs.

Has the actual issue in question been addressed? Is the code more properly globalized?

In my opinion the only thing that happens to globalization in this case is that developers who wouldn't know Unicode from UNICEF have every reason to feel like globalization is a four letter word.

And it hurts the reputation of the tool too -- I mean, how did everyone feel about the kid in high school that made them do some silly thing like not run in the halls? Did they feel more educated? Or did they just decide that the person handing out citations is just really annoying?

(apologies if you were one of those ho used to hand out the warnings, but many people did find you to be kind of annoying!)

Now some people suggest that the problem is that the wrong defaults were chosen in the beginnings of the .NET Framework (Anthony Moore is one of the more prominent holders of this opinion). But if the problem is that people don't like adding a StringComparison.OrdinalIgnoreCase to their String.Compare calls, or alternately if the problem is that they do not find it intuitive to use String.CompareOrdinal, then the bug is not that OrdinalIgnoreCase isn't the default comparison type. Such a default would just mean that we'd be complaining about a whole different set of bugs when people blindly used a default that did not match their scenarios.

The bug, to the extent that there is a bug, is in the naming scheme being so unintuitive that there is no way on earth that any human would have expected the "preferred" terms to be used unless they either read documentation or blog posts like this one that suggested it to them. And as much as I may think of my little fruit stand here, I am really not quite ready to assume that it is thought of as "intuitive" to read this blog before one writes code! :-)

Similarly, I do not think the developers who think there ought to be an OrdinalCulture of some type are retarded; I simply think that they are trying to work within a system that simultaneously suggests the importance of always specifying a culture and always using Ordinal comparisons. The developer who is wondering where he can find the OrdinalCulture is arguably the only person involved who doesn't seem to be developmentally disabled, if you know what I mean.

Now with that said, I believe that there are things that can be done in all of the following to lead to a place where code could be better and globalization might not be thought of along with various four letter words:

and I'll try blogging about some of my thoughts about what could be done in each area.

You an think of this post as the introduction to the topic.... :-)

This post brought to you by  (U+0DDD, a.k.a. SINHALA VOWEL SIGN KOMBUVA DIGA AELA-PILLA)


# Mike Dimmick on 5 Dec 2006 8:20 AM:

It's an area that I think few programmers are familiar with. VB6 gave you no control over locale, using only the current locale, while C programmers are far more familiar with strcmp (which is of course ordinal) than with strcoll (which uses locale). Presumably .NET used the VB6 model to ease migration from that tool, but it surprises C and C++ programmers. It will probably also surprise Java programmers - the documentation for String.compareTo seems to indicate ordinal comparison.

# Michael S. Kaplan on 5 Dec 2006 9:42 AM:

And if few people understand it, then how happy will they be thinking they are covered but finding out that their semi-indiscriminate sprinkling of CultureInfo.InvariantCulture might have been incorrect, even if it shuts up FxCop?

# Chuck Steele on 5 Dec 2006 11:17 AM:

Of course, FxCop is just a tool.  On the whole, I welcome FxCop to my toolkit.

Software Engineers are paid for judgment -- not just typing.  FxCop just brings the issue up and part of one's job is to give the issue due consideration (and to decide just what consideration is actually due).  It’s perfectly legit to inform FxCop (my means of a suppression attribute) to sit down and shut up about an issue -- but only after having considered it.

As far as I'm concerned, nice-looking code is secondary to correctness.  Something is "elegant" only if it is correct and "correct" means that you can prove that it will ALWAYS work.  Most textbook examples fail this test (and properly so -- they are usually there to help one understand a point, not as production-quality code), so real code should not usually be quite so "simple" -- it should simply be no more complex than the real-world situations that it is expected to handle or model.  I'll take ugly and correct over simple but buggy any day!

If FxCop brings up an issue like the one cited (what culture to use for formatting) and you realize you don't really know which one you SHOULD use -- you have some design work to do.  Figure out what the code should do and then impose a good design that addresses the requirements.  That's what Software Engineering is really all about...

# Michael S. Kaplan on 5 Dec 2006 11:28 AM:

No argument, Chuck -- but given that most devs are not interested in working to understand the issue, the easy solutions to make FxCop shut up are often all the attention span will allow. A more intuitive and automatic way to get the right answer would be a good idea (IMO).

# Dean Harding on 5 Dec 2006 5:26 PM:

It's a tough one. I mean, at some point you still need at least two different methods: string.CompareOrdinal and string.CompareCulture (and perhaps more, throwing case, accent, etc insensitivity in there). I suppose you could have an overload for every situation: string.CompareFilename, string.CompareIdentifier, string.CompareForDisplay, etc but then what happens if you run into a situation that doesn't exactly fit?

If only would could have a string.CompareDoTheRightThing overload :-)


referenced by

2007/10/01 What's wrong with what FxCop does for globalization, Part 1

2006/12/06 What's wrong with what FxCop does for globalization, Part 0.5 (a segue)

go to newer or older post, or back to index or month or day