The roadꂸ to the solution starts with identifying the actual problem.NET

by Michael S. Kaplan, published on 2007/10/08 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/10/08/5320425.aspx


JJ's question was simple enough:

Hello,

This is probably not the right alias for this, but...

I have a Globalization bug involving “Chinese Minority Characters Yi”.  The root cause boils down to the fact that the String.EndsWith method is ignoring these characters at the end of the string.  For example, "http://localhost:54031/ꂸꐯꉘ".EndsWith("/") returns true.  As an alternative measure, I can use the indexer on the string to get the last character just fine, but EndsWith consistently fails.  The machine in question is a JPN machine running Vista Ultimate.

Is there a good reason for this, or is this potentially a .Net bug?  If so, where could I send the bug?  This is blocking our exit criteria.

I am using String.EndsWith(String) exactly as shown in the e-mail (the lone argument is “/”).  This consistently reproduces with any of the characters found in the bug:

 ꌒꆏꉘꃅꌒꉘꃅꌒꂸꐯꉘꃅꎵꑳꀕꉘꃅꑌꐥꌒꌒꆏꅔꐎꌒꅔꐎꌒꂸꐯꅔꐎꁤꇅꀕꅔꐎꏰꐥꌒ

Let me know if you need anything else.

The answer to this one might feel like review to regular readers. :-)

The problem is really twofold here --

One, the problems implicit in that most cited post, The jury will give this string no weight. And Yi definitely falls in that category....

The other problem is the fact that we are looking at a URL, which like a system file path is one of the most non-linguistic things one can really have if one is having a string.

It falls in with The nature of OrdinalIgnoreCase vs. intuitive expectations, and suggests that the overload to be using here is String.EndsWith(String, StringComparison), specifying StringComparison.OrdinalIgnoreCase (or potentially StringComparison.Ordinal if one is running on a platform that is not case insensitive, which could be true of many a web server).

The problem is a common one, and there are more misuses like this one then almost any other problem I run across.

Now It may be convenient to blame the developers who write the incorrect code (and many people do), but really too convenient -- the framework points them in this direction by simply using methods intuitively.

It may also convenient to blame the default overload (and several others do, after all the problem wouldn't happen with a different overload!) but once again too convenient -- because all this would do is shift the problem to the other cases when one actually needs linguistic comparisons.

It is easy to blame FxCop (and I do, a little bit) for it's contributions in this space, but in reality it is just slow to helping people get the right methods; with a nudge here and a prod there, they will improve, in time.

But the underlying problem will still exist.

The real problem is in the fact that the whole framework, really the whole programming model, does not lead developers to use the correct method.

Remember the frustrations of Scott Hanselman when dasBlog had a Turkic I bug and that original jury / no weight post suffered from a .Text bug when I first tried to post about it -- smart developers everywhere are hitting these problems, and it is a frustrating truth that no matter how good FxCop becomes at helping here in future versions that it is little more than King Canute trying to sweep back the tide.

Because smart developers are going to keep writing the wrong code.

And let's be honest for a moment -- if that is what the smart ones do then you know what happens with the not-so-smart ones....

What is needed is a whole new paradigm, a whole new way of looking at the problem.

It is not that FxCop sucks, to the extent that it does that can be blamed on the areas the .NET Framework itself sucks in. And those areas have their suckage built in, because they needed to be looking at the problem differently.

I have talked to and bounced ideas off of a bunch of really smart people about this (I'll get into my criteria for calling someone a "smart" person in this context in another post), and I think I am approaching if not a solution then something much closer to it.

Since my full scope of influence is really limited in large part to the blog these days, I'll post about it here and see if anyone is interested, rather than trying to sweep the problem under the rug (the current solution)....

This post brought to you by (U+a0b8, a.k.a. YI SYLLABLE MAP)


no comments

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2007/10/09 A&P of Sort Keys, part 13 (About the function that is too lazy to get it right every time)

go to newer or older post, or back to index or month or day