Did software developers ever learn their ABC's?

by Michael S. Kaplan, published on 2006/08/15 03:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/08/15/700676.aspx


Developers always think they are right.

Especially when they are not.

I mean, if you asked any random English-speaking person if they had an ordered list of characters whether they would expect all of the following letters comes before B:

aÀàÁáÂâÃãÄäÅåĀāĂ㥹ǺǻḀḁẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặAa

They would say of course. Even if they don't use any of these letters themselves, they all seem kind of A-ish, if you know what I mean. They would be shocked if you told them that any of these letters came after B, let alone if you told them that all of them come way after Z.

But if you are a software developer, and you are expecting a binary comparison, or worse yet that "not found in nature" case-insensitive binary comparison, you not only might have different expectations, but you will be annoyed as all get-out that the default did not not match your expectations. And that it was slower than the comparison that you wanted. And that there is no global way to change all comparisons to act the way you want them to. And that you would actually have to (gasp!) call a different function/method, or perhaps use a different flag, to make things behave the way you want them to.

(Their heads might explode if they ever got wind of those weightless strings, or if you explained to them What is up with number sorting!)

And of course, once you (the developer) convince yourself that you are right here, you take it off the deep end -- there should be an OrdinalCulture, and the documentation should make it perfrectly clear that the perf hit here is like 2x or worse for the inefficient default. And that their scenario is actually more common.

I have to wonder if their mothers would agree with them. Assuming their moms are not also engineers. :-)

These developers can just fix their code, in my opinion. And next time they can just RTFM and do their homework next time

I think it is another reason that I am better as a Technical Lead than as a developer. Because I am foolish enough to think that the linguistic behavior is more intuitive for more people....

Now of course this is all a bit overstated. And for every CS geek who thinks that binary comparisons make more sense, there was a Geoff or a Bryan or a Ben who was a developer who was rolling on the floor laughing as the thread that inspired this post unfolded throughout the day. Which is actually proof that there are developers who understand the intuitive behavior. :-)

 

This post brought to you by (U+1eb5, a.k.a. LATIN SMALL LETTER A WITH BREVE AND TILDE)
(A letter that either comes before B or way way after z, depending on how socialized you are!)


# Jeroen Vos on 15 Aug 2006 5:29 AM:

The second A from the right side sure looks like a normal capital A. I take it you were thorough enough not to include it (since that would be one comming before B), so what character is it?

# Michael S. Kaplan on 15 Aug 2006 6:33 AM:

The second A from the right is a fullwidth capital A. :-)

# Andrew West on 15 Aug 2006 8:47 AM:

But surely the first character on the left really is just U+0061 :-)

BTW if you asked any random Swedish-speaking person (or even any English-speaking Swede) I guess you might get a different answer to your question ;)

# Michael S. Kaplan on 15 Aug 2006 9:26 AM:

It is --- and it comes after Z in a binary order -- as does every single 'A'-ish thing there. :-)

Yes, I know many Swedes who would disagree about a few of these (Ä, ä, Å, and å in particular). But even those letters would not show up in the order *they* would expect either, by a long shot. :-)

# RubenP on 15 Aug 2006 5:23 PM:

Pity .NET doesn't support Java's notion of a RuleBasedCollator. That way, if you don't agree with the built-in collation rules, you can actually roll your own.

referenced by

2006/12/05 What's wrong with what FxCop does for globalization, Part 0

go to newer or older post, or back to index or month or day