by Michael S. Kaplan, published on 2005/11/18 04:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/11/18/492339.aspx
Last weekend, Dean Harding commented when I was talking about preferring uppercase before lowercase or vice-versa:
The whole idea of sorting (at least for latin-based scripts) is just convention anyway... I mean, you may well ask "why should 'A' come before 'a'?" but then why not ask "why should 'a' come before 'z'?" there doesn't seem to be any actual reasoning behind the 'sort order' of our alphabet at all!
Unless, I'm missing something here...
Dean wasn't missing anything here, no. But I thought that might be worth a little discussion. :-)
I remember Cathy and I doing a presentation a while back for the group about collation (similar to the ones we did at IUC22 and IUC23) and we talked about how to someone speaking English or German Å comes after A while in Swedish it comes after Z. After the presentation colleague (and native Swede) Anna came up to gently correct us since although people speaking English might make that 'incorrect' choice, people speaking German would not. We stuck to our guns and it was only after a bit of investigation that she realized that Germans 'got it wrong', too!
Then last year, Kieran suggested that I check out The Language Instinct : How the Mind Creates Language by Steven Pinker. It is one of those great tomes that can speak to people who are not necessarily linguists, and help anyone understand more about language than they ever did previously. I knew I was learning something when over half of the linguistic factoids that were being shared with me each day were things that I had also read in the book. :-)
Anyway, given how fascinated I was by collation, I admit I was at first disappointed that so little was said about the subject in the book, which was really blowing me away otherwise. There is only a brief mention is in Chapter 8:
...for the same reason that alphbetical order is similar across the Hebrew, Greek, Roman, and Cyrillic alphabets. There is nothing special about alphabetical order; it was just the order that the Caananites invented, and all Western alphbets came from theirs.
I admit that I felt a little odd having such a passion about something that apparently had very little meaning in a 'linguistic' sense. It is probably about the time that I dubbed the term 'delusions of linguistic aptitude' to describe myself. :-)
Now I got over it pretty quickly, because I realized that this one sentence did not invalidate my interest, and it certainly didn't invalidate the importance of it given all of the places that collation is used. It is shortly after that time when I posted Putting Your Ducks in a Row about different 'alphabetical orders' -- the large degree of variation between them and some of the many different principles behind them. And this does not even get into the fact that people are used to them and are confused any time things are not in the order that they expect (which is usually alphabetical order).
I am still fascinated by collation, probably at least in some part now because as cool as language as an instinct is, the fact that collation is so ingrained in people that they do not even realize its not an instinct (hell, in most cases they do not even realizes it varies between languages in a single script!)....
This post brought to you by "Ա" (U+0531, a.k.a. ARMENIAN CAPITAL LETTER AYB)
# orcmid on 18 Nov 2005 12:55 PM:
# Petr Kadlec on 19 Nov 2005 8:50 AM:
# Michael S. Kaplan on 19 Nov 2005 9:12 AM:
referenced by
2008/08/28 Collation backstory?
2008/02/23 Despite progression, the bug calls out to me quite LAOdly
2006/02/12 Collation can actually be linguistic
2006/01/01 Sorting multilingual data