Hindsight is 20/20 and DAGGER is U+2020, coincidence? (aka I miss emoji like the deserts miss the rain)

by Michael S. Kaplan, published on 2007/09/16 01:31 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/09/15/4936526.aspx

It was a [somewhat] innocent time.

Just after the bulk of the Emoji-ficationary posts had done their worst, and had their way with the minds and hearts of so many who don't know what quite what to do with a symbols proposal since no symbolic gestures can be made that wouldn't pale to the actual symbols under discussion.

And before the latest round of Tamil script fundamentalism had reared its...err... head, with to last tally over 150 messages in a week and at least one new disciple from among one of the former authors of character proposals in support of the encoding model everyone so eagerly wished to plan the overthrow of, and I am left wondering whether the counsel should be to take the effort seriously (in which case development for Tamil would lkely slow down or stop until the issue is decided) or ignore them as cranks (in which case work continues and customers are happy, but in-country experts are quite unhappy to e considered cranks and in-country cranks are quite unhappy to not be held out as experts).

The posts have wound down again for now, the energy that fueled them all but exhausted for the momet.

I'm sure I will say a word or two eventually; a post for another day, to be sure.

I look forward to it much as I would getting a root canal done. Or perhaps getting my gums scraped.

But as I said, I am thinking of prior to this latest travesty....

After some particularly misleading statements had been made about the whole Emoji issue, Ken Whistler (the cool uncle of Unicode), made a post clarifying many of the issues surrounding Emoji (Doug Ewell also added one clarifying point which I have included in Ken's original text, much as one would create a slipstream release of a product that included the latest hot fix!).

The text follows....

At the risk of being banned for necroposting, I thought it might be useful to take a little discursus into etymology here. ;-)

Emoji is actually the Japanese word for "pictograph", and long predates the existence and use of emoticons in email and bulletin boards (and later websites and forums, and now in text messaging).

Emoji: 絵文字 (U+7D75 U+6587 U+5B57), literally: "picture-writing" or "picture-script", hence "picto-graph".

The Chinese for the same concept is 象形文字 (xiang4xing2wen2zi4, a.k.a. U+8C61 U+5F62 U+6587 U+5B57), literally: "image-writing" and denoting the same concept, "pictograph". It also is used specifically for "hieroglyph".

"Emoticon" is a cutesy English language neologism, of course, from "emote" + "icon". Lots more on the wiki.

Or as Doug Ewell pointed out, from "emotion" + "icon."

(This is perhaps akin to the debate over the true origin of U+00DF)

The Japanese translation of "emoticon" is kaomoji or kaofugou, literally "face mark". And "face mark" even occasionally gets calqued back into English to refer to them.

The whole area then gets murkified by the happy (or unhappy) coincidence that "emoji" is very similar to how the first 3 syllables of "emoticon" would be pronounced as borrowed into the Japanese language. So they look and sound as if they *should* be etymologically related. And of course, because they seem to be people treat them as such, and the distinction has become blurred.

Now, the symbols in question used by the Japanese wireless operators are all lumped together as "emoji", and referred to as that both inside and outside the industry, despite the fact that they constitute a mixture of pictographs proper, non-pictographic symbols, and emoticons (which themselves are a strange offbranch of pictographs, developing with their own idiosyncratic logic different from non-emoticon pictographs).

So I guess the moral of *this* story, at least today, is be careful about terminology here. Emoji is not the same as emoticon, and the encoding issues involved are diverse, depending on exactly which sets of which kinds of things we are talking about at any point.

The thing I find most unbelievable is that just a week ago I was so tired of hearing about symbols in general and emoji in general that I was looking for that dagger symbol to stab my laptop long enough to be unable to receive email.

Now that think about it , putting † (DAGGER) at U+2020 is quite ironic -- it's like Unicode is saying "in hindsight, maybe you should have just stabbed someone."


Anyway, thinking the whole time "THEY ARE JUST SYMBOLS, PEOPLE -- we should talk about languages, and scripts. You know, important stuff."

But now, 150 messages in to the latest pitch to deny the Tamil encoding as an abudiga and re-encode it as an alphasyllabary, include all of the "pure" consonants and of course include an "x" letter which is basically a "ks" letter but in a brave pitch to synthesize spontaneous unsupported-by-the-community spelling reform needs to have these these items encoded differently....

And suddenly, I am am missing conversations about emoji like the deserts miss the rain.

Emoji may not be an ideal problem to solve, but they at least seem like a more tractable one, at the moment... :-)


This post brought to you by (U+2020, a.k.a. DAGGER)

# John Hudson on 16 Sep 2007 2:31 AM:

"The posts have wound down again for now, the energy that fueled them all but exhausted for the moment."

What is this exhaustion you speak of? Do I look exhausted? Ha! The bit is between my teeth, and I'm just waiting for India to wake up and respond to Jonathan Kew and Peter Constable before I go back in swinging.

# Michael S. Kaplan on 16 Sep 2007 2:32 AM:

Maybe it was just wishful thinking on my part. :-)

referenced by

2010/11/22 Who the hell orders a slice with no toppings?

go to newer or older post, or back to index or month or day