by Michael S. Kaplan, published on 2007/09/26 03:16 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/09/26/5134774.aspx
New reader William Overington asked over in the Suggestion Box:
In various recent articles in your blog you have mentioned lots of postings and discussions about the encoding of emoji and emoticons in Unicode.
Yet those postings are not in the Unicode public mailing list.
Could you please consider writing a blog article about what is in fact happening in this field as I, and maybe some other readers who are not representing an organization which is a member of the Unicode Consortium, would like to become aware of what is the present situation.
William Overington
20 September 2007
Some of the conversations in question are happening on the "core" Unicode list for members though I usually do not quote from there for general "trying to respect some kind of implied confidentiality of stuff there" thing, which I only feel comfortable violating when people multipost and/or crosspost with a bunch of other lists.
Luckily people do that a lot, especially on the silly mails of the sort that a person such as myself might quote. :-)
On the other hand I am not really a list secretary or ambassador and only talk about pieces that I think might be interesting to cover. My interests, though wide and varied, are neither consistent nor predictable, so it is hard to really know what one might get on a given day.
Also sometimes I say things like "tomorrow I'll talk about _________" when the blank is filled in with underwater basket-weaving or international happy fun ball tryouts or a woman I went out with years ago or whatever. When I actually mean that I will write about it tomorrow -- and the actual post could be tomorrow or in a few days or next week or even never. And I am only mildly ashamed to admit that the biggest factors gating this are
Back to William's question, if you look on the List of UTC Subcommittees, you may see the Symbols subcommittee. I do not know the precise rules by which membership happens, though I do know that each subcommittee has their own mailing list. Thus far its principal defining characteristic has apparently been that every single message has been posted to the core Unicode members list as well, meaning twice the mail for me since the list server does not consolidate those duplicates.
You know the expression "If you take out the curse words, he didn't say anything at all." ? No? Well, it exists.
I have been posting periodically about the work there when issues seemed interesting, but it has been a while since they have (in my opinion) so if I filter out all of the uninteresting stuff then I have nothing to report. :-)
If that changes I'm sure I'll say a few words....
I am probably being much less respectful about symbols than I could, probably out of fear for the really tortuous things that at least the conversations are attempting to do to (if they have their way formerly) ironclad encoding rules. I feel that the non-letters need to be kept in their place, which includes sitting in the back of the bus and entering via the back door to the kitchen if they want food. The front of the bus and the tables in the diner are reserved for real letters and numbers!
Or what did the barkeep at the cantina in Star Wars say? Something about how we don't serve their kind here. Your symbols? They'll have to wait outside" or somesuch.
You get the point, I'm sure. You probably did many metaphors and just kept reading to humor me on this one.
I do apologize if the words offend some symbols, but I have to assure you that it was almost certainly intentional. :-)
But that's just my opinon, I could be right....
This post brought to you by ◌ (U+25cc, a.k.a. DOTTED CIRCLE)
# Andrew West on 26 Sep 2007 6:11 AM:
It's good to see that William is still lurking on the Unicode list. Whilst most of the emoji kerfuffle has been on on the Unicore list, most of the relevant documents are in the public domain, and William will be able to get a good idea of what is under discussion from these documents:
Markus Scherer's original emoji mapping table:
<http://www.unicode.org/~scherer/emoji/emoji_mapping_utc_pub.html>
A zip of all the files in the above:
<http://www.unicode.org/~scherer/emoji/07257-emoji-wd-table.zip>
A pdf version of the above:
<http://www.evertype.com/standards/iso10646/pdf/emoji_mapping_utc_pub.pdf>
Michel Suignard's proposal for Japanese TV Symbols:
# William Overington on 27 Sep 2007 12:50 PM:
Thank you for writing an article in response to my suggestion for a topic and thanks too to Andrew for the list of links.
May I cross-reference this blog article to an earlier one please?
http://blogs.msdn.com/michkap/archive/2007/08/18/4455146.aspx
Also to a thread about emoji in the High-Logic forum please?
http://forum.high-logic.com/viewtopic.php?t=1868
And also to a thread about emoji in the Sun Java forum on Java TV please?
http://forum.java.sun.com/thread.jspa?threadID=5213927
Michael's link to information about the "core" Unicode list for members provides information that the situation is different from how things were the last time that I had looked: individual members were at that time not allowed to receive the unicore list emails; so that is a welcome change for me to have learned about.
Hopefully there will be some more comments in this present blog thread and hopefully more information about the situation will emerge.
I wonder if the Unicode Consortium will put the matter out to public consultation.
There are some interesting posts in the Unicode mailing list today about suggestions for new symbols.
As some readers may know I feel that items such as commands for setting colours and for setting point size and for some other typographic features could usefully be encoded as regular Unicode characters. They could all be encoded in one of the presently unused higher planes with a rule that they are default ignorable by displaying processes yet can be used by a displaying process if an option to use them is chosen in the options of the displaying process.
There are lots of unused planes in Unicode. It seems to me that some of them could be used to encode various sorts of interesting things using characters, including such things as a manufacturer-independent portable object code for expressing interactive illustrations.
It seems to me that if the possibilities are considered even-handedly without the situation that exists at present being regarded as being unchangeable that great progress could be made.
William Overington
27 September 2007