Inspiration, and a code chart

by Michael S. Kaplan, published on 2008/11/03 10:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/11/03/9029808.aspx


Way back in September after I did that presentation at the Internationalization and Unicode conference that I mentioned and provided the slides of in Behind the Proposed Change to Tamil in Unicode (five different ways), Scott sent me the following via the contact list:

Michael,

After your talk today, I was inspired to put up a Unicode syllabary chart for Tamil on the Tamil Script page in the English Wikipedia, complete with  the new Tamil named sequences from Unicode 5.1, in the hopes of building support for the current Unicode encoding model.  Anyway, you can check it out if you're curious:

http://en.wikipedia.org/wiki/Tamil_script#Tamil_in_Unicode

If you find anything horribly wrong, I'd be happy to fix if you let me know about it, so you won't have to violate your policy of Wikipedia non-editorship.

I just hope this doesn't earn me death threats!  ;)

-Scott

 I think that what Scott did here was excellent, and I did not note anything horribly wrong at all....

And it humbles me to think that I helped inspire it.

Because even though that is the secret hope I have for some of my talks (especially including this one), it is really awesome to see it spelled out in such a way.

The chart he provided was similar top but not the sam as the ones I provided in Learn Tamil in 30 Days (or something like that), and help people look at Tamil in Unicode the way that they might learn Tamil, something the simple code allocation chart would never be able to do -- in its own way something Uniciode cannot do without prioviding this same crucial bit of infomation in a familar form.

Thanks, Scott -- both for this and for supporting my non-interferece policies WRT Wikipedia! :-)

Which reminds me that I promised to talk more about some of the issues I didn't have time to cover in the talk. I'll be sure to get on that....


This blog brought to you by(U+0bb9, aka TAMIL LETTER HA)


Scott on 11 Nov 2008 6:50 PM:

Michael,  thanks again for the inspiration, and for your kind words.

I have recently revised the Tamil script Wikipedia page to add rows to the Unicode syllabary for syllables starting with J-, SS-, S-, H-, and KSS-, following a similar addition to the Tamil compound table earlier in the article.

I must admit, my knowledge of Tamil script is quite limited, and I have been drawing heavily on the work of others in additions to the Tamil script Wikipedia article.  I have limited to the scope of the Unicode syllabary chart to those Tamil compounds listed earlier in the article, because I don't truly understand what ought to be listed in this page.

In particular, I have omitted the Tamil syllables containing U+0BB6, which are included in the Unicode 5.1 named sequences for Unicode, because I don't know why these compounds aren't mentioned earlier in the page.  Are the used for transcribing words in other languages such as Hindi or Sanskrit?  Are the used for writing minority languages that share the Tamil script?  Are they historical characters no longer in modern use?  Any of these reasons could make these characters politically sensitive.  Does anyone know more about these characters?

I also omitted the named sequence for SHRII because it doesn't fit in the table format, and I don't undstand its usage or significance either.

Jan Kučera on 12 Nov 2008 5:25 AM:

Hi Scott,

we were told that SHRII is only being used in Sri Lanka (which is not how do they call the island, either), so it's significance is probably not the highest, though we learned it as a standard grandha letter - and it took me a while to figure out how could this one be written if you do not have the Tamil keyboard, so some might find it helpful to know it is SA+Virama+RA+II. This also suggest that you cannot put it into the syllabary table, unless you include it in the II (U+0B89) column only, leaving others empty (which seems as a quite good idea to me).

I'm not sure about the U+0BB6 either, but I will ask today at school if Michael does not cover it sooner...

Jan Kučera on 12 Nov 2008 10:43 AM:

Hello, I'm back. You can find pretty good information about SHA (U+0BB6) and how it relates to SHRII in the SHA proposal, which can be found at

http://std.dkuug.dk/jtc1/sc2/WG2/docs/n2617.pdf

For what I've found it, it is used very rarely, and for transcribing words (from Sanskrit etc.), and it belongs to the grantha characters. I don't think it is much more politically sensitive than other grantha characters.

Hope this helps :)

Scott on 12 Nov 2008 5:50 PM:

Thanks Jan!  That reference to the SHA proposal was very useful.  I've updated the Tamil script wikipedia page based on this info.


referenced by

2010/07/26 You can violate the rules of decorum, just not the law of gravity

2009/09/09 On not being in Germany in October

go to newer or older post, or back to index or month or day