by Michael S. Kaplan, published on 2008/03/11 09:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/03/11/8118537.aspx
Please read the disclaimer; content of Michael Kaplan's blog not approved by Microsoft!
I think it was several UTC (Unicode Technical Committee) meetings back.
Some of the representatives from the government of India were there, and they talked about language CDs being created for all of the languages of India -- with fonts, keyboards, and so on. Some of it even supporting Unicode.
They asked if I was interested.
I said of course I was interested!
Anyway, they came in the mail one day, a nice little stack of CDs. I could spread them out on the desk:
or I could bring them in closer so they'd be a little easier to read in the picture:
On each CD there is a nice quote from Thiru Dayanidhi Maran (Hon'ble Minister of Communications & Information Technology):
I strongly believe that if benefit of computers are to reach the masses, it is best done when used in one's own mother tongue. For this, Fonts and Language Software Tools need to be available freely to the public. I hope this rich collection of ____________ Software Tools and Fonts made available freely for the first time will open up a new horizon in the use of computers in ____________ by the common man.
(Blanks filled in with the various languages on the CDs in each case)
I looked though them, especially for the languages I was doing the most with (Hindi, Tamil, Bengali, Kannada, Malayalam).
Very weird, Bengali wasn't there!
They have Assamese, which uses the same script, but the government in India goes to some lengths to make it clearly pointed out as a separate language, for political reasons that I tend to agree with rather wholeheartedly when I am talking about Language rather than Script.
But it wasn't there. I checked the envelope again to see if it was just stuck in there.
No dice.
There is a URL on the back of the CD, I'll look there: it's http://www.ildc.gov.in/ if you want to look too....
The Speech Corpora link on that page is broken at the moment, by the way!
Hey, you can even (maybe?) ask for your own copies if you want -- like the site says via a banner:
CD Dispatch has been started. In case you did not recieve [sic] your reqested [sic] copy send us feedback at info@ildc.in.
There is even the nice list of languages covered by the Software Tool & Fonts:
But yes, there is no Bengali!
And Assamese can't really be considered "good enough" here, in a list that clearly includes both Hindi and Marathi (both of which use the Devanagari script).
We aren't talking scripts here, clearly....
And even if we were, then as I pointed out in A script, by any other name, the Government of India and TDIL (Technology Development for Indian Languages) aren't -- because they generally can't.
I admit I'm fixated on this at least in part because I don't think I'll be learning it in 30 days from my current resources, as I discussed in ঘেমন কর্ম তেমন ফল, aka Learn Bengali in a month (or not).
It just does seem kind of odd to me. Looking at that Indian census data I mentioned in A script, by any other name, with Bengali and Gujarati clocking in at 8.6% and 5.1% of the country respectively, it does seem a little odd to me that there isn't a CD full of language resources for them too....
Maybe they just couldn't find enough yet for a CD.
I gather that I may well jump into the actual CD contents in some upcoming blog or blogs, some day. For now though, I'm just covering their existence....
This blog brought to you by ঊ (U+098a, aka BENGALI LETTER UU)