I can't actually see Chennai (or any other city in Tamil Nadu) from *my* house....so let's talk LIPs!

by Michael S. Kaplan, published on 2010/05/09 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2010/05/09/10009652.aspx


The title is accurate, I cannot see any part of Tamil Nadu from where I live.  But then I'm not running for vice-president of the United States so the "informal" nature of such a bizarre metric on domestic and foreign policy experience leading to me being a poor choice for such a role is unlikely to hamper me too much. Instead I've just gone to the places, which ios clearly less impressive....

Which is good -- because I can forget about all that, and focus on some actual exciting news!

You know, forget about the above, it sounded much cleverer in my head. A place you wouldn't want to live....

So now presenting the thing that everyone (including me) had been waiting for (on, by providence or design, this 150th anniversary of Tagore's birthday) -- the Tamil Language Interface Pack for Windows 7!

It is only being released for 32-bit1, and you must have English resources installed.

And yes, I wish they were releasing a 64-bit version too. As I have mentioned.

You can download that 32-bit version from right here....

This LIP is produced as a part of the Local Language Program! w00t w00t!

And now, a little background information about Tamil:

NUMBER OF SPEAKERS:

~70 million, worldwide

NAME IN THE LANGUAGE ITSELF:

தமிழ் 

Tamil is the official language in the southeastern Indian state of Tamil Nadu and also in Sri Lanka and Singapore. It is constitutionally recognized in India and South Africa. Most of its 70 million speakers live in Tamil Nadu (around 50 million) and the neighboring states and in northern and northeastern Sri Lanka (4 million), but there are also significant communities of speakers in Singapore, Malaysia, Mauritius, and South Africa.

A purist movement in the 19th and 20th centuries cleaned the Tamil vocabulary from many Sanskrit loan words which can still be found in many other related languages.  Tamil, like other Dravidian languages, is an agglutinative language.

FUN FACTS:

CLASSIFICATION:

Tamil is member of the group of Dravidian languages which are not related to the Indo-Aryan languages spoken in northern and central India (those belong to the family of Indo-European languages). Its closest relative is Malayalam5.

SCRIPT:

Tamil has its own abugida script (meaning that it is a syllabic, not an alphabetic script). It has less characters than most other Indic scripts due to the lower number of consonants in the Tamil language (There are neither aspirated nor voiced stops like the English g or f4).  It is said that the script developed from the Brahmi script to its own form due to the fact that Tamil was mostly written on palm leaves. This required rounder characters and less dots (which would have pierced the leaves).
[note from me - some have suggested that the plam leaves point is an apocryphal bit added to Daniels & Bright years ago, but I have not had a chance to get the true story from anyone yet]

You can read more about Tamil here. The sections about old/middle/modern Tamil are paricularly interesting, in my opinion....

As I have mentioned in From Seattle (USA) to Coimbatore (India) in June? You betcha! , I will be at Tamil Internet 2010 in Kovai next month, and various other places throughout India. I now have my letters and my visa plus everything else I need. And I am very excited about that. I remembered the Next time, just bring one laptop, Mr. Michael! lesson and am somehow bringing just one laptop this time. Oh, and one iBot, which is neither PC nor Mac5.

Also, for the record, speaking specifically to the points raised in Tamil language support in Windows? You can't SHRII-k yet, but it's getting better, they still did not add SHA to the keyboard, either to fix the SHRII or to just add the letter. Thank goodness for the typography team adding the support, though I occasionally hear rumors of opinions some native speakers have on Latha itself that I'd love to get someone to describe on the record. :-)

 

1 - Though I am going to convert one of my self-hosting machines into a build machine so I can make a 64-bit version for me at least2
2 - No I cannot send it to anyone, sorry. Being a full-time employee in the Windows org does have some advantages3
3 - I also have my own version of Notepad that makes writing a UTF-8 BOM optional!
4 - Though as I have discussed previously, Tamil does have a more or less world-wide convention of using the Aaytham as an "F" in loasn words that require the sound,
5 - whose LIP was also recently released, as I mentioned in
Are MALAYALAM KA & TAMIL KA confusable? Only if you think "all those Dravidians look alike" !
6 - Actually it is more of a custom Linux distro on custom hardware, so probably closer in lineage to a PC.


Gwyn on 9 May 2010 2:12 PM:

Is "etric" a real word, or a typo? It looks like it really should be some kind of cool obscurity, but really I think you may have meant "metric". Pity though.

Michael S. Kaplan on 9 May 2010 5:52 PM:

That was a typo. :-)

Fixed now....

Mihai on 10 May 2010 7:37 PM:

"Though I am going to convert one of my self-hosting machines into a build machine so I can make a 64-bit version for me at least."

I keep asking myself for a while: what would it take for someone to create his own MUI/LIP? Creating it is not the hard part, but being recognized by Windows...

Then 64 bit from 32 bit is trivial, because the files are resources only DLLS, and copying resources from 32 bit DLLs to 64 bit DLLs shold be easy.

But being kind of afraid of all the legal traps around "reverse engineering" makes me refrain from trying any of this, no matter how interesting/tempting :-)

Pavanaja U B on 10 May 2010 9:31 PM:

@Mihai

About 6 years ago, there was a meeting with Creig Munde along with officials of Govt of Karnataka, regarding the status of Kannada in MS products. I was part of the Gov team. My suggestions were-

Make all products open architecture wherein end-user can create his own LIP, collation table, etc. and use them. This concept is prevelant in Linux

Regards,

Pavanaja

Jan Kučera on 13 May 2010 1:43 PM:

...I think the solution to make this open would cause people eventually creating buggy and corrupted content, but the end users would always blame Microsoft anyway, especially when LIP/collation tables etc. are used through the system, and cannot be identified as a 3rd party product. I much more like the idea of Local Language Program (up to the point where the government must initiate the process, though).

Anyway, what I wanted to ask... I thought you did once tell that users are expecting to have sorted their things together, and other after that. But with this Tamil LIP, I see the files and folders in Latin before the ones in Tamil. Do they have it the other way round?

Also I wonder why Latha was choosen as the system font. In English Windows, all Tamil named folders and other text fallbacked from Segoe UI to Latha and was displayed correctly. So I actually don't see why it should not be kept this way in Tamil LIP too, do you? Choosing Latha as the default one brings these cool tiny latin letters (I really feel tons of them would fit onto my display!), but not only it obviously lacks extended latin characters (which in turn get fallbacked again), quite lots of UI text are in the font I have not seen for very long time now, likely MS Sans Serif (e.g. tabs in IP protocol properties of network adapter, or Visual Studio 2010 save-before-build dialog), which eventually just displays squares instead of fallback. Why is this one used sometimes? Anyway... wouldn't keeping the system font set to Segoe UI do the job?

Michael S. Kaplan on 13 May 2010 1:56 PM:

It is an attempt to solve the problems discussed in this blog related to inappropriate font scaling....

Jan Kučera on 14 May 2010 8:40 AM:

Can I somehow set the UAC prompt (on protected screen) to use the Tamil keyboard layout by default? It does not seem to remember the keyboard setting as other windows do...

V.C.Vijayaraghavan வி.சி.விஜயராகவன் on 18 Mar 2011 3:55 AM:

Dear Mike

You mention "A purist movement in the 19th and 20th centuries cleaned the Tamil vocabulary from many Sanskrit loan words". This is true only in some parts. Yes, there was and is a purist movement. Yes, they have tried to "clean" (what else is a job of a Purist movement apart from laundry ?) Tamil of sanskrit and also English words. Their success is far from complete and like King Tantalus, they are eternally trying to roll up the Pure Tamil up the mountain of daily usage .. and you know the story. Cleaning is not a done deal. For example, this Laundry Movement tried to clean Tamil of what is called Grantha characters,  yet both the Tamil keyboards and  daily usage (writing and speaking) use tehse charcaters ஷ, ஹ, ஸ, க்ஷ. For years Tamils have been using the word பஸ் to denote 'bus', which has the same sound and has a grantha letter. A new tamil word was coined 'pErunthu'(பேருந்து in Tamil) - which literally means a Big Push. If you google for பஸ் and பேருந்து, you will see the former is used by all and sundry , including daily newspapers. So much for the success of the laundry. In fact, the Tamils have been saddled with two or more words, one which is a popular usage and another, and sometimes more than one, "Pure Tamil" word. "Intellectual progress" ns the Tamil speaking world has not been production of new knowledge, but coining new "Pure Tamil" words for what the people already know. I would ask you take with a punch of salt (if not tons) what you hear about Tamil from Tamils, especially from official quarters.

வி.சி.விஜயராகவன்

V.C.Vijayaraghavan on 18 Mar 2011 4:09 AM:

Hi Mike

In my last email, I mentioned ""....and like King Tantalus, they are......

That should have been King Sisyphus, I quickly realised.

Vijayaraghavan


go to newer or older post, or back to index or month or day