by Michael S. Kaplan, published on 2012/06/04 07:01 -04:00, original URI:

The other day in the blog, Jan Kučera (friend/colleague/regular reader/aspiring Tamil expert) commented about Have you started looking at the Windows 8 Release Preview yet? in part:

But one of the bigger surprises is that the new default font for Tamil uses an old variant of the script (AI/AA ligatures), is this intentional?

Now let me take a moment to say that Jan is someone I owe a beer and a meal and a copy of Windows 7 and 8 and whatever else I can think of for not being able as much time as I wanted to when i was in India a few years ago, which I feel really bad about.

I think he's forgiven me (I hope so!), but let me take a moment to apologize -- sorry, Jan!

Ok, now with that said, he has a point.

The original Latha font does not have the more traditional orthography sometimes referred to as the "Elephant Trunk" orthography, while the new Nirmala UI does in the Release Preview, as this text in WordPad indicates:

The pseudo-localiazed title bars in these screen shots are because I'm running Cherokee Windows 8 which hasn't yet handed off those strings!

Latha vs. Nirmala UI

 And this new Nirmala UI font is not the default font in the RichEdit used by WordPad in the Windows 8 Release Preview.

Because as I typed the text above, WordPad was making it always use Latha for the Tamil text.

I had to explicitly change the font to Nirmala UI, in fact!

So, I copied/pasted the text into Notepad, you can see the default font behavior when neither font is picked explicitly:

Nirmala UI wins!

 Wow, it has been changed to Nirmala UI.

And now we have Elephant Trunks!

Now Unicode has long ago rectified limitations and problems in their Tamil Block Description.

With the help of INFITT (including me!), Unicode has updated the text in Chapter 6 to make it clear this is the older traditional orthography:

What Unicode 6.0 says about Elephant trunks

I guess Nirmala UI decided to move away from the "modern Tamil orthography" here!

Kind of ironic given how often people talk about the new Metro user interface in Windows 8 as Modern, at least! :-)

At this point, I've pinged the owners for a comment.

Personally I find it potentially a little troubling.

Since although there are minimal truncation risks to UI via the horizontal differences given the taller vertical aspect of some of these letters, there are high reflow change risks given the narrower horizontal aspects of these letters.

So the next version of Word might have to make a different decision in the way it links Tamil text when the font does not have the text.

But it is definitely a noticeable change in the Windows 8 Release Preview, as Jan noticed!

Kannan on 6 Jun 2012 11:55 AM:

Wow. I was using a twitter app in windows 8 and saw these old tamil script. The first thing i did after that is check your blog and I see this.

It is interesting to see the changes. It will be very confusing for Tamil users as the changes to the script took place many decades ago and the last few generations grew up learning the new script.

Michael S. Kaplan on 6 Jun 2012 4:59 PM:

I've just been told they are going to be fixing this for RTM....

Jan Kučera on 12 Jun 2012 4:11 PM:

Dear Michael! There is nothing to apologize for, I enjoyed every while I had an opportunity to share with you and I hope we will meet again. By the way, do you have any internships available in your team? :-)

I am not sure how RichEdit chooses its default font since mine uses Arial Unicode, I have noticed the elephant trunks in message boxes and other Windows UI. I actually find Nirmala UI more readable than Latha (likely due to larger and thinner characters), but the choice of pre-reform elephants is unfortunate. I am glad this is going to be fixed in RTM, hopefully just moved under the old ligatures feature!

Also note that the increased readability is for the price of top and bottom glyphs truncations, try to type e.g. couple of ரி lines in notepad. Or கு. Or ஜீ or whatever :)

Speaking of the Chapter 6, this also reminds me an issue we discussed at INFITT in Coimbatoire, mentioned in the first paragraph of the page 300 (Tamil ligatures with RA). Note that Arial Unicode uses the depicted ligatures, while Latha (and Nirmala) leaves the form unchanged. This is what people at INFITT were complaining about. I am aware that there is no official body for language affairs in this area, but Microsoft still has to make decisions upon something, doesn’t it?


