Why my IUC31 talks were presented on Vista (even though running on a MacBook Pro)

by Michael S. Kaplan, published on 2007/12/16 10:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/12/16/6781186.aspx


Back in the end of September after I posted What's Mac 2008 got, you wonder?, I realized that I may have a few more readers than I realized when several people sent me the information of how I could get the beta.

Very heartwarming, actually. It is always nice to have readers who enjoy helping a fellow accomplish some things!

I had the Beta 7 version of MacOffice 2008 on my machine in plenty of time for IUC31, and I had that machine with me at the conference; it was great!

Unfortunately, even though all three presentations were done on the MacBook Pro, two of the three were actually done while booted into Vista. I thought I would explain why....

First we'll look at two slides on my XP SP2 machine, from the talk I did on collation.

One showing the collation issues I discussed in Why don't all the half forms sort right? related to how the various forms of some letters that should sort together were in fact not doing so:

And the other showing the collation issues I discussed in And then there is the virama.... related to how consonant+virama sorts before or after consonant alone, depending on language:

I had the slides on my Mac partition and was very willing to try them out.

But unfortunately things did work out as well as I would have liked. :-(

That first slide managed to show the wrong form for two of the three ones there:

Now the messed-up Latin and Cyrillic typography examples are ones I would have fixed up by finding fonts that showed the differences, so I don't really consider that a bug that the various mapping for fonts between platforms d not have 100% identical results.

That would just be the cost of doing the presentation!

But having the two of the three forms I was trying to show the differences in not work right? That was a deal breaker....

That second slide was mainly a problem that fonts were not on the machine for some of the scripts:

The Tamil and Bengali fonts I downloaded from the web, and were it all going to work out I would have tried to fix up the weird color issue.

But I was unable to find fonts for Malayalam at all.

While it might have been fun to show off the whole "last resort" font on the Mac under other circumstances, in this case I was really going for a slightly different effect. :-(

It's funny, we were just talking about this other day about the whole "last resort" font stuff in SiaO as The Red Carpet (aka Characters just want to be seen), where regular reader Andrew Cook commented:

A new feature coming out in Firefox 3 early next year will hopefully make this slightly less confusing.

At present, when Firefox 2 encounters a character it cannot find a glyph for, it will display either a question mark or the notdef glyph.

Firefox 3 renders a notdef glyph box, but embeds within it the hexadecimal representation of the Unicode codepoint that the unrenderable character lives at. If the codepoint is on the BMP, it's rendered as four hex digits in a square, going L->R T->B. If the codepoint is on another plane, it's rendered as six hex digits in a box three digits wide, two digits tall, L->R T->B.

I don't quite know how this works with base and combining characters...

Well, assuming that the FireFox 3 solution is the Apple last resort font or that it is at least built on the same principles, we have an answer there now -- what with our great example of Malayalam consonant+chandrakkala and all. :-)

Anyway, that is why not all of the presentations were done while I was booted into Mac OS X (Tiger at the time, upgraded to Panther as soon as I got home as I discussed in They say a leopard can't change its spots, but I upgraded anyway!. Because support for some of the Indic stuff is not quite there yet -- including for Devanagari (which was one of Tom's original questions)....

I also did verify that these limitations I mentioned for Indic are by design for this release and that the results were the same in Beta 8 and beyond.

 

This post brought to you by त्‌न (U+0924 U+094d U+200c U+0928,aka DEVANAGARI LETTER TA, DEVANAGARI SIGN VIRAMA, ZERO WIDTH NON-JOINER, DEVANAGARI LETTER NA)


# Dean Harding on 16 Dec 2007 5:01 PM:

I believe that box-with-codepoint thing might be Pango feature, because I've seen it in a number of GNU/Linux programs before. From what I understand, the next version of Firefox will be built on top of Cairo, which I think will be interesting, because it'll allow us to compare the differences between Pango and Uniscribe in all kinds of situations without having to have a Windows and a Linux box together :-)

# Tom Gewecke on 17 Dec 2007 11:32 AM:

Thanks for the info about the unfortunate lack of support for Indic in MacOffice 08.  How about Arabic?  That would still be a major plus.

# Tom Gewecke on 17 Dec 2007 11:35 AM:

PS  I forgot to ask, did you try the font Devanagari MT that comes with Leopard?  That is only font that can do this script correctly in OS X in any case.

# Michael S. Kaplan on 17 Dec 2007 12:24 PM:

I do not see Devanagari MT in the font dropdown in PowerPoint (though I do see a zillion other fonts).

Arabic text copied into Word either formatted or unformatted looks backwards. But maybe it would do a better job with a better font choice?

# Andrew Cook on 17 Dec 2007 12:37 PM:

Yes, that codepoint-in-box thing is a Pango feature. I run XP so I don't normally encounter such things.

On the Unicode website, the Unicode 4.0 Implementation Guidelines <http://www.unicode.org/versions/Unicode4.0.0/ch05.pdf> suggests "Options for rendering such unknown code points include printing the code point as four to six hexadecimal digits [(which Pango does)], printing a black or white box [(which Uniscribe does, using the notdef glyph)], using appropriate glyphs such as [shaded box] for reserved and [pencil in a box] for private use, or simply displaying nothing."

It goes on to mention doing what Mac OSX does:

"An implementation can distinguish between unrenderable (but assigned) code points and unassigned code points by printing the former with distinctive glyphs that give some general indication of their type such as [sequence of script-representative characters in boxes] and so on.

Interesting, comparing specifications with implementations.

# Michael S. Kaplan on 18 Dec 2007 4:27 AM:

Not on the Mac though, right? I thought the ones that ended up on my screenshots were from Apple's last resort font....

# Tom Gewecke on 18 Dec 2007 4:55 PM:

It sounds like Arabic does not work.  Are the glyphs connected at least? This page shows different ways Arabic can be botched.

http://homepage.mac.com/thgewecke/arabprobs.jpg

Your missing glyph boxes are from the Last Resort font.  If you make them very large, you can read the range, etc. around the edge.

http://developer.apple.com/textfonts/LastResortFont/

# Michael S. Kaplan on 18 Dec 2007 5:12 PM:

It seemed to most resemble the backwards one.

# Rosyna on 18 Dec 2007 6:08 PM:

Office for Mac does not support many complex scripts (like Devanagari) or contextual rendering of glyphs (like Arabic). Microsoft's MacBU has stated (on some blog post) they currently have no plans to change this, even though the support itself exists in Mac OS X.

Your slides may have shown the correct glyphs had you opened the PowerPoint presentation in Keynote (assuming Keynote imports it 1:1).


referenced by

2010/07/16 Which form to use if the form keeps changing?

2010/07/09 The script can make the language more complicated [to use]

2008/01/27 How would Harry Potter have pronounced शहिवाख़्‍ का दर्पण, anyway?

go to newer or older post, or back to index or month or day