Are ligatures supposed to be thought of as 'single characters' ?

by Michael S. Kaplan, published on 2006/08/11 03:08 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/08/11/695147.aspx


Just yesterday, I got an email from Paul Dempsey (the guy who did all of the groundbreaking work to Uniscribe-enable Vistual Studio) about a particular (unrelated) bug to see if it was my area (it was).

He mentioned after we worked out the bug assignment that he thought I had a great blog. Always nice to hear....

Then, a few hours later I am looking at some of the blogs I read and I run across Dean Harding's post Visual Studio and Ligatures, in which he points out a bug he submitted for people to vote on entitled Visual Studio editor considers ligatures to be a single character.

Of course, the report is for a bug involving the Uniscribe support in Visual Studio....

I need to work on my timing, truly. :-)

As a separate issue, I might want to report Dean's preference for Calibri as his code font to the typographical authorities; I am sure he is violating some kind of natural law there. :-)

For the behavior of backspace vs. delete, you can see this post, which explains the behavior that has (for the most part) been implemented in Windows and other MS products. If VS is not doing it, that is something I'd probably consider a bug....

The additional point raised here is whether a font that actually has a ligature preference for certain letters is supposed to actually become such a text element for the purpose of cursor movement and selection. Offhand I might really expect NO in this case (since there really is a difference between a base character/its attached combing characters and two base characters that happen to have a ligature defined in a particular font). But I can understand why this might happen?

Though I just checked in Notepad and although it too showed the fi ligature for the letters, it did not have the same issue with selection and deletion -- each letter was treated separately.

(Wordpad of course did not even show the ligature, but we already know plenty about Wordpad/RichEdit's limitations when it comes to properly supporting Uniscribe (cf: this post), so I am not surprised!)

Nevertheless, the support for complex scripts in Visual Studio is amazing and has been since Paul first did the work here to enable Uniscribe support. He has moved on to other projects now, but I am sure the current owners can deal with a bug or two. :-)

 

This post brought to you by (U+0f85, a.k.a. TIBETAN MARK PALUTA)


# Dean Harding on 11 Aug 2006 3:58 AM:

Wow, that was quick! :)

Call me crazy, but I'm really liking Calibri... well, except for that one issue.

# Dean Harding on 13 Aug 2006 3:34 AM:

Something else I noticed about Calibri: Word (this is beta 2 of Word 2007) doesn't use the ligatures at all, it seems!

# Michael S. Kaplan on 13 Aug 2006 4:27 AM:

That is the same problem as the one RichEdit has with not doing any of the dynamic OpenType stuff....

# RubenP on 13 Aug 2006 1:53 PM:

It must be said, but all the ClearType fonts with automatic fi ligatures look exceptionally bad for the sequence 'fij'; if you remember, the ij is quite frequent in Dutch, so that's a little troublesome.  (To me at least ;-)

But then again, the few fonts that contain a combining acute accent, hardly ever actually combine it with the j, and if they do, the accent is markably different from the accent on the (pre composed) i. Adding acutes to ij is actually something you'd want in Dutch (the acute is an emphasis mark and ij is a vowel; well a diphtong actually). But because of the very poor support for this kind of thing, even the official rule has become i acute + j, rather than i acute + j acute.

Oh, and how does one stop these ligatures from happening? For example, in Turkish? IIRC the fi ligature is a big no-no in Turkish typography, because you cannot distinguish it from f + dotless i.

With such silly things, I guess non-American digital typography still has a long way to go...

# Michael S. Kaplan on 14 Aug 2006 12:32 AM:

Hi RubenP,

Ah, the issue you are talking about actually deserves to be looked at on its own, as it is a very important one related to language-specific needs within fonts in general. I think I should write something about that.... :-)

referenced by

2007/01/24 Sometimes a WCHAR really *is* just a character....

2006/08/14 Optimized for English (oh, and also Japanese, and maybe a few others)

go to newer or older post, or back to index or month or day