Font substitution and linking #2

by Michael S. Kaplan, published on 2005/05/16 03:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/05/16/417711.aspx

Back in the end of March, when I posted Font substitution and linking #1, that ordinal at the end certainly implied that there were going to be a few more posts in a series.

Sorry it took so long to get back to the topic, I have been busy. :-)

Now I have mentioned MLang, the MultiLanguage object used by Internet Explorer to do many different operations. From conversion to encoding detection to font linking, it does a lot of things, some of which are unavailable anywhere else in the managed or unmanaged worlds of code that runs on Microsoft platforms.

I'll talk about some of those other "exclusive to MLang" features in the future, but this time I will be talking about font linking.

Acording to the documentation for the IMLangFontLink interface:

Font linking is a term used to describe the process of automatic font switching based on the character code values of the text stream to be rendered. For example, characters for Hebrew and Japanese are not likely included in a single font. The services provided by this interface would allow a client to switch between a Hebrew font and a Japanese font to output a string containing characters from both languages. IMLangFontLink does this by creating custom fonts and providing an underlying font cache in the implementation.

Now there is no other interface that provides this support, anywhere. Even Uniscribe and GDI+ are simply consumers of font linking technology, and neither one provides the underlying mechanism to build synthetic fonts that have support for different subranges of Unicode.

With that said, not every application needs quite an extensive of a level of support -- often the "retail" support provided by the consumers of font linking technology obviates the need for a technology that allows for the "wholesale" support of such virtual fonts. It does not stop some people from wanting that support though. And it certainly does not make people less curious about how it works!

I'll start by pointing to a really good post that talks about using font linking in MLang, which helps gives the actual steps you would use to implement custom font linking. It has a sample that is used to do the detection work to find out if you need to build a synethic font:

// IMLangFontLink* pMLangFontLink;
DWORD dwFontCodePages;
DWORD dwCharCodePages;

pMLangFontLink->GetFontCodePages(hDC, hFont, &dwFontCodePages);
pMLangFontLink->GetCharCodePages(ch, &dwCharCodePages);

if(dwCharCodePages & dwFontCodePages)
{
    // Character ch can be output with hFont on hDC.
}
else
{
    // Create a custom font to output the characters from
    // dwCharCodePages.
}

(It might be a misnomer to call these items "code pages" since they cover things outside the range of code pages like various Uniode-only subranges. But I am not going to quibble)

I had actually started putting together a sample that showed how you would use the interface to do the actual font linking for that "else" case.

But then I found that somebody else had already done it!

So instead I will point to How to display a string without those ugly boxes, a post that Raymond Chen did last year that gives a real world example, implementing those steps. i doubt I could have done it better, so I'll just refer to that post....

Now one important point here that is not talked about too much in eithedr the documentation or Raymond's post is that these methods do take up resources. More than you want to tie up for any longer than you are required to.

So, when you are completely done, the all-important IMLangFontLink2::ResetFontMapping method is crucial -- it certainly it does more than the IMLangFontLink2::ReleaseFont method, which does not free up the resources associated with the synthetic font.

I have had people ask me about it in the past, but I do not tend to think of the IMLangFontLink2::ReleaseFont behavior as a bug, since you may have duplicated that HFONT handle through other means outside of MLang (say in GDI). There is a strict reference counting method employed in MLang, but using that method to determine when to free resources could cause many application bugs.

None of which means you should ignore when the method fails -- it is a good indication that you have a "double free" bug in your code for you to look into fixing!

It does mean that you will definitely want to consider expanding Raymond's advice:

One refinement I did not do was to avoid creating the IMlangFontLink2 pointer each time we want to draw text. In a "real program" you would probably create the multilanguage object once per drawing context (per window, perhaps) and re-use it to avoid going through the whole object creation codepath each time you want to draw a string.

and applying it to the HFONT that you get back from your IMLangFontLink::MapFont call. Otherwise, you could find yourself building up a much bigger MLang font cache than you intended....

Now it is not covered in too many places, but the advantages of IMLangFontLink2::MapFont over IMLangFontLink::MapFont are clear for our purposes. The former supports non-Microsoft Windows codepage character font linking. To locate a font for a Unicode character not found in a codepage, just

set dwCodePages to zero and
make sure the string passed in contains the Unicode characters you want to support

When you do this, all fonts on the system are then searched to locate a font which supports this character. You can obviously make this part of the process take up fewer resources by making sure the font that is the HDC you pass to IMLangFontLink2::MapFont is a good "first guess" as to what might be the right font. You will want to make sure to not pass a huge font like Arial Unicode MS unless you want MLang to find the glyphs there (such a font would also pass the initial test to see if font linking was even needed).

In future posts, I will talk more about consumers of font linking functionality, like GDI+ and Uniscribe.

This post brought to you by "ῴ" (U+1ff4, a.k.a. GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI)

# Rick Cameron on 16 May 2005 4:52 PM:

Very timely - as it happens we're working on font linking in code that uses Uniscribe for layout. I think the approach needs to be a bit different when using Uniscribe, compared to Raymond Chen's sample working with TextOut.
BTW, when I say Uniscribe, I mean the low-level API (ScriptShape, ScriptPlace, etc.) not the high-level ScriptString* API.
My feeling is that one would do ScriptShape, then check for missing glyphs, and use MLang to help choose a substitution font, then re-run ScriptShape using the replacement font for the code points that failed the first time.
Any chance you could update the old Uniscribe sample from MSDN Mag (or was it MSJ at the time?) to use MLang for font linking?
Cheers - rick

# Michael S. Kaplan on 16 May 2005 5:35 PM:

Hi Rick, LTNS!

Well, as a sample that's pretty ambitious, whether for me or for MSDN. But I may be able to see something done (here and/or elsewhere) for the various component pieces...

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2012/01/26 If font linking doesn't fit the text to a T (or ț!), a Romanian letter may be right but not quite look it

2007/02/27 They do it with MLang

2006/06/06 Is this the Über-font post? No, but it is the teaser for it!

2006/03/18 What about logical fonts?

2006/02/15 Mixing MLang and Uniscribe

2006/01/22 Questions about font linking, etc.

2006/01/06 When not to use MLang

2005/06/18 Font substitution and linking #3

go to newer or older post, or back to index or month or day