WYSIWYG font dropdowns

by Michael S. Kaplan, published on 2006/06/28 12:18 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/06/28/649791.aspx


So, the question that Bill asked was clear:

I have a WYSIWYG font dropdown in my toolbar. Some international fonts (like Estrangelo Edessa) show up as squares, apparently because they don’t support all Unicode ranges. I need to detect these fonts and display them in the WYSIWYG list using Tahoma (like Office does). What’s the best solution?

My app is a managed app but I’m assuming I’ll need to use native interop to solve this problem.

Thanks,

Bill

The list he is referring to is like the one you see in Word's font dropdown:

Or  even in Outlook's mail message dropdown:

So, the real question here is even simpler: "How can one determine if a font supports the characters in a string?"

The easiest way is via the GetGlyphIndices function, which has in the initial description:

The GetGlyphIndices function translates a string into an array of glyph indices. The function can be used to determine whether a glyph exists in a font.

So you can pass the name you were going to use to the function (making sure to pass the GGI_MARK_NONEXISTING_GLYPHS flag), and then look at the glyph indices for the 0xffff marker.

If it shows up for any of the characters, then you know you should use some other font.

Now the Office dropdown has a few more tricks up its sleeve that I scrolled to avoid, so I should probably mention them now....

If you look, some of the fonts have some additional stuff next to the names:

All of the Hebrew fonts have the first eight letters of the Hebrew alphabet (U+05d0 to U+05d7), and all of the symbol fonts have a sampling of the symbols. And this extra feature is done even when there is a fallback done for the font name itself (due to the letters in the name not being supported).

Incidentally, Arabic fonts do the same sort of thing that the Hebrew fonts do....

Now it is easy enough to detect a symbol font, but figuring out that a font is specifically a Hebrew or an Arabic font (e.g. Miriam Fixed or Arabic Transparent) rather than just a font that contains Hebrew or Arabic (e.g. Arial Unicode MS) is a bit more complicated.

I have no idea how they actually do it -- perhaps they are calling the GetFontUnicodeRanges function, perhaps they are grabbing the FONTSIGNATURE-like info that is inside the font and using it to find out. Though of course this is the kind of feature that one could quickly go wild with in showing the characters of many different language-specific fonts, from Tibetan to Sinhalese.

Paradoxically, the only "boring" fonts on the list would be ones that support multiple scripts!

At some point, there are also performance concerns, plus there are people who find the list somewhat visually distracting (not to mention the list as it exists in Word has an obvious bias for the myriad of ways one might display the Latin script!)....

 

This post brought to you by ܨ (U+0728, a.k.a. SYRIAC LETTER SADHE)
(A character that is well-supported by Estrangelo Edessa!)


# Mihai on 28 Jun 2006 1:43 PM:

But you should be aware of the limitations

- GetFontUnicodeRanges and FONTSIGNATURE-gives generic blocks that are supported, but does not mean all the characters in the block are supported ("granularity"). So you might still get rectangles.
Plus this is info set by the font creator, and might match or not the reality

- GetGlyphIndices does not handle chars outside BMP (true in XP SP2 and an older Vista build)

Depending on what is needed, one of the solutions might be good enough. If not, one can always directly parse the cmap table of the font.

# Michael S. Kaplan on 28 Jun 2006 2:00 PM:

Ah, but you still have to do the other work, I was just answering the "how can you tell a ____ font from a font that happens to support ____" question. :-)

# Sebastian on 28 Jun 2006 5:30 PM:

The callback function EnumFontFamExProc from EnumFontFamiliesEx tells you the charset of a Font. So you can easily find out if a font is hebrew or arabic.

function EnumFontFamExProc(
 const AEnumLogFontEx: TEnumLogFontEx;        // logical-font data
 const ATextMetricEx : TTextMetric;           // physical-font data
 FontType: DWORD;                             // type of font
 FontCombo: TFontComboBox): Integer; stdcall; // application-defined data
begin
 //...
 if (AEnumLogFontEx.elfLogFont.lfCharSet = HEBREW_CHARSET) or
       (AEnumLogFontEx.elfLogFont.lfCharSet = ARABIC_CHARSET) then
       begin
         // Hebrew or Arabic Font
       end;
end;

# Michael S. Kaplan on 28 Jun 2006 5:58 PM:

Hi Sebatsian,

Yes, but doesn't a font that claims to support multiple scripts show up for more than one of these? I mean, if you enumerate all of the Arabic fonts, won't Tahoma, Microsoft Sans Serif, and Arial Unicode MS all be there, since they all claim to support Arabic?

# Daniel Yacob on 28 Jun 2006 8:10 PM:

Perhaps the in-script example text should be put into locale data?

While speaking of font menus... I've always wished that in the "Windows\Fonts" folders we would add sub-folders to organize our fonts, which would then map to step-out sub-menus in the font dropdowns. This should be really useful when you have hundreds of fonts, saves scrolling...

The "Start" menu has a similar relationship with the file system.  I'm certain this isn't a new idea, has this been tried at MS before and been deemed not not viable? Why?

# Michael S. Kaplan on 28 Jun 2006 10:37 PM:

Hey Daniel!

That is an interesting idea... even better if it had wider usage than just fonts. People have talked a lot lately about exemplar characters; is there another usage for sample exemplar characters?

The submenu idea is tougher since the mechanisms for enumeration are not as clear, and probably would not be backward compatible. And for the average user it would be too complicated to even use, even ignoring the difficulty on the developer side....

The older start menu has had lots of usability problems (the Vista version has gotten some better usability feedback). The key would be deciding how to make that apply to fonts, perhaps?

# Aaron Ballman on 2 Jul 2006 5:56 PM:

For determining whether the font is a symbol font or not, would it make sense to look at the PANTOSE numbers using GetOutlineTextMetrics?  If the bFamilyType is PAN_FAMILY_PICTORIAL, that would mean it's a symbol font, correct?

# Michael S. Kaplan on 2 Jul 2006 8:25 PM:

Hi Aaron,

I think that a PANOSE family type of PAN_FAMILY_PICTORIAL is the easiest way, but I am not 100% sure that all fonts follow the convention. It may also be worthwhile to look at other secondary indications to verify, like the exclusive use of the particular piece of the PUA that symbol fonts use, just to be sure?

# Aaron Ballman on 2 Jul 2006 8:57 PM:

I'm not certain I'm familiar of how to get that secondary information.  By the looks of things, you're right that not all fonts follow the convention.  For instance, the Symbol font is correct, but Marlett returns PAN_ANY instead of PAN_FAMILY_PICTORIAL.

What would be a good alternative to double-check?

# Michael S. Kaplan on 2 Jul 2006 11:24 PM:

Well, there is the GetGlyphIndices function that the post talks about that you can use to see if the font supports a given character? In this case you are looking at that symbol range....

# Aaron Ballman on 3 Jul 2006 2:09 AM:

True -- that's what I'm using to tell whether the text can be displayed with the given font.  Now I'm just trying to mimic the symbol font behavior you described later in your post.

Another way which may work is to use GetTextCharset API and check the return value to see if it's SYMBOL_CHARSET.  Or do you think that falls prey to the same problem as the PANTOSE attempt?

Thanks for the information!

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2007/12/16 How best to keep the font switcheroo from happening?

2006/06/29 Working beyond the BMP is going off script (according to GDI)

go to newer or older post, or back to index or month or day