Is that character in the font or isn't it?

by Michael S. Kaplan, published on 2008/07/01 10:01 -04:00, original URI:

Regular reader Yaytay asks over in the Suggestion Box:

How can I find out, from code, (reliably and completely:-) ) which fonts support a given character?

I've tried using GetGlyphIndices, but there are still some fonts that return non-zero values for glyphs they don't have.

My comment from your blog here:

The utility converts a regex into a list of unicode characters (based on their name) and then when you select one of those characters it displays that character in all installed fonts that support the character.

It is based on the GetGlyphIndices function.

Unfortunately some fonts return a non-zero value for a given character, but don't actually support it (displaying the default rectangle when used).

Is there a more reliable way to determine support for a code-point in a font?

When I've got this utility working more reliably I'll make it available if anyone wants it.

Rather than using GetGlyphIndices, I always found myself using GetGlyphOutline, instead.

The bias probably comes from the work I did in MSLU to support GetGlyphOutlineW on Win9x platforms, that I mentioned a few years back in Getting all of the localized names of a font, to date the blog that still gets the most U+fffd-filled spam comments....

Though if GetGlyphIndices is mapping code points to the notdef glyph, then GetGlyphOutline might be as well. So this my or may not be a solution.

Perhaps you could take the hint from Getting all of the localized names of a font and grab the CMAP directly.

Or even better try by a different route -- via a ScriptGetCMap call? Though this could get expensive across all of Unicode, across all fonts.

But as the PSDK topic describes all of the information about dealing with the default glyph (aka the NOTDEF glyph), at least the function has given things some thought:

This function can be used to determine the characters in a run that are supported by the selected font. The application can scan the retrieved glyph buffer, looking for the default glyph to determine characters that are not available. The application should determine the default glyph index for the selected font by calling ScriptGetFontProperties.

The return value for this function indicates the presence of any missing glyphs.

Some code points can be rendered by a combination of glyphs, as well as by a single glyph, for example, 00C9; LATIN CAPITAL LETTER E WITH ACUTE. In this case, if the font supports the capital E glyph and the acute glyph, but not a single glyph for 00C9, ScriptGetCMap shows that 00C9 is unsupported. To determine the font support for a string that contains these kinds of code points, the application can call ScriptShape. If the function returns S_OK, the application should check the output for missing glyphs.

Kind of gives a roadmap to how to think about the problem, and inspires some confidence that they are on the right track. :-)


This post brought to you by (U+fffd, a.k.a. REPLACEMENT CHARACTER)

Andrew West on 1 Jul 2008 11:27 AM:

If GetGlyphIndices is returning a non-zero glyph index then the font must have a glyph for the character, and grabbing the CMAP directly will not help.

I think the problem is that some fonts define individual square box glyphs for some characters that are in the CMAP but that aren't supported -- although the resultant glyph looks like the NOTDEF glyph it isn't. The only way round this is to get the glyph outline for the character you are interested in, and compare it with the glyph outline for the NOTDEF glyph -- and if they are the same then you can exclude that character for that font ... assuming that there are no subtle differences between the real NOTDEF glyph and the fake ones.

Tihiy on 1 Jul 2008 2:30 PM:

GetGlyphIndicesW does exactly what KB241020 does. Are you the author of KB241020?

Michael S. Kaplan on 1 Jul 2008 3:51 PM:

I am not the author of 241020, no. Though I implemented a cleaned up version of the code there for MSLU's GetGlyphOutlineW wrapper to do its work....

Yaytay on 14 Nov 2009 4:28 AM:

I said I'd make it available, it's just taken me a little while to get around to it.

If anyone is interested in a tool for finding which fonts contain a non-default glyph for a given code point have a look:

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2009/09/24 Can I show the whole string or can't I?

go to newer or older post, or back to index or month or day