Character Map Plus?

by Michael S. Kaplan, published on 2007/07/09 14:49 -04:00, original URI:

The other day when I wrote We've got a style of glyphs, yes we do; we've got a style of glyphs, how 'bout you?, regular reader Mihai commented:

...Character Map is not consistent.

Select "Angsana New" (or Arial, or "Lucida Sans" or whatever) and using "Character Set: Unicode" you will notice that only the glyphs that exist in the font are shown.

So I guess the expectancy is that Character Map does not do any font fallback/linking/substitution.

This is an excellent point; Character Map is really a tool that is built for the display of the fonts, not of the display of other font technologies.

Though as I think about it, wouldn't a Character Map Plus tool that did all of the fallback/linking/substitution and showed you what you could expect to actually GET if you asked for a given font be a really cool idea?

or maybe it could be an additional checkbox that would expand the view to this much wider one.

I imagine that I would use that one more often than I use the one that is there, to tell the truth.

Anyone else think this would be worth thinking about? :-)


This post brought to you by (U+0aa2, a.k.a. GUJARATI LETTER DDHA)

# Mike on 9 Jul 2007 5:15 PM:

I'm just stunned that after about 20 years of Windows releases, there isn't a standard system combo box for font selection that shows a WYSIWYG preview of the font appearance ... or if there is, why it isn't exposed anywhere useful in Vista.

And to be quite clear I'm not talking about anything that's only available to those with programmatic access to MSOx.DLL.

# Michael S. Kaplan on 9 Jul 2007 6:16 PM:

Hi Mike,

Yes, I have talked about this before in WYSIWYG font dropdowns; it is a decidely non-trivial and somewhat resource intensive problem....

# Marc Durdin on 9 Jul 2007 7:32 PM:

Keyman Desktop and Keyman Developer include a 'character map plus' that show basically all the Unicode 5.0 characters - although they are not really focused on showing what a font will produce.  Keyman Desktop also avoids the annoying copy/paste that you have to do with Character Map.

Okay, I admit it, that's a product plug...  But I am quite pleased with how the Character Map tool in Keyman Desktop turned out and I use it frequently.

# Mihai on 9 Jul 2007 8:08 PM:

I have a toy CharMapEx application that started as a research in cmap parsing, but now is big enough to be useful. And I am thinking about making it available, maybe even open-source.

But my initial comment was not a hook for me to place a shameless plug, really :-)

# J.J. on 9 Jul 2007 8:08 PM:


I've been lurking on your blog for months now.  As far as I'm concerned, this is the single greatest idea you've come up with since I've been reading.

CharMap does not help my "Unicode lessons" to other devs; in fact, sometimes it hurts more than helps.  After explaining the "unicode stack" of UTF-xx, fonts, string normalization, etc a dev will envitably point to some result of the fallback/linking/substution and say "all that makes sense, but (IE|Word|whatever) doesn't behave like just explained!"  and I have no choice but to wave my hands, say it's a hard problem and that microsoft has a lot of "black magic" under the hood to "make it work like you expect."  Not reassuring or satisfying.

I mean, come on, what good is "a tool for the display of fonts" if it doesn't accurately represent how every application actally behaves?

So, think you can get it into SP1?    :-)


# Michael S. Kaplan on 9 Jul 2007 9:06 PM:

Given how consistently KeyMan provides what I would want to provide myself if I were allowed to expand the scope of my work, I generally grant Marc carte blanche to point out such things any time they are on-topic.

And any time you want to point out code you are working on, the same principle applies.

I am less forgiving of the CAPS LOCK haters who are unable to avoid a commercial for their "superior layout" and how it revolutionizes society, whether it is on topic or not -- them, I tend to filter these days.

But J.J., this project is not even in the queue to be approved, let alone approved, let alone scheduled. So SP1 is pretty unlikely at this point. :-)

# Lowell Stewart on 10 Jul 2007 1:37 PM:

You might want to check out Linotype's FontExplorer X (  It started out as a Mac program, but their Windows port is in beta.  Both release and beta versions are freely downloadable.

You can pick any font, click the Information button, and (at the Characters tab) see all the characters in the font, split up according to Unicode range.  Plus there's a "Show missing characters" checkbox that (it would seem to me) does something like what you're talking about -- it shows all the characters in the selected Unicode range(s).  But I don't know if it correctly does all the fallback/linking/substitution stuff, or if it's just using some default font for all its "missing" glyphs.

At any rate, it seems like a pretty cool (and free) tool to me.

# John Bates on 10 Jul 2007 10:25 PM:

BabelStone have BabelMap ( which is Unicode v5.0 compatible (free for personal use, donation-ware for commercial). They also have BabelPad ( which allows you to choose a font for each Unicode character range. Recommended.

# Michael S. Kaplan on 10 Jul 2007 11:46 PM:

These tools, while all being useful, all fail to provide the one thing I was suggesting would be a useful feature.... :-)

# Andrew West on 11 Jul 2007 7:45 AM:

There are a couple of reasons why I personally think that an option to show characters with fallback/substitution/linking applied would not be such a cool idea.

Firstly, I'm sure that I am not the only person who loathes the way Microsoft's font substitution works. When I select a specific font to render my text with then what I really want is the application to respect my choice of font and not act as if it knows better then me and use some different font. For example, in Notepad or Word if I select Code2000 to display some Syriac text it wil be displayed with Estrangelo Edessa and there is no way I can get it to use Code2000 even though Code2000 supports Syriac just as well as Estrangelo Edessa does (and even worse in Word the font dropdown box shows Code2000 as the selected font when you click on the text, deceiving users into thinking that they have successfully applied the font to the text when they haven't). If there was a registry setting where the user could define the font to use as a fallback for each script (or if there was an option to disable font substitution at the application level) then it wouldn't be so bad, but as it is, from a user's perspective the font substitution behaviour is unpredictable and often unwanted.

Secondly, font substitution behaviour is application-specific. Microsoft apps such as Word and Notepad do apply font substitution and linking, but many non-Microsoft apps such as and BabelPad do not. Thus if charmap showed characters with fallback/substitution/linking applied this would not necessarily be what you will get if you select the same font in some other application.

Incidentally, when you told us in September ( that charmap on Vista would still not display anything beyond the BMP and would still only display the very limited list of Unicode blocks (nothing beyond Unicode 2.0 for @%#'$ sake !!!) I was stunned. How hard could it possibly be to update charmap to show everything up to and including Unicode 5.0 ?

So I think that a far more useful feature for charmap would be the ability to display all currently defined Unicode characters.

# Michiel on 11 Jul 2007 8:13 AM:

Since we're still in the wishlist phase: what about a tool that showed not just WHAT you get, but also WHY?

# Johannes Roessel on 3 Aug 2007 9:58 PM:

A better character map might be something really helpful. I noticed joyfully that in Vista (possibly also in earlier versions of Windows but I kinda skipped 5.1) there are a few options added in the dropdown box »Group by«. But since Word started adding support for formulas I noticed that many characters are encoded in the astral planes, while charmap only displays the BMP. Also, being able to find characters from description or code point (possibly including encodings like UTF-8) may be of value to some people.

John on 31 Aug 2009 3:08 PM:

I'm looking for programmatic support for exerting direct control over font fallback,linking, and substitution at the application or better yet O/S level?  Do system hooks exist that allow programmers to be heavy handed in controlling character to glyph index mappings?

DFSDFSD on 18 Jan 2013 5:21 AM:


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2007/07/11 Any sufficiently confusing technology is indistinguishable from magic

go to newer or older post, or back to index or month or day