by Michael S. Kaplan, published on 2007/12/12 10:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/12/12/6742301.aspx
Over in the Suggestion Box, Doug asks:
Your posts include a "This post brought to you by" character at the bottom of the post.
Often the characters are displayed, but sometimes they just show as a rectangle. For example U+267f on a recent post did not display.
Do you have (one or more of)
- a method to diagnose the cause
- links to the fine manual pages on this topic
- the appropriate search engine keywords to find the answer (it would probably return your previous posts)
No need to diagnose too much, as the cause is very well known, always:
It means that an appropriate font to display the character was not being used.
Of course it does not exactly end there, because there are several possible reasons that such a thing could happen:
I mean, it is possible that no font on your machine supports the character in question.
Or it is possible that the underlying system(s) that find fonts when the characters are not in the font selected could not find the character.
The fact that I often choose characters off the beaten path does not help either possibility!
In any case, the end result is a notdef glyph (discussed previously here).
One of the main reasons that I tend to provide links from code point values to http://www.fileformat.info/ is that you might be likely to have a chance to view the characters that way! :-)
I had previously toyed with doing font embedding to allow the viewing, though the fact that it would be limited to specific platforms and browsers kind of soured me on the idea, though I did write up the code to do the building automatically and was pretty excited about the idea at the time. The code got re-purposed into a sample and became that whole "Rhymes with Amharic" series I wrote earlier this year.
The other idea I toyed with and even wrote code for was one to create images containing the characters -- it is a perpetual project I keep putting off to take that code and combine it with a WLW add-in that would let me automatically generate the image for any chosen character and embed the image in the page (while also putting the character itself on the page to make sure clipboard operations were still possible!), to keep people from doing too much hot-linking.
It moved from GDI+ and then to GDI/Uniscribe and then briefly to WPF before heading over to the managed TextRenderer. You know the type -- code that never stands completely still. :-)
Again.... most of the code is written, I just have to take the time to finish it (and there always seems to be something else to do!)....
When I finished, it might be fun to put the whole thing up as a sample and show how I did it, lots of cool technologies that others might find useful in their web applications....
This post brought to you by 𪘀 (U+2fa1d, aka CJK COMPATIBILITY IDEOGRAPH-2FA1D, stoa U+2a600)
# John Cowan on 12 Dec 2007 4:50 PM:
Sometimes, however, what shows is U+xxxx. I suppose this is deliberate on your part for inherently invisible characters?
# Michael S. Kaplan on 12 Dec 2007 4:58 PM:
Yes, I used to include them, usually between quotes, but it seemed kind of silly to let people see un-seeable things! :-)
# Andrew Cook on 14 Dec 2007 1:38 PM:
A new feature coming out in Firefox 3 early next year will hopefully make this slightly less confusing.
At present, when Firefox 2 encounters a character it cannot find a glyph for, it will display either a question mark or the notdef glyph.
Firefox 3 renders a notdef glyph box, but embeds within it the hexadecimal representation of the Unicode codepoint that the unrenderable character lives at. If the codepoint is on the BMP, it's rendered as four hex digits in a square, going L->R T->B. If the codepoint is on another plane, it's rendered as six hex digits in a box three digits wide, two digits tall, L->R T->B.
I don't quite know how this works with base and combining characters...
# Michael S. Kaplan on 14 Dec 2007 5:40 PM:
I shudder to think about the number of people who have never even heard of a code point seeing this exciting feature in random web pages (most of whom will probably call Microsoft support with their questions about this new behavior!)....
# Jan Kučera on 18 Dec 2007 3:58 AM:
Okay Michael, so do _you_ see all the characters which sponsor your posts? If so, what font do you use? :)
# Michael S. Kaplan on 18 Dec 2007 4:24 AM:
Me? Almost always, though I am using all kinds of different fonts to do so....
# Jan Kučera on 18 Dec 2007 6:24 AM:
Oh that's the trick :-) If you can change the sponsor character font on the web, you could adjust it to the font in which you can see the character. If the reader does not have that font, it would then fallback to the current one.
Though, this won't improve the poor experience in Pocket Internet Explorer I guess... your sponsors really should talk to the WM team. ;-)
# Michael S. Kaplan on 18 Dec 2007 11:02 AM:
Consider it a test of modern browsers, at least until I finish up that image code. :-)
# Tanveer Badar on 24 Dec 2007 10:41 AM:
(most of whom will probably call Microsoft support with their questions about this new behavior!)
go to newer or older post, or back to index or month or day