More than you ever wanted to know about CP_SYMBOL

by Michael S. Kaplan, published on 2005/11/08 15:23 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/11/08/490495.aspx


If you look at the Platform SDK topics for both WideCharToMultiByte and MultiByteToWideChar, you will notice one very interesting code page entry:

CP_SYMBOL Windows 2000/XP: Symbol code page (42)

Of course if you go up to the reference information on the various Windows code pages you will not see this code page defined, so how is one to know what it is converting to and from?

Well, I'll tell you....

Basically, everything from 0x00 - 0x1f maps to U+0000 - U+001f.

Then, everything from 0x20 - 0xff maps to U+f020 - U+f0ff (in the Unicode private use area).

And if you pass anything outside of those ranges to WideCharToMultiByte, the conversion fails and GetLastError will return ERROR_NO_UNICODE_TRANSLATION.

Now the original goal of this pseudo-codepage was to work with GDI's ExtTextOut and symbol fonts, but unfortunately GDI maps by a different scheme and will accept U+0020 - U+00ff. Which means in practice that the two technologies simply do not work well together. There was even a bug or two related to processes that innocently assumed that symbol fonts and symbol code pages worked together properly and would end up with bad records in EMF/WMF files.

It is too late to change the code page, and might be too late to change the core GDI behavior, so this CP_SYMBOL code page remains as a monument to features that no one can use, but it is not worth taking out just in case someone is using it....

 

This post brought to you by "" (U+f0ff, a.k.a. one of the many <Private Use> characters in Unicode)


# Robert on 8 Nov 2005 7:28 PM:

In Windows XP it seems that GDI accepts character codes from both ranges 0x20-0xff and 0xf020-0xf0ff for symbol fonts.

# Michael S. Kaplan on 8 Nov 2005 7:41 PM:

You are correct Robert (I am sorry I implied otherwise). The problem is that we could not do that in code pages; we have to choose one or the other?

# Vorn on 8 Nov 2005 11:13 PM:

Linked image shows the two symbols available in 0xf0ff on my Mac. The first is from Webdings, the second is from Wingdings. Note the little TM in the bottom right of the latter. :)

Vorn

# Michael S. Kaplan on 9 Nov 2005 8:00 AM:

Of course, Vorn! Those are symbol fonts. not everything on the font side behaves differently (and its not like NLS just made up the idea of this pseudo code page)....

# Vorn on 12 Nov 2005 7:43 AM:

Yeah, I know. But I showed it because I found it funny that my Mac has a font with the Windows symbol in it. :)

Vorn

# Michael S. Kaplan on 12 Nov 2005 9:27 AM:

Heh, that is ironic.

Or maybe not. MS and Apple were once very involved on even more fronts than today in regard to fonts and TrueType....

Phil Harvey on 21 Jan 2010 5:07 AM:

Thank you very much for posting this!!!

I search everywhere and couldn't find a table anywhere else to convert symbol characters to Unicode.

Yuhong Bao on 9 Feb 2010 12:43 AM:

"Or maybe not. MS and Apple were once very involved on even more fronts than today in regard to fonts and TrueType.... "

Yep, they for example developed TrueType together.


referenced by

2008/09/14 Johab to be kidding me!

2007/08/30 The main criteria in determing whether a code page sucks? Suckage, of course!

2007/03/21 Warning: when private is used in public, it can really suck

2006/12/03 Strangely Symbolic font issues

2006/04/27 'Doctor, it hurts when I do this.' Well, don't do that!

go to newer or older post, or back to index or month or day