Can I get your [font]signature on this, please?

by Michael S. Kaplan, published on 2008/10/05 12:31 -04:00, original URI:

Now regular readers know I have talked about the FONTSIGNATURE structure and the LOCALESIGNATURE structure in the past, starting way back in 2005 in blogs like Parameter confusion #2a and It isn't a FONTSIGNATURE, darn it!, both of which refer to the confusion between the two structures and some of the consequences thereof.

At the same time many offhand observations were made, though without too much follow-up.

Time to dig in a little deeper!

To start, the FONTSIGNATURE structure looks like this:

typedef struct tagFONTSIGNATURE
    DWORD fsUsb[4];
    DWORD fsCsb[2];

And you can compare this to some bits of the OS/2 & Windows Metrics table in a TrueType font, which contains (among other things) a UnicodeRange and a CodePageRange:


ULONG ulUnicodeRange1 Bits 0-31
ULONG ulUnicodeRange2 Bits 32-63
ULONG ulUnicodeRange3 Bits 64-95
ULONG ulUnicodeRange4 Bits 96-127


ULONG ulCodePageRange1 Bits 0-31
ULONG ulCodePageRange2 Bits 32-63


Now first of all, though both contain the same amount of information, the layout clearly isn't the same, exactly.

There are also a few differences in the bits, as I pointed out in Running a bit short on space?.

The fact that the layout isn't identical helps put differences between the FONTSIGNATURE structure and the LOCALESIGNATURE structure in an interesting light. The LOCALESIGNATURE structure looks like this:

typedef struct tagLOCALESIGNATURE
    DWORD lsUsb[4];
    DWORD lsCsbDefault[2];
    DWORD lsCsbSupported[2];

I mean, if I were building one structure that was a subset of another, I'd have probably ordered them differently -- I would have put lsUsb followed by lsCsbSupported, since those are the two members that match fsUsb and fsCsb.

But remembering that the data is coming from the Typography team, where these two members are not even next to each other at all, it makes the connection between the two structures perhaps a bit less structural and a bit more conceptual.

But that is a side issue.

The fact that the actual bits have fallen out of sync between the list in the OpenType spec and the Unicode Subset Bitfields in the PSDK is also kind of a side issue. I wish the latter would be removed and replaced with a pointer to the former, though leaving in the note about bits 123, 124, and 125:

123 Windows 2000 or later: Layout progress, horizontal from right to left
124 Windows 2000 or later: Layout progress, vertical before horizontal
125 Windows 2000 or later: Layout progress, vertical bottom to to

Those details can be fixed up in the documentation, at some point....

More important is of course the rich source of limitations and bugs I first hinted at a long time ago in It isn't a FONTSIGNATURE, darn it! and then later talked about much more explicitly in the recent What do you get when you put a Hebrew on top of a Russian? (aka What lies beneath can bite you on the ass).

Which proves that this is feature with much farther reaching consequences that fonts, as keyboards get sucked into the mix as well!

While one could hypothesize the creation of some very sophisticated locale-specific automatic font fallback methods involving the comparison of the data in fonts on the machine and the data in the locale, the reality of both limitations in the data when it is correct and incidents where the data is not make the reality fall short here. Even though products have done so and many third parties (like MVP Mihai mentioned) have used the support as well:

I have a small application using the FONTSIGNATURE and doing automatic font fallback and font selection.
Nothing special, you will say.
Only that it was written in 1997 on Windows 95 (and still running ok on XP).

Now I am thinking about filing a suit against MS :-)

Now I wouldn't necessarily advocate a lawsuit (given the real limitations that have always existed here even ignoring random bugs over the years), but now that (according to the OpenType spec's list) there are no more bits, and given how difficult it would be to add more bits across both the typography and NLS sides of the data, the serious question about whether to keep using this method does not have an easy answer for people in Mihai's situation.

Now to make the situation a bit worse, if you take the code from What the hell is wrong with TranslateCharsetInfo, anyway? and use it to look at locales like es-ES, you sometimes see some quite overeager LOCALESIGNATURE structure data in there, even just looking at plain old XP:

          <msLocale:range type="0" />
          <msLocale:range type="1" />
          <msLocale:range type="2" />
          <msLocale:range type="3" />
          <msLocale:range type="5" />
          <msLocale:range type="31" />
          <msLocale:range type="35" />
          <msLocale:range type="43" />
          <msLocale:range type="44" />
          <msLocale:range type="45" />
            <msLocale:codePage type="1252" />
          <msLocale:ansiOemCodePage />
            <msLocale:codePage type="850" />
            <msLocale:codePage type="1252" />
            <msLocale:codePage type="1250" />
            <msLocale:codePage type="1254" />
          <msLocale:ansiOemCodePage />
            <msLocale:codePage type="865" />
            <msLocale:codePage type="863" />
            <msLocale:codePage type="861" />
            <msLocale:codePage type="860" />
            <msLocale:codePage type="857" />
            <msLocale:codePage type="852" />
            <msLocale:codePage type="850" />
            <msLocale:codePage type="437" />

Now all you have to do is look up some of those code pages listed there to see how little they have to do with Spanish (plus how many of them cannot represent Spanish on their own). And this is of course just one example, recently pointed out to me that has been around for a lot of the last decade.

I have my own theories as to why/how this happened, but they are not very relevant here!

It is taking the easy way out to blame it one the typography folks, since all they are doing is providing data to the NLS team that is in large part based on information that the NLS team provided to them.

I think the fact that this data is not more widely used is the main reason that complaints haven't been seen very much. I think there is a natural law of sorts about this:

People complain when the stuff they use is broken.

This might help people triage the importance of fixing the problem, even.

Certainly it suggests that Mihai's lawsuit might be in trouble (unless he extends it to non-performance over the course of the last decade!). :-)


This blog brought to you by all of the roadmapped characters in Unicode which have no room in the existing bits of the FONTSIGNATURE

# Mihai on 6 Oct 2008 7:22 PM:

Really, that was just a joke :-)

So, here is a public and official promise: no lawsuit whatsoever!


But I would still like to see things fixed ;-)

go to newer or older post, or back to index or month or day