Yet another time that UTF-8 can't be the ACP

by Michael S. Kaplan, published on 2008/08/15 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/08/15/8869343.aspx


Glenn asks:

Regarding MKLC Codepages: Microsoft Keyboard Layout Creator - In the Properties, I set the Language to "English (United States)" which sets the code page to (1252). The "Validate Layout" tool generates warnings that some of the characters are not in the codepage (1252). It states that "This may cause compatibility problems in non-Unicode applications." (65001) is the codepage I need to use for non-Unicode applications. How can I set the codepage for a Microsoft Keyboard Layout Creator to codepage (65001)?

This particular feature really has very little to do with directly assigning a code page to a keyboard.

In the MSKLC user interface, it can be found right where the cursor is in this screen shot:

http://www.trigeminal.com/images/validation01.png

It has to do with the feature I described how we broke in Vista RTM (as mentioned here) and then fixed in a hofix (as mentioned here) and in SP1. Of course even before we broke this, there were examples where bogus information in a locale's LOCALESIGNATURE could lead to bugs, e.g. in Hong Kong as described in What the hell is wrong with TranslateCharsetInfo, anyway?.

Basically, here are the steps:

This is the basis of the support for ANSI-PLUS applications.

MSKLC's "feature" here takes the locale that the keyboard author has chosen (the one that it will later cause to become the LANGID of the KLID of the installed keyboard) and points out the characters that will not be supported by such ANSI-plus applications.

However, the problem with both ANSI and ANSI-plus applications is that neither of them can handle Unicode-only locales, where the code page would have to be 65001 to support the characters used by the locale.

But as I pointed out here and further blatheredexpounded on here, the ACP of a locale cannot be UTF-8, so there is no way to specify 65001 to be used....

Now getting the actual code page being used by an ANSI-plus applications once a keyboard is installee and gets used turns out to be much harder than it loikely ought to be; I'll discuss how to get that information in an upcoming blog....


This blog brought to you by(U+0988, aka BENGALI LETTER II)


no comments

go to newer or older post, or back to index or month or day