Whither WM_UNICHAR in Windows 7 (and 8!)

by Michael S. Kaplan, published on 2012/05/21 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2012/05/21/10308135.aspx


The  WM_UNICHAR message has had an interesting history.

Over time it has come up in this Blog occasionally, in blogs like Will the real Unicode character message please stand up?, which tend to be more generous in suggesting apps on Windows should just take it if they want to, but by 2008 questions like Why is my WM_UNICHAR handler never called? in Stack Overflow pointed out that the message itself crystallized it's intent/purpose pretty clearly:

The WM_UNICHAR message is equivalent to WM_CHAR, but it uses Unicode Transformation Format (UTF)-32, whereas WM_CHAR uses UTF-16. It is designed to send or post Unicode characters to ANSI windows and it can can handle Unicode Supplementary Plane characters.

But the other day, when Marc asked (on behalf of a customer):

When is WM_UNICHAR used in Windows 7?  I thought that I might be able to create WM_UNICHAR  from an IME by entering a surrogate pair through Unicode input but I only saw WM_CHAR messages.  Under what conditions will the  WM_UNICHAR message be used?

There are reportedly still some 3rd party IMEs that use it even for Unicode windows, but IMEs from Microsoft have largely moved away from that.

I suspect it is largely a philosophical idea, and how strongly you feel that a single discrete character should always be returned -- an issue some 3rd party IMEs do vote with their opinion.

But Microsoft IMEs have another good reason to move away from the view of "one character at a time" in such messages -- Unicode Variation Selectors, discussed in blog like UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme....

Now not every Microsoft IME is putting out variation selectors in its output, even in Windows 8.

But some of them do, and that's enough to convince the IME team at Microsoft to not bother with the "incomplete" WM_UNICHAR.

I still believe that if you get the message, whether it is a Unicode window or not, you should accept it.

For the sake of those 3rd party IMEs, if nothing else.

But there are people who don't bother now, since the docs now make it clear that this is "off brand" usage. So perhaps at this point I am tilting at windmills?


no comments

go to newer or older post, or back to index or month or day