Premature keystroke processing? Don't worry, it happens to everyone....

by Michael S. Kaplan, published on 2008/04/22 03:01 -04:00, original URI:

Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

Ben Supnik asked over in the Suggestion Box:


Your post on dead key states "Sometimes you *want* to interfere with the keyboard's state buffer" was very helpful in understanding what's going on with ToUnicodeEx.  But I have the opposite problem of Sebastien:

I want to be able to calculate the unicode key that is generated by one or more virtual key presses.  But I am supporting a legacy API; I need to dispatch this unicode key with the second key press (e.g. in the second WM_KEYDOWN) rather than wait for the WM_CHAR message.  Similarly, I need to know when the first WM_KEYDOWN comes in that it is a dead key without waiting for WM_DEADCHAR.

Here's the part that surprises me: when I call ToUnicode with the virtual key code of the dead key (in response to the very first WM_KEYDOWN I get two unicode chars back, typically a repeat of the spacing version of the diacritic.  For example: on the first key press of the ^ key on a French keyboard, I get a return of 2 and a buffer with ^^ (a pair of non-spacing circumflexes) when I would have expected to get a return of -1 and a single ^, telling me the circumflex is getting set up.

What am I missing here?  It looks like something is calling ToUnicode with my vkey code once before I get the WM_KEYDOWN message, and thus pre-loading the unicode buffer.  I don't think I can drain the buffer because when the second (real) key is hit, I'll have drained out my precious dead key state.

Could you please revisit this topic?  In particular, what part of the system is looking up my vkey so early that I'm losing my dead key state?  Is there any programmatic access to accumulated dead key state?

The post that Ben refers to is Sometimes you *want* to interfere with the keyboard's state buffer.

The first answer is to the last question asked, an the answer is that there is no way to access that buffer directly.

But let's look a little deeper at the bigger questions here.

One of the changes that happened in Windows is hinted at in the help for WM_KEYDOWN and WM_KEYUP:

Windows 2000/XP: Applications must pass wParam to TranslateMessage without altering it at all.

Thinking aloud for a moment...

Why it would be necessary for you to not change the wParam (which contains the virtual key) -- it shouldn't be important to avoid making such a change if the processing is all to come after the TraslateMessage call.

Perhaps if it was doing that lookup or that caching without waiting?

To be honest I am guessing aloud here, and will be forwarding this speculative blog to someone who should know the actual answer and who can explain the difference between them. :-)

But my advice here would be to do the same thing that MSKLC does --it loads up the information contained in the entire keyboard, not only to display all of the information but also to simulate what the keystroke processing code does without being negatively impacted by mixing up the system with multiple sources of ToUnicode and ToUnicodeEx type calls....

Although this is a bigger investment upfront, in the long run it is much more maintainable than trying to make queries that can (or apparently in this case, does) negatively impact the state of the buffer.

If one really wants to go down the harder road, that solution is not too awful:

  1. Clear the buffer as the previous articles talked about;
  2. Call ToUnicode/ToUnicodeEx with the same key and shift state info again to fill the buffer;
  3. Don't touch the buffer again until the next time a keystroke message comes in.

Now as I said I am going to try to get answers on the question of what is going on here since it appears to violate my (admittedly older) understanding of what happens during keystroke processing, and it was only after testing both of the methods I suggest above that I realized that even if the underlying behaavior did change, neither method is broken.

So perhaps the only real bug at this point is in the documentation, which does seem to be a little off now (though I am trying to get my head entirely around how!).


This blog brought to you by(U+0d85, aka SINHALA LETTER AYANNA)

Marc Durdin on 22 Apr 2008 6:52 PM:

The AbiWord team did this in an early version - it turned out to be a whole lot of code and never worked 100%.  They ended up using TranslateMessage/WM_CHAR for a later version and that was much, much simpler and far more robust.

You could consider testing the return value of TranslateMessage instead:

- Call TranslateMessage to do the WM_*CHAR generation

- If it returns non-zero, peek the message queue for 1 or more WM_*CHAR messages and then handle the set of WM_KEYDOWN + WM_*CHAR together (remembering that a single key can generate more than 1 WM_*CHAR message)

go to newer or older post, or back to index or month or day