What's wrong with this code?

by Michael S. Kaplan, published on 2007/02/12 03:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/02/12/1658006.aspx

Ok, another one of those 'what's wrong with this code?' posts, for your coding pleasure....

I won't even say what the code is supposed to do, but it should be fairly obvious what is being attempted (unsuccessfully) here, on several levels:

private static string VirtualCodeToString(int virtualCode, uint scanCode) {
    string resultingString = string.Empty;
    byte[] keyboardState = new byte[256];


    IntPtr foregroundWindow = NativeMethods.GetForegroundWindow();
    uint processId;
    uint threadId = NativeMethods.GetWindowThreadProcessId(foregroundWindow, out processId);
    StringBuilder actualKeyBuffer = new StringBuilder(2);
    int retVal = NativeMethods.ToUnicodeEx(

    if (retVal == 1) {
        actualKeyBuffer.Length = 1;
        resultingString = actualKeyBuffer.ToString();

    return resultingString;

internal static extern bool GetKeyboardState(byte[] lpKeyState);

internal static extern IntPtr GetForegroundWindow();

internal static extern uint GetWindowThreadProcessId(IntPtr hWnd, out uint lpdwProcessId);

internal static extern int ToUnicodeEx(uint wVirtKey,
                                       uint wScanCode,
                                       byte[] lpKeyState,
                                       [MarshalAs(UnmanagedType.LPWStr)] StringBuilder pwszBuff,
                                       int cchBuff,
                                       uint wFlags,
                                       IntPtr dwhkl);

The one hint I will give is that the actual code one would need to accomplish what this code is trying to do is pretty complex, if you want to cover all supported scenarios. So I am not asking anyone to try to write the full. correct code here. I am mainly looking for people to point out the bad practices and flaws as if one was reviewing this code either on any version of Windows or on any non-Win9x based version (your choice!).

I might do the full correct version of the code in a future post, if there is interest and if I can get over my fear of supporting a budding industry of keyloggers. :-)

Okay? Ready, set, go!


This post brought to you by é (U+00e9, a.k.a. LATIN SMALL LETTER E WITH ACUTE)

Michael S. Kaplan on 12 Feb 2007 12:06 PM:

Several people guessing via the contact link, but no one feels comfortable leaving their guesses in a comment yet.


Mihai on 12 Feb 2007 12:22 PM:

Trying to get a character from the info available normally in a WM_KEYDOWN message (virtual-key code, and scan code).

There are many ways for this to fail, the main ones being lack of "historical info" (was a dead-key pressed before?) and IMEs.

Michael S. Kaplan on 12 Feb 2007 1:02 PM:

Ok, good start... now what else is going to be a problem here? :-)

Mihai on 12 Feb 2007 1:26 PM:

There are so many problems, I don't even know which one do you want to single out :-)

Here is round two:

1. The keyboard state is a "per task" thing.

So the code calls GetKeyboardState to retrieve the keyboard state of the current thread, then uses it using the GetKeyboardLayout of another thread.

2. ToUnicodeEx can return more than 2 characters, so the buffer is too small.

3. The buffer filled by ToUnicodeEx is might be non-null terminated. Doc: "However, this buffer may be returned without being null-terminated even though the variable name suggests that it is null-terminated."

4. The return -1 from ToUnicodeEx is not treated. From the doc "If possible, even with Unicode keyboard layouts, the function has written a spacing version of the dead-key character to the buffer specified by pwszBuff."

Michael S. Kaplan on 12 Feb 2007 2:54 PM:

These are also good, though the exact consequences of #1 are also worthy of some discussion, as well as what one might want to think of doing with the issue in #4.

Jon on 16 Feb 2007 2:28 PM:

The worst problem: if you call this in a subclass procedure when processing WM_KEYDOWN (which is only reasonable), with an international key which is a dead char and is obtained with AltGr (Ctrl+Alt), such as the Swedish '~' key, this will screw up the dead char state.

For this not to happen, you need to call ToUnicode() twice (!!).

Yes, I know, ToUnicode() is not suppose to know about the deadchar state - well, not only it knows it, but it changes it. I've been just debugging my keyboard handling code (I sell a vi/vim emulator for Visual Studio, SQL Server, Word and Outlook) because a poor fellow Swedish programmer couldn't write any more destructors, and tracked it down to this, on XP SP2.

Ah, how I hate the windows input model :) I wrote a blog post about this a while ago, titled "What's broken in the WM_KEYDOWN/WM_CHAR input model", which is #2 on Google both for WM_CHAR and WM_KEYDOWN (don't ask me why):


Exercise for the reader: write code to determine, when receiving a WM_KEYDOWN or WM_SYSKEYDOWN, whether this will be accompanied by a WM_CHAR afterwards or not. It seems simple, doesn't it? This is necessary if you want to provide reliable user-configurable remapping of all keypresses that doesn't depend on timers.

Believe me, this is the start of the road to hell. On US keyboards, it's only "very complex". If you add intl keyboards into the mix, you will suffer without limits, and end up discovering the dirty secrets of ToUnicode() and much more...

I'm preparing another article called "The Ultimate WM_CHAR/WM_KEYDOWN Chart from Hell", I'm sure it will be very popular. And maybe I can get that spot #1 before MSDN on both keywords.

Michael, congratulations on a great blog and a lot of useful info. Great that someone at Microsoft cares about the stuff, but I must say it certainly seems you are one of a very few.

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day