If you don't commit, you may not get what you want

by Michael S. Kaplan, published on 2006/03/11 16:36 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/03/11/549624.aspx


(No, this is not a post about relationship advice!)

The other day, Karthik asked:

Hi Michael,

The issue is: GetWindowTextLength returns 0 for any edit control that has Chinese Pinyin IME or Japanese IME characters in it.... GetWindowText also returns nothing. Hiding the field not make these characters disappear....

...This is an issue with many dialog boxes hosted in property sheets that I looked around at. It malfunctions with in certain ways with other dialogs also.

The issue here is that when you first type text using an IME, it is not yet in the underlying control that has the focus.

Let's say you are in Notepad and with the Japanese IME in full width katakana mode you type the letters kazuka. What will appear is something like the following:

See that dotted line? It is an important one.

If you hit ENTER now, the text will be committed and then it will be in the underlying control.

If you hit the space bar twice, the candidate window will come up, giving you whatever options are available:

(in this case all that is available is the Katakana and Hiragana versions of ノチツナノチ a.k.a. 30ce 30c1 30c4 30ca 30ce 30c1)

Once you either hit that initial ENTER key or choose from the candidate window, you will have text in the underlying control.

But until then, any call to functions like GetWindowText will not return this text and functions like GetWindowTextLength will return 0. Other operations such as trying to hide control text will have other result consistent with the fact that control itself has not yet been given the text.

Because when you are using the IME, just as in life, you often have to commit before you can get what you are looking for....

(Some other day I will talk more about what to do if you want the interim text in the IME)

 

This post brought to you by "ツ" (U+30c4, a.k.a. KATAKANA LETTER TU)


# Michael F on 9 Apr 2006 3:03 AM:

The next question is:

How do you commit programatically after using SendInput to send unicode characters to another process' window?

Not even sure why IME is getting involved when I am sending exactly what I want to see.

# Michael S. Kaplan on 9 Apr 2006 3:09 AM:

Bypassing the IME is not something easily done; a commit would have to be done via the IMM API or the TSF interfaces.

I usually find it easier to switch keyboard layouts and then switch back, sll things considered (since every IME is different).

# Michael F on 9 Apr 2006 4:47 AM:

I've tried it all (except TSF, I need to support Win2K) but nothing seems to solve the problem.

Does your approach work when input is sent to another process?

# Michael S. Kaplan on 9 Apr 2006 10:00 AM:

Hi Michael,

Well, you would have to inject yourself into that other thread (say via a SetWindowsHookEx WH_CALLWNDPROC hook), change the layout, send yourself a message with the input, send the input, change the layout back, and unhook yourself....

Though the more I think about it, the more concerned I get about what you are trying to do here -- it seems a bit dangerous/insecure to be inserting random text into some other thread of some other process?

# Michael F on 9 Apr 2006 2:51 PM:

It's not random.  We're developing an SSO application that inserts user-defined input into applications on demand.

Up until multi-byte languages everything was working as expected.  I was hoping to avoid any kind of hooking for the purpose of inserting characters but it looks like it's unavoidable.

I am still puzzled by why IME would get involved in this whole process since I am sending unicode characters exactly as I want to see them in the other application.

Is there a way to determine if whatever I am sending is ending up in the uncomitted state?  If I could do that I'd be able to send a VK_RETURN to accept the changes.

Thanks for your replies,
Michael.

# Michael S. Kaplan on 9 Apr 2006 3:08 PM:

"I am still puzzled by why IME would get involved in this whole process"

When you are involved at a very low level like the IMM/TSF components, what is given to you does not look any different than any of the myriad of bizarre ways there are to get text into an application (IMM works by being integrated in with USER, and TSF works via those very same sorts of hooks).

The whole KEYEVENTF_UNICODE thing, I'll be talking more about it another day....

# Michael F on 10 Apr 2006 12:05 AM:

So, I've tried loading a U.S. English keyboard layout using a hook.

The problems turned out to be that SendInput either does not post the neccessary messages synchronously or the keyboard messages have lower priority than a registered message I am posting to restore the layout.

This means I would have to leave the new layout active.  Not so great.

Another small problem is that when I unload the temporary layout, the IMM window does not remove the layout from its list.  So, it's possible to see and select the temporary layout after it has been unloaded.  Although it has no effect.  Bug?

# Michael S. Kaplan on 10 Apr 2006 12:14 AM:

Well, the WM_CHAR is a notification, not a message -- so it is not blocking. I suspect the unload issue you are hitting might be related to the keyboard quite literally being in use at the moment you try to do the unload....

Another choice could be that you document the limitation in the app? You can blame MS for it (it's our model for IMEs, after all!).

It may be worth trying it out in a few different IMEs and modes; I suspect some may work just fine while others will have the uncommitted text.

go to newer or older post, or back to index or month or day