by Michael S. Kaplan, published on 2005/05/31 03:46 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/05/31/423343.aspx
Raymond Chen did a post yesterday entitled You can't simulate keyboard input with PostMessage.
He did touch on the complicated language issues, saying:
First of all, keyboard input is a more complicated matter than those who imprinted on the English keyboard realize. Languages with accent marks have dead keys, Far East languages have a variety of Input Method Editors, and I have no idea how complex script languages handle input. There's more to typing a character than just pressing a key.
This is a subject I have covered before a little bit. The fact is that these APIs in the USER subsystem (from ToUnicode to SendInput and so on) all keep a certain amount of state. not just the state of shift keys like Raymond mentioned, but state related to when you have typed a dead key (so that it knows when you type the next key whether the dead key table contains the combination you have just typed). There is actually even an MSDN topic that helps describe some of the complex process entitled About Keyboard Input.
And once you get into IMEs, the complex rules related to state that the IME must keep really boggle the imagination. It is probably easier to use the Input Method Manager (IMM) APIs to try to get input through an IME than to try and fake keystrokes....
Luckily, the one issue that is not really all that complex is complex scripts. Because the actual issuses that make it complex (bidirectional text, contextual shaping, line breaking, and illegal sequence checking) are all related to what happens to the text after you have typed the keystrokes -- the font linking and the rendering.
Well, that and knowing what to type, of course! Being able to have words even look like they belong together in languages like Thai and Hindi and Tamil really requires either knowing the language or memorizing keystrokes. Which is the same as when dealing with IMEs (in my book I had a chapter that talked about keystroke combinations you could use to test IMEs, it was a lot of fun and I still get positive feedback, enough that I may start posting examples of stuff like that soon.
(Let me know if that sort of thing seems like it might be interesting -- I never know what people will find engaging here!)
This post brought to you by "ஂ" (U+0b82, a.k.a. TAMIL SIGN ANUSVARA)
(which is annotated in Unicode as "not used in Tamil" though several people who deal with Sanskrit in Tamil would beg to differ!)
# CN on 31 May 2005 11:47 AM:
# Ambarish Sridharanarayanan on 31 May 2005 5:02 PM:
# Suzanne McCarthy on 4 Jun 2005 7:04 PM:
# Suzanne McCarthy on 6 Jun 2005 2:38 AM: