It is not easy to chain dead keys on Windows

by Michael S. Kaplan, published on 2005/11/15 04:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/11/15/492200.aspx


The title says it all -- and not just for the most obvious reason -- because MSKLC does not support it.

In many different posts on this blog I have said a little or a lot about chaining dead keys.

It is in the post Dead keys are not intuitive that I probably said the most about the topic:

Now one thing that is possible in the Windows architecture is chaining dead keys together, so that a dead key plus a base character will then wait treat the combination as another dead key waiting for yet a third base character. One could then chain that as well, and so on -- adding more and more keystrokes to produce in the end a single code point. This feature is not currently supported by MSKLC since the demand for combinations of three or more keystrokes always involve multiple characters being produced -- one is simply not enough here for anyone who has ever asked....

Sounds easy, and even makes sense in a case such as wanting to be able to type the sequence:

and have it spit out U+0170 (Ű, LATIN CAPITAL LETTER U WITH DOUBLE ACUTE).

But the reason it makes sense here is that there is a character one could assign to second keystroke in this three-key sequence -- namely U+02ba (ʺ, MODIFIER LETTER DOUBLE PRIME).

That way if they type something else that is not in the dead key table for that third keystroke, the fallback will be something that is not entirely impossible for the user of the keyboard to understand.

But this is not so easy when you take the large number of double diacritics used in a language such as Vietnamese. Because in most (perhaps all?) cases, the double diacritics themselves are not encoded in single code points anywhere in Unicode. Even in the U+02ba example above, it is really just a workaround (in most cases double prime is probably not what the user would expect).

So if I want to type the following:

(or maybe with the diacritics in the other order?) to have it spit out U+1eb1 (ằ, LATIN SMALL LETTER A WITH BREVE AND GRAVE)? There is no BREVE GRAVE modifier letter to stick in there, so you would essentially need a series of dummy characters that would be inserted in the case of typos where the user did not type the exact sequence.

If a future version of MSKLC was to support this feature, each of these mappings would have to be spelled out, explicitly -- in each case with a base character and a combining character. Even though in most of these cases there is no clear choice for the combining character to actually be. And the user interface in MSKLC would probably need to make all of these associations clear, which may well be the most difficult challenge of all.

The workaround would be to change the order in which the keystrokes were typed so that there is a valid character to place there (i.e. BASE ACCENT ACCENT), but this would require base characters to act like dead keys (where typing a letter would not show any visible character) which is likely to be just as confusing.

A less jarring change would be to make the input be required to be ACCENT BASE ACCENT so that the second keystroke would have a character behind it -- one that you could then get to by hittng the spacebar if you did not want an additional accent.

And then, to take a huge step back, there is no way to know if users would even find either method to be intuitive. The only way to help that would be some type of visual indicactor of the character(s) in the pending keystroke buffer. Something similar to what I have heard is on the Mac, or what IMEs on Windows show, maybe?

Oh well, I never said it would be easy....

 

This post brought to you by "" (U+1ed9, a.k.a. LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW)


# Andrew Cunningham on 16 Nov 2005 4:53 PM:

I was wondering if it would be feasible to have some flag that would indicate that output should be NFC or NFD

and just type base_character, diacritic, diacritic and when another character of combining class zero was type, then the preceding characters would be converted to NFC or NFD.

Umm, on second thoughts sounds somewhat impractical formt he point of view of having a visual clue for the user about whats being typed.

# Michael S. Kaplan on 17 Nov 2005 8:37 AM:

It is likely not fesible to do such a thing in user/userk for keyboards, but it should be possible in Text Service Frame work TIPS (used for IMEs are more intelligent keyboards).

# Mark Steward on 18 Nov 2005 8:23 AM:

Thank you very much for the TIPS tip - it seems much easier than all the Imm* I was trying to use! I was frustrated by wchComposed, as it means an intuitive Linear B keyboard layout is impossible without IME. I may well pick up that project now.

Perhaps I should write a polytonic Greek IME, too, to satisfy all those BetaCode, etc. lovers who keep contacting you...

Great blog,

Mark

referenced by

2011/04/16 Chain Chain Chain, Chain of Dead Keys

2010/09/28 Like one of those standards that can't/won't be fully implemented

2007/05/04 The limitations of keyboard layouts (again)

2007/03/05 It's not just returning 1; it's returning -1

2007/01/31 A year later, and the Vietnamese keyboard isn't any better

2006/04/22 Getting all you can out of a keyboard layout, Part #10a

2006/03/28 Only ONE WCHAR per dead key

go to newer or older post, or back to index or month or day