by Michael S. Kaplan, published on 2011/11/04 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/11/04/10233999.aspx
Previous blogs from this series:
Now the net effect of Part 3 was a whole lot of nothing -- I mean, I pointed out a bug, I claimed to be the person who essentially introduced it, and I apologized for it. I didn't explain what was going on or even give hints as to what the next step would be.
Sorry about that....
Anyway, we'll start over.
That mail Frank Grießhammer of Adobe sent me?
Michael, Maybe you remember – I asked you some questions once before – most of which I could actually deal with myself.
I have been creating a bunch of keyboard layouts for Windows and Mac, to be shipped with the Adobe Pi fonts. The motivation behind this project is providing the user with a method to ‘key’ their symbol glyphs. The layouts were created for the Mac first, the XML was converted to a .klc file using a Python script, the final layout being compiled with MS Keyboard Layout Creator. In the process, I made the following observation, which might be interesting material for your blog:
As of Unicode 5, Unicode values with 5 digits exist. Both Mac and Windows have (at least) four possible shift states for keyboards, I kept them unified across the platforms: ‘alt’ on the Mac would become ‘altGr’ on Windows (e.g. the right ‘alt’-key).
The observation I made has to do with the altGr/altGr+Shift states: If there is a 5-digit unicode value mapped to a key in either of those states, the respective key just won’t return anything on Windows. I filed a bug with your colleagues, and in a long email conversation we came down to the point that unicode support in the OS is – to say the least – leaving a lot to desire. It’s broken. We could rule out MSKLC being the culprit, it really came down to the OS. By the way: I tested my example layout on a developer version of Windows 8 as well, and the same problem exists.
Now here’s my question: Do you maybe have an explanation for that? I see your latest post is about keyboards, so that might fit in just nicely.
Best greetings,
Frank Grießhammer
Type Design & Font Production
Adobe Systems
I could make some minor corrections -- like supplementary characters were first introduced in Unicode 3.1, not 5.0. But that's not the point.
I could point out that as bad as Microsoft may be for not properly supporting a Unicode 5.03.1 feature, but Adobe was not properly supporting a Unicode 2.0 feature (ref: Beauty isn't only glyph deep), but that's not the point either. :-)
The point is one of the main feature additions in KBDUTOOL.EXE is the ability to define supplementary characters in the LAYOUT table of the .KLC file, which the tool then converts to surrogate pairs and adds them to the LIGATURE table. Which is where supplementary characters work - by defining surrogate pairs in the LIGATURE table.
Here is where the bug comes in.
When the .KLC file defines supplementary characters in the LAYOUT table in any state other than the BASE or SHIFT states, the supplementary characters are not properly converted to surrogate pairs. So those keys won't work properly.
This is the bug that Frank was reporting.
Now MSKLC.EXE does not have UI for a ligature table -- it just lets you define up to four UTF-16 code units on a key and any time 2, 3, or 4 are defined on a key, it adds them to the LIGATURE table.
Now before you decide to just convert the supplementary characters to surrogate pairs yourself, there is an MSKLC feature that automatically converts surrogate pairs to supplementary characters.
So you can't ever save the file in MSKLC.EXE, because it is designed to do something that exacerbates the problem in KBDUTOOL.EXE.
The only way to define these characters is to calculate the surrogate pairs and add them to the LIGATURE table directly, and then run KBDUTOOL.EXE from the command line to generate the various layout DLLs for the different platform architectures.
This is pretty awful, and I'm a little embarrassed no one found this bug.
I'm even more embarrassed that I never found this bug.
For Frank (who is generating his .KLC files from a Python script, the problem is slightly less onerous to deal with, though it is still ugly.
For a look the syntax, you can create some actual LIGATURE entries and see the syntax MSKLC.EXE uses for the table:
SHIFTSTATE
0 //Column 4
1 //Column 5 : Shft
2 //Column 6 : Ctrl
6 //Column 7 : Ctrl Alt
7 //Column 8 : Shft Ctrl Alt
LAYOUT ;an extra @ at the end is a dead key
//SC VK_ Cap 0 1 2 6 7
//-- ---- ---- ---- ---- ---- ---- ----
29 OEM_3 0 %% %% -1 %% %% // <null>, <null>, <none>, <null>, <null>
39 SPACE 0 0020 0020 -1 -1 -1 // SPACE, SPACE, <none>, <none>, <none>
53 DECIMAL 0 002e 002e -1 -1 -1 // FULL STOP, FULL STOP, , ,
LIGATURE
//VK_ Mod# Char0 Char1 Char2 Char3
//---- ---- ---- ---- ---- ----
OEM_3 0 0041 0041 // LATIN CAPITAL LETTER A + LATIN CAPITAL LETTER A
OEM_3 1 0042 0042 // LATIN CAPITAL LETTER B + LATIN CAPITAL LETTER B
OEM_3 3 0043 0043 // LATIN CAPITAL LETTER C + LATIN CAPITAL LETTER C
OEM_3 4 0044 0044 // LATIN CAPITAL LETTER D + LATIN CAPITAL LETTER D
Obviously only the ALTGR and SHIFT+ALTGR states will need this done, but I defined a single ligature in all four states so you can see the syntax.
For the syntax of KBDUTOOL.EXE, you can see other blogs I've done before like In case you have a yen to extend your keyboard (or at least want a yen?).
For the record, I consider this a must fix bug in MSKLC, or more specifically the KBDUTOOL.EXE binary that is within the MSKLC package.
In fact, I think between this bug and the need to remove enough of the IA64 support to shrink the download and created setup package size (discussed in Still with the Itanium?), and the bug discussed in A picture that *still* can't be easily described with words, someone really ought to do a little dev work and release a tiny update -- MSKLC 1.5!
Now in the interests of turning this blog post into a mini "IStartedSomething.com" type issue, does anyone agree with me that these three bugs are worh fixing? Yes they can all be worked around, but they all suck IMNSHO.
Okay, and now with that said this series is officially off track (assuming I am not given the job to fix MSKLC!), so in the next part, I'll return to topic....
Doug Ewell on 4 Nov 2011 7:22 AM:
So since you've told us 538 times now that you are no longer responsible for updating MSKLC, how do we get hold of the people who are?
Michael S. Kaplan on 4 Nov 2011 8:27 AM:
It actually isn't clear (to me, at least!) who the owners are, at this point. Which makes pointing out the need all the more important (it's how I got our GM to approve 1.4, overruling the decision of the leads)....
Michael S. Kaplan on 4 Nov 2011 8:57 AM:
And of course the question remains -- do you believe it's worth doing the work?
Van on 5 Nov 2011 1:55 AM:
Yes, I believe it is worth updating MSKLC to handle surrogates in AltGr shift states (and removing IA64/fixing the blank window problems). I also believe it is worth updating MSKLC to at least not remove, and if possible support chained dead keys.
Michael S. Kaplan on 5 Nov 2011 12:45 PM:
Hey Van -- Supporting chained dead keys is a whole different kettle of fish, and the UI changes that would be required are decidedly non-trivial....
Van on 5 Nov 2011 10:48 PM:
That's why I wrote that it should at least not remove chained dead keys. The ideal would be that you could generate them within MSKLC, but the fact that the program actively strips previously defined chained dead key definitions from files that it opens is decidedly non-user-friendly. That should be fixed, even if we aren't going to get a little dead key checkbox and ... button in the dead key dialogue.
Michael S. Kaplan on 6 Nov 2011 12:45 AM:
The trouble is that it only can save what it can read; and making it able to read them is complicated....
Andrew_Cunningham on 6 Nov 2011 3:41 PM:
I'd like to see an update of MSKLC.
Andrew West on 7 Nov 2011 7:06 AM:
We'd all like to see an update to MSKLC. If MS doesn't feel it is worthwhile continuing to support this tool maybe they should just release the source code and let the user community do the dirty work (yeah, I know this is never going to happen).
Michael S. Kaplan on 7 Nov 2011 7:21 AM:
It isn't a matter of "worth it" per se -- since the tool is central to how we support keyboards ourselves. It just the resources to update it that are the tricky part. The open source route seems highly unlikely, though....
Matthew Slyman on 16 Apr 2013 11:09 PM:
@MichaelKaplan:
> "The trouble is that it only can save what it can read; and making it able to read them is complicated...."
So, read them as a plain string, and write them as a plain string. Beginners won't notice the dummy "functionality" is there. And advanced users (who will have put those annotations there on purpose) won't curse MSKLC for stripping them out!
At least, one might have an option in a MSKLC v1.4b, to NOT strip out the extra chained dead key information?
Matthew Slyman on 16 Apr 2013 11:28 PM:
@MichaelKaplan: Will you put me in touch with Frank Griesshammer please? I've been looking for ways to bridge the gap between .KLC and Apple's XML-based format (that I haven't studied much yet). I'm developing a range of new keyboard layouts that I hope to propose/promote as a new international standard. (Perhaps naive of me, but what would the world be like without crazy people like me?)
referenced by
2013/04/17 One of these days you'll want to stand back, as I am working to ARM myself with another Surface RT!
2013/03/11 That MSKLC with a jailbroken Surface RT thing? I dared to disturb the universe, and...
2012/10/26 The evolving Story of Locale Support, part 28: We finally fixed that 'Install New Languages' thing!
2012/10/02 The evolving Story of Locale Support, part 27: No, the T and the H aren't silent...
2012/08/20 The evolving Story of Locale Support, part 26: Hey Windows 8, there's someone on the phone for you.
2012/07/30 2nd amendment issues don't apply; I'm not ARMed!
2012/06/07 The evolving Story of Locale Support, part 24: I Adar you! Hell, I Double Adar you! (Windows 8 ed.)
2012/06/05 The evolving Story of Locale Support, part 23: Tamazight? Outta sight!
2012/04/12 The evolving Story of Locale Support, part 22: Digit Substitution 2.0
2012/03/08 The evolving Story of Locale Support, part 21: The Windows 8 Hijripalooza extraordinaire!
2012/03/02 The evolving Story of Locale Support, part 20: Yes, it's Bangla. Not Bengali!
2012/02/21 The evolving Story of Locale Support, part 19: In honor of International Mother Language Day...
2012/02/15 The evolving Story of Locale Support, part 18: Two scripts that share ten digits can be trouble
2012/02/02 The evolving Story of Locale Support, part 17: Today I feel like translating you more than before
2012/01/24 The evolving Story of Locale Support, part 16: We can't scale to a Xishuangbanna Dai locale, but…
2012/01/17 The evolving Story of Locale Support, part 15: Fixing our listings up in Windows 8!
2011/12/23 What I'd do with my 'Microsoft 20% time'
2011/12/22 The evolving Story of Locale Support, part 14: Tifinagh, Tamazight, and Berber? Oh my!
2011/12/21 The evolving Story of Locale Support, part 13: Divvying up locales, yet again!
2011/11/23 The evolving Story of Locale Support, part 11: What language is that keyboard for?
2011/11/11 The evolving Story of Locale Support, part 7: That would be a "call and a raise" for Hawaiian
2011/11/09 The evolving Story of Locale Support, part 6: Behind the Cherokee Phonetic layout in Windows 8