by Michael S. Kaplan, published on 2006/08/18 11:11 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/08/18/706063.aspx
Not too long ago, Thakara asked in the Suggestion Box:
Hi,
I’m working on a Transliterating Input Method for the Sinhala language. One that would allow Sinhala to be entered phonetically. I.e., you would enter ‘ka’ to get KAYANNA (“\u0D9A”), ‘kaa’ to get (“\u0D9A\u0DCF”), ‘kae’ to get (“\u0D9A\u0DD0”) and ‘k’ to get (“\u0D9A\u0DCA”), and so on. And it should work with any (or at least most) existing applications.
The need for this is that the existing layout for Sinhala (Wijesekara) is very hard to use with a non-Sinhala keyboard. I.e., it would require an actual Sinhala keyboard with Sinhala letters printed on the keys. It is very hard to enter Sinhala with, say, a US keyboard. For the relative lack of Sinhala keyboards on the market and to avoid the hassle of having to buy a Sinhala keyboard just to type a few sentences in Sinhala, it is useful to have such a phonetic mechanism. Since this is how we type Sinhala informally (e.g. while chatting), most Sri Lankans are used to such ‘phonetic’ typing.
After some poking around I came to the conclusion that IMM is old hat and the new way is to use the Text Services Framework (TSF) to build input methods. Then, I started looking for a .NET binding for TSF (since it’ll be much easier) but found there was none. Therefore, I started with VC++ 8 (with CLR support) to build my input method, hoping to use .NET facilities for common tasks such as reading/writing XML configs files and the composition window and some GUI elements.
However, working with TSF, I came across many problems. First of all, there seems to be very little documentation about TSF, even on the Internet. The TSF reference cannot even be reached from the Visual Studio 2005 MSDN index. The API seems to be so complex, so obfuscated that it led me to suspect that TSF is a phased-out API.
TSF and .NET does not seem to mix properly as well. I got access violations while trying to load a mixed-code input method DLL in some applications (Notepad.exe) while working fine in others.
The questions I have are these:
*) Is it possible to build a transliterating input method (as I plan to do) with TSF?
*) Is TSF “alive”? Has it been phased-out/deprecated in favor of something else?
*) Is mixing .NET with TSF bad? Do I have to work in pure C++ (*pain!)
I would be very glad if you could shed some light on these questions, so that I can be sure I’m not on a wild goose chase with TSF.
Thanks!
Tharaka
The Text Services Framework is definitely alive and well -- in fact, in Vista virtually all of the Input Mehod Editors (IMEs) have been converted to use it, and the input methods for Yi and Amharic both use it as well.
Unfortunately, I do not know of any specific way to allow for a managed (.NET) TSF Text Input Processor. I will inquire further but I suspect that this is not possible given how it has to be integrated into essentially any thread using it for input, whether managed or not.
But the good news is that such a transliterating input method is quite possible with the Text Services Framework. And companies like Murasu have actually created such input methods for Tamil and other languages already, when simple keyboards are simply inadequate. This is the model for the input methods used for Amharic in Vista, for example.
It is even quite easy in Vista using the same techniques I used to create the Cantonese and Unicode IME samples I have been working on. If you wanted to send me the table containing all of the equivalances you are using, i.e.
"ka" = "\u0D9A"
"kaa" = "\u0D9A\u0DCF"
"kae" = "\u0D9A\u0DD0"
"k" = "\u0D9A\u0DCA"
and so on, I'll see if I can add another sample to the list....
This method does not currently work in versions of Windows prior to Vista, although to be honest the font and shaping support for Sinhala is also not widely available (other than the earlier version that was released as described here, and significant enhancements to the font and shaping engine have happened since then).
Some form of an input method like this, if it gains wider acceptance in the community and by language experts, could eventually find itself considered for inclusion in a future version of Windows!
So Thakara, if you can just send me your email contact info via that Contacting Michael... link, we can talk further about how to get the info transferred and get the sample put together!
This post brought to you by ඐ (U+0d90, a.k.a. SINHALA LETTER ILUUYANNA)
# Marc Durdin on 18 Aug 2006 5:03 PM:
# NyaRuRu (MSMVP for Windows - DirectX) on 23 Aug 2006 2:41 PM:
# Michael S. Kaplan on 23 Aug 2006 4:52 PM:
# NyaRuRu (MSMVP for Windows - DirectX) on 23 Aug 2006 8:14 PM:
# Michael S. Kaplan on 23 Aug 2006 10:31 PM:
# NyaRuRu (MSMVP for Windows - DirectX) on 24 Aug 2006 1:08 AM:
referenced by
2008/07/25 Behold the Table Driven Text Service, Part 13 (Sinhalification proclamation!)
2008/06/21 Back to Sri Lanka (conceptually)
2008/01/21 Behold the Table Driven Text Service, Part 0 (You have to start somewhere!)
2006/09/17 And we are the knights who say நீ (NII)
2006/08/26 I must admit that an example would be nice