The candidates might have two dimensions (at least!)

by Michael S. Kaplan, published on 2006/12/30 03:10 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/12/30/1383684.aspx


Back in early October, Yao Ziyuan (a.k.a. 'Booted Cat') posted a suggestion for Microsoft in the microsoft.public.word.international.features newsgroup. Although I believe the suggestion has indeed been forwarded on appropriately, the message is about to scroll off the group and I thought it would be better to get it somewhere a bit more visible that doesn't have quite the same 'scrolling' characteristic....

Plus it inspired a few things I wanted to say something about. :-)

The post was titled: A Feature Suggestion for Microsoft Chinese PinYin IME. And here is the content of the post:

I wish there can be a mode in which the homophone candidates can be displayed in multiple rows. On the first row are the most frequently used word candidates and homophone character candidates. The subsequent rows divide the other homophone character candidates according to a characteristic. The characteristic can be:

  1. By type of tone. Chinese characters have 5 types of tones: type-1through type-5.
  2. By common radical. Homophone characters usually can be grouped according to radicals commonly shared. That is, some of them can have a common radical X, some others Y, yet some others Z, and so on. This is like that in a set of integers, some of them have a common divisor, some others have another common divisor, and so on. And the rest which can't be classified into any prior group are put on a last row.
  3. By semantic category. This is tricky and may only stay in theoretical speculation. Top semantic categories are like "concrete objects", "abstract concepts", "verbs", "adjectives", "grammatical auxiliary characters", and more specific categories can be derived from an existing category. Thus character selection would look like exploring a tree.

In case there are too many rows to display, a vertical scroll bar can come to help.

On each row, candidate characters can be sorted by computed probability of occurring in the current context.

This idea could improve the efficiency for the user to select a desired character candidate significantly.

Regards,
Yao Ziyuan

This is an interesting idea, though one that is different enough that were it ever implemented by Microsoft that I would hopefully expect that it would be a new PinYin IME rather than a feature enhancement for the one currently available.

Though for the most part people actually type the tone number so using that first idea of having different rows for different tones would quickly lead to just two rows whose principle "feature" might be blocking the screen in a new way that could lead to less productivity since people might be used to what is being blocked now. Just something to consider, but obviously a need to think about the actual results of having such an input method will need to take place even before something might be considered or prototyped or whatever.

Clearly as one is typing one is using language, which makes it hard to try to quantify specific phonemic or orthographic or semantic or other dimensions of language and simply make them the second dimension, ignoring the others. This gets us into the area of how the mind works, and how language works -- how do we think about language as we write, and with that how does a person using ideographs find the right ideograph in their mind?

An input method that could capture that would have an edge over just about anything else out there, couldn't it? :-)

I don't know how much study has been done in this area in academic circles, or whether there is work that could be captured in an input method that would feel appropriate to users. Does anyone know?

On the other hand, when I look at my keyboard I don't see nearly that much of a connection between language as my brain processes it and the way it is laid out. So perhaps thinking this would be a great idea is kind of flawed reasoning even in the case of an IME? I doubt that the way these ideographs asre grouped really relates to how people think about language while they are typing....

People tend to get quite attached to their methods of input, as they represent a particular stability in their computing life that can be quite traumatic to upset. And while many people could probably see the obvious benefit to both speed and accuracy that the above could bring, there will always be people who really don't want the extra complication that more information can bring.

I know a bit about some of this after having switched to do a lot more of my writing via Dragon Dictate, saving my typing time for code (and thus avoiding exhausting my ability to do so too early!). Dealing with the display of the various candidate lists to maximize productivity and minimize annoyance is a complicated affair, whether it is built in IMEs or expensive bits of custom software.

In its own way, my comments are a slightly more constructive extension to the ideas I pointed out in the provocative post Your layout (in all likelihood) bores me.

That post was of course aimed at a different audience -- the people who were basically hoping to sell some unique keyboard layout to Microsoft since they were sure it would be a great thing for everyone (including them, given Microsoft's cash reserves?). And Yao Ziyuan's post, which was presented more constructively, is in my mind more deserving of a constructive response?

Terribly judgmental of me, I will admit. Though given the subjective doctrines under which this blog is run (i.e. stuff that interests me), I guess my judgment is the arbiter of what will show up here, and the idea of my judgment being judgmental hardly seems all that unreasonable? :-)

Now input method editors have a complicated task (no matter how easy I might think they have it in other contexts!). When one considers the problems of making available thousands or even tens of thousands of possible characters, I think that the free flow of ideas on ways to improve the experience cannot ever do harm whether the ideas lead to solutions or not.

I even have a naive hope that it will help add a little perspective to those who feel a need to muck with keyboard layouts. In actuality it won't help, but that's why it is a naive hope. :-)

In it's own way, the suggestion might even really be pointing to separate IMEs rather than trying to bundle them into the same IME. They are all trying to find ways to "break the ties" between characters that are identical according to what the user types such that it is easier for the user to have the character they want, and it is unclear how often all of the methods would be used by the same person such that glomming them into one IME would itself help productivity.

The principal suggestion that I think is somewhat unique among the IMEs that Microsoft ships is the idea of expanding the candidate list in a new dimension, adding the X axis to the usual Y axis as a way to represent the information. Now I may be actually wrong in thinking this is unique (I don't actually know about all of the IMEs) and perhaps there are even users of IMEs who feel the same way about such an innovation as I feel about people who muck with the CAPS LOCK key. But it seems like an interesting idea, from the outside, at least....

My point? I may not actually have one (or perhaps I just have too many so they are all milling about aimlessly on your screen). But I thought that the idea of a [visible] two dimensional candidate list could probably use a few more eyes.

Maybe it already exists? It seems like one of those ideas that is so obvious to everyone after someone comes up with it, doesn't it?

Or maybe some other company will try to get a patent on this idea after reading about it in then newsgroups or in this little blog, and then if Microsoft actually implements it a few years later they can go sue Microsoft for violating the patent. In which case this blog post can perhaps help them to feel foolish in court. I think if there were more opportunity for people trying to patent things to feel foolish that the world can only get better.... :-)

 

This post brought to you by ё (U+0451, a.k.a. CYRILLIC SMALL LETTER IO)


# Michael S. Kaplan on 31 Dec 2006 9:05 PM:

Due to the random attempts (four so far) of random keyboard creators to spam this post with commercials and other descriptions of their wares, I am forceed to moderate all comments on it.

Sorry to those who might be impacted negatively by this, and hopefully the good people who have thoughts about the IME suggestion that was raised or the related issues I raised will still be willing to comment to this post in a normal ON TOPIC manner.

# Yao Ziyuan on 31 Jan 2007 9:05 AM:

Believe it or not, I am the Yao Ziyuan in this context :) Come across to this post by searching the blogsphere for my name.

Grouping by common tone is practical because most Chinese (including me) actually don't type tone numbers (1-4).

Actually, the most promising approach is by "common radical", for which I didn't give an example in my original suggestion. Here's one: (All Chinese homophones pronounced as "gang")

常用 : 1 刚 2 港

冈  : 1 钢 2 岗 3 纲 4 冈

工  : 1 缸 2 杠 3 肛 4 扛 5 冮 6 矼 7 釭

罡罔 : 1 罡 2 堽 3 掆 4 棡

~  : 1 筻 2 戆

The first row displays the most commonly used characters; Row 2 through Row 4 display characters of a common radical; Row 5 displays the unclassified characters.

Other sets of Chinese homophones also can be efficiently classified in this way.

Generally, Chinese inputing is done on a word basis, where individual character selection doesn't play a role. However, when it comes to inputing a person's name or a company's name (which are composed by individual characters rather than dictionary words), individual character selection methods such as the above mentioned idea can speed up the user's effort in homing in at his desired character.

# Michael S. Kaplan on 31 Jan 2007 9:39 AM:

I believe you. :-)

Of course, typing in names is not an incredibly common (or even the most common) task one might do with an IME -- which makes it a hard sell to work as the central method for how an IME should work....

# Yao Ziyuan on 31 Jan 2007 10:42 PM:

I'm sorry I made a mistake in that example. Here's the corrected one:

(All "gang"-pronounced homophones grouped by common radical)

常用 : 1 刚 2 港

工  : 1 缸 2 杠 3 肛 4 扛 5 冮 6 矼 7 釭

冈岡 : 1 钢 2 岗 3 纲 4 冈 5 掆 6 棡

罡  : 1 罡 2 堽

~  : 1 筻 2 戆


go to newer or older post, or back to index or month or day