I Kana understand you, could you repeater that? (Part 2)

by Michael S. Kaplan, published on 2007/04/01 05:55 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/04/01/2004424.aspx


Moving past the issues I discussed in I Kana understand you, could you repeater that? (Part 1), there are some additional ones of interest.

There are several characters here that can be added to the discussion beyond U+3005 and U+303b:

U+3031 (〱) -- VERTICAL KANA REPEAT MARK

U+3032 (〲) -- VERTICAL KANA REPEAT WITH VOICED SOUND MARK

U+3033 (〳) -- VERTICAL KANA REPEAT MARK UPPER HALF

U+3034 (〴) -- VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER HALF

U+3035 (〵) -- VERTICAL KANA REPEAT MARK LOWER HALF

Now those last three characters, though usually expected in isolation, have some conceptual equivalences with the preceeding two. Thus:

<U+3033, U+3035> ≈ U+3031 and <U+3034, U+3035> ≈ U+3032

Now as I pointed out in Knock knock! Who's there? Kana! Kana Who?, they are not going to be exactly equal; the idea is for them to sort near to each other while still being distinguishable from each other.

Note that both of them are generally expected to repeat the previous one word rather than one character (which may seem a bit unlike U+3005 and U+303b, though since the latter two are usually used with Kanji, they are also being used with "words" rather than "letters".

This is an area where Microsoft's implementation falls a little short of exectations. It gets a little better in Vista (some of the characters that were not previously identified as repeaters were "hooked up" but it still falls short of the target, a bit, given that it does not repeat enough of the preceeding characters).

Some examples of what ideally ought be very similar (special thanks to Anan-san for the examples!):

涙をぽた<U+3033, U+3035> 落とした。  ≈  涙をぽたぽた落とした。
(repeating one hiragana word with two hiragana letters)

なんだかビク<U+3033, U+3035> と絲にあたる  ≈  なんだかビクビクと絲にあたる
(repeating one katakana word with two katakana letters)

彼はさん<U+3034, U+3035> な目に遭った。  ≈  さんざんな目に遭った。
(repeating one hiragana word with two hiragana letters but changed with voiced SA in the iteration)

うんどうかた<U+3034, U+3035> でかける。  ≈  うんどうかたがたでかける。
(another repeating one hiragana word with two hiragana letters but changed with voiced SA in the iteration)

出来ん<U+3033, U+3035> という声が聞こえる。  ≈  出来ん出来んという声が聞こえる。
(one kana/kanji mixed word with three kana/kanji letters)

代わる<U+3034, U+3035> これをつとめる。  ≈  代わる代わるこれをつとめる。
(an example of one kana/kanji mixed word with three kana/kanji letters but changed with voiced KA when pronounce in the iteration)

This area has in it the kind of change that would be worth considering for a future version....

(In the next part I'll get into more of the specifics of what Microsoft's implementation does here, both good and bad!)

 

This post brought to you by (U+3005, a.k.a. IDEOGRAPHIC ITERATION MARK)


no comments

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day