The time has come to fix a bug that has been in Windows more than twice as long as I have...

by Michael S. Kaplan, published on 2013/02/13 07:01 -05:00, original URI:

The other day, I was reminded about two old blogs of mine:

Here's the art:

Anyway, I was reminded when a colleague asked me whatever happened to the bug (is was too late in the cycle to fix at that time):

Hello Michael,

May I ask what was the decision on bug described in your WideCharToMultiBytes vs. DrawTextW blog post? Was a bug logged for this issue?

Why I'm asking: From time to time we receive questions about spacy Cyrillic in Visual Studio on EA OS's. Though we're telling them that there's nothing wrong with their code and that there's a workaround in VS (set font to Segoe UI instead of Automatic which forces font to be inherited from OS in VS settings, more on that here), it would be great to provide more details to those who ask. Reference to your blog post is good but bug # and your (Windows) final decision in addition would be even better.


I thought maybe I should go follow up on that bug in DrawText, since we have time.

The bug is easy to fix, as I pointed out in that second blog:

The bug is in using WideCharToMultiByte to detect "wide" characters by converting it using a code page.
Because every CJK code page has some characters that are not double width that is situated one of the two byte ranges inside the code page (other characters that can hit this problem include random symbols and such).
Clearly to fix this in any version would require a different, more reliable test for full width characters!
For example, a simple call to GetStringTypeW(CT_CTYPE3, ...) checking for the C3_FULLWIDTH character type flag -- the non locale specific code that should have been there all along....

I think I'll go have a chat with some DrawText dev owners.

I'll even offer to do the work if they want.

Or maybe I'll code up the fix and ask them to do the CR.

The fact that a partner team noticed the bug and saw my blog explaining how easy the fix is won't hurt my case.

Bad code is just a bad idea, doncha think?

The fact that I'm offering to do all this here is fun, too.

In case there was any doubt, I'm back! :-)

Random832 on 13 Feb 2013 9:10 AM:

!!!WARNING!!!: Ideographs (and Hiragana), which are probably what the original test is _meant_ to detect (since it's deciding what to treat as "a word"), _do not_ set C3_FULLWIDTH. I detected this in writing a test program to look at my question from last time about what it returns for Cyrillic.

Also - the assertion "The various non-full-width CJK fonts don't give the Cyrillic characters full width visually" is NOT true of MS PGothic or PMingLiU. And I don't know where you could possibly get the idea that "you'd just get fat Cyrillic", those fonts give the characters the extra width by making more white space, not by stretching the glyphs. Did you only check Meiryo, or did you only look at the glyphs in isolation without putting them next to each other?

And why would it be adding spaces just because it passes this [however broken or not] test? It doesn't do so for the Ideographs, which also presumably set cChars = 2.

Incidentally, here's the output of a test program I wrote to see what flags are set by what characters:

Char    Val  F H ----

U+0061: 8040 0 1 0000 (Latin)

U+03b1: 8000 0 0 0000 (Greek)

U+0410: 8000 0 0 0000 (Cyrillic)

U+ff41: 8080 1 0 0000 (Latin FW)

U+4e9c: 8100 0 0 0100 (Ideograph)

U+3042: 8020 0 0 0020 (Hiragana)

U+30a2: 8090 1 0 0010 (Katakana)

U+ff71: 8050 0 1 0010 (Katakana HW)

U+0000: 0000 0 0 0000

(The last column is the value with C3_ALPHA|C3_FULLWIDTH|C3_HALFWIDTH masked out - nothing unexpected there, just ideograph, hiragana, and katakana)

Azarien on 18 Feb 2013 5:55 AM:

There is also a problem with Arabic (Egypt) on this screenshot. The parentheses look wrong.

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day