Fight the Future? (#8 of ??), aka The Bug(s) Spotted, aka Design flaws are worse than bugs

by Michael S. Kaplan, published on 2008/04/07 07:01 -07:00, original URI: http://blogs.msdn.com/michkap/archive/2008/04/07/8364480.aspx


Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

It seems like it was just days ago when I blogged Fight the Future? (#2 of ??), aka Spot the Bug(s)!, which provided the following cose asnd aked (dared?) people to find the problem(s) therein:

if( _plv->_ci.dwExStyle & WS_EX_RTLREADING)
{
    if (item.pszText)
    {
        //
        // temp hack for the find.files to see if LtoR/RtoL mixing
        // works. if ok, we'll take this out and make that lv ownerdraw
        //
        if ((item.pszText[0] != '\xfd') && (item.pszText[lstrlen(item.pszText)-1] != '\xfd'))
        {
            textflags |= SHDT_RTLREADING;
        }
    }
}

Now many people pointed out that this code can't handle when the text in item.pszText is a zero-length string (leading to at best a not-so-nice one charcter buffer underrun), and several also went further in guessing that item was meant to be an LVITEM Structure in a list-view control. One person even pointed put what Ben referred to as the principal error being targetted when he asked the question initially:

Short answer:
The comparison is always false because WCHAR is zero-extended but '\xfd' is sign-extended.

Long answer:
Among the solvers, there was confusion about which half of the comparison was signed.  In our project, WCHAR is unsigned, and char is signed.  So '\xfd' is sign extended and pszText[0] will not be, so they always compare as unequal.

This is most evident from the assembly code.  Did you know one dev even closed this bug since they couldn't believe the compiler would do such a thing?


75ae1e1f 0fb708   movzx   ecx,word ptr [eax]   // zero extend a WORD (pszText[0])
75ae1e22 83f9fd   cmp     ecx,0FFFFFFFDh       // compare to a DWORD literal

One person noted that there should be a compiler flag to keep this sign mismatch issue from happening. He is right, it is /J, described in the topic /J (Default char Type is unsigned).

But although this bug does keep the code from ever working, it still is just a bug -- and a quick fix once it is identified.

And then several people noted the more serious issue -- the fact that the code is using a non-Unicode chracter ('\xfd') in the comparison against Unicode characters in the item.pszText -- in an attempt to look for U+200e, the LEFT-TO-RIGHT MARK (which is only '\xfd' in Windows code pages 1255 and 1256, and not in Unicode ever).

Now we get to where the wheels fall off the wagon a bit. :-)

There was a bit of suggestion with this described problem what the actual fix would be, and the "internal" answer for all of this was pretty direct:

The deeper bug requires some context.  One person was kind enough to provide some detailed history:

"It's worse than you think.

The code was originally written for the Mideast version of Windows 95. That version of Windows uses ANSI not Unicode, and the code pages are 1255 (Hebrew) and 1256 (Arabic). In both of those code pages, character 0xFD is Unicode character U+200E (LEFT-TO-RIGHT MARK). The code was protected under #ifdef WINDOWS_ME so it would be active only on Arabic and Hebrew systems.

This code was ported to Unicode without paying attention to the code page assumption hiding behind the #ifdef. Lucky for us, the code was ported incorrectly and the test never succeeds.  A naive "fix" would corrupt Czech strings: The comparison would think that character U+00FD (LATIN SMALL Y WITH ACUTE) is the LTR marker and any string that begins and ends with that character gets treated as Arabic/Hebrew text.

The correct fix is to delete the test entirely. We are all-Unicode now. We don't need an old hack for Hebrew/Arabic Windows 95."

But in some ways I find this answer a little bit wrong and also way less than complete, to be honest.

The hint for my issue here can be found if you look at the comment:

        //
        // temp hack for the find.files to see if LtoR/RtoL mixing
        // works. if ok, we'll take this out and make that lv ownerdraw
        //

What kind of temp hack designed to test a specific feature lives on long enough to make it to a Unicode conversion, intact?

Now the problem with LtoR/RtoL mixing does not go away when you convert to Unicode -- it just gets harder. And the initial hack was indeed a hack because it was never really such a great solution being given.

You can see the underlying real problem in action with the user interface language list -- shown here in Vista on that machine with all of the home-built locales, with an English UI language:

though not in the smaller "official" list with an English user interface language since there are no RTL languages with parentheses listed:

though the bug comes backi to haunt us with a right-to-left user interface language with many examples:

Now this is yet another case of the problem I talked about in Mixing it up with bidirectional text, where any time you "islands of text" within other text that:

  1. does not have the same directionality as the overall user interface, and
  2. either the first character or the last one has a neutral Bidi class

then one needs to put in a non-neutral character such as U+200e (LEFT-TO-RIGHT MARK) or U+200f (RIGHT-TO-LEFT MARK) -- depending on the desired directionslity of the island.

The "old" temp hack fix -- presumably only running on Hebrew or Arabic Win9x -- was

So the person who suggested the code could just be removed was right -- if you are willing to live with strings that have serious potential to look wrong in any of those directiolaity-spanning scenarios.

What should be there? Well, an algorithm that:

  1. Looked at the directionality of the first and last characters/pieces in the string, and
  2. Looked at the directionality of either the surrounding text or the user interface (whichever was appropriate), and
  3. Whenever a difference between either/both sides of #1 and what was found in #2 was seen, added the appropriate RLM or LRM marker to cause the text to look right. kind of like I suggested in Mixing it up with bidirectional text.

Now there are two ways this bug can manifest -- if the character in question is neutral and mirrored, it can show up on the wrong side of the string, reversed. And if it is just neutral but is not mirrored, then it will look right but still be on the wrong side of the string.

Both problems are fixed by the above description of the algorithm (the code for which is left as an exercise for the reader).

In fairness to that initial code, it was a temp hack presumably to test whether using strong LTR characters would help with directionality of listview items on RTL platforms, but obviously if the code is never removed and then the actual issue is never fixed then there is clearly a bug here -- a bug caused by the lower quality bar implicit in a temp hack which, by never being revisited, proved the underlying problem in ever lowering one's quality bars in code one checks in.

Which in my mind is the most serious bug here -- the conceptual design flaw caused by never finishing the work to solve a genuine issue.

It would be great if this code were written up in a function, which could then be used in all of the places in the UI where such strings can or do show up, from Listviews to Listboxes and beyond....

 

This blog brought to you by ‏)‏ (U+0029, aka RIGHT PARENTHESIS, mirrored of course due to a surrounding U+200f entourage...)


# John Cowan on Thursday, April 17, 2008 9:43 AM:

It should, of course, have been "This blog brought to you by ( (U+0029, aka RIGHT PARENTHESIS)".

# Michael S. Kaplan on Thursday, April 17, 2008 9:47 AM:

Tempting,but I'd hate to freak people out quite that much.

On second thought.... :-)


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2010/07/23 It used to be Windows doing it right, and Office following. But now...

2010/06/18 Bugs hidden in plain sight, and commented that way too

2008/08/25 The Bidi Algorithm's own SEP Field

2008/04/19 Even if the text is right underneath, it may look wrong close up....

2008/04/18 The mythical nature of bidirectional support, and where the wheels come off the wagon

go to newer or older post, or back to index or month or day