Bugs hidden in plain sight, and commented that way too ANSWERS

by Michael S. Kaplan, published on 2010/06/27 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2010/06/27/10028282.aspx


In the spirit of BUG SPOTTING answers, today's blog will talk about the problems in Bugs hidden in plain sight, and commented that way too. :-)

There are several problems here, in fact.

Here is the email and comment inside a code snippet for your perusal if you don't want to click to look:

The current console output behavior for <redacted> and <redacted> was painfully established through years of international testing.
That’s not to say there are no issues – just that we need to be cautious about making changes.


// if redirected to a pipe or a file, don't use WriteConsole;
// it drops redirected output on the floor
// if going to a file, we should not use console codepage, we
// should use ANSI codepage OR write unicode & a unicode
// filemarker at beginning of file.
// Since we don't know if this is the first thing in the file,
// let's go the ANSI route.

WideCharToMultiByte(GetACP())
WriteFile(hStdOut)

The reminder here is that this is from code that actually has shipped before in Windows.

Yes that part needs to be fixed too. :-)

Now it is hard to know what kind of "international testing" was done here, though one thing is for certain: the tester either never strayed outside of ASCII, never tested with chcp (which would try to change the console's code page and have limited success given the conversion this code will do), never tried both in the console/redirection given the unusual differences the code would cause, or perhaps didn't do anything once it was working for ASCII and the "redirection does nothing" bug was reported and the fix was verified.

But let's give then credit and assume, like I did in Conventional wisdom is retarded, aka What the @#%&* is _O_U16TEXT? and The real problem(s) with all of these console "fallback" discussions and Anyone who says the console can't do Unicode isn't as smart as they think they are and Cunningly conquering communicated console caveats. Comprende, mon Capitán?, that the console vs. redirection case was being detected and this block is inside code that detects that we are talking about redirection.

After all, once I didn't know how to do it right, either...

That makes the first two lines of the comment piece correct!

The next three lines don't give their justification; that change just makes it harder for test because now they have two different code paths to test. But a code comment does not a spec make so the lack of info may be okay here.

That last part is the problem.

That is, if whack has gone back to meaning something bad again.

Because I mean this bit:

// Since we don't know if this is the first thing in the file,
// let's go the ANSI route.

It is just plain wrong.

First of all, what's all this about not knowing if this is the first thing in the file? We can know that exactly!

A simple call to GetFileSizeEx will tell you that, immediately! Just pass in that stdout pointer that you have already determined is a redirected file, and then you will know by the size of the file if they redirected with a > or a >>.

Second of all, you can find out the encoding, too. A quick call to GetFinalPathNameByHandle to get the path and you can look at the contents and see what is in there and make the appropriate decision (if you need it to be pre-Vista you cab use code like this).

What's so hard about that?

Remember that the file is guaranteed to be opened to you or else all of your write operations would fail. So you are the one person with access (that's how you were able to get the size in the aforementioned First of all.

And third of all, and this also goes to algorithm flaws too, what makes a person think that if choosing between

that either one of those options does not completely suck?!?

Either way a bunch of your file will be unreadable.

So if you assume both of those options suck, and you are trying to ship a quality tool, then you have two choices. Either

In the end, the comment, and the subsequent code that implements it, is dead wrong and should be fixed to not make such misstatements since people reading such a comment and thinking they learned something new about Wndows are probably the reason why this code was wrong in the first place. Inacurate information perpetuates!

With all of these console blogs, jncluding this one, I have shown you the way so that if you want to produce a tool that will do its best to fully support all languages, that you can. That can really be worth the effort, after a decade of beng just plain wrong and broken (which pretty much everything has been up until now)....


Yuhong Bao on 9 Jul 2010 5:04 PM:

"A quick call to GetFinalPathNameByHandle to get the path and you can look at the contents and see what is in there and make the appropriate decision (if you need it to be pre-Vista you cab use code like this)."

You can use SetFilePointerEx directly on the handle if it points to a file as determined by GetFileType.


referenced by

2010/10/07 Myth busting in the console

go to newer or older post, or back to index or month or day