The error is in the premise (aka garbage in, undefined out)

by Michael S. Kaplan, published on 2009/10/19 07:01 -04:00, original URI:

Yesterday, Eran asked (via the Contact link):


I apologize in advance if this is the wrong way to report this issue. I've waisted two hours trying to figure out how to use Connect for this, to no avail. Being an occasional reader, contacting you directly was the only option I could think of.

So, without further ado:
GetKeyboardLayoutList( 0, (HKL*)"garbage" ); works correctly on all 32 bit versions of Windows I've tried (XP, 2003, Vista), and when run from a 64 bit executable it also works on 64 bit versions of Windows (2003, 2008). However, when run from 32 bit executables on 64 bit windows (2003, 2008) it fails with a "Invalid access to memory location" error. If a NULL is passed instead of the garbage, the call succeeds on all bitnesses and OSes.

In the MSDN docs, nothing is said about the second parameter when the first is 0. This is the first time I come across incompatibility between WOW64 and the equivalent 32 version of Windows. And it breaks a feature of our product.

I will highly appreciate any response - be it a direction to the appropriate address, an acknowledgment of the bug, or any other advise.

Thank you for your time,

Sometimes, like the title says, the problem is the premise.

In particular I will yield to Raymond's Basic ground rules for programming - function parameters and how they are used, which is in my view the definitive way to look at the contract here.

And while there are nuances when differences exist between 32bit and WOW64, these nuances are caused by the same kind of problem that comes from people who don't use CallWindowProc but instead call window procs directly -- people who then run afoul of the ANSI/Unicode layer requirements.

So, is it a bug if the WOW64 layer has always behaved this way? Saying NO does mean resigning to a certain kind of backward compatibility break in the whole "run 32-bit apps on a 64-bit OS" scenario, though the defense of passing bad parameters is one that in my view is fairly indefensible just to allow one to avoid initializing a buffer (or more likely to avoid causing one to change one's existing code to initialize a buffer).

Though of course your mileage may vary....

Mihai on 19 Oct 2009 2:27 PM:

Actually, in this case it sounds like a bug.

The doc says "Return Value ... , if nBuff is zero, the return value is the size, in array elements, of the buffer needed to receive all current input locale identifiers."

That is the contract.

No matter what the value of the second parameter is, it will return the needed size.

Otherwise the doc should say "if nBuff is zero and lpList is NULL, ..."

It also does not break any "Raymond rule."

A rule is "All buffers are valid to the size declared or implied"; and in this case the declared size is zero, so the function should not try to touch anything in the buffer.

Michael S. Kaplan on 19 Oct 2009 5:53 PM:

Unless there is some reason why this head check is inconvenient in the case in question... garbage in, garbage out. There is no version where this has worked.

Random832 on 20 Oct 2009 10:09 AM:

The linked article also says "Pointers are not NULL unless explicitly permitted otherwise." - if you're supposed to be passing in NULL, then the documentation should mention _that_, and it's still a bug that it doesn't.

As the documentation stands there is no reason to think a null pointer is a better choice than an uninitialized pointer. After all, if it's dereferencing an uninitialized pointer today, who says it won't dereference a null pointer tomorrow? Which leaves the only option to be _allocating memory twice_.

Michael S. Kaplan on 20 Oct 2009 10:44 AM:

The problem here is not a NULL or non-NULL pointer -- the problem is a bogus pointer.

Either a valid pointer or NULL would be okay here. GARBAGE is not.

I do not mind casts, except when they are outright lies, like in this case....

Dave Bacher on 23 Oct 2009 11:13 AM:

If you think about how this works, you can see why Microsoft needs the pointer to be null if the count is zero.

This is a 30+ year old C convention, Microsoft doesn't need to document it in each individual function call.  Either a pointer points towards what it is supposed to, or the pointer is null.  Casts are there because there are times where someone who knows what they are doing needs to treat one type as another -- they are not there to make the compiler "shut up."

Now, for 16 to 32 and for 32 to 64, Microsoft has a set of wrapper libraries.  These either invoke the kernel, or they invoke the newer API.  I believe the latter is much more common than the former.

Microsoft has a tool, it reads a header file, and for each structure it writes a marshaller.  For each function, it invokes the marshaller (MarshalTo64 or w/e), then invokes the native function, then invokes the marshaller again (MarshalTo32).

Because there are millions of functions, Microsoft isn't going to convert each one by hand.  They have a tool, it writes all these stubs.  The tool, out of necessity, doesn't know anything about which parameters are used in which cases -- it assumes that the pointer is either valid or null.

It has to, otherwise you'd have to be able to express all the pre and post conditions for every combination of variables using #defines, etc., and the header would expand exponentially.

This is, however, why newer APIs are being factored to use cursor/iterator against OS-owned memory, or COM.

Michael S. Kaplan on 23 Oct 2009 1:15 PM:

Thanks dbacher, that is what I was hinting at when I mentioned ANSI/Unicode issues. :-)

go to newer or older post, or back to index or month or day