Ask a simple question, and then duck!

by Michael S. Kaplan, published on 2008/05/29 20:14 -04:00, original URI:

The whole thread started innocently enough, with a simple question:

Check out this shell command I typed & the result it got.  Note the stuff in red.

    C:\>dir d:\enlistment\chef*.* /s /b

Is this correct behaviour?

Interestingly, though “chef*.*” finds things it shouldn’t, two variations (“chez*.*” and “cher*.*”) didn’t.

I’ve attached a screen shot of a cmd window where I saw it.

I’m using 64-bit Vista.  I’ve attached a screen shot of my Computer/Properties window so you can see a more detailed description of the version.

I’ve attached a *.cpp file a coworker wrote which demonstrates the behaviour by calling FindFirstFile directly.  He sees the same behaviour on 32-bit Vista.

You can probably guess what's going on here -- the miracle of short file names (both of the "found" files have a short name of CHEFB4~1.CMD, which of course fits into the wildcard quite nicely....

Now if you stop right here, it's a nice little story about the assumptions we all make about wildcards from time to time.

And then of course someone pointed out:

This is really unfortunate for people doing deletes with wild card characters.

Then someone else responded:

Unless they do a dir /x foo* and then want to turn around and do a del foo*.  It’s all a matter of perspective.

The original person asking then had a follow-up question:

Is there a way to make FindFirstFile ignore the 8.3 names?  Maybe a flag we can set?

Of course someone pointed out KB 121007 (How to Disable the 8.3 Name Creation on NTFS Partitions).

Uh oh!

Luckily someone else was there to start listing some of the downsides in that plan:

Sorry, but that’s not great advice.

1.       Just because they’re not auto-created doesn’t mean that they don’t exist
2.       Doesn’t help for existing volumes or network access etc.
3.       Not very well tested; use at your own risk.  E.g. as far as I know nobody ever runs any automated test passes with this feature disabled.

The only real answer is that you need to look at the set of files that you get back from FF as candidates and then apply your own filter.

This is of course in addition to the ones given in the KB article itself. Though someone who perhaps had not read #1 or #2 or the ones in the KB article asked:

Just curious, can’t we stop generating them automatically? How many 8.3 applications do we still support?

There is of course the huge appcompat burden here, plus #1 and #2 above, plus a sample of an attempt to do something like this that I mentioned in Our non-Unicode heritage, plus on top of that how such a thing would break the auto-path shrinking work the Shell folks did that I mentioned in Sometimes what a person really wants is a LACK of size that deals with the MAX_PATH issues.

So in summary, in order to go to an 8.3-less world, one would have to solve all of the following issues:

Now maybe if the entire Windows team did nothing else beyond solving all of the problems associated with these three issues, for real, we might be able to ship the next version of Windows within 5-6 years after that.

And the most compelling feature to describe would be that we solved three problems that most people don't really understand exist until after you explain them and even then would not be very impressed that it would be a terribly compelling release to them.

Of course one could go the Apple route, and start over completely, changing the new one to have none of the above problems. Screw the backcompat crowd and start with no limitations at all.

I suspect we'd end up with an Apple-sized install base, with most people wanting the version of Windows that didn't tell people who merely wanted stuff to keep working to just get bent.

Anyway, the moral of the story? Beware the simple questions!

This blog brought to you by(U+099d, aka BENGALI LETTER JHA)

# Mike Dimmick on 30 May 2008 5:57 AM:

Well, the 16-bit app problem is basically solved when going to the 64-bit OS, because the 16-bit compatibility subsystems are gone. (This is a processor limitation - it doesn't support the Virtual 8086 submode, required for DOS/BIOS interrupt compatibility, in 64-bit long mode.)

# John Cowan on 30 May 2008 11:23 PM:

I was trying to think about why it was so un-traumatic for Unix to move from 14-character filenames to (essentially) unlimited length ones, and why nobody ever imagined the notion of keeping around two names (potentially) for every file, one short and one long.  The only programs that changed were those that had hard-wired knowledge of the layout of directory blocks, and they were changed to use a library that could handle either old or new format file systems.  Why didn't that happen for Windows?

I think the source of the problem long predates the introduction of long file names to Windows 95 in 1994, and in fact actually goes back to 1982, when Microsoft was working on MS-DOS version 2.  That was the first version of DOS to have directories and pathnames, and more importantly, to have a system interface which accepted a simple null-terminated string as a filename.  In DOS version 1, system calls used "filename blocks", fixed-length fixed-format objects in userland that had to be set up correctly for the call to work.  That style in turn went back through CP/M through a chain of DEC operating systems to TOPS/10 in 1964 and perhaps further; read the Wikipedia article on RADIX-50 for details.

In Unix, however, filenames have *always* been simple strings, even back in 1969 before Unix pathnames existed, so applications didn't have to have hard-wired knowledge about the formats of names.  The basic filename-oriented system calls (open, link, unlink, etc.) haven't changed at all since then.

If Microsoft had had the [organs] to break backward compatibility between DOS 1.0 and DOS 2.0, none of this matching nightmare would have had to exist.

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day