Converting a Project to Unicode: Part 7 (What does it mean to fit things to a 'T', anyway?)

by Michael S. Kaplan, published on 2007/01/03 03:01 -08:00, original URI: /web/20080229133725/http://blogs.msdn.com/michkap/archive/2007/01/03/1395788.aspx

(If you are just tuning in and want to start now you can grab the current source from here -- no changes since it was posted the day before yesterday)

Like I said yesterday, if you have read Parts 2-5 then you know how we went from a purely ANSI application to a purely Unicode one.

The binary itself has been tested with the MSKLC update and it resolves the bug I talked about back in Part 0. And the Unicode Bootstrap EXE works for the scenarios in which it will be used.

Now for a moment I wanted to talk about the myth of applications compiled as both Unicode and ANSI. We say TCHAR but the truth is that most of the time the dev has just one in mind. For me it is Unicode (which leads to problems like the one Mihai pointed out here) and to be honest most developers think of it as ANSI, even when they talk about Unicode, which is why you get problems like those in the DrawThemeText function. Ignore the weird text for a moment:

You know, text handled by people who were not aware that DrawText has a Unicode version. And just look at the code sample:

Clearly, there are times where even the people who are moving forward and only providing Unicode versions to their functions are not necessarily thinking of a TCHAR as a type that could be either a CHAR or a WCHAR.

And I am not casting stones here or anything (after all, I made the same kind of mistake in the other direction -- one I may never have noticed since I was only ever going to probably compile and run the code with UNICODE/_UNICODE (just as I suppose people are anticipating those samples will be written by people who don't).

So I think we should go ahead and make sure it will compile both ways, and do the work in the makefile to make sure it happens. let's break the myth, at least for this particular sample at this particular moment....

One way that some Platform SDK samples do this (like the StrOut sample, for example) is in addition to the makefile, having a makefile.uni that looks something like this (this is the StrOut one):

Well, no COM and we don't use CommandLineToArgvW, so we don't need exactly this. But it gives one example of how samples are doing this. We'll just go with it. :-)

The cynical side of me believes that if this does end up in the Platform SDK that this will work up until the next time it is updated for some other particular feature, since the whole "dual compiling system" doesn't exactly fit us to a 'T'.....

(other techniques here might include different config settings in the same makefile or environment variable dependencies, but I am aiming for the Platform SDK so doing it the way they seem to my work in my favor!)

But in any case, the next source code drop will include an updated makefile and a new makefile.uni and instructions about using them.

Bart: that depends on whether you still need the resulting binary to work on Windows 9x. If you do, you have two options: you can either build for ANSI and use that build on both Win9x and NT-based systems, or you can build for Unicode and link to MSLU (unicows.lib). If you support both, you can change this decision relatively quickly.

You should be aware that running ANSI programs on NT-based systems causes all strings passed to ANSI Win32 APIs to be converted to Unicode at runtime, then those strings are passed to the equivalent Unicode API. On return, any modified strings have to be converted back. These conversions are pretty fast but still cost a little CPU time and memory. There are some scripts which do not have an ANSI encoding, and therefore ANSI programs will not work very well on Windows XP if one of these scripts is selected.

You could take the approach of building both an ANSI version and a Unicode version. This has been used by Windows Installer, which provides InstMsiA.exe and InstMsiW.exe installers (at least in version 2.0), while ATL.DLL (for ATL 3.0) comes in both versions, the installer installing the appropriate version depending on which OS it is installed on.

For this setup.exe, it's still possible that users of the sample will want to compile it to run on Windows 9x (even if only to provide a nice error message, rather than the unfriendly error that results from a missing export). However, in general I would expect to see Windows 9x support die out quite soon - aggregated web server statistics (e.g. at http://marketshare.hitslink.com/report.aspx?qprid=5) show Windows 98 with less than 2% usage share and declining. Obviously you should consider your own market to decide whether to support Windows 9x.