Suits you to a

by Michael S. Kaplan, published on 2005/08/10 08:51 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/08/10/449851.aspx

If you are not dealing with both Unicode and non-Unicode builds of a program, then all of the _T()/TEXT() macro stuff as well as all of the TCHAR stuff is fairly superfluous. As I mentioned a few days ago, new functions NLS adds to Vista are not going to have non-Unicode versions added (a trend started in Server 2003).

To answer the specific question about whether the macros is required (and keeping the last paragraph in mind), I would always suggest using the L prefix on Unicode characters and strings, even though the compiler seems to not feel the need to use it for characters. It is definitely still needed any time you specify a string literal, and the consistency seems like a good thing, doesn't it?

For the ASCII range, you will not find a difference between that "double byte promotion" and a Unicode representation. However, for anything single byte that is outside of ASCII but inside of the default system code page, I would go so far as to say that the "promotion" would usually be wrong, and possibly also subject to different interpretations depending on what the default system code page happens to be. If you are gong to write UNICODE/_UNICODE applications, then it seems best to keep them using Unicode everywhere....

In C++ 's' is a char, in C it is an int. L's' in C++ is a wchar_t (not sure about C).

However in C wchar_t is a typedef, but in C++ it is a distinct type. But Visual C++ (unless overridden with a command line/IDE option) it is also a typedef; and char->int is an alloewd conversion.

I wonder if
wchar_t x = 'S'
works OK with wchar_t as a distinct type?

I don't have my copy of the beta here, but I'm pretty sure wchar_t is a distinct type by default in VS2005. It was a typedef by default in VS2003 (though it's easy enough to change - usually the first thing I do when starting a new C++ project is set all those 'Treat wchar_t as a Built-In Type', 'Force Conformance In For Loop Scope', etc options to True).

And yeah, I agree with just doing away with all that TCHAR, _T() stuff. It's much simpler to just use wchar_t and S'...' directly, these days. The macros were good when you were doing separate builds for Windows 9X and NT, and didn't want/need MSLU :)

I figure that the _T for strings and characters is useful for whenever Unicode 12.7 comes and out and we're all using 5 byte characters. Less code to change - assuming the macros are updated.

There's one more thing important. If you use literals from upper half of ASCII table or multibyte characters, be sure to use #pragma setlocale to make compiler convert strings to Unicode using correct code page.

Just today a guy from office opposite to mine had a problem. His SQL query did not work correctly. The query contained Russian letters and it did not fetch any results.

It turned out that Visual Studio was setup to use Russian fonts but system locale was set to English. Compiler converted strings using incorrect code page and Russian string became garbage.

Considering how often people copy/paste code snippets, it might be safer to just use code points (well,code units) in those cases....

You mean writing it like this?

CHAR StrA = "\xF1\xEB\xEE\xE2\xEE";
WCHAR StrW = L"\x0441\x043B\x043E\x0432\x043E";

This is a pain for Russian speaking people (like I am). The following snippet looks so much natural to me.

#pragma setlocale("rus")
TCHAR Str = _T("");

WCHAR c = 'S';

works because char->wchar_t is a widening conversion akin to BYTE->WORD or short->long. If you did this:

wcout << 'S';

it won't promote anything.

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

Suits you to a _T()