Whither SHAnsiToUnicode?

by Michael S. Kaplan, published on 2005/06/21 14:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/06/21/431062.aspx


In the Suggestion Box, our good friend Serge Wautier of appTranslator fame (I'll blog about that tool another day!) asked:

What's the purpose of SHAnsiToUnicode() ? I mean... what advantage does it have over plain old MultiByteToWideChar() ?
Less parameters obviously. But is that it ?

TIA,

Very good question, I have often wondered myself!

Just kidding. :-)

The fact is that in NLS we try to provide the full functionality that every function in the API is meant to support, via the various parameters and flags.

But even if you are inside of Microsoft, you are usually not trying to look at the whole big complicated functionality, you are trying to do a single task many times. So whether you are in the USER group or the SHELL group or the BCL or Office or somewhere else entirely, you write wrapper macros and functions. And many of the functions get published and become officially documented APIs. This is especially true in the Shell group where the SHLWAPI (Shell Lightweight API) exists and is incremented in every version with new wrappers and functions.

Which is not to say that SHUnicodeToAnsi and SHAnsiToUnicode are perfect functions for all occasions -- they assume that the source buffer is null terminated, and they do not allow you to select the code page (it is always the default system code page). You also cannot change the default character or choose what flags to pass in order to determine how to handle precomposed/composite characters. There is even a security warning in both functions, despite the fact that it guarantees NULL terminaton of the target string (if the length of the destination you pass does not match the buffer, bad things can happen).

But if you need to call the NLS APIs to do these conversions and do not want to have to remember all of the calling semantics, and if the way you want the conversion you happen is the same as the wrapper, then one of these functions may be perfect.

Sometimes, I wish that all of these wrappers explicitly documented how they called the NLS APIs; it would be much easier to choose the right function to use. But it is really not too hard to figure out what they are with a little experimentation and reading of the documentation. And if you are the sort of developer who worries about that sort of detail, then you are probably either calling the NLS functions directly or writing your own wrappers. So I have never pushed too hard for a huge doc. update. Plus, that would make it harder to make changes if there were bugs in the way it was implemented.

One disadvantage to these wrappers is that the parameters and calling semantics vary quite a bit. if you need to change the way you call the function or the flags you want to pass, your options are limited. I mean, if you are hasppily using String.StartsWith and String.EndsWith and suddenly need to start looking for a prefix and suffix with a width-insensitive comparison, the migration to CompareInfo.IsPrefix and CompareInfo.IsSuffix can be difficult, especially as you move from an instance method to a static one.

It is hard enough to make some people start using the "real" APIs all the time....

 

This post brought to you by "¥" (U+ffe5, a.k.a. FULLWIDTH WON SIGN)


# Rosyna on 21 Jun 2005 1:07 PM:

Pedantic, but why is it SHAnsiToUnicode and not SHANSIToUnicode? Too many uppercase letters in a row?

Anything that takes a C String should take an encoding. Evil otherwise should be deprecated.

# Serge Wautier on 21 Jun 2005 2:22 PM:

Do you remember I mentioned Michael Kaplan's blog some time ago.

Today, I learned I'm his good friend! :-)

# Michael S. Kaplan on 21 Jun 2005 5:29 PM:

Hi Rosyna!

Well, the Shlwapi APIs seem to stay away from capitalizing most acronyms (though there are exceptions). Mainly it bolis down to some person's preference....

# Michael S. Kaplan on 21 Jun 2005 5:29 PM:

Of course Serge is a good friend of this blog -- he posts comments here!

# Dean Harding on 21 Jun 2005 7:34 PM:

I guess the other thing is that it probably only makes sense to use the SHAnsiToUnicode/SHUnicodeToAnsi wrappers if you're writing a shell extension so that you know you're going to get consistent behaviour with the rest of the shell.

# Rosyna on 21 Jun 2005 8:06 PM:

This actually brings up an interesting point. If someone working on one group at MS sees a problem in another group's software that affects them, can get just fix the other group's code?

And when you're in a group at MS, how much code is your (individually, one person's) responsibility? An entire DLL? A few functions in a DLL? A lot of DLLs under the same "genre"?

# Michael S. Kaplan on 21 Jun 2005 11:09 PM:

Hi Dean --

To some extent I guess that is true, but like with sorting there are other Shell functions to accomplish things here. So there is no way to be "consitent with the Shell" since the Shell has so many different choices?

Or looked at another way, it easy to be consistent with the Shell since things are done so many diferent ways that you are consistent with *somebody* no matter what you do....

# Michael S. Kaplan on 21 Jun 2005 11:29 PM:

Hi Rosyna --

I'll post about this in a bit. :-)

# Stuart Dootson on 22 Jun 2005 3:23 AM:

Am I being incredibly dense....or does SHAnsiToUnicode et. al. exist only in the MSDN documents (like several of the SHLWAPI functions, it seems to me)? I've included shlwapi.h in a C++ file (using VS2005, BTW) and it says it can't find a SHAnsiToUnicode. And also, using depends.exe, there doesn't seem to be one in shlwapi.dll (using XP SP2, so it's reasonably up to date :-)

Whimper - I'm confused :-(

# Michael S. Kaplan on 22 Jun 2005 4:13 AM:

Hi Stuart!

See http://members.ozemail.com.au/~geoffch/samples/win32/shell/shlwapi/functions/ or to be more specific, see the following links:

SHAnsiToUnicode:
http://members.ozemail.com.au/~geoffch/samples/win32/shell/shlwapi/functions/string/atow.htm

SHUnicodeToAnsi:
http://members.ozemail.com.au/~geoffch/samples/win32/shell/shlwapi/functions/string/wtoa.htm

# Stuart Dootson on 24 Jun 2005 10:04 AM:

Thanks for that!

So...the Platform SDK documentation is ...less than accurate for these functions, as it points you at shlwapi.h to pull in a prototype. Oh well.

# Michael S. Kaplan on 24 Jun 2005 11:26 AM:

The great thing about the Platform SDK is that new ones always come out....

# Stuart Dootson on 28 Jun 2005 10:04 AM:

"The great thing about the Platform SDK is that new ones always come out...."

I know - I'm using the Windows XP SP2 Platform SDK...I try to keep up to date, honest, guv :-)

referenced by

2006/06/16 Neither GDI nor Uniscribe solve the ultimate font problem completely, either

2005/06/22 What APIs do I own?

go to newer or older post, or back to index or month or day