by Michael S. Kaplan, published on 2005/01/06 13:09 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/01/06/347834.aspx
A few days ago, PEK asked in the newsgroups about WideCharToMultiByte and MultiByteToWideChar:
I'm a bit confused about the first parameter in MultiBytToWideChar. It
is telling which code-page to use. You could use the value CP_ACP
("ANSI Code page"), but what does it mean? Is it the value returned by
GetACP() or is it code-page 1252?
What I want to know is that if you the same result if you call:
MultiByteToWideChat(CP_ACP, ...)
as when you call:
MultiByteToWideChat(GetACP(), ..)
The answer is indeed that they will both give the same results. One is a constant that says "I do not care what the system default codepage is -- just use it" while the other asks "what is the default system codepage?".
The natural question to ask wny time there are to or more ways to do the same thing is which would be faster -- caching the return of GetACP() and using that, or using CP_ACP?
Well, on NT-based platforms, CP_ACP is slightly faster, and by slightly I mean it would take 1,000,000 iterations or more to see even the slightest difference. On 9x-based platforms, it does indeed look like the codepage value itself may be ever-so-slightly faster, but again not enough to matter. i prefer CP_ACP as the constant because it just seems easier to keep track of, to me. It should likely never matter to your code.
ADDENDUM 1:00pm -- Ok, I had a chance to look at the Win9x code, and I have changed my mind. I could easily see it making a difference; CP_ACP is going to be slower. Not by a lot, but its just not the smartest code in the world and I remain committed to supporting smart code paths. I also understand why the same code is not used in the NT codebase. For what its worth, MSLU was actually written using the cached code page method, based on someone's comment that this was a little faster on Win9x. Well, the comment is right and I'm glad MSLU does it this way.
Though I can probably say for fact that using the return of GetLocaleInfo (after returning it as as string and converting to a number) with the LOCALE_IDEFAULTANSICOODEPAGE with the LCID returned from GetSystemDefaultLCID, making these calls every single time. But if you have to be told that explicitly then your code is likely too slow anyway. :-)
Now if you are using the .NET Framework, your question is probably wondering which is faster -- using the built-in Encoding.Default, or using the Encoding.GetEncoding() method to create the Encoding object on its own. And the quick answer to that is to use Encoding.Default since it is always faster.
But the better answer is the one Shawn Steele (the owner of encodings and code pages) gives -- use Unicode and avoid all of the extra work here. This is so much easier there since all of the data is Unicode already....
This post brought to you by "﹌" (U+fe4c , a.k.a. DOUBLE WAVY OVERLINE)
# Jochen Kalmbach on 6 Jan 2005 1:38 PM:
# Michael Kaplan on 6 Jan 2005 4:21 PM:
# Jochen Kalmbach on 7 Jan 2005 3:03 AM:
# Michael Kaplan on 7 Jan 2005 8:58 AM:
referenced by
2005/01/08 How does it detect invalid characters?