by Michael S. Kaplan, published on 2006/08/12 03:11 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/08/12/696161.aspx
The story I am telling here is completely true. I have only omitted project and people names to protect the guilty, and perhaps also the [rightfully] embarrassed....
The original mail that came to me was from someone who was getting some unexpected results from CompareString. The mail eventually boiled down to a simple question:
...CHT Vista build, I am passed an lcid of 0x10804 with the query. Is that correct? I can’t find that value on MSDN.
I admit I felt like I was being the teacher in How I Got Into College who looked at Marlon Browne's ungraded SAT (the bunch of dots on the page) and immediately expressed concern as his score, but the problem here was obvious -- calling the LCID value "for Traditional Chinese" a huge crap bag would have been an insult to large bags filled with crap!
You think that's bad? Just wait, it gets worse...
I immediately explained that this LCID value was completely bogus in two senses:
They forwarded on to me the way that they were constucting the various Chinese LCIDs:
MAKELCID( MAKELANGID(LANG_CHINESE, // ChineseSUBLANG_CHINESE_SIMPLIFIED),
SORT_CHINESE_UNICODE )MAKELCID( MAKELANGID(LANG_CHINESE, // Chinese/china
SUBLANG_CHINESE_SIMPLIFIED),
SORT_CHINESE_UNICODE )MAKELCID( MAKELANGID(LANG_CHINESE, // Chinese/taiwan
SUBLANG_CHINESE_TRADITIONAL),
SORT_CHINESE_UNICODE )
Yikes! This was getting worse and worse.
(The values are all struck out so that no one tries to use them!)
Now that first one was the source of the problem -- they were assuming there was a generic "Chinese" that was neither Simplified nor Traditional, and that by passing it they would get some nice generic results -- especially when they plunked in that Unicode support.
You think that's bad? Just wait, it gets worse...
They were also constructing their Japanese and Korean LCIDs in a similar way:
MAKELCID( MAKELANGID(LANG_JAPANESE, // Japanese
SUBLANG_DEFAULT),
SORT_JAPANESE_UNICODE )
MAKELCID( MAKELANGID(LANG_KOREAN, // Korean
SUBLANG_DEFAULT),
SORT_KOREAN_UNICODE )
Again, nice use of a generic, friendly sounding flag involving the word Unicode and all would be good. Nice generic results, right?
Well, I can answer that question with a question. Would failure of the function with GetLastError() returning ERROR_INVALID_PARAMETER be generic enough of a result? :-(
Perhaps looking at the definition of these flags with the latest winnt.h might shed some light here:
#define SORT_JAPANESE_UNICODE 0x1 // Japanese Unicode order (no longer supported)
#define SORT_CHINESE_UNICODE 0x1 // Chinese Unicode order (no longer supported)
#define SORT_KOREAN_UNICODE 0x1 // Korean Unicode order (no longer supported)
The Japanese and Korean Unicode sorts are those awful abominations I discussed earlier (here and here), and they were removed back in Windows XP, and the "Chinese Unicode sort" didn't exist even then (I think it was removed before NT 4.0 shipped, if not sooner?) -- and lacking the whole yen/won thing I imagine it had even less reason for being.
You think that's bad? Just wait, it gets worse...
After I straightened out their LCID story for these came the scariest part of all:
I’m wondering though why I didn’t get any complaints for these languages. In fact Korean was tested and works correctly.
It of course scared me for two reasons:
As I look to the future I think it is important to spend a lot more time evangelizing how to not only call the NLS functions, but to call them correctly!
This post brought to you by ₩ (U+20a9, a.k.a. WON SIGN)
# oidon on 12 Aug 2006 4:48 AM:
# Michael S. Kaplan on 12 Aug 2006 10:49 AM:
# Dean Harding on 13 Aug 2006 7:51 AM:
referenced by
2006/08/27 It has not always been so invariant