It is true that your LCID sucks, but your LANGID sucks more

by Michael S. Kaplan, published on 2007/08/20 03:16 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/08/20/4472890.aspx


Sometimes when working on software projects we spend so much time thinking a certain way that thinking any other way just does not occur to us.

Like the other day when someone sent me some email asking:

Hi Michael,

I saw your name on MSDN regarding GetLocaleInfo() and I hope you could help me with my question.

...I need to convert an ISO 639 code (extracted from a stream) to a LANGID in Windows. I know GetLocaleInfoEx could provide me with LANGID or ISO 639 code with LOCALE_ILANGUAGE and LOCALE_SISO639LANGNAME2 if  I give it a locale name. Since I need to convert an ISO 639 code to LANGID, do you have any suggestion as to how to get a locale name from an ISO 639 code or is there any other way to do it?

Thanks,

On MSDN? Moi?

Oh yeah, its that Search and ye shall find, SIAO style! thing. Kind of raises a ruckus with that whole blog as an official source of info thing that was mentioned before, huh? :-)

Okay, let's move past that issue for now. I may have some thoughts to share another day....

Onward.

Let's take a moment and think about the request here (getting LANGID values from ISO-639 codes), and you may know where I am going with this.....

It starts with the whole Your LCID sucks issue. You don't need a mathematical proof to cover the fact that if LCIDs suck, then the subsets of LCIDs known as LANGIDs suck worse. :-)

One could obviously think of all kinds of clever ways to take ISO 639 names and convert them to locale names which are in part made up of ISO 639 names anyway.

But is that the right thing to do?

Office built a huge architecture off of locales for its proofing tools, and Windows did for its input methods. They did it with LCIDs, sure. And it true Your LCID sucks style they have to take steps to fix that (like input methods have started to do in Vista).

But is the answer to just always move to locale names, still staying with locales in every case?

Locales have their own data to perform certain tasks, sure. Does this mean that any project that is taking an ISO-639 language name (which as of the latest update can be over 6000 different items!) and try to shoehorn it into a list of locales that is limited to a mere 208 choices?

The 300 pound man who jumps off a trapeze pole into a cup of water can be a fun way to draw a cartoon, but is it a way to write software? :-)

Now in situations where you need data from a locale then sure, limiting yourself to that list can be important. But if you don't specifically need the data, then why let locales hold you back.

I can't speak for locales, but I am pretty sure that if they could those little LCID values would be proud to see that you have grown past them....

The key thing to ask yourself is whether you need the locale. When you think about it, this whole issue is really just a generalization of the problems with CurrentUICulture I talked about in two parts. It is simply bad engineering to require the support of the whole huge model of the locale support in Windows or the .NET Framework when you really need nothing more than an identifier or at most a name string. It's time to start thinking about how to lighten the load, people!

 

This post brought to you by (U+104f, a.k.a. MYANMAR SYMBOL GENITIVE)


Amit on 4 Dec 2009 8:35 AM:

Yeah, well. Even when they do try to follow standards, they (don|won|can)'t. Windows' locale name for PRC is zh-CN I couldn't find this in the IETF registry - zh-cmn, however, is the tag for Mandarin Chinese.

Michael S. Kaplan on 4 Dec 2009 8:03 PM:

Actually, zh-CN is legal because the RFC accepts names that use ISO-639 and ISO-3166 -- no individual registration required!

Doug Ewell on 30 Sep 2010 11:05 AM:

@Amit: zh-CN means "Chinese as used in Mainland China" (whatever that means).  It is in the IETF registry; 'zh' means Chinese and 'CN' means China.  Read RFC 5646.


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2011/10/21 Yet another time they messed up. Respectfully.

2010/11/08 Header files are the wrong place to be less than helpful

2010/09/30 Office skipped version "13" (makes you wonder what they will do in 2013!)

2007/12/19 NO doesn't mean maybe, and it certainly doesn't means NB!

2007/09/11 Perhaps they don't quite get it just yet, #1

go to newer or older post, or back to index or month or day