Windows isn't Office (and vice versa)

by Michael S. Kaplan, published on 2011/08/05 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/08/05/10193046.aspx


Some people view Microsoft as one big company.

Others view it as several different companies, rubber banded together.

I suppose at some level either view is reasonable, but the latter view has one advantage in particular: it helps make differences between teams and groups and divisions make more sense!

Like the other day, when in a late and somewhat off-topic response to If you find that GetLocaleInfo is driving you crazy, it may not be the right function to use, Jacob Schäffer commented:

I'm using the GetLocaleInfoW function to return the <language>-<Script>-<REGION> pattern for a specific LCID using the LOCALE_SNAME constant (in VB6 and Office VBA). While doing so I found that looking up LCID's 2072 (&H818) for Moldov Romanian and 2073 (&H819) for Moldov Russian fail, while any other LCID I try returns the expected pattern.

The same thing happens when I use the LOCALE_SISO639LANGNAME and LOCALE_SISO3166CTRYNAME instead of LOCALE_SNAME.

I cannot find the LCID's 2072 (&H818) and 2073 (&H819) in the Language Identifier Constants and Strings table, so I suppose Windows (tested on Windows 7) cannot properly deal with these Moldov language variants. Can this really be true?

Surprisingly, MS Office 2010 declare the constants wdRomanianMoldova (2072) and wdRussianMoldova (2073) in the WdLanguageID enum, and similarly in the MsoLanguageID enum.

Is it possible that MS Office supports languages for both document editing and the Office GUI that doesn't exist in Windows, or do I need to use another function to get access to the complete locale info database (I need support back to Windows XP)?

I'm using the <language>-<Script>-<REGION> pattern to identify which language file (not standard ressource files, but external UTF-16LE encoded XML documents) to load strings from. Although I might never run into the need for a Moldov Romanian setup I'd prefer to have a generic and reliable function that returns meaningful <language>-<Script>-<REGION> patterns for all MS Office declared LCID's.

Finally, I also need the reverse functionality. I've been looking for an API function or a constant for GetLocaleInfoW that returns the LCID based on the input string pattern <language>-<Script>-<REGION>. I may have become blind and overlooked something during my searches, but I didn't find any API method for this. Does such a beast exist or do I have to build my own table (which is a highly undesirable thing to do)?

All the best / Jacob

Jacob had it right here - Windows has no support for a Moldova-based locale - either for Romanian or Russian. Either in XP or in later versions.

Though of course both Romanian and Russian are supported. The script is there, and all the letters.

People can get things done, at least.

If one truly needs an Office-based function then it's unfortunate that Office doesn't document one - especially for the sake of these edge cases that bring one so close!

But there it is.

Of course I do have to correct the notion that a <language>-<Script>-<REGION> pattern always makes sense; after all, most locales don't include a script.

And locales themselves don't have their names constructed from their constituent parts anyway -- their names are defined whole in each locale.

In the end, platforms like 2000, XP, and 2003 were built before Microsoft jumped into names at all, so trying to construct them after the fact will never work as well as they do now....


Jacob Schäffer on 8 Aug 2011 4:34 PM:

No, most locales don't need a Script, but when you are in Eastern Europe it may be quite interesting to differ bewteen "Cyrl" and "Latn" scripts (at least in my case).

What exactly does the Downlevel NLS library back-port to XP and 2003 that can't be achieved with EnumSystemLocales() in combination with GetLocaleInfo() with appropriate flags? Does it implement "names" from newer versions or is XP/SP2 or 2003/SP1 updated so such "names" are returned correctly by GetLocaleInfo() with the LOCALE_SNAME flag set?

/Jacob

Michael S. Kaplan on 8 Aug 2011 8:45 PM:

It does not return any names of locales that aren't on the machine....

Jacob Schäffer on 12 Aug 2011 6:58 AM:

Unfortunately not :-)

The LOCALE_SNAME constant seem not to exist on XP, but the constants LOCALE_SISO639LANGNAME and LOCALE_SISO3166CTRYNAME does. If not one need the <Script> or <_sort> part of the Locale Name it looks like an resonable option on XP to compose the Locale Name as <Val(LOCALE_SISO639LANGNAME)>-<Val(LOCALE_SISO3166CTRYNAME)>. This actually appear to work for the vast majority of locales, though one have to build a private "exceptions" table to attain 100% match and support for the <Script> part for certain locales.

XP seem - with the LCID_SUPPORTED flag set - to enumerate 167 locales, of which only a very small part seem to require special attention in the above mentioned LOCALE_SNAME simulation. However Windows 7 seem to enumerate 211 locales. Is it those extra locales you refer to when you say "It does not return any names of locales that aren't on the machine..."?

During my research I stumbled over a heterogeneity regarding return values on XP vs Windows 7 for the same LCID's. For example, Val(LOCALE_SISO3166CTRYNAME) for LCID 9225 returns "CB" on XP and "029" on Windows 7. I can't find neither "CB" nor "029" mentioned as valid country codes in online versions of the ISO 3166 standards. One place I actually could find something about this change was in ".NET Compact Framework 3.5 Run-Time Breaking Changes" (msdn.microsoft.com/.../bb986636) that clearly shows that quite a few Locale Names have changed for a given locale.

As I read the NLS support API's documentation at MSDN, Microsoft urge developers to use Locale Names over Locale Identifiers. This seem to be a game of chance if the naming conventions change, the ISO standards are broken or insuffient while LCID's are much more static.

What are your best argument(s) for using Locale Names over numeric LCID's - except for human readability?

All the best /Jacob

Michael S. Kaplan on 12 Aug 2011 7:10 AM:

Following standards?

Changes made were to *be* conformant, which LCIDs are not....

Michael S. Kaplan on 12 Aug 2011 10:54 AM:

Anyway, there are a host of reasons that we are now ten years smarter than we were then; those reasons should be embraced, not shunned.

Jacob Sch&#228;ffer on 12 Aug 2011 12:12 PM:

I *certainly* embrace the NAMING initiative. It's fantastic to use in most cases, since it's quite readable by the human eye. And ... FWIW ... I'm certainly not backward-looking (though some of my customers are)!!!

The fact is that I'm hunting a *stable* way to use Locale Names for lookup *AND* need access to locale data on XP as well as on newer OS versions - using unique Locale Names for input. However, since the Locale Names for some locales can't be directly built on XP - even with good will - this appear to be a problem for me, since the environment I work in is VBA7 in Office 2010 (which - by the way - don't implement ANY way to lookup locale information by Locale Name, but only by LCID).

What I see is that the Windows environment can NOT deliver a *stable* mapping from Locale Names to Locale Identifiers unless I implement all sorts of workarounds. That's perfectly fine with me, since the world goes on and *if* I need backward compatibility I'm asking for trouble. Period.

Anyway, please point to an ISO standard that define "029" as a proper country identifier. The MSDN documentation for the LOCALE_SISO3166CTRYNAME says that it should return the ISO 3166 name for a country. I'd like to know which country "029" represent and where to find the ISO standard that says so ???

Now, let's assume that such a standard doesn't exist. Then, can developers rely on LOCALE_SISO3166CTRYNAME after all, or is the ISO standard insuffient in this regard ???

I have to point out, that I do NOT need support with this, but find the academic discussion highly interesting.

All the best /Jacob

Michael S. Kaplan on 14 Aug 2011 9:38 AM:

029 is the *numeric* country code for ISO 3166, and the only way that MS could rehabilitate the "en-cb" locale....


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2011/08/16 The road to standards compat is paved with app back-INcompat

go to newer or older post, or back to index or month or day