Sometimes it *does* pay to be neutral

by Michael S. Kaplan, published on 2005/08/31 03:03 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/08/31/458266.aspx


I can hardly believe it has been a year since I asked and answered the questions What is a neutral culture? What is a neutral locale?

I spent a bunch of time talking about how lame neutral locales were on Wn32 given how they are not supported by any of the NLS functions.

They are very useful for resources though, something Stuart reminded me of the other day when he asked the following question in the newsgroups:

In XP my user is set to German (Switzerland) and in my COM dll (built in Visual Studio 6) I have resources for German (Germany) and English (U.S.)

However the German (Germany) string table is ignored and the LoadString win32 call loads the English (U.S.) resource.

Why is the German (Switzerland) user loading the English (U.S.) resource?

Well, once upon a time this is what may have happened -- if you look at the section of Chapter 4 of the first edition of Developing International Software for Windows 95 and Windows NT entitled Multiple Language Resources:

On Windows NT, FindResource searches for resources tagged with the language ID of the calling thread. On Windows 95, it searches for resources tagged with the default system language ID. On Windows NT, you can search for a resource in a specific language by calling FindResourceEx, which takes a language ID as a parameter. Both FindResource and FindResourceEx first attempt to find a resource tagged with a language ID, as described above. If they don't find anything, they then search for a resource tagged with the same primary language as that of the specified language ID. (If several resources with the same primary language but different sublanguages exist, the functions will return whatever they encounter first.) If, for example, the program requests resources in Standard German that aren't available, the program can retrieve Austrian German or Swiss German resources and still provide a user interface that the user can understand.

If the FindResource and FindResourceEx functions do not find any resources that match the language ID's primary language, they search for resources tagged as "language-neutral." This language ID is useful for resource elements such as icons or cursors that are identical for all languages. If a bitmap or an icon will differ for some languages, you can define one language-neutral bitmap as the default and specify language IDs for as many other customized bitmaps as required. For example, bidirectional applications might require bitmaps with right-to-left directionality. Because the FindResource and FindResourceEx functions always search for specific language IDs first, they will always find a bitmap tagged with that language ID before they find one tagged as language-neutral. The search algorithm they follow is summarized in the following list:

  1. Primary language/sublanguage
  2. Primary language
  3. Language-neutral
  4. English (skipped if primary language is English)
  5. Any

But a lot has changed since this book was published (hell, just after the section I quoted is the one that talks about how you can use SetThreadLocale to choose the language and if you are a regular reader here then you know how I feel about SetThreadLocale.

But there have been other changes in resource loading since NT 3.1 and Windows 95 that invalidate some of the priorities in resource loading. To understand why, it might be a good idea to think back to my post in May when I talked about The weird, weird world of the SUBLANGID. The truth is that each of these locales is different, and there are times that the behavior Stuart wanted (if not the full language I wanted, give me something close) may actually cause more problems then it could solve. Possibly not for German, but you never know. And there are times that you want to define an entirely different fallback logic, such as Basque or Catalan falling back to Spanish. and times that you do not want Bosnian to fall back to Croatian (or Simplified Chinese to fall back to Traditional Chinese).

So things had to change, they really did. And they have been changing over the last few versions.

The specific change prior to Windows Vista is to meet the same numbered steps as above, skipping the text above the list, i.e. (If several resources with the same primary language but different sublanguages exist, the functions will return whatever they encounter first.).

(At this point, people pretty much universally recommend that "best practice" in this case involves separate language resource DLLs, one per language. I will get into why this is the case at a later time.)

But specifically it is important if you do not want to follow the multiple DLL advice that you use neutral language resources to use a language that you want to have all of the various specific locales fall back upon. As I mentioned back in June, Using full locales rather than the neutral ones is often not a terribly good idea.

Because sometimes, it pays to be neutral!

(Editorial request of readers who wish to comment -- I would prefer if comments if any for this post not try to get into the religious issue of separate language resource DLLs and limit themselves to the religious issue of neutral vs. specific resources. I promise I will be covering the other issue soon!)

 

This post brought to you by "₦" (U+20a6, NAIRA SIGN)


# Vorn on 31 Aug 2005 4:41 AM:

Off topic for this post, but not exactly for this blog, the linked URL describes what happens when MacOS X comes across a character it has no glyph for, in particular in the private-use area. And, yeah, that's me.

Vorn

# Michael S. Kaplan on 31 Aug 2005 9:26 AM:

Yes, the Apple 'ultimate fallback' font, if no font can be found with the infomation....

# Michael S. Kaplan on 31 Aug 2005 9:31 AM:

I will actually be talking a big about a few features I like among our Macintoshian bretheren some day soon....

# Michael S. Kaplan on 31 Aug 2005 10:55 PM:

I can't believe there were no dissenting opinions on the question of neutral versus specific resource tagging.... amazing!

# Martin Bohring on 1 Sep 2005 3:34 AM:

Hey this is interesting on more than one level, because Switzerland is one of the countries claiming to be neutral.

# Stuart Barlow on 1 Sep 2005 6:19 AM:

Thanks for the blog Michael. Your previous posts on SUBLANG_NEUTRAL were the best links that Google gave me when I first hit the issue. :-) I guess that the old visual studio 6 circa 1998 (only 7 years old) is pushing down a specific locale route?

# Michael S. Kaplan on 1 Sep 2005 9:27 AM:

Hey Martin -- yep, that was kind of a baiting example, wasn't it? :-)

# Michael S. Kaplan on 1 Sep 2005 9:33 AM:

Good point, Stuart. Though I think there were neutral entries in my VS6 resource file creations, and I always wondered how they got them listed?

It seems like the option is gone now, so I guess they went to a model that uses the NLS API 100% of the time, so since we do not support them, neither do they. Though you could still get it by editing the .RC file directly....

# Tialaramex on 3 Sep 2005 8:26 AM:

Pango (under Unix) renders apparently valid Unicode characters for which no real glyph is available (in any chosen typeface or fallback) as a box containing the hexadecimal code point.

So on most freshly installed Linux boxes, the Linear B character 'Wheel' appears as a box with the digits 0100CF

I think this offers a unique combination of minimum ambiguity - they're obviously not "real" characters unlike the Mac OS examples, and maximum diagnostic value - a technical support person or 12 year old child can look up 0100CF in the Unicode table, see that it's a Linear B ideogram, do a Google search and download a font file. Drag font into fonts folder, reload, glyph appears. No special tools needed.

The SUBLANGID hilarity reminds me of a bug reported recently to the Gnumeric user list, the user complained that despite living in the UK, their copy of Gnumeric insisted on using dollars etc. It turned out that they had manually specified a locale string of en_UK.UTF-8 instead of the correct en_GB.UTF-8. After the fix was explained they quite reasonably asked what locale in this case should be used by people from Northern Ireland (which is surely in the United Kingdom, but not part of Great Britain). The answer is that ISO country codes are opaque strings, no matter how much it may appear otherwise.

go to newer or older post, or back to index or month or day