IsComplexEnoughForYou?

by Michael S. Kaplan, published on 2006/12/11 18:53 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/12/11/1262690.aspx


There are a whole bunch of people both inside and outside of Microsoft who have gotten good use out of that GetDateFormat Tarek put together that I mentioned before in Cue the smarter version of GetDateFormat... ok, it's a wrap!.

The other day, one of those folks inside of Microsoft had a good question, though:

We have some dates shown in our UI dialogs, displayed variously via static text controls and RichEdit. To support the usual loc and bidi scenarios, we call GetDateFormat with the DATE_LTRREADING and DATE_RTLREADING flags so that it generates LRM and RLM directional marks for layout. (The actual flags used in any given instance are determined by a wrapper function that implements logic similar to what Michael Kaplan has talked about in his blogs at http://blogs.msdn.com/michkap/archive/2006/09/01/735817.aspx).

The problem we’re encountering is that sometimes these characters are interpreted correctly as directional hints, whereas other times controls show them as visible characters.

For example, a clean XP SP2 English machine the static control will show where the extra characters are actually the LRM reading codes. After installing a MUI pack, the control correctly interpret the marks even if no additional languages are actually installed by the MUI-pack install and even if everything is still set to English (which technically makes the marks redundant).

I’m wondering if it is control or font functionality added by the MUI pack (and if so, what’s the best way to detect when the necessary support exists), or if perhaps the wrapper function for GetDateFormat is overly aggressive in passing the DATE_*READING flags to GetDateFormat? If it is the latter, what is the logic modification – not passing a reading flag unless the flag determined by the default locale is either different from flag implied by the date format locale\calendar or the locale is natively a bidi locale? I’m hoping one of the language gurus can point out where our mistake is, because I’m not really an expert in this area.

Thanks,
Chris

The confusion Chris is seeing is actually caused by the fact that the MUI pack actually installs complex script support (described further here), which is really what decides whether U+200e and U+200f are going to be LEFT-TO-RIGHT MARK and RIGHT-TO-LEFT MARK or two funny square boxes that are not in the font.

(I was about to say that this is not the sort of thing that could be solved at the font level when I realized I would actually be full of crap if I made such a claim; this problem could probably be solved in the font by putting the characters in there and making them invisible by mapping them a space or something. Thank goodness I think before I post, these days!)

How about figuring out whether you have complex script support installed, so you know whether or not to pass the DATE_LTRREADING or DATE_RTLREADING flags?

Although one might intuitively expect a function to tell you whether or not complex scripts are enabled, there isn't one. The way to find out is to use the IsValidLocale function with the LCID_INSTALLED flag on any locale that requires complex script support, such as:

BOOL fComplexScripts = IsValidLocale(LANG_HEBREW, LCID_INSTALLED);

Now why this is actually makes an interesting historical side note, I think...

You see, in Windows 2000 the language support was chunked into language groups, as I first mentioned in Language groups -- the vestigial tail of NLS.

Now of those 17 groups, only some of them require complex script support to work properly, but if any one of the specific language groups was installed. it would be included along with the keyboards and fonts and such. And there is also an IsValidLanguageGroup function that could be used for similar purposes, for just that reason.

But since functions like GetDateFormat actually work based on locales, no one felt any pressing need to add a separate IsComplexScriptSupportOn function, especially since the notion of "complex script support" was not present as a concrete notion. Especially since you really needed to ask the question about support of the locale to make sure that the font and keyboard and such would also be there. So the IsValidLocale call seemed like the best bet.

And then in XP all of the complex script language groups were added to the one checkbox for enabling complex script support. But no one seemed all that busy or eager to create a new function to support this UI notion in the updated Regional and Language Options since people would still have to use locales any time they wanted to do something with language, and just asking for any one you needed to support would do the trick. The fact that having one of them installed meant having all of them installed was just incidental.

Now of course in Vista this support is always present so the whole question is moot; it would be ridiculous to create a function that would always return TRUE. So there is even less reason for a new function.

But it is still a good question, in any case. And if one wants to avoid seeing NULL glyphs in one's date format strings then it is a very sensible check to be doing....

 

This post brought to you by U+200e, a.k.a. LEFT-TO-RIGHT MARK)


no comments

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2011/08/31 And then, the unrelated KB article fixes the problem...

2010/03/25 Q: Why do the small black squares disappear? Hint: The answer isn't complicated, it's complex...

2008/12/02 IsCrAndLfComplex or what?

go to newer or older post, or back to index or month or day