New in Vista Beta 1: more use of the word 'linguistic'

by Michael S. Kaplan, published on 2005/08/02 05:10 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/08/02/446522.aspx

Now I answer a lot of questions in this blog, some that people ask directly and a lot of others about things that no one has asked but I personally find interesting, but many that people have asked in the past

This was back in the early days when I was a bit more popular, so that even though none of the people commenting on the post actually had personally witnessed they issue, they found it kind of interesting (I have since mostly slipped into something a bit more obscure except when I have art that has women falling head first off of mechanical bulls!).

I am not complaining, mind you. But it is hard to not notice the fact that my posts about any topic other than internationalization seems to draw about 2-20 times the interest!

On top of which earlier today I looked at an advance copy of a book on internationalization coming out soon that is probably going to be well over 500 pages, only 5-6 of which are about my favorite topic (collation), which managed to split collation and string comparison into two different topics and spend two pages talking about alternate sorts and none of which covered almost any of the topics I go on about here as real concerns for internationalization in Windows and the .NET Framework. And I think the author might read this blog!

And that was just the technical stuff, the non-technical stuff was just as helpful (more on this in another post I'll do later).

Nevertheless, I carry on. Someone likes what I am doing here, I'm sure of it. And I get to say it all my way, too. Sometimes people point out bugs, and other times I find bugs myself while posting. Which is undeniably cool. Maybe a year from now a Google search will dig up an answer to a question that helps someone save the day or whatever. And all of that is really good enough for me.

But every once in a while (and here is where I pop the stack a bit to the original purpose of the post) I get to post about something obscure but fun that no one out there in the world knows about yet.

There are a whole bunch of people who read about the odd use of the word linguistic when I answered that other question (What does "linguistic casing" mean?). At the time I proposed that we could have called the LCMAP_LINGUISTIC_CASING flag LCMAP_UNICODE_SIMPLE_CASING and been just as close to what was actually going on with the flag. But this is selling the flag a little short; after all, it handls Turkic casing, after all, and it takes many of the lookalike symbols in Unicode that are identically shaped to Greek and other letters, and converts them to those letters. All operations that are sensible linguistically even if not a good idea in filesystems and other less linguistic operations.

And I guess that is kind of linguistic.... well, more "language-like" rather than "computer-like".

And in Windows Vista, this use of the word "linguistic" will continue, with several new flags (all of which can be used alongside the other flags in CompareString and LCMapString and all of which are available in Vista Beta 1 and all of (which will be documented in the Longhorn SDK as soon as the part that covers Win32 is available!):

LINGUISTIC_IGNORECASE -- Could have been called, NORM_THEREALANDACTUALIGNORECASE as it does what NORM_IGNORECASE ought to do and only masks the case information in scripts that actually have a notion of case.

LINGUISTIC_IGNOREDIACRITIC -- Could have been called NORM_WHATWEREALLYMEANTFORIGNORENONSPACE, it does what NORM_IGNORENONSPACE does and only masks the diacritic weight for that small range of scripts where actual, European style notion of diacritics are used (which is a great way to address that issue I mentioned earlier that affects Korean and other languages!).

NORM_LINGUISTIC_CASING -- The flag that will allow comparisons on Win32 to handle Turkic case properly, whether or not either NORM_IGNORECASE or the new LINGUISTIC_IGNORECASE is specified (something that could not be done with the existing flags due to the breaking effect that would have on existing code involving the filesystem and other situations).

Now none of these three operations are what a linguist would actually call linguistic. But all three of these operations certainly provide collation support on Win32 with behavior that is more linguistically appropriate than the existing support does on Windows. Which is pretty awesome....

And it is pretty darn linguistic if you accept the tap-dancing act above. Which is a lot better of a dance than the one done about the naming of the ANSI code page, or the Visual InterDev product! :-)

This post brought to you by "İ" (U+0130, a.k.a. LATIN CAPITAL LETTER I WITH DOT ABOVE)

Hi Michael,

I just wanted to say that I read your blog everyday, but I don't comment because I don't think I have the necessary knowledge around internationalization to comment sensibly. I suspect there may be quite a few other people that are doing the same.

I have one request though. You have mentioned font linking in previous posts, but haven't described how it actually works. Could you do a post on this?

Thanks,
Rob

Hi Rob -- Well, there are the posts http://blogs.msdn.com/michkap/archive/2005/03/20/399322.aspx and http://blogs.msdn.com/michkap/archive/2005/05/16/417711.aspx which do talk about different aspects of font linking. That second post especially talks about the difference in how Uniscribe and MLang each handle font linking.... I'll be doing more to cover how Uniscribe and GDI+ do it, and eventually I'll even talk about how Avalon does it, too. :-)

Don't you get a hit count report? I follow your blog religiously, and I doubt I am alone. Much enjoyed and appreciated.

Hey, don’t give up!

If you get no comments, it might mean you are doing a good job explaining.

I know I read you daily, but I do not comment until I have something to say.

So, if you want to see how many people actually read you, just post something technically incorrect :-)

Hi Mihai --

They also might be asleep. :-)

No worries, I have no plans to give up doing this.

Which is undeniably cool. Maybe a year from now a Google search will dig up an answer to a question that helps someone save the day or whatever.

<Evil Conspiracy>What would one be without Google?<g/></Evil Conspiracy>

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.