I know I'll Never say Never... again, at least

by Michael S. Kaplan, published on 2009/07/30 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2009/07/30/9853370.aspx


So, thinking about consequences of the CASING piece of Every character has a story #33: U+1e9e (CAPITAL SHARP S, Microsoft edition - Part 2).

And ignoring that we didn't follow my recommendation, since I've covered that, for now.

Let's look at what we did do.

We essentially took a letter whose primary and for all practical purposes only need is to handle the German word with a Sharp S that is fully capitalized, and make sure that no process that will ever capitalize it.

Ever.

Have a cup of irony, will ya? :-)

Yesterday I did talk about another issue I noticed.

Regular readers know I've discussed U+03c2 (GREEK SMALL LETTER FINAL SIGMA) before.

This letter, which has had an intentional one-way mapping to the regular uppercase function since NT 3.1, had this mapping moved to the linguistic tables activated by LCMAP_LINGUISTIC_CASING (for all locales).

This changes the fundamental behavior of a character in the filesystem that has been in existence since the early 90's if you count the beta versions.

As discussed in The last word on the FINAL SIGMA:

If you are a native speaker of Greek, then you know that both ς (U+03c2, a.k.a. GREEK SMALL LETTER FINAL SIGMA) and σ (U+03c3, a.k.a. GREEK SMALL LETTER SIGMA) do indeed uppercase to Σ (U+03a3, a.k.a. GREEK CAPITAL LETTER SIGMA). But if we added this character to the linguistic table, then it suddenly ς would never work in the CharUpper/CharUpperBuff functions and would not work in the default call to LCMapString with the LCMapString function with the LCMAP_UPPERCASE flag.

Obviously that would not be a good thing.

Try to imagine how you would feel if attempting to uppercase the string hello would come out as HELLo. Wouldn't you consider it a bug? Especially is it used to come out with the HELLO you were expecting? You might be thinking about telling the platform GooDBYE, if you know what I mean.

Now you don't have to imagine it -- you get to see it. If you deal with Greek at least.

Relying on all caps style appearances in places like IE will see documents changed, too -- at least when they are looked at in the latest version.

The easiest way to see it is to use Word's "Change Case" support -- prior versions will convert U+03c2 to U+03a3; Windows 7 will not...


# Carl on 31 Jul 2009 7:39 AM:

Surely this begs for the simple solution: uppercase final sigma. ;-)

# Michael S. Kaplan on 31 Jul 2009 9:37 AM:

Ah you don't think we'd mess that one up, too? :-)

Stephen A. on 30 Aug 2009 4:44 PM:

Uppercase final sigma and have comparisons to uppercase sigma evaluate to false? Ouch!

Out of curiosity, I tested this on OpenOffice/Linux (which, I assume, relies on the Pango library) and 'ς' turns correctly into to 'Σ' and back to 'ς'.

Can we expect to see a fix in a future service pack, or is this behavior here to stay? The "convert to uppercase" functionality is essentially broken as it stands now (the HELLo example is apt!) :-)


referenced by

2010/06/22 It would save a lot of time if I just stopped being right, aka About scuba diving, aka HELLo!

go to newer or older post, or back to index or month or day