Collation != case, still

Richard asked in the Suggestion Box, and I decided to dispatch quickly:

Why is it that English (en-US, because there is no en-GB) Windows and .NET don't know how to upper case a Latin Small Latter Sharp S even with the de-DE locale specified:


does not return "SS", but "ß"?

The Unicode casing file CaseFolding.txt has


Is this a Window's limitation? (Which would not help, given I'm trying to put together a demo of doing the right thing to build I18n into an application update.)

This is a question I have talked about many times in the past, as a simple search for U+00df indicates. And most importantly, since Casing and IgnoreCase are still not the same thing and Collation != Case (a.k.a. Collation <> Case), for now this is how casing will work on Microsoft platforms -- what Unicode refers to as simple casing....


This post sponsored by "ß" (U+00df, LATIN SMALL LETTER SHARP S)

# J. Daniel Smith on 8 Aug 2006 5:15 PM:

I know there are "good reasons" for things working the way they do...but anybody who has had even a semester of high-school German knows that ß upper-cases to SS; that is, "Straße" (street) becomes "STRASSE" (although I seem to recall that perhaps the rules are different in Austria and/or Switzerland?)

Since in .NET, ToUpper() returns a new string, it "should" be easier to "fix" this problem in that enviroment.

# Michael S. Kaplan on 8 Aug 2006 5:26 PM:

Well, "should be" is a relative term -- it is still using the same casing tables to do the work. We are more flexible in collation so we give the support....

# Richard on 9 Aug 2006 5:44 AM:


(Sharp S search failed to find anything... didn't try just the code point.)

# Richard on 10 Aug 2006 5:19 AM:

Or, rather I should say,

Search for "Sharp S" failed to find anything about case folding (quite a few hits around collation/equality.)

