Over in the Suggestion Box, mpz asked:

Suggestion: Write about the new Latin capital letter sharp S introduced in Unicode 5.1.0.

Fair enough....

Though to be honest, by the time I get through:

I think my thoughts on the matter have been pretty much covered.

It is hard to say how things will go on that last point, as my opinions are fairly controversial and it is just as likely that they will not go in that direction....

But otherwise, the invention of letters that do not actually exist is quite powerful, as is the decision to ignore intuitive casing behavior or make unrealistic case mappings. Unicode has been doing it for some time and they seem pretty popular.

The whole issue makes me wonder about how Germany really feels about capital punishment, given all of the capital letter punishment they seem comfortable with. :-)


This post brought to you by ß and(U+00df and U+1e9e, LATIN SMALL LETTER SHARP S and CAPITAL SMALL LETTER SHARP S)

Andrew West on 15 Apr 2008 10:40 AM:

Hmm, this post seems to have been brought to us by ß and ß (U+00df and U+00df, LATIN SMALL LETTER SHARP S and LATIN SMALL LETTER SHARP S), with the latter letter using html to masquerade as ẞ (U+1e9e, LATIN CAPITAL LETTER SHARP S). You may not like the little fella, but at least let us see him

Michael S. Kaplan on 15 Apr 2008 11:08 AM:

Most fonts don't have him yet, so I thought some some smoke and mirrors would be appropriate. :-)

Andrew West on 15 Apr 2008 11:17 AM:

I can see him just fine with IE7 configured to use Code2000 for Latin based text.

Michael S. Kaplan on 15 Apr 2008 11:23 AM:

Okay, I changed it. Enjoy your capital[ letter] punishment. :-)

Andrew West on 15 Apr 2008 11:25 AM:

Much better !

John Cowan on 15 Apr 2008 4:12 PM:

It looks like capital S with comma below to me, actually.

Michael S. Kaplan on 15 Apr 2008 6:42 PM:

That must be your font; the character is right....

ReallyEvilCanine on 16 Apr 2008 11:49 AM:

Stupid, stupid, stupid. So when are we getting capital versions of other lower case-only ligatures like ff, fi, fl and ffi? Modern Icelandic has a capital eth (Ð) even though no word begins with it because there used to be swords that did. We have some of 'em, or at least their descendents. No word in German can or ever could begin with a double-ess (a ligature which has an odd history in and of itself). A capital "ß" is a pinnicle of linguistic ignorance and bureaucratic idiocy.

Andrew West on 17 Apr 2008 7:23 AM:

Read the proposal from DIN ( -- rightly or wrongly a capital form of the letter is sometimes used in the middle of words written in all capitals (even on the cover of the 1960 and and 1964 Leipzig editions of "Der Große Duden"). Unicode is descriptive not prescriptive, and if a character exists then it is a candidate for encoding -- it is not up to Unicode to refuse to encode a character just because its usage is questionable.

As to capital eth (Ð), it is used initially in Old English, and even in modern Icelandic it is required when a word with small eth (ð) medially is uppercased.

John Cowan on 17 Apr 2008 9:28 AM:

It looks correct now (that is, I get a boxed-1E9E last-resort glyph).  Since I switched to Linux, it's amazing how many of those I see now in the C0 range when surfing: people just seem to like throwing random U+000B's into their so-professional Web content.

ReallyEvilCanine on 18 Apr 2008 8:51 AM:


Perhaps I didn't make clear the source of my annoyance. DIN decided to invent a glyph for a character which is NEVER used as a capital because it has no initial form. It is only used in all caps to indicate the usage of the "ß" in normal upper & lower case when there's a possibility of confusion; otherwise two esses are used.

It's as absurd as ISO decrfeeing a capital long-ess form.

mpz on 21 Apr 2008 9:13 AM:

Thanks! I didn't realize you'd already written about it. I guess that means I should read your blog more often ;-)

