Every character has a story #34: LATIN LETTER T WITH CEDILLA (U+0162/U+0163)

by Michael S. Kaplan, published on 2011/08/26 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/08/26/10200787.aspx


It's possible to go a long way when you don't even exist.

Look how it worked out for the Capital Sharp S? :-)

Some of them get baked into ISO 8859 and Unicode much earlier though.

Like

for example.

Yes, there was a proposal for it to be used in French once upon a time, as per Wikipedia:

In 1868, Ambroise Firmin-Didot suggested in his book Observations sur l'orthographe, ou ortografie, française (Observations on French Spelling) that French phonetics could be better regularized by adding a cedilla beneath the letter "t" in some words. For example, it is well-known that in the suffix -tion this letter is usually not pronounced as (or close to) /t/ in either French or English. It has to be distinctly learned that in words such as French diplomatie (but not diplomatique) and English action it is pronounced /s/ and /ʃ/, respectively (but not in active in either language). A similar effect occurs with other prefixes or within words also in French and English, such as partial where t is pronounced /s/ and /ʃ/ respectively. Firmin-Didot surmised that a new character could be added to French orthography. A similar letter, the t-comma, does exist in Romanian, but it has a comma accent, not a cedilla one.

But since it never happened, it doesn't count so much.

Let's face it, this one was added for the sake of Romanian.

And we know how that worked out.

(ref: The history of messing up Romanian on computers)

Oh well, it will work out some day....


John Cowan on 26 Aug 2011 9:49 AM:

Mny typos.  Try speling chequer.

Kemp on 26 Aug 2011 10:59 AM:

So, wait. I don't think I realised the extent of the fail here before. Latin t with cedilla isn't used anywhere? It was only added (wrongly) in place of Latin t with comma below? I really don't understand these decision processes...

Michael S. Kaplan on 26 Aug 2011 6:14 PM:

???? John?

rjcox on 27 Aug 2011 2:18 AM:

@Kemp the implication here that Latin t with comma below is not used led me to look it up in "The Elements of Typographic Style". Bringhurst says that the undercomma (as he names this diacritic) is actually an artifact of mechanical typewriters, having no proper cedilla a comma was over-typed (in Appendix B).

This would make sense as the use of two spaces after a period, and underscores for emphasis. But unlike those but seems to have become the typographic preference.

Of course this could be an authorial opinion expressed as fact (Bringhurst certainly isn't short on expressing preference as correctness elsewhere).

However he does state that both T and S undercomma are used in Romanian, and T undercomma also used in Livonian.

And T with cedilla isn't listed as a used combination.

Michael S. Kaplan on 27 Aug 2011 7:24 AM:

I'm a huge Bringhurst fan, but I'm not quite ready to write off Romanian stated preference as typographic blunder....

John Cowan on 27 Aug 2011 9:57 AM:

Blunders do come to be preferred sometimes.  Consider the word "syllabus", whose OED etymology is sheer comedy:

From modern Latin syllabus, usually referred to an alleged Greek σύλλαβος. Syllabus appears to be founded on a corrupt reading syllabos in some early printed editions — the Medicean MS. has sillabos — of Cicero Epp. ad Atticum iv. iv, where the reading indicated as correct by comparison with the MS. readings in iv. v. and viii. is sittybas or Greek σιττύβας, accusative plural of sittyba, σιττύβα 'parchment label' or 'title-slip on a book'. (Compare Tyrrell and Purser Correspondence of Cicero nos. 107, 108, 112, Comm. and Adnot. Crit.) Syllabos was græcized by later editors as συλλάβους, from which a spurious σύλλαβος was deduced and treated as a derivative of συλλαμβάνειν to put together, collect (compare syllable n.)]

Kemp:  Not so much a decision process as a lack of recognition of the distinction.  It was a rule when 8859-1 through -4 were designed that if the same character appeared in more than one encoding, it always appeared at the same code point.  S-comma (for Romanian) and s-cedilla (for Turkish) were merged in ISO 8859-2 and 8859-3 respectively, because it was not realized that they were different, and the Romanian t-comma appeared in the standard as a spurious t-cedilla.  For Turkish use, 8859-3 has now been replaced by 8859-9, which does not follow this rule (it is a variant of 8859-1 with Icelandic-specific characters replaced with Turkish ones).

(Sorry about the typos business: that belonged on a different post, but I can't figure out which.)


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day