There's a ™ joke in here somewhere, I just don't know what is (aka And if a 't' turned out to be 'm'…)

by Michael S. Kaplan, published on 2011/06/20 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/06/20/10176769.aspx


It isn't always true, but it often is -- the issue, when it came up, was brought up by a tester.

This makes sense, really.

No matter how much self hosting people like me might do, by the very nature of the job they do, most of the day of the tester is a mix of random ad-hoc, directed ad-hoc, and other testing....

Now do you remember when I talked about pseudo-localization, back in One of my colleagues is the "Pseudo Man" (a rich source of puns in conversation!)?

Well, a tester noticed that in the pseudo build a bunch of strings that were supposed to have "t" in them seemed to have "m" in them, instead -- in the pseudo build.

A shaping problem? A font problem? A pseudo problem? No one was sure.

Well, I did know there was no shaping involved. A font problem seemed unlikely, and a pseudo issue seemed far-fetched.

The mail thread had some perplexed folks on it, let me tell you.

Now the way pseudo works, each letter has a set of various "replacement" characters that look like them. So take, for example:

t      U+0074    LATIN SMALL LETTER T

For this letter, there are several potential "alternates":

Letter Codepoint Name
ţ U+0163 LATIN SMALL LETTER T WITH CEDILLA
ť U+0165 LATIN SMALL LETTER T WITH CARON
ŧ U+0167 LATIN SMALL LETTER T WITH STROKE
τ U+03c4 GREEK SMALL LETTER TAU
т U+0442 CYRILLIC SMALL LETTER TE
U+ff54 FULLWIDTH LATIN SMALL LETTER T

There was a part of me that was sad that the Tenge was not on the "Capital T alternatives", especially after I wrote It is with a tenge of sorrow that I say this. I mean, I'm over it now. But I was sad for a little while....

Anyway, back to the t that becomes an m, do you know what's going on?

Just think of it as yet another flavor of the issue i talked about in blogs like Small case is not just tinier capitals; italics are not merely slanted letters and You say ĭtalics, I say ītalics. It is much more complicated in Cyrillic...., previously.

Judy pointed out some of the form issues, with art:


This fact interested me for many different reasons.

So I went off and looked at what various fonts did.

I am pretty sure something in here might be weird. Probably not bugs though:

Sometimes your t is gonna look like an m; deal with it!


andreas on 20 Jun 2011 9:55 PM:

back in 2000 I stayed a week in Sofia (BG). was invited one evening for dinner by local friends. tried to decipher the (cyrillic) menu. asked if "salama" was sausage. big laughter: no, that's salad! :-))

M W Lees-Grossmann on 21 Jun 2011 5:48 AM:

That last graphic with the various fonts is almost entirely 8859P15/MSWin1252 fonts. What happens with Arial Unicode MS and Mincho? This looks like a straightforward substitution problem similar to something weird I ran into some years ago with the Cyrillic "page" (Unicode vs. cp1251 vs. 8859-5).

Michael S. Kaplan on 21 Jun 2011 6:51 AM:

Not sure what you mean, they are all *Unicode* fonts.


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day