by Michael S. Kaplan, published on 2006/02/17 04:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/02/17/533929.aspx
The other day I was looking at a particular bug repro (it was actually that BACKSPACE vs. DELETE bug I have mentioned before, if you were curious).
Anyway, I decided to take the letter a and put as many different diacritics on it as I could. Here it is:
à̴̵̶̷̸̡̢̧̨̛̖̗̘̙̜̝̞̟̠̣̤̥̦̩̪̫̬̭̮̯̯̰̱̲̳̹̺̻̼͇͈͉͍͎́̂̃̄̅̆̇̈̉̊̋̌̍̎̏̐̑̒̓̔̽̾̿̀́͂̓̈́͆͊͋͌̕̚͠͡ͅ
And here are the code points:
0061 0300 0301 0302 0303 0304 0305 0306 0307 0308 0309 030a 030b 030c 030d 030e 030f 0310 0311 0312 0313 0314 0315 0316 0317 0318 0319 031a 031b 031c 031d 031e 031f 0320 0321 0322 0323 0324 0325 0326 0327 0328 0329 032a 032b 032c 032d 032e 032f 032f 0330 0331 0332 0333 0334 0335 0336 0337 0338 0339 033a 033b 033c 033d 033e 033f 0340 0341 0342 0343 0344 0345 0346 0347 0348 0349 034a 034b 034c 034d 034e 0360 0361
Here is how it looked in Notepad:
Scary, huh? Somewhere under that mess is a letter a. Tahoma does have a specific way of dealing with those diacritics that just seems to not do so well in outrageous situations, huh? :-)
Then, on the advice of someone on the typography team, I tried Segoe UI, one of the new Vista fonts. The results were a little different:
Doesn't it seem like some of them are missing?
So I talked to Judy (Safran-Aasen) and Simon (Daniels) and they suggested I remember that just because I can't see something doesn't mean that it isn't there. They suggested I try looking at it Word while increasing the before and after point size per line. So I did and suddenly saw what they were expecting (here it is with both fonts, side by side):
It is very difficult to explain why I think this is so cool. You may just have to accept that there are two kinds of people -- people who think it is cool and people who think those other folks need counseling.
Or maybe it is that there are 10 kinds of people (those who understand binary and those who don't).
Obviously there is no perfect solution to this sort of unreal situation, but I think the stacking behavior may be much cooler for a whole bunch of normal cases.
Now what these weird cases do for sorting is a different story, one that I will talk about another time....
This post brought to you by "a" (U+0061, LATIN SMALL LETTER A)
# Serge Wautier on 17 Feb 2006 4:27 AM:
# check on 17 Feb 2006 4:28 AM:
# aidan_walsh on 17 Feb 2006 4:30 AM:
# Rosyna on 17 Feb 2006 5:04 AM:
# Michael S. Kaplan on 17 Feb 2006 8:39 AM:
# Ben Cooke on 17 Feb 2006 1:25 PM:
# Mihai on 17 Feb 2006 2:23 PM:
# Michael S. Kaplan on 17 Feb 2006 4:24 PM:
# Michael S. Kaplan on 17 Feb 2006 4:27 PM:
# Maurits [MSFT] on 17 Feb 2006 5:20 PM:
# Michael S. Kaplan on 17 Feb 2006 10:48 PM:
# Vorn on 18 Feb 2006 4:20 AM:
# Ben Cooke on 18 Feb 2006 6:30 AM:
# Si on 18 Feb 2006 9:55 AM:
# Michael Dunn_ on 18 Feb 2006 9:57 PM:
referenced by
2015/07/08 Fixing up broken and semi broken blog posts, as needed?
2011/04/14 Actionable bugs are the only ones that get fixed, in the long run
2008/10/06 UCS-2 to UTF-16, Part 4: Talking about the ask
2008/09/08 UCS-2 to UTF-16, Part 1: Getting the obvious out of the way
2007/01/24 Sometimes a WCHAR really *is* just a character....
2006/03/07 Everybody's doing the wraparound....