When keeping things on a level Plane[ 1] doesn't work anymore

by Michael S. Kaplan, published on 2009/06/09 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2009/06/09/9712595.aspx

It has been over three years (in Every character has a story #4: U+feff (alternate title: UTF-8 is the BOM, dude!)) that I mentioned in an offhand manner about Notepad "Application is perhaps an overstatement; its just an uber-wrapper for a Win32 EDIT control."

It never occurred to me that Chris Walker, the development owner of Notepad for over a decade, was a reader of my blog. He was. And he noticed. and commented.


Now what I said was true only in the comparative sense of which technology is doing the drawing. But in addition to being wrong in the sense that Chris noticed (and has since forgiven me for!), I was also wrong in another sense, which I'll get to in a moment.

By the way (I mention in a not entirely offhand manner), I'm actually on the Windows team.

I checked in and reviewed code that will ship with Windows 7, and I officially signed off on a few features in Vista (I suppose I could have used the development equivalent of jury nullification and refused to sign off but it seemed kind of silly given the fact the things I owned kind of worked?). For Windows 7 I find myself in triage meetings pretty regularly, and in shiprooms as infrequently a I can get away with.

Now you may wonder why I pointed this out.

Well it had something to do with the mail I got the other day from someone who we will call Bob. Why? Because that is his name:

Hi Michael

More a topic for the Windows 7 guys rather than your blog. Using the RC Build.

As an example, consider U+10901 PHOENICIAN LETTER BET introduced in Unicode 5.0. Displays in WordPad and Paint but not in Notepad. I used the Aegean font http://users.teilar.gr/~g1951d/download.html.

Indeed, someone has broken Notepad! Guess this means the Edit control is down the tubes too.

Furthermore I've been looking at Unicode 5.2 Egyptian Hieroglyphs and same problem in Notepad using the font I built with Fontlab when wrote the proposal with Michael Everson. In this case Paint and Wordpad appear broken too.

Any idea how to report this problem? Pretty major incompatibility with Vista apart from being bad news for users of recent and upcoming versions of Unicode.


I work on Windows 7!

Since this is actually about the ScriptString* functions in Uniscribe that do simpler higher level uniswcribe work and how Uniscribe deals with Plane 1 of Unicode, I am probably a good person to forward the note onto the people in Uniscribe who are (I trust they will pardon the expression I will use here) most culpable for the "pretty major incompatibility with Vista" as Bob put it.

One of them can fill in the details here, but....

Applications that rely on Uniscribe, which has quite a bit of knowledge about Unicode, are somewhat at the mercy of the fact that Uniscribe has some knowledgte about Plane 1 but a lot of targeted ignorance about other parts not being covered -- because then suddenly things that used to work by accident but never by specific deisgn will suddenly stop, once it starts paying attention to the plane.

An occupational hazard for anything that "works by accident" I suppose....

The workaround is simple enough at the micro level: ExtTextOutW with the ETO_IGNORELANGUAGE flag I have mentioned before. -- I even "renamed" the flag to ETO_STOPTREATINGMELIKEIAMSOCOMPLEXYOUMISERABLECONTROLFREAK for entertainment purposes, though in this case it would sound more reasonable, I think. :-)

At the macro level, like Notepad, Paint, and Wordpad, this is not of much help, of course.

Now sometimes (and I am not saying this time, but you never know?) there are specific things you can do in the font that will make things work better without turning things off -- once ranges get around to being covered in the OpenType docs....

I almost forgot to explain the other place I was wrong about Notepad -- it was that "uber-wrapper for a Win32 EDIT control" bit. It is actually a wrapper around the Shell EDIT control, which is not exactly the same thing. Though it tries to be and most usually is. Perhaps I'll explain why another day....


This post was specifically not sponsored by any Unicode character, for hopefully obvious reasons

John Cowan on 9 Jun 2009 9:07 PM:

Another day?  <i>Another day?!!</i>.   Better be, like the very next tech posting!

Michael S. Kaplan on 10 Jun 2009 12:20 AM:

Better be or what, Mr. Cowan? Are you in a position to control content here? :-)

I'll get to it eventually. I have a backlog I have to deal with, too!

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day