When Notepad's Find doesn't

by Michael S. Kaplan, published on 2005/04/06 09:45 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/04/06/405847.aspx

I met Chris Walker the other day.

It is amazing how you can "know" someone at Microsoft through email without ever actually meeting them. Occasionally with a name like Leslie or Sam or Chris you might even not know their gender, so you just have these faceless relationships through you inbox....

Anyway, we were at a talk about compiler optimizations and he asked me "are you Michael Kaplan?"

I guess there are benefits to having the picture on top of the blog. Though it does kill the faceless aspect a bit. :-)

Anyway, we were talking about the functionality of the "Find" and "Replace" functionalities in Notepad. He was pretty sure that it did not do the right thing with situations where the length of the search string and the target string were different (something which can indeed happen, as we talked about before).

He's right, it cannot yet handle that situation. So it cannot find the case of å (U+0061 U+030a) when å (U+00e5) is what you are looking for. And it has an optimization that looks for the first character in the search string prior to checking for the whole string. Though is on the right track since he is at least calling the right API (CompareString) at least half the time. :-)

Of course features for a product like Notepad are always hard to balance. It has come a long way since NT 3.1 but like everyone I have my wish list of features to add to it. I am sure that this is how an innocent bunch of RichEdit control developers thought up a product like Word, but I probably would not lose that much control in what I would ask for....

Notepad is probably the single most-used HTML, XML, and text editor in the known universe. But its dependence on the Win32 EDIT control and its implicit use (through the EDIT control) on Uniscribe's ScriptString* APIs when Uniscribe is turned on make it a great testbed for international features in plain text. I promised him I would put in a few bugs for him to take a look at, but I have to admit it is a little daunting to think that anything I say or do would affect a tool that is consciously used as much as Notepad!

I'll see if I can pick on Notepad a few days this week. If I am lucky Chris will fix the issues and then he can pick on CompareString a bit for bugs he finds while he does it!


This post brought to you by "å" (U+00e5, a.k.a. LATIN SMALL LETTER A WITH RING ABOVE)
A letter that has been really happy to be seen as the default example for normalization functionality. :-)

# Dean Harding on 6 Apr 2005 6:43 PM:

Heh, I loved Rory's post on Notepad in Longhorn: http://neopoleon.com/blog/posts/12904.aspx

# Michael S. Kaplan on 6 Apr 2005 8:26 PM:

That is pretty funny. Probably no way to get my features in if I have to compete with exciting ones like that!

referenced by

2011/10/13 Every rose has it's Þ....

2006/07/09 The fallacy of comparing out of context

go to newer or older post, or back to index or month or day