Don't use lstrcmp if you don't need it

by Michael S. Kaplan, published on 2006/05/12 03:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/05/12/595861.aspx


It was over a year ago I posted Performance gains at the cost of your own components, where I talked about the shell game that people play with peformance gains, sometimes. You know, where one component speeds up but it is done in such a way that someone else will seem slower. Kind of like skipping out on the check so that the person you had lunch with can pay it, but perhaps a bit more polite? :-)

But all jokes aside, that whole "lightning fast init phase" is still very important, and probably always will be. No user wants to pay a penalty before they feel they have accomplished anything....

Anyway, we ran across one of those cases in Vista just recently. As a part of the team's performance work, the code was set up to delay some of the initialization until/unless it was needed.

Which is a good thing, because even though it probably will be needed at some point, it is usually not needed when a process is initialized.

Of course as soon as someone actually needs a locale's information to be loaded there will be that perf hit. Suddenly it is the fault of the function that is called later (in this case it was CompareString, but it could have been anything, really).

The story had a very happy ending, though. It turns out that the reason that they were calling CompareString was that they actually were not calling it; they were calling lstrcmp. And the fact is that they did not need to be getting a linguistically appropriate result back, and that CompareStringOrdinal would be not only faster in general and more appropriate in terms of results, but in this particular scenario for a specific application it would delay loading up the locale information a bit longer.

I am a lot less cynical about this than I used to be (don't worry, I have plenty of other things to be cynical about).

I mean, it is a shell game, sure. But the end result is an improvement in performance that can potentially affect every process on the system, whether it is noticed or not. And that can't ever be a bad thing. Especially in cases like this one when the results are actually more likely to be correct and the difference is noticed! :-)

So, the moral of the story? Don't use the functions that provide linguistically appropriate results if you don't need an answer with such considerations kept in mind. There is that cool new CompareStringOrdinal function that is just waiting for the chance to do the non-linguistic thing when it is needed....

 

This post brought to you by "" (U+129f, a.k.a. ETHIOPIC SYLLABLE NYWA)


# Adam on 12 May 2006 4:42 AM:

So, you'll need a resource at some point later in the execution, don't want to pay the price up front for loading it, but also don't want the perf hit when you finally do have to load it?

How about a function LoadLocaleInBackground()? It creates a new thread, lowers its priority a bit, starts it on the path of loading this resource and returns almost instantly to get on with important stuff.

If LoadLocaleInBackground() is never called, the first call that actually needs that data does what it does at the moment and you get the perf hit. No big deal, it's just existing behaviour.

If the function is called, and the background thread has loaded all the data into the app before the first funtion that actually uses that data it is called, great. No delay. Everyone wins.

If the function is called and the background thread is in the middle of loading the data when another function that needs it is called, the foreground thread can just wait for the background thread to finish. It shouldn't spend as much time waiting as it would have if it needed to load all the data from scratch itself, so you should still win.

As more apps become i18nised/l10nised[0], this could help more of them. Apart from the complexity[1], what have I missed here?

[0] Yes, I know that what that exapnds to is wrong, but I don't care!
[1] And, yes, I know I shouldn't dismiss complexity that glibly.

# Michael S. Kaplan on 12 May 2006 8:55 AM:

I don't think you missed anything, but there are additional considerations here. Such as of the hit of the creation of a new thread in terms of soft page faults (the new 1mb of stack, etc.) and of course the kernel call for the thread creation itself. For a non-zero expense, it is better to put it off until you can't anymore....

# Gabe on 12 May 2006 2:08 PM:

Well, presumably this background thread would be initializing everything that you don't need yet but which will be a perf hit when needed (like OLE). So the hit of the extra thread would be spread across everything that's being initialized.

# Michael S. Kaplan on 12 May 2006 4:58 PM:

Yes, but it means additional soft page faults on app boot for the new thread, which will be a perf concern.

# Gabe on 12 May 2006 7:23 PM:

As long as this initialization is going on in the background, there's really no need for it to happen on app boot. If I were implementing this sort of thing, I might wait 10 seconds, then start the thread, which might wait 10 seconds between initializing each different subsystem. Of course if anything was needed before it was initialized, the init would happen on the foreground thread instead.

BTW, the thread would only fault in the first couple pages of the stack. The 1MB would be allocated, but I don't expect that more than a couple pages would likely be used in most cases.

# Michael S. Kaplan on 12 May 2006 7:46 PM:

Hi Gabe --

I think the point is really being missed here.

No one is complaining about the performance of lstrcmp generally that would require such a scheme -- the specific complaint was a perf. scenario tied to a delay in app bpoot that was narrowed down to our code (and we pointed out that the best fix was to not call us there!).

Perf. fixes always have to have a basis -- something that is slow that is linked to a particular area of the code....

# Ben Cooke on 12 May 2006 9:14 PM:

Is there a cool *old* function that I should use when I'm targetting versions of Windows that have actually been released as well as Vista?

# Michael S. Kaplan on 12 May 2006 11:44 PM:

If you need to know whether two strings are equal (which is usually what is happening here; seldom do people need to know what comes first in these cases!) than RtlCompareUnicodeString or RtlEqualUnicodeString both can make excellent choices.... :-)

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day