Why UnicoWS.dll forwards to the OS on Unicode platforms

by Michael S. Kaplan, published on 2005/05/11 19:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/05/11/416624.aspx


The original version of the Microsoft Layer for Unicode's DLL only ran on Win9x. Ever.

Now it had the loader , mind you, which would properly call the operating system or MSLU depending on where you were running, but there was no code in place to handle what would happen if you stuck the DLL on your Windows 2000 box and took the trouble to call LoadLibrary/GetProcAddress on the various APIs.

Two things changed that, though:

1) From my past heritage, I understood that the MSLU loader, while a very cool and elegant solution (cf Challenges behind MSLU: the loader! (Part 1) and Exploiting the linker's rules in unicows.lib), really did not solve the problem for the legions of developers using C#, VB.Net, Classic VB, Visual FoxPro, VBA, and so on. They were still kind of left out of the mix, and they really did constitue a majority of potential callers.

2) Many assumptions that MSLU makes about the platform simply cannot be sustained for very long -- trying to use MSLU on an NT platform where the dll code is called will probably crash very soon into the run, if we did not aggressively work to disallow it. This was especially true in relation to window management.

After some intense discussion internally, the decision to address the problem was made. I got to add the small bit of code to each API to make sure if not running on Win9x, then forward to the operating system. The actual size/perf hit of this code was pretty small even on Win9x, but the ability to make sure MSLU wuld simply work (rather than providing a list of things that would not work) was pretty stellar.

I also ran into an interesting optimization issue.

Let's say that this is one of the functions (it is not, but it is close enough):

int __stdcall
GodotAddFontResourceW(LPCWSTR lpszFilename)
{
    if(IS_ON_NT())
    {
        return AddFontResourceW(lpszFilename);
    }
    else
    {

        // Insert the wrapper here to convert the
        // the string and call AddFontResourceA.
    }
}

Now I and others looked at that code and the fact that we were getting parameters that were identical to what the OS wanted -- so couldn't we just jump right to it in assembly language, changing the

return AddFontResourceW(lpszFilename);

to a

__asm jmp AddFontResourceW

instead?

I mean, thinking about functions that take a lot of parameters -- the time you would save pushing parameters onto the stack twice alone would be a win, right?

Unfortunately, this assumption is actually wrong. Using a jmp here actually makes the DLL 50% bigger and a little slower, too!

It turns out that the compiler is smart enough all on its own to handle these kinds of situations. But when you plunk down inline assembly, it stops doing many of the basic optimizations. So I can actually do better here by not trying to be quite so clever....

I have to admit that it was a little sad since I had not been able to use inline assembler in a long time. It  was exciting to do it again, though it would have been cooler if I had been able to actually use it. :-)

Anyway, thats why and how MSLU forwards calls that belong to the OS any time they are received.

 

This post brought to you by "" (U+0ab2, a.k.a. GUJARATI LETTER LA)


# Eusebio Rufian-Zilbermann on 11 May 2005 6:39 PM:

Your adding code to each API function reminds me of a conversation I had with a Microsoft developer about a case of enabling CJK display in non-CJK systems. He asked me if I knew how it could be done. I thought it was one of those "trick" questions with an ellegant sollution and after thinking for a minute and not figuring it out I ended up saying I didn't know. He just said: "We created dlls with all the APIs patched". Sometimes brute force is the solution, and because of the many rules and exceptions we have to deal with, those of us working in internationalization we often get an extra dose...

Yuhong Bao on 12 Apr 2009 1:48 AM:

Unfortuately, disassembling with IDA Pro, there seems to be many functions in the latest version of UNICOWS.DLL where this optimization was not performed by the compiler.

Michael S. Kaplan on 13 Apr 2009 2:39 AM:

In another blog on this Blog, I pointed out that over 250 of the functions have additional code in them -- that is what you are seeing here, I believe.

Yuhong Bao on 14 Mar 2010 5:36 PM:

And I realize it now after looking at it another time.


referenced by

2006/09/02 Every character has a story #23: U+00ad (SOFT HYPHEN)

go to newer or older post, or back to index or month or day