Exploiting the linker's rules in unicows.lib

by Michael S. Kaplan, published on 2005/05/01 12:43 -07:00, original URI: http://blogs.msdn.com/michkap/archive/2005/05/01/413835.aspx


I wanted to clarify the way that unicows.lib works.

This .LIB file is not the usual sort. It has in it a bunch of symbols, for each API that unicows.dll supports. Lets look at what a bit of that looks like -- say running the following on unicows.lb and kernedl32.lib looking at a particular API (I'll try using my favorite API -- CompareString):

link -dump -all unicows.lib >link_unicows.txt
link -dump -all kernel32.lib >link_kernel32.txt

The output is huge (over 11mb) for unicows.lib, and most of them are obviously for other functions anyway, which is why I dumped them both to a file. Here is what kernel32.lib has to say about CompareStringW:

    19400 _CompareStringW@24  
    19400 __imp__CompareStringW@24
       2D _CompareStringW@24
       2D __imp__CompareStringW@24
  Version      : 0
  Machine      : 14C (x86)
  TimeDateStamp: 41C232BB Thu Dec 16 17:13:31 2004
  SizeOfData   : 00000020
  DLL name     : KERNEL32.dll
  Symbol name  :
_CompareStringW@24
  Type         : code
  Name type    : undecorate
  Hint         : 59
  Name         : CompareStringW
Archive member name at 19470: KERNEL32.dll/  
41C232BB time/date Thu Dec 16 17:13:31 2004
         uid
         gid
       0 mode
      35 size
correct header end
                 
_CompareStringW@24

Now looking at the bigger unicows.lib file and its references to CompareString. Those who are curious can run the above command line -- it is a huge file.

The biggest part of all this is the names that are the same between them. By putting unicows.lib before kernel32.lib, any time anyone tries to reference CompareStringW in any of its forms, it will find MSLU's rather than the operating system's. Simple misdirection, really. Now the rules of the linker in looking for symbols are simple: look in all of the .OBJ files, then look in all of the .LIB files, in order from left to right as they are explicitly listed.

So the trick is just to put unicows.lib first -- it will win!

The rest is a little bit trickier -- there is a pointer called _kernel32_CompareStringW_Thunk which will sit in the user's image, and it points to a function also inside of the user's image called ResolveThunk(). This ResolveThunk() function is the one that does the work to figure out what platform you are running on and then call either the function in unicows.dll or the one in the operating system (calling something like the same GetProcAddress that you would call if you wanted a function pointer). It then fixes up that pointer in _kernel32_CompareStringW_Thunk to call the function it selected, so that ResolveThunk() never needs to be called again for the function that was just resolved.

Think of it as a static version of the information that is dynamically generated when you use DelayLoad.

Now I do not want to take credit for this work, the real genius here comes from two people: Bryan Tuttle (most likely person to fix and least likely person to break the Windows build on any given day of the week, and the person who originally pointed out to me the real problems in Godot using DelayLoad itself and blocking others from using it) and the recently retired Dan Spalding (also known to many as the father of the linker or the linker god at Microsoft and the original author of the Delayload technology on which much of this approach is based).

And of course I am oversimplifying here, as there are many interesting gotchas that unicows.lib hits (from failure cases to functions that are not there to functions that are hard to call to avoiding assembly language and more), which I'll talk about another day....

 

This post brought to you by "U" (U+0055, a.k.a. LATIN CAPITAL LETTER U)
(Because "U" is for Unicows, Unicode, and UBetcha!)


# James Todd on Monday, May 02, 2005 10:44 AM:

Nifty. Thanks for the info! I'd like to ask a few more questions about this to make sure I'm understanding it correct.

So let's say I'm calling CompareStringW, and I properly link with unicows.lib early in my link line. Then at runtime, I make my first call to CompareStringW. What happens at that stage? I'm assuming it goes into the unicows.lib function, which in turn just calls _kernel32_CompareStringW_Thunk. Now since this is the first time I've called CompareStringW, it thunks to the ResolveThunk function, which is also in unicows.lib. That function decides I'm on win9x, and points _kernel32_CompareStringW_Thunk to another function in unicows.dll that actually does the conversion of the wide string arguments so that the OS's non-unicode CompareString can be called. Then later in my runtime when I make additional calls to CompareStringW, I still end up in a unicows.lib function which again calls _kernel32_CompareStringW_Thunk, but this time it goes straight to the actual implementation function in unicows.dll, thus skipping the call to ResolveThunk.

Is that more or less correct? In my scenario above, each function is only resolved on the first time it's called, rather than all functions being resolved at once during program initialization, and I'm not sure if that's what's happening or not.

Last question: When exactly does unicows.dll get loaded?

Thanks!
James

# Michael S. Kaplan on Monday, May 02, 2005 11:01 AM:

Hi James -- Your understanding is exactly correct. ;-)

unicows.dll is loaded the first time you call an API that has to load it. On NT-based systems that means never....

# James Todd on Monday, May 02, 2005 12:20 PM:

Thanks Michael. :-)

James

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2010/12/26 I don't see any scripture claiming 413835 is the number of the beast, but...

2005/05/11 Why UnicoWS.dll forwards to the OS on Unicode platforms

go to newer or older post, or back to index or month or day