Its the End[UpdateResource] of the world we know it

by Michael S. Kaplan, published on 2008/08/21 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/08/21/8883552.aspx


It was late last week when Maksim asked a very interesting question via email to one of those large aliases at Microsoft:

SUBJECT: EndUpdateResource failing after adding cirtain number of items with UpdateResource

Hi,

It appears that there is a bug (or undocumented behavior anyway) with BeginUpdate/Update/EndUpdateResource functions.

When I am adding more than certain number of resources this way, EndUpdateResource returns with error ERROR_INVALID_DATA. The exact count of items is not always the same and varies depending on the length of resource names and resource types that I have.

After running several experiments I have discovered that that the problem occurs according to following formula:

(Cumulative Resource Names Length) + (Resources Count) * 25 + (Cumulative Resource Types Length) + (Resource Types Count) * 13 > 2040

Can someone please say if there is a bug and if my assumed formula is correct? Or may be there is some other workaround apart from doing EndUpdateResource after adding each resource.

My source code is below, the dll where I updated resources is a simple dll without any code:

#include "stdafx.h"
#include <string>
#include <iostream>

using namespace std;

wstring MakeLongName(size_t length) {
      int randomNumber = rand();
      TCHAR buffer[65];
      ZeroMemory(buffer, 65);
      _itot_s(randomNumber, buffer, 65, 10);
      wstring randomPart = buffer;
      length -= randomPart.length();
      wstring result;
      result.append(length, 'X');
      result.append(randomPart);
      return result;
}

int _tmain(int argc, _TCHAR* argv[]) {
      CopyFile(L".\\testdll.dll", L".\\testdll1.dll", FALSE);

      HANDLE hLibrary = BeginUpdateResource(L".\\testdll1.dll", TRUE);
      if(hLibrary==NULL) {
            cout << "Failed to BeginUpdateResource. Error: " << GetLastError() << endl;
            return 1;
      }

      for(long i = 0; i < 10; i++) {
            BYTE data[100];
            ZeroMemory(data, 100);
            wstring longName = MakeLongName(230);

            if(! UpdateResource(hLibrary, L"Y", longName.c_str(), MAKELANGID(LANG_NEUTRAL, SUBLANG_NEUTRAL), data, 100)) {
                  cout << "Failed to UpdateResource. Error: " << GetLastError() << endl;
                  EndUpdateResource(hLibrary, TRUE);
                  return 1;
            }
      }

      if(! EndUpdateResource(hLibrary, FALSE) ) {
            cout << "Failed to EndUpdateResource. Error: " << GetLastError() << endl;
            return 1;
      }
      return 0;
}

I had not seen this cone up before, but this is a function I have found interesting since all the way back when we the resource updating functions in MSLU (described here).

The answer to this particular riddle came from developer Paul:

EndUpdateResource fails if it cannot extend the .rsrc section of your DLL. I’ve seen this happen if the .rsrc section isn’t the last section in the image – and that’s frequently the case (a few experiments show that .reloc usually follows .rsrc using the Microsoft linker). Annoyingly, LINK.EXE always seems to insert a .reloc section, even if you have a resource-only DLL. (The formula you discovered is an approximation for “the .rsrc section cannot be extended”.)

Now as to whether this is a bug of by design....

It really is by design.

Twice.

Now I am not going to dig into the format of PE files, since for that you can look at:

to get the lowdown here.

So for the first by design we'll look to the linker.

When the Microsoft Linker (LINK.EXE) does its work it makes a lot of sense that it makes the .reloc section last rather than the .rsrc section, because the latter is more or less gunk that is alread compiled by the Microsoft Resource Compiler (RC.EXE) and which it does no t really need to modify -- it just has to align, while the former is the section that it arguably has to do some of it hardest work in to have all of the relocation entries.

Matt also has a less cynical reason he mentions in that second article:

Working backwards from the end of the executable, if there is a .debug section in the OBJs, it's placed last in the executable. In the absence of a .debug section, the linker tries to put the .reloc section last because, in most cases, the Win32 loader won't need to read the relocation information. Cutting down the amount of the executable that needs to be read decreases the load time.

Then for the second by design we'll look to the EndUpdateResource function and its cousins (BeginUpdateResource and UpdateResource), though really that first function I mentioned is the real bad boy here.

While it does a bunch of work inside the .rsrc section, it doesn't start mucking around a whole bunch with the rest of the PE file. Reordering sections just fall a bit outside of its current beat, if you know what I mean.

Paul had some thoughts about workarounds:

If you have control over how “testdll1.dll” is created, you might be able to figure out how to manipulate the PE sections so that .rsrc always goes last. In my code, I was able to start with a hand-crafted resource-only PE file which had only a .rsrc section.

Matt's first article gives some info on removing the .reloc section:

If you do decide to remove relocations, there are three ways to do it. The easiest is to specify the /FIXED switch on the linker command line. Alternatively, you can run the REBASE program with the -f option on your executable. REBASE comes with the Win32 SDK. The third way to remove relocations is the new RemoveRelocations function in the Windows NT 4.0 IMAGEHLP.DLL. My sample code below shows how to use RemoveRelocations.

Though to be honest this is something I try to avoid, especially with /FIXED, because I have seen multiple sources that suggest this to be a bad idea for two reasons:

Though your mileage may vary.

And of course someone could write a tool to simply do the reordering of these two sections in the binary; the principal thing to worry about (and the easiest bit to mess up) is not aligning things properly, but that isn't too hard, so it might be worth just grabbing the source from Matt's PEDUMP (used in the last two articles on the list above) and the code to remove the .reloc section from the second one to use as a start and then working to just write the whole file out with these two sections reordered.

Now if someone were to decide to fix it -- to unmark the by design flag on it -- whose job would it be?

On the whole I'd say the fix should be in the EndUpdateResource function, for several reasons:

Of course now we get to the really unfortunate aspect of all of this.

In Windows, there are some components with specific owners, and others that are really considered to be very shared, with no specific owner who would be responsible for daoing major updates.

Many times that "no owner" status comes in code that has not required changes in a long time.

Code of that sort often finds new owners via the "Chess move" theory of development -- i.e. "you touched it, you own it", but the resource updating functions (BeginUpdateResource, UpdateResource, and EndUpdateResource) have proven quite resilient to this, with people who modify it managing to be able to avoid becoming owners except within the scope of their own changes.

So finding someone to volunteer to own this particular change could prove to be a challenge (especially since one can fall back on the whole by design thing!).

 

This blog brought to you by(U+32ae, aka CIRCLED IDEOGRAPH RESOURCE)


# Mihai on 25 Aug 2008 2:00 PM:

A related problems seem to be the (apparently) random failure if adding a manifest to the executable. I guess mt.exe does the same kind of resource update (although it does not seem to call the official API).

# Michael S. Kaplan on 26 Aug 2008 12:42 AM:

I don't know much about how the manifests are embedded -- could they be the same essential bug/limitation?

# KJK::Hyperion on 1 Sep 2008 5:19 AM:

Relocation data is fully position-independent: it doesn't contain relocations (naturally) nor RVAs to itself. It can be easily stripped from the executable and re-appended later

# Michael S. Kaplan on 1 Sep 2008 5:31 AM:

Yes, of course. :-)

But it is the part of the binary that the linker does the most writing work in -- so it is easier to stick at the end. Others can of course move sections around later, but none would do so gratuitously, so you need someone to decide it is worth their while....


go to newer or older post, or back to index or month or day