Challenges behind MSLU: The hardest API to wrap

by Michael S. Kaplan, published on 2005/02/19 10:20 -05:00, original URI:

MSLU covers a lot of territory, considering how small it is -- less than 250k yet it covers over 500 APIs!

One of the questions I was often asked was what I thought was the most difficult API to wrap.

My answer, without hesitation, was always the same: FormatMessage.

Let's look at how the Platform SDK describes it:

The FormatMessage function formats a message string. The function requires a message definition as input. The message definition can come from a buffer passed into the function. It can come from a message table resource in an already-loaded module. Or the caller can ask the function to search the system's message table resource(s) for the message definition. The function finds the message definition in a message table resource based on a message identifier and a language identifier. The function copies the formatted message text to an output buffer, processing any embedded insert sequences if requested.

Ok, so it sounds complicated. When I started out, I figured it would be easy. Of the 32 different Unicode layers inside of Microsoft almost half of them had a FormatMessage wrapper already. With so
much code to work from, we should be covered, right? The reality was far stranger, unfortunately! It seems that even with 14 samples to steal from, no one had ever implemented the whole function. After all, FormatMessage is pretty complicated:

You can load from system messages, from your own binary, or format a string you pass in

Turns out no one had ever done it all -- in fact, none of them used arguments at all, except in the simplest of ways to avoid problems (i.e. a version that only accepted strings, a version that never accepted them, and so on). I won't names or anything, but I really don't have to -- from my point of view everyone stunk at this point, and I did not care how many interesting bugs for ExtTextOutW on Win9x had been fixed....

So, the MSLU Triage Committee met (which sounds more impressive than it was -- it was just my lead and a PM and me talking). We decided that ideally MSLU ought to just to do it right, for all of the cases. After all, you never know what a user might be depending on!

But trying to completely support all possible aspects was tricky. After all, you have to accept an arbitrary number of parameters, only some of which might be strings (and of course all strings have to be converted!). They can be used in any order and although it is not documented it it entirely legal to skip inserts in the actual string being used (thank God for beta testers, thats all I can say!).

In the end, I kept a very simple strategy to handle all the weirdnesses of inserts:

  1. Preload the string, using the FORMAT_MESSAGE_IGNORE_INSERTS dwFlag so that the raw string could be obtained (and the API allocates this time)
  2. Parse the string for insert tags (which are always in the %n format, up to %99), being sure to get the highest id number and also to store whether each insert has a string in it per the raw string
  3. Free the allocated string, we do not need it anymore
  4. Build a new array of items based on the old array or va_list (converting all strings out of Unicode as needed).
  5. Call the OS API with the newly created inserts, using the allocation preference of the one who called us.

Lots of minor flourishes about allocations and language info, etc., but in the end this was a pretty hellaciously complicated API to try to totally support!


This post brought to you by "ฑ" (U+0e11, a.k.a. THAI CHARACTER THO NANGMONTHO)

# AC on 19 Feb 2005 1:01 PM:

Why wouldn't you copy the Unicode FormatMessage code from NT and use that?

# CN on 19 Feb 2005 4:42 PM:

Another question/suggestion: Were you ever, in this method or others in MSLU able/allowed/helped by looking at how SomeWeirdFunctionA wrapped SomeWeirdFunctionW in NT? I would imagine that some issues should be quite similar, even if the devil in the details would be proper assumptions of total buffer sizes and so on.

# Michael Kaplan on 19 Feb 2005 7:52 PM:

Hello AC -- The design goal of MSLU was *not* to support Unicode on Win9x. It was to provide a Unicode programming interface so that the same application running on NT would be able to support everything the platform could.

CN -- sometimes it would help, I did look at most of them at some point....

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day