Marshaling your resistance

by Michael S. Kaplan, published on 2007/06/23 03:01 -04:00, original URI:

I was recently involved in one of those interesting email discussions where I had the opportunity to both help and hinder my reputation.

And at the same time the conversation got to make something in the design of a programming language look like it both a bug and an inconsistent (or maybe I should say impure) design.

(It also hurt at least a little bit the reputation of a developer who gave me advice about the same issue months before, but their advice was not attributed so it only hurt their reputation with me, in this case!).

Anyway, let me explain what I am talking about.

There are some functions that accept or return buffers that are arrays of strings, and they are laid out like the following:


This is the pattern you see in functions like GetProfileSection and WritePrivateProfileSection. It is not the most common kind of design pattern, but if you are using C/C++ it might be mildly convenient (though from C or C++ a va_list might be a bit easier to use).

Now one problem that this pattern has is that it is much harder to use from languages like VB or C# (though truth be told a va_list is much harder to use in that case so it is probably not so bad that this harder to use but still in theory available method exists!).

Anyway, some other functions that uses this pattern are SetThreadPreferredUILanguages and GetThreadPreferredUILanguages. I covered a way to call them from C# in the post Thinking about MUI is making me bipolar, and one of the people who had an earlier look at the post suggested that instead of using the following pinvoke declaration:

        [DllImport("kernel32.dll", CharSet=CharSet.Unicode, ExactSpelling=true, CallingConvention=CallingConvention.StdCall, SetLastError=true)]
        static extern bool GetThreadPreferredUILanguages(
          uint dwFlags, ref int pulNumLanguages, [MarshalAs(UnmanagedType.LPWStr)] string pwszLanguagesBuffer, ref int pcchLanguagesBuffer);

that I might want to use StringBuilder instead (then converting the final results to a string using the StringBuilder.ToString(int,int) overload. But I thought about it and realized that this would perhaps allocate the string twice, and the code I had already worked (and I was much more worried about the actual bugs -- convincing the MUI folks that bugs existed -- by that point) so I let it go.

Anyway, a few months later someone else was calling a similar function and they were having trouble. Suggestions ranged from using byte arrays and converting to strings via the Encoding class to using Marshal.PtrToStringUni, and I figured I'd trot out that blog entry with the [MarshalAs(UnmanagedType.LPWStr)] code that works and then also suggest the StringBuilder.ToString(int,int) solution for people bothered by the voodoo of treating a System.String as a buffer for a function,  violating the whole immutability thing.

After not too long someone turned around and proved that the StringBuilder method did not work -- that the marshaller for the StringBuilder appeared to be truncating after the first NULL long before the ToString() method was even involved. Insane!

Okay, so I was wrong about that. It isn't like I actually tried that code; I just had a random developer wonder why I chose the method I did when there was a more conventional one available.

At least I had that answer that worked to fall back on. :-)

Not to mention the fact the actual StringBuilder behavior here is hard to argue as anything other than a bug fat bug that probably deserves consideration of a fix on its merits....

Even if the method is not "pure" and would not exist in a perfect NGWS/.Net world, I could make a strong case that the truncation at NULL bug in StringBuilder wouldn't exist in a perfect world, either (and both the Encoding class and Marshal.PtrToStringUni solutions involved extra allocations and potential conversion bugs, which would also be pretty impure....


This post brought to you by Ǥ (U+01e4, a.k.a. LATIN CAPITAL LETTER G WITH STROKE)

# Wilhelm Svenselius on 25 Jun 2007 2:01 AM:

I'd probably go for the byte[]/System.Encoding solution if I was faced with this, unless there is some magical way to do it in less code. Using byte[] and Encoding makes it obvious to whoever reads the code that we are parsing a binary blob that contains some strings, rather than something which itself is a string, as using StringBuilder would imply.

# Michael S. Kaplan on 25 Jun 2007 8:13 AM:

Doesn't  [MarshalAs(UnmanagedType.LPWStr)]  make for less code here? :-)

go to newer or older post, or back to index or month or day