From Win32 to .NET (and vice versa)

by Michael S. Kaplan, published on 2005/04/15 05:45 -04:00, original URI:

Charts like the one in Some suggested updates to the Win32-->.NET mapping for NLS functions are kind of amusing when you take the context in which the System.Gloablization class was conceived.

Internationalization in Windows did not really exist way back in the days of 16-bit Windows, and when all of that code was written back in the early days, all of the expert advice that came from linguists and standards experts went into the pot. But mistakes were still made and not everything was done in the way we might do it today if we had the choice to do it all overe again. If you know what I mean.

Back before the 1.0 version of the .NET Framework shipped, it all started as an interesting "What if there was no legacy, how would we support internationalization knowing everything we do now?"

Julie Bennett (based on meetings with John McConnell and others) put together an interesting spec that laid out the way such a framework would work. This spec is the very spec that I had found that made we want to meet with her and talk about Globalization support in .NET that I mentioned in Why/how MSLU came to be, and more that led to me getting my first contract with this group.

It describes a lot of what actually became the System.Globalization classes in the .NET Framework. That big "What if?" discussion document made many interesting changes that do make a lot more sense.

Yesterday I was talking to a developer working on the Longhorn project who was trying to work on our support of the way Longhorn will consume the results of Whidbey's CultureAndRegionInfoBuilder class (which as discussed earlier will pave the way to how locales will be able to be customized in Longhorn) and was baffled at why a certain bit of code would not return the right results. Until he found the documentation of NumberFormatInfo.NumberGroupSizes and looked at the remarks:

Every element in the one-dimensional array must be an integer from 1 through 9. The last element can be 0.

The first element of the array defines the number of elements in the least significant group of digits immediately to the left of the NumberDecimalSeparator. Each subsequent element refers to the next significant group of digits to the left of the previous group. If the last element of the array is not 0, the remaining digits are grouped based on the last element of the array. If the last element is 0, the remaining digits are not grouped.

For example, if the array contains { 3, 4, 5 }, the digits will be grouped similar to "55,55555,55555,55555,4444,333.00". If the array contains { 3, 4, 0 }, the digits will be grouped similar to "55555555555555555,4444,333.00".

And compared that to the documentation he already knew about from the GetLocaleInfo topic in Windows and its link to Locale Information that discusses the LCTypes, specifically the one on LOCALE_SGROUPING:

Sizes for each group of digits to the left of the decimal. An explicit size is needed for each group, and sizes are separated by semicolons. If the last value is zero, the preceding value is repeated. For example, to group thousands, specify 3;0. Indic locales group the first thousand and then group by hundreds, for example 12,34,56,789, which is represented by 3;2;0.

Note the change -- the .NET Framework says "If the last element is 0, the remaining digits are not grouped" while in Windows it says "If the last value is zero, the preceding value is repeated". Aha! or maybe Eureka! The way to make the code work becomes clear....

Now by any objective standard, this is a good change, made in the sober spirit of "what if we could do it all over again?" but it is still a bit jarring when you have to come back and put them side by side. Either because you are building a big table that tells you how to take a Win32 API and find its .NET Framework equivalent, or because you are working on code to consume the .NET Framework's format in Win32. Either way you understand both why it changed, the terrible freedom you have when you do not have a legacy support issue, and how embarrasing the old ways were, even though they worked quite well for over a decade....

Yet somehow it does not feel all that embarrassing to me to put them both up, side by side. It feels quite liberating to admit you are wrong, especially when

  1. The "wrong" solution still works just fine, thank you;
  2. The "right" solution exists now, or will exist soon;
  3. We are the team responsible for both;
  4. There are sensible reasons for both to exist as they do;
  5. It comes up in the context of both opening it all up and getting out of the way.

That is, quite simply, awesome. Both as a legacy and as a future direction.


This post brought to you by "Ă" (U+0102, LATIN CAPITAL LETTER A WITH BREVE)
Who, after seeing yesterday' sponsorship by "A" wanted to point out that it is more than just an "A" -- it is like an "A+", as it were!

# Mihai on 15 Apr 2005 11:46 AM:

Two proposed changes/additions:
1. If the .NET 2.0 API is not 100% frozen and I am not too late: I think there is a need of a function to tell if a culture uses a right to left script or not.
It is quite common to change the CurrentUICulture, but if there are special things you need to do (like deciding if you need to mirror images or not), you need to know if the culture is RTL or not. Sure, you can have a value in your resource file, but why?
2. Use full Unicode name when describing the sponsoring character. Otherwise they may get upset:
So "CAPITAL LETTER A WITH BREVE" is not quite good enough :-)

# Michael S. Kaplan on 15 Apr 2005 1:10 PM:

For #1, a new TextInfo.IsRightToLeft property was added. See,vs.80).aspx for details.

For #2, a slip of the typing, fixed now. :-)

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2006/05/18 Where are the other Tamils?

2006/03/20 Practical Uses for Replacement Cultures/Locales

go to newer or older post, or back to index or month or day