When the roof got raised, and why

by Michael S. Kaplan, published on 2005/12/27 16:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/12/27/507599.aspx


Documentation for API functions can sometimes fall behind and not keep up to the way functions work.

I know, that is a huge newsflash for everyone.... :-)

Eventually we fix in though.

If you look at the Locale Information topic that contains the LCTYPE values used by GetLocaleInfo/SetLocaleInfo, you will notice that for all of the string fields that can be altered by SetLocaleInfo we list the maximum legal size (including the NULL character). For example:

LOCALE_S1159
String for the AM designator (first 12 hours of the day). The maximum number of characters allowed for this string is different for different releases of Windows:

LOCALE_S2359
String for the PM designator (second 12 hours of the day). The maximum number of characters allowed for this string is different for different releases of Windows:

Now obviously this was not always what was there. In fact, if you look at the Platform SDK documentation in the October 2005 MSDN release, you will find a different story:

LOCALE_S1159
String for the AM designator. The maximum number of characters allowed for this string is nine.

LOCALE_S2359
String for the PM designator. The maximum number of characters allowed for this string is nine.

(that number includes the NULL character)

So the new documentation has certainly worked to better describe a problem that has existed in the documentation for years after Windows XP and Server 2003 have shipped.

So how is it that the GetLocaleInfo and SetLocaleInfo limits are different?

Well, I guess it was a combination of how the functions work and what they are trying to do in Windows.

Prior to Vista, the locale data is always what we ship, which we trust. So GetLocaleInfo does not have to verify the length, it can just pick up its null-terminated string from its cache and return it. This is different than SetLocaleInfo, which has to do things with that string like put it in the registry and thd cache. So we have to care about the maximum length a bit more.

Now some time before XP shipped, the official limit was raised to 13 (12 plus the NULL) because some locale had strings greater then 8 characters.

Unfortunately, no one noticed that Gujarati had strings that were longer than this:

They are both 14 characters each. So you can retrieve them, but you can't set them!

Sometimes we might look less strange/mysterious when we explain what is happening, at the cost of seeming a little more foolish. It is definitely a trade off!

 

This post brought to you by "મ" (U+0aae, GUJARATI LETTER MA)


# Centaur on 28 Dec 2005 11:00 AM:

Now, this raises two questions. First, why was there a roof in the first place? And second, when it was found that the roof is inconvenient for some tall people, why wasn’t it raised all the way to the sky? Another example of “ought to be enough for everyone”?

# Michael S. Kaplan on 28 Dec 2005 11:07 AM:

Well, the issue of memory allocation comes up (people do not want to call it twice to get the size and then get the value), so the max. size per item is allocated. Unfortunately, it then ended up needing to get bigger....

# Centaur on 29 Dec 2005 11:59 PM:

Speaking of calling API functions twice… It seems to me that each occurence of this pattern is an invitation for a race condition.

Suppose we want to read a string value from the registry. But we don’t know how long it is, so we call RegQueryValue once with 0 size, and it tells us how much space we will need. We allocate that much and call it again with the new size and the buffer pointer.

But while we were allocating, another process or thread was scheduled, and it rewrote the value to be longer. So, when we regain consciousness, we find that our buffer is insufficient.

The correct thing would be to deallocate the obsolete-sized buffer, allocate a new one that seems to be sufficient, and retry; but a maliciously inclined concurrent process or thread still might make the value longer, potentially catching us in an infinite loop, until we hit the registry size limit, memory limit or address space limit.

Anyway, how many programs will just fail when the second call to RegQueryValue returns “buffer is too small”?

# Michael S. Kaplan on 30 Dec 2005 10:11 AM:

It is perhaps useful to consider that (1) updates are rare, and that (2) you can avoid the issue you are worried about by using the LOCALE_NOUSEROVERRIDE flag in which case you get only Microsoft-provided data?

# silverpie on 30 Dec 2005 12:06 PM:

http://www.catb.org/~esr/jargon/html/Z/Zero-One-Infinity-Rule.html

# Michael S. Kaplan on 30 Dec 2005 12:39 PM:

The reality of software denies the premise though, sorry. The possibilities are not unlimited, and forcing developers to prepare for unknown limits is not really something that software is interested in doing..... Nice try though! :-)

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2010/04/11 When one dishes it out but can't take it (aka Raising the roof, 2010 edition)

2009/08/20 When you're in the red, you don't have a lot of choices

2008/02/21 Why is the separator blue?

go to newer or older post, or back to index or month or day