Grouping in the managed and unmanaged worlds

by Michael S. Kaplan, published on 2006/04/18 12:03 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/04/18/578328.aspx


Yesterday, when I talked about that post from Raymond about numeric grouping the locale sensitive way, I mentioned that I'd be back after he came back. Here goes....

Now as messy as GetLocaleInfo's LOCALE_SGROUPING LCTYPE may seem:

Sizes for each group of digits to the left of the decimal. An explicit size is needed for each group, and sizes are separated by semicolons. If the last value is zero, the preceding value is repeated. For example, to group thousands, specify 3;0. Indic locales group the first thousand and then group by hundreds: for example 12,34,56,789, which is represented by 3;2;0. The maximum number of characters allowed for this string is ten.

Further examples:

Specification Resulting string
3;0 3,000,000,000,000
3;2;0 30,00,00,00,00,000
3 3000000000,000
3;2 30000000,00,000

It looks even worse after you compare it to its managed cousin, NumberFormatInfo.NumberGroupSizes:

Every element in the one-dimensional array must be an integer from 1 through 9. The last element can be 0.

The first element of the array defines the number of elements in the least significant group of digits immediately to the left of the NumberDecimalSeparator. Each subsequent element refers to the next significant group of digits to the left of the previous group. If the last element of the array is not 0, the remaining digits are grouped based on the last element of the array. If the last element is 0, the remaining digits are not grouped.

For example, if the array contains { 3, 4, 5 }, the digits will be grouped similar to "55,55555,55555,55555,4444,333.00". If the array contains { 3, 4, 0 }, the digits will be grouped similar to "55555555555555555,4444,333.00".

Note the different syntaxes? Not just the string versus array thing (the array makes parsing easier), but the difference in repeat syntax? That added an interesting wrinkle to supporting custom locales (which are also custom cultures), let me tell you! :-)

Luckily, as Raymond mentioned, over 99% of people using either the managed or the unmanaged version of grouping will do it via the various formatting and parsing functions, so the varied syntaxes can be knowledge that never needs to be used directly. When it is understood, however, it can help explain the vast world of possible number formats out there....

 

This post brought to you by "𐄎" (U+1010e, a.k.a. AEGEAN NUMBER EIGHT)


# Maurits [MSFT] on 18 Apr 2006 12:52 PM:

Interesting.

Are there any real-world cultures that use different separators between different groups?  Like Japanese dates, but with numbers?

go to newer or older post, or back to index or month or day