by Michael S. Kaplan, published on 2012/07/02 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2012/07/02/10325344.aspx
So we have these values in locales, the LOCALE_ICOUNTRY values.
They [mostly] correspond to some constants in winnls.h, the CTRY_* constants.
Now values are largely the same between them, when they both exist.
But some of the values are wrong.
Some of them have been wrong for many years, in fact.
Now they [mostly] represent International Dialing codes, though in most cases to be used as such they have to be zero-padded....
Anyway, let's look at the full list or broken LOCALE_ICOUNTRY values, with their corresponding CTRY_* constants (there are other locales for countries lacking those constants, that would require a bit more work to fill out!).
In the table, if it is RED and BOLD then it is wrong -- missing constant, wrong value represented by the constant, wrong value in the locale:
BCP47 | Country | CTRY_* constant | iCountryCode | CTRY_* value | Country it actually belongs to | Correct iCountryCode |
be-BY | Belarus | CTRY_BELARUS | 7 | 375 | Russia | 375 |
tn-BW | Botswana | n/a | 27 | n/a | South Africa | 267 |
en-CA | Canada | CTRY_CANADA | 2 | 2 | n/a | 1 |
fr-CA | Canada | CTRY_CANADA | 2 | 2 | n/a | 1 |
moh-CA | Canada | CTRY_CANADA | 2 | 2 | n/a | 1 |
iu-Latn-CA | Canada | CTRY_CANADA | 2 | 2 | n/a | 1 |
iu-Cans-CA | Canada | CTRY_CANADA | 2 | 2 | n/a | 1 |
fa-IR | Iran | CTRY_IRAN | 981 | 981 | nothing - looks like a typo | 98 |
lo-LA | Laos | n/a | 457 | n/a | this is the mobile dialing code, but not the regular dialing code |
856 |
de-Li | Liechtenstein | CTRY_LIECHTENSTEIN | 41 | 41 | Switzerland | 423 |
sr-Latn-ME | Montenegro | n/a | 381 | n/a | Serbia | 382 |
sr-Cyrl-ME | Montenegro | n/a | 381 | n/a | Serbia | 382 |
wo-SN | Senegal | n/a | 608 | n/a | the Senegal country code, but not the dialing code |
221 |
ff-Latn-SN | Senegal | n/a | 608 | n/a | the Senegal country code, but not the dialing code |
221 |
ta-LK | Sri Lanka | n/a | 91 | n/a | India | 94 |
uz-Latn-UZ | Uzbekistan | CTRY_UZBEKISTAN | 7 | n/a | Russia | 998 |
uz-Cyrl-UZ | Uzbekistan | CTRY_UZBEKISTAN | 7 | n/a | Russia | 998 |
Especially the longstanding issue for Canada, which I believe has been around since Windows NT 3.1/Windows 95. There was a reason for this that used to be documented -- I think it might have been the old IBM country/region codes or something, but they're moved on and no longer use them. Cathy thinks so too, though it's been years for her as well. Anyone have any antiquated IBM standards info handy? :-)
I introduced three of these in Windows 8, I believe -- tn-BW and ta-LK and ff-Latn-SN (all taken from the locales they were based on, with the Tamil - Sri Lanka case being a bit sillier since Sinhlala - Sri Lanka is in there, and correct).
The reason behind the Montenegro ones is obvious how that happened.
But the Senegal ones grabbing the wrong kind of country code? Weird!
Laos has me stunned -- maybe the mobile dialing code would be a helluva lot more useful in some scenarios -- these can all be on Windows Phone at some point, right? Forcing use of the non-mobile code seems less and less relevant over time, right?
The Iran thing? An oops from long ago.
And the Liechtenstein one? It was originally largely based on Switzerland's data, so I guess that makes sense.
Russia used to mean a lot more to Belarus and Uzbekistan, and they've been that way a while....
[At least] 4 new constants in a global header file, and 12-17 changes to locale data.
Sounds like a fascinating bug to convince people to approve of a fix for, huh? :-)
On the one hand, I care a lot -- this data is wrong and should be fixed.
On the other hand, these have all been wrong for 1-18 years, without any customer complaints. That is a strike against them, I'll admit....
Kind of proves an issue we've long known -- data only available programatically is generally of lower quality, and lacks the basic checks and balances that items in visible UI have.
This particular item that isnt even in the Locale Builder or in .Net at all? We're lucky to have ~20-30 bugs only!
Doug Ewell on 2 Jul 2012 8:48 AM:
2 for Canada (or "Canadian-French") goes all the way back to the COUNTRY setting in MS-DOS CONFIG.SYS. See www.vfrazee.com/.../country.htm or numerous other online references.
That setting was my first experience with country codes, well before I learned about ISO 3166. I adapted it for the commercial software I was working on at the time, as well as for customer records. Hooray, the country code also helps us dial the number! Of course, the MS-DOS codes aren't exactly ITU-T E.164 country codes—besides the errors you show here, there were all the non-US, NANP countries that needed ad-hoc codes. MS-DOS gave us 2 for Canada as an example, but I was on my own for the rest. There were other conflicts; 7 was the dialing code for many ex-Soviet countries for years after the breakup, and is still used for Kazakhstan as well as Russia. Liechtenstein used to be dialed as if it were part of Switzerland.
No country coding system can be perfect, because of the imprecise definition of "country." ISO 3166 is clearly (IMHO) the best choice today, but even then some people are outraged because, for example, Bouvet Island has a 3166 code element but Scotland doesn't.
Ken Sadahiro (MSFT) on 2 Jul 2012 10:42 AM:
Thank you Michael, for blogging about this!