Some notes about The Locales of Windows 8, not yet divvied up...

by Michael S. Kaplan, published on 2012/08/23 07:01 -04:00, original URI:

The other day my blog The Locales of Windows 8, not yet divvied up... went up.

Some people made observations.

Some people asked questions.

A few of them even did all that in the comments....

Like Andrew West, who commented:

I've said it before, and I'll say it one last time, I think the MS spelling of ᠮᠤᠨᠭᠭᠤᠯ (m-u-n-g-g-u-l) for the more usual ᠮᠣᠩᠭᠣᠯ (m-o-ng-g-o-l) is bizarre.
On a different matter, can you explain why Basque, Catalan and Cherokee repeat the language name in parentheses instead of giving the name of the country in which the language is spoken, as is the case with all the other locales on the list?  And why is "Iran" present in the native name for Persian but omitted from the display name?

I forwarded on that Mongolian comment previously, but I will do it again. :-)

For Persian I'm not sure - maybe that whole "for the ex-pats, with no contact with the country" thing came into play.

I think the fact that we don't say it's Farsi anymore was the biggest issue for them.

Those three cases where the BCP-47 tag has a country/region but the name does not -- and the fourth case involving moh-CA aka Mohawk (Mohawk) -- are I think just cases where the locale is covering dependent nations (not sovereign nations) where it just makes people more comfortable. Just trying to be a little respectful....

And then Azarian pointed out some of the same things and some new things:

English (Caribbean) having weird code of en-029.
English (Republic of the Philippines) having native name of English (Philippines).
Cherokee (Cherokee) and Catalan (Catalan) with a non-country country name.
Persian with no country name at all.
Norwegian, Bokmål (Norway) being the only Display Name with a non-ASCII character.
And no less than 6 Sami locales.

Now the English (Philippines) vs. English (Republic of the Philippines) case is just a side effect of the English display name coming from us and the native name coming from the native reviewer.

It is a good candidate to clean up at some point -- we don't have a good reason to use the long name if the language expert doesn't think it's important! :-)

The Norwegian, Bokmål (Norway) case is a longstanding issue we've just always had, in part because no Norwegian we've ever worked with or talked to found it acceptable to use Bokmal instead.

As for that code, Doug Ewell explained it:

@Azarien: '029' is the UN M.49 code element for the Caribbean.

It's a little embarrassing, but we had this English (Caribbean) locale, but we had no ISO-3166 code for it. So someone found that 029 and we went with it.

There's probably something smarter we could have done later via BCP-47 but no one wanted to change the name again, so it stayed that way.

As for the "no less than 6 Sami locales" -- we have nine Sami locales!

Doug Ewell's other comment:

'ko-KR' should not expand to "Korean (Korea)." The name of the country is Republic of Korea, or South Korea, or whatever. Just because North Korea is isolationist and the is no localized version of Windows for it, that doesn't mean it doesn't exist. This is particularly noticeable given the indulgent "Bolivarian Republic of Venezuela."

Since we don't have North Korea or DPRK on the list, the usage of South Korea is not necessary -- it's the only Korea we ship to. :-)

So it is our "short name"....

I wouldn't consider the Venezuela name to be indulgent -- but someone asked us to update the English display name so we did. It's the kind of thing we could probably do for English (Philippines) at some point.

The last comment there was from Paul B.:

Great. Next, tell us the delta from Win7 to Win8 - what's new and what's changed? :-)

Now that would have to be a whole new blog, some other day!

I was disappointed no one asked about the difference with the display name and native name of Macedonia, but maybe people knew I already covered it....

John Cowan on 23 Aug 2012 8:53 AM:

en-029 is the totally correct BCP 47 language tag for Caribbean English.

Michael S. Kaplan on 23 Aug 2012 3:41 PM:

Good, less work for me. :-)

Joan Montané on 23 Aug 2012 3:42 PM:

Just one comment,

why the display name of "ca-ES-valencia" is "Valencian"? It's a Catalan language variant, so "Catalan (Valencian)" or "Catalan (Valencian, Spain)" can be a better choice. In Linux world "Catalan (Valencian) is in used for this locale, :)  

So, It's strange we have

Catalan (Catalan)

Basque (Basque)

Galician (Galician)


Valencian (Spain)?!!!!

It's really strange, :)))

Michael S. Kaplan on 23 Aug 2012 3:47 PM:

A lot of conversations led to what we ended up with. Language politics in Spain can be rather complicated at times....

go to newer or older post, or back to index or month or day