About that Portuguese localization question...

by Michael S. Kaplan, published on 2006/03/01 01:01 -08:00, original URI: http://blogs.msdn.com/michkap/archive/2006/03/01/541074.aspx


I have mentioned Portuguese and some of the many issues that come up with the fact that Microsoft does not usually localize int two different dialects of the same language before in this very blog (whether that language is English, Spanish, Arabic, or Italian).

Anyway, I was installing the Server 2003 MUI language packs the other day, and I noticed in the list of languages two varieties of Portuguese were present:

Ok, no worries -- many people feel that the difference between the two is much greater than the difference between US English and UK English, or between Spanish in Spain and Mexico. In other words, it reportedly goes above the level of being annoying/distracting and into the realm where mutual intelligibility is in question.

Now one could argue the same issue exists in Arabic as spoken in Morocco and Saudi Arabia. And one would even be correct, except for the fact that this ignores a more 'formal' mode of Arabic that is apparently accepted across many Arabic locales, one that does not delve too deeply into the dialectical differences.

Certainly their are standardization bodies in both Canada and France that each have strong (sometimes conflicting) opinions about the way French should be spoken. I am fairly convinced (after having had translations from both locations mistaken for their source by some of the 'experts' in the respective dialects) that the differences in French have a lot less to do with the language than those who are passionate on the topic will readily admit.

But Portuguese really is different -- historically, it is obvious that Brazilian Portuguese is a child of Iberian (European) Portuguese, but just as obviously it has gone through many subtle and not-so-subtle changes. And that Brazilian market is much bigger, too (a factor that a software company cannot really ignore!).

Anyway, when I went to install those language packs, I was somewhat bemused to see the dialog come up when the Brazilian Portuguese language pack was being installed:

compared to when the Iberian Portuguese one was being installed:

Now these two names are definitely not coming from the NLS data -- it would have been interesting to look at the localized strings for this piece of setup, had this UI been localized (unfortunately, it is not). This is all in the ProductName field anyway -- someone just decided to put them out there this way, even though other places in Windows don't seem to do it this way.

And I wonder in any case whether this was someone being clever on the English version or if it was with feedback from the language people?

Anyway, once it is installed let's look at how they are listed in the MUI language list:

Wow, at least they are using NLS again. They are picking up the LOCALE_SNATIVELANGNAME, with Brazilian Portuguese (0x0416) calling it Português and Iberian Portuguese (0x0816) calling it português.

Of course, this seems like a poor distinguishing feature to me, even if it does truly represent the preference in each locale. If you are a native speaker, doesn't this seem like a little bit of a mind game to guess which one to pick? And if that preference changes, what happens to the items in this list? Could you end up with two items that are identical? And could that lead to a bug where the first one is always selected and the Continental variety gets the shaft?

Maybe I should try this out in Vista at some point with a replacement locale that makes one's SNATIVELANGNAME look like other's and see what happens....

(don't worry, some day soon I'll be posting a whole bunch about replacement locales and some of the more compelling scenarios!)

Anyway, in speaking to people from both Mexico and Spain (as well as conversations with people from other Spanish-speaking countries), the differences between the various Spanish dialects can sometimes be quite marked. It has been a source of some bitterness among the more pure Castellano speakers that so much of the Spanish product appears to be produced in somewhere other than Spain.

So what is the actual difference, the one that makes it compelling to ship two versions of Portuguese but only one version of some of these other languages?

From those conversations I will hazard a semi-educated guess. :-)

I think it has more to do with what markets will tolerate -- the differences in English or Spanish or Italian are certainly noticable, but the dialects are mutually intelligible. And in the case of Arabic the attempt to try to stay as neutral of dialect as possible may make a product feel less "local" to the inhabitants any one country but more likely to be understood across a wider area.

So a real study of this would have to be focused on the actual differences between the various languages. According to Wikipedia, Phrase- and sentence-level stress and tone patterns and differ significantly between dialects: European Portuguese is often described as a stress-timed language (consistent with the its loss of pre-stress vowels), while Brazilian Portuguese is syllable-timed.

If true, then this is obviously a much more significant type of difference than those in for example English (which seem to be more focused on word choices and word spellings), or in French (which as I said I have been able to unintentionally fool people in Quebec about translations done in France and vice versa). And in situations like those with the Arabic language (where the choice seems to be more focused on literacy vs. illteracy than on capturing the dialectical variations between populations), it seems much more natural to simply go with what one might consider to be the Arabic language equivalent of BBC Standard Received (you know, pure, cultured, etc.).

Or perhaps we have a new standard of degree of language difference that spans linguistic, geopolitical, and other concerns. And we can alter Max Weinrich's famous quote (A shprakh iz a dialekt mit an armey un a flot) and say that a language is a dialect with its own localized version of Windows.

Boy, if ever were the first paragraph of my disclaimer to apply, it would be just about now, huh?

But in my opinion, if Microsoft as a company really is trying to give people local experiences with Windows, we need to work harder to target localized versions to a finer level than we do currently.

An aussie and a yank and a brit may be able to understand each other, but that does not mean it is a comfortable local experience to force that difference on them on a full-time basis....

 

This post brought to you by "ڦ" (U+06a6, a.k.a. ARABIC LETTER PEHEH)


# Eric Duran on Wednesday, March 01, 2006 8:03 AM:

I didn't know Italian had many different locales.
Where else in the world is Italian spoken (and I mean which country with more than a million people)?
What are the differences between these "Italians"?

# Michael S. Kaplan on Wednesday, March 01, 2006 9:02 AM:

Italian is also spoken in Switzerland, and it definitely a dialect compares to the mother language in Italy (as are French and German in Switzerland). The languages are recognizable but are definitely different....

# Heath Stewart on Wednesday, March 01, 2006 12:26 PM:

Windows Installer isn't 100% MUI friendly. Yes, the packages likely were not localized, but not all strings come from the package either, but from Windows Installer itself in certain scenarios.

What should be the practice for Windows Installer is that the UI is in whatever language the user prefers, the idea being that if I'm an IT admin installing products for users or making administrative installations (basically extracting files for later installation on client machines via GPO or some other tools) and get prompted with a question in, say, JPN because I'm installing VS 2005 JPN, I'm not going to be able to answer that question (more than likely).

This is a problem that DevDiv wrestles with and have created custom solutions, such as external UI handlers and "MUIzed" patches with custom actions. It is a pain.

# Michael S. Kaplan on Wednesday, March 01, 2006 12:38 PM:

Hi Heath,

I realized after writing this that most of the UI language strings (if they did exist) would be wasted for most customers since the UI language could not be changed until the UI language was actually installed -- which is the point of this particular set of packages!

# OrthoFR on Wednesday, March 01, 2006 2:43 PM:

Hi Michael,

Very interesting post.

As a matter of fact, there are two very distinct spellers for European Portuguese and Brazilian Portuguese (idem for word-breakers used for instance in SharePoint). The are many differences between the two languages as far as accents are concerned, but also for clitic pronouns and for instance for the formation of diminutives.

Compare the European Portuguese (EP) and Brazilian Portuguese (BP) versions:

EP: acção
BP: ação

EP: diminutives in -ito are found next to the diminutives in -inho
BP: diminutives in -inho only

EP: doar (give, donate): 1st person sing: doo
BP: 1st person sing: dôo

EP: abdómen
BP: abdômen

So it's not just a question of market size, they are really considered as two different language varieties (there are words which you would really expect to accept in one version and to red-squiggle in another version if you are using a spell-checker). You are right when you say that differences between European French and Canadian French are less visible. From a phonological point of view, the two varieties of French may be very different, but reading a text in a Canadian newspaper and in a French (or Belgian) newspaper does not reveal lots of syntactic differences; the morphology is the same; of course, there will be words that are used exclusively in Canada (and vice-versa), but that would in general not justify the creation of two distinct spell-checkers, for instance (there may be some aspects that are dealt with through the use of settings, such as enforcing accented capitals for the Canadian market (École), when the European French version accepts both accented or unaccented capitals (École or Ecole).

I hope this helps,

Thierry

Thierry Fontenelle
Microsoft Speech & Natural Language Group

# Michael S. Kaplan on Wednesday, March 01, 2006 2:51 PM:

Of course, there are separate modules used for UK vs. US spellchecking,  for much the same reasons. Yet it does not manifest as a new SKU with a separate localization (or should I say localisation in this context?).

# Alun Jones on Wednesday, March 01, 2006 3:40 PM:

There are similar differences between American and Commonwealth English.  [Your comment about Received Pronunciation suggests that Commonwealth English should have been the default used by Microsoft in the beginning :-)]

Endings: -ise vs. -ize
Spellings: colour vs. color
Different words: bonnet vs. hood
                          trunk vs boot

This morning I went into the hardware store, and while my command of American is usually fairly good, it's here (and the car repair shop) that I find the language gap is huge and occasional insurmountable.

Instead of asking for an adjustable spanner, I have to ask for a monkey wrench.  I've given up, now, and simply ask for things with a combination of small words and hand gestures.  If that doesn't make it close to a foreign language, I don't know what does!

# Heath Stewart on Thursday, March 02, 2006 2:29 AM:

Actually, the UI strings are in the MSI itself and independent of the content being installed. As you may remember from your past experience, Windows Installer doesn't officially support Unicode (support is incomplete, actually) but it does support code pages so displaying different languages is not a problem. Whoever created these language packs didn't localize the Error table in the MSI, which is where the strings in question come from.

Sounds like it's time for me to tackle a few posts on localization in MSI, and what DevDiv does for language-neutral products like .NET to add MUI support ourselves through custom actions.

# Michael S. Kaplan on Thursday, March 02, 2006 7:08 AM:

I get that part. But the UI language of Windows controls which lang is shown, and these particular setups *install* the lang packs.

So the only time you would ever see Portuguese strings is on uninstall! On install, it is not yet a UI lang so it is not an option....

# Eric Duran on Friday, March 03, 2006 8:59 AM:

If it is true that there is only one Spanish localization for Windows, then I'm glad to know that the Spaniards have to deal with the words I used to see in my Spanish copy of Windows (95).
Although I really can't imagine how they stand having to read "computadora" instead of "ordenador" and "archivo" instead of "fichero"... I, for sure, would have cried and screamed (and possibly refused to work with) if forced to work with a Windows version displaying the Spaniard-equivalent words...

referenced by

2011/01/30 Even in India, the language is actually known as Bangla (not Bengali)

2010/08/07 A way to say "this is who I am, where I am, what I think is wrong, and why" ?

2010/06/09 About that Portuguese localization question, redux...

2008/07/11 What kind of English were you looking for? We only seem to have one in stock....

2006/10/13 Local experiences in Norway: the Nynorsk LIP!

2006/09/12 They speak English in other places, too

2006/09/06 IsSortable() == false? Well, sometimes it may be lying....

2006/07/12 Device fonts are people too

go to newer or older post, or back to index or month or day