About that Portuguese localization question, redux...

by Michael S. Kaplan, published on 2010/06/09 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2010/06/09/10019823.aspx


It has been over four years since I wrote that About that Portuguese localization question...blog, and in that time I have gotten several emails from both people who work for Microsoft and who used to work for Microsoft, who helped me fill in additional details, beyond the ones that Thierry provided in a comment to that blog:

Hi Michael,

Very interesting post.

As a matter of fact, there are two very distinct spellers for European Portuguese and Brazilian Portuguese (idem for word-breakers used for instance in SharePoint). The are many differences between the two languages as far as accents are concerned, but also for clitic pronouns and for instance for the formation of diminutives.

Compare the European Portuguese (EP) and Brazilian Portuguese (BP) versions:

EP: acção
BP: ação

EP: diminutives in -ito are found next to the diminutives in -inho
BP: diminutives in -inho only

EP: doar (give, donate): 1st person sing: doo
BP: 1st person sing: dôo

EP: abdómen
BP: abdômen

So it's not just a question of market size, they are really considered as two different language varieties (there are words which you would really expect to accept in one version and to red-squiggle in another version if you are using a spell-checker). You are right when you say that differences between European French and Canadian French are less visible. From a phonological point of view, the two varieties of French may be very different, but reading a text in a Canadian newspaper and in a French (or Belgian) newspaper does not reveal lots of syntactic differences; the morphology is the same; of course, there will be words that are used exclusively in Canada (and vice-versa), but that would in general not justify the creation of two distinct spell-checkers, for instance (there may be some aspects that are dealt with through the use of settings, such as enforcing accented capitals for the Canadian market (École), when the European French version accepts both accented or unaccented capitals (École or Ecole).

I hope this helps,

Thierry

Thierry Fontenelle
Microsoft Speech & Natural Language Group

I thought that as I have notes on so many of these conversations that it would make sense to write them up as a follow-up blog, with some additional information I picked up by looking into the issue on my own as well.

Now I'll start by saying the same thing I noted originally: that the separate way Brazilian Portuguese and European Portuguese are handled (getting their own separate localized version for most products) as opposed to other multi-country languages like English, Arabic, Spanish, Italian, and German was the issue that I have always found to be interesting. Or perhaps I should say concerning. :-)

Several of the comments people had related to this.

What I found out was that originally only one Portuguese localization was done. And since we are talking about over 150 million speakers in Brazil versus under ten million speakers in Portugal, it made obvious sense (if one was choosing a single localized version) to choose Brazilian Portuguese.

However, due to the many differences in morphology, spelling, language syntax, accent usage, clitic pronouns, formation of diminutives, and terminology (particular technology terms commonly required in software -- between 40% and 80% different!), this was creating huge problems in Portugal:

all of which was reportedly leading to between 65% and 75% of the market in Portugal preferring the English (or less frequently even the Spanish!) version to the Portuguese one.

Of course there is an implicit recognition of the fact that people doing this might still not be entirely satisfied with their [now incorrect language] product, and may have problems understanding the English product in some cases anyway. Overall this was obviously a less than delighted group of up to ten million potential customers of Microsoft products....

Starting in Windows 95, most products started including both the Brazilian and European localized versions in order to try and reduce that negative trend and assist the subsidiary in Portugal. And by report that trend has indeed been reversed, with up to 70-75% of users in Portugal preferring the European Portuguese version (and almost none choosing the Brazilian Portuguese version). When one considers how long it can take to get people to go back to a product that has disappointed" them in this way previously (Russian is a striking example of this "burned child fears the fire" phenomenon), the reversal is striking!

With this kind of information to draw one, the seemingly arbitrary nature of this being done for Portuguese while not being done for some other languages seems a lot less arbitrary -- because generally for the other languages the trend of not using a particular localization is considered to be not nearly as severe.

The work to study the issue and generate the numbers was (I would guess) most likely done by the subsidiary in Portugal in response to the trends they were seeing. And by extension the same kind of work would probably (I would again guess) have to happen in any other location seeing analogous problems.

Another interesting point that was raised was on the source of differences between the European and Brazilian Portuguese. Although there were many references, the one that seemed most compelling to me was this one (the wider site has more information on Portuguese in general, but this particular article focuses on Brazilian Portuguese):

3.3.a. The History of the language in Brazil

When Portugal first colonized Brazil, a process that began with discover in the year 1500, Tupi, or more precisely the Tupinambá, one of the languages of the Tupi-Guarani family spoken by indians who lived on the Brazilian seacoast, was used along with Portuguese as the general language of the colony. This was primarily because the Jesuit priests studied and taught the Tupi language. In 1757, Tupi was banned by royal decree, although the language had already been overwhelmed by Portuguese spoken by the large number of immigrants from the mother country. When the Jesuits were expelled in 1759, Portuguese became the language of the country. However, Portuguese inherited words associated with flora and fauna from indigenous languages. Among these words were abacaxi (pineapple), mandioca (manioc flour), caju (cashew), tatu (armadillo), and piranha, the voracious fish, as well as proper and geographic names.

The Portuguese language in Brazil received a new source of contributions with the influx of African slaves. The African influence came primarily from the Iorubá spoken by slaves from Nigeria. Some of these words also found their way to Europe. Iorubá contributions derived from words connected with religion and the Afro-Brazilian cuisine. From the Angolan Quimbundo language came words such as caçula, meaning the youngest child, moleque (a street child), and the dance samba.

During the 18th century, other differences between the American and European Portuguese developed. At that time Brazilian Portuguese failed to adopt linguistic changes taking place in Portugal produced by French influence. The Brazilian Portuguese remained loyal to the pronunciation used at the time of its discovery. However, when Don João, the Portuguese king, and the royal entourage took refuge in Brazil in 1808 (when Napoleon Bonaparte invaded Portugal), his presence helped to reapproximate the Portuguese spoken in the cities to the Portuguese of Portugal.

After Brazilian independence in 1822, Brazilian Portuguese became influenced by Europeans who had migrated to the central and southern parts of the country. This is the reason one finds in those areas variations in pronunciation and a few superficial lexical changes. These changes reflect the nationalities settling in each area.

In the 20th century, the split between the Portuguese and Brazilian variants of Portuguese heightened as the result of new words for technological innovations. This happened because Portuguese lacked a uniform procedure for adopting such words. Certain words took different forms in different countries. For example: in Portugal one hears comboio, and in Brazil one hears trem, both meaning train. Autocarro in Portugal is the same thing as ônibus in Brazil, both meaning bus. At the beginning of this century, the nationalism and the individualism of the Romantic movement began promoting the creation of a language norm based on the Brazilian version of the Portuguese language. In 1922, the Modernists reintroduced this argument, promoting a need to break with traditional Portuguese models and to adopt the Brazilian speech pattern. This opening by the Modernists led to the successful adoption of the Brazilian norm in literature.

Very interesting! Although there are almost cetainly similar ways one could tell the tale for other languages, this summary of various influences fascinated me, and I haven't seen such a concise description for other languages as of yet.

Now this does leave the question of Portuguese in other countries, like those in Africa and Asia, a subject on which there was a lot less information as to which they would prefer between Eurpoean and Brazilian Portuguese (or if they would prefer another language entirely). But the population of Portuguese speakers in these countries who would find either variety unusable is at least reportedly not nearly as large (the number of speakers in Angola and Mozambique combined is 2-3 times greater than the njumber in Portugal, though apparently there is not much of a clamor to even add additional localizations or even locales for that matter.

Though in thinking about this, it did remind me from a fun scene from Season 5 of The West Wing, where Mr. Minaldi, the interpreter from the State Department, explains that there's no such language as Indonesian, which speak 583 different languages. Mr. Minaldi speaks English, Portuguese, and Javanese, while the person they want to talk to (Bambang Sumahidjo) speaks Batak. Though thankfully, Donna found a cook in the kitchen who knew Batak and Portuguese. Thus through multiple links they would be able to get the message across. This was obviously going to be a painful game of "telephone" so when they found out that Bambang Sumahidjo spoke English there was a sigh of relief!

Now the scene became less amusing as I learned more about it from friends and people I talked to later, and then I got bogged down in the inconsistencies:

Clearly the more you learn, the less funny some jokes become.

Back to the topic, there were many other examples of spelling and other grammatical issues like Thierry's, which I won't quote here but are rather large, particularly related to technology terms.

The final point related to names that are really not preferred:

The word "Iberian" kind of sits somewhere akin to "FAR EAST" for East Asia, and its presence just becomes less and less common internally as time goes on. Everyone's life is easier if they do not have to remember both internal and external rules, after all!

Now that I am done I'll admit that I should have written this blog earlier; each time another contact happened I got all energized and started writing, then something else pulled me away. Finally getting it taken care of feels like such a relief!

And I have a much greater respect and understanding about the situation for Portuguese (as well as a curiosity about how much it might apply in other situations and languages if it isn't as widely reported or surfaced!)....


Vitor on 9 Jun 2010 10:00 AM:

It's worth noting that Brazil and Portugual (alongside other countries from the Community of Portuguese Language Countries - CPLP) agreed on a ortographic agreeement in 1990 (en.wikipedia.org/.../Portuguese_Language_Orthographic_Agreement_of_1990), with the agreement coming effect this or next year.

Although intelligible as a Brazilian Portuguese speaker I find harder to read European Portuguese than reading English.

John Cowan on 9 Jun 2010 11:54 AM:

Note that Azorean Portuguese definitely belongs to the right-hand side of the split, and while it's reasonable to say the Azores are in Europe, it's not reasonable to say they are in Iberia.

Michael S. Kaplan on 9 Jun 2010 2:24 PM:

Indeed John -- yet another reason for that terminology preference -- sufficiency!

davis on 11 Jun 2010 4:34 PM:

Not very important but that West Wing episode is Season 1, Episode 7


referenced by

2011/07/07 The Locales of Windows 7, divvied up further

2011/01/30 Even in India, the language is actually known as Bangla (not Bengali)

2010/08/18 The Portguese version. No, the other Portuguese version...

2010/08/07 A way to say "this is who I am, where I am, what I think is wrong, and why" ?

go to newer or older post, or back to index or month or day