What's in a name?

by Michael S. Kaplan, published on 2008/08/08 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/08/08/8842654.aspx

One of the core tenets of globalization and localizability of software is that making assumptions in formatting information will lead to bugs and limitations that will keep people in other cultures from properly using the software.

There are two sides to this.

On the globalization side, there is (for example) the formatting of numbers, dates, and times. There is the sorting of lists, and so on.

On the localizability side, there is (for example) assumptions about word order in inserts that would violate the grammar of the target language (leading in many cases to grammatically poor sentences in the target language in order to accommodate the badly placed inserts).

Then there are examples that actually span both globalization and localizability, like the names of people.

I can't imagine what people do when they have to enter their name in an online form that insists on a name that is made of a single word first name, possibly a middle initial, and a single word last name -- none containing any punctuation.

Right in Windows International we have many examples of names that violate such simplistic rules (rules which, though easing the complexity of software development and database storage, blithely ignores the reality of names throughout the world).

Take for example Group Manager Jan Roelof Falkena.

His last name is Falkena.

Now in Jan Roelof's own words, "The use of double names (without hyphens) is fairly common back home."

Thus his first name is not merely Jan, any more than Captain Jean-Luc Picard's first name is Jean. and putting Jan R. Falkena in such a form would be ridiculous, and not at all how his parents or he would have wanted his name expressed.

Or take Test Lead Gerardo Villarreal Guzman.

His first name is Gerardo.

His last name is derived half from his father's name (Villarreal) and his mother's (Guzman). The hyphen is not used between these two halves, and the name itself becomes an interesting symbol of what singer/songwriter Gavin DeGraw referred to as "the birth of two souls in one". Which in my opinion is actually kind of a nice thing, culturally speaking.

Now coming to the USA and knowing how inflexible so many process are about names, he might easily have been willing to simply go by Gerardo Villarreal and saved himself the grief (that is, for example, his name on Facebook), though the fact that Gerardo Villarreal Guzman is the name on his passport made that much more problematic for the company address book in other such places.

To extend this a little bit, Gerardo Villarreal Guzman is married to Hortensia Ortiz Roffe.

Their children are:

Now the dropping of the maternally derived surnames from both parent's names is common and if you think about is one of the only way to really scale names across many generations, as I am sure neither David Villarreal Guzman Ortiz Roffe nor Paola Villarreal Guzman Ortiz Roffe would be terribly hasppy having to fill out forms with their names in them! :-)

Though interestingly, when the names are more well-known due to political or economic or cultural influences the full name sometimes is retained, and in that case hyphenated -- thus if Gerardo were famous his children might be David and Paola Villarreal-Guzman Ortiz, or alternately if Hortensia were famous might have led to their names being David and Paola Villarreal Ortiz-Roffe.

Though one could take such a practice with a cynical eye and look at it as a form of snobbery, I'd rather give such a practice a more culturally kind eye and look at it as just remembering identities that could have unique significance to others in the future.

Even the other names mentioned above, from the fictional hyphenated French name Jean-Luc Picard (who would have to deal with the indignity of the Risa planetary computer system not allowing the hyphen) to the singer/songwriter Gavin DeGraw (who might sometimes be forced to go by Degraw due to a system not remembering the case of letters in the name -- which sucks -- or worse titlecasing -- which also sucks).

And then there are readers of this blog like Gé van Gasteren and Jeroen Ruigrok van der Werven, both having names that would confound these systems.

Or the way Japanese names are usually given in the form <family name> <given name>, well other than the imperial family.

Or the different practices used in North and South India (the latter often not including a surname).

The list could go on for hours -- I could have even included more specific examples like I did with Gerardo and Jan Roelof if I had more time to ask people for permission to "use" their names for more extended analysis).

The fact is, the simplified structure of names "used in the United States" is kind of a lie anyway since many of these people live in the US.

And thus while falling under the theoretical heading of a localizability issue, is probably better thought of as an issue that is important independent of the need to prepare for localization since this flexibility is required even in products that are not being localized, or in non-localized versions of products.

Though it is also important in localization, so that localizers can reposition controls to meet the most common expectations for a target language.

Which I guess gets back to answering the question What's in a name?

Respect, or the lack thereof....

This blog brought to you by(U+337b, aka SQUARE ERA NAME HEISEI)

# Licia on 8 Aug 2008 12:32 PM:

Another common American assumption is that phone systems in all countries have area codes or city codes (e.g. Control Panel > Phone and Modem Options).

In most European countries there is no longer any distinction between local calls and long distance calls, and younger generations might not even know what an area code was!

So insisting on having a separate box for area code is definitely not a good globalization practice (at least it should be greyed out according to country settings).

# Michael S. Kaplan on 8 Aug 2008 5:32 PM:

Phone numbers are actually a whole 'nuther problem for a whole host of reasons, worth of their own blog! :-)

# Carl on 9 Aug 2008 8:55 PM:

It seems like the solution is to have two boxes like so:

Full name (ex. John Quincy Public): _____________

Respectful address (ex. Dr. Public): _____________

In which the Japanese school teacher TANAKA Ichirō could write:



(Ordinary people are 様.)

And German professors could write:

Hans-Georg Gadamer

Herr Professor Doktor Gadamer


Transgender people could make up something between Mr. and Ms. to suit themselves, etc. There are lots of benefits to such a system.

The real information we want is full name and title of respect. The rest is just a fancy way of guessing how those two are linked. After the person writes their full name, use that to auto-guess what their title of respect is (last word plus Mr. or Ms.), but let them correct it if we guess wrong. Makes sense to me.

# Michael S. Kaplan on 9 Aug 2008 9:16 PM:

Sorting is often done by last name, which would require the separation point to be known, so I think you would need to specify given name from family name....

# K on 10 Aug 2008 8:29 AM:

Another thing that US forms assume, is that my address requires both a city and a state...

In my country the zip number is unique, so the state is not required at all.

# Remco Gerlich on 10 Aug 2008 2:34 PM:

In the Netherlands, family names like "van Nistelrooy", or "van der Sterren" are very common. Although they start with "van", they sort as "Nistelrooy, van" and "Sterren, van der". So in Dutch web forms, you usually have to enter your first name, "inbetween", and family name.

Then you suddenly have to translate the form strings to English... We went with "infix" in the end. Does that make sense?

# Logan Capaldo on 10 Aug 2008 3:27 PM:

K, zipcodes are unique in the US as well. So likewise form an amount of information provided you don't need the city or the state if you have a zipcode. On the other hand, it's a lot easier  to make a mistake transcribing, typing, or writing an N-digit number than it is to make mistakes on city, state and zipcode such that the address remains valid.

10 Main St.



10 Main St.

Happyville, XY


Which one is more likely to end up at the correct address, despite the fact that my finger slipped and I hit 3 instead of 4?

# Handyman on 10 Aug 2008 3:54 PM:

There's another interesting phenomenon with names like "Gé van Gasteren": the capitalization is different depending on whether the first name (or initial) is present or not. Generally speaking, prepositions and other "small words" (such as "van" = of, "van der" = of the, "in het" = in the) are not capitalized, unless they are the first part of the name that is mentioned. So it would be Gé van Gasteren, but Mr. Van Gasteren. And it would be Joran van der Sloot, but Mr. Van der Sloot. Let's see these systems handle that! Anyway, not even Dutch systems handle this correctly AFAIK. At least not systems that I worked on -- they send letters to Mr. G. Van Gasteren, while it should be Mr. G. van Gasteren.

It becomes even more interesting with married names BTW. In The Netherlands, it is the tradition (although you see this less and less) that a married woman takes on her husband's name and optionally keeps her own name as well, separated by a hyphen, e.g., Maria van Husbandsname-van Maidenname. I'm sure that the "van" in "van Husbandsname" should not be capitalized, but the other one I'm not sure about. I'm especially uncertain if it's going to be Mrs. Van Husbandsname-van Maidenname or Mrs. Van Husbandsname-Van Maidenname (although I expect it's the former). Couldn't find anything about this in online sources either...

# Mike Dimmick on 10 Aug 2008 4:33 PM:

Icelanders still use the patronymic system, where the 'surname' is a combination of the father's given name and +son or +dóttir (for women). However, some families decide to use the mother's name.

This means, of course, that different members of a family will normally have different 'surnames'. As far as they're concerned it's a part of their full name and you can't call them by that name alone.

(Reading FC have four Icelandic footballers: ívar Ingimarsson, Brynjar Gunnarsson, Gylfi Sigurðsson, and Viktor Unnar Illugasson - the convention is to put a 'surname' on the back of the shirt, so their patronymic is what goes on there, but I'd be interested to find out what the Icelandic football league convention is!)

Speaking of footballers, there are some Icelanders who don't follow the convention, and Eiður Guðjohnsen is one - his father is Arnór Guðjohnsen.

Far more information than you probably ever wanted to know at http://en.wikipedia.org/wiki/Icelandic_name.

# Alex on 10 Aug 2008 10:40 PM:

"Last Name" is confusing enough, because many asians place the family name before the given names.

To make it doubly confusing, although Japanese names are written (in Japanese) with family name before given name, many Japanese follow western habits to spell their name in English with given name before family name. If you don't understand the language, you have a 50-50 chance of telling which part of a Japanese name (spelt in English) is the family name.

Although Japanese family name has multiple character when written in Japanese kanji, they often write it in a single stream of alphabets to form single English word. In contrast, not that many Chinese name has multiple characters, but there are a number of two-character family names, and theoretically 4-character ones (when a wife adds the family name of the husband to her name). More important, Chinese write their family name as discrete words, so you can have two, three or four words for the family name.

Oh Zip codes. Not every country has zip codes. Mine doesn't. I ran into so many online order forms that requires a zip code. Some are nice enough to accept 99999. At least one merchant lost my purchase because nothing can be done (because they are crazy enough to maintain a database of valid zip codes for each country, and failed to notice the discrepancy with the list of valid countries they maintained).

Phone numbers of course. So often I ran into forms asking for a 7 or 10 digit phone number, so I need to drop digits or pad with zeroes. Various software, including the telephone set I am currently using at home, still wants to add hyphens to the wrong places.

# int19h on 11 Aug 2008 3:50 AM:

There are a lot of cultures out there who still use patronymics, though of those, Icelanders are, IIRC, the only ones who do not bother with family name at all.

# Robin Kay on 9 Oct 2008 11:59 AM:

I joined this blog to see how names beginning with "van" should be sorted and I have the answer, thank you.  In this case: Boven, van.

In passing I saw Mike's reference to Eiður Guðjohnsen and that he has not followed the Icelandic tradition of using 'son' after his name.

Although I speak without authority, is it not possible that the name Guðjohnsen already contains the 'son' element and that to make it Guðjohnsenson would be a bit like an English person being called Johnsonson?

Although, with their present financial travails, I guess that Icelanders have got far more pressing matters on their minds.

referenced by

2015/06/30 The Japanese people didn't get the idea from Star Trek

2010/07/05 What do you get when you cross a Spaniard and a Portuguese?

2010/01/04 Two different (but related) questions often have the same answer

2008/10/01 Parents, to be perfectly blunt, suck at names, sometimes

go to newer or older post, or back to index or month or day