Lost in [no ]translation

by Michael S. Kaplan, published on 2008/06/02 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/06/02/8568892.aspx

Just the other day, some localizers were asking a question about some web content about globalization that was talking about Arabic script characters:

Question: In order to transliterate properly into the target languages, we'd like to confirm the letters in question. Could these be:

Beh -> Beth
Alef -> Alif
Ein -> Ayin

It turns out the letters they were looking at were:

ب    U+0628    ARABIC LETTER BEH

ا    U+0627    ARABIC LETTER ALEF

ع    U+0639    ARABIC LETTER AIN

Now it is true that the Unicode character names do not always match what a target language might prefer -- so one could imagine that particular target language localizations might indeed have a difference preference here.

Kind of like I mentioned here, with the real difference being that A-Z names tend to keep those letters for the names, while other characters tend to have longer names -- names that follow a single particular transliteration, even if there are multiple transliterations out there for one language or for others.

So on the surface this looks like the ideal place for the help of a localizer. :-)

However, the issue of whether to really consider changing the name during the localization process is one that must be carefully considered in a situation where the typical reader of the information will be a developer looking at a document talking about Unicode characters.

And of course when it is a big company (e.g. Microsoft), there is an extra risk if the name is localized -- someone could perceive the effort as Microsoft trying to muck with standards (since the Unicode character names are part of the standard itself, and interesting source of problems if localizers choose to translate the names and their choices differ from an official translation provided by a government source in the target market!).

In the end, the decision is best left in the hands of an experienced localizer if the localizer has insight into local developer expectations.

Short of that it is probably best to either find an informant with that knowledge, or stick with the official Unicode character names -- one can always use those all caps official names and then just blame Unicode if anyone complains about the name. :-)

This blog brought to you by the three Arabic script characters mentioned above

no comments

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day