by Michael S. Kaplan, published on 2006/08/22 03:01 -07:00, original URI: http://blogs.msdn.com/michkap/archive/2006/08/22/708503.aspx
(Note: the title of this blog post has incorrect Greek text in it, to help highlight a bug that will be explained later in the post!)
I had no idea when I posted Sometimes, uppercasing sucks that I'd find so many people who were unaware of how much of this sort of 'natural language processing' wasn't happening in Windows or the .NET Framework.
There is actually a more generalized problem here though.
It is the same problem that happens when a developer makes sure a font size is a hard-coded 8pt and it has to show Chinese Han
Or when a developer italicizes text and makes Arabic look like crap and Japanese look really ugly.
Or when a developer hard codes the location of tokens in a string so that the localizer trying to translate it to German cannot change the word order and is forced to write German text that looks ridiculous.
Or when a developer bolds all text and makes a Tibetan string that was on the thin edge of readability look like smudges on the monitor.
Or when a developer does not allow a dialog to be mirrored and messes up the Hebrew UI version.
And yes, it happens when a developer decides to uppercase a string because they believe it is required for some type of emphasis.
The bug here is NOT in Windows or the .NET Framework in these cases. It is not a platform problem at all. It is an application problem.
By taking important decisions like font size/weight/style, like case, like position and token order of strings out of the hands of the people who are most qualified to understand the requirements for acceptance in the market, the localizers, they are actually doing a crappy job on the localizability front -- preparing an application so that localization can go smoothly.
Now I am sure one day having integrated linguistic services that can automatically apply all of the specific language rules for operations like case folding without manual per-language intervention would be a wonderful internationalization feature, one that will enrich the platform tremendously.
Just as I think that if all the rules about how best to handle text rendering in all languages in regard to font attributes/styles could be captured in some internationalized "give me the font for _____" type function that will look up all the rules and apply them per market would be cool. And which once again will enrich the platform markedly.
And just as I think that if localization could happen automatically on the fly with no need for anyone to do the actual translation work and have the results look correct to native speakers of the language would be truly awesome.
But frankly, we are nowhere near any of these goals, and until we are, localizers are the key ambassadors who present the application that people working in/with another language market can use and enjoy.
So if you are not writing localizable applications (and any of these problems including Greek or French or Dutch casing are included here), then for now it is your bug.
So please fix it.
(Note that fixing may be a simple as not making a string all caps or bolding it or italicizing -- in other words, don't potentially destroy properly crafted text for a market in an effort to emphasize it)
As a side note, I'll point out that the Community Server skin that my blog uses, which forces titles to be ALL CAPS (and which I was only able to modify it enough to make SMALL CAPS happen) is a great example of such a localizability bug, this one on Telligent Systems and Community Server, which would impact any Greek title names with diacritics in them.
The title in the editor is Ρύθμιση σήματος, and as you can see it capitalizes incorrectly according to recommended/preferred practice in Greek, coing out as ΡΎΘΜΙΣΗ ΣΉΜΑΤΟΣ rather than ΡΥΘΜΙΣΗ ΣΗΜΑΤΟΣ (with the diacritics gone).
This bug requires titles that are to be completely capitalized and is just the type of application bug that this post is talking about (though in the end it would be a browser bug, of course, but since the particular skin I am using gives no way to turn off the ALL CAPS behavior, it becomes a Community Server bug, too, given the limitations in CSS!)
Otherwise, to be honest, you should not bother localizing your application. It is way too difficult and expensive of a process to get it wrong.
In short, do it right, or don't freaking bother. :-)
This post brought to you by ಬ (U+0cac, KANNADA LETTER BA)
# Mihai on Tuesday, August 22, 2006 12:37 PM:
# Mihai on Tuesday, August 22, 2006 12:45 PM:
# Michael S. Kaplan on Tuesday, August 22, 2006 12:57 PM:
# Charbal on Tuesday, August 22, 2006 1:02 PM:
# Michael S. Kaplan on Tuesday, August 22, 2006 8:52 PM:
# Charbal on Tuesday, August 22, 2006 11:41 PM:
# Michael S. Kaplan on Wednesday, August 23, 2006 12:18 AM:
# Charbal on Wednesday, August 23, 2006 4:31 AM:
# Michael S. Kaplan on Wednesday, August 23, 2006 6:07 AM:
# Richard on Wednesday, August 23, 2006 8:11 AM:
# Michael S. Kaplan on Friday, August 25, 2006 10:03 AM:
# Cutreda on Monday, May 05, 2008 12:37 AM:
But you are say, that this idead is bad?
# Michael S. Kaplan on Monday, May 05, 2008 1:18 AM:
Yes, it is indeed bad!
referenced by
2010/08/17 It would be like spelling it Anerica or something.
2010/02/28 Silly money equivalency games work both ways (aka Making your localizer's life easier, Part 3)
2008/12/11 On why it's a bad thing to choose font information by name only
2008/08/08 What's in a name?
2006/09/14 Not just uppercasing or italicizing; bolding can cause problems too!