A localizability problem is an application bug, or alternately: Ρύθμιση σήματος

by Michael S. Kaplan, published on 2006/08/22 03:01 -07:00, original URI: http://blogs.msdn.com/michkap/archive/2006/08/22/708503.aspx


(Note: the title of this blog post has incorrect Greek text in it, to help highlight a bug that will be explained later in the post!)

I had no idea when I posted Sometimes, uppercasing sucks that I'd find so many people who were unaware of how much of this sort of 'natural language processing' wasn't happening in Windows or the .NET Framework.

There is actually a more generalized problem here though.

It is the same problem that happens when a developer makes sure a font size is a hard-coded 8pt and it has to show Chinese Han

Or when a developer italicizes text and makes Arabic look like crap and Japanese look really ugly.

Or when a developer hard codes the location of tokens in a string so that the localizer trying to translate it to German cannot change the word order and is forced to write German text that looks ridiculous.

Or when a developer bolds all text and makes a Tibetan string that was on the thin edge of readability look like smudges on the monitor.

Or when a developer does not allow a dialog to be mirrored and messes up the Hebrew UI version.

And yes, it happens when a developer decides to uppercase a string because they believe it is required for some type of emphasis.

The bug here is NOT in Windows or the .NET Framework in these cases. It is not a platform problem at all. It is an application problem.

By taking important decisions like font size/weight/style, like case, like position and token order of strings out of the hands of the people who are most qualified to understand the requirements for acceptance in the market, the localizers, they are actually doing a crappy job on the localizability front -- preparing an application so that localization can go smoothly.

Now I am sure one day having integrated linguistic services that can automatically apply all of the specific language rules for operations like case folding without manual per-language intervention would be a wonderful internationalization feature, one that will enrich the platform tremendously.

Just as I think that if all the rules about how best to handle text rendering in all languages in regard to font attributes/styles could be captured in some internationalized "give me the font for _____" type function that will look up all the rules and apply them per market would be cool. And which once again will enrich the platform markedly.

And just as I think that if localization could happen automatically on the fly with no need for anyone to do the actual translation work and have the results look correct to native speakers of the language would be truly awesome.

But frankly, we are nowhere near any of these goals, and until we are, localizers are the key ambassadors who present the application that people working in/with another language market can use and enjoy.

So if you are not writing localizable applications (and any of these problems including Greek or French or Dutch casing are included here), then for now it is your bug.

So please fix it.

(Note that fixing may be a simple as not making a string all caps or bolding it or italicizing -- in other words, don't potentially destroy properly crafted text for a market in an effort to emphasize it)

As a side note, I'll point out that the Community Server skin that my blog uses, which forces titles to be ALL CAPS (and which I was only able to modify it enough to make SMALL CAPS happen) is a great example of such a localizability bug, this one on Telligent Systems and Community Server, which would impact any Greek title names with diacritics in them.

The title in the editor is Ρύθμιση σήματος, and as you can see it capitalizes incorrectly according to recommended/preferred practice in Greek, coing out as ΡΎΘΜΙΣΗ ΣΉΜΑΤΟΣ rather than ΡΥΘΜΙΣΗ ΣΗΜΑΤΟΣ (with the diacritics gone).

This bug requires titles that are to be completely capitalized and is just the type of application bug that this post is talking about (though in the end it would be a browser bug, of course, but since the particular skin I am using gives no way to turn off the ALL CAPS behavior, it becomes a Community Server bug, too, given the limitations in CSS!)

Otherwise, to be honest, you should not bother localizing your application. It is way too difficult and expensive of a process to get it wrong.

In short, do it right, or don't freaking bother. :-)

 

This post brought to you by (U+0cac, KANNADA LETTER BA)


# Mihai on Tuesday, August 22, 2006 12:37 PM:

And this happens not once, but twice :-)

style.css
.posthead h2 {
...
text-transform: uppercase;
...
}

customcss.ashx
h2 { font-Variant:small-caps; }

# Mihai on Tuesday, August 22, 2006 12:45 PM:

Now, thinking of it another way: if the style sheets go out for localization, then this is not a localizability bug.
This is something I always say: styles are localizable! Same as the colors, images, coordinates, and so on.

True, some use all kind of languages in posts :-)
But then, what can you do? No: bold, italic, underscore, not font family or font size setting, no colors, no images, no alignment, no changing case.
Ok, plain text!

So, for the software of this blog, I would say: forget styles for now, fix the search, which s*cks!

# Michael S. Kaplan on Tuesday, August 22, 2006 12:57 PM:

Hi Mihai -- the uppercase thing is what the main style for the blog does. The smallcaps was the only override I was able to make. :-(

But my blog is not localized, so opening up the style sheets to the localizer does not help -- only opening things up to the developer/author! :-)

I am happy with any setting I can override for the sake of the languages I need/want to support....

# Charbal on Tuesday, August 22, 2006 1:02 PM:

Yep, it should be possible to remove the upper-casing quite easily in your stylesheet.  Remove your small-caps inducing line in your custom CSS and add this in its place:

.posthead h2 {text-transform: none;}

This should make it so the titles are not uppercased anymore.

# Michael S. Kaplan on Tuesday, August 22, 2006 8:52 PM:

Hello Charbal,

Will that work across all browsers? I swear this was one of the things that was tried back when this issue first came up and there were some problems with it?

I could be misremembering things, of course....

# Charbal on Tuesday, August 22, 2006 11:41 PM:

It should work.  I just tried on current versions of IE, Firefox, Konqueror & Galeon as well as on IE 4.0.  This chart seems to indicate that "text-transform: none" has been fully supported in mainstream browsers for a long time: http://www.blooberry.com/indexdot/css/properties/text/texttrans.htm

Unfortunately, I don't have access to a Mac at the moment so I can't test on browsers there but I'd be genuinely surprised if it didn't work since this is a fairly simple part of CSS 1 which has been around for awhile.  In any case, Safari's rendering is usually quite similar to Konqueror's as Apple built it on KHTML.

One thing that wouldn't work would be putting the text-transform: none inside of a bare h2 selector (i.e., no .posthead) since the .posthead h2 selector in the skin has higher specificity than an h2 selector alone in the custom CSS and so the skin style would take precedence.  This might explain the issues you encountered when trying to fix this.

# Michael S. Kaplan on Wednesday, August 23, 2006 12:18 AM:

I know that the "none" is supported, I was talking about the duelng style sheets.

Which of course makes me also wonder whether this could be fixed per page with inline style info? :-)

I'll remember I am not an HTML author any moment now....

# Charbal on Wednesday, August 23, 2006 4:31 AM:

The test case I used in my browser tests used multiple stylesheets and worked as expected in all of those.  The order to apply them in is well-defined in the CSS recommendation: since the two styles have the same importance, origin and specificity, they are resolved according to the order in which they appear which is dictated by the HTML source:

<link rel="stylesheet" href="/Themes/Blogs/luxinterior/style/style.css" type="text/css" media="screen" />
<link rel="stylesheet" href="/Themes/Blogs/luxinterior/style/print.css" type="text/css" media="print" />
<link rel="stylesheet" href="/Themes/Blogs/luxinterior/style/light_wide.css" type="text/css" media="screen" />
<link rel="stylesheet" href="/michkap/customcss.ashx" type="text/css" media="screen" />

So any fully compliant user agent will deal with this properly.  Of course, there's a remarkable shortage of fully compliant user agents... :P  Offhand I can't think of any that would get this wrong, however.  The cascading order is a pretty important part of CSS (enough to make it into the acronym!).  Styles within the HTML document (using either a style element or style attributes) would also work so it could be done per-page if desired (if the blog software allows that degree of customization, of course).

# Michael S. Kaplan on Wednesday, August 23, 2006 6:07 AM:

Well, Community Server does not give me a place in the head to put per-page stuff.... the best I could do is put it in the body. Which may be too late in this case, I think, to affect the title in the window and the title at the head of the post.

So I think I might be forced to change them all rather than roing per-post styles here. Unless there is some way to make it back up and fix the two instances of the title that it may have already rendered? :-)

# Richard on Wednesday, August 23, 2006 8:11 AM:

> Community Server skin that my blog uses, which forces titles to be ALL CAPS

Which now makes far more sense on a browser than in my aggregator (SharpReader) which does not force this...

So one could easily start also talking about the all too frequent assumptions by site authors about comfortable text sizes, browser, window size, ...

# Michael S. Kaplan on Friday, August 25, 2006 10:03 AM:

Hey Charbal,

Thanks for the tip -- it works!

# Cutreda on Monday, May 05, 2008 12:37 AM:

But you are say, that this idead is bad?

# Michael S. Kaplan on Monday, May 05, 2008 1:18 AM:

Yes, it is indeed bad!


referenced by

2010/08/17 It would be like spelling it Anerica or something.

2010/02/28 Silly money equivalency games work both ways (aka Making your localizer's life easier, Part 3)

2008/12/11 On why it's a bad thing to choose font information by name only

2008/08/08 What's in a name?

2006/09/14 Not just uppercasing or italicizing; bolding can cause problems too!

2006/08/30 If you wanted to get it done with the font...

go to newer or older post, or back to index or month or day