The lazy yet foxy jackdaw I love jumped over my quick brown sphinx dog of quartz

by Michael S. Kaplan, published on 2007/03/20 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/03/20/1917602.aspx


A couple of days ago when I wrote about how In Vista, jackdaws appear to be somewhat endangered, I mentioned

...both strings are actually in Message Compiler resources which means they could actually be localized (though note that the above algorithm means that localization might make the situation worse here, not better. On top of that, what do you do when you have a font with no latins in it? By this algorithm, they will just get another Latin script string which will still have to use font linking to find the glyphs to display.

With the help of Claus Juhl (you my recall him from the Channel 9 video I posted about last August), I was able to look to see what the localizers for the various Windows language releases did to to both strings:

Here are some of the highlights....

First of all, Jackdaws love my big sphinx of quartz was not localized for any language. You can contemplate what this means for the algorithm I posted. :-)

Second of all, this is clearly a problem like the one from 'Cette phrase en français est difficile à traduire en anglais', since it clearly not intended that an actual translation of 'The quick brown fox jumps over the lazy dog' be done. What is desired is a pangram covering the letters in the target language.

Let's see how it worked out with a bunch of those languages:

Arabic: ‏‏من طلب العلا سهر الليالي. 

Bulgarian: Вкъщи не яж сьомга с фиде без ракийка и хапка люта чушчица!

Chinese (PRC): The quick brown fox jumps over the lazy dog1.

Chinese (Taiwan): 微風迎客,軟語伴茶

Czech: Příliš žluťoučký kůň úpěl ďábelské ódy!

Danish: Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen Walther spillede på xylofon.

Dutch: Pa's wijze lynx bezag vroom het fikse aquaduct.

Finnish: Tämä on malliteksti.

French: Voix ambiguë d'un cœur qui au zéphyr préfère les jattes de kiwis.

German: Franz jagt im komplett verwahrlosten Taxi quer durch Bayern.

Greek: Θέλει αρετή και τόλμη η ελευθερία (Ανδρέας Κάλβος).

Hebrew: ‏‏דג סקרן שט לו בים זך אך לפתע פגש חבורה נחמדה שצצה כך. 

Hindi: सारे जहाँ से अच्छा हिंदोस्तां हमारा. 

Hungarian: Árvíztűrő tükörfúrógép ÁRVÍZTŰRŐ TÜKÖRFÚRÓGÉP

Italian: Cantami o Diva del pelide Achille l'ira funesta.

Japanese: Windows でコンピュータの世界が広がります。

Korean: 다람쥐 헌 쳇바퀴에 타고파.

Norwegian: En god stil må først og fremst være klar. Den må være passende. Aristoteles.

Polish: Zażółć gęślą jaźń.

Portuguese (Brazilian): abcdefghijklmnopqrstuvwxyz.

Portuguese (Iberian): A rápida raposa castanha salta em cima do cão lento.

Russian: Съешь еще этих мягких французских булок, да выпей чаю.

Slovak: Kŕdeľ ďatľov učí koňa žrať kôru.

Slovenian: V kožuščku hudobnega fanta stopiclja mizar in kliče

Spanish: El veloz murciélago hindú comía feliz cardillo y kiwi. La cigüeña tocaba el saxofón detrás del palenque de paja.

Swedish: Flygande bäckasiner söka hwila på mjuka tuvor.

Turkish: abcçdefgğhıijklmnoöpqrsştuüvwxyz.

My favorites are Japanese and Iberian Portuguese.

How about yours? :-)

You may be wondering why I tagged this post with 'Unicode Lame List' -- just keep in mind how poor all of these sentences are at dealing with the issue of showing what makes a font unique to any user who might be curious. Just remember, it is not really the localizers who are lame here -- it is the implementation....


1 - When it doubt, don't translate? :-)

 

This post brought to you by (U+1006, a.k.a. MYANMAR LETTER CHA)


# Name required on 20 Mar 2007 10:48 AM:

Judging by those sentences, I guess I can claim to understand... Chinese (PRC), Portuguese (Brazilian), and Turkish.

What is it about the Japanese that you like? It just says that the computer world widens with Windows. Of course that has nothing to do with original two sentences, like most of the other "translations".

# Blake Handler on 20 Mar 2007 10:52 AM:

"Pack my box with five dozen liquor jugs"

# Michael S. Kaplan on 20 Mar 2007 11:01 AM:

The Japanese one just seemed kind of silly to me, meeting very little of the requirements here.

But many of the translations are actually good pangrams for the given language, which makes them good translations.

The Iberian Portuguese one is the silliest since it is a literal translation of the original source sentence.

(And of course recognition of one piece of the UI of a localized version does not indicate language fluency!)

# Szajd on 20 Mar 2007 11:17 AM:

Our little Hungarian pangram ("árvíztűrő tükörfúrógép") means "flood-proof mirror-drill". :)

It's pros include that it's short (it's easy to type this into a new mobile phone when I want to test it's Hungarian-specific capabilities).

The con is that it really isn't a sentence. That makes it less... typographigillishistically looking... If you know what I mean. :)

# Andrew West on 20 Mar 2007 1:27 PM:

I thought the Chinese (Taiwan) sentence was rather clever, combining an elegant couplet with a play on the name of Microsoft.

微風迎客,軟語伴茶

A gentle breeze welcomes the guest,

Soft talk accompanies the tea.

If you take the first character of each line and put them together you get the Chinese for Microsoft : 微軟.

# Mike Dimmick on 20 Mar 2007 1:29 PM:

Also on the Iberian Portuguese - not only is it a literal translation but it actually uses the fewest letters of all of them! Fewer than 20, if I'm counting correctly (and depending on whether you count the letters with diacritics).

The result obviously depends on whether the translator recognised that the original phrase is an English pangram and translated the intent, rather than the result. The Finnish version looks suspiciously short to be a literal translation (not that I know any Finnish) but doesn't really serve the purpose either.

# admix on 20 Mar 2007 2:35 PM:

i've started to translate my native one(Russian):

and found this: http://en.wikipedia.org/wiki/The_quick_brown_fox_jumps_over_the_lazy_dog

# admix on 20 Mar 2007 2:39 PM:

PS and even more here: http://en.wikipedia.org/wiki/Pangram

=)

# Szajd on 20 Mar 2007 2:51 PM:

Oh, I forgot to note another little, small, minor con with the Hungarian pangram: it's not actually a pangram.

So it doesn't display all the Hungarian characters, it only pays attention to the diactric letters of Hungarian, which might be mor problemmatical, especially őŐűŰ.

# Pavanaja U B on 21 Mar 2007 1:08 AM:

The sentence shown for Hindi does not use all the possible letters and their conjuncts. The display forms for the conjuncts for Indic are different from just putting side by side the alphabet from which they have been derived. That is the domain of opentype font and complex scripts. I don't have to explain these to you. Hence if the purpose is to show all the possible display forms for Indic then the total number of such combinations maybe anywhere near 15,000 per script.

-Pavanaja

# Michael S. Kaplan on 21 Mar 2007 1:24 AM:

Well, I am not suggesting that all forms need to be displayed, any more than I am suggesting that Chinese needs 70.000+ ideographs displayed!

The idea would be a sentence that has every letter used....

# Bernhard on 21 Mar 2007 3:08 AM:

The German version doesn't show any of the special characters (ä,ö,ü,ß) either. But at least it shows all "basic" letters from the alphabet.

Knowing what the Hungarian pangrams mean - I really like that

# Do I need a name? on 21 Mar 2007 7:27 AM:

Michael,

Even with that +, Chinese needs well more than 70,000. There are still a significant number of historical characters that I can not type. Hopefully Extension D will begin to solve that.

# Peter Karlsson on 21 Mar 2007 8:19 AM:

The Swedish one is a well-known typographic test sentence, our version of “The quick brown fox…”, so it is a good localization.

The Finnish version, if I read it correctly, just says “this is a sample text”.

# andreas on 22 Mar 2007 6:57 AM:

two nice ones in French with (all?) possible accents:

Où qu'il réside, à Nîmes ou même Capharnaüm, tout  Français inscrit au rôle payera son dû dès avant Noël, qu'il soit naïf ou râleur.

Le cœur déçu, mais l'âme plutôt naïve, Louÿs rêva de crapaüter en canoë au delà des îles, près du mälström où brûlent les novæ.

and a kind in English (kind of :-)

¿But aren't Kafka's Schloß and Æsop's Œuvres often naïve vis-à-vis the dæmonic phœnix's official rôle in fluffy soufflés?

and finally a German one *with* umlauts:

Falsches Üben von Xylophonmusik quält jeden größeren Zwerg.

;-)

# Pavanaja U B on 27 Mar 2007 12:01 AM:

Here is a Kannada poem which has all the alphabet of Kannada (consonants and vowel signs).

ಗಾಢಾಂಧಕಾರದೊಳು ನಟನ ಪಟು ಮೇಘ ಸಖಿ
ಸಞ್ಚಲಿಸೆ, ಬೆಳಕು ಝರಿಯಂತೆ ಹರಿದಾಡಿ ಮೂಡಿ
ಜೀವ ಭಙ್ಗಿಯ ಜಾಣ್ಮೆ ಕಥಿಸೆ ಕೌಶಲ ಕೇಳಿ
ಫಲವೊ ಛಲರಹಿತ ರಸನಿಷ್ಠೈಕ ಮೋದಕೃತಿಯೇ

Read my intro to this at http://vishvakannada.com/node/359. This poem is generated by Shatavadhani (http://en.wikipedia.org/wiki/Avadhanam) Dr R Ganesh (http://en.wikipedia.org/wiki/Dr.R.Ganesh).

Regards,

Pavanaja

# Daniel Neto on 10 Jun 2008 12:28 PM:

Portuguese (brazilian): A raposa ligeira e marrom pula sobre o cachorro preguiçoso.


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2007/03/27 Another post about providing font samples

go to newer or older post, or back to index or month or day