Why bother putting the size in the name if it's not always right?

by Michael S. Kaplan, published on 2006/09/06 03:41 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/09/06/742245.aspx


Yet another fun issue in the '.NET Framework isn't always telling the truth' files.... I was reminded of it by Aldo in the aftermath of that IsSortable() == false? Well, sometimes it may be lying.... post I just did.

Try the following code in the .NET Framework, running on both XP >= SP2 and Vista:

using System;
using System.Globalization;

public class Test {
    public static void Main() {
        Console.WriteLine("\nFRAMEWORK:\n");
        GetSomeCultures(CultureTypes.FrameworkCultures);
        Console.WriteLine("\nWINDOWS ONLY:\n");
        GetSomeCultures(CultureTypes.WindowsOnlyCultures);
    }

    public static void GetSomeCultures(CultureTypes ct) {
        foreach(CultureInfo ci in CultureInfo.GetCultures(ct)) {
            if(ci.ThreeLetterISOLanguageName.Length != 3) {
                Console.WriteLine("    {0} has '{1}', a ThreeLetterISOLanguageName that is not three characters long.",
                                  ci.Name, ci.ThreeLetterISOLanguageName);
            }
            if(ci.TwoLetterISOLanguageName.Length != 2) {
                Console.WriteLine("    {0} has '{1}', a TwoLetterISOLanguageName that is not two characters long.",
                                  ci.Name, ci.TwoLetterISOLanguageName);
            }
        }
    }
}

There's a real kind of  'drive on a parkway, but park on a driveway' kind of thing, huh? :-)

Ok, the results I get with XP SP2:

FRAMEWORK:

    kok has 'kok', a TwoLetterISOLanguageName that is not two characters long.
    syr has 'syr', a TwoLetterISOLanguageName that is not two characters long.
    div has 'div', a TwoLetterISOLanguageName that is not two characters long.
    kok-IN has 'kok', a TwoLetterISOLanguageName that is not two characters long.
    syr-SY has 'syr', a TwoLetterISOLanguageName that is not two characters long.
    div-MV has 'div', a TwoLetterISOLanguageName that is not two characters long.

WINDOWS ONLY:

    sma-NO has 'sma', a TwoLetterISOLanguageName that is not two characters long.
    sr-BA-Cyrl has 'sr', a ThreeLetterISOLanguageName that is not three characters long.
    zu-ZA has 'zu', a ThreeLetterISOLanguageName that is not three characters long.
    xh-ZA has 'xh', a ThreeLetterISOLanguageName that is not three characters long.
    tn-ZA has 'tn', a ThreeLetterISOLanguageName that is not three characters long.
    se-SE has 'se', a ThreeLetterISOLanguageName that is not three characters long.
    sma-SE has 'sma', a TwoLetterISOLanguageName that is not two characters long.
    hr-BA has 'hr', a ThreeLetterISOLanguageName that is not three characters long.
    smn-FI has 'smn', a TwoLetterISOLanguageName that is not two characters long.
    quz-PE has 'quz', a TwoLetterISOLanguageName that is not two characters long.
    se-FI has 'se', a ThreeLetterISOLanguageName that is not three characters long.
    sms-FI has 'sms', a TwoLetterISOLanguageName that is not two characters long.
    cy-GB has 'cy', a ThreeLetterISOLanguageName that is not three characters long.
    bs-BA-Latn has 'bs', a ThreeLetterISOLanguageName that is not three characters long.
    smj-NO has 'smj', a TwoLetterISOLanguageName that is not two characters long.
    mi-NZ has 'mi', a ThreeLetterISOLanguageName that is not three characters long.
    quz-EC has 'quz', a TwoLetterISOLanguageName that is not two characters long.
    sr-BA-Latn has 'sr', a ThreeLetterISOLanguageName that is not three characters long.
    bn-IN has 'bn', a ThreeLetterISOLanguageName that is not three characters long.
    smj-SE has 'smj', a TwoLetterISOLanguageName that is not two characters long.
    ml-IN has 'ml', a ThreeLetterISOLanguageName that is not three characters long.
    ns-ZA has 'ns', a ThreeLetterISOLanguageName that is not three characters long.
    quz-BO has 'quz', a TwoLetterISOLanguageName that is not two characters long.
    se-NO has 'se', a ThreeLetterISOLanguageName that is not three characters long.
    mt-MT has 'mt', a ThreeLetterISOLanguageName that is not three characters long.

And with Vista RC1:

FRAMEWORK:

    kok has 'kok', a TwoLetterISOLanguageName that is not two characters long.
    syr has 'syr', a TwoLetterISOLanguageName that is not two characters long.
    kok-IN has 'kok', a TwoLetterISOLanguageName that is not two characters long.
    syr-SY has 'syr', a TwoLetterISOLanguageName that is not two characters long.

WINDOWS ONLY:

    sma-NO has 'sma', a TwoLetterISOLanguageName that is not two characters long.
    prs-AF has 'prs', a TwoLetterISOLanguageName that is not two characters long.
    qut-GT has 'qut', a TwoLetterISOLanguageName that is not two characters long.
    sah-RU has 'sah', a TwoLetterISOLanguageName that is not two characters long.
    gsw-FR has 'gsw', a TwoLetterISOLanguageName that is not two characters long.
    moh-CA has 'moh', a TwoLetterISOLanguageName that is not two characters long.
    arn-CL has 'arn', a TwoLetterISOLanguageName that is not two characters long.
    sma-SE has 'sma', a TwoLetterISOLanguageName that is not two characters long.
    quz-PE has 'quz', a TwoLetterISOLanguageName that is not two characters long.
    nso-ZA has 'nso', a TwoLetterISOLanguageName that is not two characters long.
    quz-BO has 'quz', a TwoLetterISOLanguageName that is not two characters long.
    sms-FI has 'sms', a TwoLetterISOLanguageName that is not two characters long.
    fil-PH has 'fil', a TwoLetterISOLanguageName that is not two characters long.
    quz-EC has 'quz', a TwoLetterISOLanguageName that is not two characters long.
    smj-SE has 'smj', a TwoLetterISOLanguageName that is not two characters long.
    smj-NO has 'smj', a TwoLetterISOLanguageName that is not two characters long.
    dsb-DE has 'dsb', a TwoLetterISOLanguageName that is not two characters long.
    smn-FI has 'smn', a TwoLetterISOLanguageName that is not two characters long.
    tzm-Latn-DZ has 'tzm', a TwoLetterISOLanguageName that is not two characters long.
    hsb-DE has 'hsb', a TwoLetterISOLanguageName that is not two characters long.

 Now the differences between the two cases underscore the fact that there are different causes for the times that the .NET Framework appears to be "lying" by claiming a two letter property has three letters or vice versa:

So all and all, a nice combination of unintentional mistakes, unavailable data, and data that does not exist. :-)

 

This post brought to you by (U+1839, a.k.a. MONGOLIAN LETTER FA)


Adam on 8 Sep 2006 6:03 AM:

If a suitably specific tag of the appropriate length does not exist, why doesn't it just return null or throw an exception? If the data being asked for does not exist (or is not known) why return incorrect data? "kok" is _not_ a two letter code.

Michael S. Kaplan on 8 Sep 2006 7:10 AM:

Hi Adam,

The choice to always return something is actually based on the usage of these fields in identifiers -- and the consequences of either throwing or returning null for a string property (or even returning a ZLS) are dire for some of those cases....

Adam on 11 Sep 2006 8:52 AM:

What do you mean - their usage in identifiers? As in, in variables and methods named after them? Where is this an issue.

And surely it's only an issue if the code that uses them assumes that all languages have 2- and 3-letter ISO languages codes. Which is clearly wrong, as ISO does /not/ guarantee that all languages have 2- and 3-letter codes.

Looks like the code really wants a PreferredISOLanguageName property instead.

Michael S. Kaplan on 11 Sep 2006 10:46 AM:

Hi Adam,

And that is what the two letter code is being treated as. Whether people like it or not, it is really too late to get hung up on the name. :-)

referenced by

2011/07/25 Why are the short names longer than than long names?

2007/03/22 When language codes are reported that don't actually exist

go to newer or older post, or back to index or month or day