by Michael S. Kaplan, published on 2008/08/06 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/08/06/8836106.aspx
There are not very many times that a feature within NLS can make a person psychotic.
Though of course by making such a claim one implies that there are in fact such cases, no matter how rare they may be.
This post will be about one of them....
It is about the TransliteratedFrench and TransliteratedEnglish calendars in Windows.
In order to properly tear them apart, first we'll write some code to enumerate the information in them.
Note that the tortuous method of getting to the data is not my idea. :-)
Here is the code:
namespace PsychoticCalendars {
using System;
using System.Globalization;
class PsychoticCalendars {
[STAThread]
static void Main(string[] args) {
CultureInfo[] rgci = {new CultureInfo("en-US"), new CultureInfo("fr-FR"), new CultureInfo("ar-IQ")};
foreach(CultureInfo ci in rgci) {
foreach(Calendar cal in ci.OptionalCalendars) {
if(cal is GregorianCalendar) {
Console.WriteLine("{0}\t{1} ({2})", ci.Name, cal, ((GregorianCalendar)cal).CalendarType);
ci.DateTimeFormat.Calendar = cal;
Console.Write('\t');
for(int i = 1; i <= 12; i++) {
Console.Write(ci.DateTimeFormat.GetMonthName(i) + " ");
}
Console.WriteLine();
Console.Write('\t');
for(int i = 1; i <= 12; i++) {
Console.Write(ci.DateTimeFormat.GetAbbreviatedMonthName(i) + " ");
}
Console.WriteLine();
Console.Write('\t');
for(DayOfWeek d = DayOfWeek.Sunday; d <= DayOfWeek.Saturday; d++) {
Console.Write(ci.DateTimeFormat.GetDayName(d) + " ");
}
Console.WriteLine();
Console.Write('\t');
for(DayOfWeek d = DayOfWeek.Sunday; d <= DayOfWeek.Saturday; d++) {
Console.Write(ci.DateTimeFormat.GetAbbreviatedDayName(d) + " ");
}
Console.WriteLine();
Console.Write('\t');
for(DayOfWeek d = DayOfWeek.Sunday; d <= DayOfWeek.Saturday; d++) {
Console.Write(ci.DateTimeFormat.GetShortestDayName(d) + " ");
}
Console.WriteLine("\r\n");
}
}
}
}
}
}
First, before you run this code, you will want to run chcp 1256 or chcp 65001 since those are two of the only code pages that will be able to contain the French and Arabic letters that will be needed here.
Okay, now here is the output....
en-US System.Globalization.GregorianCalendar (Localized)
January February March April May June July August September October November December
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Sun Mon Tue Wed Thu Fri Sat
Su Mo Tu We Th Fr Sa
en-US System.Globalization.GregorianCalendar (USEnglish)
January February March April May June July August September October November December
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Sun Mon Tue Wed Thu Fri Sat
Su Mo Tu We Th Fr Sa
fr-FR System.Globalization.GregorianCalendar (Localized)
janvier février mars avril mai juin juillet août septembre octobre novembre décembre
janv. févr. mars avr. mai juin juil. août sept. oct. nov. déc.
dimanche lundi mardi mercredi jeudi vendredi samedi
dim. lun. mar. mer. jeu. ven. sam.
di lu ma me je ve sa
ar-IQ System.Globalization.GregorianCalendar (Localized)
كانون الثاني شباط آذار نيسان أيار حزيران تموز آب أيلول تشرين الأول تشرين الثاني كانون الأول
كانون الثاني شباط آذار نيسان أيار حزيران تموز آب أيلول تشرين الأول تشرين الثاني كانون الأول
الاحد الاثنين الثلاثاء الاربعاء الخميس الجمعة السبت
الاحد الاثنين الثلاثاء الاربعاء الخميس الجمعة السبت
أ ا ث أ خ ج س
ar-IQ System.Globalization.GregorianCalendar (USEnglish)
January February March April May June July August September October November December
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Sun Mon Tue Wed Thu Fri Sat
Su Mo Tu We Th Fr Sa
ar-IQ System.Globalization.GregorianCalendar (MiddleEastFrench)
janvier février mars avril mai juin juillet août septembre octobre novembre décembre
janv. févr. mars avr. mai juin juil. août sept. oct. nov. déc.
dimanche lundi mardi mercredi jeudi vendredi samedi
dim. lun. mar. mer. jeu. ven. sam.
أ ا ث أ خ ج س
ar-IQ System.Globalization.GregorianCalendar (TransliteratedEnglish)
يناير فبراير مارس ابريل مايو يونيو يوليو اغسطس سبتمبر اكتوبر نوفمبر ديسمبر
يناير فبراير مارس ابريل مايو يونيو يوليو اغسطس سبتمبر اكتوبر نوفمبر ديسمبر
الاحد الاثنين الثلاثاء الاربعاء الخميس الجمعة السبت
الاحد الاثنين الثلاثاء الاربعاء الخميس الجمعة السبت
أ ا ث أ خ ج س
ar-IQ System.Globalization.GregorianCalendar (TransliteratedFrench)
جانفييه فيفرييه مارس أفريل مي جوان جوييه أوت سبتمبر اكتوبر نوفمبر ديسمبر
جانفييه فيفرييه مارس أفريل مي جوان جوييه أوت سبتمبر اكتوبر نوفمبر ديسمبر
الاحد الاثنين الثلاثاء الاربعاء الخميس الجمعة السبت
الاحد الاثنين الثلاثاء الاربعاء الخميس الجمعة السبت
أ ا ث أ خ ج س
Okay, so here we go.
Let's pick them apart, after getting some advice from the NLS "Calendar Girl" Shelby who first pointed out one of the problems here (and though we both turned out to be mistaken as to the cause, that is only because both of us attributed more smarts to the actual process!).
First of all, there is the fact that the shortest day names for the MiddleEastFrench calendar, rather than matching the French Gregorian localized calendar like all of the rest of the data does, matches the Arabic Gregorian localized calendar.
Thus instead of
di lu ma me je ve sa
we have
أ ا ث أ خ ج س
which for those who don't know Arabic, is
ALEF WITH HAMZA ABOVE, ALEF, THEH, ALEF WITH HAMZA ABOVE, KHAH. JEEM, SEEN
Okay.
Now there is also the fact that the shortest day names for the TransliteratedEnglish and TransliteratedFrench calendars are also identical to these. Note from the above that they are in no way transliterations for either the English or French Gregorian calendars.
That seems like kind of a problem too.
But don't worry too much -- it turns out that the day names and abbreviated day names for the TransliteratedEnglish and TransliteratedFrench calendars are also identical to the Arabic Gregorian localized calendar.
And are also in no way transliterations.
In case you don't believe me I'll take one and prove it. Wednesday is:
الاربعاء
which is
ALEF, LAM, ALEF, REH, BEH, AIN, ALEF HAMZA
which is obviously not a transliteration for either Wednesday or mercredi.
Month name fare a bit better, though -- they do look like transliterations. Thus
سبتمبر
is
SEEN BEH TEH MEEM BEH REH
which is a fair transliteration for September, just as
فيفرييه
is
FEH YEH FEH REH YEH YEH HEH
which is kind of a transliteration for février.
Though of course in both the TransliteratedEnglish and TransliteratedFrench calendars, the abbreviated month names, rather than being transliterations of the English and French calendars, are identical to their non-abbreviated cousins.
At this point, it is fair to say that of the data in these three Gregorian calendars:
60% of it is just wrong, wrong, wrong in any conventional sense of how a reasonable person would expect them to work.
If you ignore the shortest day name stuff (which was added fairly recently) then only 50% of it is wrong.
But this data is not newly wrong -- it has been wrong for as long as these calendars have existed -- in Windows 95, I think?
It would be easy to claim that this is really a fallback system kind of thing -- you know, data was not there so it is falling back to data elsewhere.
I could make such a claim right now credibly based on the situation.
There would be one problem this this claim, though.
The fact that I would be full of crap if I made it. :-)
This data is stored as is in the data and has been for as long as the data has been there.
These calendars are just wrong and weird and odd and strange and they are mostly not transliterations in any sense.
From a quality of data standpoint, in fact, I would tentatively suggest that we are currently hip deep in the low-point of NLS right now.
Transliterationally speaking, that is.
So I'll put forward the two questions again:
How bad does data have to before it is wrong?
Will 50-60% do it? How about the 75-80% of the transliterated calendars only?
And how long does it have to be wrong before it is right?
Is over a decade long enough that this is not just okay? Or does fixing the worst of it make sense at some point in the future?
This blog brought to you by ج (U+062c, aka ARABIC LETTER JEEM)
John Cowan on 6 Aug 2008 12:44 PM:
But you can't *change* it, because Someone Out There is undoubtedly *depending* on that very trash!
Michael S. Kaplan on 6 Aug 2008 3:38 PM:
That might be true for some of it, but other parts are likely fixable. There is not always a requirement to stay broken. :-)