Does Bengali sorting work?

by Michael S. Kaplan, published on 2005/06/05 02:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/06/05/425307.aspx


I had a colleague of mine ask me this very question the other day.

It was not an entirely offhand question. After all, as I mentioned in Lions and tigers and bearsELKs, Oh my!, NLS support for Bengali was added to Windows XP in Service Pack 2. The post even mentions sorting as one of the many potential issues that one of these ELKs has to address. So it is not unreaonable to wonder if that support is in there. There are even easy things to check for, like the Hasant and its expected effect on Bengali collation.

Unfortunately, the joy by which one can be happy at how easy is to verify is quickly dashed by the results of calls to CompareString and/or LCMapString with U+0995 U+09cd versus plain old U+0995:

That extra secondary weight (marked in RED above) is clearly what is causing the first string to sort after the second. Which is not what the native speaker would expect (something the other article I wrote explains).

Hoist by mine own petard. Darn.

There was actually some early work done on the collation support for Bengali and Malayalam, but the initial investigation seemed to suggest that the collation would require features in the code that were not yet available in Windows XP. So we reluctantly shipped with only some of the mapping defined. Probably enough for any native speaker to think that a developmentally disabled Bengali youth had tried to define the ordering (hints of truth followed by blatant problems).

Now I can say that a lot of work has gone into all of these "ELK" locales have been carried forward to Longhorn. And that many of these known UNDONEs in the language support are being taken as important issues to address. Not really startling news since language support always gets better in each version of Windows, but important to keep in mind if you are trying to look for "the next new thing."

And then on the other side of the issue, later investigation proced that the earlier investigation was more pessimistic than it had to be. Which is okay -- it is the reason thst we always follow up our earlier investigation! But it makes it even easier to improve in the future.

So hang in there, things seemed destined to get better here.... :-)

 

This post brought to you by "্" (U+09cd, a.k.a. BENGALI SIGN VIRAMA, a.a.k.a. BENGALI HASANT)


no comments

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2006/02/03 We're drowning in LIPs!

go to newer or older post, or back to index or month or day