by Michael S. Kaplan, published on 2005/12/02 10:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/12/02/498889.aspx
The other day, someone in another group in Microsoft asked me:
As part of my implementation, I need to map from FExx or FBxx range of Arabic characters back to the base in 0x06xx range. There is a Unicode database text file that I can use (http://www.unicode.org/Public/UNIDATA/UnicodeData.txt), but I was wondering whether any libraries that we ship with Windows would have API’s to do this task, or would you recommend some other way. Thanks for your time.
Indeed, the mapping from the compatibility range into the regular Arabic block is one that is defined in Unicode.
(Regular readers may recall that I talked about this back when I discussed how It Does Not Always Pay to be Compatible).
But you do not have to grab down the file from unicode.org; that definition is also found now in Microsoft platforms -- in the normalization functionality found in Whidbey, in Vista, and in the downlevel package for IDN.
If you map to Unicode Normalization Form KC, you will move text directly out of the compatibility form....
Easy, right? :-)
That download is really cool, by the way -- a lot of functionality in there!
This post brought to you by "ﺸ" (U+feb8, a.k.a. ARABIC LETTER SHEEN MEDIAL FORM)
2006/01/14 Getting out of the compatibility zone, redux
go to newer or older post, or back to index or month or day