Getting out of dodge (or at least out of the compatibility range!)

by Michael S. Kaplan, published on 2005/12/02 10:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/12/02/498889.aspx


The other day, someone in another group in Microsoft asked me:

As part of my implementation, I need to map from FExx or FBxx range of Arabic characters back to the base in 0x06xx range. There is a Unicode database text file that I can use (http://www.unicode.org/Public/UNIDATA/UnicodeData.txt), but I was wondering whether any libraries that we ship with Windows would have API’s to do this task, or would you recommend some other way. Thanks for your time.

Indeed, the mapping from the compatibility range into the regular Arabic block is one that is defined in Unicode.

(Regular readers may recall that I talked about this back when I discussed how It Does Not Always Pay to be Compatible).

But you do not have to grab down the file from unicode.org; that definition is also found now in Microsoft platforms -- in the normalization functionality found in Whidbey, in Vista, and in the downlevel package for IDN.

If you map to Unicode Normalization Form KC, you will move text directly out of the compatibility form....

Easy, right? :-)

That download is really cool, by the way -- a lot of functionality in there!

 

This post brought to you by "" (U+feb8, a.k.a. ARABIC LETTER SHEEN MEDIAL FORM)


no comments

referenced by

2010/09/16 Providing more information is the best way to assure correct information is received

2009/02/04 The road to hell is paved with attempts at being compatible

2008/09/04 Staying away from the compatibility zone is still a good idea

2008/05/04 Who bells the cat when it comes to glyph substitution?

2006/01/14 Getting out of the compatibility zone, redux

go to newer or older post, or back to index or month or day