There's no "I" in IDN, part 2: Try not to use the wrong functions!

by Michael S. Kaplan, published on 2011/04/02 07:03 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/04/02/10149149.aspx


 It started a while back in There's no "I" in IDN, part 1: If you're not Unicode, you're just wrong!.

And in that blog I mentioned some functions that can't handle Unicode domain names, and thus should never be used in situations where Unicode domain names might be needed or expected.

But there can be another class of usage to look out for -- functions that might be just fine on their own in situations they need but which you have to be careful about using once you move into the world of IDN and EAI and such.

Like the question that came up the other day:

Does anyone know a way to convert full-width characters to half width (or vice versa)? We have a need where when we generate hash of a string that needs to be equal with width insensitivity.

Here is the example of equivalent strings when width is ignored.

test53496aaa@ae2.dion.ne.jp
test53496aaa@ae2.dion.ne.jp

Of course the simple answer would be to use LCMapStringW/LCMapStringEx to convert between full and half width.

But if these entries are domain names (they look like email addresses, which contain domain names), then following those rules and using the IDN functions -- in this case IdnToNameprepUnicode -- to do the various conversions (at least for the "domain" piece after the "@" sign) can make a lot more sense.

The issue to the text before the "@" sign is slightly more complicated, so we'll leave that for another time...

But for now, the rule is simple: don't do IDN things with non-IDN functions!


no comments

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2013/10/17 There's no "I" in IDN, part 19: There's no "I" in IPv6, either!

2013/10/08 There's no "I" in IDN, part 18: There isn't even an "I" in John C. Klensin's name!

2013/09/13 There's no "I" in IDN, part 17: EAI made it to China, and everybody knows it!

2013/04/19 There's no "I" in IDN, part 16: It's a good thing they decided to call it EAI!

2012/10/12 There's no "I" in IDN, part 15: Still no 'I' in EAI.... but we could use an US sometime soon!

2012/08/08 There's no "I" in IDN, part 14: It turns out there's no "I" in IE, either

2012/05/18 There's no "I" in IDN, part 13: Desktop and Managed and Metro; oh my!

2012/02/27 There's no "I" in IDN, part 12: Emoji + IDN == U+1F4A9 (PILE OF POO)

2011/10/25 There's no "I" in IDN, part 11: There's no place like ::1, not even 127.0.0.1!

2011/09/21 There's no "I" in IDN, part 10: Who needs IDN support? How much? When? (Part 2)

2011/09/16 There's no "I" in IDN, part 9: Who needs IDN support? How much? When? (Part 1)

2011/08/12 There's no "I" in IDN part 8: Punycode don't do the PUA

2011/07/28 There's no "I" in IDN, part 7: IDN comes to AdWords

2011/07/14 There's no "I" in IDN, part 6: It isn't like there's an "I" in EAI, either!

2011/06/29 There's no "I" in IDN, part 5: Stephen Colbert's job is not in any jeopardy

2011/06/17 There's no "I" in IDN, part 4: the 'path' to Hell is paved with IDN bugs

2011/05/25 There's no "I" in IDN, part 3: There's no "I" in DIY, either!

go to newer or older post, or back to index or month or day