The internationalization of spam

by Michael S. Kaplan, published on 2011/04/15 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/04/15/10154335.aspx


In the world of spam and malicious code (I am treating it as one world for the current purposes though it is really two separate, somewhat overlapping worlds), whether it is the code in spam/malicious e-mails or the code in spam/malicious web sites, the bulk of the initial efforts were not very supportive, internationally.

Now in my heart of hearts I like to see code done correctly.

But I am hardly going to go in to the spam business

Insert gratuitous Monty Python Spam video here.

I'm sure they might pay well, mainly because I think I was almost offered such a job once. It's just that while the person was feeling out my interest my disdain for the efforts was not as veiled as it might have been.

Wes Miller mentioned the other day in his Spamsplosion blog:

So the irony here was that I had to actually switch to using Unicode everywhere because spammers (in addition to doing some pretty conniving stuff to get into your inbox and get you to click on things), are actually sending messages in Unicode, most likely not to enable localized text, but to evade spam-sniffing tools that (ahem, like mine) blow up or skim over when Unicode text comes along.

That is probably part of it, but there is more to it than that.

Lots more tools support Unicode now than did before, and UTF-8 is so often the default in these tools now.

And I mean there are many spam sniffing tools that look for code pages being used that the reader would not expect, as well.

Plus there is spam starting to take advantage of lookalike characters, something much easier in Unicode since there are more of them (and no matter how smart Internet Explorer gets in this area for IDN, email programs are not all doing as great a job on the random text in email front -- so filters looking for the names of erectile dysfunctions drugs may or may not know to look for the math alphanumerics and so forth).

There is a part of me that really wishes for more appropriate uses of ISO-10646 than support of spam. Maybe I should form PE-UU (People for the Ethical Use of Unicode). Or maybe I should work on the name some.


no comments

go to newer or older post, or back to index or month or day