../ Front ./ Email Conversations /
Anti-harvesting measures

how to prevent your address from being harvested by spam bots

Around the middle of November 2006, I decided to experiment with how to protect email addresses from spam-harvesting robots. My testbed was this website, which is admittedly not the highest-traffic site in the world. As such, it might be considered to be representative of other websites.

I planted addresses on my elsewhere page, in a hidden <div>. (They have since been removed.)

results

Interestingly, the spambots don't include a dash or a plus as part of an address. s-mailto became just mailto, and s-prot+REMOVEME became removeme. Note the change to lowercase. Posting an address in uppercase, and filtering out all mail to lowercased addresses, seems that it would be effective. It's also permitted by RFC822—email programs are allowed to be case-sensitive, except with regards to the special address POSTMASTER, which must be accepted regardless of case.

addressspamsdays to harvest
(after first spam)
s-mailto (became mailto)1070
s-prot+REMOVEME (became removeme)223
s-inline (became inline)299
s-spannone!none!

From this, it appears that the simple strategy of using HTML <span> tags to enclose the recipient username and the host parts of the address will scare off just about all spam-harvesting parsers. Apparently they use badly-tuned regexes against the raw text of the page, rather than doing the really hard work of parsing HTML.