
Obscuring email addresses from spammers
July 5, 2008Today (and yesterday and the day before) I was working to upgrade a website’s defenses against spam-bots (web-spiders, bots, ect) which crawl the internets looking for email addresses to scoop up and add to their mailing lists. Apparently people still don’t know they can buy viagra online (???).
Anyway, I ended up doing quite a bit of research into protecting my site, and those who entrust their emails to me, from spammers. The site also suddenly began having a problem with bots filling out the ‘contact’ form and submitting it… repeatedly (like once every 2 minutes). Needless to say, this was very annoying. I came up with some solutions (most of these are modified from elsewhere) and some good websites with information about protecting your site.
Protecting Email addresses from bots:
In my research I found a couple of very useful sites which you may want to read and which provide much more information then I will here :
http://nadeausoftware.com/[...]protect_email_addresses_spammers
http://inews.berkeley.edu/bcc/Winter2003/feat.spamharvest.html
http://reliableanswers.com/js/mailme.asp
All of these pages present methods to prevent bots from havesting from email addresses from your website. The first article (from nadeausoftware.com) seems to be the most in depth, featuring a test with several methods put up against multiple (readily available) bots. The general recommendation is separating the email address and putting it on multiple lines with no at-sign (@).
For example:
name:notreal
domain:site.com
This seems like an effective method, especially if you only have a few email addresses on your site. However, it is kind of annoying as far as copy and pasting goes. (I think that most people would agree that ‘mailto’ links are not particularly desirable or useful in this age of webmail clients).
My favorite method (from all of these sites) however is obscuring an email address using JavaScript. This method counts on a bot not being JavaScript enabled (which for now is a pretty good bet but could be a poor assumption in the future: no method is going to protect you forever).
There were many different methods for doing this (I’m sure all of you can read the sites so I’ll just show you how I did it, and if you don’t like my method, check theirs) but I chose to write the at sign using javascript in between the name and the domain name.
For example
example (throw some script tags in here)
document.write("@"); (close script)
example.com
prints: example@example.com
I feel like this is a good method as its difficult to pattern match, having no visible @ and containing spaces. It should be noted that (as all the sites I linked to above state) not ALL browsers/people have JavaScript enabled (most do though). This is still not a horrible option as they will see exampleexample.com as the email address… which, for a person, isn’t too difficult to figure out (hopefully).
As for stopping bots from using forms – I ended up using a combination of things to (try to) prevent this. First, I added a timer to the form – on one end I record the starting time
(in php)
$time = time();
On the other end, I check that it has been 5 seconds. You may wish to set the minimum time lower, Bots fill out the forms in less than a second, I am just providing a little padding.
if($_POST[time] - time() < 5) ... do whatever
This seems relatively effective (it stopped whatever bot was attacking my site), but just to be on the safe side, I implemented a couple other safeguards.
1. On all the forms which you can submit without logging in, I placed a simple math question like
# plus/minus #. Most bots won’t be able to answer this correctly (I am considering sending the failing tries to myself or logging them just to see the responses). (this sort of check is technically a CAPTCHA, another version would be images containing transformed text)
2. I added some fields which are hidden by css formatting ( display:none; ) that have set values… if the field gets filled out (the value changed) I know its a bot.
Between the timer and these two checks, the spam has stopped all together.
Hopefully some of these ideas can be useful to you.
(These are my opinions and not a technical assessment, use these ideas and code snippets at your own risk, I am providing this information as an opinion and without any guarantee. Use this information as a starting point, not as an answer.)