cancel
Showing results for 
Search instead for 
Did you mean: 

Spam prevention - An idea to consider?

Anteaus
Grafter
Posts: 64
Thanks: 1
Registered: 02-08-2007

Spam prevention - An idea to consider?

None of us need telling that ISPs have massive problems with spam.
Most protective actions rely on trying to stop the spam getting into the Plusnet network. How about looking-at the other side of the equation, that of stopping spammers from harvesting email addresses from Plusnet-hosted websites?
In principle this could be of great benefit to the user, and save a lot of ISP bandwidth.
The main issue where spam is concerned is with users -or their webdesigners- posting unprotected 'Mailtos' onto webpages. These mailtos are harvested with spidering bots, and used to build spamvertising lists. Uni research indicates that 80-90% of all spam results from mailtos being harvested.
If the Plusnet webservers had a mechanism to warn owners of pages with harvestable email addresses on them, or perhaps even to return an error code for such pages instead of publishing them, who knows but this might just nail the spam problem at-source, for once and for all. Or even if that's unrealistic, a 50% reduction would be more than worthwhile.
As for implementation, the sophisticated route would be by way of an Apache module which scans any requested page it hasn't seen before, or which has been modified since the last scan.  The scan would use regexes or DOM to find mailtos. Preferably regexes, as DOM is prone to missing constructs in bad HTML. Optionally it could also find anything else which matches a Plusnet email address. (Scanning for any email address at all would give too many false positives on other structures containing @-signs, so probably not advised.)
The lo-tech way would be to use a script to scan the server filesystem on a periodic basis, and notify users of harvesting vulnerabilities found in html/php files. (Not hard to know who to notify either, if it's vulnerable then the contact-info IS the vulnerability!) Optionally the script could also change the permissions or the file extension to prevent publication and consequent harvesting. Or, is that too invasive?
There might need to be a mechanism to exclude sites with an (apparent) infinite number of mailtos, such as are created by honeypots. 
Admittedly this would only be fully effective if implemented by the majority of ISPs and webhosts. Though, any ISP/Webhost implementing webserver harvesting-protection should see a drastic reduction of inbound spam, since a high proportion of the harvested addresses will be at same-site accounts. Thus, of immediate benefit to same ISP. 
The other advantage would be that lazy commercial-webdesign firms (The main problem!) would be forced to implement harvesting protection, if not doing so meant their client being notified of their negligence.
On a wider scale, this is maybe something which an ISP working-group could consider the merits of.
-Thoughts?
5 REPLIES
phil4
Grafter
Posts: 244
Registered: 13-12-2007

Re: Spam prevention - An idea to consider?

I'm a little unsure who you're suggesting does the scanning/runs the apache module.  If it's PN themselves, then it'll only have any effect on sites they host, which could be a tiny proprotion of mailtos, compared to say people using their PN address out on forums, shopping sites etc .etc.
Another aspect to consider, which I see a lot of, is spambots taking the domain part of the address and then guessing at the first bit, so @plus.net and then guess at common names, character combos etc. etc.  Nothing much can stop this.
My personal fave is the hashcash method, which I use on mailservers I look after.  Unfortunately, as with most solutions it needed widespread addoption to be effective.  For the uninitiated, sending an email with a hashchash, requires the sender to do some intensive computational work to generate the hash, and send the email.  This should then discourage spambots of the semi-legitimate kind (ie... it'll not stop botnets). 
Unfortunately regardless of the ideas, spam is something that needs monumental international cooperation, something I doubt I'll even see in my offsprings lifetime, let alone mine.
Anteaus
Grafter
Posts: 64
Thanks: 1
Registered: 02-08-2007

Re: Spam prevention - An idea to consider?

Quote from: phil4
I'm a little unsure who you're suggesting does the scanning/runs the apache module.  If it's PN themselves, then it'll only have any effect on sites they host, which could be a tiny proprotion of mailtos, compared to say people using their PN address out on forums, shopping sites etc .etc.

Obviously, Plusnet (and other participating webhosts) would do the scanning.
I covered the fact that to be fully effective, the measure would need fairly wide uptake by hosting companies. But, as I also pointed out, locally-hosted websites with synonymous email addresses would account for a fair proportion of the spam destined for Plusnet servers. Thus, an immediate benefit is apparent.
Looking at the global picture, it is exceedingly rare for shopping sites to leak email details. Neither do phpBB forums, though there is one well-known forum product which leaks addresses. Meanwhile, business directories used to be a major contributor, but in recent years most have tightened-up their security. Facebook and Twitter don't expose your email address, unless that is you are careless enough to deliberately flash it around.
And, I daresay you can't stop a fool from posting his email address in a newsgroup, but then if that same fool had a webpage taken offline because of its having harvesting issues, he might be a bit wiser next time he visits a newsgroup... and might not repeat the mistake.
That, and while backscatter-attacks on domains do take place, I don't think they represent a statistically significant  part of spam volumes. In any event it is relatively easy to protect a mailserver from backscatter, by temporarily banning any IP which sends more than (say) ten invalid messages.
My feelings are that the Apache module and an error code would be the way to go. Mainly because this covers CMS/SQL pages, which filesystem-scanning misses. Then, if people demand their 'human right' to have their email address harvested Roll eyes there could be a .htaccess command to turn the module off.
phil4
Grafter
Posts: 244
Registered: 13-12-2007

Re: Spam prevention - An idea to consider?

It certainly works, but as I mentioned requires so many different people and products to cooperate that I doubt it'll ever happen.
That's not to say it's not a good idea, I've just been in the game so long I'm highly sceptical that it'll ever roll.
What the world needs is one of two things:
1) Some funky socialmedia/web 2.0 thing that therefore everyone uses to be popular, which either replaces email and/or is the solution to spam.  The social/web2.0 aspect is purely there to increase success.
2) Is so staggeringly simple, free and easy, and 100% effective that everyone can do it, and so does.
So while hashcash, and apache modules are nice ideas the fact neither fit the above criteria suggest to me that they'll never work.
Please however feel free to start persuading companies to adopt your suggestion, though I'd suggest writing the apache module for them might give you better success.
Call me bitter and cyncial if you like, but it's years of experience that have taught me to be this way.
Anteaus
Grafter
Posts: 64
Thanks: 1
Registered: 02-08-2007

Re: Spam prevention - An idea to consider?

Quote from: phil4
.. though I'd suggest writing the apache module for them might give you better success.

At this stage I just wanted to test-out whether there is any interest in the idea.  As I'm not an Apache coder it might in any case be simpler/quicker to find someone who is than to learn the specialist skills.  For that matter an IIS version would be needed too, and that needs another different skillset. If the idea has promise then I'm sure we can get some coders on-board.


phil4
Grafter
Posts: 244
Registered: 13-12-2007

Re: Spam prevention - An idea to consider?

Unfortunately I don't work in any circles that would make use of the module (IIS or apache), it's a different bit of IT I'm in.
I guess all I can really give you, and it's not of much use, is my opinion as above.
Wish you the best, as while my view is doom and gloom, this certainly is as good as any idea.