Skip to content

Post #1 on Why Spam Filters Suck “trickle blog” series

By Desmond Liao | 3 minute read

A Short History of Spam Protection

While methods have changed, spam continues to be the misuse of an open communication network for financial gain. What was once a harmless annoyance has led to serious conditions where high spam traffic can clog email servers to the detriment of legitimate mail.

How did we get here? And what can we change to solve the problem?

The first spam email ever was used to promote a seminar from Digital Equipment Corporation (DEC) in 1978. I’d call it spam because it was a mass emailing harvested from a printed directory of ARPAnet to recipients who had not requested any contact.

Spam didn’t become a huge problem until around 2002 when there were enough active email users worldwide to make spamming profitable. In response, the first commercial and open source spam filters arrived in Brightmail, PureMessage, and SpamAssassin to name a few. The first generation of filters applied sets of rules to each message received, identifying features within messages which might indicate the likelihood
of being spam.

Spammers countered rule-based filters by obfuscating the content of their messages. Rather than sending a text message advertising Viagra, for example, the spammer might chop the message into small HTML pieces which, while unrecognizable to the spam filter, would still render into legible text for the message recipient. The rule-based filters added more rules to catch these obfuscations, causing the spammers to further innovate. This pattern of content obfuscation continues to the present day, the most recent example of which is probably MP3 spam (i.e. spam message contained in an audio file).

Anti-spam is one of those areas of IT where you’re “damned if you don’t.” If email is flowing free of spam, you hear nothing. But when spam is getting through or emails are backlogged on the server, there’s hell to pay.

Why is spam causing backlogs? Why is all mail treated equally? And do we need to keep adding what are effectively junk processing servers?

As the sophistication of spam has increased so has the need for processing power to analyze those messages. Today, with email servers under high traffic loads, the ever increasing computational cost and processing overhead of analyzing the content of every email often results in service disruptions for legitimate email. This has to change. IT infrastructure costs should be a function of legitimate activity not spammer driven loads.

To solve the loading problem imposed by the current method of spam filtering where all incoming email messages are accepted by the server, buffered in a common queue on a first-come first-served basis, there needs to be a shift away from a single-queue of email traffic towards a prioritized system that can expedite legitimate mail first.

But there’s more that needs to be considered…

UPDATE: On the subject of the history of spam, Christopher Nickson writes that the word “spam” to describe unsolicited commercial email recently celebrated it’s 15th anniversary.

NEXT: Post #2 Prohibition Induces “Botlegging”

Cut your support tickets and make customers happier