Die, Spam, Die! SpamAssassin “Whacks” Those Shady Emails

Linode
Linode Cube
Published in
4 min readApr 20, 2017

By: Steven J. Vaughan-Nichols

I hate spam. I’ve hated it for decades. Back, when many of you were still writing letters on paper, I was battling spam as an email administrator at NASA’s Goddard Space Flight Center. Later, I joined CAUCE, one of the first anti-spam organizations. Years later I’m still combating spam, but since 2001 I’ve had a powerful ally: SpamAssassin.

SpamAssassin 3.4.1 is an open-source Apache Software Foundation program. It’s written in Perl. Don’t sneer! While its native language is no longer fashionable, it still excels at slaying spam.

The program works, as then-Apache-SpamAssassin VP Kevin A. McGrail put it in 2014 with its last major update, 3.4, “The №. 1 feature of SpamAssassin is the proven classification-scoring framework that allows system administrators to use new ideas to improve email classification, thereby making SpamAssassin timeless.”

SpamAssassin uses a multiple “tests” for classifying e-mail. These include text-based patterns, Bayesian filtering, checksum filters, sender authentication and automated rule channel updates. Since SpamAssassin 3.4 arrived it has also included native IPv6 support, improved DNS Blacklist technology and support for massively-scalable Bayesian filtering using the Redis backend. With this multi-tiered approach, the program does an excellent job of snaring spam while avoiding labeling legitimate emails as spam.

You can optionally white-list senders to make sure that their messages are never filtered. You can blacklist senders or servers that you want permanently blocked.

This classification system uses an adjustable set of regular expression rules to “test” if a message is spam. These tests give each message a score, the higher the score, the more likely it is that an email is spam. You get to decide what score is high enough for SpamAssassin to qualify an incoming email as spam.

Of course, we’ve had spam filtering systems ever since the first Nigerian prince offered someone 10 percent of $40 million for setting up a bank account they could use to transfer the cash out of the country. Filtering systems based on rules set by hand, however, are brittle. They don’t work that well. That’s where Bayesian filtering comes in.

Ideally, spam filtering should determine what’s spam based on known probabilities of past messages. As luck would have it, there’s a 250-year-old mathematics edifice that can do just that: Bayesian analysis.

Without burying you in statistical analysis, Bayes’ Theorem provides a means to combine probabilities. For example, if there’s a .99 probability that an email with the word “adult” in the title is spam, and a .97 probability for the word “Viagra,” then the likelihood that a message with both words is spam is 99.97%.

But … that only gets you so far. SpamAssassin also looks at the reverse probabilities as well: what are the words that never occur in spam? By applying Bayesian analysis to those words, its filter’s accuracy is greatly improved.

Bayesian filtering also looks at other factors besides text. For example, messages that use all-caps words in the subject line or HTML to mark words as red are also more likely to be spam.

So, by constantly analyzing your email stream, SpamAssassin may miss at first that “Insane NAVY SEAL Flashlight Now Available to the Public” is spam. But it doesn’t take long for SpamAssassin to figure out with your help and the constantly updated SpamAssassin rules that such is a spam message.

Sure, you can spot that too. Two all-caps words in the subject line is a huge spam red-flag. But SpamAssassin can catch this kind of marker in microseconds, instead of the handful of seconds it would take you. Multiply that over scores of daily email and you get the idea.

SpamAssassin doesn’t rely on just Bayesian filtering. It also can be used with DNS-based email blacklists, Sender Policy Frameworks, and DomainKeys Identified Mail to spot spam.

Installing SpamAssassin is comparatively simple. If you’re using cPanel to control your Linux server, you can install directly from this easy-to-use server manager. Installing SpamAssassin directly on your server isn’t complicated either.

SpamAssassin works with practically all email servers. I currently use it with Exim, qmail, and Dovecot.

Is it worth using? Heck yes. I’ve been using SpamAssassin for over a decade now. I run multiple mail servers and lists. On an average day, I see thousands of messages pass through my servers, and SpamAssassin eliminates almost all the bad messages while securing safe delivery for essentially all the good ones. I consider it a necessary part of any mail server setup.

Try SpamAssassin. I think you’ll come to find there’s no better way to free yourself from the annoying racket that is spam. The spam gets whacked, while you, like me, get happy.

Please feel free to share below any comments or insights about your experience with handling spam or using SpamAssassin. And if you found this blog useful, please consider sharing it through social media.

About the blogger: Steven J. Vaughan-Nichols is a veteran IT journalist whose estimable work can be found on a host of channels, including ZDNet.com, PC Magazine, InfoWorld, ComputerWorld, Linux Today and eWEEK. Steven’s IT expertise comes without parallel — he has even been a Jeopardy! clue. And while his views and cloud situations are solely his and don’t necessarily reflect those of Linode, we are grateful for his contributions. He can be followed on Twitter (@sjvn).

--

--

Linode
Linode Cube

Cloud Hosting for You. Sign up today and take control of your own server! Contact us via ticket or email for all support inquiries: https://www.linode.com/contact