Tuesday, April 8, 2008

Anti-Spam Technology Adoption


In his comments on Post #3 of our trickle blog, TZink notes that Bill Gates would have been right about the 2-year time frame for stopping spam if the computational challenge and sender authentication measures had been widely implemented.

They were probably right. These steps would have worked but didn't because these techniques interfere with legitimate email delivery. Let's take a closer look at the challenges to implementing these and other anti-spam technologies by comparing the adoption of Sender Authentication and Reputation filtering.

To date, Sender Authentication has been limited in its deployment and usefulness by factors related to how legitimate organizations use email.

The first barrier to adoption is inconvenience. If we look at Sender Authentication today, most organizations implement soft-sender authentication. Soft-fail authentication basically says "Our email comes from these specific servers, OR it could come from anywhere else." This is not very different from the unauthenticated model of "Our email comes from anywhere" because I can't reject mail, even if it comes from somewhere other than the authenticated servers.

To enable rejection of messages, organizations should implement hard-fail authentication and state "All email from our domain originates at these servers." This is a good clear statement enabling the rejection of mail coming from elsewhere. Why don't more organizations do this?

One reason appears to be because it's inconvenient for their users if they enforce the use of their servers. Many end users dislike authentication because they must now setup their email client to send only via those servers, rather than change the "from" address and send from anywhere. That this process is perceived as a burden on senders is unclear to me, and it remains one of the barriers to implementation. [Disclosure: Mailchannels sets our records to soft-fail authentication and its unclear to me why]

A second barrier to adoption is the lack of incentive to do it. Unless I'm worried my email will get blocked when I send it, there is little value in configuring authentication for my servers. The value of authentication is mainly derived by the recipient - better and clearer information helps them decide whether or not to receive my email. However, people want to get their mail so authentication is configured on the (pragmatic) assumption that most servers don't have authentication setup. In this case, without authentication data, the recipient will normally receive the mail anyway. Since I can still deliver my messages without setting up authentication, why bother doing so. If measures like this are to be effective, the default action needs to be to enfocement. This would likely penalize most legitimate senders - hence adoption is slow. Yahoo and others have become more aggressive in their requirement for Authentication, the adoption has improved.

Another barrier to the adoption of authentication is that the value in taking the time to authenticate is perceived as low. Knowing that a person is who they claim to be is not in itself helpfulunless there is some measure determining whether or not that person is worth talking to. A driver's license is more useful to a police officer if they can also run your ID through a records search. It's not much good for me to know that yes, you are "Bob". If I want to do something with the information, it's better for me to know you are "Bob the known spammer."

But the number one reason for poor adoption is simple ... authentication on its own is useless for stopping spam.

Sender Authentication is only solves one aspect of email abuse, address spoofing. With SMTP any email can be sent from anywhere claiming to be from anybody. Sender Authentication enables the recipient to check whether a message was sent from a server belonging to the organization it claims to be from. The technique has proven effective against phishing attacks but spammers aren't impersonating anyone so sender authentication doesn't really help. What we get is mail from authenticated spammers.

I hate to be sounding like Ironport but, what is needed is reputation.

Sender Authentication could have stopped spam if everyone (or a large subset of everyone) agreed to register their servers or addresses with some central authority that could clearly identify the legitimate registered senders and be used to allow that mail through and block the rest. But who is going to be that authority? How will it be policed? Where will it operate? Can I trust it? What if there is more than one authority? Can I trust all of them? The internet was designed to avoid this sort of centralized control. It is pretty hard to get that cat back in the bag.

Instead of an agreed authority, what has arisen are third-party reputation systems that came along as an evolution of blacklists. These systems track the history of the senders they see in their traffic. They have been effective against spam because they identify the known bad addresses and block those. They also identify known good senders and allow those messages through. Each of these systems tries to be a central authority for email reputation. However, they don't work well with unknown senders because the senders don't have to register first. The systems don't have enough reputation information to stop the message. Each day, Botnets exploits the fact that it takes time to see a new address, and then give it a reputation score.

Reputation has been widely adopted where Authentication has not. The difference between them in terms of adoption are clear. Reputation does not inconvenience end users. There is incentive to implement reputation because it reduces load on servers. The value is high because it can be used it to make real decisions. Most importantly, it works to reduce a real pain.

In his comments on Post #3 of our trickle blog, TZink notes that Bill Gates would have been right about the 2-year time frame for stopping spam if the computational challenge and sender authentication measures had been widely implemented.

They were probably right. These steps would have worked but didn't because these techniques interfere with legitimate email delivery. Let's take a closer look at the challenges to implementing these and other anti-spam technologies by comparing the adoption of Sender Authentication and Reputation filtering.

To date, Sender Authentication has been limited in its deployment and usefulness by factors related to how legitimate organizations use email.

The first barrier to adoption is inconvenience. If we look at Sender Authentication today, most organizations implement soft-sender authentication. Soft-fail authentication basically says "Our email comes from these specific servers, OR it could come from anywhere else." This is not really very different from the unauthenticated model of "Our email comes from anywhere" because I can't reject mail, even if it comes from somewhere other than the authenticated servers.

To enable rejection of messages, organizations should implement hard-fail authentication and state, "all email from our domain originates at these servers." This is a good clear statement enabling the rejection of mail coming from elsewhere. Why don't more organizations do this?

One reason appears to be because it's inconvenient for their users if they enforce the use of their servers. Many end users dislike authentication because they must now setup their email client to send only via those servers, rather than change the "from" address and send from anywhere. That this process is perceived as a burden on senders is unclear to me, and it remains one of the barriers to implementation. [Disclosure: Mailchannels sets our records to soft-fail authentication and its unclear to me why]

A second barrier to adoption is the lack of incentive to do it. Unless I'm worried my email will get blocked when I send it, there is little value in configuring authentication for my servers. The value of authentication is mainly derived by the recipient - better information clearer information helps them decide whether or not to receive my email. However, people want to get their mail so authentication is configured on the (pragmatic) assumption that most servers don't have authentication setup. In this case, without authentication data the recipient will normally receive the mail anyway. Since I can still deliver my messages without setting up authentication, why bother doing so. If measures like this are to be effective the default action needs to be to enforce it, but that would penalize most legitimate senders - hence adoption is slow. Although, as Yahoo and others have become more aggressive in their requirement for Authentication, the adoption has improved.

Another barrier to the adoption of authentication is that the value is low. Knowing that a person is who they claim to be, is very low unless you have some measure of whether that person is worth talking to. A driver's license is more useful to a police officer if they can also run your ID through a records search. It's not much good for me to know that yes, you are "Bob". If I want to do something with the information it's better for me to know, yes you are "Bob the known spammer".

But the number one reason for poor adoption is simple ... authentication on its own is useless for stopping spam.

Sender Authentication is an only solves one aspect of email abuse, address spoofing. With SMTP any email can be sent from anywhere claiming to be from anybody. Sender Authentication enables the recipient to check whether a message was sent from a server belonging to the organization it claims to be from. The technique has proven effective against phishing attacks but spammers aren't impersonating anyone so sender authentication doesn't really help, what we get is mail from authenticated spammers.

I hate to be sounding like Ironport but what is needed is reputation.

Sender Authentication could have stopped spam if everyone (or a large subset of everyone) agreed to register their servers or addresses with some central authority that could clearly identify the legitimate registered senders and be used to allow that mail through and block the rest. But who is going to be that authority? how will it be policed? where will it operate? can I trust it? What if there is more than one, can I trust all of them? The internet was designed to avoid this sort of centralized control, its pretty hard to get that cat back in the bag.

Instead of an agreed authority, what has arisen are third-party reputations systems that came along as an evolution of blacklists. These systems track the history of the senders they see in their traffic and have been effective against spam because they identify the known bad addresses and block those, and identify known good senders and allow messages from those through. Each one of these systems tries to be a central authority for email reputation, but they don't work well with unknown senders because the senders don't have to register first and the systems don't have enough reputation information to stop the message. Every day, Botnets exploit the weakness that it takes time to see a new address and give it a reputation score.

Reputation has been widely adopted where Authentication has not. The difference between them in terms of adoption are clear. Reputation does not inconvenience end users, there is incentive to implement because it reduces load on my servers, the value is high because I can use it to make real decisions and most importantly it works to reduce a real pain.

2 comments:

J.D. said...

Reputation, today, is almost always based on IP addresses -- because the last-hop connecting IP address is (very close to) unforgeable.

With domain authentication, we could base reputation on domain names too. This would allow good actors to change IP addresses without penalty, it would allow multiple senders to share the same IP without affecting each others' reputation -- the list of benefits goes on and on, and we probably don't even know the half of 'em yet.

This isn't authentication OR reputation, it's reputation kludges that have to last until we've got reputation built ON effective authentication.

(Also, the "soft fail" you describe is only valid for SPF or SenderID. DKIM is different.)

David Whitehead said...

Thanks J.D.

You are right, "soft fail" is unique to SenderID, and is one of the strikes against it to my mind. Still its better than nothing.

With regard to domain authentication, if I am not mistaken this is what Sender Authentication provides. Sender Authentication enables an organization to state which mail servers IP's are used by their domains to prevent spoofing of addresses.

Whether an IP is tied to a known good sender (Domain) via SenderID or DomainKeys is typically used as part of the reputation score.

Reputation build on effective authentication is definitely the preferred solution.

But as you suggest we can only trust the IP of the last-hop so mail that takes multiple hops along its way creates the challenge that the IP used in the reputation score (the last hop) may not itself have an authentication record.

Which of course is why the FUSSP is so hard to achieve, there are too many degrees of freedom. Of course, perhaps we could implement some of the more effective ideas if we could find a mechanism to overcome the FUSSP barriers rather than design solutions that continue to endure them.