Engineering Tackling Greylisting In MailChannels Cloud By Ken Simpson | 5 minute read MailChannels Cloud sends enormous volumes of email through a large and complex infrastructure that spans the globe. But recently, we began encountering a serious challenge from a relatively old and yet still commonly used anti-spam technology called “greylisting”. What is Greylisting? According to Wikipedia, greylisting is an anti-spam technique in which the “mail transfer agent (MTA) … will ‘temporarily reject’ any email from a sender it does not recognize. If the mail is legitimate the originating server will try again after a delay, and if sufficient time has elapsed the email will be accepted.” Greylisting can be an effective defense against spam originating from simplistic botnet SMTP clients, which do not have an internal message delivery queue and cannot retry delivery if the first attempt fails (as it does when the receiving server is implementing greylisting). Legitimate email senders always have a message queue available, and so presumably should have no problem dealing with greylisting. One popular implementation of greylisting is the postgrey plugin for the Postfix MTA. From the chart below, you can see that greylisting is capable of achieving a significant reduction in spam within a very short period of time (source: http://postgrey.schweikert.ch/, fetched December 10, 2015): Problems with Greylisting The key problem with greylisting is that some senders send email from a large number of IP addresses. Greylisting makes the assumption that each message is tied to a specific sending IP address – as would be the case if the sender operated one or more individual email servers, each with its own message queue. Large installations, however, often use a distributed message queue that is spread across many servers. When a large system like this sends the same message multiple times in order to deal with greylisting, it’s quite likely that each delivery attempt will be made out of a different IP address. A large scale email sending infrastructure, therefore, is at risk of poor email deliverability when the receiver is using greylisting, because the receiver will temporarily reject each new IP address, each time the same message is retried from the large sender’s distributed queue. In some cases – and if the infrastructure is large enough – greylisted messages can be requeued hundreds of times, eventually reaching the point at which the message is bounced from the sender’s queue for non-delivery. Delays and eventual non-deliveries relating to greylisting are a major problem for large senders. How MailChannels Solves Greylisting The MailChannels Cloud sends email from literally thousands of IP addresses spread across multiple providers in many different countries. Our infrastructure involves hundreds of servers and a very large distributed queue. Until recently, greylisting was causing significant message delivery delays for our customers when they attempted to deliver to many receivers with naive greylisting implementations. When MailChannels Cloud ordinarily delivers a message, the server from which the message will be sent is chosen at random from within various pools of servers which are used for sending specific types of email. For instance, there is a pool of servers for sending email that is highly reputable and unlikely to be spam. Another pool is dedicated to sending bounce messages. And still another pool sends only messages that we suspect might be spam, but don’t yet have enough evidence to be totally certain of this fact yet. To solve greylisting, we had to come up with a method of detecting when greylisting has occurred, and then ensuring that re-delivery of a queued message would occur from the same IP address on subsequent delivery attempts. For reasons we can’t get in to in a single blog post, it would not have been easy for us to simply move messages to a specific “location” for re-delivery, because our internal systems are ultra-distributed for reasons of scale and reliability. Our first step was to detect when greylisting has occurred. To accomplish greylisting detection, we added some rules to our Response Analytics system, which analyzes SMTP responses from receiving mail servers and sorts them in to categories. A common greylisting SMTP response could be 450 Greylisted, see http://postgrey.schweikert.ch/help/example.com.html When Response Analytics detects that a message or connection has been greylisted by the receiver, we record the IP address from which the delivery was attempted, along with the destination IP address of the receiver. The next time the same message is delivered – assuming it’s going to the same downstream IP – our system automatically selects the same outgoing IP address rather than picking from an address completely at random from within a given pool. The fact that a receiving IP uses greylisting is remembered for a long period of time, ensuring that future deliveries will also send from the same consistent IP address. Results This approach has been remarkably effective at improving our delivery rates to servers using greylisting. The chart below shows just the first day of delivery performance since deploying the feature: the area in green represents the proportion of greylisted connections, the area in yellow indicates the initial temporary rejections by greylisting receivers, and the area in blue represents successful deliveries to those receivers. In less than a day, our system’s “immediate delivery rate” to greylisting servers improved from close to zero to 71%. To learn more, feel free to contact our technical sales team who would be happy to explain how MailChannels Cloud can improve your email delivery experience.