Skip to content

How Spamhaus will protect the internet from AI-driven spam and phishing

By MailChannels | 46 minute read

YT Thumbnail EP 3 Andrew B V2

In this engaging dialogue with Andrew Barrett of Spamhaus, we trace the organization’s journey from its inception, discussing the initial vision, early challenges, and the adaptations required as the internet and cyber threats evolved. Highlighting key milestones, the conversation shifts to the present day where Spamhaus is recognized as an authority on IP and domain reputation data. Detailed insights are provided on the importance and management of IP and domain reputation, the impacts of a bad reputation, and recent trends observed in cybersecurity, including the role of artificial intelligence. The conversation concludes with a forward-looking perspective, exploring future challenges in the cybersecurity industry and how Spamhaus plans to address them. The evolution of public blocklists, the company’s future vision, and the potential of AI in augmenting both cybersecurity measures and hackers’ attacks are also discussed, offering a comprehensive and informative insight into the world of Spamhaus.

 Listen here:

 Watch here:

Read the transcript:

Andrew: We have, in fact, seen some very enterprising folks submitting AI generated delisting requests. 

Ken: AI generated delisting requests? 

Andrew: We can  tell when it’s AI, even if it’s not particularly AI looking, because… We can see what they’ve said to us in the past. Yes. And what we see in front of us today reporting to come from them is so very different without a semi colon out of place.

Andrew: Just, you know, perfect. 

Ken: The dead giveaway is that the user agent string says auto GPT. I mean, that’s, that’s the dead giveaway. Today, I’m Spamhaus.

Ken: Started in the late 1990s, Spamhaus is perhaps the most iconic name in the fight against bad email practices on the internet. For over 20 years, Spamhaus has been fighting against bad guys, with a team of researchers around the world collecting information to identify sources of spam and phishing, and make this information widely available to email receivers.

Ken: Who can then do a better job of filtering out the bad stuff to keep our inboxes safe and clean. In this discussion with Andrew Barrett, we talk about the history of Spamhaus and what they’re currently doing to fight spam and phishing online. We then veer into a discussion about the future, including some interesting discussions about how artificial intelligence is starting to show up on their radar.

Ken: I really hope you enjoy this discussion with Andrew Barrett. I certainly found it extremely interesting and something I’ve wanted to do for a very long time.

Ken: So, Andrew, Andrew from Spamhaus. Um, can you tell us about the early days of Spamhaus? How did you Did this organization that is really a household name in the email industry, how did it get started? What, what’s the history? What was the initial vision? 

Andrew: Well, it got started, um, and almost nobody noticed at least not to begin with, right?

Andrew: It started in 1998, which As luck would have it is about the same year that Google got started, uh, to about as much fanfare, right? And, uh, for the longest time, it was just a small group of, of researchers. In closed forums, comparing notes and working on, uh, problems that were, were becoming evident early on, we wouldn’t see these problems scale because the Internet itself was still not scaled either.

Andrew: But it was very clear that Steve Linford, the fellow who started Spamhaus, saw early on the promise of an open Internet and At the same time, recognized some of the problems that came with an open internet and, and, and threats to that promise. And over time, word of their work, uh, spread further and further afield and, uh, until we are about where we are now.

Ken: Yeah, it’s fascinating to, you know, when I think back to 1998. Uh, I don’t remember that spam was even a thing in 1998. I mean, you might have occasionally got a message, but it wasn’t the sort of industry that it is now. Uh, to my recollection, it wasn’t really until the early 2000s that I really started to get a lot of spam and that people actually started…

Ken: I’m thinking about fighting spam more broadly. 

Andrew: I think that’s right. That’s certainly is my recollection as well. And there’s a couple of reasons for that. First of all, market penetration of the internet was still, you know, lagging far, far behind. I think it wasn’t until, uh, 2001 or 2002 where 50% of American households at least, uh, finally had, uh, some kind of access to internet and maybe also email.

Andrew: Also access to those communication channels were, were metered. It wasn’t the all you can eat buffet. So there were some significant economic barriers to abusive messaging, right? Right. You, in some very early cases like AOL, um, you literally paid by the message or you paid by the minute for connection time.

Andrew: Um, wasn’t quite the attractive nuisance, um, that it would later. Become when the economics started heading in the other direction, 

Ken: right? So in the early days of Spamhaus, it was, uh, you know, a band of enthusiastic researchers Uh, you know, working in kind of private channels or discussion groups with others, sort of look ahead and fight some of these early battles with the bad guys.

Andrew: That’s right. This is, you know, before internet or computer security was even a viable career track, right? Much less a phrase you might hear in everyday conversation. But their work was prescient. It was prescient. Yeah. Uh, and, um, you know, necessarily defensive in nature. Right. Once the horse was out of the barn, you really couldn’t bring it back.

Andrew: There’s no way in the way that the internet is architected on a protocol level to stop abusive traffic from going out. Now, that means all that’s left is stop the receiving of abusive traffic. Right. And again, economics would play a huge role. in, in, in where the points of defense might lie. For many years, it remained a problem that even if you were successful in keeping abusive traffic or abusive messaging out of the inboxes of intended recipients, you still incurred the cost of transit.

Andrew: And so one of the transformative moments, I think, in, in the entire war around spam and abusive messaging was the Vixie’s advent of an, uh, IP based DNS distributed block list. And, you know, that was a game changer. And it’s still how everyone, generally speaking, is doing things today. You block the traffic at the distal edges of your networks so that you’re not only blocking the traffic, but you’re also not incurring the costs of processing the traffic to find out if it’s abusive.

Andrew: Find out whether it should land in front of intended recipients. 

Ken: So all these years later. Uh, all these iterations of technology, you know, thousands and thousands of times scaling on network capacity and thousands and thousands of time reduction in computation costs. And we’re still. Uh, primarily IP blocking.

Andrew: Well, it’s quite a bit more sophisticated that than that, because just as you say over time, uh, and with scale, attacks become more sophisticated. We see new vectors of attack. Uh, I think the second big milestone in, in defensive measures against abusive messaging is the broad adoption of robust authentication protocols where You have that IP that you mentioned, that all important IP, which is what blockers were looking at.

Andrew: In the very earliest of days, um, but with the advent of these protocols, you can now make verifiable, robust assertions about the entity responsible for mail from that IP address, more importantly, traffic from the host associated with the IP. And so what emerges is an opportunity to make very, very granular.

Andrew: Assessments about the reputation of an infrastructure or of a very tiny slice of an infrastructure so that you can decide. Well, I don’t want traffic from that slice, but, uh, except that from this slice and maybe I’ll, I’ll accept a little bit from that third slice, uh, and make a decision about what I want to do with, uh, more traffic from that third slice later.

Andrew: So along with those. Uh, additional tools in the toolbox. You have today, uh, what feels like a pretty, uh, pretty wide ranging arsenal of, of, of tools to levy against traffic, but we’re still losing the battle. Yeah. 

Ken: So you say you’re, you think we’re still losing the battle? 

Andrew: Well, yeah. The reason I say that we are losing is because, you know, every couple of months or years we see some new attack vector that.

Andrew: No one was able to predict and time lag between day zero and the time to, uh, some kind of a response to that attack. Um, those are all days during which machines, people, lives are being compromised. So, you know, so 

Ken: can you speak to some of the, like more speak more specifically to some of those inflection points in the history, you know, going back, uh, I guess over the last 25 years of, of Spamhaus history.

Ken: Uh, what were some of those specific inflection points where, uh, the approach of malicious actors changed and Spamhaus had to adapt?

Andrew: Well,  I think the big year in that timeline is 2003. That’s the year when we first saw, you know, significant expansion of open proxy hijacking. And then follow, following very quickly on that, you see the rise of botnet based malware.

Andrew: Where hardware gets surreptitiously controlled by third parties and used in very tiny, tiny amounts over great, great swaths of infrastructure to amount to extremely large volumes of abusive traffic. 

Ken: So you say that the sort of before 2003, it was all about. Renting a server, pumping out spam out of some IPs that you had bought or leased, uh, and then in 2003, we had this sudden emergence of botnets.

Ken: And so now, instead of getting a ton of traffic from one IP, you get a little tiny bit of traffic from 100, 000 IPs spread all over the place. That’s right. 

Andrew: 2003 is also the year when, you know, broadband suddenly became not universally available, but, uh, it’s became affordable for a much broader range of users.

Andrew: And so windows machine attached 24 7 to a nice fat pipe could be compromised. A very small mail server might be installed surreptitiously. It might be listening on an IRC channel to a command and control server and Being woken up by that server, by that command, fed a little slice of text to send his email, and you’re off to the races.

Andrew: It’s easy to, as a bad guy, to rent an IP as yourself, but the minute somebody figures out who you are and what you’re doing, that’s it, game over, you have to do it all over again. But if you’re able to assemble a botnet, just like the one you described, uh, it becomes much harder because Now they’re able to use human shields.

Andrew: Grandma in upper Poughkeepsie has her connectivity disrupted because some guy in Ukraine or Romania is using her broadband connection to, to send spam from her botnets in 2003, it creates this, you know, huge problem of, uh, of traffic. Being dispersed over essentially consumer broadband links like coming from grandma’s PC and in the basement or whatever.

Ken: Yeah How does Spamhaus respond to that problem? What did they do? Actually, 

Andrew: the response was already there. It just had never been more important until that day That was a solution that was in place. That was a good match for a problem that would follow in, in 2003, suddenly we see these botnets, uh, uh, and, uh, these open proxies being abused everywhere.

Andrew: There is already a solution in the form of IP paced, uh, and domain reputation based blocks. All it needed was a little bit of scale. 

Ken: I see. Got it. So it’s more like the, it wasn’t something where Spamhaus had to do something new. It was more like. Spamhaus was there with the service, uh, and the distribution of block list data, and then the problem came.

Ken: Right. There was a perfect match for it.

Andrew:  It was. Right. The problem always existed. The solution always existed. Uh, the problem got really big, really fast all of a sudden. And along with that sudden visibility of the problem came the sudden visibility of that solution. So, in a sense, uh, you know, that, that might be one of those few occasions where the good guys were actually just a little bit ahead of the game, right?

Ken: Now, uh, you know, for, for, for many people, uh, Uh, watching this, they might not be familiar with the economic model behind SpamHouse, you know, so back in the early days, it was a bunch of researchers, you know, kind of doing, uh, doing God’s work in their spare time. Um, at some point, uh, SpamHouse had to start making money for itself, uh, and, and it had this, uh, block list data.

Ken: So how, what were some of the like achievements along the way that allowed SpamHouse to grow into, uh, the entity that it is today, which we’ll definitely get into later? 

Andrew: Sure. Sure. Spamhaus and its partner entities has access to an enormous amount of data. Um, truly, truly stunning. Uh, if you’re listening to this podcast and you’re a sender of email, you work for an ESP, your worldview of spamhaus is kind of constrained by commercial mail.

Andrew: There is so much more badness out there that you don’t see and that we are detecting and have some, some interesting data on, uh, a lot of that can be packaged In ways that can be consumed for different audiences. And so some of that data is free, has always been free, will always be free because, uh, it needs to, it needs to be in the hands of people who can use it in order to.

Andrew: Stop the bad guys from owning this very precious public resource, but there are ways to use the data more efficiently and, uh, occasionally preemptively and, um, uh, that is of use to commercial entities. And so we can charge those entities for access to the data, regardless of how they choose to, to deploy it, uh, in their own networks.

Ken: Got it. Okay, cool. So, yeah, so it kind of how it got from. A band of good guys, uh, working behind the scenes to save the internet to a force to be reckoned with is, you know, along the way, a bunch of data got collected and different slices of that data are, are interesting to, uh, for a variety of commercial purposes.

Ken: And so Spamhaus has been able to build a business servicing those customers. 

Andrew: That’s right. Yeah. So you could be, uh, a threat researcher. Uh, you’re going to want the broadest possible, uh, look at the universe of data so you can act preemptively to stop exploits. You could be, uh, an E S P, uh, sending mail on behalf of other brands and products, and maybe all you need to do.

Andrew: Is to be able to see what’s happening on your network or how is your brand or product or performing? Are you doing the right things in terms of best practices for your IP and domain reputation? Is the mail getting to where it needs to go? 

Ken: Interesting. So in that way, it’s a, it’s a little bit like, uh, a little bit like a credit rating agency.

Ken: So, you know, consumers will get an account with the credit rating agency and then they’ll look up their own account to see how they’re doing, um, because they’re interested in doing the right thing so they can maybe get a mortgage one day. In a similar way, if you’re an ESP sending out a commercial email, you want to know that your best practices are actually being applied, that your customers aren’t doing the wrong thing.

Ken: You might buy data from Spamhaus to understand Uh, how well you’re doing and how you could improve.

Andrew:That’s right. And those bracket, I think the range of possible or typical uses of the data, there’s a whole bunch of slices of things in between those two pieces of bread to make an interesting data sandwich.

Ken: Huh. Fascinating. Um, so, so, uh, you know, thinking about the present day, today we know Spamhaus, uh, I mean, I think most people on the internet, uh, know Spamhaus as the trusted authority on IP and domain reputation data. Uh, can you explain the importance of IPs and domains in particular? Uh, and why reputation data is your sole focus.

Andrew: I can certainly try. Using a combination of IPs and domains and their associated reputation can be concatenated into a unique or a discrete reputational entity. That is. Used to move some traffic on the internet of some variety. Let’s say email because it’s an easily accessible example. An IP address is good and interesting and that’s how folks went in the very earliest of days.

Andrew: But a single IP address can be used for a whole bunch of different things other than email. And you may not want to block those other things. You may just be interested in blocking some spam that’s coming out of that. IP address, but that IP address is also tied to a domain, which enriches that data a little bit, gives you a little bit more information, allows you to be a little bit more granular, uh, if you want to be, uh, with the data that you see from Spamhaus.

Andrew: The important thing to Spamhaus, I think, is that we’re not making any judgment or value decisions. We merely present data as accurately as we possibly can, right? This IP and domain combination is responsible for sending this kind of traffic. If you as a consumer do not want to, a consumer of that data don’t want to accept that traffic, then you have the, you can use our data to block it.

Andrew: If you’re a more permissive network owner, you may want to accept that traffic because you think that falls in kind of a gray area. It’s up to you. We don’t say whether you are good or bad as a unique or discrete reputational entity. We merely say, we observe you doing X, Y, and Z. If you as a consumer of Spamhaus data want to block X, Y, and Z, then please use our data to do that.

Andrew: Maybe you want to only block X and Z. You can use the data to do that. It’s up to you. The important thing is that the data is available, that it’s actionable, and that it is as close to real time as possible.

Ken: And I, and I know that spam, one of the reasons why Spamhaus has been so successful is because Spamhaus takes great pains to make sure that the data is high quality.

Ken: Uh, you know, there are other, other block lists from, from past years that have been accused of being extremely, uh, poorly operated that, you know, they listed all kinds of IPs that really weren’t doing anything bad. Uh, and, uh, whereas, you know, I think the consensus is that Spamhaus is quite selective and quite careful about the listings that it does make.

Andrew: I like to use the word conservative, but I’m afraid to use it sometimes because of some of the pejorative connotations that come with the word. But it really, there’s nothing more accurate than that Spamhaus has always been and actually deliberately attempts to be very conservative in what it publishes to those zones that we know is going to be used broadly to block traffic.

Andrew: Right, right. Um, we have to be very careful. We have to be absolutely certain that if a source of traffic is likely to be a blocked as a result of our publication. That, that, that there is a very good reason for it to appear on that list, that publication, that zone, uh, to, to be blocked. So, uh, we, we are quite deliberately conservative.

Andrew: We will, um, If we must air on, on the side of, uh, permissiveness.

Ken:  Yeah, and I’ve certainly, uh, you know, I’ve certainly seen, uh, discussions with, uh, you know, between Spamhaus and someone whose IP has been listed. Um, and, uh, and I’ve seen a, a genuine effort to try to figure out what’s going on where, you know, maybe the person with the IP has no idea why they got listed.

Ken: There’s a, a degree of education, uh, that. That that happens. And I, my personal belief is that’s one of the reasons why the spam bus block list is so good. Uh, the advisory is so useful. Uh, is is because of that interactive process, uh, with the people hosting the IP or the domain to help them try to do the right thing.

Andrew: I think that’s right. I think that’s also a deliberate choice, really can’t be the folks that are providing. Uh, a consultation to a good brand that just happens to be doing bad things because they don’t know any better. But what we can do is put the tools in the hands of the experts, uh, the deliverability experts out there, for example, uh, to, to point to this, this third party with expertise that is otherwise completely uninterested in the outcome.

Andrew: No skin in the game. Look at that. And that tells you a lot of what you need to know about your current practices. Right. But you’re right. The streets are… littered with the bloody corpses of block lists that erred too far either in one direction or the other. They were too punitive, or they did not list based on what they said their policy for listing was going to be.

Andrew: And there has to be a certain amount of trust for that data to be useful. Because Spamhaus is deliberately conservative in what we list, we know that folks are going to trust that. If a network object appears on any of our zones, it appears there for a very good reason. And not every block list has been able to say that.

Ken: And that, that causes a kind of positive feedback loop, uh, where, you know, the quality, because you take that conservative approach, people trust the data more and then they’re more likely to take a policy action based on the data. And therefore, the data has more meaning and more relevance, right? 

Andrew: More meaning, more relevance, and if it’s trusted, it can be applied in an automatic way.

Andrew: Right, right. So badness can be stopped sooner. Yeah. There doesn’t necessarily need to be. Um, human review for each and every blocking or unblocking decision that has to be made. And what that means is going back to, uh, the top of the show here, we were talking about economic barriers. That is a removal of a further economic barrier to internet security, right?

Ken: Which, uh, in AI, we, we certainly can’t be spending a lot of time on these things with humans in the loop. Um, so what it tell me, like what is the impact of having a bad IP reputation or a bad domain reputation? Um, is it, you know, is it, is it still something that’s just really an issue around email, like having your email?

Ken: Messages blocked or is it wider than that now? 

Andrew: Well, it’s definitely wider than that. Email is just one flavor of the traffic that we see flowing across networks everywhere. And, uh, as time goes on, it becomes a smaller and smaller fraction of that total traffic or activity that we observe and detect. Uh, but the impact of listings are still essentially the same to those other types of traffic as they are to email.

Andrew: If you have poor domain and IP reputation. Uh, regardless of the kind of traffic you’re emitting, regardless of what protocol you’re talking on, you see diminished, uh, transit, uh, as a result. 

Ken: Interesting. Interesting. So if, you know, like, let’s say I’m. Uh, offering, uh, you know, a large scale API hosting service for some, you know, thousands and thousands of customers.

Ken: It matters. Like, my IP reputation matters, uh, in being able to access services like that. It’s possible that, uh, data from an outfit like Spamhaus could be used to help screen out abusive traffic, uh, that’s trying to hit APIs with invalid requests or conduct, you know, authentication attacks or, or whatever.

Ken: It’s not just, uh, over SMTP. 

Andrew: That’s right. No matter what you’re doing online, any of the things that you just described, uh, pretty much everything ties to an IP or a domain, particularly after 2011. And that broad adoption of those authentication protocols when you no longer had to take anybody’s word for who they said they were.

Andrew: You could pretty much verify that on your own in. 

Ken: An automatic and in real time way. And you’re really speaking about SPF DMARC. Exactly. Yeah. 

Andrew: Right. And sometimes ARC. 

Ken: Sometimes ARC. We’re hoping ARC will gain broader, uh, adoption. I know there’s a lot of discussion, uh, these days about, uh, Uh, getting ARC a little bit more adopted.

Ken: Um, so, uh, you know, if I’m any kind of online actor and I want to maintain a good domain reputation, what are some of the steps that I should take from Spamhaus’s perspective to be the best possible? Uh, domain that will, uh, be viewed favorably. 

Andrew: Just don’t associate yourself with any kind of malicious traffic is the easiest way we can put it.

Andrew: We have, uh, employed a lot of machine learning to understand more quickly and with more depth and context, the kinds of things, uh, the kinds of activities, the kind of signal we can pick up from, from that kind of malicious traffic to try and. Stay ahead of new threats as they arise. And so, we’re capturing some fraction of that total traffic.

Andrew: And where we see malicious outcomes associated with traffic sources, those sources, be it an IP or a domain or what have you, then get published to a zone. And then that zone is consumed by whomever wants access to it. But typically what they’re doing with that zone is restricting transit, right? So just don’t do bad things.

Andrew: Don’t hang out in a bad neighborhood. Don’t hang out in a bad neighborhood. Yeah. If you find yourself in, in the middle of an ASN, that’s associated with bad guys, bad things. You may be painted with an overly broad brush. Um, now certainly the granularity that we described earlier in the show around IP and domain reputation might be applied here, but people are still people.

Andrew: Network engineers are really smart people, but they’re still just people and they don’t have time. To sift through the dustbin to see if you’re doing the right thing. If, if you’re the one exception among all the other, uh, bad sources of traffic in that neighborhood. So it’s always a good idea to do a little bit of homework first, when you’re looking for a place to home your network assets so that you’re not associated with some of the, well, less savory characters in your network neighborhood. 

Ken: Uh, yeah. So I guess in a, in a perfect world, if everybody had all the time in the world. You know, they would look at your domain and they would look up the kind of email you’re sending or the kind of website you’re hosting or whatever and, and they’d say, yeah, you’re, you’re a little, you know, you’re ASN, you’re, you’re hosting your website in this terrible quarter of the internet, but we’ll give you a pass because you, you seem to be a good guys.

Ken: But nobody has time for that. 

Andrew: It’s because the internet is so large. Yeah. Yeah. Nobody has time for that. As it turns out, Spamhaus, uh, has time for that.

Ken: for that. Has, right. Spamhaus is kind of like the outsourced time for 

Andrew: that. Exactly. Yeah. And that’s where the business opportunity for an organization like Spamhaus exists.

Andrew: And I think that we’ve done a reasonably good job. Right. Taking advantage of that opportunity. Our, uh, we’ve deployed a vast, vast over the years, uh, network that passively intercepts all kinds of traffic, including. Passive DNS, um, which actually turns out, at least in my view, to be a fairly key component of the data set we offer.

Andrew: In DNS, it, it, it’s not built to remember. That’s the way I like to put it. DNS has no memory. It’s like the Pacific Ocean. Right? It only knows what it has to know right now to make sure that these packets today get where they’re supposed to go right when it’s supposed to get there. Anything else it needs to know after that, it pretty much forgets because it’s useless.

Andrew: It takes a lot of, uh, Uh, resources to, to retain memory of routes that used to exist in the distant past where the distance past is 10 seconds ago. Right. Right. Yeah. And just think about retaining all that information for the course of a day, a week, a year. It’s very, very, uh, resource intensive to do that.

Andrew: But we have partners who do exactly that. So when we’re able to, we’re able to tell when bad guys start moving around the internet or using fresh resources to continue doing bad things. And that lets us stay ahead in a lot of cases of a moving threat and get that information to consumers of Spanhouse data.

Ken: In a way, it’s sort of like you have a, uh, you have so much data. on current and historical domain usage, uh, that you can, uh, it’s like a panopticon, right? You see all the moving parts. So if someone tries to hide over here, Uh, you know, that collection of domains seem to be related because of previous behavior they had over on this other side of the internet.

Ken: You can see that. 

Andrew: And we can use some pretty straightforward machine learning to fingerprint those behaviors in a fuzzy sort of way so that small variations will still allow us to accurately identify that traffic as. Correctly associated with that previously detected actor. And that’s, that’s the key.

Andrew: Uh, being able to do that at scale, uh, is, uh, not easy. It takes a lot of resources and you can’t expect everybody who has a computer on the internet to be able to do that for themselves. So we try to do that for everybody. Yeah. And distribute the data as broadly as we can to try and sanitize the internet.

Ken: Yeah. Because when we think about a domain name, I mean, at the most fundamental level, a domain name is just a series of characters separated by dots, you know, uh, registered at some, but then there’s all this meta information associated with the domain. There’s the registrar where the domain was registered at.

Ken: There’s the create date. There’s whatever they lied about on the registrant record, you know. Um, and, and if you retain that information over a long period of time, you can sort of see, uh, for example, if a domain got re registered at a different registrar, uh, maybe they’re, you know, maybe they lied about their registrant information, but they lied in the same way in 16 other places.

Ken: And you know, if you’re watching all of it, you can see those patterns. 

Andrew: They list, they lied about it in the same way, uh, at this other place and two years ago. Right. Right. So there, there’s great separation, not only in terms of the network topology, but also in time and by accreting that data and analyzing it in a highly automated way and a highly reliable way, we’re able to contribute the data that’s most useful in protecting the internet.

Ken: Amazing. I mean, you sort of think about this kind of a data collection. An analysis and you imagine an organization like Google that has a hundred thousand engineers and, and a hundred billions of dollars worth of data centers, but Spamhaus is not a vast organization. You must have had to come up with some very efficient ways of managing a ton of data and infrastructure over the years.

Andrew: Well, we did it, but that’s not to say that the infrastructure is small. I mean, right. When I was first hired at Spamhaus. I was invited to go tour the data center in Ashburn, Virginia and, uh, which just happens to be across the river from where I live. So it was, I was lucky in that regard. And uh, it was, uh, a staggering, staggering visit.

Andrew: We pulled up to a warehouse that had absolutely no signage. Uh, no windows, no people, nothing. We walk in. It’s a Sally port where only one door can be opened at a time and both doors could be closed simultaneously to hold you there for law enforcement if someone decides you’re a bad guy. Um, security and bulletproof glass right away.

Andrew: And there are palm print readers on every door. I had never. Known that palm print readers were actually a thing outside of Hollywood. 

Ken: Yeah  I was thinking of the Bourne identity when he goes to the bank and he you know He puts his hand on the glass. Oh, they get access to his stuff.

Andrew: They’re there. They exist I was surprised and then when we moved into the server room proper just you know towers upon towers upon towers of screaming machines and howling air conditioning just an incredible sight if Uh, you hadn’t ever had an opportunity to see anything like that before, which I hadn’t.

Andrew: I mean, certainly. I’ve seen pictures and had them described to me in the past. Seeing it in action was definitely something, uh, uh, that was, uh, awe inspiring to me. 

Ken: Wow. Yeah. Interesting. Um, so, uh, you know, speaking about artificial intelligence, um, you know, does Spamhaus use artificial intelligence these days?

Ken: I mean, there’s a lot of, uh, there’s a lot of talk about AI, certainly. And I think a lot of, uh, AI. That isn’t really AI or it’s just using someone else’s AI API, right? But you got, you did mention machine learning previously, like what, how does, how does SpamOS make use of AI? 

Andrew: Well, we, machine learning is a subset of AI, which is very different from generative AI, which the hot new sexy right now.

Andrew: We’re not using generative AI. Perhaps we can in the future if we find a useful application for it, but right now we use the machine language to, machine learning, excuse me, to generate the kind of pattern recognition that allows us to do logical fingerprinting of, of content, of activity, of entities, so that we can attribute them correctly, accurately to, um, Uh, persistent behaviors and then use the identity information to, to allow our consumers of the data to block traffic someday in the future.

Andrew: Maybe we could use generative AI to do something fun and cool, but we haven’t thought of a way to do that just yet. I. I. Do you want to mention that we have in fact seen some very enterprising folks, uh, submitting, uh, AI generated delisting requests, AI generated delisting requests.

Ken:  That’s right. 

Andrew: So if you’ve never been on the pointy end of a Spamhaus blocklists, we will, one of the golden rules of Spamhaus is there must always be a way out. We always will accept a request to review, uh, any blocking or any listing, I should say, to see if it was a mistake. And, uh, very often, ESPs, for example, or other network owners will write into us about a network asset of theirs that has been blocked and say, we feel this is incorrect because of X, Y, Z.

Andrew: We will absolutely look at it and we’ll respond, uh, you’re right, this was our fault, the, the listing has been removed. More often than not, however, the listing is righteous and we have to explain why without Giving away too much of our methodology so that a bad actor couldn’t get enough information to avoid it in the future.

Ken: Have you ever gotten a response to one of those tickets? Uh, that starts with as an AI language model, you know, like as an AI language model, I’m unable to discuss fishing because it’s against my ethics or something like that. 

Andrew: No, we haven’t seen that quite yet. Or we, we may have, and I just. I didn’t come across it myself yet, but for a couple of days in advance of our podcast today, I actually was asking our frontline guys to keep their eyes peeled for anything that was very clearly AI generated.

Andrew: And we got a bunch of samples from that effort, but none of them had that disclaimer at the top that said, yeah, I mean, speculating wildly, uh, here, you know, I, I speculate, I speculate that that generative AI will be used by the bad guys, um, at some point, you know, a lot of their activities. like signing up for domains and having to put something into the registrant box, you know, having to put in some kind of name and organization, all of that can be generated now.

Ken: Uh, and I wonder, you know, uh, you don’t have to answer this, but I, I spec, I wonder, um, how much more difficult that’ll make the job of tracking, uh, these entities as they move around the internet, because they can sort of. Um, leverage machine amounts of creativity, uh, in those exercises, the behind the scenes stuff, the meta information, nevermind the content of spam messages and stuff, but the other things that go into running a cybercrime organization.

Andrew: We’re putting some significant thought around those and similar issues. Uh, I think that we’ll need to be able to see a more specific threat before solutions arise. Yeah. For better or for worse, we remain in a defensive posture. We can only respond to new vectors as they arise and it’s frustrating. Yeah.

Andrew: Um, but for now. The response seems to be working at least well enough to keep the internet from crumpling into the sea. So we’ll take that as a win. 

Ken: Yeah, I would do that. Yeah, I mean, you know, given the amount of time that I’ve personally spent. Uh, just trying to understand this whole new space. I mean, it takes time for the bad guys to understand it as well.

Ken: So at least we’ve got that going for us, right?

Andrew: Yeah. We do see them playing around with, with the tool, with different submission styles and forms and things like that. But we are also using that same machine learning to identify. The sources of those delisting requests and to determine whether in fact they are associated with the object that is being requested to be delisted, right?

Andrew: Are you authorized to speak on behalf of the owner of that domain or of that IP? If not, we’re not going to talk to you. So we can tell when it’s AI, even if it’s not particularly AI looking, because we can see what they’ve said to us in the past. And what we see in front of us today, reporting to come from them is so very different without a semicolon out of place, just, you know, perfect.

Ken: The dead giveaway is that the user agent string says auto GPT. I mean, that’s the dead giveaway. 

Andrew: Um, I’ll update our rules. 

Ken: That’s a great tip. Thank you. So we do, we, we seem to be talking about the future now, so we might as well transition. Uh, what do you see as the biggest challenges facing the cybersecurity industry in the years ahead?

Ken: And honestly, considering how fast things are moving right now, how about the days ahead? 

Andrew: I think that ransomware attacks, uh, phishing attacks, mobile banking malware. Um, this past three years has been a very interesting period in, uh, the internet and in internet security generally because of the pandemic.

Andrew: And if we are to see more events like the pandemic occur in the future, then we’ll need some kind of a standing response to those issues. What we saw very specifically is that folks were suddenly working from home. Most did not already have a company deployed device. In their hands when they were told to stay home, uh, and stay away from everybody else.

Andrew: So they wound up in many cases using personal devices and the operational security for those devices are never going to be quite as good as they are for. Well, one hopes, uh, a company owned and remotely managed device. So what we saw was a new take on some of the old classics, some of the golden oldies, right?

Andrew: Right. We talked about, uh, uh, use of malicious proxies, uh, usually, or at least in the past, folks have actually stolen access to those proxies to do bad things on the internet. Now, what we see is that folks are being proffered an opportunity to, uh, download and install apps on their mobile devices, uh, that they’re the same mobile device they’re using for, uh, company communications and connecting to company network assets with.

Andrew: And these apps. In the terms and conditions say by installing this app, you are agreeing to let us use you as a proxy for whatever malicious traffic. Wow. They’re actually getting permission. They’re doing what we’ve always asked them to do. Get permission before you start sending traffic. So they’re getting permission.

Andrew: Of course, it’s not informed consent, but they’re actually opening the door and letting them in, inviting them onto those devices and creating new paths, new… Holes in network security for that traffic to pass right through. So. We are just now coming down off of an enormous bolus of those and related issues.

Ken: Wow. So, uh, we should absolutely expect more of that in the future if there are other pandemics in the future. Uh, if you hope not, not we should hope that this is a once in every a hundred years sort of thing. Um, but, uh, but obviously, who knows? 

Andrew: Um, I certainly hope so. Yeah. And maybe we have. Maybe we have The experience now to know what needs to be done in order to keep it from being quite so explosive in its effect.

Andrew: And maybe those same measures will keep the same similar internet issues at bay.

Ken:  I somehow think that every time, uh, you know, every time, uh, cybersecurity learns of a new threat and neutralizes it, then there’s just another iteration and another attack factor and it gets exploited. Uh, and, uh, and so I don’t think it’ll ever be job done.

Ken: Um, so what is, what’s on the horizon for Spamhaus in terms of improving things for the internet community? 

Andrew: One idea that we’ve been putting some resources behind is creating a community very much like the one. Um, where it all started back in 1998, except open to all. There are lots and lots of smart people out there who know lots about the internet and other things.

Andrew: Um, they may not be internet professionals, but they still. Know, uh, where the bad guys live or how to find them. They, I mean, we’re talking about Steve Lin Ford’s in backyards all over the world, so let’s create a space for them to gather the same kind of space that Linford created back in 98. And use that community not only as a path for us to distribute what we’re detecting, what we’re learning with our resources, but for those folks to gather and collect signal to pass back up to us.

Andrew: What are we missing? What haven’t we seen yet? What can we learn from the folks who are consuming the data that we provide? 

Ken: So really kind of going into the open source intelligence world. 

Andrew: If that, if you want to use that term, I think that’s totally fair, right? And there is a system of personal credentialing that happens inside.

Andrew: In other words, you can become more and more of a trusted source and badged that way or labeled that way. So that there is. is it can we get some kind of geodesic trust model inside that community so that other participants in the community can gauge the value of the information that they see from other participants?

Andrew: Interesting. Interesting. It should be interesting, uh It’s been a success so far, the more folks that participate, I think, the better chance it has to actually make facilitation.

Ken:  And, and, you know, uh, public block lists, uh, like the Spamhaus advisory, the various Spamhaus advisories, um, have been obviously instrumental in, in helping to secure the internet or, or at least to.

Ken: Segment the internet into known safe and unsafe parts. I mean, let’s, let’s face it, like that’s probably the most we can do. We can’t totally secure everything with block lists, but at least we can, you know, use them to know, uh, what are the relatively safer parts. Um, how, how do you see the role of public block lists evolving in cybersecurity generally?

Andrew: Well, I think that they’re here to stay. That seems clear. They’re critical, at least for the email world of function. And most other types of traffic as well. But that said, we know that the data can do a lot more. And the data that we have available is much larger than the data that actually gets used typically.

Andrew: And it’s very contextualizing data. It enriches that binary. block, don’t block kind of decisions that are usually used by most other platforms and ISPs. So we’re working with threat intelligence providers in our orbit. Some of them are already partners. There’s some outreach going to, to new folks that we think would be valuable partners in the space so that the data can be used for vulnerability management.

Andrew: Uh, and ESPs also, uh, Where, uh, data is being, using for, for customer vetting processes, right? It’s a hard thing to learn that this very valuable customer turned out to be a spammer and then you have to cancel their contract. That’s something that nobody wants to have happen.

Ken:  Painful experience for the salesperson.

Andrew: Well, deliverability side either. Yeah. Uh, and so there’s a whole bunch of possible ways to be utilizing the data beyond just the typical vanilla binary decision of whether to pass or block traffic associated with those sources. 

Ken: I see. So like if I’m, uh, trying to interpret what you’re saying, you, you, one of the visions that Spamhaus has for the future is to try and surface more of this vast amount of data that you have.

Ken: Uh, and make it useful. 

Andrew: We don’t imagine for a moment that we have thought of all of the possible ways in which this big bolus of data could be used. We think we have some pretty good ideas because we spend all day with it and breathe it and sleep with it. And, uh, you know. Um, try and grow it and make it more and more useful and more and more accurate all the time.

Andrew: But somebody else we think is going to think of something that we didn’t think of yet ourselves. We were not quite so proud to think that we’ve considered all the possibilities there. And that’s why, for example, we’re opening up that community that I described earlier and why we’re partnering with ESPs and with other threat researchers and intelligence organizations to see how they might better make use of the data.

Andrew: Um, We really want to affect change with the big ISPs, the big hosting providers. The registries, the registrars and so on to really step up and we hope that by giving them access to all the various flavors of the data and helping them organize it so that it’s easily consumable, they’ll be able to finally shift the fight from that defensive posture that it’s always been since 1998 into a more preemptive.

Andrew: Posture where we can stop badness before it actually escapes out onto the internet. That would be a huge sea change in the entire threat environment. And we think that we’re uniquely positioned to affect that change, but we can’t do it ourselves. And so that’s why you’re seeing in the last few months, these outreach efforts.

Andrew: To all the other stakeholders within the environment. 

Ken: So it’s kind of, you know, it’s like Spamhaus comes out of the closet. Spamhaus, you know, uh, it comes out of the back room. You’re trying to get out there, uh, and work with more partners to make use of this valuable intelligence that you’ve always had because, uh, that’s what’s really needed to make a significant difference on the internet for cybersecurity.

Ken: That’s right.

Andrew:  The further up the funnel we can push the defense against whatever type of malicious traffic you happen to be thinking of at the moment, the less cost is accrued to the network owners, the less damage is inflicted to the users of the Internet. And the safer and more trusted place the internet can be.

Ken: Now, uh, this is an out there question. Uh, but, you know, think like 50 years into the future. Do you think Spamhaus will still be a force 50 years from now? 

Andrew: We will be, but I can’t predict exactly how. I wouldn’t have predicted in 1998 when I was working in an office admiring Spamhaus from afar via Usenet, uh, that.

Andrew: They would be doing the things that they’re doing today, passively collecting and condensing historic DNS data, for example, and packaging that for consumption by third parties never would have guessed. So I don’t know what’s coming. Maybe it has to do with AI that we discussed earlier. But we think that if we’re going to persist in the space, if we’re going to keep doing what we’re doing, We’re not going to be doing it alone.

Andrew: I think that any significant progress has to involve some of these third parties that we described earlier. We’re really looking at the big ISPs, the big Big names that start with Ss, that start with G, right? Um, they’re also making resources available and sometimes those resources get abused. I don’t think they’re doing that on purpose.

Andrew: I, I don’t think they’re bad guys. I don’t think they want to be ga bad guys. I think that most of the time they just don’t know what’s going on in their own basement. And if we can get them the data in their hands, maybe they can stop that stuff that’s happening in their basement end. With a little bit more effort beyond that, they can stop it from beginning in the first place and preempting that activity before it becomes an issue.

Andrew: That’s harder to measure because you can never prove a negative. I can never say, look at what didn’t happen because we did our job. But I don’t think we need that to feel good about ourselves. It has to happen whether we can take credit for it or not. Um, we want to succeed as a business, but it, to a person, everyone who works at Spamhaus wants to do what they can to save the internet.

Andrew: And that sounds dumb and hopeless and naive, but it’s kind of true. 

Ken: You do kind of wonder what The internet would look like if Spamhaus never existed in the first place, or if it’s, you know, if it had failed early on for any number of reasons, you know, where would we be now? Would someone else have taken up the cause?

Ken: Would it just, you know, would we not have seen nearly as much economic progress? I mean, it’s possible, right? I think it’s possible. 

Andrew: Yeah. I think we see, at a minimum, a lot less business transacted automatically and electronically. 

Ken: Right. Well, it’s been a real pleasure talking to you, Andrew. Um, thank you very much for talking to me about the past, present, and future of Spamhaus.

Ken: Thanks for having me. Of course, of course. Uh, we’ll really look forward to, you know, chatting again. Uh, in a year or so when we’ve seen what the impact of generative AI has been on the, uh, on the cybersecurity space, I think we’re, we’re both in for some real surprises around the corner and we can’t really predict what that’ll look like.

Ken: We’ll see you then. 

Andrew: Yeah. See you then.


Cut your support tickets and make customers happier