LinuxDevCenter.com
oreilly.comSafari Books Online.Conferences.

advertisement


Defending Your Site Against Spam, Part 2

by Dru Nelson
07/24/2003

This article is the second and final installment describing my efforts to defend my systems from spam. The first article explains some necessary concepts and terminology. This article will dig more into the details of an actual implementation with my mail system. One thing to note is that I used qmail for my mail system (hence the title), but the information in here could apply to just about any email server in production today.

RBL Providers Today

In the previous article, I covered the history and the protocol used by network level spam defenses but not the existing landscape of RBL Providers that supply blocklists. There are quite a few of them out there, and I needed to select one for my system. At first, I polled a few friends to see what they were doing. Most had tried a blocklist at some point. I found that some people mix various blocklists and usually don't trust them enough for their corporate machines. Some switch providers periodically since the quality isn't stable. One friend had to get new blocks of IP addresses for his service since he was in the same network block as a spammer. This form of collateral damage caused him to be very negative about the whole subject.

The biggest problem with a network level defense has been the providers. Although they all conform to the protocol, they vary wildly on what goes into their blocklist database. This is a direct result of how they came to be. Almost all of them are grassroots efforts by individuals or small groups, each with different opinions and policies about how they operate. Some groups are very private and do not disclose much about their members or their policies. Some of them have very sensitive trigger fingers and others do not. Some groups are very aggressive about adding to their lists while others are extremely lax.

Without a good policy and consistent operation, I was leery of letting anybody control my mail server. It is easy to understand why it's important to be cautious about this. No one wants to disrupt their incoming mail. If an RBL provider consistently blocks the wrong hosts, it might take a while to fix. I might have to build my own special list of exceptions to their blocklist. This seemed like the kind of work I didn't want to add to my already long list of duties.

Towards the end of my research, I had one conversation that turned out to be really interesting. I used to work with a guy named Mark Fletcher at eGroups. He had been working on the spam problem for a little while and we'd come to the same conclusions. However, he had decided that he had a new twist and he started a service called Trustic to provide it.

Trustic

Fletcher decided to build a trust network for email servers on the Internet. Trust networks are becoming a popular technique for getting good results out of systems that could potentially include untrustworthy individuals (such as spammers). The best and most impressive examples of trust networks today are eBay, Google, and Advogato.

Like other trust systems, Trustic takes recommendations from registered users about IP addresses of other systems. Each user has a level of trust. Users build their trust by making accurate recommendations over time. In order for a host to become untrusted, the cumulative trust level of the recommendations has to be above a certain threshold. For example, if two well-trusted users mark a server as untrustworthy, that server would become untrusted. If a bunch of new users tried to mark a server as untrusted, it would take a large number of them. If someone abuses their trust rating, their trust level is reduced and becomes harder to raise. There are more descriptions on Trustic's web site, but that is the gist of the technique.

The trust system provides a blocklist that allows the many good users of the Internet to mark things as trusted or un-trusted very quickly. Unlike a lot of other RBL providers, Trustic also added a good web interface, email reports, and other features that make this system easy to use. As a result of these policies and the web interface, it was easy to trust blocking email to Mark's system. As an added bonus for me, he incorporated a lot of my feedback into the design before he fully launched in January of 2003.

To use his service, I logged in and got an account. The registration gave me a number to use when making an IP4R query of the system, when forwarding spam to the Trustic system, or when sending and receiving recommendations. The number allows Trustic to tailor its responses to the particular user. Each user can have his own blocklist rules. Originally, Trustic relied on the registrant's IP address for submissions or queries. This didn't work with the dynamic IP addresses assigned by most ISPs.

The Wonderful World of qmail

Now that I have a provider for blocklist information, how do I use it with my mail server? Well, before we get to that, we must go over qmail a bit in case there are any readers who aren't yet comfortable with qmail configuration. Once that little step is out of the way, it is a lot easier to understand the actual steps involved to integrate with Trustic or some other IP4R provider. In fact, it took me much more time to write this article than to install and test this setup. (If you don't run qmail, you can skip this section, but if you are curious, it shouldn't be a hard read.)

qmail, written by Dan Bernstein (DJB), follows a few simple principles in its design. Once you understand these, understanding what I'm about to describe becomes much easier. In order of priority, DJB wanted qmail to be secure, extensible, reliable, and fast. One of the first design decisions was to divide the email system into several smaller programs. This is the same successful technique that the original Unix architects used when building a text processing system from commands like grep, cut, or cat. Each small program that makes up qmail does just one particular task and usually runs as a less-privileged user. Each tries to do the absolute minimum required for that particular task. This keeps the programs from "becoming too many things to too many tasks", which usually causes programs to look like "spaghetti code".

Because the programs and code are smaller and simpler, the code is easier to both debug or audit. Also, when the programs do interact with each other, they use a simple pipeline API. Such a system allows you to insert or remove different pieces from the pipeline for your specific situation. Finally, the programs are designed to be efficient with the operating system's resources (CPU and memory). This has a nice side effect of making the system fast even on older systems.

To summarize,

  1. The programs do the minimum necessary for their roles, so they remain small and secure
  2. The programs conform to a pipeline dataflow model for extensibility and security containment
  3. Efficient resource usage keeps them small and fast

As a result of qmail's design, it was an easy choice for many of the Internet services I worked at in the past. It was easy to scale the package to handle enormous mail loads. It was free and it worked on our favorite operating systems. It also won the security contest hands down compared to any other mail package. Naturally, I would end up running the system for my own personal mail system. I have been running the same code for about 5 years without recompiling. I can't remember how many times I've seen other mail systems get listed on CERT or Bugtraq. I've never had any issues with qmail, and I like that a lot. It just runs and runs.

Pages: 1, 2

Next Pagearrow




Linux Online Certification

Linux/Unix System Administration Certificate Series
Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills, and to back those skills up with a Certificate from the University of Illinois Office of Continuing Education.

Enroll today!


Linux Resources
  • Linux Online
  • The Linux FAQ
  • linux.java.net
  • Linux Kernel Archives
  • Kernel Traffic
  • DistroWatch.com


  • Sponsored by: