Revisiting Greylisting

Some time ago, in my battle against spam, I started using greylisting. Greylisting, for those not familiar with it, takes advantage of a standard fallback in email delivery systems. If a server is running but temporarily unable to accept incoming messages, it can send an error code saying so. RFC compliant servers are supposed to retry after a timeout period. Why is this useful? Because most spammers don't use compliant servers or don't have the time to retry.

A greylist program tracks these connections, sends a temporary error for new connections, and creates a 'triplet' record of the sending server's IP, and the sender and recipient email addresses. After a specified timeout, a new connection matching these three fields will be accepted. The first message from a client will be delayed, but subsequent messages will not be - as long as the triplet is matched.

I tried greylisting with a variety of implementations, with varied success. The major impediment was my use of a backup server - this severely reduces the effectiveness of many antispam strategies. In this case, since the backup server (outside my control) 'wasn't greylisting, the deferred email was simple accepted by the backup server and funneled into my mail from that trusted relay. No good!

A second problem I have with most greylist implementations is that they overlook the fundamental concept - that we are looking at the behavior of a server (or client), not of the email's sender (author). Once we know that Server X will retry after receiving the temporary error, the email addresses on the envelope are irrelevant. So why do we keep checking triplets? I am currently experimenting with a newer approach: a triplet is stored and tested for unknown clients, but once a given client has successfully passed the greylist, we allow all future connections from that IP without a delay. Since it has proven to us that it will retry in the future, there is no point in delaying it.

I can think of a few potential problems with this strategy:

  • Dynamic IPs: the IP we saw before might be assigned to a new computer now. True, that's a possible problem but most dynamic IP ranges are blacklisted by DNSRBLs and never reach the point of the greylist. I think this is manageable.
  • Innocent but non-compliant servers: some legit servers don't retry, for whatever reason. (Why, if you are running a legit server, can't you comply with the accepted rules?) I have a whitelist for these servers.