Raptor Email Anti-Spam Compendium & FAQ

Table of Contents

Explanation of Threats


Malware is a general term used to describe malicious software which causes unwanted, intrusive operation of a computer, normally unknown to the user.

Malicious software includes but is not limited to virusesadwaretrojansworms, and spyware. Common infection vectors for malware include email attachments, intentional or driveby downloads, and removable media such as thumb drives.


Macros are executable extensions of specific programs which are designed to automate long and tedious tasks. While some macro languages are limited from a programming perspective, many try to extend their usefulness by calling outside programs.

Since some programs, such as document processors and office suites, allow the embedding of macros, it is possible to construct a malicious document that will download and run malware as soon as it is opened.

We work to score Office documents with Macros so that they are considered spam due to the risk in receiving them.


A computer virus is one type of malware which can spread by itself. As Wikipedia puts it, “the term “virus” is also commonly used, albeit erroneously, to refer to many different types of malware and adware programs.”


Adware is a type of software which displays or injects internet advertisements in an attempt to gain the author ad-revenue. Some adware may come from legitimate companies to support a business model, and may come bundled with your computer. Other sources may present unwanted pop-up ads, and generally are classified as having malicious intent.


A trojan, or “trojan horse”, is malware that masquerades as or is bundled with legitimate software. Sophisticated trojans, coupled with the implicit trust that a computer user unknowingly grants to the malware, is capable of crippling or disabling anti-virus software entirely, while concealing the problem from the user to evade detection.


A worm is a type of malware designed to replicate itself to spread to other computers or servers, usually relying on security flaws in physical networks to spread to as many computers as possible [1].


Spyware is malware designed to silently steal information about an infected computer’s user by logging keystrokes, accessing local files, and collecting stored application data to be sent back to the spyware author.

Some spyware is the direct payload of a trojan, although some has been known to spread as a virus.


Ransomware, sometimes known as cypherware, is a malicious program which encrypts personal documents stored on computers or otherwise restricts access to the computer, holding the computer “hostage” and demanding money in exchange for the decryption or access key. While some ransomware is trivial to defeat, the best defense against ransomware is to keep recent backups of your all personal documents.

Buffer Overflow

A buffer overflow is an unintentional flaw in software which can be exploited to run malware with the same privileges as the exploited program. An old or outdated browser may contain known buffer overflow exploits, which can be exploited to run malware through a specially crafted website. Keeping all installed software up to date is important to preventing security issues from buffer overflows.

Raptor Email Anti-Spam Compendium

This knowledgebase is intended to be read by administrators as well as SpamAssassin users who want to learn more about e-mail and anti-spam. It is a living document and we welcome your comments and suggestions, which can be submitted by using the Report a Problem page.

What is E-mail?

E-mail is the amalgamation of many plain-text protocols, and other specifications, defining the structure of the body of an e-mail, associated header information, and how to send and receive e-mail. E-mail predates the internet and has a complicated history. E-mail as it is defined today is specified by a number of RFCs. Occasionally a new RFC will be published that includes minor refinements.

A basic example of sending an e-mail is below:

helo intel1.pccc.com - THIS IS THE "HELO"(aka "Hello") GREETING,

mail from:
rcpt to:





After you send an e-mail any number of things can happen, depending on how your mail server is configured, who you are sending to, and even how many other people are sending e-mail as well.

In the very basic hypothetical example, you are on a Unix-like operating system sending a message from one user to another on the same machine. Because you are an old fogey, you open up mutt and send a message from yourself@localhost to root@localhost. It contains the message “[sudo] password for root:“. You submit the message to localhost, which then “sends” the message to itself. When localhost receives the message from itself, it places it in the root mailbox, ready for reading. The same process repeats when the root user replies with the message “hunter2“.

What is Spam?

Many people try and use various legal definitions such as CAN-SPAM in the US. This definition clearly has problems for the global community of the internet. Spammers routinely fail to understand that the Internet is much larger than one country’s laws.

We find the definition of CONSENT to be best. It leaves the content of any e-mail in the eye of the beholder. In short, if you consent to receive e-mails about XYZ, then those e-mails are not spam by definition.

The name “spam” refers to a skit by Monty Python’s Flying Circus which involves an overload of unwanted items offered at a café all including SPAM. A patron who doesn’t like SPAM certainly won’t like “SPAM SPAM SPAM SPAM SPAM SPAM SPAM baked beans SPAM SPAM SPAM and SPAM”. Hence, an overload of unwanted items became known as spam (but not SPAM).

Hormel, naturally, defines SPAM as their spiced ham product. And the use of their trademark to refer to junk e-mail has necessitated their lawyers to come up with an entire position on the matter. To avoid confusion in this document, the Hormel trademark will always be written as “SPAM” in all-caps.

What isn’t Spam?

In the anti-Spam community, Spam and Ham are opposites. Spam refers to Junk E-mails and Ham refers to Good E-mails.

Since Ham is “Treif”, or non-kosher for those who follow Jewish dietary laws, some people have proposed using the word Yam, instead of Ham. We are firmly anti-vegetable and will not stand for such nonsense.

Who Sends Spam?

Spam is sent from all over the world. More and more seems to be sent by organized crime. Additionally, Bot networks of infected computers often send massive amounts of spam.

Spammers often obtain e-mail addresses by using dictionary attacks. Basically, they send e-mails to lists of names. Unsubscribe links help verify that an e-mail address exists.

Spammers also use e-mail addresses posted publicly on websites. They also often employ malware to obtain address books off computers.

Spammers send spam to make money by selling something, to make money by stealing from you, and trick you into installing malware that lets spammers send even more spam, repeating the cycle.

Email Hygiene Recommendations

Raptor Email Security will accept and score messages from senders with poor email hygiene but it’s more likely that they will be marked as spam. This is made even more likely if there are many very similar messages coming from a common source. To avoid this, senders are encouraged to follow postmaster best practices including:

1. Have a non-generic valid reverse pointer (DNS ‘PTR’ record) for your mail relay’s IP address: a hostname which does not include the IP address in any manner and which has an A record that resolves back to the IP address used for outbound SMTP. Ideally this name is also the one used by the mail relay to identify itself in EHLO/HELO commands and Received headers.



Using a name that is clearly intentional and has simple proper resolution sets your sending machine apart from the botnets populated by devices like PCs, mobile phones, and cloud VMs which are frequently compromised and used to send spam.


2. If you send ‘bulk’ mail (i.e. mail that isn’t from one human to a small number of other humans) you should use a unique domain name (such as a subdomain of your primary domain) for each class of mail you send. Person-to-person, transactional B2C, discussion lists, and marketing email should each have their own domain with their own user namespace and DNS records (SPF, DKIM, DMARC, etc.)



Different classes of mail carry different reputational risks, different importance, and different handling requirements. Using distinct domains aids in this. It also provides flexibility so that you can handle different classes on different infrastructure or selectively outsource mail functions (i.e. mailboxes to a mailbox provider, marketing to a marketing ESP, etc. )


3. Use SPF records to publish the legitimate sources of your email, ideally with a -all ending but definitely not with ?all If your users might use their individual addresses for random online systems that might use a user’s address to send mail, you may need to use “~all” to let them work.



SPF is a policy that you set per domain that lets other servers on the internet know what IPs are permitted to send email for a domain. SPF records are DNS TXT records made up of a series of rules matching legitimate client IPs and can end in – all, ~all, or ?all, indicating how non-matching IPs should be viewed: definitely bad, probably bad, or neutral. You should publish records ending in “-all” for any domains used for any sort of bulk mail, as you should be able to define all of the legitimate sources for those precisely. If you have user addresses used by individual humans in a domain, you may want to use ~all to accommodate the fact that people will use their email addresses in ways you can’t predict. Using ?all may seem even safer but in practice that is considered by some receiving sites to be a sign of a domain run irresponsibly or incompetently.


4. Add DKIM signatures to all mail that you originate OR relay (such as discussion mailing lists.) Use the “relaxed/relaxed” canonicalization, which normalizes aspects of the header and body of signed messages.



DKIM (Domain Key Identified Mail) is a mechanism for adding a cryptographically secure signature for an email message as a header line, signing a hash of the canonicalized message body and an arbitrary set of canonicalized headers, verified by a public key published in the DNS. The “relaxed” mechanisms for canonicalizing the body and headers help protect the signature from trivial breakage due to immaterial formatting changes that can happen in transit.


5. Publish a DMARC record. Watch the reports of errors (or hire someone else to do so, e.g. Dmarcian) and address any reports of legitimate mail failing DMARC by fixing SPF or DKIM signing errors. Use a “p=none” policy unless you have a strong reason to do otherwise



DMARC (Domain-based Message Authentication, Reporting, and Conformance) is a framework which uses SPF and/or DKIM to authenticate the From header and uses a formatted TXT record in DNS to publish reporting parameters and domain policy for how receivers should treat non-authenticated messages using the domain in their From header. The safest DMARC policy attribute is “p=none” which means you don’t recommend any particular action by recipients who encounter a DMARC failure in mail claiming to be from the covered domain. It is also possible to use “p=quarantine” or “p=reject” but you should understand the risks. A ‘quarantine’ policy means that you are telling receiving sites that DMARC failure is a serious enough problem that they should not deliver it normally, which often means mail silently disappears, giving no indication to the intended recipient that it ever existed while indicating to the sender that it was accepted for delivery. A ‘reject’ policy means that they should not accept the mail at all but rather simply reject it at the SMTP “end of data” stage. Because DMARC can be broken by mail handling practices that have traditionally been considered perfectly normal (e.g. transparent forwarding, mailing lists that add topic tags, some types of canonicalization of addresses in headers, etc.) you should always use p=none for domains that have human users who might try to post to a mailing list or send to addresses that are forwarded.


6. Deploy DNSSEC for any domain used in email.


DKIM, DMARC, and SPF all publish data that is critical to authentication via the DNS, which by default is accessed via unencrypted UDP, subject to spoofing. DNSSEC assures that the records a client gets from DNS were in fact published by the domain owner (or at least, by someone able to insert records in both the parent and subject domain.) Without DNSSEC, some mail systems will treat the records needed for DKIM, DMARC, and SPF as non-existent.


7. If you have end users who generate email, train and encourage them to:

  • Send plain text email when possible, never send HTML-only email.
  • Use OpenPGP or S/MIME to sign messages routinely.
  • Use OpenPGP or S/MIME to encrypt messages as needed.


HTML in email weakly correlates to a message being spam, so using it (even as one alternative in a multipart/alternative message) increases the risk of causing “false positive” results from some spam filters. Plain text email is much less prone to false positives and is easier for recipients to identify as a false positive. Some people use MUAs which can’t render HTML, so HTML-only mail is inconsiderate in addition to being more likely to be scored as spam. OpenPGP and S/MIME are standard mechanisms for cryptographically signing and/or encrypting email, usually implemented in end user MUAs. As with HTML, but reversed, there is a correlation between mail being signed and/or encrypted in a standard fashion and being NOT spam. Some spam filters will use the fact that a message is signed and/or encrypted as a positive indicator, and some will use a cryptographically authenticated sender as a key for exempting mail from some or all filtering.

No Guarantee!

All of the best practices above are considered “best practices” on a purely empirical heuristic basis. You can do all the right things technically and still have your mail caught in a spam filter, even if it is not spam.

Servers that Handle E-mail

The typical handoff of e-mail between servers is described in detail on Wikipedia. There are a few major entities involved in the process of sending e-mail from one place to another.

MUA – Mail User Agent or e-mail reader or e-mail client

Any agent acting as a client toward an e-mail server, regardless of it being a mail user agent, a relaying server, or a human typing on a terminal. In addition, a web application providing message management, composition, and reception functionality is sometimes considered an e-mail client.

An MUA is only active when a user runs it. Messages arrive on the Mail Transfer Agent (MTA) server. Unless the MUA has access to the server’s disk, messages are stored on a remote server and the MUA has to request them on behalf of the user.

As a basic function, an MUA is able to introduce new messages in the transport system. Typically, it does so by connecting to either an MSA or an MTA, two variations of the SMTP protocol. The client needs to put a message quickly without worrying about where the message eventually will be delivered: that’s why a transport system exists. Thus it always connects to the same preferred server, however, how does that server know that it should accept and relay submissions from that client?

There are a number of ways. One way is to recognizes the client’s IP address, e.g. because the client is on the same machine and uses internal address, or because the client’s IP address is controlled by the same internet service provider that provides both internet access and mail services.

Another way is to authenticate since the SMTP protocol has an authentication extension. The latter method eases modularity and nomadic computing but can cause e-mail passwords to be sent using a clear-text protocol.

Finally, there is a technique called POP Before SMTP that requires a user to check mail before sending e-mail that then records the IP address and allows relay for a temporary period of time.

Client settings require the name or IP address of the preferred outgoing mail server, the port number (25 for MTA, 587 for MSA), and the user name and password for the authentication, if any. It is also typically possible to use port 465 for SSL encrypted SMTP sessions (SMTPS) that many clients and servers support.

Transport Layer Security (TLS) encryption can be configured for the standard ports, if both the client and the server support it.


MSA – Mail Submission Agent

MSA is a software agent that receives electronic mail messages from a mail user agent (MUA) and cooperates with a mail transfer agent (MTA) for delivery of the mail. It uses a variant of the Simple Mail Transfer Protocol (SMTP), as specified in RFC 4409.

RFC 4409 requires that clients are authorized and authenticated to use the mail submission service, e.g., as described in SMTP-AUTH, by other means such as RADIUS, or POP before SMTP. The MSA must check that the submitted mail is syntactically valid and conforms to the relevant site policies.


RFC 4409 contains some optional features:

Enforce submission rights guarantees that the envelope sender address is valid and authorized with the used authentication. This is in essence the SPF model specified in RFC 4408.

May add sender permits to add a Sender address header field if the envelope sender address does not match any author address in the “From” header field. This is roughly the Sender ID model specified in RFC 4406 – ignoring the tricky case of Resent-From header fields not covered in RFC 4409.


Many Internet service providers and enterprise or institutional networks restrict the ability to connect to remote MTAs on port 25 (the usual SMTP port). Availability of a Mail Submission Agent on port 587 enables nomadic users to continue to send mail via their preferred submission servers even from within others’ networks. Using a specific submission server is a requirement when sender policies are enforced.

Most of the benefits mentioned above may also apply to authenticated MTA, that is port 25 after the user logs in. In fact, the relevant server software is often the same for both services. However, the very fact that the service can be made available on a different port can be considered a further benefit, as it allows for circumvention of restrictions on port 25.



MTA – Mail Transfer Agent or Mail Relay

A software agent that transfers electronic mail messages from one computer to another. An MTA implements both the client (sending) and server (receiving) portions of the Simple Mail Transfer Protocol (SMTP). There are many ways to configure an MTA. A few common ones are listed below.


External Relays – An external relay basically allows two organizations to share the same domain name but at the same time separate e-mails between them. Basically, when the server of one organization receives an e-mail sent to the other organization, the server redirects the e-mail to the server of the other organization. So basically, say we have organization A and organization B in an external relay domain. If organization A gets an e-mail for organization B, A’s server gives it to B’s server.


Internal Relays – In an internal relay domain, most recipients of e-mails don’t have mailboxes in a certain server organization. An organization may have to share SMTP address space with two or more e-mail systems. All users have the same domain suffix in their e-mail addresses. So, when someone is trying to send an e-mail to someone who isn’t in one e-mail system, instead of giving a non-delivery report, the server then tries to find the e-mail address that matches the address specified in the e-mail.


Last External Relay – In an external relay domain, the last external relay is the last server that is outside your own network that relayed the e-mail to your own server. So basically, say we have organization A and organization B in an external relay domain. If organization A gets an e-mail for organization B, A’s server gives it to B’s server. A’s server is the last external relay. Look at the header of any e-mail. You should be able to find the last external relay.

SMTP Relay – Basically, a server is called an open relay, or SMTP relay, if it accepts messages on the behalf of other domains and doesn’t require authentication. A person in China could send a message through a server in South Africa to a person in California. As I hope you can see, this can easily be abused by spammers who send massive amounts of e-mails through an SMTP relay server without being discovered who they are.


MDA – Mail Delivery Agent

Mail Delivery Agent is computer software that is responsible for delivering e-mail to a recipient’s mailbox.

Within the Internet mail architecture, the message delivery agents consist of two components, the message handling service side that accepts messages from the message transfer agent (MTA), and a component in the recipient’s environment that affects message storage in a mailbox or other customized mechanisms.

Usually, the mail delivery agent is not started independently but with the message transfer agent.

An example of an MDA is procmail on Unix systems. Procmail’s handling of e-mails can be heavily influenced through the use of procmailrc scripts.

Some message delivery agent software for Unix-like platforms

  • binmail (The MDA portion of Sendmail)
  • Delivermail
  • fdm
  • maildirproc
  • maildrop
  • postdrop
  • procmail
  • Courier-pop
  • Courier-imap
  • dovecot


MX Records – Mail Exchanger Records

An MX Record is a record in the DNS that specifies a mail server responsible for accepting e-mail messages on behalf of a recipient’s domain and prioritizes mail delivery. It specifies how e-mail should be routed with SMTP.

Though the practice of pointing MX records to CNAME (alias) records is not that uncommon, it certainly isn’t in keeping with internet standards.

The domain name used as the value of an NS resource record, or part of the value of an MX resource record must not be an alias. Not only is the specification clear on this point, but using an alias in either of these positions neither works as well as might be hoped, nor well fulfills the ambition that may have led to this approach. This domain name must have as its value one or more address records. Currently those will be A records, however in the future other record types giving addressing information may be acceptable. It can also have other RRs, but never a CNAME RR.

Additional section processing does not include CNAME records, let alone the address records that may be associated with the canonical name derived from the alias. Thus, if an alias is used as the value of an NS or MX record, no address will be returned with the NS or MX value. This can cause extra queries, and extra network burden, on every query.



IMAP (Internet Message Access Protocol)  is an internet standard used to retrieve mails from a remote mail server over a TCP/IP connection.

When a user checks for new emails, the client will connect to the mail server and begin downloading messages on the user’s local system while ALSO keeping the server’s stored copies.

When an email is DELETED from the user’s local system, it sends a deletion request to delete the server’s stored copy.


Post Office Protocol (POP3) is an internet standard used to retrieve emails from a remote mail server over a TCP/IP connection.

When a user checks for new emails, the client will connect to the mail server and begin downloading messages on the user’s local system as new emails while deleting the server copies.

Advantages of IMAP over POP3:

First and foremost, the email is stored on the mail server and not on the workstation. Since servers are typically backed up more often than your workstation, this helps protect your critical emails from being lost and allows for very easy migrations if you buy a new computer or need to borrow someone else’s computer.

Second, IMAP inherently supports the ability to check your mail from multiple computers and still know which ones have been read, replied to or deleted. You can check your email client at work, use webmail at the library, and use your email client at home and hardly miss a beat.

Disadvantages of IMAP:

First, some programs aren’t 100% IMAP compliant and can print strange and annoying yet totally ignorable error messages. All of Microsoft’s email clients fall into this category. 

Second, searching and accessing emails can be slower due to the remote storage of the actual emails.

Don’t let these problems fool you though! IMAP is the solution we recommend and if you have ever bought a new desktop, had a laptop stolen or wanted to use more than one computer to check email, it can save you hours and hours of grief. Use it for at least a week and set it up on two computers and use webmail and you’ll agree!

SMTP Ports

The standard ports used by SMTP servers over the years have been Port 25, Port 465, Port 587, and Port 2525.

Port 25

Port 25 is the oldest of all the standard ports as it was in the original SMTP proposal of 1982.

It was commonly used for sending email via SMTP but can be used for other purposes.

ISPs and many email providers have stopped utilizing this port as it is less secure leading to a large number of spammers and bad actors exploiting it.

Port 465

Port 465 was utilized for SMTPS (Simple Mail Transfer Protocol Secure) which offered more secure communication by encrypting the contents of emails.

This port stopped being widely used with the rise of STARTTLS as it offers greater security.

It is still often kept open because a number of legacy systems are unable to use STARTTLS properly.

Port 587

Port 587 is the current standard port per RFC 2476.

Per that RFC, relaying mail between servers is entrusted to port 25 but all submissions were to be directed to port 587.

Port 2525

Port 2525 is a port that has never been officially recognized as a port for mail however it is often seen as a reliable alternative to port 587.

It supports STARTTLS and SSL/TLS like port 587 however it has never been designated as a port for sending mail by any RFC.

SMTP Culpability

As e-mail moves from one server to the next, SMTP adds a header file marking the e-mail as having been passed along successfully, failed, or permanently failed. These headers help determine “SMTP Culpability” by showing which server along its way to its destination was responsible for a temporary or permanent failure.

From RFC2821 3.8.2 Received Lines in Gatewaying

When forwarding a message into or out of the Internet environment, a gateway MUST prepend a Received: line, but it MUST NOT alter in any way a Received: line that is already in the header.

“Received:” fields of messages originating from other environments may not conform exactly to this specification. However, the most important use of Received: lines is for debugging mail faults, and this debugging can be severely hampered by well-meaning gateways that try to “fix” a Received: line. As another consequence of trace fields arising in non-SMTP environments, receiving systems MUST NOT reject mail based on the format of a trace field and SHOULD be extremely robust in the light of unexpected information or formats in those fields.

The gateway SHOULD indicate the environment and protocol in the “via” clauses of Received field(s) that it supplies.



Note that attachments sent via e-mail are larger than they are on a hard drive. On a hard drive, files are usually in binary or 8 bit format. However, e-mails are in a 7 bit format. The conversion that occurs increases the size of attachments, often by 30%. Keep in mind that it’s the size of an e-mail that matters, not the attachment.

The largest e-mail that can be sent depends on the size limit of both the sender’s e-mail server and the recipient’s server. Most size limits are around 10 MB. Keep in mind that the size of an attachment is limited to the lowest limit in the chain of servers. Even if you have a 1 GB limit, if the other server has a 1 MB limit, the maximum size of the attachment is 1 MB. Also, note that attachments are larger than the original file due to conversion from 8 bit to 7 bit.



What is SpamAssassin?

Apache SpamAssassin is a mail filter and programming interface that identifies junk e-mail. SpamAssassin at it’s heart is a Scoring Framework. This framework allows virtually any anti-Spam technologies to be added and used.

The engine for SpamAssassin is sometimes called a “Rules-Based Heuristics Engine” because the filter typically looks for patterns such as common phrases or known senders. Heuristics is the application of experience-based techniques for problem solving. Virus scanners use the same technique and are also heuristics engines. As an e-mail goes through the engine, each of the rules for SpamAssassin are run and generate a score. These individual scores are then totaled to provide an overall score for the e-mail. Some rules are positive and add to the score. Some rules are negative take away from the score.

The lower the overall score is, the more likely the e-mail is Ham. The higher the score, the more likely the e-mail is Spam.

But not each rule has the same weight. The weight each rule is given is determined first by initial “guesses” and later refined through optimization. We use a Genetic Algorithm to optimize the scores. Given a list of rules and a set of existing spam/ham classifications, SpamAssassin can automatically determine how best to weigh each rule. You can read more about Genetic Algorithms on Wikipedia. Discussing the finer details of Genetic Algorithms is outside the scope of this document.

Not everyone will want to use SpamAssassin to do the actual e-mail processing. With this design, you don’t have to use SpamAssassin to filter your mail.

Instead, you can also use SpamAssassin from other programs. In this way, SpamAssassin can be used to return a “this is spam/this is ham,” a score, a list of rules that lead to the score, and a detailed report. It’s then the job of the parent program that calls SpamAssassin to decide what to do with this information. Most e-mail clients using SpamAssassin have the ability to dispose of e-mails that SpamAssassin thinks is spam. Some clients, such as Thunderbird, have the ability to directly read the SpamAssassin scores and filter based on that.

If you are unsure why an e-mail was or was not tagged, make sure to first check the headers of the e-mail. Simple problems such as whitelisting of a Spammer are often at the root of the problem.

Also, check blacklists. If a blacklist lists a site and you want to delist it, follow the instructions! And NEVER pay any money for delisting.

Making Your Own Rules

Make sure to go to http://wiki.apache.org/spamassassin/WritingRules.

Remember these commands are for Unix-like environments.

Lint is a program that looks for bugs in source code. Make sure every rule you write passes lint without any problems by checking with the command “spamassassin -D –lint” Make sure to do this!

First, make sure the rules you write are not already implemented. If you want to change the rule, just change the default score.

Then, write your own rules! It is a good idea to score your rules low to prevent false positives. Make sure to write rules for your circumstances! For example, when writing a negative rule, pick tokens that commonly appear in ham e-mails addressed to you! Use phrases instead of single words.

Finally, test it! Make sure it passes Lint. Then, test it by running spamassassin in test mode against a text file containing the contents of an e-mail with:

spamassassin -t -D<[message]

Or without network tests:

spamassassin -t -L -D<[message]

‘-D’ prints out debug information.

When to Catch Spam

Spam can really be detected at any time. However, the later spam is identified, the more resources the server has used, and therefore the expenses rise. The rule of thumb is to reject early on. The various stages:
    • Connection The server decides if it wants to accept and forward the client’s connection attempt to the SMTP server
    • SMTP session The SMTP server examines data sent, e.g. client IP address, envelope sender, envelope recipient etc., in the SMTP session and decides if it wants to accept or reject the client, sender, recipient or message
    • Reputation The server consults an external (local and/or remote) system to query the reputation of the client or sender domain
    • Content A content filter examines SMTP session data and/or message content and decides if the message is spam or ham

How to Catch Spam

Author Domain Signing Practices – ADSP is an optional extension to DKIM where a domain administrator can specify the DKIM signing practice. ADSP is an additional DNS record that is used alongside DKIM. With ADSP, an organization can specify one of three DKIM signing practices. Unknown is the same as not having an ADSP record at all. The organization does not make any guarantees about how consistently they use DKIM to sign their e-mails. All states that all e-mail from an organization will be DKIM signed. Discardable states that in addition to All, the organization wants any non-signed e-mail to be silently discarded by the receiving party.

Away messages – If these sick/vacation messages are sent as replies to Junk Mail, you are simply going to respond to spammers. Again, because of forgeries, such messages can be sent as spam to innocent addresses, or sent to non-existent addresses and bounced back yet again as a DSN. Never reply to spam!

Backscatter – While some administrators might think they are being neighborly and helping out a fellow netizen, most Spam and Viri is delivered using forged from addresses. These forged addresses sometimes receive a lot of Delivery Status Notifications or DSNs. This is called backscatter.

So, please, don’t bounce spam, as this causes backscatter! This can also get you blacklisted.

Bayesian Classifier – The Bayesian Classifier in SpamAssassin learns tokens – words or short sequences – that are commonly found in spam or ham. It makes use of Bayes’ Theorem, with strong assumptions about the statistically independent nature of e-mail. Bayesian classifiers make an effective means of adapting to new spam techniques, by adding uncaught spam to the spam/ham datasets to improve the classifier’s accuracy.

Blacklist / Whitelist – These are lists of domains, IP addresses, or other means of identifying the sender of a message. A blacklist contains entries intended to identify spammers, while a whitelist identifies known-good senders. Some common blacklists are below. You should research the rules and limits of each list before using them.

Other blacklist checkers:

Challenge and Response – Another similar technique that causes DSNs is called Challenge and Response. This is a system where a new sender is sent an e-mail by an automated system asking them to verify that they sent the original e-mail. Unfortunately, because of forgeries, C+R systems are often considered spam. Use it at your own risk!

Checksum blacklist – A particular type of blacklist containing hashes or checksums of known spam messages. This can be a quick way of detecting some of the most common spam messages. For information on using a checksum blacklist with SpamAssassin, see perldoc Mail::SpamAssassin::Plugin::DCCperldoc Mail::SpamAssassin::Plugin::Pyzor, and perldoc Mail::SpamAssassin::Plugin::Razor2. Cloudmark is a commercial derivative of the Razor checksum filter.

DMARC – “Domain-based Message Authentication, Reporting & Conformance” is a standardized way for an e-mail sender to inform the recipient that the e-mails are protected by SPF and/or DKIM, and what to do in the event that these procedures fail. DMARC removes any guesswork about what to do with an e-mail if it fails, limiting or eliminating the user’s exposure to potentially fraudulent and harmful messages. DMARC doesn’t directly address whether or not an e-mail is spam or otherwise fraudulent. DMARC also provides a way for the e-mail receiver to report back to the sender about messages that pass and/or fail DMARC evaluation.

Information/Websites used in creation of this text –

False positives and negatives – Much like testing for an illness, e-mail tests positive for being spam and test negative for being legitimate. A false positive is when a legitimate e-mail is mistakenly marked as spam. A false negative is when a spam e-mail is mistakenly accepted as legitimate.

One of the core principles of anti-spam is Do No Harm. It is better to let a thousand spam e-mails through than to drop even a single legitimate one. This can sometimes be very difficult. A common scam e-mail tactic is to forge an e-mail from a friend, saying they are on vacation and need money for a taxi, if you could just send it to their account. Obviously, an accounting firm would be very unhappy if their spam filter reacted to such tactics with a uniform block on any banking-related keywords.

The vast majority of Spam uses forged sender addresses as we talked about with the Reduction of DSNs. If you whitelist your own domain, you are just going to whitelist a whole bunch of Spam! Deploying SPF (as well as DKIM & ADSP) will help combat these forgeries and there are rules for these technologies that can decrease AND increase the overall score for an e-mail.

Greylist – A greylist contains senders about which you are undecided. Your server will return a temporary error to every greylisted sender, requiring them to connect twice to send an e-mail. This technique can block most spam sent by poorly-written software that cannot handle temporary 4XX errors correctly. Greylisting is not recommended though, because it also blocks e-mail from poorly-written software that sends legitimate e-mails. It purposefully introduces a delay in the sending of legitimate e-mail, and reports spurious errors that can make it difficult to determine when your server is having a real problem.

rPTR, Smart Hosts, and SPF – On the internet, computers speak in numbers called in IP addresses. Humans use names via DNS to translate into the numbers. However, you can also translate a number into a name. This is called a reverse lookup.

Having a valid answer for a reverse lookup, called a reverse PTR or rPTR makes your mail server more legitimate. Some ISPs, like AOL, will not even accept e-mail from a server that doesn’t have one. A reverse lookup should match exactly. For example, mail.macsysadmin.se instead of static-71-163-15-129.washdc.fios.verizon.net.

If you don’t have a static IP, you really MUST be using a Smart Host. This is the feature in the mail preferences to Relay E-mail through your ISP. Mark Martinec, another member of the SA Project Management Committee, also points out that you should really, really use the Smart Host or MSA of a domain used in your From address. Submitting mail through anything but the MSA for the From address will likely contribute points towards the threshold to tag an e-mail as Spam.

So, via your user’s Mail User Agent, they should be submitting the e-mail to your domain’s Mail Submission Agent, preferably on a standard submission port 587 and authenticated through AUTH or POP-Before-SMTP or similar. In the same vein, when using @gmail.com in the From address, the e-mail should be submitted through that domains MSA, smtp.gmail.com:587. The Gmail web interface, technically just another MUA, would automatically be setup to use a proper submission method.

The reason away from using an ISP’s Smart Host to an MSA for the domain is due to the widespread e-mail address forgery in Junk Mail. This fraud has necessitated increasingly stricter methods to distinguish the valid addresses from the forged address. One of these stricter methods is SPF.

Using these methods, like SPF or Sender Policy Framework by adding an SPF Record for your domain will help control forged e-mail. SPF allows other servers that receive your e-mails to check your domain’s DNS. There you can set policies that tell the MTA on the receiving end what MTAs are allowed to transmit e-mail for your domain. The website www.openspf.org can assist you in configuring this record.


AMaViS- A Mail Virus Scanner – A Mail Virus Scanner scans e-mail attachments for viruses using third-party applications available for UNIX operating systems. It is written in Perl, and is the application which calls SpamAssassin in OS X.


Backscatter (outscatter and misdirected bounces, blowback, or collateral spam, et al.) – A side effect of e-mail spam, viruses, and worms where e-mail servers receiving spam and other mail send bounce messages to innocent users. These usually come in the form of “Your mail could not be delivered..” or “Your mail contained a virus..” messages. These messages are classified as spam because they aren’t solicited by the recipient and are delivered in bulk quantity. The vast majority of Spam comes from forged e-mail addresses.

Information/Websites used in creation of this text –

Blacklist (See DNSBLs) – a basic access control mechanism that allows every access, except for the members of the black list (i.e. list of denied accesses). The opposite is a whitelist, which means allow nobody, except members of the white list. As a sort of middle ground, a greylist, contains accesses that are temporarily blocked.

Information/Websites used in creation of this text –

Deep Header Parsing aka Deep Header Inspection – Some anti-spam tools will actually look at the reputation of IP addresses in each of the received headers of an e-mail. Reputation mechanisms usually involve multiple sources, e.g., DNSBLs and Honeypot-driven reputation services.

Here’s a sample header:

Delivered-To: address@gmail.com Received: by [[]] with SMTP id e5cs33412ibd; Fri, 16 Apr 2010 08:38:08 -0700 (PDT) Received: by [[]] with SMTP id e9mr1978437rvi.51.1271432287560; Fri, 16 Apr 2010 08:38:07 -0700 (PDT) Return-Path: me@mydomain.com Received: from SERVER.somedomain.com (mail.somedomain.com [[]]) by mx.google.com with ESMTP id 11si4715430qyk.0.2010.; Fri, 16 Apr 2010 08:38:07 -0700 (PDT) Received: from myserver.mydomain.com (unverified [[]]) by SERVER.somedomain.com (XYZ MTA) with ESMTP id <B0001262286@SERVER.somedomain.com> for <address@gmail.com>; Fri, 16 Apr 2010 11:38:08 -0400 [BRACKETED]: IP addresses that will be verified.


DomainKeys Identified Mail (DKIM) – an e-mail authentication system which verifies the domain of a sender. It resulted from the combination of DomainKeys and Identified Internet Mail. A server can therefore identify e-mails from forged e-mail addresses and identify it as spam.

A domain owner generates private/public key-pairs that will be used to sign messages originating from that domain. The public-key is placed in DNS as a text file. The private-key is kept on the mail server which sends e-mail for the domain. When a user sends an e-mail, the e-mail is embedded with a digital signature based on the private key and sent. The receiving server uses the domain name and elements in the key to perform a DNS lookup for a specified text file in the digital signature to find the public key. If the e-mail passes this lookup, it is legitimate.

ADSP is an optional extension to DKIM.


DNFTEC – This acronym goes all the way back to 1996. Its the original “don’t feed the trolls.” DNFTEC stands for ‘don’t feed the energy creatures’ and a great description of what/who these are can be found here:


“There is a certain type of being that’s all too common in the online world. I call them “Energy Creatures,” a term I first heard on one of the commercial services. Energy Creatures are a bizarre lifeform which grow and feed off of the negative energy generated by others.

Energy Creatures’ favorite feeding tactic is to try to hurt people’s feelings or get them angry. Then they can feed off the pain and anger they’ve generated. Their second favorite tactic is to hurt one person or group’s feelings while gathering the sympathy of others. That way, when the injured party lashes back, others will jump to the Energy Creature’s defense. Then the Energy Creature need do nothing except feed off the attention and the negative energy generated by the people fighting.”

While energy creatures are normally thought to frequent the various forums around the internet, the same term can be applied to spam mailers. Its more common today to find that spammers are actually members of organized crime around the world or are hackers with large robotic networks of hijacked computers called botnets. These fall under the DNFTEC acronym because they are trying to steal your money, personal information, contact information, of all of the above. PCCC highly recommends that users just mark the spam mail as such and move on. Do not hit the ‘click here to be removed from our list’ link since most of the time those just confirm to the spammers which e-mails addresses are actually live.

DNS – The Domain Name System (DNS) is an internet directory service. DNS’s most basic service is to translate hostnames into IP addresses, and DNS also controls e-mail delivery. If your computer cannot access DNS, your web browser will not be able to find web sites, and you will not be able to receive or send e-mail. The DNS system consists of three components: DNS data (called resource records), servers (called name servers), and Internet protocols for fetching data from the servers. The billions of resource records in the DNS are split into millions of files called zones. Zones are kept on authoritative servers distributed all over the Internet, which answer queries based on the resource records stored in the zones they have copies of. Caching servers ask other servers for information and cache any replies. Most name servers are authoritative for some zones and perform a caching function for all other DNS information. Large name servers are often authoritative for tens of thousands of zones, but most name servers are authoritative for just a few zones.

Information/Websites used in creation of this text –

DNS Blacklist (DNSBL) – A published list of IP addresses that can be queried through the Internet. DNSBLs are used to publish IP addresses associated with e-mail spam and spamming. Most mail servers can be configured to reject messages from addresses on a DNSBL. An address found in a DNSBL may be directly associated with spam, or may have made the list due to Web server vulnerabilities that can be used by spammers. There are many DNSBLs available, each published and maintained by different individuals and organizations.

Information/Websites used in creation of this text –

DSNs – Delivery Status Notification – DSNs are automated e-mail messages from a mail system informing the sender of the status of his/her e-mail.

DSNS are classified in these categories

  • 2xx/3xx – successful
  • 4xx – failure, asking sender to try later
  • 5xx – permanent failure

2xx/3xx class – Success Messages

  • 200 Non standard success response
  • 211 System status, or system help reply
  • 214 Help message
  • 220 Service ready
  • 221 Service closing transmission channel
  • 250 Requested mail action taken and completed – Your ISP mail server has successfully executed a command and the DNS is reporting a positive delivery.
  • 251 User not local: will forward to: – Your message to a specified e-mail address is not local to the mail server, but it will accept and forward the message to a different recipient e-mail address.
  • 252 Recipient cannot be verified – Recipient cannot be verified but mail server accepts the message and attempts delivery
  • 354 Start mail input and end with . – Indicates mail server is ready to accept the message or instruct your mail client to send the message body after the mail server has received the message headers.

4xx class – Temporary Errors – Those codes are temporary error messages. They are used to tell the sender that an error occured and to try later.

  • 421 Service not available, closing transmission channel – This may be a reply to any command if the service knows it must shut down.
  • 450 Requested mail action not taken: mailbox busy or access denied – Your ISP mail server indicates that an e-mail address does not exist or the mailbox is busy. It could be the network connection went down while sending, or it could also happen if the remote mail server does not want to accept mail from you for some reason i.e. (IP address, From address, Recipient, etc.)
  • 451 Requested mail action aborted: error in processing – Your ISP mail server indicates that the mailing has been interrupted, usually due to overloading from too many messages or transient failure is one in which the message sent is valid, but some temporary event prevents the successful sending of the message. Sending in the future may be successful.
  • 452 Requested mail action not taken: insufficient system storage – Your ISP mail server indicates, probable overloading from too many messages and sending in the future may be successful.
  • 453 Too many messages – Some mail servers have the option to reduce the number of concurrent connection and also the number of messages sent per connection. If you have a lot of messages queued up it could go over the max number of messages per connection. To see if this is the case you can try submitting only a few messages to that domain at a time and then keep increasing the number until you find the maximum number accepted by the server.

5xx class – Permanent Errors – These are permanent error codes. Mail transfer is definitly a failure. No other try will be done.

  • 500 Syntax error, command unrecognized or command line too long
  • 501 Syntax error in parameters or arguments
  • 502 Command not implemented
  • 503 Server encountered bad sequence of commands
  • 504 Command parameter not implemented
  • 521 does not accept mail or closing transmission channel – You must be pop-authenticated before you can use this SMTP server and you must use your mail address for the Sender/From field.
  • 530 Access denied – A sendmailism ?
  • 550 Requested mail action not taken (Relaying not allowed, Unknown recipient user, …) – Sending an e-mail to recipients outside of your domain are not allowed or your mail server does not know that you have access to use it for relaying messages and authentication is required. Or to prevent the sending of Spam some mail servers will not allow (relay) send mail to any e-mail using another company’s network and computer resources.
  • 551 User not local: please try or Invalid Address: Relay request denied
  • 552 Requested mail action aborted: exceeded storage allocation – ISP mail server indicates, probable overloading from too many messages.
  • 553 Requested mail action not taken: mailbox name not allowed – Some mail servers have the option to reduce the number of concurrent connection and also the number of messages sent per connection. If you have a lot of messages queued up (being sent) for a domain, it could go over the maximum number of messages per connection and/or some change to the message and/or destination must be made for successful delivery.
  • 554 Requested mail action rejected: access denied
  • 557 Too many duplicate messages – Resource temporarily unavailable Indicates (probable) that there is some kind of anti-spam system on the mail server.


False Negatives & False Positives – the terms in statistical analysis when a binary system classifies something incorrectly.  So when a Spam Message is NOT marked as Spam, that’s a False Negative.  A message that isn’t spam that gets marked as Spam is a False Positive.  Also known as FPs and FNs, making sure these occur very rarely is the primary goal of Raptor Email Security.

Glue – The method in which one program is interfaced with another. There are many ways of doing this including using a program.s API, calling the other program directly, etc. SpamAssassin can be called in multiple ways and ‘glued’ into various filtering methods.


GNU-Unix-like Operating System – A Unix-like operating system that aims to be a complete Unix-compatible operating system consisting of completely free software.


GPG – GNU Privacy Guard – A protection system that encrypts data to be sent to another computer. It was created in the GNU Project, but now is compatible most operating systems. You can install a GPG plugin on OS X Mail called GPGMail.


HELO greeting (pertaining to AntiSpam) – Spam can be greatly reduced by a number of checks confirming compliance with standard addressing and MTA operation. In many situations, simply requiring a valid FQDN (Fully Qualified Domain Name) in the SMTP HELO statement is enough to block 25% of incoming spam.

  • Refusing connections from hosts that begin transmission prior to presentation of the receiving host’s HELO banner.
  • Refusing connections from hosts that give an invalid HELO – for example, a HELO that is not an FQDN or is an IP address not surrounded by square brackets

    Invalid: HELO localhost
    Invalid: HELO
    Valid: HELO domain.tld
    Valid: HELO []

  • Refusing connections from hosts that give an obviously fraudulent HELO – for example, issuing a HELO using the FQDN or an IP address that doesn’t match the IP address of the connecting host

    Fraudulent: HELO friend
    Fraudulent: HELO -232975332

  • Refusing to accept e-mail claiming to be from a hosted domain when the sending host has not authenticated
  • Refusing to accept e-mail whose HELO/EHLO argument does not resolve in DNS. Unfortunately, some e-mail system administrators ignore section 3.6 of RFC2821 and administer the MTA to use a nonresolvable argument to the HELO/EHLO command. All of the examples above are fairly simple checks, all conform to existing standards and RFCs, and all are missing from most commercial MTA implementations available today.

Information/Websites used in creation of this text –

Internet Message Access Protocol (IMAP) – a method of accessing electronic mail or bulletin board messages that are kept on a (possibly shared) mail server. In other words, it permits a “client” e-mail program to access remote message stores as if they were local. For example, e-mail stored on an IMAP server can be manipulated from a desktop computer at home, a workstation at the office, and a notebook computer while traveling, without the need to transfer messages or files back and forth between these computers. IMAP’s ability to access messages (both new and saved) from more than one computer has become important as reliance on electronic messaging and use of multiple computers increase, but this functionality cannot be taken for granted: the widely used Post Office Protocol (POP) works best when one has only a single computer, since it was designed to support “offline” message access, wherein messages are downloaded and then deleted from the mail server.

Information/Websites used in creation of this text –

IMAP4 Testing – IMAP can be tested using the following list of commands:

abc1 logout - closes the connection to dovecotIMAP on :143

Lightweight Directory Access Protocol (LDAP) – LDAP is an Internet protocol that e-mail and other programs use to look up information from a server. LDAP is used to look up encryption certificates, pointers to printers and other services on a network, and provide “single sign-on” where one password for a user is shared between many services. LDAP is appropriate for any kind of directory-like information, where fast lookups and less-frequent updates are the norm. As a protocol, LDAP does not define how programs work on either the client or server side. It defines the “language” used for client programs to talk to servers (and servers to servers, too). On the client side, a client may be an e-mail program, a printer browser, or an address book. The server may speak only LDAP, or have other methods of sending and receiving data—LDAP may just be an add-on method.

LDAP also defines:
Permissions: set by the administrator to allow only certain people to access the LDAP
database, and optionally keep certain data private.
Schema: a way to describe the format and attributes of data in the server.

Information/Websites used in creation of this text –

Milters – Midstream Filters – A milter enables third-party programs to access mail messages as they are being processed by the MTA. A milter allows them to examine and modify content and the meta-information.

Previously, the MTA would pass the e-mail to an e-mail filter for filtering after the e-mail was completely downloaded. With a milter-capable MTA, it instead does all this work while the e-mail is being downloaded. This allows rejection of massive files very early on to prevent wasted downloading.

At each phase of the SMTP session, the filter is given data about the arriving message and then has an opportunity to make decisions concerning the message. For very large messages, this can have an enormous impact such as when a decision to reject can be made as early as possible. Moreover, unlike the former model, a milter-capable MTA can connect to multiple filters in parallel that serve specific purposes such as anti-virus, anti-spam, message authentication, flow regulation, etc. Finally, such filters can take special action on the message: add or remove recipients in the envelope; alter the body prior to delivery; add, change or remove header fields in the message, etc.

For example, a very cursory look at our Raptor system shows that:

  1. During the SMTP conversation, the mail is filtered by MIMEDefang. If the mail is being forwarded, MD calls SpamAssassin and the mail is scanned for Spam.
  2. AFTER the SMTP Conversation concludes (by replying to the final “.” with a 2xx SMTP reply code), the Milter has one more pass at the e-mail and if the mail is delivered locally, the mail gets handed over to the delivery agent. In our case, that’s procmail.
  3. For local users, spamassassin is called by the global /etc/procmailrc script as part of the MDAs work. When the spamc command returns (possibly having altered the mail), procmail drops it into your INBOX.
  4. For non-local users, MD can be configured to spam test the e-mail prior to delivery

See also MIMEDefang for an example of “a milter”.


Phishing – A social engineering technique used by bodies posing as a trustworthy source to steal information (i.e. – usernames/passwords, bank/PayPal account information, any information that can be used to assist in data and/or identity theft.). These attacks are typically carried out via e-mail or instant messaging and in some cases through bogus accounts on social networking services such as MySpace, Facebook, etc.

E-mail deception usually takes the form of a link in a bogus e-mail or IM (though the e-mail can be VERY convincing down to being an almost Exact copy of a legitimate e-mail from the company being “spoofed” using company logos and appearing very ‘official looking’.) which leads to a website where the user being scammed will input sensitive information thus completing the theft. Requests to “verify your account” are often used in these fraudulent messages with links that appear to lead to the legitimate website to ‘verify your account’ but actually lead to the websites mentioned previously.

Phishers employ various other tactics in attempts to trick you into following their links and submitting confidential information. These techniques include Link Manipulation such as, Misspelled URLs(www.paypa1.com rather than www.paypal.com),and Masked URL, making the anchor text of a link appear valid but on mouseover in the tooltip that appears you will see where the link is Actually pointing to the scam site. Also, Filter Evasion is a common technique, using images with text on them instead of actual text in an e-mail to avoid anti-phishing filters searching for text commonly used in phishing e-mails.

Not all phishing attacks require a fake website. Scammers can employ VOIP numbers to call users claiming to need Account and PIN numbers for different services. The caller ID can be spoofed to show a legitimate company or organization name. This technique of Voice Phishing is called “Vishing”.

Damage caused by phishing ranges from denial of access to e-mail to substantial financial loss.

Information/Websites used in creation of this text –

Protocol (i.e. – POP,IMAP,etc.) before SMTP – Protocol (POP will be used in this text) before SMTP is a method of authorization used by mail server software which helps allow users the option to send e-mail from any location, as long as they can demonstrably also fetch their mail from the same place. Users are allowed to use SMTP from an IP address as long as they have previously made a successful login into the POP service at the same mail hosting provider, from the same IP address, within a predefined timeout period. The main advantage of this process is that it’s generally transparent to the average user who will be connecting with an e-mail client, which will almost always make a connection to fetch new mail before sending new mail. The disadvantages include a potentially complex setup for the mail hosting provider (requiring some sort of communication channel between the POP service and the SMTP service) and uncertainty as to how much time users will take to connect via SMTP (to send mail) after connecting to POP. Those users not handled by this method need to resort to other authorization methods. Also, in cases where users come from externally controlled dial-up addresses (more specifically, all dynamically assigned IP addresses), the SMTP server must be careful about not giving too much leeway when allowing unauthorized connections, because of a possibility of race conditions leaving an open mail relay unintentionally exposed.
pop-before-smtp has been widely superseded by smtp-auth. 

Information/Websites used in creation of this text –

Post Office Protocol (POP3) – an application-layer Internet standard protocol, to retrieve e-mail from a remote server over a TCP/IP connection. The design of POP3 and its procedures supports end-users with intermittent connections (such as dial-up connections), allowing these users to retrieve e-mail when connected and then to view and manipulate the retrieved messages without needing to stay connected. Although most clients have an option to leave mail on server, e-mail clients using POP3 generally connect, retrieve all messages, store them on the user’s PC as new messages, delete them from the server, and then disconnect. This standard protocol is built into most popular e-mail products, such as Eudora and Outlook Express. It’s also built into the Netscape and Microsoft Internet Explorer browsers. POP3 is designed to delete mail on the server as soon as the user has downloaded it. However, some implementations allow users or an administrator to specify that mail be saved for some period of time. POP can be thought of as a “store-and-forward” service. An alternative protocol is Internet Message Access Protocol (IMAP). IMAP provides the user more capabilities for retaining e-mail on the server and for organizing it in folders on the server. IMAP can be thought of as a remote file server.

Information/Websites used in creation of this text –

POP3 Testing – Test POP3 by running the following commands:

YOU WILL RECIEVE NOTIFICATION THAT "+OK mailbox open, Messages (where XXX is the number of messages in your inbox)

POP vs. IMAP – POP and IMAP both have various Pros and Cons associated with each protocol.

Advantages of POP3 include:

Message storage is limited only by the capacity of your computer.
Minimum use of connect time.
Minimum use of server resources.
It is less likely to exhaust disk space on the server.

Disadvantages to POP3 include:

Reading your e-mail from multiple computers or e-mail programs results in messages scattered about.
Messages are stored on your computer. If your computer fails you may lose all your e-mail.
You are not able to preview new messages before downloading, nor do you have control over which messages can be downloaded.
Once delivered, e-mail messages are stored on your local computer and deleted from the mail server.

Advantages of IMAP include:

Messages are stored on the server and are not affected if your computer fails.
Easily use multiple computers or e-mail programs to read mail.
Faster start-up time, as only message headings are transferred initially
Optimization for low-speed connections.

Disadvantages to IMAP include:

Mail is not usually available if you are offline.
Sensitive to size and requires periodic archival of e-mail messages
Subject to storage quotas
Not all mail providers offer IMAP as it’s more complex for them to support due to increased space occupation.

Information/Websites used in creation of this text –

Why Port 25 is Blocked and Alternate Ports that Are available

Port 25 is often blocked by ISPs to help inhibit the distribution of spam. “But I’m not a spammer!” Many people say that and many people aren’t spammers, but times are changing. As noted above in the DNFTEC section, more and more spammers are actually members of organized crime or just hackers in general. Many pieces of malware that are out in the wild don’t do anything except make your computer a member of a larger network of hijacked computers called a botnet. Botnets can be very useful to people with bad intentions. Spammers use botnets to send out their millions of spam messages to the internet and as a result, eat bandwidth. The reason ISPs block port 25 is just in case your computer gets hijacked and becomes part of a botnet, it can’t automatically send out e-mail.


Ports like 2025, 2525 are non-standard ports that some ISPs provide. There is no standard to these port #’s.

587 is a standard submission port. It requires authentication and more ISPs are supporting (or even requiring) you to use this port.

465 is a standard SMTPS or SMTP over SSL port.

postmaster.aol.com – presents a set of standards, guidelines, and best practices regarding e-mail policy.

Information/Websites used in creation of this text –

Real-time Blacklist (RBL) – The first DNSBL (DNS Blacklist) was the Real-time Blackhole List (RBL). Initially, the RBL was not a DNSBL, but rather a list of commands that could be used to program routers so that network operators could blackhole, a routing term to send all the packets into nothingness, all TCP/IP traffic for machines used to send spam or host spam supporting services, such as a website. The purpose of the RBL was not simply to block spam—it was to educate Internet service providers and other Internet sites about spam and related problems, such as open SMTP relays, spamvertising, etc. The RBL was also released in a DNSBL form and authors of Sendmail and other mail software were urged to implement RBL clients. This allowed the mail software to query the RBL and reject mail from listed sites on a per mail server basis instead of “blackholing” all traffic.

Information/Websites used in creation of this text –

Reverse DNS (rDNS) – rDNS is a process to determine the hostname or host associated with a given IP address or host address. Reverse DNS is setup by configuring PTR records (Pointer Records) in your DNS server. The Domain Name System is used to determine what IP address is associated with a given domain name. So, to reverse DNS lookup an IP address is to look up what host and domain name belongs to that IP address. There are many reverse DNS lookup tools available for free on the internet at various sites.

Information/Websites used in creation of this text –

Request for Comments (RFC) – Documents published as a series of memos encompassing new research, innovations, and methodologies applicable to internet technologies for review by peers or to convey new technologies/protocols. The Internet Engineering Task Force (IETF) adopts RFCs as Internet Standards.

Information/Websites used in creation of this text –

RFC-Ese (RFC 2119) – It’s important to understand this RFC to understand RFCs in general as it works to remove some of the vagueness of the English language that might otherwise creep up when interpreting RFCs and similar technical documentation.

Taken from http://www.ietf.org/rfc/rfc2119.txt

  1. MUST – This word, or the terms “REQUIRED” or “SHALL”, mean that the definition is an absolute requirement of the specification.
  2. MUST NOT – This phrase, or the phrase “SHALL NOT”, mean that the definition is an absolute prohibition of the specification.
  3. SHOULD – This word, or the adjective “RECOMMENDED”, mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
  4. SHOULD NOT – This phrase, or the phrase “NOT RECOMMENDED” mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.
  5. MAY – This word, or the adjective “OPTIONAL”, mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item. An implementation which does not include a particular option MUST be prepared to interoperate with another implementation which does include the option, though perhaps with reduced functionality. In the same vein an implementation which does include a particular option MUST be prepared to interoperate with another implementation which does not include the option (except, of course, for the feature the option provides.)
  6. Guidance in the use of these Imperatives
    Imperatives of the type defined in this memo must be used with care and sparingly. In particular, they MUST only be used where it is actually required for interoperation or to limit behavior which has potential for causing harm (e.g., limiting retransmisssions) For example, they must not be used to try to impose a particular method on implementors where the method is not required for interoperability.
  7. Security Considerations
    These terms are frequently used to specify behavior with security implications. The effects on security of not implementing a MUST or SHOULD, or doing something the specification says MUST NOT or SHOULD NOT be done may be very subtle. Document authors should take the time to elaborate the security implications of not following recommendations or requirements as most implementors will not have had the benefit of the experience and discussion that produced the specification.


rPTR (see rDNS) – Reverse DNS is a way of associating an IP address with its domain name. The reverse DNS identifier is contained in the PTR portion of the IP Zone File. The IP Zone File contains all the different ways that your IP and domain name can be associated; each association serves a different need.

Information/Websites used in creation of this text –

SMTP – Simple Mail Transfer Protocol (SMTP) is the standard for e-mail transmission across the internet. A relatively simple text-based protocol, SMTP is limited in its ability to queue messages at the receiving end, it is usually used with one of two other protocols, POP3 or IMAP, that let the user save messages in a server mailbox and download them periodically from the server. Extended SMTP(ESMTP) is the protocol used today and allows for multimedia files to be delivered as e-mail.

Information/Websites used in creation of this text –

SMTP Testing – SMTP can be tested using the following list of commands:

mail from:
rcpt to:

data subject:


SMTP AUTH – SMTP-AUTH is an extension of the Simple Mail Transfer Protocol (SMTP) to include an authentication step through which the client effectively logs in to the mail server during the process of sending mail. Servers which support SMTP-AUTH can usually be configured to require clients to use this extension, ensuring the true identity of the sender is known. SMTP-AUTH is defined in RFC 4954.

Information/Websites used in creation of this text –

Sender Policy Framework (SPF) – SPF is an anti-forgery system in which the Internet domain of an e-mail sender can be authenticated for that sender, thereby discouraging spam mailers, who routinely disguise the origin of their e-mail, a practice known as e-mail spoofing. SPF allows the owner of an Internet domain to use a special format of DNS TXT records to specify which machines are authorized to transmit e-mail for that domain. For example, the owner of the example.org domain can designate which machines are authorized to send e-mail whose sender e-mail address ends with “@example.org”. Receivers checking SPF can reject messages from unauthorized machines before receiving the body of the message. Principles of operations are quite similar to those of DNSBL, except that SPF exploits the authority delegation scheme of the real Domain Name System. SPF and other authentication-based measures are designed to redress a vulnerability in Simple Mail Transfer Protocol (SMTP), the main protocol used in sending e-mail, which does not include an authentication mechanism.

Information/Websites used in creation of this text –

TTL – (Time to Live) – Occur in the Domain Name System (DNS), where they are set by an authoritative nameserver for a particular resource record. When a caching nameserver queries the authoritative nameserver for a resource record, it will cache that record for the time (in seconds) specified by the TTL. Shorter TTLs can cause heavier loads on an authoritative nameserver, but can be useful when changing the address of critical services like web servers or MX records, and therefore are often lowered by the DNS administrator prior to a service being moved, in order to minimize disruptions.

Information/Websites used in creation of this text –

How to send email that will be stopped as spam

By Dianne F. Skoll with minor edits

How to send e-mail that will be stopped as Spam


How many times have you been required to send an e-mail to someone, but really didn’t want to? You know, those pesky class assignments that you have not bothered to work on, your sample resume for your dad to check out. Well now I’ll let you know how you can send those e-mails and have them not get to the destination so that you can tell your dad/professor: “I did send it, but the stupid Spam scanners rejected it as Spam.”

The Methods

  1. Send e-mail that has an attachment but no message in the body
    Just attach whatever spreadsheet/document to your e-mail and put no text in the body. Spam scanners score this very high. Many times this is enough in itself to get your e-mail rejected.
  2. Send e-mail with profanity in the Subject
    Though this does not score as high as the first method it does score quite high. This works because most Spam scanners really want to stop that porn Spam. This works even better if you try to obfuscate the profanity with something like spaces, periods, asterisks between the letters.
  3. Send e-mail with profanity in the body of the mail
    Sort of like number 2 but does not score quite as high. Trying to obfuscate the profanity will raise the score more though. Don’t zip up the body and send it as an attachment (unless you do it with no text in the body (#1)) because the Spam scanners don’t check for profanity in a zip file.
  4. Send e-mail with a Word document as an attachment
    Word documents as attachments really raise the score nicely.
    Remember though, don’t zip them up or they won’t raise the score. Besides, zipping them would make your e-mail smaller and easier to deliver.
  5. Send e-mail with no Subject
    Not putting a subject on your e-mail scores fairly high as well.
  6. Send e-mail with a Subject that is all CAPS
    This is a surprisingly simple method that will raise the Spam score
  7. Send e-mail with the body in all CAPS
    Combined with #6 this can be fairly effective in upping your Spam score
  8. Send an email with special text called “gtube” in the body.
    The GTUBE (“Generic Test for Unsolicited Bulk Email”) is a 68-byte test string used to test anti-spam systems, in particular those based on SpamAssassin.

If you combine methods, you greatly improve chances of getting your mail stopped as Spam. For instance, if you attach a word document #4, with no text in the body #1, and either no subject #5, or a subject that is all caps #6 you can pretty much guarantee that your e-mail will be rejected as Spam. Hey, check back once in a while, I’ll probably be adding new methods for you to employ.

Source PDF

For Apple OS X Users: Not Much Interface (You should still read this though)

As of Snow Leopard, the Apple OS X GUI provides very little interface and customization for the inner workings of Spam Assassin. The server preferences in Snow Leopard only allow for two options relating to Junk Mail:

  1. Enabling the filter
  2. Setting the threshold that is used to mark an e-mail as spam.

Even the mail service settings page shows only a few tweaks. The Accepted languages and locales option doesn’t even work for the included version of SpamAssassin!

Snow Leopard uses an extremely outdated version of SpamAssassin. 3.2.1, the version it uses, was released June 11th, 2007! 3.2.5 was released June 12th, 2008!

What You Need:

  1. Terminal: try to get familiar with it
  2. Vi: A must! It’s not easy but it is a very powerful and very fast text editor. It should exist on Unix-based boxes. Run the command vimtutor and follow the the tutorial.
  3. Xcode: As an administrator, you should have Apple’s Development Environment. It provides GNU compiler collection and other tools.


First, consider your threshold for tagging. A 6.0 to 7.0 score as a threshold is recommended. However, many use thresholds from 5.0 to 20.0. Remember, thresholds can be non integers such as 7.5.

Using network tests for SpamAssassin can significantly improve your Junk Mail filter. Some blacklists you can use (be careful about rules and limits):

To enable these, typically you add a few configuration lines or a *.cf file in the directory ‘/etc/mail/spamassassin’. Some are available in the default SpamAssassin rules, depending on the version of SpamAssassin. Make sure you don.t duplicate a blacklist by checking the headers in a few filtered e-mails!

Check that network tests are enabled in AMaViS, short for A Mail Virus Scanner, by editing /etc/amavisd.conf and checking the line for local tests reads ‘$sa_local_tests_only=0;’.

Checksum Filters are also available, albeit with more difficulty. Try these commands for more information: ‘perldoc Mail::SpamAssassin::Plugin::DCC’, ‘perldoc Mail::SpamAssassin::Plugin::Pyzor’ and ‘perldoc Mail::SpamAssassin::Plugin::Razor2’.

To make such network tests run faster, install a local caching nameserver. In OS X, turn on DNS in Server Admin and in System Preferences->Network, change your DNS resolution to the DNS server on your local host as the first DNS entry.

In SpamAssassin, the Bayesian Classifier learns tokens, words or short sequences that are commonly found in spam or ham. The command ‘sa-learn’ (also available in Unix) teaches the Bayesian Classifier new words or short sequences. In mail services that use SpamAssassin, a Junk Button is often included. This button interfaces with the command ‘sa-learn’.

The script that runs ‘sa-learn’ on the ham folder (notjunkmail) and the spam folder (junkmail) is located in /etc/mail/learn_junk_mail. To use this script, it is first required to create the e-mail accounts junkmail and notjunkmail. Then, redirect all spam that are reported as negative to the junkmail address and Ham that are reported as positive to the notjunkmail address. Using the Bayesian Classifier, SpamAssassin will learn from these e-mails. It is also possible to use SpamTrainer from http://osx.topicdesk.com/spamtrainer. Since new rules learned by the Bayesian Classifier will not be used until there are at least 200 tokens, SpamTrainer allows you to manually make SpamAssassin learn some folders of Spam and Ham. Once 200 tokens are found, you should see BAYES_* rules in e-mail headers and reports.

Look at the *.pre files and the *.cf files in /etc/mail/spamassassin/. The local.cf is your configuration file for SpamAssassin. It contains many more customizations than the GUI shows for SpamAssassin. Before making changes to any of these files, backup them with a new file just in case the GUI interfaces accidently changes something. You can manually blacklist and whitelist senders with certain entries, such as:

  • whitelist_from + e-mail address : whitelist the e-mail address(not recommended, because it also whitelists forged e-mail addresses).
  • whitelist_auth: This first verifies that the message was sent by an authorized sender for the address before whitelisting.
  • whitelist_to + e-mail address: Disable SpamAssassin for a certain local user
  • use the command .man Mail::SpamAssassin::Conf. for more configuration options

Plugins are implemented in the *.pre files. Most are not implemented in default. For example, to use the TextCat plugin, edit the file /etc/mail/spamassassin/v310.pre and remove the # that is commenting .loadplugin Mail::SpamAssassin::Plugin::TextCat..

For Windows Users: Installing

SpamAssassin development for Windows has stalled over the past few years. However, you can still download a precompiled native Window version with similar capabilities at http://www.jam-software.com/spamassassin/index.shtml. However, please keep in mind that this version is extremely unstable and creates multiple memory leaks. You must have a Mail Transfer Agent (MTA) such as hMailServer or Exchange that can interface with SpamAssassin. If you want, you can also install the actual SpamAssassin using the following steps:

More in-depth:


Installing : (per Daniel Lemke)

  1. Install or upgrade ActivePerl
    1. Download from www.activestate.com
    2. Download nmake from Microsoft at http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/Nmake15.exe
    3. Extract nmake.exe and nmake.err to your bin folder in perl.
      1. Use the command PPM to install required modules(first two) and the optional modules: ppm install Net-DNS
      2. ppm install NetAddr-IP
      3. ppm repo add tcool
      4. ppm install DB_File
      5. ppm install Mail-SPF
      6. ppm install IP-Country
      7. ppm install Net-Ident
      8. ppm install IO-Socket-INET6
      9. ppm install IO-Socket-SSL
      10. ppm install Encode-Detect
      11. ppm repo add bribes
      12. ppm install Mail-DKIM
  2. Install Microsoft Visual C++ 6.0 or later or another C compiler
  3. Download the SpamAssassin zip file from the official website
  4. Set the following the system environment variables:
    2. LANG=en_US
  5. Extract the zip file anywhere.
  6. Open up command prompt and switch the directory to the SpamAssassin directory
    1. Type perm makefile.pl
    2. nmake
    3. nmake test (optional)
    4. nmake install
  7. Make a test folder and create a sample-spam.txt file and a sample-nospam.txt in the site configuration folder ( C:\perl\site\etc\mail\spamassassin).
  8. Switch the directory to the site configuration folder
    1. Type sa-update -nogpgp
  9. Test the files!

Procmail Info

Procmail is a mail delivery agent (MDA) available on Unix-like environments that sorts incoming mail into specified directories and uses SpamAssassin to filter out spam. Procmail usually automatically runs all e-mails through SpamAssassin; however, it doesn’t automatically put these e-mails into a ‘spam’ directory. This requires a certain tweak:

Add this using any editor:  

* ^X-Spam-Status: yes

It would also be good to manually run the Bayesian Classifier(read more about this in the OS X section) by using the command ‘man sa-learn’ and setting ‘use_bayes 1’ in the ‘~/.spamassassin/user_prefs’ file.



This compendium was compiled and edited by Kevin A. McGrail This page consists of information assembled from various sources with contributions from:
  • Dorian Chan for his work as part of the Google Code-In
  • Dianne F. Skoll and MIMEDefang
  • Robert Schetterer
  • The Perl foundation, Google Code-In
  • PCCC
  • The Apache Software Foundation SpamAssassin Project Management Committee
Originally based on this work: http://spamassassin.apache.org/presentations/macsysadmin_2009_presentation_final_post.pdf

Need Additional Help?

Let our experts take over!

Email Us


Call Us


Copyright © 1993 – 2024 Peregrine Hardware, Inc.
All trademarks and registered servicemarks are the property of their respective companies.