Trends An update to the email standards By Ken Simpson | 4 minute read The standards that govern Internet email have just been replaced. RFCs 2822 and 2821 are now officially obsoleted by RFCs 5322 and 5321. What’s changed? I downloaded both standards, ran the raw text through a filter to remove extraneous things like page headers, and then compared the documents using Microsoft Word. It looks like most of the changes are intended to resolve ambiguities in the old standard. Here’s what I found: Changes from RFC 2822 to 5322 (Internet Message Format) 1. Slight changes to the rules for message headers in RFC 2822/5322 (the crossed out text is from 2822; underlined text is from 5322) – looks like the standard is locking down on the definition of “header folding” somewhat, which has previously been an area of some ambiguity: 2.2. Header Fields Header fields are lines composed ofbeginning with a field name, followed by a colon (“:”), followed by a field body, and terminated by CRLF. A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon. A field body may be composed of anyprintable US-ASCII characters, except for CR and LF. However, a field body may contain CRLF as well as the space (SP, ASCII value 32) and horizontal tab (HTAB, ASCII value 9) characters (together known as the white space characters, WSP). A field body MUST NOT include CR and LF except when used in header “folding” and “unfolding““, as described in section 2. Unfolded header lines can be arbitrarily long. I worry when I see the words “arbitrarily long,” because it means that an implementation that scans headers for security reasons now must be able to flexibly allocate storage for headers during processing rather than allocating a fixed buffer: The process of moving from this folded multiple-line representation of a header field to its single line representation is called “unfolding”. Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP. Each header field should be treated in its unfolded form for further syntactic and semantic evaluation. An unfolded header field has no length restriction and therefore may be indeterminately long. 3. Tightening up the email address format. There is now a note recommending (but not requiring – one step at a time, folks) that the domain portion of an email address should actually be a legitimate domain in the context that an email message is being used: Note: A liberal syntax for the domain portion of addr-spec is given here. However, the domain portion contains addressing information specified by and used in other protocols (e.g., [RFC1034], [RFC1035], [RFC1123], [RFC5321]). It is therefore incumbent upon implementations to conform to the syntax of addresses for the context in which they are used. 4. Quoted text is no longer allowed in the Message-Id header field. Message-Ids are incredibly important in email systems. They permit a receiver to identify whether two messages are actually duplicates of the same message. The old standard permitted quoted strings to be included in Message-Ids, but these are now prohibited. 5. The Received: header definition has now been mostly moved into RFC5321 (the SMTP standard). This is helpful, because the Received: header contains information concerning the hosts through which an email message traverses, which really has more to do with SMTP than the Internet Message Format defined in RFC5322. Changes from RFC 2821 to RFC 5321 (Simple Message Transfer Protocol) 1. The most significant thing to note is that RFC 5321 is not a self-contained specification for SMTP. It merely “consolidates, updates, and clarifies several previous documents, making all or parts of most of them obsolete,” covering, “SMTP extension mechanisms and best practices for the contemporary Internet…”: This document is a self-contained specification of the basic protocol for the Internet electronic mail transport. It consolidates, updates, and clarifies, but doesn’t add new or change existing functionalityseveral previous documents, making all or parts of most of themobsolete. It covers the following:– the original SMTP time=”2008-10-02T10:37″>(Simple Mail Transfer Protocol) specification ofRFC 821 [30], – domain name system requirements extension mechanisms and implications for mailbest practicestransport from RFC 1035 [22] and RFC 974 [27],– the clarifications and applicability statements in RFC 1123 [2],and– material drawn from the SMTP Extension mechanisms [19].It obsoletes RFC 821, RFC 974, and updates RFC 1123 (replaces themail transport materials of RFC 1123). However, RFC 821 specifiessome features that were not in significant use in the Internet by themid-1990s and (in appendices) some additional transport models.Those sections are omitted here in the interest of clarity andbrevity; readers needing them should refer to RFC 821.It also includes some additional material from RFC 1123 that requiredamplification. This material has been identified in multiple ways,mostly by tracking flaming on various lists and newsgroups andproblems of unusual readings or interpretations that have appeared asthe SMTP extensions have been deployed. Where this specificationmoves beyond consolidation and actually differs from earlierdocuments, it supersedes them technically as well as textually.Although SMTP was designed as a mail transport and delivery protocol,this specification also contains information that is important to itsuse as a ‘mail submission’ protocol, as recommended for POP [3, 26]and IMAP [6]. Additional submission issues are discussed in RFC 2476[15].Section 2.3 provides definitions of terms specific to this document.Except when the historical terminology is necessary for clarity, thisdocument uses the current ‘client’ and ‘server’ terminology toidentify the sending and receiving SMTP processes, respectively.A companion document [32] discusses message headers, message bodiesand formats and structures for them, and their relationship.for the contemporary Internet, but does not provide details aboutparticular extensions. Although SMTP was designed as a mailtransport and delivery protocol, this specification also containsinformation that is important to its use as a “mail submission”protocol for “split-UA” (User Agent) mail reading systems and mobileenvironments. 2. Using port 587 is (finally) recommended. The message submission protocol basically lets you use a subset of SMTP to send messages via a trusted gateway. Most ISPs now provide port-587 access for sending email, which allows them to shut off port 25 from their subscriber networks.: In many situations and configurations, the less- capable clients discussed above SHOULD be using the message submission protocol (RFC 4409 [18]) rather than SMTP. 3. No spaces after MAIL FROM. This is one I hadn’t noticed in the previous standard, but which is now clarified. The SMTP MAIL FROM command cannot be followed by a space Since it has been a common source of errors, it is worth noting thatspaces are not permitted on either side of the colon following FROMin the MAIL command or TO in the RCPT command. The syntax is exactlyas given above. 4. Explicit recognition of SPF and DKIM. The SMTP standard completely lacks a method for verifying whether the purported sender of a message is who they say they are. The new standard recommends using external mechanisms like SPF and DKIM to help in identifying the actual sender of a message: This specification does not deal with the verification of returnpaths for use in delivery notifications. Recent work, such as thaton SPF [29] and DKIM [30] [31], has been done to provide ways toascertain that an address is valid or belongs to the person whoactually sent the message. A server MAY attempt to verify the returnpath before using its address for delivery notifications, but methodsof doing so are not defined here nor is any particular methodrecommended at this time. 5. It’s now quite legal to disconnect an SMTP session after detecting a timeout. Everyone did this anyhow, so it’s nice to see it finally recognized: An SMTP server MUST NOT intentionally close the connection except: under – normal operational circumstances (see Section 7.8) except:o After receiving a QUIT command and responding with a 221 reply. – o After detecting the need to shut down the SMTP service and returning a 421 response code. This response code can be issued after the server receives any command or, if necessary, asynchronously from command receipt (on the assumption that the client will receive it after the next command is issued). o After a timeout, as specified in Section 4.5.3.2, occurs waitingfor the client to send a command or data. 6. 100-series reply codes have now been removed. These codes were never really used anyhow: 1yz Positive Preliminary replyThe command has been accepted, but the requested action is beingheld in abeyance, pending confirmation of the information in thisreply. The SMTP client should send another command specifyingwhether to continue or abort the action. Note: unextended SMTPdoes not have any commands that allow this type of reply, and sodoes not have continue or abort commands. 7. You can now send back 550 responses after DATA, when the message could not be queued for policy violations. This is a great step forward, finally recognizing the concept of inline spam and virus filtering. 8. IPv6 support is now explicitly mentioned, although not required: 5.2. IPv6 and MX RecordsIn the contemporary Internet, SMTP clients and servers may be hostedon IPv4 systems, IPv6 systems, or dual-stack systems that arecompatible with either version of the Internet Protocol. The hostdomains to which MX records point may, consequently, contain “A RR”s(IPv4), “AAAA RR”s (IPv6), or any combination of them. While RFC3974 [39] discusses some operational experience in mixedenvironments, it was not comprehensive enough to justifystandardization, and some of its recommendations appear to beinconsistent with this specification. The appropriate actions to betaken either will depend on local cir cumstances, such as performanceof the relevant networks and any conversions that might be necessary,or will be obvious (e.g., an IPv6-only client need not attempt tolook up A RRs or attempt to reach IPv4-only servers). Designers ofSMTP implementations that might run in IPv6 or dual-stackenvironments should study the procedures above, especially thecomments about multihomed hosts, and, preferably, provide mechanismsto facilitate operational tuning and mail interoperability betweenIPv4 and IPv6 systems while considering local circumstances. 9. A new section dealing specifically with abusive or attack messages. Section 6.2 argues that messages should be delivered to recipients unless the receiving system is absolutely sure that they are bad, and in the case where a message is bad, a bounce should be sent if possible to the sender. Silently discarding messages is not prohibited, but it is strongly discouraged. Bounces should only be sent if the receiver knows they’ll be “usefully delivered.” This is code for: Don’t sent bounces when your system rejects spam or virus traffic, or you’ll become an Internet pariah. Conversely, if a message is rejected because it is found to containhostile content (a decision that is outside the scope of an SMTPserver as defined in this document), rejection (“bounce”) messagesSHOULD NOT be sent unless the receiving site is confident that thosemessages will be usefully delivered. 10. Directory Harvest Attack (DHA) prevention and other kinds of SMTP server protection is now legal. This is a great thing to see in the standard, because it allows receivers to claim innocence if they refuse to service connections from hostile senders. 7. 8. Resistance to AttacksIn recent years, there has been an increase of attacks on SMTPservers, either in conjunction with attempts to discover addressesfor sending unsolicited messages or simply to make the serversinaccessible to others (i.e., as an application-level denial ofservice attack). While the means of doing so are beyond the scope ofthis Standard, rational operational behavior requires that servers bepermitted to detect such attacks and take action to defendthemselves. For example, if a server determines that a large numberof RCPT TO commands are being sent, most or all with invalidaddresses, as part of such an attack, it would be reasonable for theserver to close the connection after generating an appropriate numberof 5yz (normally 550) replies. That’s all for now. I eagerly await your comments – by RFC 5321 email or otherwise.