Internet Engineering Task Force (IETF) M. Kucherawy Request for Comments: 7103 G. Shapiro Category: Informational N. Freed ISSN: 2070-1721 January 2014
Internet Engineering Task Force (IETF) M. Kucherawy Request for Comments: 7103 G. Shapiro Category: Informational N. Freed ISSN: 2070-1721 January 2014
Advice for Safe Handling of Malformed Messages
安全处理格式错误邮件的建议
Abstract
摘要
Although Internet message formats have been precisely defined since the 1970s, authoring and handling software often shows only mild conformance to the specifications. The malformed messages that result are non-standard. Nonetheless, decades of experience have shown that using some tolerance in the handling of the malformations that result is often an acceptable approach and is better than rejecting the messages outright as nonconformant. This document includes a collection of the best advice available regarding a variety of common malformed mail situations; it is to be used as implementation guidance.
尽管自20世纪70年代以来,互联网消息格式就得到了精确定义,但创作和处理软件通常只显示出与规范的轻微一致性。产生的格式错误的消息是非标准的。尽管如此,几十年的经验表明,在处理导致畸形的过程中使用一定的容忍度通常是一种可接受的方法,并且比完全拒绝不符合要求的信息要好。本文件包括关于各种常见格式错误邮件情况的最佳建议集;它将被用作实施指南。
Status of This Memo
关于下段备忘
This document is not an Internet Standards Track specification; it is published for informational purposes.
本文件不是互联网标准跟踪规范;它是为了提供信息而发布的。
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.
本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准;见RFC 5741第2节。
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7103.
有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc7103.
Copyright Notice
版权公告
Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2014 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。
Table of Contents
目录
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. The Purpose of This Work . . . . . . . . . . . . . . . . 3 1.2. Not the Purpose of This Work . . . . . . . . . . . . . . 4 1.3. General Considerations . . . . . . . . . . . . . . . . . 4 2. Document Conventions . . . . . . . . . . . . . . . . . . . . 5 2.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 5 4. Invariant Content . . . . . . . . . . . . . . . . . . . . . . 5 5. Mail Submission Agents . . . . . . . . . . . . . . . . . . . 6 6. Line Termination . . . . . . . . . . . . . . . . . . . . . . 7 7. Header Anomalies . . . . . . . . . . . . . . . . . . . . . . 8 7.1. Converting Obsolete and Invalid Syntaxes . . . . . . . . 8 7.1.1. Host-Address Syntax . . . . . . . . . . . . . . . . . 8 7.1.2. Excessive Angle Brackets . . . . . . . . . . . . . . 8 7.1.3. Unbalanced Angle Brackets . . . . . . . . . . . . . . 8 7.1.4. Unbalanced Parentheses . . . . . . . . . . . . . . . 9 7.1.5. Commas in Address Lists . . . . . . . . . . . . . . . 9 7.1.6. Unbalanced Quotes . . . . . . . . . . . . . . . . . . 10 7.1.7. Naked Local-Parts . . . . . . . . . . . . . . . . . . 10 7.2. Non-Header Lines . . . . . . . . . . . . . . . . . . . . 10 7.3. Unusual Spacing . . . . . . . . . . . . . . . . . . . . . 12 7.4. Header Malformations . . . . . . . . . . . . . . . . . . 13 7.5. Header Field Counts . . . . . . . . . . . . . . . . . . . 13 7.5.1. Repeated Header Fields . . . . . . . . . . . . . . . 14 7.5.2. Missing Header Fields . . . . . . . . . . . . . . . . 15 7.5.3. Return-Path . . . . . . . . . . . . . . . . . . . . . 16 7.6. Missing or Incorrect Charset Information . . . . . . . . 16 7.7. Eight-Bit Data . . . . . . . . . . . . . . . . . . . . . 18 8. MIME Anomalies . . . . . . . . . . . . . . . . . . . . . . . 18 8.1. Missing MIME-Version Field . . . . . . . . . . . . . . . 19 8.2. Faulty Encodings . . . . . . . . . . . . . . . . . . . . 19 9. Body Anomalies . . . . . . . . . . . . . . . . . . . . . . . 19 9.1. Oversized Lines . . . . . . . . . . . . . . . . . . . . . 19 10. Security Considerations . . . . . . . . . . . . . . . . . . . 20 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 11.1. Normative References . . . . . . . . . . . . . . . . . . 20 11.2. Informative References . . . . . . . . . . . . . . . . . 20 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 23
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. The Purpose of This Work . . . . . . . . . . . . . . . . 3 1.2. Not the Purpose of This Work . . . . . . . . . . . . . . 4 1.3. General Considerations . . . . . . . . . . . . . . . . . 4 2. Document Conventions . . . . . . . . . . . . . . . . . . . . 5 2.1. Examples . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 5 4. Invariant Content . . . . . . . . . . . . . . . . . . . . . . 5 5. Mail Submission Agents . . . . . . . . . . . . . . . . . . . 6 6. Line Termination . . . . . . . . . . . . . . . . . . . . . . 7 7. Header Anomalies . . . . . . . . . . . . . . . . . . . . . . 8 7.1. Converting Obsolete and Invalid Syntaxes . . . . . . . . 8 7.1.1. Host-Address Syntax . . . . . . . . . . . . . . . . . 8 7.1.2. Excessive Angle Brackets . . . . . . . . . . . . . . 8 7.1.3. Unbalanced Angle Brackets . . . . . . . . . . . . . . 8 7.1.4. Unbalanced Parentheses . . . . . . . . . . . . . . . 9 7.1.5. Commas in Address Lists . . . . . . . . . . . . . . . 9 7.1.6. Unbalanced Quotes . . . . . . . . . . . . . . . . . . 10 7.1.7. Naked Local-Parts . . . . . . . . . . . . . . . . . . 10 7.2. Non-Header Lines . . . . . . . . . . . . . . . . . . . . 10 7.3. Unusual Spacing . . . . . . . . . . . . . . . . . . . . . 12 7.4. Header Malformations . . . . . . . . . . . . . . . . . . 13 7.5. Header Field Counts . . . . . . . . . . . . . . . . . . . 13 7.5.1. Repeated Header Fields . . . . . . . . . . . . . . . 14 7.5.2. Missing Header Fields . . . . . . . . . . . . . . . . 15 7.5.3. Return-Path . . . . . . . . . . . . . . . . . . . . . 16 7.6. Missing or Incorrect Charset Information . . . . . . . . 16 7.7. Eight-Bit Data . . . . . . . . . . . . . . . . . . . . . 18 8. MIME Anomalies . . . . . . . . . . . . . . . . . . . . . . . 18 8.1. Missing MIME-Version Field . . . . . . . . . . . . . . . 19 8.2. Faulty Encodings . . . . . . . . . . . . . . . . . . . . 19 9. Body Anomalies . . . . . . . . . . . . . . . . . . . . . . . 19 9.1. Oversized Lines . . . . . . . . . . . . . . . . . . . . . 19 10. Security Considerations . . . . . . . . . . . . . . . . . . . 20 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 11.1. Normative References . . . . . . . . . . . . . . . . . . 20 11.2. Informative References . . . . . . . . . . . . . . . . . 20 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 23
The history of email standards, going back to [RFC733] and beyond, contains a fairly rigid evolution of specifications. However, implementations within that culture have also long had an undercurrent known formally as "the robustness principle", also known informally as "Postel's Law": "Be liberal in what you accept, and conservative in what you send" [RFC1122].
The history of email standards, going back to [RFC733] and beyond, contains a fairly rigid evolution of specifications. However, implementations within that culture have also long had an undercurrent known formally as "the robustness principle", also known informally as "Postel's Law": "Be liberal in what you accept, and conservative in what you send" [RFC1122].
Jon Postel's directive is often interpreted to mean that any deviance from a specification is acceptable. However, we believe it was intended only to account for legitimate variations in interpretation within specifications, as well as basic transit errors, like bit errors. Taken to its unintended extreme, excessive tolerance would imply that there are no limits to the liberties that a sender might take, while presuming a burden on a receiver to guess "correctly" at the meaning of any such variation. These matters are further compounded by receiver software -- the end users' mail readers -- which are also sometimes flawed, leaving senders to craft messages (sometimes bending the rules) to overcome those flaws.
Jon Postel的指令通常被解释为任何偏离规范的行为都是可以接受的。然而,我们认为,这只是为了说明规范内解释的合法变化,以及基本的传输错误,如位错误。如果将其理解为意料之外的极端,过度容忍意味着发送者可能享有的自由没有限制,同时假定接收者有责任“正确”猜测任何此类变化的含义。接收者软件——终端用户的邮件阅读器——有时也有缺陷,这使得这些问题进一步复杂化,让发送者手工编写邮件(有时扭曲规则)来克服这些缺陷。
In general, this served the email ecosystem well by allowing a few errors in implementations without obstructing participation in the game. The proverbial bar was set low. However, as we have evolved into the current era, some of these lenient stances have begun to expose opportunities that can be exploited by malefactors. Various email-based applications rely on the strong application of these standards for simple security checks, while the very basic building blocks of that infrastructure, intending to be robust, fail utterly to assert those standards.
总的来说,这在不妨碍参与游戏的情况下允许在实现中出现一些错误,从而很好地服务于电子邮件生态系统。众所周知的门槛被降低了。然而,随着我们进入当今时代,一些宽大的立场开始暴露出犯罪分子可以利用的机会。各种基于电子邮件的应用程序都依赖于这些标准的强大应用程序来进行简单的安全检查,而这些基础设施的基本构建块(旨在实现健壮性)完全无法维护这些标准。
The distributed and non-interactive nature of email has often prompted adjustments to receiving software, to handle these variations, rather than trying to gain better conformance by senders, since the receiving operator is primarily driven by complaints from recipient users and has no authority over the sending side of the system. Processing with such flexibility comes at some cost, since mail software is faced with decisions about whether to permit non-conforming messages to continue toward their destinations unaltered, adjust them to conform (possibly at the cost of losing some of the original message), or reject them outright.
电子邮件的分布式和非交互式性质常常促使对接收软件进行调整,以处理这些变化,而不是试图让发送方更好地遵守规定,因为接收运营商主要受接收方用户投诉的驱动,对系统的发送端没有权限。具有这种灵活性的处理需要付出一定的代价,因为邮件软件面临的决策是:是否允许不符合要求的邮件继续发送到其目的地而不作任何更改,是否调整邮件以使其符合要求(可能以丢失部分原始邮件为代价),或者直接拒绝邮件。
This document includes a collection of the best advice available regarding a variety of common malformed mail situations; it is to be used as implementation guidance. These malformations are typically
本文件包括关于各种常见格式错误邮件情况的最佳建议集;它将被用作实施指南。这些畸形是典型的
based around loose interpretations or implementations of specifications such as the Internet Message Format [MAIL] and Multipurpose Internet Mail Extensions [MIME].
基于对规范的松散解释或实现,如Internet消息格式[MAIL]和多用途Internet邮件扩展[MIME]。
It is important to understand that this work is not an effort to endorse or standardize certain common malformations. The code and culture that introduces such messages into the mail stream needs to be repaired, as the security penalty now being paid for this lax processing arguably outweighs the reduction in support costs to end users who are not expected to understand the standards. However, the reality is that this will not be fixed quickly.
重要的是要理解,这项工作不是为了认可或标准化某些常见畸形。将此类消息引入邮件流的代码和文化需要修复,因为目前为这种处理松懈而支付的安全惩罚可以说超过了为不了解标准的最终用户降低的支持成本。然而,现实是,这一问题不会很快得到解决。
Given this, it is beneficial to provide implementers with guidance about the safest or most effective way to handle malformed messages when they arrive, taking into consideration the trade-offs of the choices available especially with respect to how various actors in the email ecosystem respond to such messages in terms of handling, parsing, or rendering to end users.
有鉴于此,有利于为实施者提供关于在错误消息到达时处理错误消息的最安全或最有效的方法的指导,同时考虑到可用选择的权衡,特别是电子邮件生态系统中的各个参与者如何在处理、解析和处理方面响应此类消息,或呈现给最终用户。
Many deviations from message format standards are considered by some receivers to be strong indications that the message is undesirable, such as spam or something containing malware. These receivers quickly decide that the best handling choice is simply to reject or discard the message. This means malformations caused by innocent misunderstandings or ignorance of proper syntax can cause messages with no ill intent also to fail to be delivered.
一些接收者认为,与消息格式标准的许多偏差强烈表明消息是不受欢迎的,例如垃圾邮件或包含恶意软件的内容。这些接收者很快决定,最好的处理选择就是拒绝或丢弃消息。这意味着无辜的误解或对正确语法的无知所导致的畸形可能会导致没有恶意的消息也无法传递。
Senders that want to ensure message delivery are best advised to adhere strictly to the relevant standards (including, but not limited to, [MAIL], [MIME], and [DKIM]), as well as observe other industry best practices such as may be published from time to time by either the IETF or independently.
希望确保邮件传递的发件人最好严格遵守相关标准(包括但不限于[MAIL]、[MIME]和[DKIM]),并遵守IETF或独立发布的其他行业最佳实践。
Receivers that haven't the luxury of strict enforcement of the standards on inbound messages are usually best served by observing the following guidelines for handling of malformed messages:
对于没有严格执行入站消息标准的接收者,通常最好遵守以下错误消息处理指南:
1. Whenever possible, mitigation of syntactic malformations should be guided by an assessment of the most likely semantic intent. For example, it is reasonable to conclude that multiple sets of angle brackets around an address are simply superfluous and can be dropped.
1. 只要有可能,句法畸形的缓解应该以最可能的语义意图的评估为指导。例如,有理由得出这样的结论:一个地址周围的多组尖括号是多余的,可以删除。
2. When the intent is unclear, or when it is clear but also impractical to change the content to reflect that intent, mitigation should be limited to cases where not taking any corrective action would clearly lead to a worse outcome.
2. 当意图不明确时,或当改变内容以反映该意图是明确的但又不切实际时,缓解措施应限于不采取任何纠正措施会明显导致更糟糕结果的情况。
3. Security issues, when present, need to be addressed and may force mitigation strategies that are otherwise suboptimal.
3. 当存在安全问题时,需要加以解决,并可能迫使采取不太理想的缓解策略。
Examples of message content include a number within braces at the end of each line. These are line numbers for use in subsequent discussion, and they are not actually part of the message content presented in the example.
消息内容的示例包括每行末尾大括号内的数字。这些行号将在后续讨论中使用,它们实际上不是示例中显示的消息内容的一部分。
Blank lines are not numbered in the examples.
示例中未对空行进行编号。
The reader would benefit from reading [EMAIL-ARCH] for some general background about the overall email architecture. Of particular interest is the Internet Message Format, detailed in [MAIL]. Throughout this document, the use of the term "message" should be assumed to mean a block of text conforming to the Internet Message Format.
读者可以通过阅读[EMAIL-ARCH]了解有关整个电子邮件体系结构的一般背景。特别令人感兴趣的是互联网信息格式,详见[邮件]。在本文件中,术语“消息”的使用应假定为指符合互联网消息格式的文本块。
An agent handling a message could use several distinct representations of the message. One is an internal representation, such as separate blocks of storage for the header and body, some header or body alterations, or tables indexed by header name, set up to make particular kinds of processing easier. The other is the representation passed along to the next agent in the handling chain. This might be identical to the message input to the module, or it might have some changes such as added or reordered header fields or body elisions to remove malicious content.
处理消息的代理可以使用消息的多个不同表示形式。一种是内部表示,例如头和正文的单独存储块、一些头或正文的更改,或按头名称索引的表,这些设置使特定类型的处理更容易。另一个是传递给处理链中下一个代理的表示。这可能与模块的消息输入相同,或者可能有一些更改,例如添加或重新排序的标题字段或正文省略,以删除恶意内容。
Message handling is usually most effective when each in a sequence of handling modules receives the same content for analysis. A module that "fixes" or otherwise alters the content passed to later modules can prevent the later modules from identifying malicious or other content that exposes the end user to harm. It is important that all processing modules can make consistent assertions about the content. Modules that operate sequentially sometimes add private header fields to relay information downstream for later filters to use (and
当处理模块序列中的每个模块都接收到相同的内容进行分析时,消息处理通常最有效。“修复”或以其他方式更改传递给后续模块的内容的模块可以防止后续模块识别恶意内容或使最终用户受到伤害的其他内容。重要的是,所有处理模块都可以对内容做出一致的断言。按顺序操作的模块有时会添加专用标题字段,以将信息转发到下游,供以后的过滤器使用(和
possibly remove), or they may have out-of-band ways of doing so. However, even the presence of private header fields can impact a downstream handling agent unaware of its local semantics, so an out-of-band method is always preferable.
可能移除),或者他们可能有带外的方法来执行此操作。然而,即使私有头字段的存在也会影响下游处理代理,而不知道其本地语义,因此带外方法总是首选的。
The above is less of a concern when multiple analysis modules are operated in parallel, independent of one another.
当多个分析模块相互独立并行运行时,上述问题就不那么重要了。
Often, abuse reporting systems can act effectively only when a complaint or report contains the original message exactly as it was generated. Messages that have been altered by handling modules might render a complaint not actionable as the system receiving the report may be unable to identify the original message as one of its own.
通常,虐待报告系统只有在投诉或报告完全按照其产生的原样包含原始信息时才能有效运作。处理模块更改的消息可能导致投诉无法处理,因为接收报告的系统可能无法将原始消息识别为自己的消息。
Some message changes alter syntax without changing semantics. For example, Section 7.4 describes a situation where an agent removes additional header whitespace. This is a syntax change without a change in semantics, though some systems (such as DKIM) are sensitive to such changes. Message system developers need to be aware of the downstream impact of making either kind of change.
有些消息更改会改变语法而不改变语义。例如,第7.4节描述了一种情况,即代理删除额外的标题空白。这是一个语法更改,没有语义更改,尽管某些系统(如DKIM)对此类更改很敏感。消息系统开发人员需要了解进行任何一种更改的下游影响。
Where a change to content between modules is unavoidable, it is a good idea to add standard trace data to indicate a "visible" handoff between modules has occurred. The only advisable way to do this is to prepend Received fields with the appropriate information, as described in Section 3.6.7 of [MAIL].
如果模块之间的内容更改不可避免,最好添加标准跟踪数据,以指示模块之间发生了“可见”切换。唯一可取的方法是在收到的字段前加上适当的信息,如[邮件]第3.6.7节所述。
There will always be local handling exceptions, but these guidelines should be useful for developing integrated message processing environments.
总会有本地处理异常,但这些指导原则对于开发集成的消息处理环境应该很有用。
In most cases, this document only discusses techniques used on internal representations. It is occasionally necessary to make changes between the input and output versions; such cases will be called out explicitly.
在大多数情况下,本文档仅讨论用于内部表示的技术。有时需要在输入和输出版本之间进行更改;这样的案件将被明确提出。
Within the email context, the single most influential component that can reduce the presence of malformed items in the email system is the Mail Handling Service (MHS; see [EMAIL-ARCH]), which includes the Mail Submission Agent (MSA). This is the component that is essentially the interface between end users that create content and the mail stream.
在电子邮件环境中,能够减少电子邮件系统中出现格式错误项目的最具影响力的组件是邮件处理服务(MHS;请参阅[email-ARCH]),其中包括邮件提交代理(MSA)。这是一个组件,本质上是创建内容的最终用户和邮件流之间的接口。
MHSs need to become more strict about enforcement of all relevant email standards, especially [MAIL] and the [MIME] family of documents.
MHS需要更加严格地执行所有相关的电子邮件标准,尤其是[MAIL]和[MIME]文档系列。
More strict conformance by relaying Mail Transfer Agents (MTAs) will also be helpful. Although preventing the dissemination of malformed messages is desirable, the rejection of such mail already in transit also has a support cost -- namely, the creation of a [DSN] that many end users might not understand.
通过中继邮件传输代理(MTA)实现更严格的一致性也会有所帮助。尽管防止传播格式错误的邮件是可取的,但拒绝已经在传输中的此类邮件也会带来支持成本——即创建许多最终用户可能不理解的[DSN]。
For interoperable Internet Mail messages, the only valid line separation sequence during a typical SMTP session is ASCII 0x0D ("carriage return", or CR) followed by ASCII 0x0A ("line feed", or LF), commonly referred to as "CRLF". This is not the case for binary mode SMTP (see [BINARYSMTP]).
对于可互操作的Internet邮件,在典型SMTP会话期间,唯一有效的行分隔顺序是ASCII 0x0D(“回车符”,或CR),后跟ASCII 0x0A(“换行符”,或LF),通常称为“CRLF”。二进制模式SMTP的情况并非如此(请参阅[BinaryStp])。
Common UNIX user tools, however, typically only use LF for internal line termination. This means that a protocol engine that converts between UNIX and Internet message formats has to convert between these two end-of-line representations before transmitting a message or after receiving it.
但是,常见的UNIX用户工具通常仅将LF用于内部线路终止。这意味着,在UNIX和Internet消息格式之间进行转换的协议引擎必须在传输消息之前或接收消息之后在这两种行尾表示形式之间进行转换。
Non-compliant implementations can create messages with a mix of line terminations, such as LF everywhere except CRLF only at the end of the message. According to [SMTP] and [MAIL], this means the entire message actually exists on a single line.
不兼容的实现可以创建混合行终止的消息,例如除消息末尾的CRLF之外的所有LF。根据[SMTP]和[MAIL],这意味着整个邮件实际上存在于一行中。
Within modern Internet Mail, it is highly unlikely that an isolated CR or LF is valid in common ASCII text. Furthermore, when content actually does need to contain such an unusual character sequence, [MIME] provides mechanisms for encoding that content in an SMTP-safe manner.
在现代互联网邮件中,孤立的CR或LF在普通ASCII文本中是有效的可能性很小。此外,当内容确实需要包含这样一个不寻常的字符序列时,[MIME]提供了以SMTP安全方式对该内容进行编码的机制。
Thus, it will typically be safe and helpful to treat an isolated CR or LF as equivalent to a CRLF when parsing a message.
因此,在解析消息时,将隔离的CR或LF视为等同于CRLF通常是安全和有用的。
Note that this advice pertains only to the raw SMTP data and not to decoded MIME entities. As noted above, when MIME encoding mechanisms are used, the unusual character sequences are not visible in the raw SMTP stream.
请注意,此建议仅适用于原始SMTP数据,而不适用于解码的MIME实体。如上所述,当使用MIME编码机制时,不寻常的字符序列在原始SMTP流中不可见。
This section covers common syntactic and semantic anomalies found in a message header and presents suggested methods of mitigation.
本节介绍在消息头中发现的常见语法和语义异常,并介绍建议的缓解方法。
A message using an obsolete header syntax (see Section 4 of [MAIL]) might confound an agent that is attempting to be robust in its handling of syntax variations. A bad actor could exploit such a weakness in order to get abusive or malicious content through a filter. This section presents some examples of such variations. Messages including these variations ought to be rejected; where this is not possible, recommended internal interpretations are provided.
使用过时的标头语法的消息(请参见[MAIL]第4节)可能会使试图在处理语法变化时保持稳健的代理感到困惑。一个坏演员可以利用这样一个弱点,通过过滤器获取滥用或恶意内容。本节介绍了此类变化的一些示例。应拒绝包含这些变化的信息;如果不可能,则提供建议的内部解释。
The following obsolete syntax attempts to specify source routing:
以下过时语法试图指定源路由:
To: <@example.net:fran@example.com>
To: <@example.net:fran@example.com>
This means "send to fran@example.com via the mail service at example.net". It can safely be interpreted as:
这意味着“发送到”fran@example.com通过example.net上的邮件服务”。可以安全地解释为:
To: <fran@example.com>
To: <fran@example.com>
The following overuse of angle brackets:
以下是过度使用角括号的原因:
To: <<<user2@example.org>>>
To: <<<user2@example.org>>>
can safely be interpreted as:
可以安全地解释为:
To: <user2@example.org>
To: <user2@example.org>
The following use of unbalanced angle brackets:
不平衡角括号的以下使用:
To: <another@example.net
To: <another@example.net
can usually be treated as:
通常可被视为:
To: <another@example.net>
To: <another@example.net>
The following:
以下是:
To: second@example.org>
To: second@example.org>
can usually be treated as:
通常可被视为:
To: second@example.org
致:second@example.org
The following use of unbalanced parentheses:
不平衡圆括号的以下用法:
To: (Testing <fran@example.com>
To: (Testing <fran@example.com>
can safely be interpreted as:
可以安全地解释为:
To: (Testing) <fran@example.com>
To: (Testing) <fran@example.com>
Likewise, this case:
同样,在这种情况下:
To: Testing) <sam@example.com>
To: Testing) <sam@example.com>
can safely be interpreted as:
可以安全地解释为:
To: "Testing)" <sam@example.com>
To: "Testing)" <sam@example.com>
In both cases, it is obvious where the active email address in the string can be found. The former case retains the active email address in the string by completing what appears to be intended as a comment; the intent in the latter case is less obvious, so the leading string is interpreted as a display name.
在这两种情况下,很明显可以在字符串中找到活动的电子邮件地址。前一种情况通过完成似乎是要作为注释的内容来保留字符串中的活动电子邮件地址;后一种情况下的意图不太明显,因此前导字符串被解释为显示名称。
This use of an errant comma:
使用错误的逗号:
To: <third@example.net, fourth@example.net>
To: <third@example.net, fourth@example.net>
can usually be interpreted as ending an address, so the above is usually best interpreted as:
通常可以解释为地址的结尾,因此通常最好将上述内容解释为:
To: third@example.net, fourth@example.net
To: third@example.net, fourth@example.net
The following use of unbalanced quotation marks:
不平衡引号的以下用法:
To: "Joe <joe@example.com>
To: "Joe <joe@example.com>
leaves software with no unambiguous interpretation. One possible interpretation is:
使软件没有明确的解释。一种可能的解释是:
To: "Joe <joe@example.com>"@example.net
To: "Joe <joe@example.com>"@example.net
where "example.net" is the domain name or host name of the handling agent making the interpretation. However, the more obvious and likely best interpretation is simply:
其中,“example.net”是进行解释的处理代理的域名或主机名。然而,更明显和可能的最佳解释是:
To: "Joe" <joe@example.com>
To: "Joe" <joe@example.com>
[MAIL] defines a local-part as the user portion of an email address, and the display-name as the "user-friendly" label that accompanies the address specification.
[MAIL]将本地部分定义为电子邮件地址的用户部分,将显示名称定义为地址规范附带的“用户友好”标签。
Some broken submission agents might introduce messages with only a local-part or only a display-name and no properly formed address. For example:
某些已损坏的提交代理可能会引入仅包含本地部分或仅包含显示名称且没有正确格式的地址的消息。例如:
To: Joe
致:乔
A submission agent ought to reject this or, at a minimum, append "@" followed by its own host name or some other valid name likely to enable a reply to be delivered to the correct mailbox. Where this is not done, an agent receiving such a message will probably be successful by synthesizing a valid header field for evaluation using the techniques described in Section 7.5.2.
提交代理应该拒绝此请求,或者至少在其主机名或其他有效名称后附加“@”,以便能够将回复发送到正确的邮箱。如果不这样做,则通过使用第7.5.2节中描述的技术合成用于评估的有效报头字段,接收此类消息的代理可能会成功。
Some messages contain a line of text in the header that is not a valid message header field of any kind. For example:
某些邮件的标题中包含一行文本,该行不是任何类型的有效邮件标题字段。例如:
From: user@example.com {1} To: userpal@example.net {2} Subject: This is your reminder {3} about the football game tonight {4} Date: Wed, 20 Oct 2010 20:53:35 -0400 {5}
From: user@example.com {1} To: userpal@example.net {2} Subject: This is your reminder {3} about the football game tonight {4} Date: Wed, 20 Oct 2010 20:53:35 -0400 {5}
Don't forget to meet us for the tailgate party! {7}
Don't forget to meet us for the tailgate party! {7}
The cause of this is typically a bug in a message generator of some kind. Line {4} was intended to be a continuation of line {3}; it should have been indented by whitespace as set out in Section 2.2.3 of [MAIL].
造成这种情况的原因通常是某种消息生成器中的错误。第{4}行是第{3}行的延续;应按照[邮件]第2.2.3节的规定,用空格缩进。
This anomaly has varying impacts on processing software, depending on the implementation:
此异常对处理软件有不同的影响,具体取决于实现:
1. Some agents choose to separate the header of the message from the body only at the first empty line (that is, a CRLF immediately followed by another CRLF).
1. 一些代理选择仅在第一个空行(即,紧接着另一个CRLF的CRLF)处将消息头与正文分开。
2. Some agents assume this anomaly should be interpreted to mean the body starts at line {4}, as the end of the header is assumed by encountering something that is not a valid header field or folded portion thereof.
2. 一些代理假定此异常应解释为主体从第{4}行开始,因为假定头的结尾遇到了无效的头字段或其折叠部分。
3. Some agents assume this should be interpreted as an intended header folding as described above and thus simply append a single space character (ASCII 0x20) and the content of line {4} to that of line {3}.
3. 一些代理假定这应该被解释为如上所述的预期标头折叠,因此只需将单个空格字符(ASCII 0x20)和第{4}行的内容附加到第{3}行的内容。
4. Some agents reject this outright as line {4} is neither a valid header field nor a folded continuation of a header field prior to an empty line.
4. 一些代理完全拒绝此操作,因为第{4}行既不是有效的头字段,也不是空行之前头字段的折叠延续。
This can be exploited if it is known that one message handling agent will take one action, while the next agent in the handling chain will take another. Consider, for example, a message filter that searches message headers for properties indicative of abusive or malicious content that is attached to a Mail Transfer Agent (MTA) implementing option 2 above. An attacker could craft a message that includes this malformation at a position above the property of interest, knowing the MTA will not consider that content part of the header. Consequently, the MTA will not feed it to the filter; thus, it avoids detection. Meanwhile, the Mail User Agent (MUA), which presents the content to an end user, implements option 1 or 3, which has some undesirable effect.
如果已知一个消息处理代理将执行一个操作,而处理链中的下一个代理将执行另一个操作,则可以利用此漏洞。例如,考虑消息筛选器,该消息筛选器搜索消息标识符,该属性指示附着于邮件传输代理(MTA)的恶意或恶意内容,该邮件传输代理(MTA)实现上述选项2。攻击者可以在包含感兴趣属性的位置上形成包含该畸形的消息,知道MTA不会考虑头部的内容部分。因此,MTA不会将其提供给过滤器;因此,它避免了检测。同时,将内容呈现给最终用户的邮件用户代理(MUA)实现了选项1或3,这会产生一些不良影响。
It should be noted that a few implementations choose option 4 above since any reputable message generation program will get header folding right, and thus anything so blatant as this malformation is likely an error caused by a malefactor.
应该注意的是,有几个实现选择了上面的选项4,因为任何著名的消息生成程序都会正确地折叠报头,因此,任何如此明显的异常都可能是由恶意因素导致的错误。
The preferred implementation if option 4 above is not employed is to apply the following heuristic when this malformation is detected:
如果未采用上述选项4,则首选实施方式是在检测到该畸形时应用以下启发式方法:
1. Search forward for an empty line. If one is found, then apply option 3 above to the anomalous line, and continue.
1. 向前搜索空行。如果发现一条异常线,则将上述选项3应用于异常线,然后继续。
2. Search forward for another line that appears to be a new header field (a name followed by a colon). If one is found, then apply option 3 above to the anomalous line, and continue.
2. 向前搜索另一行,该行显示为新的标题字段(名称后跟冒号)。如果发现一条异常线,则将上述选项3应用于异常线,然后继续。
The following message is valid per [MAIL]:
以下消息对[邮件]有效:
From: user@example.com {1} To: userpal@example.net {2} Subject: This is your reminder {3} {4} about the football game tonight {5} Date: Wed, 20 Oct 2010 20:53:35 -0400 {6}
From: user@example.com {1} To: userpal@example.net {2} Subject: This is your reminder {3} {4} about the football game tonight {5} Date: Wed, 20 Oct 2010 20:53:35 -0400 {6}
Don't forget to meet us for the tailgate party! {8}
Don't forget to meet us for the tailgate party! {8}
Line {4} contains a single whitespace. The intended result is that lines {3}, {4}, and {5} comprise a single continued header field. However, some agents are aggressive at stripping trailing whitespace, which will cause line {4} to be treated as an empty line, and thus the separator line between header and body. This can affect header-specific processing algorithms as described in the previous section.
第{4}行包含一个空格。预期结果是{3}、{4}和{5}行组成一个连续的头字段。但是,有些代理在剥离尾随空格时很激进,这将导致行{4}被视为空行,从而导致标头和正文之间的分隔行。如前一节所述,这可能会影响特定于报头的处理算法。
This example was legal in earlier versions of the Internet message format standard but was rendered obsolete as of [RFC2822] as line {4} could be interpreted as the separator between the header and body.
该示例在早期版本的Internet消息格式标准中是合法的,但由于第{4}行可以解释为标头和正文之间的分隔符,因此在[RFC2822]时已过时。
The best handling of this example is for a message parsing engine to behave as if line {4} were not present in the message and for a message creation engine to emit the message with line {4} removed.
此示例的最佳处理方式是,消息解析引擎的行为就像消息中不存在第{4}行一样,消息创建引擎在删除第{4}行的情况下发出消息。
Among the many possible malformations, a common one is insertion of whitespace at unusual locations, such as:
在许多可能的畸形中,一种常见的是在不寻常的位置插入空格,例如:
From: user@example.com {1} To: userpal@example.net {2} Subject: This is your reminder {3} MIME-Version : 1.0 {4} Content-Type: text/plain {5} Date: Wed, 20 Oct 2010 20:53:35 -0400 {6}
From: user@example.com {1} To: userpal@example.net {2} Subject: This is your reminder {3} MIME-Version : 1.0 {4} Content-Type: text/plain {5} Date: Wed, 20 Oct 2010 20:53:35 -0400 {6}
Don't forget to meet us for the tailgate party! {8}
Don't forget to meet us for the tailgate party! {8}
Note the addition of whitespace in line {4} after the header field name but before the colon that separates the name from the value.
注意{4}行在头字段名称之后,但在冒号之前添加了空格,将名称与值分隔开。
The obsolete grammar of Section 4 of [MAIL] permits that extra whitespace, so it cannot be considered invalid. However, a consensus of implementations prefers to remove that whitespace. There is no perceived change to the semantics of the header field being altered as the whitespace is itself semantically meaningless. Therefore, it is best to remove all whitespace after the field name but before the colon and to emit the field in this modified form.
[MAIL]第4节的过时语法允许使用额外的空格,因此不能将其视为无效。然而,实现的共识倾向于删除空白。由于空格本身在语义上是无意义的,因此对正在更改的头字段的语义没有任何感知到的更改。因此,最好删除字段名之后冒号之前的所有空格,并以这种修改后的形式发出字段。
Section 3.6 of [MAIL] prescribes specific header field counts for a valid message. Few agents actually enforce these in the sense that a message whose header contents exceed one or more limits set there are generally allowed to pass; they typically add any required fields that are missing, however.
[MAIL]第3.6节规定了有效邮件的特定标题字段计数。实际上,很少有代理执行这些命令,因为头内容超过一个或多个限制的消息通常被允许通过;但是,它们通常会添加缺少的任何必填字段。
Also, few agents that use messages as input, including MUAs that actually display messages to users, verify that the input is valid before proceeding. Some popular open-source filtering programs and some popular Mailing List Management (MLM) packages select either the first or last instance of a particular field name, such as From, to decide who sent a message. Absent strict enforcement of [MAIL], an attacker can craft a message with multiple instances of the same fields if that attacker knows the filter will make a decision based on one, but the user will be shown the others.
此外,很少有使用消息作为输入的代理(包括向用户实际显示消息的MUA)在继续之前验证输入是否有效。一些流行的开源过滤程序和一些流行的邮件列表管理(MLM)软件包选择特定字段名的第一个或最后一个实例(如发件人),以决定谁发送了消息。在没有严格执行[MAIL]的情况下,如果攻击者知道过滤器将根据一个实例做出决定,但用户将看到其他实例,则攻击者可以使用相同字段的多个实例来制作消息。
This situation is exacerbated when message validity is assessed, such as through enhanced authentication methods like DomainKeys Identified Mail [DKIM]. Such methods might cover one instance of a constrained field but not another, taking the wrong one as "good" or "safe". An
当评估消息有效性时,这种情况会加剧,例如通过增强的身份验证方法,如DomainKeys Identified Mail[DKIM]。此类方法可能涵盖受约束字段的一个实例,但不涵盖另一个实例,将错误的实例视为“良好”或“安全”。一
MUA, for example, could show the first of two From fields to an end user as "good" or "safe", while an authentication method actually only verified the second.
例如,MUA可以向最终用户显示两个From字段中的第一个字段为“良好”或“安全”,而身份验证方法实际上只验证第二个字段。
In attempting to counter this exposure, one of the following strategies can be used:
在试图应对这种暴露时,可以使用以下策略之一:
1. reject outright or refuse to process further any input message that does not conform to Section 3.6 of [MAIL];
1. 完全拒绝或拒绝进一步处理任何不符合[邮件]第3.6节的输入信息;
2. remove or, in the case of an MUA, refuse to render any instances of a header field whose presence exceeds a limit prescribed in Section 3.6 of [MAIL] when generating its output;
2. 如果是MUA,在生成其输出时,移除或拒绝呈现其存在超过[邮件]第3.6节规定限制的标题字段的任何实例;
3. where a field can contain multiple distinct values (such as From) or is free-form text (such as Subject), combine them into a semantically identical, single header field of the same name (see Section 7.5.1);
3. 如果一个字段可以包含多个不同的值(如From)或是自由格式文本(如Subject),则将它们组合成语义相同、名称相同的单个标题字段(见第7.5.1节);
4. alter the name of any header field whose presence exceeds a limit prescribed in Section 3.6 of [MAIL] when generating its output so that later agents can produce a consistent result. Any alteration likely to cause the field to be ignored by downstream agents is acceptable. A common approach is to prefix the field names with a string such as "BAD-".
4. 在生成其输出时,更改出现超过[MAIL]第3.6节规定限制的任何标题字段的名称,以便以后的代理可以生成一致的结果。任何可能导致下游代理忽略现场的变更都是可以接受的。一种常见的方法是在字段名前面加上字符串,如“BAD-”。
When selecting a mitigation action (or some other action) from the above list, an operator must consider its needs and the nature of its user base.
当从上面列表中选择缓解动作(或其他一些动作)时,操作者必须考虑它的需求和它的用户基础的性质。
There are some occasions where repeated fields are encountered where only one is expected. Two examples are presented. First:
在某些情况下,会遇到重复字段,而预期只有一个字段。给出了两个例子。第一:
From: reminders@example.com {1} To: jqpublic@example.com {2} Subject: Automatic Meeting Reminder {3} Subject: 4pm Today -- Staff Meeting {4} Date: Wed, 20 Oct 2010 08:00:00 -0700 {5}
From: reminders@example.com {1} To: jqpublic@example.com {2} Subject: Automatic Meeting Reminder {3} Subject: 4pm Today -- Staff Meeting {4} Date: Wed, 20 Oct 2010 08:00:00 -0700 {5}
Reminder of the staff meeting today in the small {6} auditorium. Come early! {7}
Reminder of the staff meeting today in the small {6} auditorium. Come early! {7}
The message above has two Subject fields, which is in violation of Section 3.6 of [MAIL]. A safe interpretation of this would be to treat it as though the two Subject field values were concatenated, so long as they are not identical, such as:
上述邮件有两个主题字段,这违反了[邮件]第3.6节的规定。对此的安全解释是,将其视为两个主题字段值是串联的,只要它们不相同,例如:
From: reminders@example.com {1} To: jqpublic@example.com {2} Subject: Automatic Meeting Reminder {3} 4pm Today -- Staff Meeting {4} Date: Wed, 20 Oct 2010 08:00:00 -0700 {5}
From: reminders@example.com {1} To: jqpublic@example.com {2} Subject: Automatic Meeting Reminder {3} 4pm Today -- Staff Meeting {4} Date: Wed, 20 Oct 2010 08:00:00 -0700 {5}
Reminder of the staff meeting today in the small {6} auditorium. Come early! {7}
Reminder of the staff meeting today in the small {6} auditorium. Come early! {7}
Second:
第二:
From: president@example.com {1} From: vice-president@example.com {2} To: jqpublic@example.com {3} Subject: A note from the E-Team {4} Date: Wed, 20 Oct 2010 08:00:00 -0700 {5}
From: president@example.com {1} From: vice-president@example.com {2} To: jqpublic@example.com {3} Subject: A note from the E-Team {4} Date: Wed, 20 Oct 2010 08:00:00 -0700 {5}
This memo is to remind you of the corporate dress {6} code. Attached you will find an updated copy of {7} the policy. {8} ...
This memo is to remind you of the corporate dress {6} code. Attached you will find an updated copy of {7} the policy. {8} ...
As with the first example, there is a violation in terms of the number of instances of the From field. A likely safe interpretation would be to combine these into a comma-separated address list in a single From field:
与第一个示例一样,在From字段的实例数方面存在冲突。一种可能安全的解释是,将这些内容合并到单个“发件人”字段中的逗号分隔地址列表中:
From: president@example.com, {1} vice-president@example.com {2} To: jqpublic@example.com {3} Subject: A note from the E-Team {4} Date: Wed, 20 Oct 2010 08:00:00 -0700 {5}
From: president@example.com, {1} vice-president@example.com {2} To: jqpublic@example.com {3} Subject: A note from the E-Team {4} Date: Wed, 20 Oct 2010 08:00:00 -0700 {5}
This memo is to remind you of the corporate dress {6} code. Attached you will find an updated copy of {7} the policy. {8} ...
This memo is to remind you of the corporate dress {6} code. Attached you will find an updated copy of {7} the policy. {8} ...
Similar to the previous section, there are messages seen in the wild that lack certain required header fields. In particular, [MAIL] requires that a From and Date field be present in all messages.
与上一节类似,在野外看到的消息缺少某些必需的头字段。特别是,[MAIL]要求所有邮件中都有“发件人”和“日期”字段。
When presented with a message lacking these fields, the MTA might perform one of the following:
当显示缺少这些字段的邮件时,MTA可能会执行以下操作之一:
1. Make no changes.
1. 不要做任何改变。
2. Add an instance of the missing field(s) using synthesized content based on data provided in other parts of the protocol.
2. 使用基于协议其他部分中提供的数据的合成内容添加缺少字段的实例。
Option 2 is recommended for handling this case. Handling agents should add these for internal handling if they are missing, but should not add them to the external representation. The reason for this advice is that there are some filter modules that would consider the absence of such fields to be a condition warranting special treatment (for example, rejection), and thus the effectiveness of such modules would be stymied by an upstream filter adding them in a way visible to other components.
建议使用选项2来处理这种情况。如果缺少这些代理,则处理代理应将其添加到内部处理中,但不应将其添加到外部表示中。这一建议的原因是,有一些滤波器模块会考虑到没有这样的字段是一个条件,特别处理(例如,拒绝),因此这样的模块的有效性会被上游滤波器添加到其他组件可见的方式。
The synthesized fields should contain a best guess as to what should have been there; for From, the SMTP MAIL command's address can be used (if not null) or a placeholder address followed by an address literal (for example, unknown@[192.0.2.1]); for Date, a date extracted from a Received field is a reasonable choice.
合成字段应该包含关于应该存在什么的最佳猜测;对于发件人,可以使用SMTP邮件命令的地址(如果不为空)或占位符地址,后跟地址文字(例如,unknown@[192.0.2.1]);对于日期,从接收字段中提取的日期是一个合理的选择。
One other important case to consider is a missing Message-ID field. An MTA that encounters a message missing this field should synthesize a valid one and add it to the external representation, since many deployed tools commonly use the content of that field as a unique message reference, so its absence inhibits correlation of message processing. Section 3.6.4 of [MAIL] describes advisable practice for synthesizing the content of this field when it is absent, and establishes a requirement that it be globally unique.
另一个要考虑的重要情况是缺少消息ID字段。遇到缺少此字段的消息的MTA应合成一个有效的字段并将其添加到外部表示,因为许多已部署的工具通常将该字段的内容用作唯一的消息引用,因此缺少该字段会抑制消息处理的关联。[MAIL]第3.6.4节描述了在该字段不存在时合成该字段内容的可取做法,并规定了该字段必须是全局唯一的要求。
While legitimate messages can contain more than one Return-Path header field, such usage is often an error rather that a valid message containing multiple header field blocks as described in Sections 3.6 of [MAIL]. Accordingly, when a message containing multiple Return-Path header fields is encountered, all but the topmost one is to be disregarded, as it is most likely to have been added nearest to the mailbox that received that message.
虽然合法消息可以包含多个返回路径头字段,但这种用法通常是错误的,而不是像[邮件]第3.6节所述的包含多个头字段块的有效消息。因此,当遇到包含多个返回路径头字段的邮件时,除了最上面的一个字段外,其他所有字段都将被忽略,因为它很可能是最近添加到接收该邮件的邮箱的。
MIME provides the means to include textual material employing character sets ("charsets") other than US-ASCII. Such material is required to have an identified charset. Charset identification is
MIME提供了包括使用字符集(“字符集”)而非US-ASCII的文本材料的方法。要求此类材料具有识别的字符集。字符集识别是
done using a "charset" parameter in the Content-Type header field, a charset label within the MIME entity itself, or the charset can be implicitly specified by the Content-Type (see [CHARSET]).
使用内容类型标头字段中的“charset”参数、MIME实体本身中的字符集标签或可由内容类型隐式指定的字符集完成(请参见[charset])。
Unfortunately, it is fairly common for required character set information to be missing or incorrect in textual MIME entities. As such, processing agents should perform basic sanity checks, such as:
不幸的是,在文本MIME实体中,所需的字符集信息丢失或不正确是相当常见的。因此,加工剂应执行基本的卫生检查,例如:
o US-ASCII contains bytes between 1 and 127 inclusive only (colloquially, "7-bit" data), so material including bytes outside of that range ("8-bit" data) is necessarily not US-ASCII. (See Section 2.1 of [MAIL].)
o US-ASCII仅包含1到127之间的字节(通俗地说是“7位”数据),因此包含超出该范围的字节(“8位”数据)的材料不一定是US-ASCII。(见[邮件]第2.1节)
o [UTF-8] has a very specific syntactic structure that other 8-bit charsets are unlikely to follow.
o [UTF-8]具有其他8位字符集不可能遵循的非常特定的语法结构。
o Null bytes (ASCII 0x00) are not allowed in either 7-bit or 8-bit data.
o 7位或8位数据中都不允许使用空字节(ASCII 0x00)。
o Not all 7-bit material is US-ASCII. The presence of the various escape sequences used for character switching can be used as an indication of the various charsets based on ISO/IEC 2022 [ISO-2022], such as those defined in [ISO-2022-CN], [ISO-2022-JP], and [ISO-2022-KR].
o 并非所有7位材料都是US-ASCII。用于字符切换的各种转义序列的存在可用作基于ISO/IEC 2022[ISO-2022]的各种字符集的指示,例如[ISO-2022-CN]、[ISO-2022-JP]和[ISO-2022-KR]中定义的字符集。
When a character set error is detected, processing agents should:
当检测到字符集错误时,处理代理应:
1. apply heuristics to determine the most likely character set and, if successful, proceed using that information; or
1. 应用启发式来确定最可能的字符集,如果成功,则继续使用该信息;或
2. refuse to process the malformed MIME entity.
2. 拒绝处理格式错误的MIME实体。
A null byte inside a textual MIME entity can cause typical string processing functions to misidentify the end of a string, which can be exploited to hide malicious content from analysis processes. Accordingly, null bytes require additional special handling.
文本MIME实体中的空字节可能会导致典型的字符串处理函数错误识别字符串的结尾,这可能被利用来隐藏分析过程中的恶意内容。因此,空字节需要额外的特殊处理。
A few null bytes in isolation is likely to be the result of poor message construction practices. Such nulls should be silently dropped.
孤立的几个空字节可能是消息构造实践不佳的结果。这样的空值应该被悄悄地删除。
Large numbers of null bytes are usually the result of binary material that is improperly encoded, improperly labeled, or both. Such material is likely to be damaged beyond the hope of recovery, so the best course of action is to refuse to process it.
大量空字节通常是由于二进制材料编码不当、标记不当或两者兼而有之。这些材料很可能被损坏,无法恢复,因此最好的做法是拒绝处理。
Finally, the presence of null bytes may be used as indication of possible malicious intent.
最后,空字节的存在可能被用作表示可能的恶意意图。
Standards-compliant email messages do not contain any non-ASCII data without indicating that such content is present by means of published SMTP extensions. Absent that, MIME encodings are typically used to convert non-ASCII data to ASCII in a way that can be reversed by other handling agents or end users.
符合标准的电子邮件消息不包含任何非ASCII数据,除非通过已发布的SMTP扩展名表明存在此类内容。如果没有,MIME编码通常用于将非ASCII数据转换为ASCII,这种转换方式可以被其他处理代理或最终用户逆转。
The best way to handle non-compliant 8-bit material depends on its location.
处理不符合要求的8位材料的最佳方法取决于其位置。
Non-compliant 8-bit material in MIME entity content should simply be processed as if the necessary SMTP extensions had been used to transfer the message. Note that improperly labeled 8-bit material in textual MIME entities may require treatment as described in Section 7.6.
MIME实体内容中不兼容的8位材料应简单地进行处理,就像使用了必要的SMTP扩展来传输邮件一样。请注意,文本MIME实体中未正确标记的8位材料可能需要按照第7.6节所述进行处理。
Non-compliant 8-bit material in message or MIME entity header fields can be handled as follows:
消息或MIME实体标题字段中不符合要求的8位材料可按如下方式处理:
1. Occurrences in unstructured text fields, comments, and phrases can be converted into encoded-words (see [MIME3] if a likely character set can be determined). Alternatively, 8-bit characters can be removed or replaced with some other character.
1. 非结构化文本字段、注释和短语中出现的内容可以转换为编码字(如果可以确定可能的字符集,请参见[MIME3])。或者,可以删除8位字符或用其他字符替换。
2. Occurrences in header fields whose syntax is unknown may be handled by dropping the field entirely or by removing/replacing the 8-bit character as described above.
2. 语法未知的标题字段中出现的情况可通过完全删除该字段或删除/替换上述8位字符来处理。
3. Occurrences in addresses are especially problematic. Agents supporting [EAI] may, if the 8-bit material conforms to 8-bit syntax, elect to treat the message as an EAI message and process it accordingly. Otherwise, in most cases, it is best to exclude the address from any sort of processing -- which may mean dropping it entirely -- since any attempt to fix it definitively is unlikely to be successful.
3. 地址中出现的问题尤其严重。如果8位材料符合8位语法,则支持[EAI]的代理可以选择将消息视为EAI消息并相应地进行处理。否则,在大多数情况下,最好从任何类型的处理中排除该地址——这可能意味着完全删除它——因为任何试图最终修复它的尝试都不可能成功。
The five-part set of MIME specifications includes a mechanism of message extensions for providing text in character sets other than ASCII, non-text attachments to messages, multipart message bodies, and similar facilities.
MIME规范由五部分组成,其中包括一种消息扩展机制,用于以字符集(ASCII除外)提供文本、消息的非文本附件、多部分消息正文和类似功能。
Some anomalies with MIME-compliant generation are also common. This section discusses some of those and presents preferred methods of mitigation.
MIME兼容生成的一些异常情况也很常见。本节讨论了其中一些问题,并介绍了首选的缓解方法。
Any message that uses [MIME] constructs is required to have a MIME-Version header field. Without it, the Content-Type and associated fields have no semantic meaning.
任何使用[MIME]结构的消息都需要有MIME版本头字段。没有它,内容类型和关联字段就没有语义意义。
It is often observed that a message has complete MIME structure, yet lacks this header field. It is prudent to disregard this absence and conduct analysis of the message as if it were present, especially by agents attempting to identify malicious material.
通常可以看到,消息具有完整的MIME结构,但缺少此头字段。谨慎的做法是,忽略此缺失,并像消息存在一样对其进行分析,特别是由试图识别恶意材料的代理进行分析。
Further, the absence of MIME-Version might be an indication of malicious intent, and extra scrutiny of the message may be warranted. Such omissions are not expected from compliant message generators.
此外,缺少MIME版本可能表明存在恶意意图,可能需要对消息进行额外检查。合规消息生成器预计不会出现此类遗漏。
There have been a few different specifications of base64 in the past. The implementation defined in [MIME] instructs decoders to discard characters that are not part of the base64 alphabet. Other implementations consider an encoded body containing such characters to be completely invalid. Very early specifications of base64 (see [PEM89], for example, which was later obsoleted by [PEM93]) allowed email-style comments within base64-encoded data.
过去base64有几种不同的规格。[MIME]中定义的实现指示解码器丢弃不属于base64字母表的字符。其他实现考虑包含这样的字符的编码体是完全无效的。base64的早期规范(例如,参见[PEM89],后来被[PEM93]淘汰)允许在base64编码数据中使用电子邮件样式的注释。
The attack vector here involves constructing a base64 body whose meaning varies given different possible decodings. If a security analysis module wishes to be thorough, it should consider scanning the possible outputs of the known decoding dialects in an attempt to anticipate how the MUA will interpret the data.
这里的攻击向量包括构造一个Base64体,其含义因不同的可能的解码而变化。如果安全分析模块希望是彻底的,它应该考虑扫描已知解码方言的可能输出,以预期MUA将如何解释数据。
A message containing a line of content that exceeds 998 characters plus the line terminator (1000 total) violates Section 2.1.1 of [MAIL]. Some handling agents may not look at content in a single line past the first 998 bytes, providing bad actors an opportunity to hide malicious content.
包含超过998个字符的内容行加上行结束符(总共1000个)的邮件违反了[MAIL]第2.1.1节。一些处理代理可能无法查看超过前998字节的一行内容,从而为坏角色提供了隐藏恶意内容的机会。
There is no specified way to handle such messages, other than to observe that they are non-compliant and reject them or rewrite the oversized line such that the message is compliant.
没有指定的方法来处理此类消息,除了观察它们不符合要求并拒绝它们或重写过大的行以使消息符合要求之外。
To ensure long lines do not prevent analysis of potentially malicious data, handling agents are strongly encouraged to take one of the following actions:
为确保长线不会阻止对潜在恶意数据的分析,强烈建议处理代理采取以下行动之一:
1. Break such lines into multiple lines at a position that does not change the semantics of the text being thus altered. For example, break an oversized line at a position such that a [URI] does not span two lines (which could inhibit the proper identification of the URI).
1. 在不改变被修改文本语义的位置将这些行分成多行。例如,在[URI]不跨越两行的位置断开一个过大的行(这可能会抑制URI的正确标识)。
2. Rewrite the MIME part (or the entire message if not MIME) that contains the excessively long line using a content encoding that breaks the line in the transmission but would still result in the line being intact on decoding for presentation to the user. Both of the encodings declared in [MIME] can accomplish this.
2. 使用内容编码重写包含过长行的MIME部分(如果不是MIME,则重写整个消息),该内容编码会在传输过程中中断该行,但在解码时仍会导致该行完好无损地呈现给用户。[MIME]中声明的两种编码都可以实现这一点。
The discussions of the anomalies above and their prescribed solutions are themselves security considerations. The practices enumerated in this document are generally perceived as attempts to resolve security considerations that already exist rather than introducing new ones. However, some of the attacks described here may not have appeared in previous email specifications.
上述异常及其规定解决方案的讨论本身就是安全考虑。本文件中列举的做法通常被视为试图解决已经存在的安全问题,而不是引入新的安全问题。然而,这里描述的一些攻击可能没有出现在以前的电子邮件规范中。
[EMAIL-ARCH] Crocker, D., "Internet Mail Architecture", RFC 5598, July 2009.
[EMAIL-ARCH]Crocker,D.,“互联网邮件体系结构”,RFC 55982009年7月。
[MAIL] Resnick, P., "Internet Message Format", RFC 5322, October 2008.
[邮件]Resnick,P.,“互联网信息格式”,RFC5322,2008年10月。
[MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996.
[MIME]Freed,N.和N.Borenstein,“多用途Internet邮件扩展(MIME)第一部分:Internet邮件正文格式”,RFC 20451996年11月。
[BINARYSMTP] Vaudreuil, G., "SMTP Service Extensions for Transmission of Large and Binary MIME Messages", RFC 3030, December 2000.
[BinaryStp]Vaudreuil,G.,“用于传输大型和二进制MIME消息的SMTP服务扩展”,RFC 3030,2000年12月。
[CHARSET] Melnikov, A. and J. Reschke, "Update to MIME regarding "charset" Parameter Handling in Textual Media Types", RFC 6657, July 2012.
[CHARSET]Melnikov,A.和J.Reschke,“关于文本媒体类型中“CHARSET”参数处理的MIME更新”,RFC 6657,2012年7月。
[DKIM] Crocker, D., Ed., Hansen, T., Ed., and M. Kucherawy, Ed., "DomainKeys Identified Mail (DKIM) Signatures", RFC 6376, September 2011.
[DKIM]Crocker,D.,Ed.,Hansen,T.,Ed.,和M.Kucherawy,Ed.,“域密钥识别邮件(DKIM)签名”,RFC 63762011年9月。
[DSN] Moore, K. and G. Vaudreuil, "An Extensible Message Format for Delivery Status Notifications", RFC 3464, January 2003.
[DSN]Moore,K.和G.Vaudreuil,“交付状态通知的可扩展消息格式”,RFC 3464,2003年1月。
[EAI] Yang, A., Steele, S., and N. Freed, "Internationalized Email Headers", RFC 6532, February 2012.
[EAI]Yang,A.,Steele,S.,和N.Freed,“国际化电子邮件标题”,RFC 65322012年2月。
[ISO-2022-CN] Zhu, HF., Hu, DY., Wang, ZG., Kao, TC., Chang, WCH., and M. Crispin, "Chinese Character Encoding for Internet Messages", RFC 1922, March 1996.
[ISO-2022-CN]朱晓峰,胡德勇,王志刚,高,TC.,张,WCH.,和M.Crispin,“互联网信息的汉字编码”,RFC 1922,1996年3月。
[ISO-2022-JP] Murai, J., Crispin, M., and E. van der Poel, "Japanese Character Encoding for Internet Messages", RFC 1468, June 1993.
[ISO-2022-JP]Murai,J.,Crispin,M.和E.van der Poel,“互联网信息的日语字符编码”,RFC 14681993年6月。
[ISO-2022-KR] Choi, U., Chon, K., and H. Park, "Korean Character Encoding for Internet Messages", RFC 1557, December 1993.
[ISO-2022-KR]Choi,U.,Chon,K.,和H.Park,“互联网信息的韩文字符编码”,RFC 1557,1993年12月。
[ISO-2022] ISO/IEC, "Information technology -- Character code structure and extension techniques", ISO/IEC 2022, 1994, <http://www.iso.org/iso/ catalogue_detail.htm?csnumber=22747>.
[ISO-2022]ISO/IEC,“信息技术——字符代码结构和扩展技术”,ISO/IEC 20221994<http://www.iso.org/iso/ 目录\u detail.htm?csnumber=22747>。
[MIME3] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text", RFC 2047, November 1996.
[MIME3]Moore,K.,“MIME(多用途互联网邮件扩展)第三部分:非ASCII文本的消息头扩展”,RFC 2047,1996年11月。
[PEM89] Linn, J., "Privacy Enhancement for Internet Electronic Mail: Part I -- Message Encipherment and Authentication Procedures", RFC 1113, August 1989.
[PEM89]Linn,J.,“互联网电子邮件的隐私增强:第一部分——信息加密和认证程序”,RFC 1113,1989年8月。
[PEM93] Linn, J., "Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures", RFC 1421, February 1993.
[PEM93]Linn,J.,“互联网电子邮件的隐私增强:第一部分:信息加密和认证程序”,RFC 14211993年2月。
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts -- Communication Layers", RFC 1122, October 1989.
[RFC1122]Braden,R.,Ed.“互联网主机的要求——通信层”,RFC11221989年10月。
[RFC2822] Resnick, P., Ed., "Internet Message Format", RFC 2822, April 2001.
[RFC2822]Resnick,P.,Ed.,“互联网信息格式”,RFC 2822,2001年4月。
[RFC733] Crocker, D., Vittal, J., Pogran, K., and D. Henderson, Jr., "Standard for the Format of Internet Text Messages", RFC 733, November 1977.
[RFC733]Crocker,D.,Vittal,J.,Pogran,K.,和D.Henderson,Jr.,“互联网短信格式标准”,RFC 733,1977年11月。
[SMTP] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, October 2008.
[SMTP]Klensin,J.,“简单邮件传输协议”,RFC 53212008年10月。
[URI] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", RFC 3986, January 2005.
[URI]Berners Lee,T.,Fielding,R.,和L.Masinter,“统一资源标识符(URI):通用语法”,RFC 3986,2005年1月。
[UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 3629, 2003.
[UTF-8]Yergeau,F.,“UTF-8,ISO 10646的转换格式”,RFC 36292003。
The authors wish to acknowledge the following for their review and constructive criticism of this proposal: Dave Cridland, Dave Crocker, Jim Galvin, Tony Hansen, John Levine, Franck Martin, Alexey Melnikov, and Timo Sirainen.
作者希望感谢以下对本提案的审查和建设性批评:戴夫·克里德兰、戴夫·克罗克、吉姆·加尔文、托尼·汉森、约翰·莱文、弗兰克·马丁、阿列克谢·梅尔尼科夫和蒂莫·西莱宁。
Authors' Addresses
作者地址
Murray S. Kucherawy
默里·S·库切拉维
EMail: superuser@gmail.com
EMail: superuser@gmail.com
Gregory N. Shapiro
格雷戈里·N·夏皮罗
EMail: gshapiro@proofpoint.com
EMail: gshapiro@proofpoint.com
Ned Freed
内德·费利德
EMail: ned.freed@mrochek.com
EMail: ned.freed@mrochek.com