Network Working Group K. Ramakrishnan Request for Comments: 2481 AT&T Labs Research Category: Experimental S. Floyd LBNL January 1999
Network Working Group K. Ramakrishnan Request for Comments: 2481 AT&T Labs Research Category: Experimental S. Floyd LBNL January 1999
A Proposal to add Explicit Congestion Notification (ECN) to IP
向IP添加显式拥塞通知(ECN)的建议
Status of this Memo
本备忘录的状况
This memo defines an Experimental Protocol for the Internet community. It does not specify an Internet standard of any kind. Discussion and suggestions for improvement are requested. Distribution of this memo is unlimited.
这份备忘录为互联网社区定义了一个实验性协议。它没有规定任何类型的互联网标准。要求进行讨论并提出改进建议。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (C) The Internet Society (1999). All Rights Reserved.
版权所有(C)互联网协会(1999年)。版权所有。
Abstract
摘要
This note describes a proposed addition of ECN (Explicit Congestion Notification) to IP. TCP is currently the dominant transport protocol used in the Internet. We begin by describing TCP's use of packet drops as an indication of congestion. Next we argue that with the addition of active queue management (e.g., RED) to the Internet infrastructure, where routers detect congestion before the queue overflows, routers are no longer limited to packet drops as an indication of congestion. Routers could instead set a Congestion Experienced (CE) bit in the packet header of packets from ECN-capable transport protocols. We describe when the CE bit would be set in the routers, and describe what modifications would be needed to TCP to make it ECN-capable. Modifications to other transport protocols (e.g., unreliable unicast or multicast, reliable multicast, other reliable unicast transport protocols) could be considered as those protocols are developed and advance through the standards process.
本说明描述了向IP添加ECN(显式拥塞通知)的建议。TCP是目前互联网上使用的主要传输协议。我们首先描述TCP使用丢包作为拥塞指示。接下来,我们认为,随着主动队列管理(如RED)添加到Internet基础设施中,路由器在队列溢出之前检测到拥塞,路由器不再局限于作为拥塞指示的数据包丢弃。路由器可以在来自支持ECN的传输协议的数据包的数据包报头中设置拥塞体验(CE)位。我们描述了何时在路由器中设置CE位,并描述了需要对TCP进行哪些修改以使其具有ECN功能。可以考虑对其他传输协议(例如,不可靠的单播或多播、可靠的多播、其他可靠的单播传输协议)的修改,因为这些协议是通过标准过程开发和推进的。
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this document, are to be interpreted as described in [B97].
本文件中出现的关键词必须、不得、必需、应、不应、应、不应、建议、可能和可选时,应按照[B97]中的说明进行解释。
TCP's congestion control and avoidance algorithms are based on the notion that the network is a black-box [Jacobson88, Jacobson90]. The network's state of congestion or otherwise is determined by end-systems probing for the network state, by gradually increasing the load on the network (by increasing the window of packets that are outstanding in the network) until the network becomes congested and a packet is lost. Treating the network as a "black-box" and treating loss as an indication of congestion in the network is appropriate for pure best-effort data carried by TCP which has little or no sensitivity to delay or loss of individual packets. In addition, TCP's congestion management algorithms have techniques built-in (such as Fast Retransmit and Fast Recovery) to minimize the impact of losses from a throughput perspective.
TCP的拥塞控制和避免算法基于网络是一个黑箱的概念[Jacobson88,Jacobson90]。网络的拥塞状态或其他状态由终端系统通过逐渐增加网络上的负载(通过增加网络中未完成的数据包的窗口)来探测网络状态来确定,直到网络变得拥塞并且数据包丢失。将网络视为“黑箱”,并将丢失视为网络拥塞的指示,适用于TCP承载的纯尽力而为数据,该数据对单个数据包的延迟或丢失几乎不敏感或不敏感。此外,TCP的拥塞管理算法具有内置技术(如快速重传和快速恢复),以从吞吐量的角度将丢失的影响降至最低。
However, these mechanisms are not intended to help applications that are in fact sensitive to the delay or loss of one or more individual packets. Interactive traffic such as telnet, web-browsing, and transfer of audio and video data can be sensitive to packet losses (using an unreliable data delivery transport such as UDP) or to the increased latency of the packet caused by the need to retransmit the packet after a loss (for reliable data delivery such as TCP).
然而,这些机制并不旨在帮助那些实际上对一个或多个单独数据包的延迟或丢失敏感的应用程序。交互式流量(如telnet、web浏览以及音频和视频数据的传输)可能对数据包丢失(使用不可靠的数据传输传输,如UDP)或由于丢失后需要重新传输数据包(用于可靠的数据传输,如TCP)而导致的数据包延迟增加敏感。
Since TCP determines the appropriate congestion window to use by gradually increasing the window size until it experiences a dropped packet, this causes the queues at the bottleneck router to build up. With most packet drop policies at the router that are not sensitive to the load placed by each individual flow, this means that some of the packets of latency-sensitive flows are going to be dropped. Active queue management mechanisms detect congestion before the queue overflows, and provide an indication of this congestion to the end nodes. The advantages of active queue management are discussed in RFC 2309 [RFC2309]. Active queue management avoids some of the bad properties of dropping on queue overflow, including the undesirable synchronization of loss across multiple flows. More importantly, active queue management means that transport protocols with congestion control (e.g., TCP) do not have to rely on buffer overflow as the only indication of congestion. This can reduce unnecessary queueing delay for all traffic sharing that queue.
由于TCP通过逐渐增加窗口大小直到遇到丢弃的数据包来确定要使用的适当拥塞窗口,这会导致瓶颈路由器上的队列增加。由于路由器上的大多数数据包丢弃策略对每个流放置的负载不敏感,这意味着延迟敏感流的一些数据包将被丢弃。主动队列管理机制在队列溢出之前检测拥塞,并向终端节点提供该拥塞的指示。RFC2309中讨论了主动队列管理的优点。主动队列管理避免了队列溢出时丢弃的一些不良属性,包括跨多个流的丢失同步。更重要的是,主动队列管理意味着具有拥塞控制(例如TCP)的传输协议不必依赖缓冲区溢出作为拥塞的唯一指示。这可以减少共享该队列的所有流量的不必要排队延迟。
Active queue management mechanisms may use one of several methods for indicating congestion to end-nodes. One is to use packet drops, as is currently done. However, active queue management allows the router to separate policies of queueing or dropping packets from the policies for indicating congestion. Thus, active queue management allows
主动队列管理机制可以使用几种方法中的一种来指示终端节点的拥塞。一种是使用数据包丢弃,就像目前所做的那样。然而,主动队列管理允许路由器将排队或丢弃数据包的策略与指示拥塞的策略分开。因此,主动队列管理允许
routers to use the Congestion Experienced (CE) bit in a packet header as an indication of congestion, instead of relying solely on packet drops.
路由器使用分组报头中的拥塞经历(CE)位作为拥塞指示,而不是仅仅依赖分组丢弃。
In this section, we describe some of the important design principles and assumptions that guided the design choices in this proposal.
在本节中,我们将介绍一些重要的设计原则和假设,这些原则和假设指导了本建议书中的设计选择。
(1) Congestion may persist over different time-scales. The time scales that we are concerned with are congestion events that may last longer than a round-trip time. (2) The number of packets in an individual flow (e.g., TCP connection or an exchange using UDP) may range from a small number of packets to quite a large number. We are interested in managing the congestion caused by flows that send enough packets so that they are still active when network feedback reaches them. (3) New mechanisms for congestion control and avoidance need to co-exist and cooperate with existing mechanisms for congestion control. In particular, new mechanisms have to co-exist with TCP's current methods of adapting to congestion and with routers' current practice of dropping packets in periods of congestion. (4) Because ECN is likely to be adopted gradually, accommodating migration is essential. Some routers may still only drop packets to indicate congestion, and some end-systems may not be ECN-capable. The most viable strategy is one that accommodates incremental deployment without having to resort to "islands" of ECN-capable and non-ECN-capable environments. (5) Asymmetric routing is likely to be a normal occurrence in the Internet. The path (sequence of links and routers) followed by data packets may be different from the path followed by the acknowledgment packets in the reverse direction. (6) Many routers process the "regular" headers in IP packets more efficiently than they process the header information in IP options. This suggests keeping congestion experienced information in the regular headers of an IP packet. (7) It must be recognized that not all end-systems will cooperate in mechanisms for congestion control. However, new mechanisms shouldn't make it easier for TCP applications to disable TCP congestion control. The benefit of lying about participating in new mechanisms such as ECN-capability should be small.
(1) 拥堵可能会在不同的时间尺度上持续。我们关注的时间尺度是拥堵事件,其持续时间可能超过往返时间。(2) 单个流中的数据包数量(例如,TCP连接或使用UDP的交换)可能从少量数据包到相当大的数据包数量不等。我们感兴趣的是管理由发送足够数据包的流引起的拥塞,以便当网络反馈到达它们时它们仍然处于活动状态。(3) 新的拥塞控制和避免机制需要与现有的拥塞控制机制共存和合作。特别是,新机制必须与TCP当前适应拥塞的方法以及路由器当前在拥塞期间丢弃数据包的实践共存。(4) 由于ECN可能会逐渐被采用,因此适应移民至关重要。一些路由器可能仍然只丢弃数据包以指示拥塞,而一些终端系统可能不支持ECN。最可行的策略是一种适应增量部署的策略,而不必求助于支持ECN和不支持ECN的环境的“孤岛”。(5) 不对称路由很可能是互联网中的正常现象。数据包后面的路径(链路和路由器的序列)可能与反向确认包后面的路径不同。(6) 许多路由器处理IP数据包中的“常规”报头比处理IP选项中的报头信息更有效。这建议在IP数据包的常规报头中保留拥塞信息。(7) 必须认识到,并非所有终端系统都会在拥塞控制机制中进行合作。然而,新的机制不应该使TCP应用程序更容易禁用TCP拥塞控制。谎称参与新机制(如ECN能力)的好处应该很小。
Random Early Detection (RED) is a mechanism for active queue management that has been proposed to detect incipient congestion [FJ93], and is currently being deployed in the Internet backbone [RFC2309]. Although RED is meant to be a general mechanism using one
随机早期检测(RED)是一种主动队列管理机制,已被提出用于检测初期拥塞[FJ93],目前正在互联网主干网[RFC2309]中部署。虽然红色是一种使用红色的通用机制
of several alternatives for congestion indication, in the current environment of the Internet RED is restricted to using packet drops as a mechanism for congestion indication. RED drops packets based on the average queue length exceeding a threshold, rather than only when the queue overflows. However, when RED drops packets before the queue actually overflows, RED is not forced by memory limitations to discard the packet.
在拥塞指示的几种备选方案中,在当前的互联网环境中,RED仅限于使用分组丢弃作为拥塞指示的机制。RED根据超过阈值的平均队列长度丢弃数据包,而不仅仅是在队列溢出时。然而,当RED在队列实际溢出之前丢弃数据包时,RED不会因为内存限制而被迫丢弃数据包。
RED could set a Congestion Experienced (CE) bit in the packet header instead of dropping the packet, if such a bit was provided in the IP header and understood by the transport protocol. The use of the CE bit would allow the receiver(s) to receive the packet, avoiding the potential for excessive delays due to retransmissions after packet losses. We use the term 'CE packet' to denote a packet that has the CE bit set.
如果IP报头中提供了拥塞体验(CE)位,并且传输协议能够理解,则RED可以在数据包报头中设置拥塞体验(CE)位,而不是丢弃数据包。CE比特的使用将允许接收机接收分组,避免由于分组丢失后的重新传输而导致过度延迟的可能性。我们使用术语“CE数据包”来表示设置了CE位的数据包。
We propose that the Internet provide a congestion indication for incipient congestion (as in RED and earlier work [RJ90]) where the notification can sometimes be through marking packets rather than dropping them. This would require an ECN field in the IP header with two bits. The ECN-Capable Transport (ECT) bit would be set by the data sender to indicate that the end-points of the transport protocol are ECN-capable. The CE bit would be set by the router to indicate congestion to the end nodes. Routers that have a packet arriving at a full queue would drop the packet, just as they do now.
我们建议互联网为初期拥塞提供拥塞指示(如RED和早期工作[RJ90]),其中通知有时可以通过标记数据包而不是丢弃数据包。这需要在IP报头中有一个带两位的ECN字段。数据发送方将设置支持ECN的传输(ECT)位,以指示传输协议的端点支持ECN。路由器将设置CE位,以指示终端节点的拥塞。有一个数据包到达一个完整队列的路由器将丢弃该数据包,就像现在一样。
Bits 6 and 7 in the IPv4 TOS octet are designated as the ECN field. Bit 6 is designated as the ECT bit, and bit 7 is designated as the CE bit. The IPv4 TOS octet corresponds to the Traffic Class octet in IPv6. The definitions for the IPv4 TOS octet [RFC791] and the IPv6 Traffic Class octet are intended to be superseded by the DS (Differentiated Services) Field [DIFFSERV]. Bits 6 and 7 are listed in [DIFFSERV] as Currently Unused. Section 19 gives a brief history of the TOS octet.
IPv4 TOS八位字节中的第6位和第7位被指定为ECN字段。位6指定为ECT位,位7指定为CE位。IPv4 TOS八位字节对应于IPv6中的流量类八位字节。IPv4 TOS八位字节[RFC791]和IPv6流量类八位字节的定义将由DS(区分服务)字段[DIFFSERV]取代。位6和7在[DIFFSERV]中列为当前未使用的。第19节给出了TOS八重奏的简要历史。
Because of the unstable history of the TOS octet, the use of the ECN field as specified in this document cannot be guaranteed to be backwards compatible with all past uses of these two bits. The potential dangers of this lack of backwards compatibility are discussed in Section 19.
由于TOS八位字节的历史不稳定,本文件中规定的ECN字段的使用不能保证与这两个位的所有过去使用向后兼容。第19节讨论了缺乏向后兼容性的潜在危险。
Upon the receipt by an ECN-Capable transport of a single CE packet, the congestion control algorithms followed at the end-systems MUST be essentially the same as the congestion control response to a *single* dropped packet. For example, for ECN-Capable TCP the source TCP is required to halve its congestion window for any window of data
当具有ECN能力的传输接收到单个CE分组时,终端系统遵循的拥塞控制算法必须与对*单个*丢弃分组的拥塞控制响应基本相同。例如,对于支持ECN的TCP,源TCP需要将任何数据窗口的拥塞窗口减半
containing either a packet drop or an ECN indication. However, we would like to point out some notable exceptions in the reaction of the source TCP, related to following the shorter-time-scale details of particular implementations of TCP. For TCP's response to an ECN indication, we do not recommend such behavior as the slow-start of Tahoe TCP in response to a packet drop, or Reno TCP's wait of roughly half a round-trip time during Fast Recovery.
包含数据包丢弃或ECN指示。然而,我们想指出源TCP反应中的一些显著异常,这些异常与遵循TCP特定实现的较短时间尺度细节有关。对于TCP对ECN指示的响应,我们不建议此类行为,如Tahoe TCP因数据包丢失而缓慢启动,或Reno TCP在快速恢复期间等待大约半个往返时间。
One reason for requiring that the congestion-control response to the CE packet be essentially the same as the response to a dropped packet is to accommodate the incremental deployment of ECN in both end-systems and in routers. Some routers may drop ECN-Capable packets (e.g., using the same RED policies for congestion detection) while other routers set the CE bit, for equivalent levels of congestion. Similarly, a router might drop a non-ECN-Capable packet but set the CE bit in an ECN-Capable packet, for equivalent levels of congestion. Different congestion control responses to a CE bit indication and to a packet drop could result in unfair treatment for different flows.
要求对CE分组的拥塞控制响应基本上与对丢弃分组的响应相同的一个原因是为了适应终端系统和路由器中ECN的增量部署。一些路由器可能丢弃支持ECN的数据包(例如,使用相同的RED策略进行拥塞检测),而其他路由器则设置CE位,以实现同等级别的拥塞。类似地,路由器可能丢弃不支持ECN的数据包,但在支持ECN的数据包中设置CE位,以达到同等的拥塞水平。对CE位指示和分组丢弃的不同拥塞控制响应可能导致对不同流的不公平处理。
An additional requirement is that the end-systems should react to congestion at most once per window of data (i.e., at most once per roundtrip time), to avoid reacting multiple times to multiple indications of congestion within a roundtrip time.
另一项要求是,终端系统应在每个数据窗口最多对拥塞做出一次反应(即,在每次往返时间最多一次),以避免在往返时间内对多个拥塞指示做出多次反应。
For a router, the CE bit of an ECN-Capable packet should only be set if the router would otherwise have dropped the packet as an indication of congestion to the end nodes. When the router's buffer is not yet full and the router is prepared to drop a packet to inform end nodes of incipient congestion, the router should first check to see if the ECT bit is set in that packet's IP header. If so, then instead of dropping the packet, the router MAY instead set the CE bit in the IP header.
对于路由器,只有当路由器以其他方式丢弃数据包作为对终端节点的拥塞指示时,才应设置支持ECN的数据包的CE位。当路由器的缓冲区尚未满且路由器准备丢弃数据包以通知终端节点初始拥塞时,路由器应首先检查该数据包的IP报头中是否设置了ECT位。如果是这样,那么路由器可以改为在IP报头中设置CE位,而不是丢弃分组。
An environment where all end nodes were ECN-Capable could allow new criteria to be developed for setting the CE bit, and new congestion control mechanisms for end-node reaction to CE packets. However, this is a research issue, and as such is not addressed in this document.
在一个所有终端节点都支持ECN的环境中,可以为设置CE位制定新的标准,并为终端节点对CE数据包的反应制定新的拥塞控制机制。然而,这是一个研究问题,因此,本文件未涉及。
When a CE packet is received by a router, the CE bit is left unchanged, and the packet transmitted as usual. When severe congestion has occurred and the router's queue is full, then the router has no choice but to drop some packet when a new packet arrives. We anticipate that such packet losses will become relatively infrequent when a majority of end-systems become ECN-Capable and participate in TCP or other compatible congestion control mechanisms. In an adequately-provisioned network in such an ECN-Capable environment, packet losses should occur primarily during
当路由器接收到CE数据包时,CE位保持不变,数据包照常传输。当发生严重拥塞且路由器队列已满时,路由器别无选择,只能在新数据包到达时丢弃一些数据包。我们预计,当大多数终端系统具备ECN能力并参与TCP或其他兼容的拥塞控制机制时,此类数据包丢失将相对较少。在这种支持ECN的环境中,在充分配置的网络中,数据包丢失应该主要发生在传输过程中
transients or in the presence of non-cooperating sources.
瞬变或存在非合作源时。
We expect that routers will set the CE bit in response to incipient congestion as indicated by the average queue size, using the RED algorithms suggested in [FJ93, RFC2309]. To the best of our knowledge, this is the only proposal currently under discussion in the IETF for routers to drop packets proactively, before the buffer overflows. However, this document does not attempt to specify a particular mechanism for active queue management, leaving that endeavor, if needed, to other areas of the IETF. While ECN is inextricably tied up with active queue management at the router, the reverse does not hold; active queue management mechanisms have been developed and deployed independently from ECN, using packet drops as indications of congestion in the absence of ECN in the IP architecture.
我们预计路由器将使用[FJ93,RFC2309]中建议的RED算法设置CE位,以响应平均队列大小指示的初始拥塞。据我们所知,这是IETF中目前讨论的唯一一个建议,即路由器在缓冲区溢出之前主动丢弃数据包。然而,本文档并不试图为主动队列管理指定特定的机制,如果需要,将此工作留给IETF的其他领域。虽然ECN与路由器上的主动队列管理密不可分,但反过来就不行了;主动队列管理机制已独立于ECN开发和部署,在IP体系结构中使用丢包作为没有ECN的拥塞指示。
ECN requires support from the transport protocol, in addition to the functionality given by the ECN field in the IP packet header. The transport protocol might require negotiation between the endpoints during setup to determine that all of the endpoints are ECN-capable, so that the sender can set the ECT bit in transmitted packets. Second, the transport protocol must be capable of reacting appropriately to the receipt of CE packets. This reaction could be in the form of the data receiver informing the data sender of the received CE packet (e.g., TCP), of the data receiver unsubscribing to a layered multicast group (e.g., RLM [MJV96]), or of some other action that ultimately reduces the arrival rate of that flow to that receiver.
除了IP数据包头中的ECN字段提供的功能外,ECN还需要传输协议的支持。传输协议可能需要在设置期间在端点之间进行协商,以确定所有端点都支持ECN,以便发送方可以在传输的数据包中设置ECT位。第二,传输协议必须能够对CE数据包的接收做出适当的反应。该反应可以是数据接收器通知数据发送者所接收的CE分组(例如,TCP)、数据接收器取消订阅分层多播组(例如,RLM[MJV96])或最终降低流向该接收器的流的到达率的一些其他动作的形式。
This document only addresses the addition of ECN Capability to TCP, leaving issues of ECN and other transport protocols to further research. For TCP, ECN requires three new mechanisms: negotiation between the endpoints during setup to determine if they are both ECN-capable; an ECN-Echo flag in the TCP header so that the data receiver can inform the data sender when a CE packet has been received; and a Congestion Window Reduced (CWR) flag in the TCP header so that the data sender can inform the data receiver that the congestion window has been reduced. The support required from other transport protocols is likely to be different, particular for unreliable or reliable multicast transport protocols, and will have to be determined as other transport protocols are brought to the IETF for standardization.
本文档仅讨论TCP中添加ECN功能的问题,将ECN和其他传输协议的问题留待进一步研究。对于TCP,ECN需要三种新机制:在设置过程中端点之间的协商,以确定它们是否都支持ECN;TCP报头中的ECN Echo标志,以便数据接收器可以在接收到CE数据包时通知数据发送者;以及TCP报头中的拥塞窗口缩减(CWR)标志,以便数据发送方可以通知数据接收方拥塞窗口已经缩减。其他传输协议所需的支持可能会有所不同,特别是对于不可靠或可靠的多播传输协议,并且必须在其他传输协议提交给IETF进行标准化时确定。
The following sections describe in detail the proposed use of ECN in TCP. This proposal is described in essentially the same form in [Floyd94]. We assume that the source TCP uses the standard congestion control algorithms of Slow-start, Fast Retransmit and Fast Recovery [RFC 2001].
以下各节详细介绍了在TCP中使用ECN的建议。该提案在[Floyd94]中的描述形式基本相同。我们假设源TCP使用慢启动、快速重传和快速恢复的标准拥塞控制算法[RFC 2001]。
This proposal specifies two new flags in the Reserved field of the TCP header. The TCP mechanism for negotiating ECN-Capability uses the ECN-Echo flag in the TCP header. (This was called the ECN Notify flag in some earlier documents.) Bit 9 in the Reserved field of the TCP header is designated as the ECN-Echo flag. The location of the 6-bit Reserved field in the TCP header is shown in Figure 3 of RFC 793 [RFC793].
此建议在TCP标头的保留字段中指定两个新标志。协商ECN能力的TCP机制使用TCP报头中的ECN Echo标志。(在以前的一些文档中,这被称为ECN Notify标志。)TCP头的保留字段中的第9位被指定为ECN Echo标志。TCP报头中6位保留字段的位置如RFC 793[RFC793]的图3所示。
To enable the TCP receiver to determine when to stop setting the ECN-Echo flag, we introduce a second new flag in the TCP header, the Congestion Window Reduced (CWR) flag. The CWR flag is assigned to Bit 8 in the Reserved field of the TCP header.
为了使TCP接收器能够确定何时停止设置ECN Echo标志,我们在TCP报头中引入了第二个新标志,即拥塞窗口缩减(CWR)标志。CWR标志分配给TCP标头保留字段中的位8。
The use of these flags is described in the sections below.
以下各节介绍了这些标志的使用。
In the TCP connection setup phase, the source and destination TCPs exchange information about their desire and/or capability to use ECN. Subsequent to the completion of this negotiation, the TCP sender sets the ECT bit in the IP header of data packets to indicate to the network that the transport is capable and willing to participate in ECN for this packet. This will indicate to the routers that they may mark this packet with the CE bit, if they would like to use that as a method of congestion notification. If the TCP connection does not wish to use ECN notification for a particular packet, the sending TCP sets the ECT bit equal to 0 (i.e., not set), and the TCP receiver ignores the CE bit in the received packet.
在TCP连接设置阶段,源和目标TCP交换有关其使用ECN的愿望和/或能力的信息。在完成该协商之后,TCP发送方在数据分组的IP报头中设置ECT位,以向网络指示传输能够并且愿意参与该分组的ECN。这将向路由器指示,如果他们想使用CE位作为拥塞通知的方法,他们可以用CE位标记此数据包。如果TCP连接不希望对特定数据包使用ECN通知,则发送TCP将ECT位设置为0(即未设置),并且TCP接收器忽略接收数据包中的CE位。
When a node sends a TCP SYN packet, it may set the ECN-Echo and CWR flags in the TCP header. For a SYN packet, the setting of both the ECN-Echo and CWR flags are defined as an indication that the sending TCP is ECN-Capable, rather than as an indication of congestion or of response to congestion. More precisely, a SYN packet with both the ECN-Echo and CWR flags set indicates that the TCP implementation transmitting the SYN packet will participate in ECN as both a sender and receiver. As a receiver, it will respond to incoming data packets that have the CE bit set in the IP header by setting the ECN-Echo flag in outgoing TCP Acknowledgement (ACK) packets. As a sender, it will respond to incoming packets that have the ECN-Echo
当节点发送TCP SYN数据包时,它可以在TCP报头中设置ECN Echo和CWR标志。对于SYN数据包,ECN Echo和CWR标志的设置被定义为发送TCP具有ECN能力的指示,而不是拥塞或拥塞响应的指示。更准确地说,设置了ECN Echo和CWR标志的SYN数据包表示传输SYN数据包的TCP实现将作为发送方和接收方参与ECN。作为接收器,它将通过在传出TCP确认(ACK)数据包中设置ECN Echo标志来响应IP报头中设置了CE位的传入数据包。作为发送方,它将响应具有ECN回音的传入数据包
flag set by reducing the congestion window when appropriate.
适当时通过减少拥塞窗口设置的标志。
When a node sends a SYN-ACK packet, it may set the ECN-Echo flag, but it does not set the CWR flag. For a SYN-ACK packet, the pattern of the ECN-Echo flag set and the CWR flag not set in the TCP header is defined as an indication that the TCP transmitting the SYN-ACK packet is ECN-Capable.
当节点发送SYN-ACK数据包时,它可以设置ECN Echo标志,但不设置CWR标志。对于SYN-ACK数据包,在TCP报头中设置的ECN回波标志和未设置的CWR标志的模式被定义为传输SYN-ACK数据包的TCP具有ECN能力的指示。
There is the question of why we chose to have the TCP sending the SYN set two ECN-related flags in the Reserved field of the TCP header for the SYN packet, while the responding TCP sending the SYN-ACK sets only one ECN-related flag in the SYN-ACK packet. This asymmetry is necessary for the robust negotiation of ECN-capability with deployed TCP implementations. There exists at least one TCP implementation in which TCP receivers set the Reserved field of the TCP header in ACK packets (and hence the SYN-ACK) simply to reflect the Reserved field of the TCP header in the received data packet. Because the TCP SYN packet sets the ECN-Echo and CWR flags to indicate ECN-capability, while the SYN-ACK packet sets only the ECN-Echo flag, the sending TCP correctly interprets a receiver's reflection of its own flags in the Reserved field as an indication that the receiver is not ECN-capable.
问题是,为什么我们选择让发送SYN的TCP在SYN数据包的TCP头的保留字段中设置两个ECN相关标志,而发送SYN-ACK的响应TCP在SYN-ACK数据包中仅设置一个ECN相关标志。这种不对称性对于ECN能力与部署的TCP实现的健壮协商是必要的。存在至少一种TCP实现,其中TCP接收器在ACK数据包中设置TCP报头的保留字段(因此SYN-ACK)只是为了反映接收数据包中TCP报头的保留字段。由于TCP SYN数据包设置ECN Echo和CWR标志以指示ECN能力,而SYN-ACK数据包仅设置ECN Echo标志,因此发送TCP正确地将接收机自身标志在保留字段中的反映解释为接收机不具备ECN能力的指示。
For a TCP connection using ECN, data packets are transmitted with the ECT bit set in the IP header (set to a "1"). If the sender receives an ECN-Echo ACK packet (that is, an ACK packet with the ECN-Echo flag set in the TCP header), then the sender knows that congestion was encountered in the network on the path from the sender to the receiver. The indication of congestion should be treated just as a congestion loss in non-ECN-Capable TCP. That is, the TCP source halves the congestion window "cwnd" and reduces the slow start threshold "ssthresh". The sending TCP does NOT increase the congestion window in response to the receipt of an ECN-Echo ACK packet.
对于使用ECN的TCP连接,使用IP报头中设置的ECT位(设置为“1”)传输数据包。如果发送方收到ECN Echo ACK数据包(即,在TCP报头中设置了ECN Echo标志的ACK数据包),则发送方知道在从发送方到接收方的路径上的网络中遇到了拥塞。拥塞指示应被视为不支持ECN的TCP中的拥塞丢失。也就是说,TCP源将拥塞窗口“cwnd”减半,并降低慢启动阈值“ssthresh”。发送TCP不会增加拥塞窗口以响应接收到ECN Echo ACK数据包。
A critical condition is that TCP does not react to congestion indications more than once every window of data (or more loosely, more than once every round-trip time). That is, the TCP sender's congestion window should be reduced only once in response to a series of dropped and/or CE packets from a single window of data, In addition, the TCP source should not decrease the slow-start threshold, ssthresh, if it has been decreased within the last round trip time. However, if any retransmitted packets are dropped or have the CE bit set, then this is interpreted by the source TCP as a new instance of congestion.
一个关键条件是,TCP不会在每个数据窗口对拥塞指示做出多次响应(或者更宽松地说,在每个往返时间都不会做出多次响应)。也就是说,TCP发送方的拥塞窗口应仅减少一次,以响应来自单个数据窗口的一系列丢弃和/或CE数据包。此外,如果在上一个往返时间内降低了慢启动阈值ssthresh,则TCP源不应降低慢启动阈值ssthresh。但是,如果任何重新传输的数据包被丢弃或设置了CE位,则源TCP会将其解释为新的拥塞实例。
After the source TCP reduces its congestion window in response to a CE packet, incoming acknowledgements that continue to arrive can "clock out" outgoing packets as allowed by the reduced congestion window. If the congestion window consists of only one MSS (maximum segment size), and the sending TCP receives an ECN-Echo ACK packet, then the sending TCP should in principle still reduce its congestion window in half. However, the value of the congestion window is bounded below by a value of one MSS. If the sending TCP were to continue to send, using a congestion window of 1 MSS, this results in the transmission of one packet per round-trip time. We believe it is desirable to still reduce the sending rate of the TCP sender even further, on receipt of an ECN-Echo packet when the congestion window is one. We use the retransmit timer as a means to reduce the rate further in this circumstance. Therefore, the sending TCP should also reset the retransmit timer on receiving the ECN-Echo packet when the congestion window is one. The sending TCP will then be able to send a new packet when the retransmit timer expires.
在源TCP响应CE数据包减少其拥塞窗口后,继续到达的传入确认可以按照减少的拥塞窗口所允许的方式“打卡”传出数据包。如果拥塞窗口仅由一个MSS(最大段大小)组成,并且发送TCP接收到ECN Echo ACK数据包,则发送TCP原则上仍应将其拥塞窗口减少一半。但是,拥塞窗口的值以一个MSS的值为界。如果发送TCP继续发送,使用1 ms的拥塞窗口,这将导致每个往返时间传输一个数据包。我们认为,当拥塞窗口为1时,在接收到ECN回波数据包时,仍然希望进一步降低TCP发送方的发送速率。在这种情况下,我们使用重传定时器作为进一步降低速率的手段。因此,当拥塞窗口为1时,发送TCP还应在接收ECN回波数据包时重置重传计时器。当重传计时器过期时,发送TCP将能够发送新的数据包。
[Floyd94] discusses TCP's response to ECN in more detail. [Floyd98] discusses the validation test in the ns simulator, which illustrates a wide range of ECN scenarios. These scenarios include the following: an ECN followed by another ECN, a Fast Retransmit, or a Retransmit Timeout; a Retransmit Timeout or a Fast Retransmit followed by an ECN, and a congestion window of one packet followed by an ECN.
[Floyd94]更详细地讨论了TCP对ECN的响应。[Floyd98]讨论了ns模拟器中的验证测试,该模拟器演示了广泛的ECN场景。这些场景包括:一个ECN后跟另一个ECN、快速重传或重传超时;重传超时或快速重传后接ECN,以及一个数据包的拥塞窗口后接ECN。
TCP follows existing algorithms for sending data packets in response to incoming ACKs, multiple duplicate acknowledgements, or retransmit timeouts [RFC2001].
TCP遵循现有算法发送数据包以响应传入确认、多个重复确认或重新传输超时[RFC2001]。
When TCP receives a CE data packet at the destination end-system, the TCP data receiver sets the ECN-Echo flag in the TCP header of the subsequent ACK packet. If there is any ACK withholding implemented, as in current "delayed-ACK" TCP implementations where the TCP receiver can send an ACK for two arriving data packets, then the ECN-Echo flag in the ACK packet will be set to the OR of the CE bits of all of the data packets being acknowledged. That is, if any of the received data packets are CE packets, then the returning ACK has the ECN-Echo flag set.
当TCP在目的端系统接收到CE数据包时,TCP数据接收器在后续ACK数据包的TCP报头中设置ECN Echo标志。如果实施了任何ACK预扣,如在当前的“延迟ACK”TCP实施中,其中TCP接收器可以发送两个到达数据分组的ACK,则ACK分组中的ECN Echo标志将被设置为被确认的所有数据分组的CE位的OR。也就是说,如果任何接收到的数据分组是CE分组,则返回的ACK设置了ECN Echo标志。
To provide robustness against the possibility of a dropped ACK packet carrying an ECN-Echo flag, the TCP receiver must set the ECN-Echo flag in a series of ACK packets. The TCP receiver uses the CWR flag to determine when to stop setting the ECN-Echo flag.
为了提供对丢失的ACK数据包携带ECN回波标志的可能性的鲁棒性,TCP接收器必须在一系列ACK数据包中设置ECN回波标志。TCP接收器使用CWR标志来确定何时停止设置ECN回波标志。
When an ECN-Capable TCP reduces its congestion window for any reason (because of a retransmit timeout, a Fast Retransmit, or in response to an ECN Notification), the TCP sets the CWR flag in the TCP header of the first data packet sent after the window reduction. If that data packet is dropped in the network, then the sending TCP will have to reduce the congestion window again and retransmit the dropped packet. Thus, the Congestion Window Reduced message is reliably delivered to the data receiver.
当支持ECN的TCP出于任何原因(由于重传超时、快速重传或响应ECN通知)减少其拥塞窗口时,TCP在窗口减少后发送的第一个数据包的TCP报头中设置CWR标志。如果该数据包在网络中被丢弃,那么发送TCP将不得不再次减少拥塞窗口并重新传输丢弃的数据包。因此,拥塞窗口减少消息被可靠地传送到数据接收器。
After a TCP receiver sends an ACK packet with the ECN-Echo bit set, that TCP receiver continues to set the ECN-Echo flag in ACK packets until it receives a CWR packet (a packet with the CWR flag set). After the receipt of the CWR packet, acknowledgements for subsequent non-CE data packets do not have the ECN-Echo flag set. If another CE packet is received by the data receiver, the receiver would once again send ACK packets with the ECN-Echo flag set. While the receipt of a CWR packet does not guarantee that the data sender received the ECN-Echo message, this does indicate that the data sender reduced its congestion window at some point *after* it sent the data packet for which the CE bit was set.
TCP接收器发送设置了ECN Echo位的ACK数据包后,该TCP接收器继续在ACK数据包中设置ECN Echo标志,直到接收到CWR数据包(设置了CWR标志的数据包)。在接收到CWR数据包之后,后续非CE数据包的确认没有设置ECN Echo标志。如果数据接收器接收到另一个CE数据包,接收器将再次发送设置了ECN Echo标志的ACK数据包。虽然CWR数据包的接收不保证数据发送方收到ECN回显消息,但这确实表明数据发送方在发送设置了CE位的数据包后的某个时间点*减少了拥塞窗口。
We have already specified that a TCP sender reduces its congestion window at most once per window of data. This mechanism requires some care to make sure that the sender reduces its congestion window at most once per ECN indication, and that multiple ECN messages over several successive windows of data are properly reported to the ECN sender. This is discussed further in [Floyd98].
我们已经指定TCP发送方在每个数据窗口中最多减少一次拥塞窗口。此机制需要谨慎,以确保发送方在每个ECN指示中最多减少一次拥塞窗口,并且在多个连续数据窗口上的多个ECN消息正确报告给ECN发送方。这将在[Floyd98]中进一步讨论。
For the current generation of TCP congestion control algorithms, pure acknowledgement packets (e.g., packets that do not contain any accompanying data) should be sent with the ECT bit off. Current TCP receivers have no mechanisms for reducing traffic on the ACK-path in response to congestion notification. Mechanisms for responding to congestion on the ACK-path are areas for current and future research. (One simple possibility would be for the sender to reduce its congestion window when it receives a pure ACK packet with the CE bit set). For current TCP implementations, a single dropped ACK generally has only a very small effect on the TCP's sending rate.
对于当前一代TCP拥塞控制算法,应在ECT位关闭的情况下发送纯确认数据包(例如,不包含任何伴随数据的数据包)。当前TCP接收器没有响应拥塞通知而减少ACK路径上流量的机制。应答ACK路径上拥塞的机制是当前和未来研究的领域。(一种简单的可能性是,发送方在接收到带有CE位设置的纯ACK数据包时,减少其拥塞窗口)。对于当前的TCP实现,单个丢弃的ACK通常对TCP的发送速率影响很小。
Two bits need to be specified in the IP header, the ECN-Capable Transport (ECT) bit and the Congestion Experienced (CE) bit. The ECT bit set to "0" indicates that the transport protocol will ignore the
需要在IP报头中指定两位,即支持ECN的传输(ECT)位和经历拥塞(CE)位。ECT位设置为“0”表示传输协议将忽略
CE bit. This is the default value for the ECT bit. The ECT bit set to "1" indicates that the transport protocol is willing and able to participate in ECN.
CE位。这是ECT位的默认值。ECT位设置为“1”表示传输协议愿意并且能够参与ECN。
The default value for the CE bit is "0". The router sets the CE bit to "1" to indicate congestion to the end nodes. The CE bit in a packet header should never be reset by a router from "1" to "0".
CE位的默认值为“0”。路由器将CE位设置为“1”,以指示终端节点的拥塞。路由器不应将数据包头中的CE位从“1”重置为“0”。
TCP requires three changes, a negotiation phase during setup to determine if both end nodes are ECN-capable, and two new flags in the TCP header, from the "reserved" flags in the TCP flags field. The ECN-Echo flag is used by the data receiver to inform the data sender of a received CE packet. The Congestion Window Reduced flag is used by the data sender to inform the data receiver that the congestion window has been reduced.
TCP需要三个更改:设置过程中的一个协商阶段,以确定两个终端节点是否都支持ECN;TCP报头中的两个新标志,来自TCP标志字段中的“保留”标志。数据接收器使用ECN Echo标志通知数据发送者接收到的CE数据包。数据发送方使用拥塞窗口缩减标志通知数据接收方拥塞窗口已缩减。
Since the ATM and Frame Relay mechanisms for congestion indication have typically been defined without any notion of average queue size as the basis for determining that an intermediate node is congested, we believe that they provide a very noisy signal. The TCP-sender reaction specified in this draft for ECN is NOT the appropriate reaction for such a noisy signal of congestion notification. It is our expectation that ATM's EFCI and Frame Relay's FECN mechanisms would be phased out over time within the ATM network. However, if the routers that interface to the ATM network have a way of maintaining the average queue at the interface, and use it to come to a reliable determination that the ATM subnet is congested, they may use the ECN notification that is defined here.
由于用于拥塞指示的ATM和帧中继机制通常被定义为没有任何平均队列大小的概念作为确定中间节点拥塞的基础,因此我们认为它们提供了非常嘈杂的信号。本草案中针对ECN指定的TCP发送方反应不是针对此类拥塞通知的嘈杂信号的适当反应。我们期望ATM的EFCI和帧中继的FECN机制将随着时间的推移在ATM网络中逐步淘汰。然而,如果与ATM网络连接的路由器有办法维持接口处的平均队列,并使用它来可靠地确定ATM子网拥塞,则它们可以使用此处定义的ECN通知。
We emphasize that a *single* packet with the CE bit set in an IP packet causes the transport layer to respond, in terms of congestion control, as it would to a packet drop. As such, the CE bit is not a good match to a transient signal such as one based on the instantaneous queue size. However, experiments in techniques at layer 2 (e.g., in ATM switches or Frame Relay switches) should be encouraged. For example, using a scheme such as RED (where packet marking is based on the average queue length exceeding a threshold), layer 2 devices could provide a reasonably reliable indication of congestion. When all the layer 2 devices in a path set that layer's own Congestion Experienced bit (e.g., the EFCI bit for ATM, the FECN bit in Frame Relay) in this reliable manner, then the interface router to the layer 2 network could copy the state of that layer 2 Congestion Experienced bit into the CE bit in the IP header. We recognize that this is not the current practice, nor is it in current standards. However, encouraging experimentation in this manner may
我们强调,在IP数据包中设置CE位的*单个*数据包会导致传输层在拥塞控制方面做出响应,就像对数据包丢弃做出响应一样。因此,CE位与瞬时信号(例如基于瞬时队列大小的信号)不是很好的匹配。但是,应鼓励在第2层(例如ATM交换机或帧中继交换机)进行技术试验。例如,使用诸如RED的方案(其中分组标记基于超过阈值的平均队列长度),第2层设备可以提供合理可靠的拥塞指示。当路径中的所有第2层设备以这种可靠方式设置该层自身的拥塞经历位(例如,ATM的EFCI位、帧中继中的FECN位)时,到第2层网络的接口路由器可以将该第2层拥塞经历位的状态复制到IP报头中的CE位。我们认识到,这不是当前的做法,也不是当前的标准。然而,鼓励以这种方式进行实验可能会导致失败
provide the information needed to enable evolution of existing layer 2 mechanisms to provide a more reliable means of congestion indication, when they use a single bit for indicating congestion.
当现有的第2层机制使用单个比特指示拥塞时,提供所需的信息,以支持现有第2层机制的发展,从而提供更可靠的拥塞指示方法。
This section discusses concerns about the vulnerability of ECN to non-compliant end-nodes (i.e., end nodes that set the ECT bit in transmitted packets but do not respond to received CE packets). We argue that the addition of ECN to the IP architecture would not significantly increase the current vulnerability of the architecture to unresponsive flows.
本节讨论了ECN对不合规终端节点(即,在传输的数据包中设置ECT位但不响应接收到的CE数据包的终端节点)的漏洞。我们认为,将ECN添加到IP体系结构不会显著增加体系结构当前对无响应流的脆弱性。
Even for non-ECN environments, there are serious concerns about the damage that can be done by non-compliant or unresponsive flows (that is, flows that do not respond to congestion control indications by reducing their arrival rate at the congested link). For example, an end-node could "turn off congestion control" by not reducing its congestion window in response to packet drops. This is a concern for the current Internet. It has been argued that routers will have to deploy mechanisms to detect and differentially treat packets from non-compliant flows. It has also been argued that techniques such as end-to-end per-flow scheduling and isolation of one flow from another, differentiated services, or end-to-end reservations could remove some of the more damaging effects of unresponsive flows.
即使对于非ECN环境,也存在严重的问题,即不符合或无响应的流(即,不通过降低其在拥塞链路的到达率来响应拥塞控制指示的流)可能造成的损害。例如,终端节点可以通过不减少其拥塞窗口来“关闭拥塞控制”,以响应数据包丢失。这是当前互联网关注的问题。有人认为,路由器必须部署机制来检测和区别处理来自不兼容流的数据包。也有人认为,诸如端到端每流调度和一个流与另一个流隔离、区分服务或端到端保留等技术可以消除无响应流的一些更具破坏性的影响。
It has been argued that dropping packets in itself may be an adequate deterrent for non-compliance, and that the use of ECN removes this deterrent. We would argue in response that (1) ECN-capable routers preserve packet-dropping behavior in times of high congestion; and (2) even in times of high congestion, dropping packets in itself is not an adequate deterrent for non-compliance.
有人认为,丢弃数据包本身可能是对违规行为的充分威慑,而使用ECN则消除了这种威慑。作为回应,我们认为(1)支持ECN的路由器在高拥塞时保持丢包行为;(2)即使在高度拥挤的情况下,丢弃数据包本身也不能充分阻止违规行为。
First, ECN-Capable routers will only mark packets (as opposed to dropping them) when the packet marking rate is reasonably low. During periods where the average queue size exceeds an upper threshold, and therefore the potential packet marking rate would be high, our recommendation is that routers drop packets rather then set the CE bit in packet headers.
首先,支持ECN的路由器只有在包标记率相当低时才会标记包(而不是丢弃它们)。在平均队列大小超过上限阈值的时期,因此潜在的数据包标记率将很高,我们建议路由器丢弃数据包,而不是在数据包头中设置CE位。
During the periods of low or moderate packet marking rates when ECN would be deployed, there would be little deterrent effect on unresponsive flows of dropping rather than marking those packets. For example, delay-insensitive flows using reliable delivery might have an incentive to increase rather than to decrease their sending rate in the presence of dropped packets. Similarly, delay-sensitive flows using unreliable delivery might increase their use of FEC in response to an increased packet drop rate, increasing rather than decreasing
在部署ECN的低或中等数据包标记率期间,对无响应的丢弃流(而不是标记这些数据包)几乎没有威慑作用。例如,使用可靠传递的延迟不敏感流在存在丢包的情况下可能有增加而不是降低其发送速率的动机。类似地,使用不可靠传递的延迟敏感流可能会增加其FEC的使用,以响应增加的分组丢弃率,增加而不是减少
their sending rate. For the same reasons, we do not believe that packet dropping itself is an effective deterrent for non-compliance even in an environment of high packet drop rates.
他们的发送速率。出于同样的原因,我们认为即使在高丢包率的环境中,丢包本身也不能有效地阻止违规行为。
Several methods have been proposed to identify and restrict non-compliant or unresponsive flows. The addition of ECN to the network environment would not in any way increase the difficulty of designing and deploying such mechanisms. If anything, the addition of ECN to the architecture would make the job of identifying unresponsive flows slightly easier. For example, in an ECN-Capable environment routers are not limited to information about packets that are dropped or have the CE bit set at that router itself; in such an environment routers could also take note of arriving CE packets that indicate congestion encountered by that packet earlier in the path.
已经提出了几种方法来识别和限制不合规或无响应的流。将ECN添加到网络环境中不会以任何方式增加设计和部署此类机制的难度。如果说有什么区别的话,那么在体系结构中添加ECN将使识别无响应流的工作稍微容易一些。例如,在支持ECN的环境中,路由器不限于关于被丢弃的分组或在该路由器本身设置了CE位的分组的信息;在这样的环境中,路由器还可以注意到到达的CE分组,这些CE分组指示该分组在路径的早期遇到的拥塞。
The breakdown of effective congestion control could be caused not only by a non-compliant end-node, but also by the loss of the congestion indication in the network itself. This could happen through a rogue or broken router that set the ECT bit in a packet from a non-ECN-capable transport, or "erased" the CE bit in arriving packets. As one example, a rogue or broken router that "erased" the CE bit in arriving CE packets would prevent that indication of congestion from reaching downstream receivers. This could result in the failure of congestion control for that flow and a resulting increase in congestion in the network, ultimately resulting in subsequent packets dropped for this flow as the average queue size increased at the congested gateway.
有效拥塞控制的崩溃不仅可能由不符合要求的终端节点引起,还可能由网络本身中拥塞指示的丢失引起。这可能通过恶意或损坏的路由器发生,该路由器设置来自不支持ECN的传输的数据包中的ECT位,或“擦除”到达数据包中的CE位。例如,一个恶意或损坏的路由器“擦除”到达的CE数据包中的CE位将阻止拥塞指示到达下游接收器。这可能导致该流的拥塞控制失败,并导致网络中的拥塞增加,最终导致随着拥塞网关处的平均队列大小增加,该流的后续数据包被丢弃。
The actions of a rogue or broken router could also result in an unnecessary indication of congestion to the end-nodes. These actions can include a router dropping a packet or setting the CE bit in the absence of congestion. From a congestion control point of view, setting the CE bit in the absence of congestion by a non-compliant router would be no different than a router dropping a packet unecessarily. By "erasing" the ECT bit of a packet that is later dropped in the network, a router's actions could result in an unnecessary packet drop for that packet later in the network.
流氓或坏掉的路由器的行为也可能导致终端节点出现不必要的拥塞指示。这些操作可以包括路由器丢弃数据包或在没有拥塞的情况下设置CE位。从拥塞控制的角度来看,在不兼容路由器没有拥塞的情况下设置CE位与路由器不必要地丢弃数据包没有什么不同。通过“擦除”稍后在网络中丢弃的数据包的ECT位,路由器的操作可能导致该数据包稍后在网络中不必要的数据包丢弃。
Concerns regarding the loss of congestion indications from encapsulated, dropped, or corrupted packets are discussed below.
下面将讨论关于因封装、丢弃或损坏的数据包而丢失拥塞指示的问题。
Some care is required to handle the CE and ECT bits appropriately when packets are encapsulated and de-encapsulated for tunnels.
当为隧道封装和解封数据包时,需要注意适当处理CE和ECT位。
When a packet is encapsulated, the following rules apply regarding the ECT bit. First, if the ECT bit in the encapsulated ('inside') header is a 0, then the ECT bit in the encapsulating ('outside') header MUST be a 0. If the ECT bit in the inside header is a 1, then the ECT bit in the outside header SHOULD be a 1.
封装数据包时,以下规则适用于ECT位。首先,如果封装(“内部”)标头中的ECT位为0,则封装(“外部”)标头中的ECT位必须为0。如果内部标头中的ECT位为1,则外部标头中的ECT位应为1。
When a packet is de-encapsulated, the following rules apply regarding the CE bit. If the ECT bit is a 1 in both the inside and the outside header, then the CE bit in the outside header MUST be ORed with the CE bit in the inside header. (That is, in this case a CE bit of 1 in the outside header must be copied to the inside header.) If the ECT bit in either header is a 0, then the CE bit in the outside header is ignored. This requirement for the treatment of de-encapsulated packets does not currently apply to IPsec tunnels.
当数据包被解除封装时,以下规则适用于CE位。如果ECT位在内部和外部标头中均为1,则外部标头中的CE位必须与内部标头中的CE位进行或运算。(即,在这种情况下,必须将外部标头中的CE位1复制到内部标头。)如果任一标头中的ECT位为0,则忽略外部标头中的CE位。这一处理去封装数据包的要求目前不适用于IPsec隧道。
A specific example of the use of ECN with encapsulation occurs when a flow wishes to use ECN-capability to avoid the danger of an unnecessary packet drop for the encapsulated packet as a result of congestion at an intermediate node in the tunnel. This functionality can be supported by copying the ECN field in the inner IP header to the outer IP header upon encapsulation, and using the ECN field in the outer IP header to set the ECN field in the inner IP header upon decapsulation. This effectively allows routers along the tunnel to cause the CE bit to be set in the ECN field of the unencapsulated IP header of an ECN-capable packet when such routers experience congestion.
当流希望使用ECN能力来避免由于隧道中的中间节点处的拥塞而导致被封装的分组的不必要分组丢弃的危险时,出现了将ECN与封装一起使用的特定示例。通过在封装时将内部IP标头中的ECN字段复制到外部IP标头,并在解除封装时使用外部IP标头中的ECN字段设置内部IP标头中的ECN字段,可以支持此功能。这有效地允许沿隧道的路由器在具有ECN能力的分组的未封装IP报头的ECN字段中设置CE比特,当此类路由器经历拥塞时。
The IPsec protocol, as defined in [ESP, AH], does not include the IP header's ECN field in any of its cryptographic calculations (in the case of tunnel mode, the outer IP header's ECN field is not included). Hence modification of the ECN field by a network node has no effect on IPsec's end-to-end security, because it cannot cause any IPsec integrity check to fail. As a consequence, IPsec does not provide any defense against an adversary's modification of the ECN field (i.e., a man-in-the-middle attack), as the adversary's modification will also have no effect on IPsec's end-to-end security. In some environments, the ability to modify the ECN field without affecting IPsec integrity checks may constitute a covert channel; if it is necessary to eliminate such a channel or reduce its bandwidth, then the outer IP header's ECN field can be zeroed at the tunnel ingress and egress nodes.
[ESP,AH]中定义的IPsec协议在其任何加密计算中不包括IP头的ECN字段(在隧道模式下,不包括外部IP头的ECN字段)。因此,网络节点对ECN字段的修改不会影响IPsec的端到端安全性,因为它不会导致任何IPsec完整性检查失败。因此,IPsec不会针对对手修改ECN字段(即中间人攻击)提供任何防御,因为对手的修改也不会影响IPsec的端到端安全。在某些环境中,在不影响IPsec完整性检查的情况下修改ECN字段的能力可能构成隐蔽通道;如果需要消除这样的信道或减少其带宽,则外部IP报头的ECN字段可以在隧道入口和出口节点处归零。
The IPsec protocol currently requires that the inner header's ECN field not be changed by IPsec decapsulation processing at a tunnel egress node. This ensures that an adversary's modifications to the ECN field cannot be used to launch theft- or denial-of-service attacks across an IPsec tunnel endpoint, as any such modifications will be discarded at the tunnel endpoint. This document makes no change to that IPsec requirement. As a consequence of the current specification of the IPsec protocol, we suggest that experiments with ECN not be carried out for flows that will undergo IPsec tunneling at the present time.
IPsec协议当前要求隧道出口节点处的IPsec解除封装处理不会更改内部报头的ECN字段。这确保了对手对ECN字段的修改不能用于跨IPsec隧道端点发起盗窃或拒绝服务攻击,因为任何此类修改都将在隧道端点处被丢弃。本文档对IPsec要求没有任何更改。作为IPsec协议当前规范的一个结果,我们建议目前不针对将经历IPsec隧道的流执行ECN实验。
If the IPsec specifications are modified in the future to permit a tunnel egress node to modify the ECN field in an inner IP header based on the ECN field value in the outer header (e.g., copying part or all of the outer ECN field to the inner ECN field), or to permit the ECN field of the outer IP header to be zeroed during encapsulation, then experiments with ECN may be used in combination with IPsec tunneling.
如果将来修改IPsec规范以允许隧道出口节点基于外部报头中的ECN字段值修改内部IP报头中的ECN字段(例如,将部分或全部外部ECN字段复制到内部ECN字段),或者允许在封装期间将外部IP报头的ECN字段归零,然后,可以将ECN实验与IPsec隧道结合使用。
This discussion of ECN and IPsec tunnel considerations draws heavily on related discussions and documents from the Differentiated Services Working Group.
关于ECN和IPsec隧道注意事项的讨论大量借鉴了Differentied Services工作组的相关讨论和文档。
An additional issue concerns a packet that has the CE bit set at one router and is dropped by a subsequent router. For the proposed use for ECN in this paper (that is, for a transport protocol such as TCP for which a dropped data packet is an indication of congestion), end nodes detect dropped data packets, and the congestion response of the end nodes to a dropped data packet is at least as strong as the congestion response to a received CE packet.
另一个问题涉及在一个路由器上设置CE位并被后续路由器丢弃的数据包。对于本文中提出的ECN用途(即,对于传输协议,例如TCP,丢弃的数据包表示拥塞),终端节点检测丢弃的数据包,并且终端节点对丢弃的数据包的拥塞响应至少与对接收到的CE包的拥塞响应一样强。
However, transport protocols such as TCP do not necessarily detect all packet drops, such as the drop of a "pure" ACK packet; for example, TCP does not reduce the arrival rate of subsequent ACK packets in response to an earlier dropped ACK packet. Any proposal for extending ECN-Capability to such packets would have to address concerns raised by CE packets that were later dropped in the network.
然而,诸如TCP之类的传输协议不一定检测所有分组丢弃,例如“纯”ACK分组的丢弃;例如,TCP不会降低后续ACK数据包的到达率,以响应先前丢弃的ACK数据包。任何将ECN能力扩展到此类数据包的建议都必须解决CE数据包引起的问题,这些数据包后来在网络中被丢弃。
Similarly, if a CE packet is dropped later in the network due to corruption (bit errors), the end nodes should still invoke congestion control, just as TCP would today in response to a dropped data packet. This issue of corrupted CE packets would have to be considered in any proposal for the network to distinguish between packets dropped due to corruption, and packets dropped due to congestion or buffer overflow.
类似地,如果CE数据包稍后由于损坏(位错误)而在网络中被丢弃,则终端节点仍应调用拥塞控制,就像TCP今天响应丢弃的数据包一样。在任何关于网络的提案中,都必须考虑损坏CE数据包的问题,以区分由于损坏而丢弃的数据包和由于拥塞或缓冲区溢出而丢弃的数据包。
11. A summary of related work.
11. 相关工作的总结。
[Floyd94] considers the advantages and drawbacks of adding ECN to the TCP/IP architecture. As shown in the simulation-based comparisons, one advantage of ECN is to avoid unnecessary packet drops for short or delay-sensitive TCP connections. A second advantage of ECN is in avoiding some unnecessary retransmit timeouts in TCP. This paper discusses in detail the integration of ECN into TCP's congestion control mechanisms. The possible disadvantages of ECN discussed in the paper are that a non-compliant TCP connection could falsely advertise itself as ECN-capable, and that a TCP ACK packet carrying an ECN-Echo message could itself be dropped in the network. The first of these two issues is discussed in Section 8 of this document, and the second is addressed by the proposal in Section 5.1.3 for a CWR flag in the TCP header.
[Floyd94]考虑了将ECN添加到TCP/IP体系结构的优点和缺点。如基于模拟的比较所示,ECN的一个优点是避免了对短或延迟敏感的TCP连接不必要的丢包。ECN的第二个优点是避免了TCP中一些不必要的重传超时。本文详细讨论了ECN与TCP拥塞控制机制的集成。本文中讨论的ECN可能存在的缺点是,不兼容的TCP连接可能会错误地将自身宣传为具有ECN功能,并且携带ECN回显消息的TCP ACK数据包本身可能会在网络中丢弃。本文件第8节讨论了这两个问题中的第一个问题,第二个问题由第5.1.3节中关于TCP标头中CWR标志的提案解决。
[CKLTZ97] reports on an experimental implementation of ECN in IPv6. The experiments include an implementation of ECN in an existing implementation of RED for FreeBSD. A number of experiments were run to demonstrate the control of the average queue size in the router, the performance of ECN for a single TCP connection as a congested router, and fairness with multiple competing TCP connections. One conclusion of the experiments is that dropping packets from a bulk-data transfer can degrade performance much more severely than marking packets.
[CKLTZ97]报告了在IPv6中实现ECN的实验。实验包括在现有的RED for FreeBSD实现中实现ECN。我们进行了大量实验,以证明路由器中平均队列大小的控制、ECN作为拥塞路由器的单个TCP连接的性能以及多个竞争TCP连接的公平性。实验的一个结论是,从批量数据传输中丢弃数据包会比标记数据包严重地降低性能。
Because the experimental implementation in [CKLTZ97] predates some of the developments in this document, the implementation does not conform to this document in all respects. For example, in the experimental implementation the CWR flag is not used, but instead the TCP receiver sends the ECN-Echo bit on a single ACK packet.
由于[CKLTZ97]中的实验实现早于本文档中的某些开发,因此该实现在所有方面都不符合本文档。例如,在实验实现中不使用CWR标志,而是TCP接收器在单个ACK数据包上发送ECN回波位。
[K98] and [CKLTZ98] build on [CKLTZ97] to further analyze the benefits of ECN for TCP. The conclusions are that ECN TCP gets moderately better throughput than non-ECN TCP; that ECN TCP flows are fair towards non-ECN TCP flows; and that ECN TCP is robust with two-way traffic, congestion in both directions, and with multiple congested gateways. Experiments with many short web transfers show that, while most of the short connections have similar transfer times with or without ECN, a small percentage of the short connections have very long transfer times for the non-ECN experiments as compared to the ECN experiments. This increased transfer time is particularly dramatic for those short connections that have their first packet dropped in the non-ECN experiments, and that therefore have to wait six seconds for the retransmit timer to expire.
[K98]和[CKLTZ98]以[CKLTZ97]为基础,进一步分析ECN对TCP的好处。结果表明,ECN-TCP的吞吐量略高于非ECN-TCP;ECN TCP流对非ECN TCP流是公平的;而且ECN TCP对双向流量、双向拥塞和多个拥塞网关都很健壮。对许多短web传输的实验表明,尽管大多数短连接在有或没有ECN的情况下具有相似的传输时间,但与ECN实验相比,非ECN实验中有一小部分短连接具有很长的传输时间。对于那些在非ECN实验中丢弃了第一个数据包的短连接来说,这种增加的传输时间尤其引人注目,因此必须等待六秒钟才能使重传计时器过期。
The ECN Web Page [ECN] has pointers to other implementations of ECN in progress.
ECN网页[ECN]有指向其他正在进行的ECN实现的指针。
Given the current effort to implement RED, we believe this is the right time for router vendors to examine how to implement congestion avoidance mechanisms that do not depend on packet drops alone. With the increased deployment of applications and transports sensitive to the delay and loss of a single packet (e.g., realtime traffic, short web transfers), depending on packet loss as a normal congestion notification mechanism appears to be insufficient (or at the very least, non-optimal).
鉴于目前实施RED的努力,我们认为现在正是路由器供应商研究如何实施拥塞避免机制的适当时机,这些机制不单单依赖于丢包。随着对单个数据包的延迟和丢失(例如,实时流量、短web传输)敏感的应用程序和传输部署的增加,依靠数据包丢失作为正常的拥塞通知机制似乎是不够的(或者至少是非最佳的)。
Many people have made contributions to this RFC. In particular, we would like to thank Kenjiro Cho for the proposal for the TCP mechanism for negotiating ECN-Capability, Kevin Fall for the proposal of the CWR bit, Steve Blake for material on IPv4 Header Checksum Recalculation, Jamal Hadi Salim for discussions of ECN issues, and Steve Bellovin, Jim Bound, Brian Carpenter, Paul Ferguson, Stephen Kent, Greg Minshall, and Vern Paxson for discussions of security issues. We also thank the Internet End-to-End Research Group for ongoing discussions of these issues.
许多人为这个RFC做出了贡献。特别是,我们要感谢Kenjiro Cho关于协商ECN能力的TCP机制的提议,Kevin Fall关于CWR bit的提议,Steve Blake关于IPv4报头校验和重新计算的材料,Jamal Hadi Salim关于ECN问题的讨论,以及Steve Bellovin,Jim Bound,Brian Carpenter,Paul Ferguson,斯蒂芬·肯特、格雷格·明索尔和弗恩·帕克森讨论了安全问题。我们还感谢互联网端到端研究小组对这些问题的持续讨论。
[AH] Kent, S. and R. Atkinson, "IP Authentication Header", RFC 2402, November 1998.
[AH]Kent,S.和R.Atkinson,“IP认证头”,RFC 2402,1998年11月。
[B97] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[B97]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。
[CKLT98] Chen, C., Krishnan, H., Leung, S., Tang, N., and Zhang, L., "Implementing ECN for TCP/IPv6", presentation to the ECN BOF at the L.A. IETF, March 1998, URL "http://www.cs.ucla.edu/~hari/ecn-ietf.ps".
[CKLT98]Chen,C.,Krishnan,H.,Leung,S.,Tang,N.,和Zhang,L.,“为TCP/IPv6实施ECN”,在L.A.IETF上向ECN BOF的演示,1998年3月,URL“http://www.cs.ucla.edu/~hari/ecn ietf.ps”。
[DIFFSERV] Nichols, K., Blake, S., Baker, F. and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1998.
[DIFFSERV]Nichols,K.,Blake,S.,Baker,F.和D.Black,“IPv4和IPv6报头中区分服务字段(DS字段)的定义”,RFC 24741998年12月。
[ECN] "The ECN Web Page", URL "http://www-nrg.ee.lbl.gov/floyd/ecn.html".
[ECN]“ECN网页”,URLhttp://www-nrg.ee.lbl.gov/floyd/ecn.html".
[ESP] Kent, S. and R. Atkinson, "IP Encapsulating Security Payload", RFC 2406, November 1998.
[ESP]Kent,S.和R.Atkinson,“IP封装安全有效载荷”,RFC 2406,1998年11月。
[FJ93] Floyd, S., and Jacobson, V., "Random Early Detection gateways for Congestion Avoidance", IEEE/ACM Transactions on Networking, V.1 N.4, August 1993, p. 397-413. URL "ftp://ftp.ee.lbl.gov/papers/early.pdf".
[FJ93]Floyd,S.和Jacobson,V.,“避免拥塞的随机早期检测网关”,IEEE/ACM网络交易,第1卷第4期,1993年8月,第页。397-413. URL“ftp://ftp.ee.lbl.gov/papers/early.pdf".
[Floyd94] Floyd, S., "TCP and Explicit Congestion Notification", ACM Computer Communication Review, V. 24 N. 5, October 1994, p. 10-23. URL "ftp://ftp.ee.lbl.gov/papers/tcp_ecn.4.ps.Z".
[Floyd94]Floyd,S.,“TCP和显式拥塞通知”,《ACM计算机通信评论》,第24卷第5期,1994年10月,第页。10-23. URL“ftp://ftp.ee.lbl.gov/papers/tcp_ecn.4.ps.Z".
[Floyd97] Floyd, S., and Fall, K., "Router Mechanisms to Support End-to-End Congestion Control", Technical report, February 1997. URL "http://www-nrg.ee.lbl.gov/floyd/end2end-paper.html".
[Floyd97]Floyd,S.,和Fall,K.,“支持端到端拥塞控制的路由器机制”,技术报告,1997年2月。URL“http://www-nrg.ee.lbl.gov/floyd/end2end-paper.html".
[Floyd98] Floyd, S., "The ECN Validation Test in the NS Simulator", URL "http://www-mash.cs.berkeley.edu/ns/", test tcl/test/test-all-ecn.
[Floyd98]Floyd,S.,“NS模拟器中的ECN验证测试”,URL“http://www-mash.cs.berkeley.edu/ns/“,测试tcl/测试/测试所有ecn。
[K98] Krishnan, H., "Analyzing Explicit Congestion Notification (ECN) benefits for TCP", Master's thesis, UCLA, 1998, URL "http://www.cs.ucla.edu/~hari/software/ecn/ ecn_report.ps.gz".
[K98]Krishnan,H.“分析显式拥塞通知(ECN)对TCP的好处”,硕士论文,加州大学洛杉矶分校,1998年,URLhttp://www.cs.ucla.edu/~hari/software/ecn/ecn_report.ps.gz”。
[FRED] Lin, D., and Morris, R., "Dynamics of Random Early Detection", SIGCOMM '97, September 1997. URL "http://www.inria.fr/rodeo/sigcomm97/program.html#ab078".
[FRED]Lin,D.和Morris,R.,“随机早期检测的动力学”,SIGCOMM'971997年9月。URL“http://www.inria.fr/rodeo/sigcomm97/program.html#ab078".
[Jacobson88] V. Jacobson, "Congestion Avoidance and Control", Proc. ACM SIGCOMM '88, pp. 314-329. URL "ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z".
[Jacobson88]V.Jacobson,“拥塞避免和控制”,程序。ACM SIGCOMM'88,第314-329页。URL“ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z".
[Jacobson90] V. Jacobson, "Modified TCP Congestion Avoidance Algorithm", Message to end2end-interest mailing list, April 1990. URL "ftp://ftp.ee.lbl.gov/email/vanj.90apr30.txt".
[Jacobson90]V.Jacobson,“改进的TCP拥塞避免算法”,发送至End2的邮件列表,1990年4月。URL“ftp://ftp.ee.lbl.gov/email/vanj.90apr30.txt".
[MJV96] S. McCanne, V. Jacobson, and M. Vetterli, "Receiver-driven Layered Multicast", SIGCOMM '96, August 1996, pp. 117-130.
[MJV96]S.McCanne、V.Jacobson和M.Vetterli,“接收器驱动分层多播”,SIGCOMM'961996年8月,第117-130页。
[RFC791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981.
[RFC791]Postel,J.,“互联网协议”,标准5,RFC7911981年9月。
[RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, September 1981.
[RFC793]Postel,J.,“传输控制协议”,标准7,RFC 793,1981年9月。
[RFC1141] Mallory, T. and A. Kullberg, "Incremental Updating of the Internet Checksum", RFC 1141, January 1990.
[RFC1141]Mallory,T.和A.Kullberg,“互联网校验和的增量更新”,RFC 114119990年1月。
[RFC1349] Almquist, P., "Type of Service in the Internet Protocol Suite", RFC 1349, July 1992.
[RFC1349]Almquist,P.,“互联网协议套件中的服务类型”,RFC1349,1992年7月。
[RFC1455] Eastlake, D., "Physical Link Security Type of Service", RFC 1455, May 1993.
[RFC1455]Eastlake,D.,“物理链路安全服务类型”,RFC 1455,1993年5月。
[RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms", RFC 2001, January 1997.
[RFC2001]Stevens,W.“TCP慢启动、拥塞避免、快速重传和快速恢复算法”,RFC 2001,1997年1月。
[RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, S., Wroclawski, J. and L. Zhang, "Recommendations on Queue Management and Congestion Avoidance in the Internet", RFC 2309, April 1998.
[RFC2309]Braden,B.,Clark,D.,Crowcroft,J.,Davie,B.,Deering,S.,Estrin,D.,Floyd,S.,Jacobson,V.,Minshall,G.,Partridge,C.,Peterson,L.,Ramakrishnan,K.,Shenker,S.,Wroclawski,J.和L.Zhang,“关于互联网中队列管理和拥塞避免的建议”,RFC 2309,1998年4月。
[RJ90] K. K. Ramakrishnan and Raj Jain, "A Binary Feedback Scheme for Congestion Avoidance in Computer Networks", ACM Transactions on Computer Systems, Vol.8, No.2, pp. 158-181, May 1990.
[RJ90]K.K.Ramakrishnan和Raj Jain,“计算机网络中避免拥塞的二进制反馈方案”,ACM计算机系统交易,第8卷,第2期,第158-181页,1990年5月。
Security considerations have been discussed in Section 9.
第9节讨论了安全考虑。
IPv4 header checksum recalculation is an issue with some high-end router architectures using an output-buffered switch, since most if not all of the header manipulation is performed on the input side of the switch, while the ECN decision would need to be made local to the output buffer. This is not an issue for IPv6, since there is no IPv6 header checksum. The IPv4 TOS octet is the last byte of a 16-bit half-word.
IPv4报头校验和重新计算是一些使用输出缓冲交换机的高端路由器体系结构的一个问题,因为大部分(如果不是全部的话)报头操作都是在交换机的输入端执行的,而ECN决策需要在输出缓冲区的本地进行。这不是IPv6的问题,因为没有IPv6标头校验和。IPv4 TOS八位字节是16位半字的最后一个字节。
RFC 1141 [RFC1141] discusses the incremental updating of the IPv4 checksum after the TTL field is decremented. The incremental updating of the IPv4 checksum after the CE bit was set would work as follows: Let HC be the original header checksum, and let HC' be the new header checksum after the CE bit has been set. Then for header checksums calculated with one's complement subtraction, HC' would be recalculated as follows:
RFC 1141[RFC1141]讨论了TTL字段递减后IPv4校验和的增量更新。设置CE位后IPv4校验和的增量更新将按如下方式进行:让HC为原始报头校验和,让HC'为设置CE位后的新报头校验和。然后,对于使用补码减法计算的报头校验和,HC'将重新计算如下:
HC' = { HC - 1 HC > 1 { 0x0000 HC = 1
HC' = { HC - 1 HC > 1 { 0x0000 HC = 1
For header checksums calculated on two's complement machines, HC' would be recalculated as follows after the CE bit was set:
对于在两个补码机器上计算的报头校验和,在设置CE位后,HC'将按如下方式重新计算:
HC' = { HC - 1 HC > 0 { 0xFFFE HC = 0
HC' = { HC - 1 HC > 0 { 0xFFFE HC = 0
17. The motivation for the ECT bit.
17. ECT钻头的动机。
The need for the ECT bit is motivated by the fact that ECN will be deployed incrementally in an Internet where some transport protocols and routers understand ECN and some do not. With the ECT bit, the router can drop packets from flows that are not ECN-capable, but can *instead* set the CE bit in flows that *are* ECN-capable. Because the ECT bit allows an end node to have the CE bit set in a packet *instead* of having the packet dropped, an end node might have some incentive to deploy ECN.
对ECT位的需求是由这样一个事实驱动的,即ECN将以增量方式部署在互联网中,其中一些传输协议和路由器理解ECN,而另一些则不理解。使用ECT位,路由器可以从不支持ECN的流中丢弃数据包,但可以*改为*在*支持ECN的流中*设置CE位。由于ECT位允许终端节点在数据包*中设置CE位,而不是丢弃数据包,因此终端节点可能有一些动机来部署ECN。
If there was no ECT indication, then the router would have to set the CE bit for packets from both ECN-capable and non-ECN-capable flows. In this case, there would be no incentive for end-nodes to deploy ECN, and no viable path of incremental deployment from a non-ECN world to an ECN-capable world. Consider the first stages of such an incremental deployment, where a subset of the flows are ECN-capable. At the onset of congestion, when the packet dropping/marking rate would be low, routers would only set CE bits, rather than dropping packets. However, only those flows that are ECN-capable would understand and respond to CE packets. The result is that the ECN-capable flows would back off, and the non-ECN-capable flows would be unaware of the ECN signals and would continue to open their congestion windows.
如果没有ECT指示,则路由器必须为来自支持ECN和不支持ECN的流的数据包设置CE位。在这种情况下,终端节点没有动力部署ECN,也没有从非ECN世界到支持ECN世界的增量部署的可行路径。考虑这样的增量部署的第一阶段,其中流的子集是ECN能力的。在拥塞开始时,当数据包丢弃/标记率较低时,路由器将只设置CE位,而不是丢弃数据包。然而,只有那些支持ECN的流才能理解和响应CE数据包。结果是,支持ECN的流将后退,而不支持ECN的流将不知道ECN信号,并将继续打开其拥塞窗口。
In this case, there are two possible outcomes: (1) the ECN-capable flows back off, the non-ECN-capable flows get all of the bandwidth, and congestion remains mild, or (2) the ECN-capable flows back off, the non-ECN-capable flows don't, and congestion increases until the router transitions from setting the CE bit to dropping packets. While this second outcome evens out the fairness, the ECN-capable flows would still receive little benefit from being ECN-capable, because the increased congestion would drive the router to packet-dropping behavior.
在这种情况下,有两种可能的结果:(1)支持ECN的流回退,不支持ECN的流得到所有带宽,拥塞保持轻微,或者(2)支持ECN的流回退,不支持ECN的流没有,直到路由器从设置CE位过渡到丢弃数据包为止,拥塞增加。虽然这第二个结果平衡了公平性,但支持ECN的流仍然不会从支持ECN中获得什么好处,因为增加的拥塞会促使路由器出现丢包行为。
A flow that advertised itself as ECN-Capable but does not respond to CE bits is functionally equivalent to a flow that turns off congestion control, as discussed in Sections 8 and 9.
如第8节和第9节所述,宣称自己具有ECN能力但不响应CE位的流在功能上等同于关闭拥塞控制的流。
Thus, in a world when a subset of the flows are ECN-capable, but where ECN-capable flows have no mechanism for indicating that fact to the routers, there would be less effective and less fair congestion control in the Internet, resulting in a strong incentive for end nodes not to deploy ECN.
因此,在一个流的子集具有ECN能力,但具有ECN能力的流没有向路由器指示这一事实的机制的世界中,互联网中的拥塞控制将不那么有效和公平,导致终端节点不部署ECN的强烈激励。
Given the need for an ECT indication in the IP header, there still remains the question of whether the ECT (ECN-Capable Transport) and CE (Congestion Experienced) indications should be overloaded on a single bit. This overloaded-one-bit alternative, explored in [Floyd94], would involve a single bit with two values. One value, "ECT and not CE", would represent an ECN-Capable Transport, and the other value, "CE or not ECT", would represent either Congestion Experienced or a non-ECN-Capable transport.
鉴于IP报头中需要ECT指示,仍然存在ECT(支持ECN的传输)和CE(经历拥塞)指示是否应在单个位上过载的问题。[Floyd94]中探讨的这种重载一位替代方案将涉及一个具有两个值的位。一个值“ECT和非CE”表示支持ECN的传输,另一个值“CE或非ECT”表示经历的拥塞或不支持ECN的传输。
One difference between the one-bit and two-bit implementations concerns packets that traverse multiple congested routers. Consider a CE packet that arrives at a second congested router, and is selected by the active queue management at that router for either marking or dropping. In the one-bit implementation, the second congested router has no choice but to drop the CE packet, because it cannot distinguish between a CE packet and a non-ECT packet. In the two-bit implementation, the second congested router has the choice of either dropping the CE packet, or of leaving it alone with the CE bit set.
一位和两位实现之间的一个区别涉及穿越多个拥塞路由器的数据包。考虑到达第二拥塞路由器的CE分组,并通过该路由器上的主动队列管理来选择标记或丢弃。在一位实现中,第二拥塞路由器别无选择,只能丢弃CE分组,因为它无法区分CE分组和非ECT分组。在两位实现中,第二个拥塞路由器可以选择丢弃CE数据包,或者将其单独留给CE位集。
Another difference between the one-bit and two-bit implementations comes from the fact that with the one-bit implementation, receivers in a single flow cannot distinguish between CE and non-ECT packets. Thus, in the one-bit implementation an ECN-capable data sender would have to unambiguously indicate to the receiver or receivers whether each packet had been sent as ECN-Capable or as non-ECN-Capable. One possibility would be for the sender to indicate in the transport header whether the packet was sent as ECN-Capable. A second possibility that would involve a functional limitation for the one-bit implementation would be for the sender to unambiguously indicate that it was going to send *all* of its packets as ECN-Capable or as non-ECN-Capable. For a multicast transport protocol, this unambiguous indication would have to be apparent to receivers joining an on-going multicast session.
一位和两位实现之间的另一个区别在于,使用一位实现时,单个流中的接收器无法区分CE和非ECT数据包。因此,在一位实现中,支持ECN的数据发送方必须明确地向接收机指示每个分组是作为支持ECN的还是不支持ECN的发送。一种可能性是发送方在传输报头中指示数据包是否以支持ECN的方式发送。涉及一位实现的功能限制的第二种可能性是发送方明确表示它将发送*其所有*数据包作为支持ECN或不支持ECN。对于多播传输协议,对于加入正在进行的多播会话的接收器来说,这种明确的指示必须是显而易见的。
Another advantage of the two-bit approach is that it is somewhat more robust. The most critical issue, discussed in Section 8, is that the default indication should be that of a non-ECN-Capable transport. In a two-bit implementation, this requirement for the default value simply means that the ECT bit should be `OFF' by default. In the
两位方法的另一个优点是它在某种程度上更加健壮。第8节讨论的最关键的问题是,默认指示应该是不支持ECN的传输。在两位实现中,对默认值的要求只是意味着默认情况下ECT位应为“OFF”。在
one-bit implementation, this means that the single overloaded bit should by default be in the "CE or not ECT" position. This is less clear and straightforward, and possibly more open to incorrect implementations either in the end nodes or in the routers.
一位实现,这意味着单个重载位在默认情况下应处于“CE或not ECT”位置。这是不太清楚和直接的,并且可能更容易在终端节点或路由器中进行错误的实现。
In summary, while the one-bit implementation could be a possible implementation, it has the following significant limitations relative to the two-bit implementation. First, the one-bit implementation has more limited functionality for the treatment of CE packets at a second congested router. Second, the one-bit implementation requires either that extra information be carried in the transport header of packets from ECN-Capable flows (to convey the functionality of the second bit elsewhere, namely in the transport header), or that senders in ECN-Capable flows accept the limitation that receivers must be able to determine a priori which packets are ECN-Capable and which are not ECN-Capable. Third, the one-bit implementation is possibly more open to errors from faulty implementations that choose the wrong default value for the ECN bit. We believe that the use of the extra bit in the IP header for the ECT-bit is extremely valuable to overcome these limitations.
总之,虽然一位实现可能是一种可能的实现,但相对于两位实现,它有以下重大限制。首先,一位实现对于在第二拥塞路由器处处理CE分组具有更有限的功能。第二,一位实现要求在来自支持ECN的流的分组的传输报头中携带额外信息(以将第二位的功能传送到别处,即传送报头中),或者,支持ECN的流中的发送方接受这样的限制,即接收方必须能够事先确定哪些包支持ECN,哪些包不支持ECN。第三,一位实现可能更容易出错,因为错误的实现为ECN位选择了错误的默认值。我们认为,在IP报头中为ECT位使用额外的位对于克服这些限制非常有价值。
RFC 791 [RFC791] defined the ToS (Type of Service) octet in the IP header. In RFC 791, bits 6 and 7 of the ToS octet are listed as "Reserved for Future Use", and are shown set to zero. The first two fields of the ToS octet were defined as the Precedence and Type of Service (TOS) fields.
RFC 791[RFC791]在IP头中定义了ToS(服务类型)八位字节。在RFC 791中,ToS八位字节的第6位和第7位列为“保留供将来使用”,并显示为设置为零。ToS八位字节的前两个字段被定义为优先级和服务类型(ToS)字段。
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | PRECEDENCE | TOS | 0 | 0 | RFC 791 +-----+-----+-----+-----+-----+-----+-----+-----+
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | PRECEDENCE | TOS | 0 | 0 | RFC 791 +-----+-----+-----+-----+-----+-----+-----+-----+
RFC 1122 included bits 6 and 7 in the TOS field, though it did not discuss any specific use for those two bits:
RFC 1122在TOS字段中包括第6位和第7位,但没有讨论这两位的任何具体用途:
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | PRECEDENCE | TOS | RFC 1122 +-----+-----+-----+-----+-----+-----+-----+-----+
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | PRECEDENCE | TOS | RFC 1122 +-----+-----+-----+-----+-----+-----+-----+-----+
The IPv4 TOS octet was redefined in RFC 1349 [RFC1349] as follows:
在RFC 1349[RFC1349]中,IPv4 TOS八位字节被重新定义如下:
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | PRECEDENCE | TOS | MBZ | RFC 1349 +-----+-----+-----+-----+-----+-----+-----+-----+
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | PRECEDENCE | TOS | MBZ | RFC 1349 +-----+-----+-----+-----+-----+-----+-----+-----+
Bit 6 in the TOS field was defined in RFC 1349 for "Minimize Monetary Cost". In addition to the Precedence and Type of Service (TOS) fields, the last field, MBZ (for "must be zero") was defined as currently unused. RFC 1349 stated that "The originator of a datagram sets [the MBZ] field to zero (unless participating in an Internet protocol experiment which makes use of that bit)."
TOS字段中的第6位在RFC 1349中定义为“最小化货币成本”。除了优先级和服务类型(TOS)字段外,最后一个字段MBZ(表示“必须为零”)被定义为当前未使用的字段。RFC 1349指出,“数据报的发起者将[MBZ]字段设置为零(除非参与使用该位的互联网协议实验)。”
RFC 1455 [RFC 1455] defined an experimental standard that used all four bits in the TOS field to request a guaranteed level of link security.
RFC 1455[RFC 1455]定义了一个实验标准,该标准使用TOS字段中的所有四位来请求链路安全的保证级别。
RFC 1349 is obsoleted by "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers" [DIFFSERV], in which bits 6 and 7 of the DS field are listed as Currently Unused (CU). The first six bits of the DS field are defined as the Differentiated Services CodePoint (DSCP):
RFC 1349因“IPv4和IPv6标头中差异化服务字段(DS字段)的定义”[DIFFSERV]而被淘汰,其中DS字段的第6位和第7位被列为当前未使用(CU)。DS字段的前六位定义为区分服务码点(DSCP):
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | DSCP | CU | +-----+-----+-----+-----+-----+-----+-----+-----+
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | DSCP | CU | +-----+-----+-----+-----+-----+-----+-----+-----+
Because of this unstable history, the definition of the ECN field in this document cannot be guaranteed to be backwards compatible with all past uses of these two bits. The damage that could be done by a non-ECN-capable router would be to "erase" the CE bit for an ECN-capable packet that arrived at the router with the CE bit set, or set the CE bit even in the absence of congestion. This has been discussed in Section 10 on "Non-compliance in the Network".
由于这种不稳定的历史,本文档中ECN字段的定义不能保证与这两个位的所有过去使用向后兼容。不支持ECN的路由器可能造成的损害是“擦除”在设置CE位的情况下到达路由器的支持ECN的数据包的CE位,或者即使在没有拥塞的情况下也设置CE位。第10节“网络中的违规行为”对此进行了讨论。
The damage that could be done in an ECN-capable environment by a non-ECN-capable end-node transmitting packets with the ECT bit set has been discussed in Section 9 on "Non-compliance by the End Nodes".
在支持ECN的环境中,不支持ECN的终端节点使用ECT位集传输数据包可能造成的损害已在第9节“终端节点的不符合性”中讨论。
AUTHORS' ADDRESSES
作者地址
K. K. Ramakrishnan AT&T Labs. Research
罗摩克里希南AT&T实验室。研究
Phone: +1 (973) 360-8766 EMail: kkrama@research.att.com URL: http://www.research.att.com/info/kkrama
Phone: +1 (973) 360-8766 EMail: kkrama@research.att.com URL: http://www.research.att.com/info/kkrama
Sally Floyd Lawrence Berkeley National Laboratory
萨莉·弗洛伊德·劳伦斯伯克利国家实验室
Phone: +1 (510) 486-7518 EMail: floyd@ee.lbl.gov URL: http://www-nrg.ee.lbl.gov/floyd/
Phone: +1 (510) 486-7518 EMail: floyd@ee.lbl.gov URL: http://www-nrg.ee.lbl.gov/floyd/
Full Copyright Statement
完整版权声明
Copyright (C) The Internet Society (1999). All Rights Reserved.
版权所有(C)互联网协会(1999年)。版权所有。
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
本文件及其译本可复制并提供给他人,对其进行评论或解释或协助其实施的衍生作品可全部或部分编制、复制、出版和分发,不受任何限制,前提是上述版权声明和本段包含在所有此类副本和衍生作品中。但是,不得以任何方式修改本文件本身,例如删除版权通知或对互联网协会或其他互联网组织的引用,除非出于制定互联网标准的需要,在这种情况下,必须遵循互联网标准过程中定义的版权程序,或根据需要将其翻译成英语以外的其他语言。
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.
上述授予的有限许可是永久性的,互联网协会或其继承人或受让人不会撤销。
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
本文件和其中包含的信息是按“原样”提供的,互联网协会和互联网工程任务组否认所有明示或暗示的保证,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。