Internet Engineering Task Force (IETF) B. Briscoe Request for Comments: 7141 BT BCP: 41 J. Manner Updates: 2309, 2914 Aalto University Category: Best Current Practice February 2014 ISSN: 2070-1721
Internet Engineering Task Force (IETF) B. Briscoe Request for Comments: 7141 BT BCP: 41 J. Manner Updates: 2309, 2914 Aalto University Category: Best Current Practice February 2014 ISSN: 2070-1721
Byte and Packet Congestion Notification
字节和数据包拥塞通知
Abstract
摘要
This document provides recommendations of best current practice for dropping or marking packets using any active queue management (AQM) algorithm, including Random Early Detection (RED), BLUE, Pre-Congestion Notification (PCN), and newer schemes such as CoDel (Controlled Delay) and PIE (Proportional Integral controller Enhanced). We give three strong recommendations: (1) packet size should be taken into account when transports detect and respond to congestion indications, (2) packet size should not be taken into account when network equipment creates congestion signals (marking, dropping), and therefore (3) in the specific case of RED, the byte-mode packet drop variant that drops fewer small packets should not be used. This memo updates RFC 2309 to deprecate deliberate preferential treatment of small packets in AQM algorithms.
本文档提供了使用任何主动队列管理(AQM)算法丢弃或标记数据包的最佳当前实践建议,包括随机早期检测(RED)、蓝色、拥塞前通知(PCN)和更新方案,如CoDel(受控延迟)和PIE(比例积分控制器增强型)。我们给出了三个强有力的建议:(1)当传输检测并响应拥塞指示时,应考虑数据包大小,(2)当网络设备产生拥塞信号(标记、丢弃)时,不应考虑数据包大小,因此(3)在特定的红色情况下,不应使用丢弃较少小数据包的字节模式数据包丢弃变体。本备忘录更新了RFC2309,反对在AQM算法中故意优先处理小数据包。
Status of This Memo
关于下段备忘
This memo documents an Internet Best Current Practice.
本备忘录记录了互联网最佳实践。
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on BCPs is available in Section 2 of RFC 5741.
本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。有关BCP的更多信息,请参见RFC 5741第2节。
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7141.
有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc7141.
Copyright Notice
版权公告
Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2014 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。
Table of Contents
目录
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Terminology and Scoping . . . . . . . . . . . . . . . . . 6 1.2. Example Comparing Packet-Mode Drop and Byte-Mode Drop . . 7 2. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 9 2.1. Recommendation on Queue Measurement . . . . . . . . . . . 9 2.2. Recommendation on Encoding Congestion Notification . . . 10 2.3. Recommendation on Responding to Congestion . . . . . . . 11 2.4. Recommendation on Handling Congestion Indications When Splitting or Merging Packets . . . . . . . . . . . . . . 12 3. Motivating Arguments . . . . . . . . . . . . . . . . . . . . 13 3.1. Avoiding Perverse Incentives to (Ab)use Smaller Packets . 13 3.2. Small != Control . . . . . . . . . . . . . . . . . . . . 14 3.3. Transport-Independent Network . . . . . . . . . . . . . . 14 3.4. Partial Deployment of AQM . . . . . . . . . . . . . . . . 16 3.5. Implementation Efficiency . . . . . . . . . . . . . . . . 17 4. A Survey and Critique of Past Advice . . . . . . . . . . . . 17 4.1. Congestion Measurement Advice . . . . . . . . . . . . . . 18 4.1.1. Fixed-Size Packet Buffers . . . . . . . . . . . . . . 18 4.1.2. Congestion Measurement without a Queue . . . . . . . 19 4.2. Congestion Notification Advice . . . . . . . . . . . . . 20 4.2.1. Network Bias When Encoding . . . . . . . . . . . . . 20 4.2.2. Transport Bias When Decoding . . . . . . . . . . . . 22 4.2.3. Making Transports Robust against Control Packet Losses . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.4. Congestion Notification: Summary of Conflicting Advice . . . . . . . . . . . . . . . . . . . . . . . 24 5. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 25 5.1. Bit-congestible Network . . . . . . . . . . . . . . . . . 25 5.2. Bit- and Packet-Congestible Network . . . . . . . . . . . 26 6. Security Considerations . . . . . . . . . . . . . . . . . . . 26 7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 27 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 28 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 9.1. Normative References . . . . . . . . . . . . . . . . . . 28 9.2. Informative References . . . . . . . . . . . . . . . . . 29 Appendix A. Survey of RED Implementation Status . . . . . . . . 33 Appendix B. Sufficiency of Packet-Mode Drop . . . . . . . . . . 34 B.1. Packet-Size (In)Dependence in Transports . . . . . . . . 35 B.2. Bit-Congestible and Packet-Congestible Indications . . . 38 Appendix C. Byte-Mode Drop Complicates Policing Congestion Response . . . . . . . . . . . . . . . . . . . . . . 39
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Terminology and Scoping . . . . . . . . . . . . . . . . . 6 1.2. Example Comparing Packet-Mode Drop and Byte-Mode Drop . . 7 2. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 9 2.1. Recommendation on Queue Measurement . . . . . . . . . . . 9 2.2. Recommendation on Encoding Congestion Notification . . . 10 2.3. Recommendation on Responding to Congestion . . . . . . . 11 2.4. Recommendation on Handling Congestion Indications When Splitting or Merging Packets . . . . . . . . . . . . . . 12 3. Motivating Arguments . . . . . . . . . . . . . . . . . . . . 13 3.1. Avoiding Perverse Incentives to (Ab)use Smaller Packets . 13 3.2. Small != Control . . . . . . . . . . . . . . . . . . . . 14 3.3. Transport-Independent Network . . . . . . . . . . . . . . 14 3.4. Partial Deployment of AQM . . . . . . . . . . . . . . . . 16 3.5. Implementation Efficiency . . . . . . . . . . . . . . . . 17 4. A Survey and Critique of Past Advice . . . . . . . . . . . . 17 4.1. Congestion Measurement Advice . . . . . . . . . . . . . . 18 4.1.1. Fixed-Size Packet Buffers . . . . . . . . . . . . . . 18 4.1.2. Congestion Measurement without a Queue . . . . . . . 19 4.2. Congestion Notification Advice . . . . . . . . . . . . . 20 4.2.1. Network Bias When Encoding . . . . . . . . . . . . . 20 4.2.2. Transport Bias When Decoding . . . . . . . . . . . . 22 4.2.3. Making Transports Robust against Control Packet Losses . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.4. Congestion Notification: Summary of Conflicting Advice . . . . . . . . . . . . . . . . . . . . . . . 24 5. Outstanding Issues and Next Steps . . . . . . . . . . . . . . 25 5.1. Bit-congestible Network . . . . . . . . . . . . . . . . . 25 5.2. Bit- and Packet-Congestible Network . . . . . . . . . . . 26 6. Security Considerations . . . . . . . . . . . . . . . . . . . 26 7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 27 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 28 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 28 9.1. Normative References . . . . . . . . . . . . . . . . . . 28 9.2. Informative References . . . . . . . . . . . . . . . . . 29 Appendix A. Survey of RED Implementation Status . . . . . . . . 33 Appendix B. Sufficiency of Packet-Mode Drop . . . . . . . . . . 34 B.1. Packet-Size (In)Dependence in Transports . . . . . . . . 35 B.2. Bit-Congestible and Packet-Congestible Indications . . . 38 Appendix C. Byte-Mode Drop Complicates Policing Congestion Response . . . . . . . . . . . . . . . . . . . . . . 39
This document provides recommendations of best current practice for how we should correctly scale congestion control functions with respect to packet size for the long term. It also recognises that expediency may be necessary to deal with existing widely deployed protocols that don't live up to the long-term goal.
本文档提供了当前最佳实践的建议,以说明我们应该如何根据数据包大小长期正确地扩展拥塞控制功能。它还认识到,处理现有广泛部署的协议可能需要权宜之计,这些协议不符合长期目标。
When signalling congestion, the problem of how (and whether) to take packet sizes into account has exercised the minds of researchers and practitioners for as long as active queue management (AQM) has been discussed. Indeed, one reason AQM was originally introduced was to reduce the lock-out effects that small packets can have on large packets in tail-drop queues. This memo aims to state the principles we should be using and to outline how these principles will affect future protocol design, taking into account pre-existing deployments.
在发送拥塞信号时,如何(以及是否)考虑数据包大小的问题一直困扰着研究人员和从业者,因为人们已经讨论了主动队列管理(AQM)。事实上,最初引入AQM的一个原因是为了减少尾部丢弃队列中小包对大包的锁定效应。本备忘录旨在说明我们应该使用的原则,并概述这些原则将如何影响未来的协议设计,同时考虑到已有的部署。
The question of whether to take into account packet size arises at three stages in the congestion notification process:
是否考虑数据包大小的问题出现在拥塞通知过程的三个阶段:
Measuring congestion: When a congested resource measures locally how congested it is, should it measure its queue length in time, bytes, or packets?
度量拥塞:当一个拥塞的资源在本地度量它的拥塞程度时,它应该以时间、字节或数据包来度量它的队列长度吗?
Encoding congestion notification into the wire protocol: When a congested network resource signals its level of congestion, should the probability that it drops/marks each packet depend on the size of the particular packet in question?
将拥塞通知编码到有线协议中:当拥塞的网络资源发出其拥塞级别的信号时,它丢弃/标记每个数据包的概率是否取决于所讨论的特定数据包的大小?
Decoding congestion notification from the wire protocol: When a transport interprets the notification in order to decide how much to respond to congestion, should it take into account the size of each missing or marked packet?
从有线协议解码拥塞通知:当传输解释该通知以决定对拥塞做出多少响应时,是否应考虑每个丢失或标记的数据包的大小?
Consensus has emerged over the years concerning the first stage, which Section 2.1 records in the RFC Series. In summary: If possible, it is best to measure congestion by time in the queue; otherwise, the choice between bytes and packets solely depends on whether the resource is congested by bytes or packets.
多年来,就第一阶段达成了共识,第2.1节记录在RFC系列中。总之:如果可能,最好通过队列中的时间来测量拥塞;否则,字节和数据包之间的选择完全取决于资源是否被字节或数据包阻塞。
The controversy is mainly around the last two stages: whether to allow for the size of the specific packet notifying congestion i) when the network encodes or ii) when the transport decodes the congestion notification.
争论主要集中在最后两个阶段:是考虑通知拥塞的特定数据包的大小i)网络编码时,还是ii)传输解码拥塞通知时。
Currently, the RFC series is silent on this matter other than a paper trail of advice referenced from [RFC2309], which conditionally recommends byte-mode (packet-size dependent) drop [pktByteEmail].
目前,除了参考[RFC2309]的书面建议外,RFC系列对这一问题保持沉默,该建议有条件地建议字节模式(取决于数据包大小)丢弃[PKTByteMail]。
Reducing the number of small packets dropped certainly has some tempting advantages: i) it drops fewer control packets, which tend to be small and ii) it makes TCP's bit rate less dependent on packet size. However, there are ways of addressing these issues at the transport layer, rather than reverse engineering network forwarding to fix the problems.
减少丢弃的小数据包的数量当然有一些诱人的优点:i)它丢弃的控制数据包较少,而控制数据包往往较小;ii)它使TCP的比特率不太依赖于数据包大小。但是,有一些方法可以在传输层解决这些问题,而不是通过反向工程网络转发来解决这些问题。
This memo updates [RFC2309] to deprecate deliberate preferential treatment of packets in AQM algorithms solely because of their size. It recommends that (1) packet size should be taken into account when transports detect and respond to congestion indications, (2) not when network equipment creates them. This memo also adds to the congestion control principles enumerated in BCP 41 [RFC2914].
本备忘录更新了[RFC2309],反对仅因数据包大小而在AQM算法中故意优先处理数据包。它建议:(1)当传输检测并响应拥塞指示时,应考虑数据包大小;(2)当网络设备创建拥塞指示时,不应考虑数据包大小。本备忘录还增加了BCP 41[RFC2914]中列举的拥塞控制原则。
In the particular case of Random Early Detection (RED), this means that the byte-mode packet drop variant should not be used to drop fewer small packets, because that creates a perverse incentive for transports to use tiny segments, consequently also opening up a DoS vulnerability. Fortunately, all the RED implementers who responded to our admittedly limited survey (Section 4.2.4) have not followed the earlier advice to use byte-mode drop, so the position this memo argues for seems to already exist in implementations.
在随机早期检测(RED)的特定情况下,这意味着不应使用字节模式数据包丢弃变体来丢弃较少的小数据包,因为这会产生一种不正当的激励,促使传输使用小数据段,从而也会打开DoS漏洞。幸运的是,所有响应我们公认的有限调查(第4.2.4节)的RED实现者都没有遵循前面的建议使用字节模式drop,因此本备忘录所主张的立场似乎已经存在于实现中。
However, at the transport layer, TCP congestion control is a widely deployed protocol that doesn't scale with packet size (i.e., its reduction in rate does not take into account the size of a lost packet). To date, this hasn't been a significant problem because most TCP implementations have been used with similar packet sizes. But, as we design new congestion control mechanisms, this memo recommends that we build in scaling with packet size rather than assuming that we should follow TCP's example.
然而,在传输层,TCP拥塞控制是一种广泛部署的协议,它不随数据包大小而扩展(即,其速率的降低不考虑丢失数据包的大小)。到目前为止,这还不是一个重大问题,因为大多数TCP实现都使用了相似的数据包大小。但是,当我们设计新的拥塞控制机制时,这份备忘录建议我们根据数据包大小进行扩展,而不是假设我们应该以TCP为例。
This memo continues as follows. First, it discusses terminology and scoping. Section 2 gives concrete formal recommendations, followed by motivating arguments in Section 3. We then critically survey the advice given previously in the RFC Series and the research literature (Section 4), referring to an assessment of whether or not this advice has been followed in production networks (Appendix A). To wrap up, outstanding issues are discussed that will need resolution both to inform future protocol designs and to handle legacy AQM deployments (Section 5). Then security issues are collected together in Section 6 before conclusions are drawn in Section 7. The interested reader can find discussion of more detailed issues on the theme of byte vs. packet in the appendices.
本备忘录继续如下。首先,它讨论术语和范围。第2节给出了具体的正式建议,随后在第3节中提出了激励性论点。然后,我们对之前在RFC系列和研究文献(第4节)中给出的建议进行了批判性调查,并参考了生产网络中是否遵循了该建议的评估(附录A)。最后,讨论了需要解决的未决问题,以便为未来的协议设计提供信息,并处理遗留AQM部署(第5节)。然后,在第7节得出结论之前,第6节收集了安全问题。感兴趣的读者可以在附录中找到关于字节与数据包主题的更详细问题的讨论。
This memo intentionally includes a non-negligible amount of material on the subject. For the busy reader, Section 2 summarises the recommendations for the Internet community.
本备忘录有意包含有关该主题的大量材料。对于忙碌的读者,第2节总结了对互联网社区的建议。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中所述进行解释。
This memo applies to the design of all AQM algorithms, for example, Random Early Detection (RED) [RFC2309], BLUE [BLUE02], Pre-Congestion Notification (PCN) [RFC5670], Controlled Delay (CoDel) [CoDel], and the Proportional Integral controller Enhanced (PIE) [PIE]. Throughout, RED is used as a concrete example because it is a widely known and deployed AQM algorithm. There is no intention to imply that the advice is any less applicable to the other algorithms, nor that RED is preferred.
本备忘录适用于所有AQM算法的设计,例如,随机早期检测(RED)[RFC2309]、蓝色[BLUE02]、拥塞前通知(PCN)[RFC5670]、受控延迟(CoDel)[CoDel]和比例积分控制器增强(PIE)[PIE]。自始至终,RED都被用作一个具体的例子,因为它是一种广为人知且已部署的AQM算法。我们无意暗示该建议对其他算法的适用性更低,也无意暗示红色是首选。
Congestion Notification: Congestion notification is a changing signal that aims to communicate the probability that the network resource(s) will not be able to forward the level of traffic load offered (or that there is an impending risk that they will not be able to).
拥塞通知:拥塞通知是一种不断变化的信号,旨在传达网络资源无法转发所提供的流量负载水平(或存在无法转发的潜在风险)的概率。
The 'impending risk' qualifier is added, because AQM systems set a virtual limit smaller than the actual limit to the resource, then notify the transport when this virtual limit is exceeded in order to avoid uncontrolled congestion of the actual capacity.
添加了“即将发生的风险”限定符,因为AQM系统设置了一个小于资源实际限制的虚拟限制,然后在超过该虚拟限制时通知传输,以避免实际容量不受控制的拥塞。
Congestion notification communicates a real number bounded by the range [ 0 , 1 ]. This ties in with the most well-understood measure of congestion notification: drop probability.
拥塞通知传递范围为[0,1]的实数。这与最广为人知的拥塞通知度量标准:丢弃概率有关。
Explicit and Implicit Notification: The byte vs. packet dilemma concerns congestion notification irrespective of whether it is signalled implicitly by drop or explicitly using ECN [RFC3168] or PCN [RFC5670]. Throughout this document, unless clear from the context, the term 'marking' will be used to mean notifying congestion explicitly, while 'congestion notification' will be used to mean notifying congestion either implicitly by drop or explicitly by marking.
显式和隐式通知:字节与数据包的两难境地涉及拥塞通知,而不管它是通过drop隐式发送的还是使用ECN[RFC3168]或PCN[RFC5670]显式发送的。在本文件中,除非上下文明确,否则术语“标记”将用于表示明确通知拥塞,而“拥塞通知”将用于表示通过拖放或标记明确通知拥塞。
Bit-congestible vs. Packet-congestible: If the load on a resource depends on the rate at which packets arrive, it is called 'packet-congestible'. If the load depends on the rate at which bits arrive, it is called 'bit-congestible'.
比特拥塞与数据包拥塞:如果资源上的负载取决于数据包到达的速率,则称为“数据包拥塞”。如果负载取决于比特到达的速率,则称为“比特拥塞”。
Examples of packet-congestible resources are route look-up engines and firewalls, because load depends on how many packet headers they have to process. Examples of bit-congestible resources are transmission links, radio power, and most buffer memory, because the load depends on how many bits they have to transmit or store. Some machine architectures use fixed-size packet buffers, so buffer memory in these cases is packet-congestible (see Section 4.1.1).
包拥塞资源的例子有路由查找引擎和防火墙,因为负载取决于它们必须处理多少个包头。比特拥塞资源的例子有传输链路、无线电功率和大多数缓冲存储器,因为负载取决于它们必须传输或存储的比特数。一些机器架构使用固定大小的数据包缓冲区,因此在这些情况下,缓冲区内存会导致数据包拥塞(参见第4.1.1节)。
The path through a machine will typically encounter both packet-congestible and bit-congestible resources. However, currently, a design goal of network processing equipment such as routers and firewalls is to size the packet-processing engine(s) relative to the lines in order to keep packet processing uncongested, even under worst-case packet rates with runs of minimum-size packets. Therefore, packet congestion is currently rare (see Section 3.3 of [RFC6077]), but there is no guarantee that it will not become more common in the future.
通过机器的路径通常会遇到分组拥塞和比特拥塞资源。然而,目前,诸如路由器和防火墙之类的网络处理设备的设计目标是相对于线路调整分组处理引擎的大小,以便即使在具有最小大小分组的运行的最坏分组速率下,也保持分组处理不被压缩。因此,数据包拥塞目前很少见(见[RFC6077]第3.3节),但不能保证将来不会变得更普遍。
Note that information is generally processed or transmitted with a minimum granularity greater than a bit (e.g., octets). The appropriate granularity for the resource in question should be used, but for the sake of brevity we will talk in terms of bytes in this memo.
请注意,信息通常以大于一位的最小粒度(例如八位字节)进行处理或传输。应该使用相关资源的适当粒度,但为了简洁起见,我们将在本备忘录中以字节为单位进行讨论。
Coarser Granularity: Resources may be congestible at higher levels of granularity than bits or packets, for instance stateful firewalls are flow-congestible and call-servers are session-congestible. This memo focuses on congestion of connectionless resources, but the same principles may be applicable for congestion notification protocols controlling per-flow and per-session processing or state.
粗粒度:资源在比比特或数据包更高的粒度级别上可能会拥塞,例如,有状态防火墙是流拥塞的,而呼叫服务器是会话拥塞的。本备忘录关注无连接资源的拥塞,但相同的原则可能适用于控制每个流和每个会话处理或状态的拥塞通知协议。
RED Terminology: In RED, whether to use packets or bytes when measuring queues is called, respectively, 'packet-mode queue measurement' or 'byte-mode queue measurement'. And whether the probability of dropping a particular packet is independent or dependent on its size is called, respectively, 'packet-mode drop' or 'byte-mode drop'. The terms 'byte-mode' and 'packet-mode' should not be used without specifying whether they apply to queue measurement or to drop.
红色术语:在红色中,当分别调用“数据包模式队列测量”或“字节模式队列测量”时,是使用数据包还是使用字节。丢弃特定数据包的概率是独立的还是取决于其大小,分别称为“数据包模式丢弃”或“字节模式丢弃”。在未指定“字节模式”和“数据包模式”是否适用于队列测量或丢弃的情况下,不应使用术语“字节模式”和“数据包模式”。
Taking RED as a well-known example algorithm, a central question addressed by this document is whether to recommend RED's packet-mode drop variant and to deprecate byte-mode drop. Table 1 compares how packet-mode and byte-mode drop affect two flows of different size
以RED作为一个著名的示例算法,本文解决的一个中心问题是,是否推荐RED的数据包模式丢弃变体,是否反对字节模式丢弃。表1比较了数据包模式和字节模式丢弃对两个不同大小的流的影响
packets. For each it gives the expected number of packets and of bits dropped in one second. Each example flow runs at the same bit rate of 48 Mbps, but one is broken up into small 60 byte packets and the other into large 1,500 byte packets.
小包。对于每一个,它给出了在一秒钟内丢弃的预期数据包数和比特数。每个示例流以48Mbps的相同比特率运行,但其中一个被分成60字节的小数据包,另一个被分成1500字节的大数据包。
To keep up the same bit rate, in one second there are about 25 times more small packets because they are 25 times smaller. As can be seen from the table, the packet rate is 100,000 small packets versus 4,000 large packets per second (pps).
为了保持相同的比特率,在一秒钟内,小数据包的数量大约是原来的25倍,因为它们是原来的25倍。从表中可以看出,数据包速率为100000个小数据包,而不是4000个大数据包/秒(pps)。
Parameter Formula Small packets Large packets -------------------- --------------- ------------- ------------- Packet size s/8 60 B 1,500 B Packet size s 480 b 12,000 b Bit rate x 48 Mbps 48 Mbps Packet rate u = x/s 100 kpps 4 kpps
Parameter Formula Small packets Large packets -------------------- --------------- ------------- ------------- Packet size s/8 60 B 1,500 B Packet size s 480 b 12,000 b Bit rate x 48 Mbps 48 Mbps Packet rate u = x/s 100 kpps 4 kpps
Packet-mode Drop Pkt-loss probability p 0.1% 0.1% Pkt-loss rate p*u 100 pps 4 pps Bit-loss rate p*u*s 48 kbps 48 kbps
分组模式丢包Pkt丢失概率p 0.1%0.1%Pkt丢失率p*u 100 pps 4 pps比特丢失率p*u*s 48 kbps 48 kbps
Byte-mode Drop MTU, M=12,000 b Pkt-loss probability b = p*s/M 0.004% 0.1% Pkt-loss rate b*u 4 pps 4 pps Bit-loss rate b*u*s 1.92 kbps 48 kbps
Byte-mode Drop MTU, M=12,000 b Pkt-loss probability b = p*s/M 0.004% 0.1% Pkt-loss rate b*u 4 pps 4 pps Bit-loss rate b*u*s 1.92 kbps 48 kbps
Table 1: Example Comparing Packet-Mode and Byte-Mode Drop
表1:比较数据包模式和字节模式丢弃的示例
For packet-mode drop, we illustrate the effect of a drop probability of 0.1%, which the algorithm applies to all packets irrespective of size. Because there are 25 times more small packets in one second, it naturally drops 25 times more small packets, that is, 100 small packets but only 4 large packets. But if we count how many bits it drops, there are 48,000 bits in 100 small packets and 48,000 bits in 4 large packets -- the same number of bits of small packets as large.
对于分组模式丢弃,我们说明了丢弃概率为0.1%的影响,该算法适用于所有分组,而不考虑大小。因为一秒钟内有25倍多的小数据包,它自然会丢弃25倍多的小数据包,即100个小数据包,但只有4个大数据包。但是如果我们计算它丢失了多少位,100个小数据包中有48000位,4个大数据包中有48000位——小数据包的位数与大数据包的位数相同。
The packet-mode drop algorithm drops any bit with the same probability whether the bit is in a small or a large packet.
分组模式丢弃算法以相同的概率丢弃任何比特,无论该比特位于小分组还是大分组中。
For byte-mode drop, again we use an example drop probability of 0.1%, but only for maximum size packets (assuming the link maximum transmission unit (MTU) is 1,500 B or 12,000 b). The byte-mode algorithm reduces the drop probability of smaller packets proportional to their size, making the probability that it drops a small packet 25 times smaller at 0.004%. But there are 25 times more small packets, so dropping them with 25 times lower probability results in dropping the same number of packets: 4 drops in both
对于字节模式丢弃,我们再次使用0.1%的示例丢弃概率,但仅针对最大大小的数据包(假设链路最大传输单元(MTU)为1500 B或12000 B)。字节模式算法降低了与大小成比例的较小数据包的丢弃概率,使其丢弃小数据包的概率在0.004%时减小25倍。但是小数据包的数量是原来的25倍,因此以25倍的低概率丢弃它们会导致丢弃相同数量的数据包:两个数据包中都有4次丢弃
cases. The 4 small dropped packets contain 25 times less bits than the 4 large dropped packets: 1,920 compared to 48,000.
案例。4个小丢弃数据包包含的比特数比4个大丢弃数据包少25倍:1920比特,而48000比特。
The byte-mode drop algorithm drops any bit with a probability proportionate to the size of the packet it is in.
字节模式丢弃算法丢弃任何比特的概率与其所在数据包的大小成比例。
This section gives recommendations related to network equipment in Sections 2.1 and 2.2, and we discuss the implications on transport protocols in Sections 2.3 and 2.4.
本节在第2.1节和第2.2节中给出了与网络设备相关的建议,我们在第2.3节和第2.4节中讨论了对传输协议的影响。
Ideally, an AQM would measure the service time of the queue to measure congestion of a resource. However service time can only be measured as packets leave the queue, where it is not always expedient to implement a full AQM algorithm. To predict the service time as packets join the queue, an AQM algorithm needs to measure the length of the queue.
理想情况下,AQM将测量队列的服务时间,以测量资源的拥塞。然而,服务时间只能在数据包离开队列时测量,在这种情况下,实施完整的AQM算法并不总是有利的。为了预测数据包加入队列时的服务时间,AQM算法需要测量队列的长度。
In this case, if the resource is bit-congestible, the AQM implementation SHOULD measure the length of the queue in bytes and, if the resource is packet-congestible, the implementation SHOULD measure the length of the queue in packets. Subject to the exceptions below, no other choice makes sense, because the number of packets waiting in the queue isn't relevant if the resource gets congested by bytes and vice versa. For example, the length of the queue into a transmission line would be measured in bytes, while the length of the queue into a firewall would be measured in packets.
在这种情况下,如果资源是位拥塞的,则AQM实现应以字节为单位测量队列长度,如果资源是数据包拥塞的,则实现应以数据包为单位测量队列长度。除以下例外情况外,没有其他选择是有意义的,因为如果资源因字节而拥塞,则队列中等待的数据包数量不相关,反之亦然。例如,进入传输线的队列长度将以字节为单位测量,而进入防火墙的队列长度将以数据包为单位测量。
To avoid the pathological effects of tail drop, the AQM can then transform this service time or queue length into the probability of dropping or marking a packet (e.g., RED's piecewise linear function between thresholds).
为了避免尾部丢弃的病理影响,AQM可以将此服务时间或队列长度转换为丢弃或标记数据包的概率(例如,阈值之间的红色分段线性函数)。
What this advice means for RED as a specific example:
作为一个具体示例,此建议对RED的意义:
1. A RED implementation SHOULD use byte-mode queue measurement for measuring the congestion of bit-congestible resources and packet-mode queue measurement for packet-congestible resources.
1. RED实现应使用字节模式队列测量来测量比特拥塞资源的拥塞,并使用分组模式队列测量来测量分组拥塞资源。
2. An implementation SHOULD NOT make it possible to configure the way a queue measures itself, because whether a queue is bit-congestible or packet-congestible is an inherent property of the queue.
2. 实现不应使配置队列测量自身的方式成为可能,因为队列是位拥塞还是数据包拥塞是队列的固有属性。
Exceptions to these recommendations might be necessary, for instance where a packet-congestible resource has to be configured as a proxy bottleneck for a bit-congestible resource in an adjacent box that does not support AQM.
这些建议的例外情况可能是必要的,例如,必须将分组拥塞资源配置为不支持AQM的相邻框中比特拥塞资源的代理瓶颈。
The recommended approach in less straightforward scenarios, such as fixed-size packet buffers, resources without a queue, and buffers comprising a mix of packet and bit-congestible resources, is discussed in Section 4.1. For instance, Section 4.1.1 explains that the queue into a line should be measured in bytes even if the queue consists of fixed-size packet buffers, because the root cause of any congestion is bytes arriving too fast for the line -- packets filling buffers are merely a symptom of the underlying congestion of the line.
第4.1节讨论了在不太简单的场景中推荐的方法,例如固定大小的数据包缓冲区、没有队列的资源以及包含数据包和比特拥塞资源混合的缓冲区。例如,第4.1.1节解释说,即使队列由固定大小的数据包缓冲区组成,进入一条线路的队列也应以字节为单位进行测量,因为任何拥塞的根本原因是数据包到达线路的速度过快——数据包填充缓冲区仅仅是线路潜在拥塞的一个症状。
When encoding congestion notification (e.g., by drop, ECN, or PCN), the probability that network equipment drops or marks a particular packet to notify congestion SHOULD NOT depend on the size of the packet in question. As the example in Section 1.2 illustrates, to drop any bit with probability 0.1%, it is only necessary to drop every packet with probability 0.1% without regard to the size of each packet.
当编码拥塞通知(例如,通过drop、ECN或PCN)时,网络设备丢弃或标记特定数据包以通知拥塞的概率不应取决于相关数据包的大小。如第1.2节中的示例所示,为了以0.1%的概率丢弃任何比特,只需要以0.1%的概率丢弃每个分组,而不考虑每个分组的大小。
This approach ensures the network layer offers sufficient congestion information for all known and future transport protocols and also ensures no perverse incentives are created that would encourage transports to use inappropriately small packet sizes.
这种方法可以确保网络层为所有已知和未来的传输协议提供足够的拥塞信息,还可以确保不会产生会鼓励传输使用不适当的小数据包大小的不正当激励。
What this advice means for RED as a specific example:
作为一个具体示例,此建议对RED的意义:
1. The RED AQM algorithm SHOULD NOT use byte-mode drop, i.e., it ought to use packet-mode drop. Byte-mode drop is more complex, it creates the perverse incentive to fragment segments into tiny pieces and it is vulnerable to floods of small packets.
1. RED AQM算法不应使用字节模式丢弃,即,它应使用分组模式丢弃。字节模式丢弃更为复杂,它产生了将片段分割成小块的不正当动机,并且容易受到大量小数据包的攻击。
2. If a vendor has implemented byte-mode drop, and an operator has turned it on, it is RECOMMENDED that the operator use packet-mode drop instead, after establishing if there are any implications on the relative performance of applications using different packet sizes. The unlikely possibility of some application-specific legacy use of byte-mode drop is the only reason that all the above recommendations on encoding congestion notification are not phrased more strongly.
2. 如果供应商已实施字节模式丢弃,且运营商已将其打开,则建议运营商在确定使用不同数据包大小的应用程序的相对性能是否存在任何影响后,改用数据包模式丢弃。字节模式丢弃的某些特定于应用程序的遗留使用的可能性不大,这是上述所有关于编码拥塞通知的建议没有得到更严格表述的唯一原因。
RED as a whole SHOULD NOT be switched off. Without RED, a tail-drop queue biases against large packets and is vulnerable to floods of small packets.
红色作为一个整体不应关闭。如果没有RED,尾部丢弃队列会对大数据包产生偏见,并且容易受到小数据包泛滥的影响。
Note well that RED's byte-mode queue drop is completely orthogonal to byte-mode queue measurement and should not be confused with it. If a RED implementation has a byte-mode but does not specify what sort of byte-mode, it is most probably byte-mode queue measurement, which is fine. However, if in doubt, the vendor should be consulted.
请注意,RED的字节模式队列丢弃与字节模式队列度量完全正交,不应与之混淆。如果RED实现有一个字节模式,但没有指定字节模式的类型,则很可能是字节模式队列测量,这很好。但是,如有疑问,应咨询供应商。
A survey (Appendix A) showed that there appears to be little, if any, installed base of the byte-mode drop variant of RED. This suggests that deprecating byte-mode drop will have little, if any, incremental deployment impact.
一项调查(附录A)表明,RED的字节模式drop变体的安装基数似乎很少(如果有的话)。这表明,不推荐字节模式删除对增量部署的影响很小(如果有的话)。
When a transport detects that a packet has been lost or congestion marked, it SHOULD consider the strength of the congestion indication as proportionate to the size in octets (bytes) of the missing or marked packet.
当传输检测到分组丢失或拥塞标记时,应该考虑拥塞指示的强度与丢失或标记分组的八位字节(字节)的大小成比例。
In other words, when a packet indicates congestion (by being lost or marked), it can be considered conceptually as if there is a congestion indication on every octet of the packet, not just one indication per packet.
换句话说,当数据包指示拥塞(通过丢失或标记)时,可以从概念上认为,数据包的每个八位组上都有拥塞指示,而不是每个数据包只有一个指示。
To be clear, the above recommendation solely describes how a transport should interpret the meaning of a congestion indication, as a long term goal. It makes no recommendation on whether a transport should act differently based on this interpretation. It merely aids interoperability between transports, if they choose to make their actions depend on the strength of congestion indications.
明确地说,上述建议仅描述了交通工具应如何将拥堵指示的含义解释为长期目标。它没有就运输是否应根据这一解释采取不同行动提出建议。它只会帮助传输之间的互操作性,如果它们选择使其操作取决于拥塞指示的强度。
This definition will be useful as the IETF transport area continues its programme of:
随着IETF运输区继续其以下计划,该定义将非常有用:
o updating host-based congestion control protocols to take packet size into account, and
o 更新基于主机的拥塞控制协议以考虑数据包大小,以及
o making transports less sensitive to losing control packets like SYNs and pure ACKs.
o 使传输对丢失控制数据包(如SYN和纯ACK)不那么敏感。
What this advice means for the case of TCP:
此建议对于TCP的意义:
1. If two TCP flows with different packet sizes are required to run at equal bit rates under the same path conditions, this SHOULD be done by altering TCP (Section 4.2.2), not network equipment (the latter affects other transports besides TCP).
1. 如果要求两个具有不同数据包大小的TCP流在相同路径条件下以相同的比特率运行,则应通过改变TCP(第4.2.2节)而不是网络设备(后者影响TCP以外的其他传输)来实现。
2. If it is desired to improve TCP performance by reducing the chance that a SYN or a pure ACK will be dropped, this SHOULD be done by modifying TCP (Section 4.2.3), not network equipment.
2. 如果希望通过降低SYN或纯ACK被丢弃的可能性来提高TCP性能,则应通过修改TCP(第4.2.3节)而不是网络设备来实现。
To be clear, we are not recommending at all that TCPs under equivalent conditions should aim for equal bit rates. We are merely saying that anyone trying to do such a thing should modify their TCP algorithm, not the network.
明确地说,我们根本不建议在同等条件下,TCP应以相同的比特率为目标。我们只是说,任何试图这样做的人都应该修改他们的TCP算法,而不是网络。
These recommendations are phrased as 'SHOULD' rather than 'MUST', because there may be cases where expediency dictates that compatibility with pre-existing versions of a transport protocol make the recommendations impractical.
这些建议的措辞是“应该”而不是“必须”,因为在某些情况下,为了方便起见,与传输协议的现有版本兼容可能会导致建议不切实际。
2.4. Recommendation on Handling Congestion Indications When Splitting or Merging Packets
2.4. 关于在拆分或合并数据包时处理拥塞指示的建议
Packets carrying congestion indications may be split or merged in some circumstances (e.g., at an RTP / RTP Control Protocol (RTCP) transcoder or during IP fragment reassembly). Splitting and merging only make sense in the context of ECN, not loss.
在某些情况下(例如,在RTP/RTP控制协议(RTCP)转码器处或在IP片段重组期间),携带拥塞指示的数据包可能被拆分或合并。拆分和合并仅在ECN的上下文中才有意义,而不是丢失。
The general rule to follow is that the number of octets in packets with congestion indications SHOULD be equivalent before and after merging or splitting. This is based on the principle used above; that an indication of congestion on a packet can be considered as an indication of congestion on each octet of the packet.
遵循的一般规则是,在合并或拆分之前和之后,具有拥塞指示的数据包中的八位字节数应该相等。这是基于上述原则;分组上的拥塞指示可被视为分组的每个八位组上的拥塞指示。
The above rule is not phrased with the word 'MUST' to allow the following exception. There are cases in which pre-existing protocols were not designed to conserve congestion-marked octets (e.g., IP fragment reassembly [RFC3168] or loss statistics in RTCP receiver reports [RFC3550] before ECN was added [RFC6679]). When any such protocol is updated, it SHOULD comply with the above rule to conserve marked octets. However, the rule may be relaxed if it would otherwise become too complex to interoperate with pre-existing implementations of the protocol.
上述规则未使用“必须”一词,以允许以下例外情况。在某些情况下,预先存在的协议未设计为保留拥塞标记的八位字节(例如,在添加ECN之前,IP片段重组[RFC3168]或RTCP接收器报告[RFC3550]中的丢失统计[RFC6679])。更新任何此类协议时,应遵守上述规则以保留标记的八位字节。但是,如果该规则变得过于复杂,无法与协议的现有实现进行互操作,则可以放宽该规则。
One can think of a splitting or merging process as if all the incoming congestion-marked octets increment a counter and all the outgoing marked octets decrement the same counter. In order to
我们可以将拆分或合并过程想象为所有传入的标记为拥塞的八位字节递增一个计数器,所有传出的标记为八位字节递减同一个计数器。为了
ensure that congestion indications remain timely, even the smallest positive remainder in the conceptual counter should trigger the next outgoing packet to be marked (causing the counter to go negative).
确保拥塞指示保持及时,即使概念计数器中最小的正余数也应触发要标记的下一个传出数据包(导致计数器变为负)。
This section is informative. It justifies the recommendations made in the previous section.
本节内容丰富。它证明了上一节中提出的建议是正确的。
Increasingly, it is being recognised that a protocol design must take care not to cause unintended consequences by giving the parties in the protocol exchange perverse incentives [Evol_cc] [RFC3426]. Given there are many good reasons why larger path maximum transmission units (PMTUs) would help solve a number of scaling issues, we do not want to create any bias against large packets that is greater than their true cost.
越来越多的人认识到,协议设计必须注意,通过给予协议交换各方不正当的激励[Evol_cc][RFC3426],避免造成意外后果。考虑到较大的路径最大传输单元(PMTU)有许多很好的理由可以帮助解决许多扩展问题,我们不希望对大于其真实成本的大数据包产生任何偏见。
Imagine a scenario where the same bit rate of packets will contribute the same to bit congestion of a link irrespective of whether it is sent as fewer larger packets or more smaller packets. A protocol design that caused larger packets to be more likely to be dropped than smaller ones would be dangerous in both of the following cases:
想象一个场景,在这个场景中,数据包的相同比特率将导致链路的相同比特拥塞,而不管它是作为更小的数据包还是更小的数据包发送的。在以下两种情况下,导致较大数据包比较小数据包更有可能被丢弃的协议设计都是危险的:
Malicious transports: A queue that gives an advantage to small packets can be used to amplify the force of a flooding attack. By sending a flood of small packets, the attacker can get the queue to discard more large-packet traffic, allowing more attack traffic to get through to cause further damage. Such a queue allows attack traffic to have a disproportionately large effect on regular traffic without the attacker having to do much work.
恶意传输:利用小包的队列可以放大洪水攻击的威力。通过发送大量的小数据包,攻击者可以让队列丢弃更多的大数据包流量,从而允许更多的攻击流量通过,从而造成进一步的破坏。这样的队列允许攻击流量对常规流量产生过大的影响,而攻击者无需做很多工作。
Non-malicious transports: Even if an application designer is not actually malicious, if over time it is noticed that small packets tend to go faster, designers will act in their own interest and use smaller packets. Queues that give advantage to small packets create an evolutionary pressure for applications or transports to send at the same bit rate but break their data stream down into tiny segments to reduce their drop rate. Encouraging a high volume of tiny packets might in turn unnecessarily overload a completely unrelated part of the system, perhaps more limited by header processing than bandwidth.
非恶意传输:即使应用程序设计人员实际上不是恶意的,但随着时间的推移,如果发现小数据包的传输速度会更快,设计人员也会根据自己的利益使用较小的数据包。为小包提供优势的队列为应用程序或传输程序创造了进化压力,使其以相同的比特率发送数据,但将其数据流分解为小段以降低其丢弃率。鼓励大量的小数据包反过来可能会不必要地使系统中完全不相关的部分过载,这可能比带宽更受报头处理的限制。
Imagine that two unresponsive flows arrive at a bit-congestible transmission link each with the same bit rate, say 1 Mbps, but one consists of 1,500 B and the other 60 B packets, which are 25x smaller. Consider a scenario where gentle RED [gentle_RED] is used,
假设两个无响应流以相同的比特率(比如1Mbps)到达一个比特拥塞的传输链路,但其中一个由1500B和其他60B数据包组成,这两个数据包小25倍。考虑使用柔和的红色[绅士红色]的场景,
along with the variant of RED we advise against, i.e., where the RED algorithm is configured to adjust the drop probability of packets in proportion to each packet's size (byte-mode packet drop). In this case, RED aims to drop 25x more of the larger packets than the smaller ones. Thus, for example, if RED drops 25% of the larger packets, it will aim to drop 1% of the smaller packets (but, in practice, it may drop more as congestion increases; see Appendix B.4 of [RFC4828]). Even though both flows arrive with the same bit rate, the bit rate the RED queue aims to pass to the line will be 750 kbps for the flow of larger packets but 990 kbps for the smaller packets (because of rate variations, it will actually be a little less than this target).
我们建议不要使用RED的变体,即RED算法配置为根据每个数据包的大小按比例调整数据包的丢弃概率(字节模式数据包丢弃)。在这种情况下,RED的目标是丢弃比较小数据包多25倍的较大数据包。因此,例如,如果RED丢弃25%的较大数据包,则其目标是丢弃1%的较小数据包(但在实践中,它可能会随着拥塞的增加而丢弃更多数据包;参见[RFC4828]的附录B.4)。即使两个流以相同的比特率到达,对于较大的数据包流,红色队列要传递到该行的比特率将为750 kbps,而对于较小的数据包,则为990 kbps(由于速率变化,它实际上将略低于此目标)。
Note that, although the byte-mode drop variant of RED amplifies small-packet attacks, tail-drop queues amplify small-packet attacks even more (see Security Considerations in Section 6). Wherever possible, neither should be used.
请注意,尽管RED的字节模式丢弃变体会放大小数据包攻击,但尾部丢弃队列会放大更多的小数据包攻击(请参阅第6节中的安全注意事项)。在可能的情况下,两者都不应使用。
Dropping fewer control packets considerably improves performance. It is tempting to drop small packets with lower probability in order to improve performance, because many control packets tend to be smaller (TCP SYNs and ACKs, DNS queries and responses, SIP messages, HTTP GETs, etc). However, we must not give control packets preference purely by virtue of their smallness, otherwise it is too easy for any data source to get the same preferential treatment simply by sending data in smaller packets. Again, we should not create perverse incentives to favour small packets rather than to favour control packets, which is what we intend.
丢弃更少的控制数据包可以显著提高性能。为了提高性能,很容易丢弃概率较低的小数据包,因为许多控制数据包往往较小(TCP SYN和ACK、DNS查询和响应、SIP消息、HTTP GET等)。然而,我们不能纯粹因为控制数据包的小而给予它们优先权,否则任何数据源都很容易通过在较小的数据包中发送数据来获得相同的优先权。再次强调,我们不应该创造偏袒小数据包的不正当激励,而不是偏袒控制数据包,这正是我们的意图。
Just because many control packets are small does not mean all small packets are control packets.
仅仅因为许多控制数据包都很小,并不意味着所有的小数据包都是控制数据包。
So, rather than fix these problems in the network, we argue that the transport should be made more robust against losses of control packets (see Section 4.2.3).
因此,我们认为,与其解决网络中的这些问题,不如让传输更健壮,以防控制数据包丢失(见第4.2.3节)。
TCP congestion control ensures that flows competing for the same resource each maintain the same number of segments in flight, irrespective of segment size. So under similar conditions, flows with different segment sizes will get different bit rates.
TCP拥塞控制确保竞争相同资源的流在飞行中保持相同的段数,而不考虑段大小。因此,在相似的条件下,具有不同段大小的流将获得不同的比特率。
To counter this effect, it seems tempting not to follow our recommendation, and instead for the network to bias congestion notification by packet size in order to equalise the bit rates of
为了抵消这种影响,似乎不遵循我们的建议,而是让网络根据数据包大小来偏向拥塞通知,以便均衡数据包的比特率
flows with different packet sizes. However, in order to do this, the queuing algorithm has to make assumptions about the transport, which become embedded in the network. Specifically:
具有不同数据包大小的流。然而,为了做到这一点,排队算法必须对嵌入到网络中的传输进行假设。明确地:
o The queuing algorithm has to assume how aggressively the transport will respond to congestion (see Section 4.2.4). If the network assumes the transport responds as aggressively as TCP NewReno, it will be wrong for Compound TCP and differently wrong for Cubic TCP, etc. To achieve equal bit rates, each transport then has to guess what assumption the network made, and work out how to replace this assumed aggressiveness with its own aggressiveness.
o 排队算法必须假设交通对拥挤的响应程度(见第4.2.4节)。如果网络假设传输响应像TCP NewReno一样具有攻击性,那么复合TCP和立方TCP等都是错误的。为了实现相同的比特率,每个传输都必须猜测网络做出了什么样的假设,并找出如何用自己的攻击性取代这种假设的攻击性。
o Also, if the network biases congestion notification by packet size, it has to assume a baseline packet size -- all proposed algorithms use the local MTU (for example, see the byte-mode loss probability formula in Table 1). Then if the non-Reno transports mentioned above are trying to reverse engineer what the network assumed, they also have to guess the MTU of the congested link.
o 此外,如果网络根据数据包大小对拥塞通知进行偏移,则必须假设一个基线数据包大小——所有提出的算法都使用本地MTU(例如,请参见表1中的字节模式丢失概率公式)。然后,如果上面提到的非雷诺传输尝试对网络假设进行反向工程,它们还必须猜测拥塞链路的MTU。
Even though reducing the drop probability of small packets (e.g., RED's byte-mode drop) helps ensure TCP flows with different packet sizes will achieve similar bit rates, we argue that this correction should be made to any future transport protocols based on TCP, not to the network in order to fix one transport, no matter how predominant it is. Effectively, favouring small packets is reverse engineering of network equipment around one particular transport protocol (TCP), contrary to the excellent advice in [RFC3426], which asks designers to question "Why are you proposing a solution at this layer of the protocol stack, rather than at another layer?"
尽管降低小数据包的丢弃概率(例如,RED的字节模式丢弃)有助于确保具有不同数据包大小的TCP流将实现相似的比特率,但我们认为,为了修复一个传输,无论它有多重要,都应该对基于TCP的任何未来传输协议进行此项修正,而不是对网络进行修正。实际上,支持小数据包是围绕一个特定的传输协议(TCP)对网络设备进行反向工程,这与[RFC3426]中的优秀建议相反,后者要求设计者提出问题“为什么要在协议栈的这一层而不是另一层提出解决方案?”
In contrast, if the network never takes packet size into account, the transport can be certain it will never need to guess any assumptions that the network has made. And the network passes two pieces of information to the transport that are sufficient in all cases: i) congestion notification on the packet and ii) the size of the packet. Both are available for the transport to combine (by taking packet size into account when responding to congestion) or not. Appendix B checks that these two pieces of information are sufficient for all relevant scenarios.
相反,如果网络从不考虑数据包大小,则传输可以确定它将永远不需要猜测网络所做的任何假设。网络向传输传递两条信息,这两条信息在所有情况下都是足够的:i)数据包的拥塞通知和ii)数据包的大小。两者都可用于传输组合(通过在响应拥塞时考虑数据包大小)或不组合。附录B检查这两条信息是否足以满足所有相关场景。
When the network does not take packet size into account, it allows transport protocols to choose whether or not to take packet size into account. However, if the network were to bias congestion notification by packet size, transport protocols would have no choice; those that did not take into account packet size themselves would unwittingly become dependent on packet size, and those that already took packet size into account would end up taking it into account twice.
当网络不考虑数据包大小时,它允许传输协议选择是否考虑数据包大小。然而,如果网络根据数据包大小来偏向拥塞通知,则传输协议将别无选择;那些不考虑数据包大小的人会不知不觉地依赖于数据包大小,而那些已经考虑数据包大小的人最终会考虑两次。
In overview, the argument in this section runs as follows:
总而言之,本节中的参数如下所示:
o Because the network does not and cannot always drop packets in proportion to their size, it shouldn't be given the task of making drop signals depend on packet size at all.
o 因为网络不会也不可能总是按照数据包的大小来丢弃数据包,所以不应该让丢弃信号完全依赖于数据包的大小。
o Transports on the other hand don't always want to make their rate response proportional to the size of dropped packets, but if they want to, they always can.
o 另一方面,传输并不总是希望使它们的速率响应与丢弃的数据包的大小成比例,但如果它们愿意,它们总是可以的。
The argument is similar to the end-to-end argument that says "Don't do X in the network if end systems can do X by themselves, and they want to be able to choose whether to do X anyway". Actually the following argument is stronger; in addition it says "Don't give the network task X that could be done by the end systems, if X is not deployed on all network nodes, and end systems won't be able to tell whether their network is doing X, or whether they need to do X themselves." In this case, the X in question is "making the response to congestion depend on packet size".
该参数类似于端到端参数,该参数表示“如果终端系统可以自己执行X,则不要在网络中执行X,并且他们希望能够选择是否执行X”。实际上,下面的论点更为有力;此外,它还表示“如果X未部署在所有网络节点上,并且终端系统无法判断其网络是否正在执行X,或者是否需要自己执行X,则不要给终端系统可以执行的网络任务X。”在这种情况下,所讨论的X是“使拥塞响应取决于数据包大小”。
We will now re-run this argument reviewing each step in more depth. The argument applies solely to drop, not to ECN marking.
现在我们将重新运行此论证,更深入地回顾每一步。该论点仅适用于drop,而不适用于ECN标记。
A queue drops packets for either of two reasons: a) to signal to host congestion controls that they should reduce the load and b) because there is no buffer left to store the packets. Active queue management tries to use drops as a signal for hosts to slow down (case a) so that drops due to buffer exhaustion (case b) should not be necessary.
队列丢弃数据包的原因有两种:A)向主机拥塞控制发出信号,表明它们应该减少负载;b)因为没有剩余的缓冲区来存储数据包。主动队列管理尝试使用丢弃作为主机减速的信号(情况a),因此不需要由于缓冲区耗尽而丢弃(情况b)。
AQM is not universally deployed in every queue in the Internet; many cheap Ethernet bridges, software firewalls, NATs on consumer devices, etc implement simple tail-drop buffers. Even if AQM were universal, it has to be able to cope with buffer exhaustion (by switching to a behaviour like tail drop), in order to cope with unresponsive or excessive transports. For these reasons networks will sometimes be dropping packets as a last resort (case b) rather than under AQM control (case a).
AQM并不是普遍部署在互联网上的每个队列中;许多廉价的以太网网桥、软件防火墙、消费设备上的NAT等都实现了简单的尾部丢弃缓冲区。即使AQM是通用的,它也必须能够应对缓冲区耗尽(通过切换到像甩尾这样的行为),以应对无反应或过度传输。由于这些原因,网络有时将丢弃数据包作为最后手段(情况b),而不是在AQM控制下(情况a)。
When buffers are exhausted (case b), they don't naturally drop packets in proportion to their size. The network can only reduce the probability of dropping smaller packets if it has enough space to store them somewhere while it waits for a larger packet that it can drop. If the buffer is exhausted, it does not have this choice. Admittedly tail drop does naturally drop somewhat fewer small packets, but exactly how few depends more on the mix of sizes than
当缓冲区耗尽时(情况b),它们不会自然地按大小比例丢弃数据包。网络只有在等待可以丢弃的较大数据包时,有足够的空间将较小数据包存储在某个地方,才能降低丢弃较小数据包的概率。如果缓冲区耗尽,则没有此选项。诚然,尾部丢弃确实会自然地丢弃更少的小数据包,但具体数量多少更多地取决于大小的混合,而不是大小
the size of the packet in question. Nonetheless, in general, if we wanted networks to do size-dependent drop, we would need universal deployment of (packet-size dependent) AQM code, which is currently unrealistic.
问题包的大小。尽管如此,总的来说,如果我们希望网络进行大小相关的丢弃,我们需要(数据包大小相关的)AQM代码的通用部署,这在目前是不现实的。
A host transport cannot know whether any particular drop was a deliberate signal from an AQM or a sign of a queue shedding packets due to buffer exhaustion. Therefore, because the network cannot universally do size-dependent drop, it should not do it all.
主机传输无法知道任何特定的丢弃是来自AQM的故意信号还是由于缓冲区耗尽而导致队列丢弃数据包的信号。因此,因为网络不能普遍地进行大小相关的丢弃,所以它不应该做所有的事情。
Whereas universality is desirable in the network, diversity is desirable between different transport-layer protocols -- some, like standards track TCP congestion control [RFC5681], may not choose to make their rate response proportionate to the size of each dropped packet, while others will (e.g., TCP-Friendly Rate Control for Small Packets (TFRC-SP) [RFC4828]).
虽然网络中的普遍性是可取的,但不同传输层协议之间的多样性是可取的——一些协议,如标准跟踪TCP拥塞控制[RFC5681],可能不会选择使其速率响应与每个丢弃数据包的大小成比例,而其他协议则会(例如,对小数据包的TCP友好速率控制(TFRC-SP)[RFC4828])。
Biasing against large packets typically requires an extra multiply and divide in the network (see the example byte-mode drop formula in Table 1). Taking packet size into account at the transport rather than in the network ensures that neither the network nor the transport needs to do a multiply operation -- multiplication by packet size is effectively achieved as a repeated add when the transport adds to its count of marked bytes as each congestion event is fed to it. Also, the work to do the biasing is spread over many hosts, rather than concentrated in just the congested network element. These aren't principled reasons in themselves, but they are a happy consequence of the other principled reasons.
针对大数据包的偏置通常需要网络中额外的乘法和除法(参见表1中的示例字节模式丢弃公式)。在传输中而不是在网络中考虑数据包大小可以确保网络和传输都不需要进行乘法运算——当传输将每个拥塞事件反馈给它时,将数据包大小的乘法作为重复加法有效地实现。此外,进行偏置的工作分布在许多主机上,而不仅仅集中在拥挤的网元上。这些本身并不是原则性的理由,但它们是其他原则性理由的一个令人高兴的结果。
This section is informative, not normative.
本节内容丰富,不规范。
The original 1993 paper on RED [RED93] proposed two options for the RED active queue management algorithm: packet mode and byte mode. Packet mode measured the queue length in packets and dropped (or marked) individual packets with a probability independent of their size. Byte mode measured the queue length in bytes and marked an individual packet with probability in proportion to its size (relative to the maximum packet size). In the paper's outline of further work, it was stated that no recommendation had been made on whether the queue size should be measured in bytes or packets, but noted that the difference could be significant.
1993年关于RED的原始论文[RED93]提出了两种RED主动队列管理算法选项:数据包模式和字节模式。数据包模式测量数据包和丢弃(或标记)的单个数据包的队列长度,其概率与数据包的大小无关。字节模式测量队列长度(以字节为单位),并以其大小(相对于最大数据包大小)成比例的概率标记单个数据包。在该文件的进一步工作大纲中,有人指出,没有就队列大小是否应以字节或数据包为单位提出建议,但指出差异可能很大。
When RED was recommended for general deployment in 1998 [RFC2309], the two modes were mentioned implying the choice between them was a question of performance, referring to a 1997 email [pktByteEmail] for advice on tuning. A later addendum to this email introduced the insight that there are in fact two orthogonal choices:
当RED在1998年被推荐用于一般部署[RFC2309]时,提到了这两种模式,这意味着它们之间的选择是一个性能问题,参考了1997年的电子邮件[PKTBytemail],以获得关于调优的建议。这封邮件后面的附录介绍了事实上有两种正交选择:
o whether to measure queue length in bytes or packets (Section 4.1), and
o 是否以字节或数据包为单位测量队列长度(第4.1节),以及
o whether the drop probability of an individual packet should depend on its own size (Section 4.2).
o 单个数据包的丢弃概率是否应取决于其自身的大小(第4.2节)。
The rest of this section is structured accordingly.
本节其余部分的结构也相应调整。
The choice of which metric to use to measure queue length was left open in RFC 2309. It is now well understood that queues for bit-congestible resources should be measured in bytes, and queues for packet-congestible resources should be measured in packets [pktByteEmail].
在RFC2309中,用于测量队列长度的度量的选择是开放的。现在已经很好地理解,比特拥塞资源的队列应该以字节为单位进行测量,而分组拥塞资源的队列应该以分组[pktbytemail]为单位进行测量。
Congestion in some legacy bit-congestible buffers is only measured in packets not bytes. In such cases, the operator has to take into account a typical mix of packet sizes when setting the thresholds. Any AQM algorithm on such a buffer will be oversensitive to high proportions of small packets, e.g., a DoS attack, and under-sensitive to high proportions of large packets. However, there is no need to make allowances for the possibility of such a legacy in future protocol design. This is safe because any under-sensitivity during unusual traffic mixes cannot lead to congestion collapse given that the buffer will eventually revert to tail drop, which discards proportionately more large packets.
某些传统位拥塞缓冲区中的拥塞仅以数据包而不是字节来度量。在这种情况下,运营商在设置阈值时必须考虑数据包大小的典型混合。这种缓冲区上的任何AQM算法都会对高比例的小数据包(例如DoS攻击)过于敏感,而对高比例的大数据包则不太敏感。然而,在未来的协议设计中,没有必要考虑这种遗留问题的可能性。这是安全的,因为在异常流量混合期间,任何灵敏度不足都不会导致拥塞崩溃,因为缓冲区最终将恢复为尾部丢弃,从而按比例丢弃更多的大数据包。
The question of whether to measure queues in bytes or packets seems to be well understood. However, measuring congestion is confusing when the resource is bit-congestible but the queue into the resource is packet-congestible. This section outlines the approach to take.
是否以字节或数据包为单位度量队列的问题似乎已经得到了很好的理解。然而,当资源是比特拥塞的,但进入资源的队列是分组拥塞的时,测量拥塞是令人困惑的。本节概述了要采取的方法。
Some, mostly older, queuing hardware allocates fixed-size buffers in which to store each packet in the queue. This hardware forwards packets to the line in one of two ways:
一些(大部分是较旧的)队列硬件分配固定大小的缓冲区,用于存储队列中的每个数据包。此硬件通过以下两种方式之一将数据包转发到线路:
o With some hardware, any fixed-size buffers not completely filled by a packet are padded when transmitted to the wire. This case should clearly be treated as packet-congestible, because both
o 对于某些硬件,任何固定大小的缓冲区在传输到导线时都会被数据包填充。这种情况应该明确地视为数据包拥塞,因为
queuing and transmission are in fixed MTU-size units. Therefore, the queue length in packets is a good model of congestion of the link.
排队和传输采用固定的MTU大小单位。因此,分组中的队列长度是一个很好的链路拥塞模型。
o More commonly, hardware with fixed-size packet buffers transmits packets to the line without padding. This implies a hybrid forwarding system with transmission congestion dependent on the size of packets but queue congestion dependent on the number of packets, irrespective of their size.
o 更常见的情况是,具有固定大小的数据包缓冲区的硬件在没有填充的情况下将数据包传输到线路。这意味着混合转发系统的传输拥塞取决于数据包的大小,但队列拥塞取决于数据包的数量,而与数据包的大小无关。
Nonetheless, there would be no queue at all unless the line had become congested -- the root cause of any congestion is too many bytes arriving for the line. Therefore, the AQM should measure the queue length as the sum of all the packet sizes in bytes that are queued up waiting to be serviced by the line, irrespective of whether each packet is held in a fixed-size buffer.
尽管如此,除非线路变得拥挤,否则根本不会有队列——任何拥挤的根本原因是线路到达的字节太多。因此,AQM应将队列长度测量为排队等待线路服务的所有数据包大小(以字节为单位)的总和,而不管每个数据包是否保存在固定大小的缓冲区中。
In the (unlikely) first case where use of padding means the queue should be measured in packets, further confusion is likely because the fixed buffers are rarely all one size. Typically, pools of different-sized buffers are provided (Cisco uses the term 'buffer carving' for the process of dividing up memory into these pools [IOSArch]). Usually, if the pool of small buffers is exhausted, arriving small packets can borrow space in the pool of large buffers, but not vice versa. However, there is no need to consider all this complexity, because the root cause of any congestion is still line overload -- buffer consumption is only the symptom. Therefore, the length of the queue should be measured as the sum of the bytes in the queue that will be transmitted to the line, including any padding. In the (unusual) case of transmission with padding, this means the sum of the sizes of the small buffers queued plus the sum of the sizes of the large buffers queued.
在(不太可能的)第一种情况下,使用填充意味着队列应该在数据包中进行度量,这可能会导致进一步的混淆,因为固定缓冲区很少都是一个大小。通常,会提供不同大小的缓冲池(Cisco使用术语“缓冲区分割”将内存划分为这些池[IOSearch])。通常,如果小缓冲池耗尽,到达的小数据包可以借用大缓冲池中的空间,但反之亦然。然而,没有必要考虑所有这些复杂性,因为任何拥塞的根本原因仍然是线路过载——缓冲器消耗只是症状。因此,队列长度应测量为队列中将传输到线路的字节的总和,包括任何填充。在(不常见的)带填充的传输情况下,这意味着排队的小缓冲区大小之和加上排队的大缓冲区大小之和。
We will return to borrowing of fixed-size buffers when we discuss biasing the drop/marking probability of a specific packet because of its size in Section 4.2.1. But here, we can repeat the simple rule for how to measure the length of queues of fixed buffers: no matter how complicated the buffering scheme is, ultimately a transmission line is nearly always bit-congestible so the number of bytes queued up waiting for the line measures how congested the line is, and it is rarely important to measure how congested the buffering system is.
当我们在第4.2.1节中讨论由于特定数据包的大小而使其丢弃/标记概率产生偏差时,我们将回到借用固定大小的缓冲区。但在这里,我们可以重复如何测量固定缓冲区队列长度的简单规则:无论缓冲方案多么复杂,最终传输线几乎总是有点拥挤,因此排队等待线路的字节数可以测量线路的拥挤程度,而且测量缓冲系统的拥挤程度也很少重要。
AQM algorithms are nearly always described assuming there is a queue for a congested resource and the algorithm can use the queue length to determine the probability that it will drop or mark each packet. But not all congested resources lead to queues. For instance, power-
AQM算法几乎总是在假设存在拥塞资源的队列的情况下描述的,并且该算法可以使用队列长度来确定丢弃或标记每个数据包的概率。但并非所有拥挤的资源都会导致排队。比如说权力-
limited resources are usually bit-congestible if energy is primarily required for transmission rather than header processing, but it is rare for a link protocol to build a queue as it approaches maximum power.
如果能量主要用于传输而不是报头处理,则有限的资源通常会有点拥挤,但链路协议在接近最大功率时很少构建队列。
Nonetheless, AQM algorithms do not require a queue in order to work. For instance, spectrum congestion can be modelled by signal quality using the target bit-energy-to-noise-density ratio. And, to model radio power exhaustion, transmission-power levels can be measured and compared to the maximum power available. [ECNFixedWireless] proposes a practical and theoretically sound way to combine congestion notification for different bit-congestible resources at different layers along an end-to-end path, whether wireless or wired, and whether with or without queues.
尽管如此,AQM算法并不需要队列才能工作。例如,频谱拥塞可以通过使用目标比特能量噪声密度比的信号质量来建模。而且,为了模拟无线电功率消耗,可以测量传输功率水平,并与最大可用功率进行比较。[ECNFixedWireless]提出了一种实用且理论上合理的方法,用于在端到端路径(无论是无线路径还是有线路径,无论是否有队列)的不同层上组合不同比特拥塞资源的拥塞通知。
In wireless protocols that use request to send / clear to send (RTS / CTS) control, such as some variants of IEEE802.11, it is reasonable to base an AQM on the time spent waiting for transmission opportunities (TXOPs) even though the wireless spectrum is usually regarded as congested by bits (for a given coding scheme). This is because requests for TXOPs queue up as the spectrum gets congested by all the bits being transferred. So the time that TXOPs are queued directly reflects bit congestion of the spectrum.
在使用请求发送/清除发送(RTS/CTS)控制的无线协议中,例如IEEE802.11的一些变体,将AQM基于等待传输机会(TXOP)所花费的时间是合理的,即使无线频谱通常被视为比特拥塞(对于给定的编码方案)。这是因为,由于传输的所有比特导致频谱拥塞,对TXOPs的请求会排队。所以TXOP排队的时间直接反映了频谱的比特拥塞。
The previously mentioned email [pktByteEmail] referred to by [RFC2309] advised that most scarce resources in the Internet were bit-congestible, which is still believed to be true (Section 1.1). But it went on to offer advice that is updated by this memo. It said that drop probability should depend on the size of the packet being considered for drop if the resource is bit-congestible, but not if it is packet-congestible. The argument continued that if packet drops were inflated by packet size (byte-mode dropping), "a flow's fraction of the packet drops is then a good indication of that flow's fraction of the link bandwidth in bits per second". This was consistent with a referenced policing mechanism being worked on at the time for detecting unusually high bandwidth flows, eventually published in 1999 [pBox]. However, the problem could and should have been solved by making the policing mechanism count the volume of bytes randomly dropped, not the number of packets.
[RFC2309]提到的前面提到的电子邮件[PKTByteMail]指出,互联网上的大多数稀缺资源都有点拥挤,这一点仍然被认为是正确的(第1.1节)。但它继续提供建议,并根据这份备忘录进行了更新。它说,如果资源是位拥塞的,则丢弃概率应取决于考虑丢弃的数据包的大小,而不是数据包是否拥塞。该论点继续说,如果数据包丢弃是由数据包大小(字节模式丢弃)膨胀的,“那么数据流的数据包丢弃分数就很好地表明了该数据流的链路带宽分数,单位为比特/秒”。这与1999年[pBox]最终发布的用于检测异常高带宽流量的参考监管机制是一致的。然而,这个问题本来可以也应该通过让监控机制统计随机丢弃的字节数,而不是数据包数来解决。
A few months before RFC 2309 was published, an addendum was added to the above archived email referenced from the RFC, in which the final paragraph seemed to partially retract what had previously been said. It clarified that the question of whether the probability of dropping/marking a packet should depend on its size was not related to whether the resource itself was bit-congestible, but a completely orthogonal question. However, the only example given had the queue measured in packets but packet drop depended on the size of the packet in question. No example was given the other way round.
在RFC 2309发布的几个月前,在RFC引用的上述存档电子邮件中添加了一个附录,其中最后一段似乎部分收回了先前所说的内容。它澄清了丢弃/标记数据包的概率是否应取决于其大小的问题与资源本身是否存在比特拥塞无关,而是一个完全正交的问题。然而,给出的唯一示例是以数据包为单位测量队列,但数据包丢弃取决于相关数据包的大小。没有相反的例子。
In 2000, Cnodder et al. [REDbyte] pointed out that there was an error in the part of the original 1993 RED algorithm that aimed to distribute drops uniformly, because it didn't correctly take into account the adjustment for packet size. They recommended an algorithm called RED_4 to fix this. But they also recommended a further change, RED_5, to adjust the drop rate dependent on the square of the relative packet size. This was indeed consistent with one implied motivation behind RED's byte-mode drop -- that we should reverse engineer the network to improve the performance of dominant end-to-end congestion control mechanisms. This memo makes a different recommendations in Section 2.
2000年,Cnodder等人[REDbyte]指出,最初的1993年RED算法中有一个错误,该算法的目标是均匀分布数据滴,因为它没有正确地考虑数据包大小的调整。他们推荐了一种叫做RED_4的算法来解决这个问题。但他们也建议进一步修改RED_5,根据相对数据包大小的平方调整丢弃率。这确实与RED的字节模式下降背后的一个隐含动机一致——我们应该对网络进行反向工程,以改善占主导地位的端到端拥塞控制机制的性能。本备忘录在第2节中提出了不同的建议。
By 2003, a further change had been made to the adjustment for packet size, this time in the RED algorithm of the ns2 simulator. Instead of taking each packet's size relative to a 'maximum packet size', it was taken relative to a 'mean packet size', intended to be a static value representative of the 'typical' packet size on the link. We have not been able to find a justification in the literature for this change; however, Eddy and Allman conducted experiments [REDbias] that assessed how sensitive RED was to this parameter, amongst other things. This changed algorithm can often lead to drop probabilities of greater than 1 (which gives a hint that there is probably a mistake in the theory somewhere).
到2003年,对数据包大小的调整做了进一步的更改,这次是在ns2模拟器的RED算法中。它不是相对于“最大数据包大小”来获取每个数据包的大小,而是相对于“平均数据包大小”来获取,旨在作为代表链路上“典型”数据包大小的静态值。我们无法在文献中找到这种变化的理由;然而,Eddy和Allman进行了实验[REDbias],评估了红色对该参数的敏感程度,以及其他因素。这种改变的算法通常会导致丢弃概率大于1(这暗示理论中可能存在错误)。
On 10-Nov-2004, this variant of byte-mode packet drop was made the default in the ns2 simulator. It seems unlikely that byte-mode drop has ever been implemented in production networks (Appendix A); therefore, any conclusions based on ns2 simulations that use RED without disabling byte-mode drop are likely to behave very differently from RED in production networks.
2004年11月10日,在ns2模拟器中,字节模式数据包丢弃的这种变体成为默认值。在生产网络中似乎不太可能实现字节模式删除(附录A);因此,基于ns2模拟的任何结论,如果使用RED而不禁用字节模式丢弃,则在生产网络中的行为可能与RED非常不同。
The byte-mode drop variant of RED (or a similar variant of other AQM algorithms) is not the only possible bias towards small packets in queuing systems. We have already mentioned that tail-drop queues naturally tend to lock out large packets once they are full.
RED的字节模式丢弃变量(或其他AQM算法的类似变量)不是排队系统中对小包的唯一可能偏差。我们已经提到,尾部丢弃队列自然倾向于在大数据包满后锁定它们。
But also, queues with fixed-size buffers reduce the probability that small packets will be dropped if (and only if) they allow small packets to borrow buffers from the pools for larger packets (see Section 4.1.1). Borrowing effectively makes the maximum queue size for small packets greater than that for large packets, because more buffers can be used by small packets while less will fit large packets. Incidentally, the bias towards small packets from buffer borrowing is nothing like as large as that of RED's byte-mode drop.
但是,如果(且仅当)具有固定大小缓冲区的队列允许小包从池中借用缓冲区以获得较大的数据包,则具有固定大小缓冲区的队列将降低小包被丢弃的概率(参见第4.1.1节)。借用有效地使小包的最大队列大小大于大包的最大队列大小,因为小包可以使用更多的缓冲区,而小包可以使用更少的缓冲区。顺便说一句,缓冲区借用产生的对小数据包的偏向与RED的字节模式丢弃的偏向完全不同。
Nonetheless, fixed-buffer memory with tail drop is still prone to lock out large packets, purely because of the tail-drop aspect. So, fixed-size packet buffers should be augmented with a good AQM algorithm and packet-mode drop. If an AQM is too complicated to implement with multiple fixed buffer pools, the minimum necessary to prevent large-packet lockout is to ensure that smaller packets never use the last available buffer in any of the pools for larger packets.
尽管如此,带尾部丢弃的固定缓冲区内存仍然容易锁定大数据包,这纯粹是因为尾部丢弃方面的原因。因此,固定大小的数据包缓冲区应该增加一个好的AQM算法和数据包模式丢弃。如果AQM太复杂,无法使用多个固定缓冲池来实现,那么防止大数据包锁定的最低要求是确保较小的数据包不会使用任何池中最后一个可用的缓冲池来处理较大的数据包。
The above proposals to alter the network equipment to bias towards smaller packets have largely carried on outside the IETF process. Whereas, within the IETF, there are many different proposals to alter transport protocols to achieve the same goals, i.e., either to make the flow bit rate take into account packet size, or to protect control packets from loss. This memo argues that altering transport protocols is the more principled approach.
上述改变网络设备以偏向较小数据包的建议主要是在IETF过程之外进行的。然而,在IETF中,有许多不同的建议来改变传输协议以实现相同的目标,即,要么使流比特率考虑到数据包大小,要么保护控制数据包不丢失。这份备忘录认为,改变传输协议是更有原则的方法。
A recently approved experimental RFC adapts its transport-layer protocol to take into account packet sizes relative to typical TCP packet sizes. This proposes a new small-packet variant of TCP-friendly rate control (TFRC [RFC5348]), which is called TFRC-SP [RFC4828]. Essentially, it proposes a rate equation that inflates the flow rate by the ratio of a typical TCP segment size (1,500 B including TCP header) over the actual segment size [PktSizeEquCC]. (There are also other important differences of detail relative to TFRC, such as using virtual packets [CCvarPktSize] to avoid responding to multiple losses per round trip and using a minimum inter-packet interval.)
最近批准的实验性RFC调整其传输层协议,以考虑相对于典型TCP数据包大小的数据包大小。本文提出了一种新的TCP友好速率控制(TFRC[RFC5348])的小数据包变体,称为TFRC-SP[RFC4828]。本质上,它提出了一个速率方程,通过典型TCP段大小(1500 B,包括TCP头)与实际段大小[PktSizeEquCC]的比率来增加流量。(与TFRC相比,还存在其他重要的细节差异,例如使用虚拟数据包[CCvarPktSize]避免对每次往返的多个丢失做出响应,并使用最小数据包间隔。)
Section 4.5.1 of the TFRC-SP specification discusses the implications of operating in an environment where queues have been configured to drop smaller packets with proportionately lower probability than larger ones. But it only discusses TCP operating in such an environment, only mentioning TFRC-SP briefly when discussing how to define fairness with TCP. And it only discusses the byte-mode dropping version of RED as it was before Cnodder et al. pointed out that it didn't sufficiently bias towards small packets to make TCP independent of packet size.
TFRC-SP规范的第4.5.1节讨论了在队列配置为丢弃较小数据包的环境中运行的影响,与较大数据包相比,丢弃较小数据包的概率相对较低。但它只讨论了在这种环境下运行的TCP,在讨论如何用TCP定义公平性时,只简单地提到了TFRC-SP。它只讨论了RED的字节模式丢弃版本,就像在Cnodder等人之前一样,指出它没有充分偏向于小数据包,从而使TCP独立于数据包大小。
So the TFRC-SP specification doesn't address the issue of whether the network or the transport _should_ handle fairness between different packet sizes. In Appendix B.4 of RFC 4828, it discusses the possibility of both TFRC-SP and some network buffers duplicating each other's attempts to deliberately bias towards small packets. But the discussion is not conclusive, instead reporting simulations of many of the possibilities in order to assess performance but not recommending any particular course of action.
因此,TFRC-SP规范没有解决网络或传输是否应该处理不同数据包大小之间的公平性问题。在RFC 4828的附录B.4中,讨论了TFRC-SP和一些网络缓冲区相互复制试图故意偏向小数据包的可能性。但讨论不是结论性的,而是报告许多可能性的模拟,以评估绩效,但不推荐任何特定的行动方案。
The paper originally proposing TFRC with virtual packets (VP-TFRC) [CCvarPktSize] proposed that there should perhaps be two variants to cater for the different variants of RED. However, as the TFRC-SP authors point out, there is no way for a transport to know whether some queues on its path have deployed RED with byte-mode packet drop (except if an exhaustive survey found that no one has deployed it! -- see Appendix A). Incidentally, VP-TFRC also proposed that byte-mode RED dropping should really square the packet-size compensation factor (like that of Cnodder's RED_5, but apparently unaware of it).
论文最初提出了具有虚拟数据包的TFRC(VP-TFRC)[CCvarPktSize]建议,可能应该有两种变体来满足RED的不同变体。然而,正如TFRC-SP作者所指出的,传输无法知道其路径上的某些队列是否部署了带有字节模式数据包丢弃的RED(除非彻底调查发现没有人部署它!-参见附录a)。顺便说一句,VP-TFRC还提出字节模式RED丢弃实际上应该使数据包大小补偿因子平方(就像Cnodder的RED_5一样,但显然没有意识到这一点)。
Pre-congestion notification [RFC5670] is an IETF technology to use a virtual queue for AQM marking for packets within one Diffserv class in order to give early warning prior to any real queuing. The PCN-marking algorithms have been designed not to take into account packet size when forwarding through queues. Instead, the general principle has been to take the sizes of marked packets into account when monitoring the fraction of marking at the edge of the network, as recommended here.
拥塞前通知[RFC5670]是一种IETF技术,它使用虚拟队列对一个Diffserv类中的数据包进行AQM标记,以便在任何实际队列之前发出预警。PCN标记算法的设计不考虑通过队列转发时的数据包大小。相反,一般原则是在监测网络边缘的标记分数时考虑标记数据包的大小,如本文所建议的。
Recently, two RFCs have defined changes to TCP that make it more robust against losing small control packets [RFC5562] [RFC5690]. In both cases, they note that the case for these two TCP changes would be weaker if RED were biased against dropping small packets. We argue here that these two proposals are a safer and more principled way to achieve TCP performance improvements than reverse engineering RED to benefit TCP.
最近,两个RFC定义了对TCP的更改,使其对丢失小控制数据包更具鲁棒性[RFC5562][RFC5690]。在这两种情况下,他们都注意到,如果RED偏向于丢弃小数据包,那么这两种TCP更改的情况将更弱。我们在这里认为,这两个方案比反向工程RED更安全、更有原则地实现TCP性能改进,从而使TCP受益。
Although there are no known proposals, it would also be possible and perfectly valid to make control packets robust against drop by requesting a scheduling class with lower drop probability, which would be achieved by re-marking to a Diffserv code point [RFC2474] within the same behaviour aggregate.
尽管没有已知的方案,但通过请求具有较低丢弃概率的调度类,使控制数据包具有抗丢弃的鲁棒性也是可能且完全有效的,这将通过在相同行为聚合内重新标记到Diffserv代码点[RFC2474]来实现。
Although not brought to the IETF, a simple proposal from Wischik [DupTCP] suggests that the first three packets of every TCP flow should be routinely duplicated after a short delay. It shows that this would greatly improve the chances of short flows completing
尽管没有提交给IETF,但Wischik[DupTCP]的一个简单提议建议,每个TCP流的前三个数据包应在短延迟后定期复制。这表明,这将大大提高短期流动完成的机会
quickly, but it would hardly increase traffic levels on the Internet, because Internet bytes have always been concentrated in the large flows. It further shows that the performance of many typical applications depends on completion of long serial chains of short messages. It argues that, given most of the value people get from the Internet is concentrated within short flows, this simple expedient would greatly increase the value of the best-effort Internet at minimal cost. A similar but more extensive approach has been evaluated on Google servers [GentleAggro].
很快,但它几乎不会增加互联网上的流量水平,因为互联网字节一直集中在大流量中。它进一步表明,许多典型应用程序的性能取决于短消息长串行链的完成。它认为,鉴于人们从互联网获得的大部分价值都集中在短流量中,这一简单的权宜之计将以最低的成本大大增加尽力而为的互联网的价值。类似但更广泛的方法已经在谷歌服务器[GentleAggro]上进行了评估。
The proposals discussed in this sub-section are experimental approaches that are not yet in wide operational use, but they are existence proofs that transports can make themselves robust against loss of control packets. The examples are all TCP-based, but applications over non-TCP transports could mitigate loss of control packets by making similar use of Diffserv, data duplication, FEC, etc.
本小节中讨论的建议是尚未广泛使用的实验性方法,但它们是存在的证据,证明了传输可以使自己对控制数据包丢失具有鲁棒性。这些示例都是基于TCP的,但非TCP传输上的应用程序可以通过类似地使用Diffserv、数据复制、FEC等来减少控制数据包的丢失。
+-----------+-----------------+-----------------+-------------------+ | transport | RED_1 (packet- | RED_4 (linear | RED_5 (square | | cc | mode drop) | byte-mode drop) | byte-mode drop) | +-----------+-----------------+-----------------+-------------------+ | TCP or | s/sqrt(p) | sqrt(s/p) | 1/sqrt(p) | | TFRC | | | | | TFRC-SP | 1/sqrt(p) | 1/sqrt(s*p) | 1/(s*sqrt(p)) | +-----------+-----------------+-----------------+-------------------+
+-----------+-----------------+-----------------+-------------------+ | transport | RED_1 (packet- | RED_4 (linear | RED_5 (square | | cc | mode drop) | byte-mode drop) | byte-mode drop) | +-----------+-----------------+-----------------+-------------------+ | TCP or | s/sqrt(p) | sqrt(s/p) | 1/sqrt(p) | | TFRC | | | | | TFRC-SP | 1/sqrt(p) | 1/sqrt(s*p) | 1/(s*sqrt(p)) | +-----------+-----------------+-----------------+-------------------+
Table 2: Dependence of flow bit rate per RTT on packet size, s, and drop probability, p, when there is network and/or transport bias towards small packets to varying degrees
表2:当网络和/或传输偏向不同程度的小包时,每个RTT的流量比特率与包大小s和丢包概率p的关系
Table 2 aims to summarise the potential effects of all the advice from different sources. Each column shows a different possible AQM behaviour in different queues in the network, using the terminology of Cnodder et al. outlined earlier (RED_1 is basic RED with packet-mode drop). Each row shows a different transport behaviour: TCP [RFC5681] and TFRC [RFC5348] on the top row with TFRC-SP [RFC4828] below. Each cell shows how the bits per round trip of a flow depends on packet size, s, and drop probability, p. In order to declutter the formulae to focus on packet-size dependence, they are all given per round trip, which removes any RTT term.
表2旨在总结来自不同来源的所有建议的潜在影响。使用前面概述的Cnodder等人的术语,每列显示了网络中不同队列中不同的可能AQM行为(RED_1是带数据包模式丢弃的基本红色)。每行显示不同的传输行为:第一行是TCP[RFC5681]和TFRC[RFC5348],下面是TFRC-SP[RFC4828]。每个单元显示流的每次往返的比特数如何取决于数据包大小s和丢弃概率p。为了将公式分离出来以关注数据包大小的依赖性,它们都是在每次往返中给出的,这样就消除了任何RTT项。
Let us assume that the goal is for the bit rate of a flow to be independent of packet size. Suppressing all inessential details, the table shows that this should either be achievable by not altering the TCP transport in a RED_5 network, or using the small packet TFRC-SP
让我们假设目标是流的比特率独立于数据包大小。抑制所有不重要的细节,该表显示,这应该通过不改变RED_5网络中的TCP传输或使用小数据包TFRC-SP来实现
transport (or similar) in a network without any byte-mode dropping RED (top right and bottom left). Top left is the 'do nothing' scenario, while bottom right is the 'do both' scenario in which the bit rate would become far too biased towards small packets. Of course, if any form of byte-mode dropping RED has been deployed on a subset of queues that congest, each path through the network will present a different hybrid scenario to its transport.
在网络中传输(或类似)时,没有任何字节模式显示红色(右上角和左下角)。左上角是“什么都不做”的场景,而右下角是“两个都做”的场景,在这种场景中,比特率将变得过于偏向小数据包。当然,如果在拥挤的队列子集上部署了任何形式的字节模式droping RED,则通过网络的每条路径将为其传输呈现不同的混合场景。
Whatever the case, we can see that the linear byte-mode drop column in the middle would considerably complicate the Internet. Even if one believes the network should be doing the biasing, linear byte-mode drop is a half-way house that doesn't bias enough towards small packets. Section 2 recommends that _all_ bias in network equipment towards small packets should be turned off -- if indeed any equipment vendors have implemented it -- leaving packet-size bias solely as the preserve of the transport layer (solely the leftmost, packet-mode drop column).
不管是什么情况,我们可以看到中间的线性字节模式下降列会使互联网变得相当复杂。即使有人认为网络应该进行偏置,线性字节模式的丢包也是一个折衷方案,它不会对小数据包产生足够的偏置。第2节建议关闭网络设备对小数据包的“所有”偏差——如果确实有任何设备供应商实施了这种做法——只保留数据包大小偏差作为传输层的保留(只保留最左边的数据包模式丢弃列)。
In practice, it seems that no deliberate bias towards small packets has been implemented for production networks. Of the 19% of vendors who responded to a survey of 84 equipment vendors, none had implemented byte-mode drop in RED (see Appendix A for details).
在实践中,生产网络似乎没有刻意偏袒小数据包。在对84家设备供应商的调查做出回应的19%的供应商中,没有一家实施了红色字节模式下降(详见附录a)。
For a connectionless network with nearly all resources being bit-congestible, the recommended position is clear -- the network should not make allowance for packet sizes and the transport should. This leaves two outstanding issues:
对于几乎所有资源都有比特拥塞的无连接网络,建议的位置是明确的——网络不应该考虑数据包大小,传输应该考虑。这就留下了两个悬而未决的问题:
o The question of how to handle any legacy AQM deployments using byte-mode drop;
o 如何使用字节模式drop处理任何遗留AQM部署的问题;
o The need to start a programme to update transport congestion control protocol standards to take packet size into account.
o 需要启动一项计划,更新传输拥塞控制协议标准,以考虑数据包大小。
A survey of equipment vendors (Section 4.2.4) found no evidence that byte-mode packet drop had been implemented, so deployment will be sparse at best. A migration strategy is not really needed to remove an algorithm that may not even be deployed.
对设备供应商的调查(第4.2.4节)发现,没有证据表明已经实施了字节模式数据包丢弃,因此部署充其量只是稀疏的。删除甚至可能未部署的算法并不需要迁移策略。
A programme of experimental updates to take packet size into account in transport congestion control protocols has already started with TFRC-SP [RFC4828].
TFRC-SP[RFC4828]已经启动了一项实验性更新计划,以在传输拥塞控制协议中考虑数据包大小。
The position is much less clear-cut if the Internet becomes populated by a more even mix of both packet-congestible and bit-congestible resources (see Appendix B.2). This problem is not pressing, because most Internet resources are designed to be bit-congestible before packet processing starts to congest (see Section 1.1).
如果互联网上充斥着更均匀的分组拥塞和比特拥塞资源(见附录B.2),情况就不那么明朗了。这个问题并不紧迫,因为大多数互联网资源在数据包处理开始拥塞之前都被设计成比特拥塞(见第1.1节)。
The IRTF's Internet Congestion Control Research Group (ICCRG) has set itself the task of reaching consensus on generic forwarding mechanisms that are necessary and sufficient to support the Internet's future congestion control requirements (the first challenge in [RFC6077]). The research question of whether packet congestion might become common and what to do if it does may in the future be explored in the IRTF (the "Challenge 3: Packet Size" in [RFC6077]).
IRTF的互联网拥塞控制研究小组(ICCRG)为自己设定了一项任务,即就通用转发机制达成共识,该机制对于支持互联网未来的拥塞控制要求是必要且充分的(RFC6077中的第一个挑战)。IRTF(RFC6077中的“挑战3:数据包大小”)将探讨数据包拥塞是否会变得普遍以及如果出现这种情况该怎么办的研究问题。
Note that sometimes it seems that resources might be congested by neither bits nor packets, e.g., where the queue for access to a wireless medium is in units of transmission opportunities. However, the root cause of congestion of the underlying spectrum is overload of bits (see Section 4.1.2).
注意,有时资源似乎既不被比特也不被分组阻塞,例如,在用于访问无线介质的队列以传输机会为单位的情况下。然而,底层频谱拥塞的根本原因是位过载(见第4.1.2节)。
This memo recommends that queues do not bias drop probability due to packets size. For instance, dropping small packets less often than large ones creates a perverse incentive for transports to break down their flows into tiny segments. One of the benefits of implementing AQM was meant to be to remove this perverse incentive that tail-drop queues gave to small packets.
此备忘录建议队列不要因数据包大小而影响丢弃概率。例如,丢弃小数据包的频率比丢弃大数据包的频率低,这就产生了一种不正当的激励,促使传输将数据流分解为小数据段。实施AQM的好处之一是消除掉掉尾队列给小包带来的这种不正当的激励。
In practice, transports cannot all be trusted to respond to congestion. So another reason for recommending that queues not bias drop probability towards small packets is to avoid the vulnerability to small-packet DDoS attacks that would otherwise result. One of the benefits of implementing AQM was meant to be to remove tail drop's DoS vulnerability to small packets, so we shouldn't add it back again.
实际上,不能完全信任传输来响应拥塞。因此,建议队列不要将丢弃概率偏向小包的另一个原因是为了避免可能导致的小包DDoS攻击的漏洞。实施AQM的一个好处是消除了tail-drop对小数据包的DoS漏洞,因此我们不应该再次添加它。
If most queues implemented AQM with byte-mode drop, the resulting network would amplify the potency of a small-packet DDoS attack. At the first queue, the stream of packets would push aside a greater proportion of large packets, so more of the small packets would survive to attack the next queue. Thus a flood of small packets would continue on towards the destination, pushing regular traffic with large packets out of the way in one queue after the next, but suffering much less drop itself.
如果大多数队列使用字节模式丢弃实现AQM,则产生的网络将放大小数据包DDoS攻击的威力。在第一个队列中,数据包流会将较大比例的大数据包推到一边,因此更多的小数据包将存活下来,以攻击下一个队列。因此,大量的小数据包将继续流向目的地,在一个接一个的队列中,带着大数据包的常规流量将被挤出,但其自身的丢包量要小得多。
Appendix C explains why the ability of networks to police the response of _any_ transport to congestion depends on bit-congestible network resources only doing packet-mode drop, not byte-mode drop. In summary, it says that making drop probability depend on the size of the packets that bits happen to be divided into simply encourages the bits to be divided into smaller packets. Byte-mode drop would therefore irreversibly complicate any attempt to fix the Internet's incentive structures.
附录C解释了为什么网络监控“任意”传输对拥塞的响应的能力取决于位拥塞网络资源只进行分组模式丢弃,而不是字节模式丢弃。总之,它说使丢弃概率取决于比特恰好被划分成的数据包的大小,这只会鼓励比特被划分成更小的数据包。因此,字节模式删除将使任何修复互联网激励结构的尝试不可逆转地复杂化。
This memo identifies the three distinct stages of the congestion notification process where implementations need to decide whether to take packet size into account. The recommendations provided in Section 2 of this memo are different in each case:
此备忘录确定了拥塞通知过程的三个不同阶段,其中实现需要决定是否考虑数据包大小。本备忘录第2节中提供的建议在每种情况下都是不同的:
o When network equipment measures the length of a queue, if it is not feasible to use time; it is recommended to count in bytes if the network resource is congested by bytes, or to count in packets if is congested by packets.
o 当网络设备测量队列长度时,如果使用时间不可行;如果网络资源因字节而拥塞,建议以字节为单位计数;如果因数据包而拥塞,建议以数据包为单位计数。
o When network equipment decides whether to drop (or mark) a packet, it is recommended that the size of the particular packet should not be taken into account.
o 当网络设备决定是否丢弃(或标记)数据包时,建议不考虑特定数据包的大小。
o However, when a transport algorithm responds to a dropped or marked packet, the size of the rate reduction should be proportionate to the size of the packet.
o 然而,当传输算法响应丢弃或标记的数据包时,速率降低的大小应与数据包的大小成比例。
In summary, the answers are 'it depends', 'no', and 'yes', respectively.
总之,答案分别是“视情况而定”、“否”和“是”。
For the specific case of RED, this means that byte-mode queue measurement will often be appropriate, but the use of byte-mode drop is very strongly discouraged.
对于RED的特定情况,这意味着字节模式队列测量通常是合适的,但强烈建议不要使用字节模式drop。
At the transport layer, the IETF should continue updating congestion control protocols to take into account the size of each packet that indicates congestion. Also, the IETF should continue to make protocols less sensitive to losing control packets like SYNs, pure ACKs, and DNS exchanges. Although many control packets happen to be small, the alternative of network equipment favouring all small packets would be dangerous. That would create perverse incentives to split data transfers into smaller packets.
在传输层,IETF应继续更新拥塞控制协议,以考虑指示拥塞的每个数据包的大小。此外,IETF应继续降低协议对丢失控制数据包(如SYN、纯ACK和DNS交换)的敏感性。尽管许多控制数据包碰巧很小,但选择支持所有小数据包的网络设备将是危险的。这将产生不正当的动机,将数据传输拆分成更小的数据包。
The memo develops these recommendations from principled arguments concerning scaling, layering, incentives, inherent efficiency, security, and 'policeability'. It also addresses practical issues
备忘录从有关规模、分层、激励、内在效率、安全性和“政策性”的原则性论点中提出了这些建议。它还涉及实际问题
such as specific buffer architectures and incremental deployment. Indeed, a limited survey of RED implementations is discussed, which shows there appears to be little, if any, installed base of RED's byte-mode drop. Therefore, it can be deprecated with little, if any, incremental deployment complications.
例如特定的缓冲区体系结构和增量部署。事实上,本文讨论了对RED实现的有限调查,这表明RED的字节模式下降的安装基数似乎很少(如果有的话)。因此,它可能会被弃用,即使有,也不会增加部署复杂性。
The recommendations have been developed on the well-founded basis that most Internet resources are bit-congestible, not packet-congestible. We need to know the likelihood that this assumption will prevail in the longer term and, if it might not, what protocol changes will be needed to cater for a mix of the two. The IRTF Internet Congestion Control Research Group (ICCRG) is currently working on these problems [RFC6077].
这些建议是建立在充分的基础上的,即大多数互联网资源都是比特拥塞的,而不是数据包拥塞的。我们需要知道这一假设在长期内占优势的可能性,如果可能不占优势,则需要对协议进行哪些更改,以满足两者的混合需求。IRTF互联网拥塞控制研究小组(ICCRG)目前正在研究这些问题[RFC6077]。
Thank you to Sally Floyd, who gave extensive and useful review comments. Also thanks for the reviews from Philip Eardley, David Black, Fred Baker, David Taht, Toby Moncaster, Arnaud Jacquet, and Mirja Kuehlewind, as well as helpful explanations of different hardware approaches from Larry Dunn and Fred Baker. We are grateful to Bruce Davie and his colleagues for providing a timely and efficient survey of RED implementation in Cisco's product range. Also, grateful thanks to Toby Moncaster, Will Dormann, John Regnault, Simon Carter, and Stefaan De Cnodder who further helped survey the current status of RED implementation and deployment, and, finally, thanks to the anonymous individuals who responded.
感谢Sally Floyd,她给出了广泛而有用的评论。还感谢Philip Eardley、David Black、Fred Baker、David Taht、Toby Moncaster、Arnaud Jacquet和Mirja Kuehlewind的评论,以及Larry Dunn和Fred Baker对不同硬件方法的有益解释。我们感谢Bruce Davie和他的同事对Cisco产品系列中的RED实施情况进行了及时有效的调查。此外,还要感谢Toby Moncaster、Will Dorman、John Regnault、Simon Carter和Stefaan De Cnodder,他们进一步帮助调查了RED实施和部署的现状,最后还要感谢回应的匿名人士。
Bob Briscoe and Jukka Manner were partly funded by Trilogy and Trilogy 2, research projects (ICT-216372, ICT-317756) supported by the European Community under its Seventh Framework Programme. The views expressed here are those of the authors only.
Bob Briscoe和Jukka Way的部分资金来自Trilogy和Trilogy 2研究项目(ICT-216372,ICT-317756),该项目由欧洲共同体在其第七个框架方案下支持。此处所表达的观点仅为作者的观点。
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。
[RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, S., Wroclawski, J., and L. Zhang, "Recommendations on Queue Management and Congestion Avoidance in the Internet", RFC 2309, April 1998.
[RFC2309]Braden,B.,Clark,D.,Crowcroft,J.,Davie,B.,Deering,S.,Estrin,D.,Floyd,S.,Jacobson,V.,Minshall,G.,Partridge,C.,Peterson,L.,Ramakrishnan,K.,Shenker,S.,Wroclawski,J.,和L.Zhang,“关于互联网中队列管理和拥塞避免的建议”,RFC 2309,1998年4月。
[RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, RFC 2914, September 2000.
[RFC2914]Floyd,S.,“拥塞控制原则”,BCP 41,RFC 2914,2000年9月。
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 2001.
[RFC3168]Ramakrishnan,K.,Floyd,S.,和D.Black,“向IP添加显式拥塞通知(ECN)”,RFC 3168,2001年9月。
[BLUE02] Feng, W-c., Shin, K., Kandlur, D., and D. Saha, "The BLUE active queue management algorithms", IEEE/ACM Transactions on Networking 10(4) 513-528, August 2002, <http://dx.doi.org/10.1109/TNET.2002.801399>.
[BLUE02]Feng,W-c.,Shin,K.,Kandlur,D.,和D.Saha,“蓝色主动队列管理算法”,IEEE/ACM网络事务10(4)513-528,2002年8月<http://dx.doi.org/10.1109/TNET.2002.801399>.
[CCvarPktSize] Widmer, J., Boutremans, C., and J-Y. Le Boudec, "End-to-end congestion control for TCP-friendly flows with variable packet size", ACM CCR 34(2) 137-151, April 2004, <http://doi.acm.org/10.1145/997150.997162>.
[CCvarPktSize]Widmer,J.,Boutremans,C.,和J-Y.Le Boudec,“具有可变数据包大小的TCP友好流的端到端拥塞控制”,ACM CCR 34(2)137-151,2004年4月<http://doi.acm.org/10.1145/997150.997162>.
[CHOKe_Var_Pkt] Psounis, K., Pan, R., and B. Prabhaker, "Approximate Fair Dropping for Variable-Length Packets", IEEE Micro 21(1):48-56, January-February 2001, <http://ieeexplore.ieee.org/xpl/ articleDetails.jsp?arnumber=903061>.
[CHOKe_Var_Pkt]Psounis,K.,Pan,R.,和B.Prabhaker,“可变长度数据包的近似公平丢弃”,IEEE Micro 21(1):48-56,2001年1月至2月<http://ieeexplore.ieee.org/xpl/ articleDetails.jsp?arnumber=903061>。
[CoDel] Nichols, K. and V. Jacobson, "Controlled Delay Active Queue Management", Work in Progress, February 2013.
[CoDel]Nichols,K.和V.Jacobson,“受控延迟主动队列管理”,正在进行的工作,2013年2月。
[DRQ] Shin, M., Chong, S., and I. Rhee, "Dual-Resource TCP/AQM for Processing-Constrained Networks", IEEE/ACM Transactions on Networking Vol 16, issue 2, April 2008, <http://dx.doi.org/10.1109/TNET.2007.900415>.
[DRQ]Shin,M.,Chong,S.和I.Rhee,“处理受限网络的双资源TCP/AQM”,IEEE/ACM网络事务卷16,第2期,2008年4月<http://dx.doi.org/10.1109/TNET.2007.900415>.
[DupTCP] Wischik, D., "Short messages", Philosophical Transactions of the Royal Society A 366(1872):1941-1953, June 2008, <http://rsta.royalsocietypublishing.org/content/366/1872/ 1941.full.pdf+html>.
[DupTCP]Wischik,D.,“短消息”,皇家学会哲学学报A 366(1872):1941-1953,2008年6月<http://rsta.royalsocietypublishing.org/content/366/1872/ 1941.full.pdf+html>。
[ECNFixedWireless] Siris, V., "Resource Control for Elastic Traffic in CDMA Networks", Proc. ACM MOBICOM'02 , September 2002, <http://www.ics.forth.gr/netlab/publications/ resource_control_elastic_cdma.html>.
[ECNFixedWireless]Siris,V.,“CDMA网络中弹性业务的资源控制”,Proc。ACM MOBICOM'022002年9月<http://www.ics.forth.gr/netlab/publications/ 资源\u控制\u弹性\u cdma.html>。
[Evol_cc] Gibbens, R. and F. Kelly, "Resource pricing and the evolution of congestion control", Automatica 35(12)1969-1985, December 1999, <http://www.sciencedirect.com/science/article/pii/ S0005109899001351>.
[Evol_cc]Gibbens,R.和F.Kelly,“资源定价和拥塞控制的演变”,Automatica 35(12)1969-1985,1999年12月<http://www.sciencedirect.com/science/article/pii/ S0005109899001351>。
[GentleAggro] Flach, T., Dukkipati, N., Terzis, A., Raghavan, B., Cardwell, N., Cheng, Y., Jain, A., Hao, S., Katz-Bassett, E., and R. Govindan, "Reducing web latency: the virtue of gentle aggression", ACM SIGCOMM CCR 43(4)159-170, August 2013, <http://doi.acm.org/10.1145/2486001.2486014>.
[GentleAggro]Flach,T.,Dukkipati,N.,Terzis,A.,Raghavan,B.,Cardwell,N.,Cheng,Y.,Jain,A.,Hao,S.,Katz Bassett,E.,和R.Govindan,“减少网络延迟:温和攻击的美德”,ACM SIGCOMM CCR 43(4)159-1702013年8月<http://doi.acm.org/10.1145/2486001.2486014>.
[IOSArch] Bollapragada, V., White, R., and C. Murphy, "Inside Cisco IOS Software Architecture", Cisco Press: CCIE Professional Development ISBN13: 978-1-57870-181-0, July 2000.
[IOSearch]Bollapragada,V.,White,R.,和C.Murphy,“内部思科IOS软件架构”,思科出版社:CCIE专业发展ISBN13:978-1-57870-181-012000年7月。
[PIE] Pan, R., Natarajan, P., Piglione, C., Prabhu, M., Subramanian, V., Baker, F., and B. Steeg, "PIE: A Lightweight Control Scheme To Address the Bufferbloat Problem", Work in Progress, February 2014.
[PIE]Pan,R.,Natarajan,P.,Piglione,C.,Prabhu,M.,Subramanian,V.,Baker,F.,和B.Steeg,“PIE:解决缓冲区膨胀问题的轻量级控制方案”,正在进行的工作,2014年2月。
[PktSizeEquCC] Vasallo, P., "Variable Packet Size Equation-Based Congestion Control", ICSI Technical Report tr-00-008, 2000, <http://http.icsi.berkeley.edu/ftp/global/pub/ techreports/2000/tr-00-008.pdf>.
[PktSizeEquCC]Vasallo,P.,“基于可变数据包大小方程的拥塞控制”,ICSI技术报告tr-00-008,2000年<http://http.icsi.berkeley.edu/ftp/global/pub/ techreports/2000/tr-00-008.pdf>。
[RED93] Floyd, S. and V. Jacobson, "Random Early Detection (RED) gateways for Congestion Avoidance", IEEE/ACM Transactions on Networking 1(4) 397--413, August 1993, <http://ieeexplore.ieee.org/xpls/ abs_all.jsp?arnumber=251892>.
[RED93]Floyd,S.和V.Jacobson,“避免拥塞的随机早期检测(RED)网关”,IEEE/ACM网络事务1(4)397-413,1993年8月<http://ieeexplore.ieee.org/xpls/ abs_all.jsp?arnumber=251892>。
[REDbias] Eddy, W. and M. Allman, "A Comparison of RED's Byte and Packet Modes", Computer Networks 42(3) 261--280, June 2003, <http://www.ir.bbn.com/documents/articles/redbias.ps>.
[REDbias]Eddy,W.和M.Allman,“RED字节和数据包模式的比较”,计算机网络42(3)261-280,2003年6月<http://www.ir.bbn.com/documents/articles/redbias.ps>.
[REDbyte] De Cnodder, S., Elloumi, O., and K. Pauwels, "Effect of different packet sizes on RED performance", Proc. 5th IEEE Symposium on Computers and Communications (ISCC) 793-799, July 2000, <http://ieeexplore.ieee.org/xpls/ abs_all.jsp?arnumber=860741>.
[REDbyte]De Cnodder,S.,Elloumi,O.,和K.Pauwels,“不同数据包大小对RED性能的影响”,Proc。第五届IEEE计算机与通信研讨会(ISCC)793-799,2000年7月<http://ieeexplore.ieee.org/xpls/ abs_all.jsp?arnumber=860741>。
[RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1998.
[RFC2474]Nichols,K.,Blake,S.,Baker,F.,和D.Black,“IPv4和IPv6头中区分服务字段(DS字段)的定义”,RFC 2474,1998年12月。
[RFC3426] Floyd, S., "General Architectural and Policy Considerations", RFC 3426, November 2002.
[RFC3426]Floyd,S.,“一般建筑和政策考虑”,RFC 3426,2002年11月。
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.
[RFC3550]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。
[RFC3714] Floyd, S. and J. Kempf, "IAB Concerns Regarding Congestion Control for Voice Traffic in the Internet", RFC 3714, March 2004.
[RFC3714]Floyd,S.和J.Kempf,“IAB对互联网语音流量拥塞控制的关注”,RFC 3714,2004年3月。
[RFC4828] Floyd, S. and E. Kohler, "TCP Friendly Rate Control (TFRC): The Small-Packet (SP) Variant", RFC 4828, April 2007.
[RFC4828]Floyd,S.和E.Kohler,“TCP友好速率控制(TFRC):小数据包(SP)变体”,RFC 48282007年4月。
[RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP Friendly Rate Control (TFRC): Protocol Specification", RFC 5348, September 2008.
[RFC5348]Floyd,S.,Handley,M.,Padhye,J.,和J.Widmer,“TCP友好速率控制(TFRC):协议规范”,RFC 5348,2008年9月。
[RFC5562] Kuzmanovic, A., Mondal, A., Floyd, S., and K. Ramakrishnan, "Adding Explicit Congestion Notification (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562, June 2009.
[RFC5562]Kuzmanovic,A.,Mondal,A.,Floyd,S.,和K.Ramakrishnan,“向TCP的SYN/ACK数据包添加显式拥塞通知(ECN)功能”,RFC 55622009年6月。
[RFC5670] Eardley, P., "Metering and Marking Behaviour of PCN-Nodes", RFC 5670, November 2009.
[RFC5670]Eardley,P.,“PCN节点的计量和标记行为”,RFC 56702009年11月。
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion Control", RFC 5681, September 2009.
[RFC5681]Allman,M.,Paxson,V.和E.Blanton,“TCP拥塞控制”,RFC 56812009年9月。
[RFC5690] Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding Acknowledgement Congestion Control to TCP", RFC 5690, February 2010.
[RFC5690]Floyd,S.,Arcia,A.,Ros,D.,和J.Iyengar,“将确认拥塞控制添加到TCP”,RFC 56902010年2月。
[RFC6077] Papadimitriou, D., Welzl, M., Scharf, M., and B. Briscoe, "Open Research Issues in Internet Congestion Control", RFC 6077, February 2011.
[RFC6077]Papadimitriou,D.,Welzl,M.,Scharf,M.,和B.Briscoe,“互联网拥塞控制的开放研究问题”,RFC 6077,2011年2月。
[RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., and K. Carlberg, "Explicit Congestion Notification (ECN) for RTP over UDP", RFC 6679, August 2012.
[RFC6679]Westerlund,M.,Johansson,I.,Perkins,C.,O'Hanlon,P.,和K.Carlberg,“UDP上RTP的显式拥塞通知(ECN)”,RFC 6679,2012年8月。
[RFC6789] Briscoe, B., Woundy, R., and A. Cooper, "Congestion Exposure (ConEx) Concepts and Use Cases", RFC 6789, December 2012.
[RFC6789]Briscoe,B.,Woundy,R.,和A.Cooper,“拥塞暴露(ConEx)概念和用例”,RFC 6789,2012年12月。
[Rate_fair_Dis] Briscoe, B., "Flow Rate Fairness: Dismantling a Religion", ACM CCR 37(2)63-74, April 2007, <http://portal.acm.org/citation.cfm?id=1232926>.
[Rate_fair_Dis]Briscoe,B.,“流量公平:摧毁宗教”,ACM CCR 37(2)63-74,2007年4月<http://portal.acm.org/citation.cfm?id=1232926>.
[gentle_RED] Floyd, S., "Recommendation on using the "gentle_" variant of RED", Web page , March 2000, <http://www.icir.org/floyd/red/gentle.html>.
[Little_RED]Floyd,S.,“关于使用红色的“Little_”变体的建议”,网页,2000年3月<http://www.icir.org/floyd/red/gentle.html>.
[pBox] Floyd, S. and K. Fall, "Promoting the Use of End-to-End Congestion Control", IEEE/ACM Transactions on Networking 7(4) 458--472, August 1999, <http://ieeexplore.ieee.org/ xpls/abs_all.jsp?arnumber=793002>.
[pBox]Floyd,S.和K.Fall,“促进端到端拥塞控制的使用”,IEEE/ACM网络交易7(4)458-472,1999年8月<http://ieeexplore.ieee.org/ xpls/abs_all.jsp?arnumber=793002>。
[pktByteEmail] Floyd, S., "RED: Discussions of Byte and Packet Modes", email, March 1997, <http://ee.lbl.gov/floyd/REDaveraging.txt>.
[PKTByteMail]Floyd,S.,“红色:字节和数据包模式的讨论”,电子邮件,1997年3月<http://ee.lbl.gov/floyd/REDaveraging.txt>.
This Appendix is informative, not normative.
本附录为资料性附录,非规范性附录。
In May 2007 a survey was conducted of 84 vendors to assess how widely drop probability based on packet size has been implemented in RED Table 3. About 19% of those surveyed replied, giving a sample size of 16. Although in most cases we do not have permission to identify the respondents, we can say that those that have responded include most of the larger equipment vendors, covering a large fraction of the market. The two who gave permission to be identified were Cisco and Alcatel-Lucent. The others range across the large network equipment vendors at L3 & L2, firewall vendors, wireless equipment vendors, as well as large software businesses with a small selection of networking products. All those who responded confirmed that they have not implemented the variant of RED with drop dependent on packet size (2 were fairly sure they had not but needed to check more thoroughly). At the time the survey was conducted, Linux did not implement RED with packet-size bias of drop, although we have not investigated a wider range of open source code.
2007年5月,对84家供应商进行了一项调查,以评估基于数据包大小的丢弃概率在RED表3中的实现程度。大约19%的受访者回答了这个问题,给出了16个样本。虽然在大多数情况下,我们没有权限确定被调查者,但我们可以说,被调查者包括大部分大型设备供应商,覆盖了大部分市场。两名被允许透露身份的人是思科(Cisco)和阿尔卡特朗讯(Alcatel-Lucent)。其他公司包括L3和L2的大型网络设备供应商、防火墙供应商、无线设备供应商,以及拥有少量网络产品的大型软件企业。所有回答的人都确认他们没有实现RED的变体,drop取决于数据包大小(2人相当确定他们没有,但需要更彻底地检查)。在进行调查时,Linux没有实现RED,数据包大小偏差为drop,尽管我们还没有调查更广泛的开放源代码。
+-------------------------------+----------------+--------------+ | Response | No. of vendors | % of vendors | +-------------------------------+----------------+--------------+ | Not implemented | 14 | 17% | | Not implemented (probably) | 2 | 2% | | Implemented | 0 | 0% | | No response | 68 | 81% | | Total companies/orgs surveyed | 84 | 100% | +-------------------------------+----------------+--------------+
+-------------------------------+----------------+--------------+ | Response | No. of vendors | % of vendors | +-------------------------------+----------------+--------------+ | Not implemented | 14 | 17% | | Not implemented (probably) | 2 | 2% | | Implemented | 0 | 0% | | No response | 68 | 81% | | Total companies/orgs surveyed | 84 | 100% | +-------------------------------+----------------+--------------+
Table 3: Vendor Survey on byte-mode drop variant of RED (lower drop probability for small packets)
表3:RED字节模式丢弃变体的供应商调查(小数据包的较低丢弃概率)
Where reasons were given for why the byte-mode drop variant had not been implemented, the extra complexity of packet-bias code was most prevalent, though one vendor had a more principled reason for avoiding it -- similar to the argument of this document.
在给出了为什么没有实现字节模式丢弃变量的原因的地方,数据包偏差代码的额外复杂性最为普遍,尽管一家供应商有一个更为原则性的理由来避免它——类似于本文档的论点。
Our survey was of vendor implementations, so we cannot be certain about operator deployment. But we believe many queues in the Internet are still tail drop. The company of one of the co-authors (BT) has widely deployed RED; however, many tail-drop queues are bound to still exist, particularly in access network equipment and on middleboxes like firewalls, where RED is not always available.
我们的调查是针对供应商实施的,因此我们无法确定运营商部署情况。但我们相信,互联网上的许多排队者仍然是垂头丧气的。其中一位合著者(BT)的公司已经广泛部署了RED;然而,许多尾部丢弃队列肯定仍然存在,特别是在接入网络设备和防火墙等中间盒上,红色并不总是可用。
Routers using a memory architecture based on fixed-size buffers with borrowing may also still be prevalent in the Internet. As explained in Section 4.2.1, these also provide a marginal (but legitimate) bias towards small packets. So even though RED byte-mode drop is not prevalent, it is likely there is still some bias towards small packets in the Internet due to tail-drop and fixed-buffer borrowing.
路由器使用基于固定大小缓冲区的内存结构,并进行借用,这在互联网上可能仍然很普遍。如第4.2.1节所述,这些也提供了对小数据包的边际(但合法)偏见。因此,尽管红色字节模式丢弃并不普遍,但由于尾部丢弃和固定缓冲区借用,互联网中可能仍然存在一些对小数据包的偏见。
This Appendix is informative, not normative.
本附录为资料性附录,非规范性附录。
Here we check that packet-mode drop (or marking) in the network gives sufficiently generic information for the transport layer to use. We check against a 2x2 matrix of four scenarios that may occur now or in the future (Table 4). Checking the two scenarios in each of the horizontal and vertical dimensions tests the extremes of sensitivity to packet size in the transport and in the network respectively.
在这里,我们检查网络中的分组模式丢弃(或标记)是否为传输层提供了足够的通用信息。我们对照四个场景的2x2矩阵进行检查,这些场景可能现在或将来发生(表4)。检查水平和垂直维度中的两个场景,分别测试传输和网络中对数据包大小的极端敏感性。
Note that this section does not consider byte-mode drop at all. Having deprecated byte-mode drop, the goal here is to check that packet-mode drop will be sufficient in all cases.
注意,这个部分根本不考虑字节模式的下降。对于不推荐使用的字节模式丢弃,这里的目标是检查包模式丢弃在所有情况下是否足够。
+-------------------------------+-----------------+-----------------+ | Transport -> | a) Independent | b) Dependent on | | ----------------------------- | of packet size | packet size of | | Network | of congestion | congestion | | | notifications | notifications | +-------------------------------+-----------------+-----------------+ | 1) Predominantly bit- | Scenario a1) | Scenario b1) | | congestible network | | | | 2) Mix of bit-congestible and | Scenario a2) | Scenario b2) | | pkt-congestible network | | | +-------------------------------+-----------------+-----------------+
+-------------------------------+-----------------+-----------------+ | Transport -> | a) Independent | b) Dependent on | | ----------------------------- | of packet size | packet size of | | Network | of congestion | congestion | | | notifications | notifications | +-------------------------------+-----------------+-----------------+ | 1) Predominantly bit- | Scenario a1) | Scenario b1) | | congestible network | | | | 2) Mix of bit-congestible and | Scenario a2) | Scenario b2) | | pkt-congestible network | | | +-------------------------------+-----------------+-----------------+
Table 4: Four Possible Congestion Scenarios
表4:四种可能的拥塞情况
Appendix B.1 focuses on the horizontal dimension of Table 4 checking that packet-mode drop (or marking) gives sufficient information, whether or not the transport uses it -- scenarios b) and a) respectively.
附录B.1侧重于表4的水平维度,检查分组模式丢弃(或标记)是否提供了足够的信息,无论传输是否使用它——场景B)和场景a)。
Appendix B.2 focuses on the vertical dimension of Table 4, checking that packet-mode drop gives sufficient information to the transport whether resources in the network are bit-congestible or packet-congestible (these terms are defined in Section 1.1).
附录B.2侧重于表4的垂直维度,检查分组模式丢弃是否为传输提供了足够的信息,无论网络中的资源是比特拥塞还是分组拥塞(这些术语在第1.1节中定义)。
Notation: To be concrete, we will compare two flows with different packet sizes, s_1 and s_2. As an example, we will take s_1 = 60 B = 480 b and s_2 = 1,500 B = 12,000 b.
注释:具体来说,我们将比较两个具有不同数据包大小的流,即s_1和s_2。例如,我们将取s_1=60b=480b和s_2=1500b=12000b。
A flow's bit rate, x [bps], is related to its packet rate, u [pps], by
流的比特率x[bps]与其分组速率u[pps]相关
x(t) = s*u(t).
x(t)=s*u(t)。
In the bit-congestible case, path congestion will be denoted by p_b, and in the packet-congestible case by p_p. When either case is implied, the letter p alone will denote path congestion.
在比特拥塞的情况下,路径拥塞将用p_b表示,而在分组拥塞的情况下用p_p表示。当任何一种情况被暗示时,字母p单独表示路径拥挤。
In all cases, we consider a packet-mode drop queue that indicates congestion by dropping (or marking) packets with probability p irrespective of packet size. We use an example value of loss (marking) probability, p=0.1%.
在所有情况下,我们考虑分组模式丢弃队列,它指示丢弃拥塞(或者标记)分组,而不考虑分组大小。我们使用损失(标记)概率的示例值,p=0.1%。
A transport like TCP as specified in RFC 5681 treats a congestion notification on any packet whatever its size as one event. However, a network with just the packet-mode drop algorithm gives more information if the transport chooses to use it. We will use Table 5 to illustrate this.
RFC 5681中指定的类似TCP的传输将任何数据包(无论其大小)上的拥塞通知视为一个事件。然而,如果传输选择使用分组模式丢弃算法,则仅使用分组模式丢弃算法的网络将提供更多信息。我们将使用表5来说明这一点。
We will set aside the last column until later. The columns labelled 'Flow 1' and 'Flow 2' compare two flows consisting of 60 B and 1,500 B packets respectively. The body of the table considers two separate cases, one where the flows have an equal bit rate and the other with equal packet rates. In both cases, the two flows fill a 96 Mbps link. Therefore, in the equal bit rate case, they each have half the bit rate (48Mbps). Whereas, with equal packet rates, Flow 1 uses 25 times smaller packets so it gets 25 times less bit rate -- it only gets 1/(1+25) of the link capacity (96 Mbps / 26 = 4 Mbps after rounding). In contrast Flow 2 gets 25 times more bit rate (92 Mbps) in the equal packet rate case because its packets are 25 times larger. The packet rate shown for each flow could easily be derived once the bit rate was known by dividing the bit rate by packet size, as shown in the column labelled 'Formula'.
我们将把最后一列留待以后。标有“流1”和“流2”的列分别比较由60 B和1500 B数据包组成的两个流。表的主体考虑了两种不同的情况,一种是流具有相同的比特率,另一种是具有相同的分组率。在这两种情况下,这两个流都会填充96 Mbps的链路。因此,在相同比特率的情况下,它们各自具有一半的比特率(48Mbps)。然而,在数据包速率相同的情况下,流1使用的数据包要小25倍,因此它得到的比特率要小25倍——它只得到链路容量的1/(1+25)(取整后96 Mbps/26=4 Mbps)。相反,流2在相同分组速率的情况下获得25倍的比特率(92 Mbps),因为它的分组要大25倍。一旦通过将比特率除以数据包大小(如标有“公式”的列中所示)知道比特率,就可以很容易地导出每个流显示的数据包速率。
Parameter Formula Flow 1 Flow 2 Combined ----------------------- ----------- -------- -------- -------- Packet size s/8 60 B 1,500 B (Mix) Packet size s 480 b 12,000 b (Mix) Pkt loss probability p 0.1% 0.1% 0.1%
Parameter Formula Flow 1 Flow 2 Combined ----------------------- ----------- -------- -------- -------- Packet size s/8 60 B 1,500 B (Mix) Packet size s 480 b 12,000 b (Mix) Pkt loss probability p 0.1% 0.1% 0.1%
EQUAL BIT RATE CASE Bit rate x 48 Mbps 48 Mbps 96 Mbps Packet rate u = x/s 100 kpps 4 kpps 104 kpps Absolute pkt-loss rate p*u 100 pps 4 pps 104 pps Absolute bit-loss rate p*u*s 48 kbps 48 kbps 96 kbps Ratio of lost/sent pkts p*u/u 0.1% 0.1% 0.1% Ratio of lost/sent bits p*u*s/(u*s) 0.1% 0.1% 0.1%
EQUAL BIT RATE CASE Bit rate x 48 Mbps 48 Mbps 96 Mbps Packet rate u = x/s 100 kpps 4 kpps 104 kpps Absolute pkt-loss rate p*u 100 pps 4 pps 104 pps Absolute bit-loss rate p*u*s 48 kbps 48 kbps 96 kbps Ratio of lost/sent pkts p*u/u 0.1% 0.1% 0.1% Ratio of lost/sent bits p*u*s/(u*s) 0.1% 0.1% 0.1%
EQUAL PACKET RATE CASE Bit rate x 4 Mbps 92 Mbps 96 Mbps Packet rate u = x/s 8 kpps 8 kpps 15 kpps Absolute pkt-loss rate p*u 8 pps 8 pps 15 pps Absolute bit-loss rate p*u*s 4 kbps 92 kbps 96 kbps Ratio of lost/sent pkts p*u/u 0.1% 0.1% 0.1% Ratio of lost/sent bits p*u*s/(u*s) 0.1% 0.1% 0.1%
EQUAL PACKET RATE CASE Bit rate x 4 Mbps 92 Mbps 96 Mbps Packet rate u = x/s 8 kpps 8 kpps 15 kpps Absolute pkt-loss rate p*u 8 pps 8 pps 15 pps Absolute bit-loss rate p*u*s 4 kbps 92 kbps 96 kbps Ratio of lost/sent pkts p*u/u 0.1% 0.1% 0.1% Ratio of lost/sent bits p*u*s/(u*s) 0.1% 0.1% 0.1%
Table 5: Absolute Loss Rates and Loss Ratios for Flows of Small and Large Packets and Both Combined
表5:小数据包和大数据包以及两者组合的流的绝对丢失率和丢失率
So far, we have merely set up the scenarios. We now consider congestion notification in the scenario. Two TCP flows with the same round-trip time aim to equalise their packet-loss rates over time; that is, the number of packets lost in a second, which is the packets per second (u) multiplied by the probability that each one is dropped (p). Thus, TCP converges on the case labelled 'Equal packet rate' in the table, where both flows aim for the same absolute packet-loss rate (both 8 pps in the table).
到目前为止,我们只设置了场景。现在我们考虑场景中的拥塞通知。两个具有相同往返时间的TCP流的目标是随着时间的推移均衡它们的丢包率;也就是说,每秒丢失的数据包数,即每秒数据包数(u)乘以每个数据包被丢弃的概率(p)。因此,TCP收敛于表中标记为“相等分组速率”的情况,其中两个流的目标是相同的绝对分组丢失率(表中均为8 pps)。
Packet-mode drop actually gives flows sufficient information to measure their loss rate in bits per second, if they choose, not just packets per second. Each flow can count the size of a lost or marked packet and scale its rate response in proportion (as TFRC-SP does). The result is shown in the row entitled 'Absolute bit-loss rate', where the bits lost in a second is the packets per second (u) multiplied by the probability of losing a packet (p) multiplied by the packet size (s). Such an algorithm would try to remove any imbalance in the bit-loss rate such as the wide disparity in the case labelled 'Equal packet rate' (4k bps vs. 92 kbps). Instead, a packet-size-dependent algorithm would aim for equal bit-loss rates, which would drive both flows towards the case labelled 'Equal bit rate', by driving them to equal bit-loss rates (both 48 kbps in this example).
数据包模式丢弃实际上为数据流提供了足够的信息来测量其丢失率(如果他们选择的话,以比特/秒为单位),而不仅仅是数据包/秒。每个流可以计算丢失或标记数据包的大小,并按比例缩放其速率响应(TFRC-SP就是这样做的)。结果显示在标题为“绝对比特丢失率”的行中,其中每秒丢失的比特是每秒分组数(u)乘以丢失分组的概率(p)乘以分组大小(s)。这种算法将试图消除比特丢失率中的任何不平衡,例如标记为“等分组速率”(4k bps vs.92 kbps)的情况下的大差异。相反,依赖于数据包大小的算法将以相等的比特丢失率为目标,通过将两个流驱动到相等的比特丢失率(在本例中均为48 kbps),从而将两个流驱动到标记为“相等比特率”的情况。
The explanation so far has assumed that each flow consists of packets of only one constant size. Nonetheless, it extends naturally to flows with mixed packet sizes. In the right-most column of Table 5, a flow of mixed-size packets is created simply by considering Flow 1 and Flow 2 as a single aggregated flow. There is no need for a flow to maintain an average packet size. It is only necessary for the transport to scale its response to each congestion indication by the size of each individual lost (or marked) packet. Taking, for example, the case labelled 'Equal packet rate', in one second about 8 small packets and 8 large packets are lost (making closer to 15 than 16 losses per second due to rounding). If the transport multiplies each loss by its size, in one second it responds to 8*480 and 8*12,000 lost bits, adding up to 96,000 lost bits in a second. This double checks correctly, being the same as 0.1% of the total bit rate of 96 Mbps. For completeness, the formula for absolute bit-loss rate is p(u1*s1+u2*s2).
到目前为止的解释假设每个流只包含一个恒定大小的数据包。尽管如此,它自然地扩展到具有混合数据包大小的流。在表5最右边的一列中,只需将流1和流2视为单个聚合流,即可创建混合大小数据包的流。流不需要保持平均数据包大小。传输只需要根据每个丢失(或标记)数据包的大小来调整其对每个拥塞指示的响应。例如,以标记为“相等分组速率”的情况为例,在1秒内大约有8个小分组和8个大分组丢失(由于舍入,使得每秒丢失的次数接近15次而不是16次)。如果传输将每个丢失乘以其大小,则在一秒钟内,它将响应8*480和8*12000丢失位,在一秒钟内总计96000丢失位。这种双重检查正确,与96 Mbps总比特率的0.1%相同。为完整起见,绝对比特丢失率的公式为p(u1*s1+u2*s2)。
Incidentally, a transport will always measure the loss probability the same, irrespective of whether it measures in packets or in bytes. In other words, the ratio of lost packets to sent packets will be the same as the ratio of lost bytes to sent bytes. (This is why TCP's bit rate is still proportional to packet size, even when byte counting is used, as recommended for TCP in [RFC5681], mainly for orthogonal security reasons.) This is intuitively obvious by comparing two example flows; one with 60 B packets, the other with 1,500 B packets. If both flows pass through a queue with drop probability 0.1%, each flow will lose 1 in 1,000 packets. In the stream of 60 B packets, the ratio of lost bytes to sent bytes will be 60 B in every 60,000 B; and in the stream of 1,500 B packets, the loss ratio will be 1,500 B out of 1,500,000 B. When the transport responds to the ratio of lost to sent packets, it will measure the same ratio whether it measures in packets or bytes: 0.1% in both cases. The fact that this ratio is the same whether measured in packets or bytes can be seen in Table 5, where the ratio of lost packets to sent packets and the ratio of lost bytes to sent bytes is always 0.1% in all cases (recall that the scenario was set up with p=0.1%).
顺便说一句,传输总是以相同的方式测量丢失概率,而不管它是以数据包还是以字节来测量。换句话说,丢失数据包与发送数据包的比率将与丢失字节与发送字节的比率相同。(这就是为什么TCP的比特率仍然与数据包大小成正比,即使在使用字节计数时也是如此,正如[RFC5681]中为TCP所建议的那样,主要是出于正交安全原因。)通过比较两个示例流,这一点直观而明显;一个具有60B数据包,另一个具有1500B数据包。如果两个流都以0.1%的丢弃概率通过队列,则每个流将丢失1000个数据包中的1个。在60b数据包流中,每60000b丢失字节与发送字节的比率为60b;在1500 B数据包的流中,丢失率将是1500000 B中的1500 B。当传输响应丢失与发送数据包的比率时,它将以数据包或字节为单位测量相同的比率:在这两种情况下均为0.1%。在表5中可以看出,无论是以数据包还是以字节衡量,该比率都是相同的,其中丢失的数据包与发送的数据包的比率以及丢失的字节与发送的字节的比率在所有情况下都始终为0.1%(回想一下,场景设置为p=0.1%)。
This discussion of how the ratio can be measured in packets or bytes is only raised here to highlight that it is irrelevant to this memo! Whether or not a transport depends on packet size depends on how this ratio is used within the congestion control algorithm.
关于如何以数据包或字节衡量比率的讨论仅在此处提出,以强调它与本备忘录无关!传输是否取决于数据包大小取决于拥塞控制算法中如何使用该比率。
So far, we have shown that packet-mode drop passes sufficient information to the transport layer so that the transport can take bit congestion into account, by using the sizes of the packets that indicate congestion. We have also shown that the transport can
到目前为止,我们已经证明分组模式丢弃将足够的信息传递给传输层,以便传输可以通过使用指示拥塞的分组的大小来考虑比特拥塞。我们还表明,交通工具可以
choose not to take packet size into account if it wishes. We will now consider whether the transport can know which to do.
如果愿意,选择不考虑数据包大小。我们现在将考虑运输是否能知道该做什么。
As a thought-experiment, imagine an idealised congestion notification protocol that supports both bit-congestible and packet-congestible resources. It would require at least two ECN flags, one for each of the bit-congestible and packet-congestible resources.
作为一个思想实验,想象一个理想化的拥塞通知协议,它支持比特拥塞和分组拥塞资源。它需要至少两个ECN标志,一个用于比特可拥塞资源和分组可拥塞资源。
1. A packet-congestible resource trying to code congestion level p_p into a packet stream should mark the idealised 'packet congestion' field in each packet with probability p_p irrespective of the packet's size. The transport should then take a packet with the packet congestion field marked to mean just one mark, irrespective of the packet size.
1. 试图将拥塞级别p_p编码到分组流中的分组拥塞资源应在每个分组中以概率p_p标记理想化的“分组拥塞”字段,而与分组的大小无关。然后,无论数据包大小如何,传输都应该接收一个数据包,其中数据包拥塞字段标记为仅表示一个标记。
2. A bit-congestible resource trying to code time-varying byte-congestion level p_b into a packet stream should mark the 'byte congestion' field in each packet with probability p_b, again irrespective of the packet's size. Unlike before, the transport should take a packet with the byte congestion field marked to count as a mark on each byte in the packet.
2. 试图将时变字节拥塞级别p_b编码到数据包流中的比特拥塞资源应以概率p_b标记每个数据包中的“字节拥塞”字段,同样与数据包的大小无关。与以前不同的是,传输应该接收一个数据包,其中字节拥塞字段被标记为作为数据包中每个字节的标记计数。
This hides a fundamental problem -- much more fundamental than whether we can magically create header space for yet another ECN flag, or whether it would work while being deployed incrementally. Distinguishing drop from delivery naturally provides just one implicit bit of congestion indication information -- the packet is either dropped or not. It is hard to drop a packet in two ways that are distinguishable remotely. This is a similar problem to that of distinguishing wireless transmission losses from congestive losses.
这隐藏了一个根本性的问题——比我们是否可以神奇地为另一个ECN标志创建头空间,或者它是否可以在增量部署时工作更根本。区分丢包和交付自然只提供了一个隐含的拥塞指示信息——数据包要么被丢包,要么不被丢包。很难用两种可以远程区分的方式丢弃数据包。这是一个类似于区分无线传输损耗和拥塞损耗的问题。
This problem would not be solved, even if ECN were universally deployed. A congestion notification protocol must survive a transition from low levels of congestion to high. Marking two states is feasible with explicit marking, but it is much harder if packets are dropped. Also, it will not always be cost-effective to implement AQM at every low-level resource, so drop will often have to suffice.
即使普遍部署了ECN,这个问题也无法解决。拥塞通知协议必须在从低拥塞级别过渡到高拥塞级别后仍然有效。使用显式标记来标记两个状态是可行的,但如果数据包被丢弃,则更难。此外,在每个低级别资源上实施AQM并不总是经济高效的,所以drop通常就足够了。
We are not saying two ECN fields will be needed (and we are not saying that somehow a resource should be able to drop a packet in one of two different ways so that the transport can distinguish which sort of drop it was!). These two congestion notification channels are a conceptual device to illustrate a dilemma we could face in the future. Section 3 gives four good reasons why it would be a bad idea to allow for packet size by biasing drop probability in favour of small packets within the network. The impracticality of our thought
我们不是说需要两个ECN字段(我们也不是说资源应该能够以两种不同的方式之一丢弃数据包,以便传输能够区分它是哪种类型的丢弃!)。这两个拥塞通知通道是一个概念性装置,用于说明我们在未来可能面临的困境。第3节给出了四个很好的理由,说明为什么在网络中通过偏向于小数据包的丢弃概率来考虑数据包大小是个坏主意。我们思想的不切实际
experiment shows that it will be hard to give transports a practical way to know whether or not to take into account the size of congestion indication packets.
实验表明,很难给传输提供一种实用的方法来知道是否要考虑拥塞指示数据包的大小。
Fortunately, this dilemma is not pressing because by design most equipment becomes bit-congested before its packet processing becomes congested (as already outlined in Section 1.1). Therefore, transports can be designed on the relatively sound assumption that a congestion indication will usually imply bit congestion.
幸运的是,这种困境并不紧迫,因为从设计上看,大多数设备在其数据包处理变得拥挤(如第1.1节所述)之前都会变得有点拥挤。因此,可以在相对合理的假设下设计传输,即拥塞指示通常意味着比特拥塞。
Nonetheless, although the above idealised protocol isn't intended for implementation, we do want to emphasise that research is needed to predict whether there are good reasons to believe that packet congestion might become more common, and if so, to find a way to somehow distinguish between bit and packet congestion [RFC3714].
尽管如此,尽管上述理想化协议并非用于实现,但我们想强调的是,需要进行研究,以预测是否有充分的理由相信数据包拥塞可能会变得更普遍,如果是这样,找到一种方法以某种方式区分位拥塞和数据包拥塞[RFC3714]。
Recently, the dual resource queue (DRQ) proposal [DRQ] has been made on the premise that, as network processors become more cost-effective, per-packet operations will become more complex (irrespective of whether more function in the network is desirable). Consequently the premise is that CPU congestion will become more common. DRQ is a proposed modification to the RED algorithm that folds both bit congestion and packet congestion into one signal (either loss or ECN).
最近,双资源队列(DRQ)提议[DRQ]的前提是,随着网络处理器变得更具成本效益,每个分组的操作将变得更复杂(无论网络中是否需要更多功能)。因此,前提是CPU拥塞将变得更加普遍。DRQ是对RED算法的一种改进,它将比特拥塞和分组拥塞合并为一个信号(丢失或ECN)。
Finally, we note one further complication. Strictly, packet-congestible resources are often cycle-congestible. For instance, for routing lookups, load depends on the complexity of each lookup and whether or not the pattern of arrivals is amenable to caching. This also reminds us that any solution must not require a forwarding engine to use excessive processor cycles in order to decide how to say it has no spare processor cycles.
最后,我们注意到另一个复杂问题。严格地说,分组拥塞资源通常是循环拥塞的。例如,对于路由查找,负载取决于每个查找的复杂性以及到达模式是否适合缓存。这也提醒我们,任何解决方案都不能要求转发引擎使用过多的处理器周期来决定如何说它没有备用处理器周期。
This section is informative, not normative.
本节内容丰富,不规范。
There are two main classes of approach to policing congestion response: (i) policing at each bottleneck link or (ii) policing at the edges of networks. Packet-mode drop in RED is compatible with either, while byte-mode drop precludes edge policing.
管理拥塞响应的方法主要有两类:(i)每个瓶颈链路的管理或(ii)网络边缘的管理。红色的数据包模式丢弃与其中一种兼容,而字节模式丢弃排除了边缘监控。
The simplicity of an edge policer relies on one dropped or marked packet being equivalent to another of the same size without having to know which link the drop or mark occurred at. However, the byte-mode drop algorithm has to depend on the local MTU of the line -- it needs to use some concept of a 'normal' packet size. Therefore, one dropped or marked packet from a byte-mode drop algorithm is not
边缘策略的简单性依赖于一个丢弃或标记的数据包等同于另一个大小相同的数据包,而不必知道丢弃或标记发生在哪个链路上。然而,字节模式丢弃算法必须依赖于线路的本地MTU——它需要使用一些“正常”数据包大小的概念。因此,从字节模式丢弃算法中丢弃或标记的一个数据包是无效的
necessarily equivalent to another from a different link. A policing function local to the link can know the local MTU where the congestion occurred. However, a policer at the edge of the network cannot, at least not without a lot of complexity.
必然等同于另一个不同的链接。链路本地的监控功能可以知道发生拥塞的本地MTU。然而,处于网络边缘的警察不能,至少不能没有很多复杂性。
The early research proposals for type (i) policing at a bottleneck link [pBox] used byte-mode drop, then detected flows that contributed disproportionately to the number of packets dropped. However, with no extra complexity, later proposals used packet-mode drop and looked for flows that contributed a disproportionate amount of dropped bytes [CHOKe_Var_Pkt].
早期关于瓶颈链路[pBox]的类型(i)监管的研究建议使用字节模式丢弃,然后检测出对丢弃的数据包数量贡献不成比例的流。然而,在没有额外复杂性的情况下,后来的建议使用数据包模式丢弃,并寻找产生不成比例的丢弃字节量的流[CHOKe_Var_Pkt]。
Work is progressing on the Congestion Exposure (ConEx) protocol [RFC6789], which enables a type (ii) edge policer located at a user's attachment point. The idea is to be able to take an integrated view of the effect of all a user's traffic on any link in the internetwork. However, byte-mode drop would effectively preclude such edge policing because of the MTU issue above.
拥塞暴露(ConEx)协议[RFC6789]的工作正在进行中,该协议支持位于用户连接点的(ii)型边缘策略。这个想法是为了能够对互联网中任何链路上的所有用户流量的影响进行综合观察。然而,由于上述MTU问题,字节模式丢弃将有效地排除这种边缘策略。
Indeed, making drop probability depend on the size of the packets that bits happen to be divided into would simply encourage the bits to be divided into smaller packets in order to confuse policing. In contrast, as long as a dropped/marked packet is taken to mean that all the bytes in the packet are dropped/marked, a policer can remain robust against sequences of bits being re-divided into different size packets or across different size flows [Rate_fair_Dis].
事实上,让丢弃概率取决于比特恰好被划分成的数据包的大小只会鼓励比特被划分成更小的数据包,从而混淆监管。相反,只要丢弃/标记的分组被认为意味着分组中的所有字节都被丢弃/标记,那么策略器就可以对被重新划分为不同大小的分组或跨不同大小的流的比特序列保持鲁棒性[速率公平]。
Authors' Addresses
作者地址
Bob Briscoe BT B54/77, Adastral Park Martlesham Heath Ipswich IP5 3RE UK
Bob Briscoe BT B54/77,英国阿达斯特拉尔公园马特勒沙姆希思伊普斯维奇IP5 3RE
Phone: +44 1473 645196 EMail: bob.briscoe@bt.com URI: http://bobbriscoe.net/
Phone: +44 1473 645196 EMail: bob.briscoe@bt.com URI: http://bobbriscoe.net/
Jukka Manner Aalto University Department of Communications and Networking (Comnet) P.O. Box 13000 FIN-00076 Aalto Finland
阿尔托大学通信与网络系(Comnet)邮政信箱13000 FIN-00076阿尔托芬兰
Phone: +358 9 470 22481 EMail: jukka.manner@aalto.fi URI: http://www.netlab.tkk.fi/~jmanner/
Phone: +358 9 470 22481 EMail: jukka.manner@aalto.fi URI: http://www.netlab.tkk.fi/~jmanner/