Internet Engineering Task Force (IETF)            T. Hoeiland-Joergensen
Request for Comments: 8290                           Karlstad University
Category: Experimental                                       P. McKenney
ISSN: 2070-1721                              IBM Linux Technology Center
                                                                 D. Taht
                                                                Teklibre
                                                               J. Gettys
        
Internet Engineering Task Force (IETF)            T. Hoeiland-Joergensen
Request for Comments: 8290                           Karlstad University
Category: Experimental                                       P. McKenney
ISSN: 2070-1721                              IBM Linux Technology Center
                                                                 D. Taht
                                                                Teklibre
                                                               J. Gettys
        

E. Dumazet Google, Inc. January 2018

E.Dumazet Google,Inc.2018年1月

The Flow Queue CoDel Packet Scheduler and Active Queue Management Algorithm

流队列编码包调度器与主动队列管理算法

Abstract

摘要

This memo presents the FQ-CoDel hybrid packet scheduler and Active Queue Management (AQM) algorithm, a powerful tool for fighting bufferbloat and reducing latency.

本备忘录介绍了FQ CoDel混合数据包调度器和主动队列管理(AQM)算法,这是一种对抗缓冲区膨胀和减少延迟的强大工具。

FQ-CoDel mixes packets from multiple flows and reduces the impact of head-of-line blocking from bursty traffic. It provides isolation for low-rate traffic such as DNS, web, and videoconferencing traffic. It improves utilisation across the networking fabric, especially for bidirectional traffic, by keeping queue lengths short, and it can be implemented in a memory- and CPU-efficient fashion across a wide range of hardware.

FQ CoDel将来自多个流的数据包混合在一起,并减少突发流量造成的线端阻塞的影响。它为DNS、web和视频会议流量等低速率流量提供隔离。它通过保持队列长度较短,提高了整个网络结构的利用率,尤其是双向流量,并且可以在各种硬件上以内存和CPU效率高的方式实现。

Status of This Memo

关于下段备忘

This document is not an Internet Standards Track specification; it is published for examination, experimental implementation, and evaluation.

本文件不是互联网标准跟踪规范;它是为检查、实验实施和评估而发布的。

This document defines an Experimental Protocol for the Internet community. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 7841.

本文档为互联网社区定义了一个实验协议。本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准;见RFC 7841第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc8290.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问https://www.rfc-editor.org/info/rfc8290.

Copyright Notice

版权公告

Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2018 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(https://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents

目录

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
     1.1.  Conventions Used in This Document . . . . . . . . . . . .   4
     1.2.  Terminology and Concepts  . . . . . . . . . . . . . . . .   5
     1.3.  Informal Summary of FQ-CoDel  . . . . . . . . . . . . . .   5
   2.  CoDel . . . . . . . . . . . . . . . . . . . . . . . . . . . .   7
   3.  Flow Queueing . . . . . . . . . . . . . . . . . . . . . . . .   7
   4.  The FQ-CoDel Scheduler  . . . . . . . . . . . . . . . . . . .   8
     4.1.  Enqueue . . . . . . . . . . . . . . . . . . . . . . . . .   8
       4.1.1.  Alternative Classification Schemes  . . . . . . . . .   9
     4.2.  Dequeue . . . . . . . . . . . . . . . . . . . . . . . . .  10
   5.  Implementation Considerations . . . . . . . . . . . . . . . .  11
     5.1.  Data Structures . . . . . . . . . . . . . . . . . . . . .  11
     5.2.  Parameters  . . . . . . . . . . . . . . . . . . . . . . .  12
       5.2.1.  Interval  . . . . . . . . . . . . . . . . . . . . . .  12
       5.2.2.  Target  . . . . . . . . . . . . . . . . . . . . . . .  12
       5.2.3.  Packet Limit  . . . . . . . . . . . . . . . . . . . .  13
       5.2.4.  Quantum . . . . . . . . . . . . . . . . . . . . . . .  13
       5.2.5.  Flows . . . . . . . . . . . . . . . . . . . . . . . .  13
       5.2.6.  Explicit Congestion Notification (ECN)  . . . . . . .  14
       5.2.7.  CE Threshold  . . . . . . . . . . . . . . . . . . . .  14
     5.3.  Probability of Hash Collisions  . . . . . . . . . . . . .  14
     5.4.  Memory Overhead . . . . . . . . . . . . . . . . . . . . .  15
     5.5.  Per-Packet Timestamping . . . . . . . . . . . . . . . . .  16
     5.6.  Limiting Queueing in Lower Layers . . . . . . . . . . . .  16
     5.7.  Other Forms of Fair Queueing  . . . . . . . . . . . . . .  17
     5.8.  Differences between CoDel and FQ-CoDel Behaviour  . . . .  17
   6.  Limitations of Flow Queueing  . . . . . . . . . . . . . . . .  18
     6.1.  Fairness between Things Other Than Flows  . . . . . . . .  18
     6.2.  Flow Bunching by Opaque Encapsulation . . . . . . . . . .  18
     6.3.  Low-Priority Congestion Control Algorithms  . . . . . . .  19
   7.  Deployment Status and Future Work . . . . . . . . . . . . . .  19
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  20
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  21
   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  21
     10.1.  Normative References . . . . . . . . . . . . . . . . . .  21
     10.2.  Informative References . . . . . . . . . . . . . . . . .  21
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  24
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  25
        
   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
     1.1.  Conventions Used in This Document . . . . . . . . . . . .   4
     1.2.  Terminology and Concepts  . . . . . . . . . . . . . . . .   5
     1.3.  Informal Summary of FQ-CoDel  . . . . . . . . . . . . . .   5
   2.  CoDel . . . . . . . . . . . . . . . . . . . . . . . . . . . .   7
   3.  Flow Queueing . . . . . . . . . . . . . . . . . . . . . . . .   7
   4.  The FQ-CoDel Scheduler  . . . . . . . . . . . . . . . . . . .   8
     4.1.  Enqueue . . . . . . . . . . . . . . . . . . . . . . . . .   8
       4.1.1.  Alternative Classification Schemes  . . . . . . . . .   9
     4.2.  Dequeue . . . . . . . . . . . . . . . . . . . . . . . . .  10
   5.  Implementation Considerations . . . . . . . . . . . . . . . .  11
     5.1.  Data Structures . . . . . . . . . . . . . . . . . . . . .  11
     5.2.  Parameters  . . . . . . . . . . . . . . . . . . . . . . .  12
       5.2.1.  Interval  . . . . . . . . . . . . . . . . . . . . . .  12
       5.2.2.  Target  . . . . . . . . . . . . . . . . . . . . . . .  12
       5.2.3.  Packet Limit  . . . . . . . . . . . . . . . . . . . .  13
       5.2.4.  Quantum . . . . . . . . . . . . . . . . . . . . . . .  13
       5.2.5.  Flows . . . . . . . . . . . . . . . . . . . . . . . .  13
       5.2.6.  Explicit Congestion Notification (ECN)  . . . . . . .  14
       5.2.7.  CE Threshold  . . . . . . . . . . . . . . . . . . . .  14
     5.3.  Probability of Hash Collisions  . . . . . . . . . . . . .  14
     5.4.  Memory Overhead . . . . . . . . . . . . . . . . . . . . .  15
     5.5.  Per-Packet Timestamping . . . . . . . . . . . . . . . . .  16
     5.6.  Limiting Queueing in Lower Layers . . . . . . . . . . . .  16
     5.7.  Other Forms of Fair Queueing  . . . . . . . . . . . . . .  17
     5.8.  Differences between CoDel and FQ-CoDel Behaviour  . . . .  17
   6.  Limitations of Flow Queueing  . . . . . . . . . . . . . . . .  18
     6.1.  Fairness between Things Other Than Flows  . . . . . . . .  18
     6.2.  Flow Bunching by Opaque Encapsulation . . . . . . . . . .  18
     6.3.  Low-Priority Congestion Control Algorithms  . . . . . . .  19
   7.  Deployment Status and Future Work . . . . . . . . . . . . . .  19
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  20
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  21
   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  21
     10.1.  Normative References . . . . . . . . . . . . . . . . . .  21
     10.2.  Informative References . . . . . . . . . . . . . . . . .  21
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  24
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  25
        
1. Introduction
1. 介绍

The Flow Queue CoDel (FQ-CoDel) algorithm is a combined packet scheduler and Active Queue Management (AQM) [RFC3168] algorithm developed as part of the bufferbloat-fighting community effort [BLOATWEB]. It is based on a modified Deficit Round Robin (DRR) queue scheduler [DRR] [DRRPP] with the CoDel AQM [RFC8289] algorithm operating on each queue. This document describes the combined algorithm; reference implementations are available for the ns-2 [NS2] and ns-3 [NS3] network simulators, and the algorithm is included in the mainline Linux kernel as the fq_codel queueing discipline [LINUXSRC].

Flow Queue CoDel(FQ CoDel)算法是一种组合的数据包调度器和主动队列管理(AQM)[RFC3168]算法,它是作为bufferbloat战斗社区努力[BLOATWEB]的一部分开发的。它基于一个改进的赤字循环(DRR)队列调度器[DRR][DRRPP],在每个队列上运行CoDel AQM[RFC8289]算法。本文档描述了组合算法;参考实现可用于ns-2[NS2]和ns-3[NS3]网络模拟器,该算法作为fq_codel排队规程[LINUXSRC]包含在主线Linux内核中。

FQ-CoDel is a general, efficient, nearly parameterless queue management approach combining flow queueing with CoDel. It is a powerful tool for solving bufferbloat [BLOAT] and has already been turned on by default in a number of Linux distributions. In this document, we describe the Linux implementation in sufficient detail for others to independently implement the algorithm for deployment outside the Linux ecosystem.

FQ CoDel是一种将流排队与CoDel相结合的通用、高效、几乎无参数的队列管理方法。它是解决bufferbloat[BLOAT]问题的强大工具,在许多Linux发行版中默认情况下都已启用。在本文档中,我们对Linux实现进行了足够详细的描述,以便其他人在Linux生态系统之外独立实现部署算法。

Since the FQ-CoDel algorithm was originally developed in the Linux kernel, that implementation is still considered canonical. This document describes the algorithm in the abstract in Sections 1-4 and separates out most implementation details in subsequent sections; however, the Linux implementation is used as a reference for default behaviour in the abstract algorithm description.

由于FQ CoDel算法最初是在Linux内核中开发的,因此该实现仍然被认为是规范的。本文档在第1-4节中对算法进行了抽象描述,并在后续章节中分离出大部分实现细节;但是,Linux实现在抽象算法描述中用作默认行为的参考。

This document is structured as follows. This section gives some concepts and terminology used in the rest of the document and gives a short informal summary of the FQ-CoDel algorithm. Section 2 gives an overview of the CoDel algorithm. Section 3 covers the flow hashing and DRR portion. Section 4 then describes the working of the algorithm in detail, while Section 5 describes implementation details and considerations. Section 6 lists some of the limitations of using flow queueing. Section 7 outlines the current status of FQ-CoDel deployment and lists some possible future areas of inquiry. Finally, Section 8 reiterates some important security points that must be observed in the implementation.

本文件的结构如下。本节给出了文档其余部分中使用的一些概念和术语,并对FQ CoDel算法进行了简短的非正式总结。第2节概述了CoDel算法。第3节介绍了流散列和DRR部分。第4节详细描述了算法的工作原理,而第5节描述了实现细节和注意事项。第6节列出了使用流排队的一些限制。第7节概述了FQ CoDel部署的当前状态,并列出了一些可能的未来调查领域。最后,第8节重申了在执行过程中必须遵守的一些重要安全要点。

1.1. Conventions Used in This Document
1.1. 本文件中使用的公约

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“建议”、“不建议”、“可”和“可选”在所有大写字母出现时(如图所示)应按照BCP 14[RFC2119][RFC8174]所述进行解释。

1.2. Terminology and Concepts
1.2. 术语和概念

Flow: A flow is typically identified by a 5-tuple of source IP address, destination IP address, source port number, destination port number, and protocol number. It can also be identified by a superset or subset of those parameters, by Media Access Control (MAC) address, or by other means. FQ-CoDel hashes flows into a configurable number of buckets to assign packets to internal queues.

流:流通常由源IP地址、目标IP地址、源端口号、目标端口号和协议号的5元组标识。它也可以通过这些参数的超集或子集、媒体访问控制(MAC)地址或其他方式来识别。FQ CoDel散列流入可配置数量的bucket,以将数据包分配给内部队列。

Queue: A queue of packets represented internally in FQ-CoDel. In most instances, each flow gets its own queue; however, because of the possibility of hash collisions, this is not always the case. In an attempt to avoid confusion, the word "queue" is used to refer to the internal data structure, and "flow" is used to refer to the actual stream of packets being delivered to the FQ-CoDel algorithm.

队列:在FQ CoDel中内部表示的数据包队列。在大多数情况下,每个流都有自己的队列;但是,由于哈希冲突的可能性,情况并非总是如此。为了避免混淆,单词“queue”用于表示内部数据结构,“flow”用于表示发送到FQ CoDel算法的实际数据包流。

Scheduler: A mechanism to select which queue a packet is dequeued from.

调度器:一种选择数据包从哪个队列退出的机制。

CoDel AQM: The Active Queue Management algorithm employed by FQ-CoDel as described in [RFC8289].

CoDel AQM:FQ CoDel采用的主动队列管理算法,如[RFC8289]所述。

DRR: Deficit Round Robin scheduling [DRR].

DRR:赤字循环调度[DRR]。

Quantum: The maximum amount of bytes to be dequeued from a queue at once.

Quantum:一次从队列中退出队列的最大字节数。

Interval: Characteristic time period used by the control loop of CoDel to detect when a persistent queue is developing (see Section 4.2 of [RFC8289]).

Interval:CoDel控制循环用于检测持续队列何时形成的特征时间段(参见[RFC8289]第4.2节)。

Target: Setpoint value of the minimum sojourn time of packets in a queue used as the target of the control loop in CoDel (see Section 4.3 of [RFC8289]).

目标:CoDel中用作控制回路目标的队列中数据包的最小驻留时间的设定值(见[RFC8289]第4.3节)。

1.3. Informal Summary of FQ-CoDel
1.3. FQ CoDel的非正式摘要

FQ-CoDel is a hybrid of DRR [DRR] and CoDel [RFC8289], with an optimisation for sparse flows similar to Shortest Queue First (SQF) [SQF] and DRR++ [DRRPP]. We call this "flow queueing" rather than "fair queueing", as flows that build a queue are treated differently from flows that do not.

FQ CoDel是DRR[DRR]和CoDel[RFC8289]的混合,对稀疏流的优化类似于最短队列优先(SQF)[SQF]和DRR++[DRRPP]。我们称之为“流排队”,而不是“公平排队”,因为构建队列的流与不构建队列的流的处理方式不同。

By default, FQ-CoDel stochastically classifies incoming packets into different queues by hashing the 5-tuple of protocol number, source and destination IP addresses, and source and destination port numbers, perturbed with a random number selected at initiation time (although other flow classification schemes can optionally be configured instead; see Section 4.1.1). Each queue is managed by the CoDel AQM algorithm [CODEL] [RFC8289]. Packet ordering within a queue is preserved, since queues have FIFO ordering.

默认情况下,FQ CoDel通过散列协议号、源和目标IP地址以及源和目标端口号的5元组,随机地将传入的数据包分类到不同的队列中,并在启动时使用随机数进行扰动(尽管可以选择配置其他流分类方案;请参见第4.1.1节)。每个队列由CoDel AQM算法[CoDel][RFC8289]管理。由于队列具有FIFO顺序,因此保留队列中的数据包顺序。

The FQ-CoDel algorithm consists of two logical parts: (1) the scheduler, which selects which queue to dequeue a packet from, and (2) the CoDel AQM, which works on each of the queues. The subtleties of FQ-CoDel are mostly in the scheduling part, whereas the interaction between the scheduler and the CoDel algorithm are fairly straightforward.

FQ CoDel算法由两个逻辑部分组成:(1)调度器,用于选择数据包从哪个队列出列;(2)CoDel AQM,用于每个队列。FQ CoDel的微妙之处主要在于调度部分,而调度器和CoDel算法之间的交互相当简单。

At initialisation, each queue is set up to have a separate set of CoDel state variables. By default, 1024 queues are created. The Linux implementation at the time of writing supports anywhere from one to 65535 separate queues, and each queue maintains the state variables throughout its lifetime, and so acts the same as the non-FQ variant of CoDel would. This means that with only one queue, FQ-CoDel behaves essentially the same as CoDel by itself.

初始化时,每个队列都设置为具有一组单独的CoDel状态变量。默认情况下,将创建1024个队列。在编写本文时,Linux实现支持从一个到65535个不同的队列,每个队列在其生命周期内维护状态变量,因此其行为与CoDel的非FQ变体相同。这意味着只有一个队列,FQ CoDel的行为本质上与CoDel本身相同。

On dequeue, FQ-CoDel selects a queue from which to dequeue by a two-tier, round-robin scheme, in which each queue is allowed to dequeue up to a configurable quantum of bytes for each iteration. Deviations from this quantum are maintained as byte credits for the queue, which serves to make the fairness scheme byte-based rather than packet-based. The two-tier, round-robin mechanism distinguishes between "new" queues (which don't build up a standing queue) and "old" queues (which have queued enough data to be active for more than one iteration of the round-robin scheduler).

在出列时,FQ CoDel选择一个队列,通过两层循环方案从中出列,在该方案中,每个队列允许为每次迭代出列一个可配置的字节量。与此量的偏差作为队列的字节信用进行维护,这使得公平方案基于字节而不是基于数据包。两层循环机制区分“新”队列(不建立固定队列)和“旧”队列(已将足够的数据排队,以便在循环调度程序的多个迭代中处于活动状态)。

This new/old queue distinction has a particular consequence for queues that don't build up more than a quantum of bytes before being visited by the scheduler: such a queue will be removed from the list after it empties and then re-added as a new queue the next time a packet arrives for it. This means it will effectively get priority over queues that do not empty out each round (a minor caveat is required here to protect against starvation, see below). Exactly how little data a flow has to send to keep its queue in this state is somewhat difficult to reason about, because it depends on both the egress link speed and the number of concurrent flows. However, in practice, many things that are beneficial to have prioritised for typical internet use (ACKs, DNS lookups, interactive Secure Shell (SSH), HTTP requests, Voice over IP (VoIP)) _tend_ to fall in this category, which is why FQ-CoDel performs so well for many practical

这种新/旧队列的区别对于在调度程序访问之前构建的队列不超过一个字节量的队列有一个特殊的后果:这样的队列在清空后将从列表中删除,然后在下次数据包到达时重新添加为新队列。这意味着它将有效地优先于不清空每一轮的队列(这里需要一个小警告以防止饥饿,见下文)。确切地说,流必须发送多少数据才能保持其队列处于这种状态有些难以解释,因为这取决于出口链接速度和并发流的数量。然而,在实践中,许多有利于优先考虑典型互联网使用的事情(ACK、DNS查找、交互式安全外壳(SSH)、HTTP请求、IP语音(VoIP))往往属于这一类,这就是FQ CoDel在许多实际应用中表现如此出色的原因

applications. However, the implicitness of the prioritisation means that for applications that require guaranteed priority (for instance, multiplexing the network control plane over the network itself), explicit classification is still needed.

应用。然而,优先级的隐含性意味着,对于需要保证优先级的应用程序(例如,通过网络本身多路传输网络控制平面),仍然需要明确的分类。

This scheduling scheme has some subtlety to it, which is explained in detail in the remainder of this document.

此调度方案有一些微妙之处,本文档的其余部分将对此进行详细解释。

2. CoDel
2. 科德尔

CoDel is described in the Communications of the ACM paper [CODEL] and the IETF document [RFC8289]. The basic idea is to control queue length, maintaining sufficient queueing to keep the outgoing link busy but avoiding building up the queue beyond that point. This is done by preferentially dropping packets that remain in the queue for "too long". Packets are dropped by head drop, which lowers the time for the drop signal to propagate back to the sender by the length of the queue and helps trigger TCP fast retransmit sooner.

ACM文件[CoDel]和IETF文件[RFC8289]的通信中描述了CoDel。其基本思想是控制队列长度,保持足够的队列以保持传出链路繁忙,但避免在该点之外建立队列。这是通过优先丢弃留在队列中“太长”的数据包来实现的。数据包通过头丢弃进行丢弃,这将使丢弃信号传回发送方的时间缩短队列长度,并有助于更快地触发TCP快速重传。

The CoDel algorithm itself will not be described here; instead, we refer the reader to the CoDel document [RFC8289].

这里不描述CoDel算法本身;相反,我们让读者参考CoDel文档[RFC8289]。

3. Flow Queueing
3. 流排队

The intention of FQ-CoDel's scheduler is to give each flow its own queue, hence the term "flow queueing". Rather than a perfect realisation of this, a hashing-based scheme is used, where flows are hashed into a number of buckets, each of which has its own queue. The number of buckets is configurable and presently defaults to 1024 in the Linux implementation. This is enough to avoid hash collisions on a moderate number of flows as seen, for instance, in a home gateway. Depending on the characteristics of the link, this can be tuned to trade off memory for a lower probability of hash collisions. See Sections 5.3 and 5.4 for a more in-depth discussion of this.

FQ CoDel调度程序的目的是为每个流提供自己的队列,因此称为“流队列”。使用了基于散列的方案,将流散列到多个bucket中,每个bucket都有自己的队列,而不是完美地实现这一点。bucket的数量是可配置的,在Linux实现中目前默认为1024个。这足以避免在中等数量的流上发生哈希冲突,例如在家庭网关中。根据链路的特性,可以对其进行调整,以牺牲内存,降低散列冲突的概率。有关这方面的更深入讨论,请参见第5.3节和第5.4节。

By default, the flow hashing is performed on the 5-tuple of source and destination IP addresses, source and destination port numbers, and protocol number. While the hashing can be customised to match on arbitrary packet bytes, care should be taken when doing so; much of the benefit of the FQ-CoDel scheduler comes from this per-flow distinction. However, the default hashing does have some limitations, as discussed in Section 6.

默认情况下,对源和目标IP地址、源和目标端口号以及协议号的5元组执行流哈希。虽然可以定制散列以匹配任意数据包字节,但在进行此操作时应小心;FQ CoDel调度器的许多好处都来自于每流的区别。但是,默认哈希确实有一些限制,如第6节所述。

FQ-CoDel's DRR scheduler is byte-based, employing a deficit round-robin mechanism between queues. This works by keeping track of the current number of "byte credits" of each queue. This number is initialised to the configurable quantum; each time a queue gets a dequeue opportunity, it gets to dequeue packets, thus decreasing the

FQ CoDel的DRR调度程序是基于字节的,在队列之间采用了缺陷循环机制。这通过跟踪每个队列的当前“字节信用”数来实现。该数字初始化为可配置的量程;每次队列获得一个出列机会时,它都会将数据包出列,从而减少

number of credits by the packet size for each packet. This continues until the value of the byte credits counter becomes zero or less, at which point the counter is increased by one quantum, and the dequeue opportunity ends.

根据每个数据包的数据包大小确定的信用数。这将一直持续到字节积分计数器的值变为零或更小,此时计数器增加一个量子,并且出列机会结束。

This means that if one queue contains packets of, for instance, size quantum/3, and another contains quantum-sized packets, the first queue will dequeue three packets each time it gets a turn, whereas the second only dequeues one. This means that flows that send small packets are not penalised by the difference in packet sizes; rather, the DRR scheme approximates a byte-based fairness queueing scheme. The size of the quantum determines the scheduling granularity, with the trade-off from too small a quantum being scheduling overhead. For small bandwidths, lowering the quantum from the default MTU size can be advantageous.

这意味着,如果一个队列包含大小为quantum/3的数据包,而另一个队列包含大小为quantum/3的数据包,则第一个队列每次轮到一个数据包时,都会让三个数据包出列,而第二个队列只让一个数据包出列。这意味着发送小数据包的流不会因数据包大小的差异而受到惩罚;相反,DRR方案近似于基于字节的公平性排队方案。量子的大小决定了调度粒度,量子太小的代价是调度开销。对于小带宽,将量子从默认MTU大小降低是有利的。

Unlike plain DRR, there are two sets of flows: a "new" list for flows that have not built a queue recently and an "old" list for queues that build a backlog. This distinction is an integral part of the FQ-CoDel scheduler and is described in more detail in Section 4.

与普通DRR不同,有两组流:一组是最近没有构建队列的流的“新”列表,另一组是构建积压的队列的“旧”列表。这种区别是FQ CoDel调度程序的一个组成部分,在第4节中有更详细的描述。

4. The FQ-CoDel Scheduler
4. FQ代码调度器

To make its scheduling decisions, FQ-CoDel maintains two ordered lists of active queues: new and old queues. When a packet is added to a queue that is not currently active, that queue becomes active by being added to the list of new queues. Later on, it is moved to the list of old queues, from which it is removed when it is no longer active. This behaviour is the source of some subtlety in the packet scheduling at dequeue time, as explained below.

为了做出调度决策,FQ CoDel维护两个活动队列的有序列表:新队列和旧队列。将数据包添加到当前不活动的队列时,该队列将通过添加到新队列列表而变为活动队列。稍后,它将移动到旧队列列表中,当它不再处于活动状态时,将从中删除它。如下文所述,这种行为是数据包在出列时调度的一些微妙之处的根源。

4.1. Enqueue
4.1. 排队

The packet enqueue mechanism consists of three stages: classifying into a queue, timestamping and bookkeeping, and optionally dropping a packet when the total number of enqueued packets goes over the maximum.

分组排队机制包括三个阶段:分类为队列、时间戳和簿记,以及在分组总数超过最大值时选择性地丢弃分组。

When a packet is enqueued, it is first classified into the appropriate queue. By default, this is done by hashing (using a Jenkins hash function [JENKINS]) on the 5-tuple of IP protocol, source and destination IP addresses, and source and destination port numbers (if they exist) and then taking the hash value modulo the number of queues. The hash is salted by modulo addition of a random value selected at initialisation time to prevent possible DoS attacks if the hash is predictable ahead of time (see Section 8). The Linux

当数据包排队时,它首先被分类到适当的队列中。默认情况下,这是通过对IP协议、源和目标IP地址以及源和目标端口号(如果存在)的5元组进行散列(使用Jenkins散列函数[Jenkins]),然后将散列值乘以队列数来完成的。如果哈希提前可预测,则通过在初始化时选择的随机值的模加来对哈希进行加密,以防止可能的DoS攻击(参见第8节)。Linux

kernel implements the Jenkins hash function by mixing three 32-bit values into a single 32-bit output value. Inputs larger than 96 bits are reduced by additional mixing steps, 96 bits at a time.

内核通过将三个32位值混合到一个32位输出值中来实现Jenkins哈希函数。大于96位的输入通过额外的混合步骤减少,每次96位。

Once the packet has been successfully classified into a queue, it is handed over to the CoDel algorithm for timestamping. It is then added to the tail of the selected queue, and the queue's byte count is updated by the packet size. Then, if the queue is not currently active (i.e., if it is not in either the list of new queues or the list of old queues), it is added to the end of the list of new queues, and its number of credits is initiated to the configured quantum. Otherwise, the queue is left in its current queue list.

一旦数据包被成功地分类到队列中,它就被交给CoDel算法进行时间戳。然后将其添加到所选队列的尾部,队列的字节计数由数据包大小更新。然后,如果队列当前未处于活动状态(即,如果它既不在新队列列表中也不在旧队列列表中),则将其添加到新队列列表的末尾,并将其信用数初始化到配置的量程。否则,队列将保留在其当前队列列表中。

Finally, to protect against overload, the total number of enqueued packets is compared with the configured limit. If the limit is exceeded (which can happen since a packet was just enqueued), the queue with the largest current byte count is selected and half the number of packets from this queue (up to a maximum of 64 packets) are dropped from the head of that queue. Dropping several packets at once helps amortise the cost of finding the longest queue, significantly lowering CPU usage in an overload situation.

最后,为了防止过载,将排队的数据包总数与配置的限制进行比较。如果超过限制(可能是因为数据包刚刚进入队列),将选择当前字节数最大的队列,并从该队列的头部丢弃该队列中数据包数的一半(最多64个数据包)。一次丢弃几个数据包有助于分摊查找最长队列的成本,从而显著降低过载情况下的CPU使用率。

4.1.1. Alternative Classification Schemes
4.1.1. 替代分类方案

As mentioned previously, it is possible to modify the classification scheme to provide a different notion of a flow. The Linux implementation provides this option in the form of the "tc filter" command. While this can add capabilities (for instance, matching on other possible parameters such as MAC address, Diffserv code point values, firewall rules, flow-specific markings, IPv6 flow label, etc.), care should be taken to preserve the notion of flow because much of the benefit of the FQ-CoDel scheduler comes from keeping flows in separate queues.

如前所述,可以修改分类方案以提供不同的流概念。Linux实现以“tc filter”命令的形式提供此选项。虽然这可以增加功能(例如,匹配其他可能的参数,如MAC地址、Diffserv代码点值、防火墙规则、流特定标记、IPv6流标签等),但应注意保留流的概念,因为FQ CoDel调度器的许多好处来自将流保持在单独的队列中。

For protocols that do not contain a port number (such as ICMP), the Linux implementation simply sets the port numbers to zero and performs the hashing as usual. In practice, this results in such protocols each getting their own queue (except in the case of hash collisions). An implementation can perform other classifications for protocols that have their own notion of a flow but SHOULD fall back to simply hashing on source and destination IP address and protocol number in the absence of other information.

对于不包含端口号的协议(如ICMP),Linux实现只需将端口号设置为零,并像往常一样执行哈希。实际上,这会导致这样的协议各自获得自己的队列(哈希冲突的情况除外)。一个实现可以对具有自己的流概念的协议执行其他分类,但应该退回到在没有其他信息的情况下对源和目标IP地址和协议号进行简单的散列。

The default classification scheme can additionally be improved by performing decapsulation of tunnelled packets prior to hashing on the 5-tuple in the encapsulated payload. The Linux implementation does this for common encapsulations known to the kernel, such as 6in4 [RFC4213], IP-in-IP [RFC2003], and Generic Routing Encapsulation

默认分类方案还可以通过在对封装负载中的5元组进行散列之前对隧道包执行去封装来改进。Linux实现对内核已知的常见封装(如6in4[RFC4213]、IP-in-IP[rfc203]和通用路由封装)执行此操作

(GRE) [RFC2890]. This helps to distinguish between flows that share the same (outer) 5-tuple but, of course, is limited to unencrypted tunnels (see Section 6.2 for a discussion of encrypted tunnels).

(GRE)[RFC2890]。这有助于区分共享相同(外部)5元组的流,当然,这些流仅限于未加密隧道(有关加密隧道的讨论,请参见第6.2节)。

4.2. Dequeue
4.2. 出列

Most of FQ-CoDel's work is done at packet dequeue time. It consists of three parts: selecting a queue from which to dequeue a packet, actually dequeueing it (employing the CoDel algorithm in the process), and some final bookkeeping.

FQ CoDel的大部分工作都是在数据包出列时完成的。它由三个部分组成:选择一个队列,从中取出一个数据包,实际将其取出(在过程中使用CoDel算法),以及一些最终的簿记。

For the first part, the scheduler first looks at the list of new queues; for the queue at the head of that list, if that queue has a negative number of credits (i.e., it has already dequeued at least a quantum of bytes), it is given an additional quantum of credits, the queue is put onto _the end of_ the list of old queues, and the routine selects the next queue and starts again.

对于第一部分,调度器首先查看新队列的列表;对于位于该列表顶部的队列,如果该队列的信用数为负数(即,它已经至少退出了一个字节的队列),则会给它一个额外的信用量,该队列将放在旧队列列表的末尾,例程选择下一个队列并再次启动。

Otherwise, that queue is selected for dequeue. If the list of new queues is empty, the scheduler proceeds down the list of old queues in the same fashion (checking the credits and either selecting the queue for dequeueing or adding credits and putting the queue back at the end of the list).

否则,将选择该队列进行出列。如果新队列列表为空,调度程序将以相同的方式向下移动旧队列列表(检查积分并选择要退出队列的队列或添加积分并将队列放回列表末尾)。

After having selected a queue from which to dequeue a packet, the CoDel algorithm is invoked on that queue. This applies the CoDel control law, which is the mechanism CoDel uses to determine when to drop packets (see [RFC8289]). As a result of this, one or more packets may be discarded from the head of the selected queue before the packet that should be dequeued is returned (or nothing is returned if the queue is or becomes empty while being handled by the CoDel algorithm).

在选择了一个队列以从中退出数据包后,将在该队列上调用CoDel算法。这应用了CoDel控制律,这是CoDel用来确定何时丢弃数据包的机制(请参见[RFC8289])。因此,在返回应该出列的数据包之前,可能会从所选队列的头部丢弃一个或多个数据包(或者,如果队列在由CoDel算法处理时为空或变为空,则不会返回任何内容)。

Finally, if the CoDel algorithm does not return a packet, then the queue must be empty, and the scheduler does one of two things. If the queue selected for dequeue came from the list of new queues, it is moved to _the end of_ the list of old queues. If instead it came from the list of old queues, that queue is removed from the list, to be added back (as a new queue) the next time a packet arrives that hashes to that queue. Then (since no packet was available for dequeue), the whole dequeue process is restarted from the beginning.

最后,如果CoDel算法不返回数据包,那么队列必须为空,调度器执行以下两种操作之一。如果为退出队列选择的队列来自新队列列表,它将移动到旧队列列表的末尾。如果它来自旧队列列表,则该队列将从列表中删除,并在下一次到达散列到该队列的数据包时添加回(作为新队列)。然后(因为没有数据包可用于出列),从一开始就重新启动整个出列过程。

If, instead, the scheduler _did_ get a packet back from the CoDel algorithm, it subtracts the size of the packet from the byte credits for the selected queue and returns the packet as the result of the dequeue operation.

相反,如果调度器确实从CoDel算法中获取了数据包,它将从所选队列的字节信用中减去数据包的大小,并将数据包作为出列操作的结果返回。

The step that moves an empty queue from the list of new queues to the end of the list of old queues before it is removed is crucial to prevent starvation. Otherwise, the queue could reappear (the next time a packet arrives for it) before the list of old queues is visited; this can go on indefinitely, even with a small number of active flows, if the flow providing packets to the queue in question transmits at just the right rate. This is prevented by first moving the queue to the end of the list of old queues, forcing the scheduler to service all old queues before the empty queue is removed and thus preventing starvation.

在删除空队列之前,将其从新队列列表移动到旧队列列表末尾的步骤对于防止饥饿至关重要。否则,在访问旧队列列表之前,队列可能会重新出现(下次数据包到达时);如果向所讨论的队列提供数据包的流以正确的速率传输,则即使有少量活动流,这种情况也可能无限期地继续下去。首先将队列移动到旧队列列表的末尾,强制调度程序在删除空队列之前为所有旧队列提供服务,从而防止饥饿。

The resulting migration of queues between the different states is summarised in the state diagram shown in Figure 1. Note that both the new and old queue states can additionally have arrival and dequeue events that do not change the state; these are omitted in the figure.

图1所示的状态图总结了不同状态之间队列的迁移结果。请注意,新队列状态和旧队列状态还可以有不改变状态的到达和退出队列事件;这些在图中被省略。

   +-----------------+                +------------------+
   |                 |     Empty      |                  |
   |     Empty       |<---------------+       Old        +----+
   |                 |                |                  |    |
   +-------+---------+                +------------------+    |
           |                             ^            ^       |Credits
           |Arrival                      |            |       |Exhausted
           v                             |            |       |
   +-----------------+                   |            |       |
   |                 |      Empty or     |            |       |
   |      New        +-------------------+            +-------+
   |                 | Credits Exhausted
   +-----------------+
        
   +-----------------+                +------------------+
   |                 |     Empty      |                  |
   |     Empty       |<---------------+       Old        +----+
   |                 |                |                  |    |
   +-------+---------+                +------------------+    |
           |                             ^            ^       |Credits
           |Arrival                      |            |       |Exhausted
           v                             |            |       |
   +-----------------+                   |            |       |
   |                 |      Empty or     |            |       |
   |      New        +-------------------+            +-------+
   |                 | Credits Exhausted
   +-----------------+
        

Figure 1: Partial State Diagram for Queues between Different States

图1:不同状态之间队列的部分状态图

5. Implementation Considerations
5. 实施考虑

This section contains implementation details for the FQ-CoDel algorithm. This includes the data structures and parameters used in the Linux implementation, as well as discussion of some required features of the target platform and other considerations.

本节包含FQ CoDel算法的实现细节。这包括Linux实现中使用的数据结构和参数,以及对目标平台的一些必需特性和其他注意事项的讨论。

5.1. Data Structures
5.1. 数据结构

The main data structure of FQ-CoDel is the array of queues, which is instantiated with the number of queues specified by the "flows" parameter at instantiation time. Each queue consists simply of an ordered list of packets with FIFO semantics, two state variables tracking the queue credits and total number of bytes enqueued, and the set of CoDel state variables. Other state variables to track

FQ CoDel的主要数据结构是队列数组,它在实例化时使用“flows”参数指定的队列数进行实例化。每个队列仅由具有FIFO语义的有序数据包列表、跟踪队列信用和排队总字节数的两个状态变量以及CoDel状态变量集组成。要跟踪的其他状态变量

queue statistics can also be included; for instance, the Linux implementation keeps a count of dropped packets.

还可以包括队列统计信息;例如,Linux实现会记录丢弃的数据包的数量。

In addition to the queue structures themselves, FQ-CoDel maintains two ordered lists containing references to the subset of queues that are currently active. These are the lists of new and old queues, as explained in Section 4 above.

除了队列结构本身,FQ CoDel还维护两个有序列表,其中包含对当前活动队列子集的引用。如上文第4节所述,这些是新队列和旧队列的列表。

In the Linux implementation, queue space is shared: there's a global limit on the number of packets the queues can hold, but not a limit for each queue.

在Linux实现中,队列空间是共享的:队列可以容纳的数据包数量有一个全局限制,但每个队列没有限制。

5.2. Parameters
5.2. 参数

The following are the user configuration parameters exposed by the Linux implementation of FQ-CoDel.

以下是FQ CoDel的Linux实现公开的用户配置参数。

5.2.1. Interval
5.2.1. 间隔

The "interval" parameter has the same semantics as CoDel and is used to ensure that the minimum sojourn time of packets in a queue used as an estimator by the CoDel control algorithm is a relatively up-to-date value. That is, CoDel only reacts to delay experienced in the last epoch of length interval. It SHOULD be set to be on the order of the worst-case RTT through the bottleneck to give end points sufficient time to react.

“interval”参数与CoDel具有相同的语义,用于确保CoDel控制算法用作估计器的队列中数据包的最小驻留时间是相对最新的值。也就是说,CoDel只对长度间隔的最后一个历元中经历的延迟作出反应。应将其设置为最坏情况下通过瓶颈的RTT顺序,以便给端点足够的时间作出反应。

The default interval value is 100 ms.

默认间隔值为100毫秒。

5.2.2. Target
5.2.2. 目标

The "target" parameter has the same semantics as CoDel. It is the acceptable minimum standing/persistent queue delay for each FQ-CoDel queue. This minimum delay is identified by tracking the local minimum queue delay that packets experience.

“target”参数与CoDel具有相同的语义。它是每个FQ CoDel队列可接受的最小持续/持续队列延迟。该最小延迟通过跟踪数据包经历的本地最小队列延迟来确定。

The default target value is 5 ms, but this value should be tuned to be at least the transmission time of a single MTU-sized packet at the prevalent egress link speed (which, for example, is ~15 ms for 1 Mbps and MTU 1500). This prevents CoDel from being too aggressive at low bandwidths. It should otherwise be set to 5-10% of the configured interval.

默认目标值为5ms,但应将该值调整为至少单个MTU大小的数据包在当前出口链路速度下的传输时间(例如,对于1Mbps和MTU 1500,该速度为~15ms)。这可以防止CoDel在低带宽时过于激进。否则,应将其设置为配置间隔的5-10%。

5.2.3. Packet Limit
5.2.3. 数据包限制

Routers do not have infinite memory, so some packet limit MUST be enforced.

路由器没有无限的内存,因此必须执行一些数据包限制。

The "limit" parameter is the hard limit on the real queue size, measured in number of packets. This limit is a global limit on the number of packets in all queues; each individual queue does not have an upper limit. When the limit is reached and a new packet arrives for enqueue, packets are dropped from the head of the largest queue (measured in bytes) to make room for the new packet.

“limit”参数是实际队列大小的硬限制,以数据包的数量度量。此限制是对所有队列中数据包数量的全局限制;每个队列都没有上限。当达到限制且新数据包到达排队时,数据包将从最大队列的队列头(以字节为单位)丢弃,以便为新数据包腾出空间。

In Linux, the default packet limit is 10240 packets, which is suitable for up to 10-Gigabit Ethernet speeds. In practice, the hard limit is rarely (if ever) hit, as drops are performed by the CoDel algorithm long before the limit is hit. For platforms that are severely memory constrained, a lower limit can be used.

在Linux中,默认的数据包限制是10240个数据包,适用于高达10千兆以太网的速度。在实践中,硬限制很少(如果有的话)被命中,因为在达到限制之前很久,CoDel算法就会执行丢弃操作。对于内存严重受限的平台,可以使用较低的限制。

5.2.4. Quantum
5.2.4. 量子

The "quantum" parameter is the number of bytes each queue gets to dequeue on each round of the scheduling algorithm. The default is set to 1514 bytes, which corresponds to the Ethernet MTU plus the hardware header length of 14 bytes.

“quantum”参数是在调度算法的每一轮中,每个队列要出列的字节数。默认设置为1514字节,对应于以太网MTU加上14字节的硬件头长度。

In systems employing TCP Segmentation Offload (TSO), where a "packet" consists of an offloaded packet train, it can presently be as large as 64 kilobytes. In systems using Generic Receive Offload (GRO), they can be up to 17 times the TCP max segment size (or 25 kilobytes). These mega-packets severely impact FQ-CoDel's ability to schedule traffic, and they hurt latency needlessly. There is ongoing work in Linux to make smarter use of offload engines.

在采用TCP分段卸载(TSO)的系统中,一个“数据包”由一个卸载的数据包序列组成,目前它可以大到64 KB。在使用通用接收卸载(GRO)的系统中,它们可以是TCP最大段大小(或25 KB)的17倍。这些巨型数据包严重影响了FQ CoDel调度流量的能力,并且不必要地损害了延迟。Linux中正在进行更智能地使用卸载引擎的工作。

5.2.5. Flows
5.2.5. 流动

The "flows" parameter sets the number of queues into which the incoming packets are classified. Due to the stochastic nature of hashing, multiple flows may end up being hashed into the same slot.

“flows”参数设置传入数据包被分类到的队列数。由于散列的随机性,多个流可能最终被散列到同一个槽中。

This parameter can be set only at initialisation time in the current implementation, since memory has to be allocated for the hash table.

在当前实现中,此参数只能在初始化时设置,因为必须为哈希表分配内存。

The default value is 1024 in the current Linux implementation.

在当前Linux实现中,默认值为1024。

5.2.6. Explicit Congestion Notification (ECN)
5.2.6. 显式拥塞通知(ECN)

ECN [RFC3168] is enabled by default. Rather than do anything special with misbehaved ECN flows, FQ-CoDel relies on the packet scheduling system to minimise their impact; thus, the number of unresponsive packets in a flow being marked with ECN can grow to the overall packet limit but will not otherwise affect the performance of the system.

默认情况下启用ECN[RFC3168]。FQ CoDel没有对行为不端的ECN流做任何特殊的处理,而是依赖于数据包调度系统来最小化其影响;因此,被标记为ECN的流中的无响应分组的数量可以增长到总分组限制,但不会影响系统的性能。

ECN can be disabled by specifying the "noecn" parameter.

可以通过指定“noecn”参数来禁用ECN。

5.2.7. CE Threshold
5.2.7. CE阈值

This parameter enables DCTCP-like processing resulting in Congestion Encountered (CE) marking on ECN-Capable Transport (ECT) packets [RFC3168] starting at a lower sojourn delay setpoint than the default CoDel target. Details of Data Center TCP (DCTCP) can be found in [RFC8257].

此参数启用类似于DCTCP的处理,从而在具有ECN能力的传输(ECT)数据包[RFC3168]上产生拥塞(CE)标记,从比默认CoDel目标更低的逗留延迟设定点开始。有关数据中心TCP(DCTCP)的详细信息,请参见[RFC8257]。

The "ce_threshold" parameter is disabled by default; it can be enabled by setting it to a number of microseconds.

默认禁用“ce_阈值”参数;可以通过将其设置为微秒数来启用。

5.3. Probability of Hash Collisions
5.3. 哈希冲突的概率

Since the Linux FQ-CoDel implementation by default uses 1024 hash buckets, the probability that (say) 100 flows will all hash to the same bucket is something like ten to the power of minus 300. Thus, at least one of the flows will almost certainly hash to some other queue.

由于Linux FQ CoDel实现默认使用1024个散列桶,因此(比如)100个流将全部散列到同一个桶的概率大约是10到负300的幂。因此,至少有一个流几乎肯定会散列到其他队列。

Expanding on this, based on analytical equations for hash collision probabilities, for 100 flows, the probability of no collision is 90.78%; the probability that no more than two of the 100 flows will be involved in any given collision is 99.57%; and the probability that no more than three of the 100 flows will be involved in any given collision is 99.99%. These probabilities assume a hypothetical perfect hashing function, so in practice, they may be a bit lower. We have not found this difference to matter in practice.

在此基础上,基于hash碰撞概率分析方程,对100个流,无碰撞概率为90.78%;100个流中不超过两个流参与任何给定碰撞的概率为99.57%;在任何给定的碰撞中,100个流中不超过3个流的概率为99.99%。这些概率假设一个假设的完美散列函数,所以在实践中,它们可能会稍微低一点。我们没有发现这种差异在实践中有什么关系。

These probabilities can be improved upon by using set-associative hashing, a technique used in the Cake algorithm currently being developed as a further refinement of the FQ-CoDel principles [CAKE]. For a 4-way associative hash with the same number of total queues, the probability of no collisions for 100 flows is 99.93%, while for an 8-way associative hash, it is ~100%.

这些概率可以通过使用集合关联散列来提高,这是目前正在开发的Cake算法中使用的一种技术,是对FQ CoDel原则的进一步完善[Cake]。对于总队列数相同的4路关联哈希,100个流的无冲突概率为99.93%,而对于8路关联哈希,无冲突概率约为100%。

5.4. Memory Overhead
5.4. 内存开销

FQ-CoDel can be implemented with a low memory footprint (less than 64 bytes per queue on 64-bit systems). These are the data structures used in the Linux implementation:

FQ CoDel可以用较低的内存占用实现(在64位系统上,每个队列少于64字节)。以下是Linux实现中使用的数据结构:

<CODE BEGINS>

<代码开始>

   struct codel_vars {
      u32             count;             /* number of dropped packets */
      u32             lastcount;     /* count entry to dropping state */
      bool            dropping;                /* currently dropping? */
      u16             rec_inv_sqrt;    /* reciprocal sqrt computation */
      codel_time_t    first_above_time;    /* when delay above target */
      codel_time_t    drop_next;                 /* next time to drop */
      codel_time_t    ldelay; /* sojourn time of last dequeued packet */
   };
        
   struct codel_vars {
      u32             count;             /* number of dropped packets */
      u32             lastcount;     /* count entry to dropping state */
      bool            dropping;                /* currently dropping? */
      u16             rec_inv_sqrt;    /* reciprocal sqrt computation */
      codel_time_t    first_above_time;    /* when delay above target */
      codel_time_t    drop_next;                 /* next time to drop */
      codel_time_t    ldelay; /* sojourn time of last dequeued packet */
   };
        
   struct fq_codel_flow {
      struct sk_buff    *head;
      struct sk_buff    *tail;
      struct list_head  flowchain;
      int               credits;   /* current number of queue credits */
      u32               dropped; /* # of drops (or ECN marks) on flow */
      struct codel_vars cvars;
   };
        
   struct fq_codel_flow {
      struct sk_buff    *head;
      struct sk_buff    *tail;
      struct list_head  flowchain;
      int               credits;   /* current number of queue credits */
      u32               dropped; /* # of drops (or ECN marks) on flow */
      struct codel_vars cvars;
   };
        

<CODE ENDS>

<代码结束>

The master table managing all queues looks like this:

管理所有队列的主表如下所示:

<CODE BEGINS>

<代码开始>

   struct fq_codel_sched_data {
      struct tcf_proto *filter_list;  /* optional external classifier */
      struct fq_codel_flow *flows;    /* Flows table [flows_cnt] */
      u32             *backlogs;      /* backlog table [flows_cnt] */
      u32             flows_cnt;      /* number of flows */
      u32             perturbation;   /* hash perturbation */
      u32             quantum;        /* psched_mtu(qdisc_dev(sch)); */
      struct codel_params cparams;
      struct codel_stats cstats;
      u32             drop_overlimit;
      u32             new_flow_count;
        
   struct fq_codel_sched_data {
      struct tcf_proto *filter_list;  /* optional external classifier */
      struct fq_codel_flow *flows;    /* Flows table [flows_cnt] */
      u32             *backlogs;      /* backlog table [flows_cnt] */
      u32             flows_cnt;      /* number of flows */
      u32             perturbation;   /* hash perturbation */
      u32             quantum;        /* psched_mtu(qdisc_dev(sch)); */
      struct codel_params cparams;
      struct codel_stats cstats;
      u32             drop_overlimit;
      u32             new_flow_count;
        
      struct list_head new_flows;     /* list of new flows */
      struct list_head old_flows;     /* list of old flows */
   };
        
      struct list_head new_flows;     /* list of new flows */
      struct list_head old_flows;     /* list of old flows */
   };
        

<CODE ENDS>

<代码结束>

5.5. Per-Packet Timestamping
5.5. 每包时间戳

The CoDel portion of the algorithm requires per-packet timestamps be stored along with the packet. While this approach works well for software-based routers, it may be impossible to retrofit devices that do most of their processing in silicon and lack the space or mechanism for timestamping.

算法的CoDel部分要求每个数据包的时间戳与数据包一起存储。虽然这种方法适用于基于软件的路由器,但可能不可能改造用硅进行大部分处理且缺乏时间戳空间或机制的设备。

Also, while perfect resolution is not needed, timestamp resolution finer than the CoDel target setting is necessary. Furthermore, timestamping functions in the core OS need to be efficient, as they are called at least once on each packet enqueue and dequeue.

此外,虽然不需要完美的分辨率,但需要比CoDel目标设置更精细的时间戳分辨率。此外,核心操作系统中的时间戳函数需要高效,因为它们在每个数据包排队和退队时至少被调用一次。

5.6. Limiting Queueing in Lower Layers
5.6. 限制下层排队

When deploying a queue management algorithm such as FQ-CoDel, it is important to ensure that the algorithm actually runs in the right place to control the queue. In particular, lower layers of the operating system networking stack can have queues of their own, as can device drivers and hardware. Thus, it is desirable that the queue management algorithm runs as close to the hardware as possible. However, scheduling such complexity at interrupt time is difficult, so a small standing queue between the algorithm and the wire is often needed at higher transmit rates.

在部署诸如FQ CoDel之类的队列管理算法时,确保该算法实际运行在控制队列的正确位置非常重要。特别是,操作系统网络堆栈的较低层可以有自己的队列,设备驱动程序和硬件也可以。因此,希望队列管理算法尽可能靠近硬件运行。然而,在中断时间调度这样的复杂性是困难的,因此在较高的传输速率下,通常需要在算法和线路之间有一个小的固定队列。

In Linux, the mechanism to ensure these different needs are balanced is called "Byte Queue Limits" [BQL]; it controls the device driver ring buffer (for physical line rates). For cases where this functionality is not available, the queue can be controlled by means of a software rate limiter such as Hierarchical Token Bucket [HTB] or Hierarchical Fair-Service Curve [HFSC]. The Cake algorithm [CAKE] integrates a software rate limiter for this purpose.

在Linux中,确保这些不同需求平衡的机制称为“字节队列限制”[BQL];它控制设备驱动程序环形缓冲区(用于物理线路速率)。对于此功能不可用的情况,可通过软件速率限制器(如分层令牌桶[HTB]或分层公平服务曲线[HFSC])来控制队列。为此,Cake算法[Cake]集成了一个软件速率限制器。

Other issues with queues at lower layers are described in [CODEL].

[CODEL]中描述了下层队列的其他问题。

5.7. Other Forms of Fair Queueing
5.7. 其他形式的公平排队

Much of the scheduling portion of FQ-CoDel is derived from DRR and is substantially similar to DRR++. Versions based on Stochastic Fair Queueing [SFQ] have also been produced and tested in ns2. Other forms of fair queueing, such as Weighted Fair Queueing [WFQ] or Quick Fair Queueing [QFQ], have not been thoroughly explored, but there's no a priori reason why the round-robin scheduling of FQ-CoDel couldn't be replaced with something else.

FQ CoDel的大部分调度部分源自DRR,与DRR++基本相似。基于随机公平排队[SFQ]的版本也已经在ns2中产生和测试。其他形式的公平排队,如加权公平排队[WFQ]或快速公平排队[QFQ],尚未得到彻底的探索,但FQ CoDel的循环调度不能被其他形式取代的原因并不存在。

For a comprehensive discussion of fairness queueing algorithms and their combination with AQM, see [RFC7806].

有关公平性排队算法及其与AQM组合的全面讨论,请参阅[RFC7806]。

5.8. Differences between CoDel and FQ-CoDel Behaviour
5.8. CoDel和FQ CoDel行为之间的差异

CoDel can be applied to a single queue system as a straight AQM, where it converges towards an "ideal" drop rate (i.e., one that minimises delay while keeping a high link utilisation) and then optimises around that control point.

CoDel可作为直接AQM应用于单队列系统,在该系统中,CoDel收敛于“理想”丢弃率(即,在保持高链路利用率的同时最小化延迟的丢弃率),然后围绕该控制点进行优化。

The scheduling of FQ-CoDel mixes packets of competing flows, which acts to pace bursty flows to better fill the pipe. Additionally, a new flow gets substantial leeway over other flows until CoDel finds an ideal drop rate for it. However, for a new flow that exceeds the configured quantum, more time passes before all of its data is delivered (as packets from it, too, are mixed across the other existing queue-building flows). Thus, FQ-CoDel takes longer (as measured in time) to converge towards an ideal drop rate for a given new flow but does so within fewer delivered _packets_ from that flow.

FQ CoDel的调度将竞争流的数据包混合在一起,从而调整突发流的速度,以更好地填充管道。此外,在CoDel找到理想的丢包率之前,新流量比其他流量有很大的回旋余地。但是,对于超过配置量的新流,在传递其所有数据之前会经过更多的时间(因为来自它的数据包也会在其他现有队列构建流中混合)。因此,FQ CoDel需要更长的时间(以时间衡量)才能收敛到给定新流的理想丢弃率,但在从该流交付的数据包更少的情况下收敛。

Finally, the flow isolation provided by FQ-CoDel means that the CoDel drop mechanism operates on the flows actually building queues; this results in packets being dropped more accurately from the largest flows than when only CoDel is used. Additionally, flow isolation radically improves the transient behaviour of the network when traffic or link characteristics change (e.g., when new flows start up or the link bandwidth changes); while CoDel itself can take a while to respond, FQ-CoDel reacts almost immediately.

最后,FQ CoDel提供的流隔离意味着CoDel丢弃机制在实际构建队列的流上运行;这导致从最大流中丢弃数据包比仅使用CoDel时更准确。此外,当流量或链路特性改变时(例如,当新流量启动或链路带宽改变时),流隔离从根本上改善了网络的瞬态行为;虽然CoDel本身可能需要一段时间才能做出响应,但FQ CoDel几乎会立即做出反应。

6. Limitations of Flow Queueing
6. 流排队的局限性

While FQ-CoDel has been shown in many scenarios to offer significant performance gains compared to alternative queue management strategies, there are some scenarios where the scheduling algorithm in particular is not a good fit. This section documents some of the known cases in which either the default behaviour may require tweaking or alternatives to flow queueing should be considered.

虽然与其他队列管理策略相比,FQ CoDel已在许多场景中显示出显著的性能提升,但在某些场景中,调度算法尤其不适合。本节记录了一些已知的情况,其中默认行为可能需要调整,或者应该考虑流排队的替代方案。

6.1. Fairness between Things Other Than Flows
6.1. 流以外的事物之间的公平性

In some parts of the network, enforcing flow-level fairness may not be desirable, or some other form of fairness may be more important. Some examples of this include an ISP that may be more interested in ensuring fairness between customers than between flows or a hosting or transit provider that wishes to ensure fairness between connecting Autonomous Systems or networks. Another issue can be that the number of simultaneous flows experienced at a particular link can be too high for flow-based fairness queueing to be effective.

在网络的某些部分,强制流级公平性可能不可取,或者其他形式的公平性可能更重要。这方面的一些例子包括ISP,其可能更关心确保客户之间的公平性,而不是流量之间的公平性,或者希望确保连接的自治系统或网络之间的公平性的托管或传输提供商。另一个问题可能是,在特定链路上同时经历的流的数量可能太高,基于流的公平性排队无法有效。

Whatever the reason, in a scenario where fairness between flows is not desirable, reconfiguring FQ-CoDel to match on a different characteristic can be a way forward. The implementation in Linux can leverage the packet matching mechanism of the "tc" subsystem to use any available packet field to partition packets into virtual queues, for instance, to match on address or subnet source/destination pairs, application-layer characteristics, etc.

不管是什么原因,在流之间的公平性不理想的场景中,重新配置FQ CoDel以匹配不同的特性可能是一种前进的方式。Linux中的实现可以利用“tc”子系统的数据包匹配机制,使用任何可用的数据包字段将数据包划分为虚拟队列,例如,匹配地址或子网源/目的地对、应用层特征等。

Furthermore, as commonly deployed today, FQ-CoDel is used with three or more tiers of service classification, based on Diffserv markings: priority, best effort, and background. Some products do more detailed classification, including deep packet inspection and destination-specific filters to achieve their desired result.

此外,正如今天通常部署的那样,FQ CoDel与三层或三层以上的服务分类一起使用,基于区分服务标记:优先级、最大努力和背景。有些产品会进行更详细的分类,包括深度包检查和特定目的地的过滤器,以达到预期的效果。

6.2. Flow Bunching by Opaque Encapsulation
6.2. 不透明封装流动聚束

Where possible, FQ-CoDel will attempt to decapsulate packets before matching on the header fields for the flow hashing. However, for some encapsulation techniques, most notably encrypted VPNs, this is not possible. If several flows are bunched into one such encapsulated tunnel, they will be seen as one flow by the FQ-CoDel algorithm. This means that they will share a queue and drop behaviour, so flows inside the encapsulation will not benefit from the implicit prioritisation of FQ-CoDel but will continue to benefit from the reduced overall queue length from the CoDel algorithm operating on the queue. In addition, when such an encapsulated bunch

在可能的情况下,FQ CoDel将在匹配流散列的头字段之前尝试解除数据包的封装。然而,对于某些封装技术,尤其是加密VPN,这是不可能的。如果多个流聚集到一个这样的封装隧道中,则FQ CoDel算法将它们视为一个流。这意味着它们将共享队列和丢弃行为,因此封装内的流不会受益于FQ CoDel的隐式优先级排序,而是将继续受益于队列上运行的CoDel算法减少的总队列长度。另外,当这样一个封装束

competes against other flows, it will count as one flow and not assigned a share of the bandwidth based on how many flows are inside the encapsulation.

与其他流竞争时,它将算作一个流,并且不会根据封装中的流数量分配带宽份额。

Depending on the application, this may or may not be desirable behaviour. In cases where it is not, changing FQ-CoDel's matching to not be flow-based (as detailed in the previous subsection above) can be a mitigation. Going forward, having some mechanism for opaque encapsulations to express to the outer layer which flow a packet belongs to could be a way to mitigate this. Naturally, care needs to be taken when designing such a mechanism to ensure no new privacy and security issues are raised by exposing information from inside the encapsulation to the outside world. Keeping the extra information out of band and dropping it before it hits the network could be one way to achieve this.

根据应用情况,这可能是也可能不是理想的行为。如果不是,则将FQ CoDel的匹配更改为不基于流(如上文前一小节所述)可能是一种缓解措施。展望未来,有一些不透明封装的机制来表示数据包所属的外层可能是缓解这种情况的一种方法。当然,在设计这种机制时需要小心,以确保将封装内部的信息暴露给外部世界不会产生新的隐私和安全问题。将额外的信息保留在带外,并在其进入网络之前将其丢弃可能是实现这一目标的一种方法。

6.3. Low-Priority Congestion Control Algorithms
6.3. 低优先级拥塞控制算法

In the presence of queue management schemes that limit latency under load, low-priority congestion control algorithms such as Low Extra Delay Background Transport (LEDBAT) [RFC6817] (or, in general, algorithms that try to voluntarily use up less than their fair share of bandwidth) experience little added latency when the link is congested. Thus, they lack the signal to back off that added latency previously afforded them. This effect is seen with FQ-CoDel as well as with any effective AQM [GONG2014].

在存在限制负载下延迟的队列管理方案的情况下,低优先级拥塞控制算法,例如低额外延迟后台传输(LEDBAT)[RFC6817](或者,通常,尝试自愿使用少于其公平带宽份额的算法)在链路拥塞时几乎不会增加延迟。因此,他们缺乏信号来回退之前提供给他们的额外延迟。FQ CoDel以及任何有效的AQM都可以看到这种效果[2014]。

As such, these delay-based algorithms tend to revert to loss-based congestion control and will consume the fair share of bandwidth afforded to them by the FQ-CoDel scheduler. However, low-priority congestion control mechanisms may be able to take steps to continue to be low priority, for instance, by taking into account the vastly reduced level of delay afforded by an AQM or by using a coupled approach to observing the behaviour of multiple flows.

因此,这些基于延迟的算法倾向于恢复到基于丢失的拥塞控制,并将消耗FQ CoDel调度器提供给它们的公平带宽份额。然而,低优先级拥塞控制机制可能能够采取步骤继续保持低优先级,例如,通过考虑AQM提供的大幅降低的延迟水平,或者通过使用耦合方法来观察多个流的行为。

7. Deployment Status and Future Work
7. 部署情况和今后的工作

The FQ-CoDel algorithm as described in this document has been shipped as part of the Linux kernel since version 3.5 (released on the 21st of July, 2012), with the ce_threshold being added in version 4.2. The algorithm has seen widespread testing in a variety of contexts and is configured as the default queueing discipline in a number of mainline Linux distributions (as of this writing, at least OpenWRT, Arch Linux, and Fedora). In addition, a BSD implementation is available. All data resulting from these trials have shown FQ-CoDel to be a massive improvement over the previous default FIFO queue, and people are encouraged to turn it on.

自版本3.5(2012年7月21日发布)以来,本文档中描述的FQ CoDel算法已作为Linux内核的一部分发布,版本4.2中添加了ce_阈值。该算法已经在各种上下文中进行了广泛的测试,并在许多主线Linux发行版中被配置为默认排队规程(在本文撰写时,至少是OpenWRT、Arch Linux和Fedora)。此外,还提供了BSD实现。从这些试验中得到的所有数据表明,FQ CoDel比以前的默认FIFO队列有了巨大的改进,鼓励人们打开它。

Of course, there is always room for improvement, and this document has listed some of the known limitations of the algorithm. As such, we encourage further research into algorithm refinements and addressing of limitations. One such effort has been undertaken by the bufferbloat community in the form of the Cake queue management scheme [CAKE]. In addition to this, we believe the following (non-exhaustive) list of issues to be worthy of further enquiry:

当然,总有改进的余地,本文档列出了该算法的一些已知限制。因此,我们鼓励进一步研究算法改进和解决限制。bufferbloat社区以Cake队列管理方案[Cake]的形式进行了这样一项工作。除此之外,我们认为以下(非详尽)问题清单值得进一步调查:

o Variations on the flow classification mechanism to fit different notions of flows. For instance, an ISP might want to deploy per-subscriber scheduling, while in other cases, several flows can share a 5-tuple, as exemplified by the RTCWEB QoS recommendations [WEBRTC-QOS].

o 流量分类机制的变化,以适应不同的流量概念。例如,ISP可能希望按订户部署调度,而在其他情况下,多个流可以共享一个5元组,如RTCWEB QoS建议[WEBRTC-QoS]所示。

o Interactions between flow queueing and delay-based congestion control algorithms and scavenger protocols.

o 流排队与基于延迟的拥塞控制算法和清道夫协议之间的相互作用。

o Other scheduling mechanisms to replace the DRR portion of the algorithm, e.g., QFQ or WFQ.

o 替换算法DRR部分的其他调度机制,例如QFQ或WFQ。

o Sensitivity of parameters, most notably, the number of queues and the CoDel parameters.

o 参数的敏感性,最显著的是队列数量和CoDel参数。

8. Security Considerations
8. 安全考虑

There are no specific security exposures associated with FQ-CoDel that are not also present in current FIFO systems. On the contrary, some vulnerabilities of FIFO systems are reduced with FQ-CoDel (e.g., simple minded packet floods). However, some care is needed in the implementation to ensure this is the case. These are included in the description above, but we reiterate them here:

当前FIFO系统中不存在与FQ CoDel相关的特定安全风险。相反,使用FQ CoDel(例如,头脑简单的数据包泛滥)可以减少FIFO系统的一些漏洞。然而,在实施过程中需要谨慎,以确保情况确实如此。上述描述中包括这些内容,但我们在此重申:

o To prevent packets in the new queues from starving old queues, it is important that when a queue on the list of new queues empties, it is moved to _the end of_ the list of old queues. This is described at the end of Section 4.2.

o 为了防止新队列中的数据包耗尽旧队列,当新队列列表中的队列清空时,必须将其移动到旧队列列表的末尾。第4.2节末尾对此进行了描述。

o To prevent an attacker targeting a specific flow for a denial-of-service attack, the hash that maps packets to queues should not be predictable. To achieve this, FQ-CoDel salts the hash, as described in the beginning of Section 4.1. The size of the salt and the strength of the hash function is obviously a trade-off between performance and security. The Linux implementation uses a 32-bit random value as the salt and a Jenkins hash function. This makes it possible to achieve high throughput, and we consider it sufficient to ward off the most obvious attacks.

o 为了防止攻击者针对特定流进行拒绝服务攻击,将数据包映射到队列的哈希不应是可预测的。为了实现这一点,FQ CoDel对散列进行盐分,如第4.1节开头所述。salt的大小和hash函数的强度显然是性能和安全性之间的权衡。Linux实现使用32位随机值作为salt和Jenkins哈希函数。这使得有可能实现高吞吐量,并且我们认为它足以避开最明显的攻击。

o Packet fragments without a Layer 4 header can be hashed into different bins than the first fragment with the header intact. This can cause reordering and/or adversely affect the performance of the flow. Keeping state to match the fragments to the beginning of the packet or simply putting all packet fragments (including the first fragment of each fragmented packet) into the same queue are two ways to alleviate this.

o 没有第4层头的数据包片段可以散列到不同的容器中,而不是头完整的第一个片段。这可能会导致重新排序和/或对流的性能产生不利影响。保持状态使片段与数据包的开头相匹配,或者简单地将所有数据包片段(包括每个片段数据包的第一个片段)放入同一队列,这是缓解这种情况的两种方法。

9. IANA Considerations
9. IANA考虑

This document does not require any IANA actions.

本文件不要求IANA采取任何行动。

10. References
10. 工具书类
10.1. Normative References
10.1. 规范性引用文件

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,DOI 10.17487/RFC2119,1997年3月<https://www.rfc-editor.org/info/rfc2119>.

[RFC7806] Baker, F. and R. Pan, "On Queuing, Marking, and Dropping", RFC 7806, DOI 10.17487/RFC7806, April 2016, <https://www.rfc-editor.org/info/rfc7806>.

[RFC7806]Baker,F.和R.Pan,“关于排队、标记和丢弃”,RFC 7806,DOI 10.17487/RFC7806,2016年4月<https://www.rfc-editor.org/info/rfc7806>.

[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>.

[RFC8174]Leiba,B.,“RFC 2119关键词中大写与小写的歧义”,BCP 14,RFC 8174,DOI 10.17487/RFC8174,2017年5月<https://www.rfc-editor.org/info/rfc8174>.

[RFC8289] Nichols, K., Jacobson, V., McGregor, A., Ed., and J. Iyengar, Ed., "Controlled Delay Active Queue Management", RFC 8289, DOI 10.17487/RFC8289, January 2018, <https://www.rfc-editor.org/info/rfc8289>.

[RFC8289]Nichols,K.,Jacobson,V.,McGregor,A.,Ed.,和J.Iyengar,Ed.,“受控延迟主动队列管理”,RFC 8289,DOI 10.17487/RFC8289,2018年1月<https://www.rfc-editor.org/info/rfc8289>.

10.2. Informative References
10.2. 资料性引用

[BLOAT] Gettys, J. and K. Nichols, "Bufferbloat: Dark Buffers in the Internet", Communications of the ACM, Volume 55, Issue 1, DOI 10.1145/2063176.2063196, January 2012.

[BLOAT]Gettys,J.和K.Nichols,“缓冲区膨胀:互联网中的暗缓冲区”,《ACM通讯》,第55卷,第1期,DOI 10.1145/2063176.2063196,2012年1月。

[BLOATWEB] "Bufferbloat", <https://www.bufferbloat.net>.

[BLOATWEB]“Bufferbloat”<https://www.bufferbloat.net>.

[BQL] Herbert, T., "bql: Byte Queue Limits", August 2011, <https://lwn.net/Articles/454378/>.

[BQL]Herbert,T.,“BQL:字节队列限制”,2011年8月<https://lwn.net/Articles/454378/>.

[CAKE] "Cake - Common Applications Kept Enhanced", <http://www.bufferbloat.net/projects/codel/wiki/Cake>.

[蛋糕]“蛋糕-常用应用程序不断增强”<http://www.bufferbloat.net/projects/codel/wiki/Cake>.

[CODEL] Nichols, K. and V. Jacobson, "Controlling Queue Delay", ACM Queue, Volume 10, Issue 5, DOI 10.1145/2208917.2209336, May 2012, <http://queue.acm.org/detail.cfm?id=2209336>.

[CODEL]Nichols,K.和V.Jacobson,“控制队列延迟”,ACM队列,第10卷,第5期,DOI 10.1145/2208917.2209336,2012年5月<http://queue.acm.org/detail.cfm?id=2209336>.

[DRR] Shreedhar, M. and G. Varghese, "Efficient Fair Queueing Using Deficit Round Robin", IEEE/ACM Transactions on Networking, Volume 4, Issue 3, DOI 10.1109/90.502236, June 1996.

[DRR]Shreedhar,M.和G.Varghese,“使用赤字循环的有效公平排队”,IEEE/ACM网络交易,第4卷,第3期,DOI 10.1109/90.502236,1996年6月。

[DRRPP] MacGregor, M. and W. Shi, "Deficits for Bursty Latency-Critical Flows: DRR++", Proceedings of the IEEE International Conference on Networks 2000 (ICON 2000), DOI 10.1109/ICON.2000.875803, September 2000, <http://ieeexplore.ieee.org/xpls/ abs_all.jsp?arnumber=875803>.

[DRRPP]MacGregor,M.和W.Shi,“突发延迟关键流的缺陷:DRR++”,《IEEE网络国际会议记录2000》(ICON 2000),DOI 10.1109/ICON.2000.875803,2000年9月<http://ieeexplore.ieee.org/xpls/ abs_all.jsp?arnumber=875803>。

[GONG2014] Gong, Y., Rossi, D., Testa, C., Valenti, S., and D. Taht, "Fighting the bufferbloat: On the coexistence of AQM and low priority congestion control", Elsevier Computer Networks, Volume 65, DOI 10.1016/j.bjp.2014.01.009, June 2014, <https://www.sciencedirect.com/science/article/pii/ S1389128614000188>.

[GONG2014]Gong,Y.,Rossi,D.,Testa,C.,Valenti,S.,和D.Taht,“对抗缓冲区膨胀:关于AQM和低优先级拥塞控制的共存”,爱思唯尔计算机网络,卷65,DOI 10.1016/j.bjp.2014.01.009,2014年6月<https://www.sciencedirect.com/science/article/pii/ S1389128614000188>。

[HFSC] Stoica, I., Zhang, H., and T. Eugene Ng, "A Hierarchical Fair Service Curve Algorithm for Link-Sharing, Real-Time and Priority Services", Proceedings of ACM SIGCOMM, DOI 10.1145/263105.263175, September 1997, <http://conferences.sigcomm.org/sigcomm/1997/papers/ p011.pdf>.

[HFSC]Stoica,I.,Zhang,H.,和T.Eugene Ng,“链路共享、实时和优先服务的分层公平服务曲线算法”,ACM SIGCOMM会议记录,DOI 10.1145/263105.263175,1997年9月<http://conferences.sigcomm.org/sigcomm/1997/papers/ p011.pdf>。

[HTB] Wikipedia, "Token Bucket: Variations", October 2017, <https://en.wikipedia.org/w/ index.php?title=Token_bucket&oldid=803574657>.

[HTB]维基百科,“代币桶:变化”,2017年10月<https://en.wikipedia.org/w/ index.php?title=Token\u bucket&oldid=803574657>。

[JENKINS] Jenkins, B., "A Hash Function for Hash Table Lookup", <http://www.burtleburtle.net/bob/hash/doobs.html>.

[JENKINS]JENKINS,B.,“用于哈希表查找的哈希函数”<http://www.burtleburtle.net/bob/hash/doobs.html>.

[LINUXSRC] "Linux Kernel Source Tree", <https://git.kernel.org/cgit/l inux/kernel/git/torvalds/linux.git/tree/net/sched/ sch_fq_codel.c>.

[LINUXSRC]“Linux内核源代码树”<https://git.kernel.org/cgit/l inux/kernel/git/torvalds/linux.git/tree/net/sched/sch_fq_codel.c>。

[NS2] "ns-2", December 2014, <http://nsnam.sourceforge.net/wiki/ index.php?title=Main_Page&oldid=8076>.

[NS2]“ns-2”,2014年12月<http://nsnam.sourceforge.net/wiki/ index.php?title=Main\u Page&oldid=8076>。

[NS3] "ns-3", February 2016, <https://www.nsnam.org/mediawiki/ index.php?title=Main_Page&oldid=9883>.

[NS3]“ns-3”,2016年2月<https://www.nsnam.org/mediawiki/ index.php?title=Main_Page&oldid=9883>。

[QFQ] Checconi, F., Rizzo, L., and P. Valente, "QFQ: Efficient Packet Scheduling with Tight Guarantees", IEEE/ACM Transactions on Networking (TON), Volume 21, Issue 3, pp. 802-816, DOI 10.1109/TNET.2012.2215881, June 2013, <http://dl.acm.org/citation.cfm?id=2525552>.

[QFQ]Checconi,F.,Rizzo,L.,和P.Valente,“QFQ:具有严格保证的有效数据包调度”,IEEE/ACM网络事务(TON),第21卷,第3期,第802-816页,DOI 10.1109/TNET.2012.22158812013年6月<http://dl.acm.org/citation.cfm?id=2525552>.

[RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI 10.17487/RFC2003, October 1996, <https://www.rfc-editor.org/info/rfc2003>.

[RFC2003]Perkins,C.,“IP内的IP封装”,RFC 2003,DOI 10.17487/RFC2003,1996年10月<https://www.rfc-editor.org/info/rfc2003>.

[RFC2890] Dommety, G., "Key and Sequence Number Extensions to GRE", RFC 2890, DOI 10.17487/RFC2890, September 2000, <https://www.rfc-editor.org/info/rfc2890>.

[RFC2890]Dommety,G.,“GRE的密钥和序列号扩展”,RFC 2890,DOI 10.17487/RFC2890,2000年9月<https://www.rfc-editor.org/info/rfc2890>.

[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001, <https://www.rfc-editor.org/info/rfc3168>.

[RFC3168]Ramakrishnan,K.,Floyd,S.,和D.Black,“向IP添加显式拥塞通知(ECN)”,RFC 3168,DOI 10.17487/RFC3168,2001年9月<https://www.rfc-editor.org/info/rfc3168>.

[RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms for IPv6 Hosts and Routers", RFC 4213, DOI 10.17487/RFC4213, October 2005, <https://www.rfc-editor.org/info/rfc4213>.

[RFC4213]Nordmark,E.和R.Gilligan,“IPv6主机和路由器的基本转换机制”,RFC 4213,DOI 10.17487/RFC4213,2005年10月<https://www.rfc-editor.org/info/rfc4213>.

[RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, "Low Extra Delay Background Transport (LEDBAT)", RFC 6817, DOI 10.17487/RFC6817, December 2012, <https://www.rfc-editor.org/info/rfc6817>.

[RFC6817]Shalunov,S.,Hazel,G.,Iyengar,J.,和M.Kuehlewind,“低额外延迟背景传输(LEDBAT)”,RFC 6817,DOI 10.17487/RFC6817,2012年12月<https://www.rfc-editor.org/info/rfc6817>.

[RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., and G. Judd, "Data Center TCP (DCTCP): TCP Congestion Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, October 2017, <https://www.rfc-editor.org/info/rfc8257>.

[RFC8257]Bensley,S.,Thaler,D.,Balasubramanian,P.,Eggert,L.,和G.Judd,“数据中心TCP(DCTCP):数据中心的TCP拥塞控制”,RFC 8257,DOI 10.17487/RFC8257,2017年10月<https://www.rfc-editor.org/info/rfc8257>.

[SFQ] McKenney, P., "Stochastic Fairness Queueing", Proceedings of IEEE INFOCOM, DOI 10.1109/INFCOM.1990.91316, June 1990, <http://perso.telecom-paristech.fr/~bonald/Publications_files/BMO2011.pdf>.

[SFQ]McKenney,P.,“随机公平排队”,IEEE信息网会议录,DOI 10.1109/INFCOM.1990.91316,1990年6月<http://perso.telecom-paristech.fr/~bonald/Publications\u files/BMO2011.pdf>。

[SQF] Carofiglio, G. and L. Muscariello, "On the Impact of TCP and Per-Flow Scheduling on Internet Performance", IEEE/ACM Transactions on Networking, Volume 20, Issue 2, DOI 10.1109/TNET.2011.2164553, August 2011.

[SQF]Carofiglio,G.和L.Muscariello,“关于TCP和每流调度对互联网性能的影响”,IEEE/ACM网络交易,第20卷,第2期,DOI 10.1109/TNET.2011.2164553,2011年8月。

[WEBRTC-QOS] Jones, P., Dhesikan, S., Jennings, C., and D. Druta, "DSCP Packet Markings for WebRTC QoS", Work in Progress, draft-ietf-tsvwg-rtcweb-qos-18, August 2016.

[WEBRTC-QOS]Jones,P.,Dhesikan,S.,Jennings,C.,和D.Druta,“用于WEBRTC QOS的DSCP数据包标记”,正在进行的工作,草稿-ietf-tsvwg-rtcweb-QOS-18,2016年8月。

[WFQ] Demers, A., Keshav, S., and S. Shenker, "Analysis and Simulation of a Fair Queueing Algorithm", ACM SIGCOMM Computer Communication Review, Volume 19, Issue 4, pp. 1-12, DOI 10.1145/75247.75248, September 1989, <http://doi.acm.org/10.1145/75247.75248>.

[WFQ]Demers,A.,Keshav,S.和S.Shenker,“公平排队算法的分析和模拟”,《ACM SIGCOMM计算机通信评论》,第19卷,第4期,第1-12页,DOI 10.1145/75247.75248,1989年9月<http://doi.acm.org/10.1145/75247.75248>.

Acknowledgements

致谢

Our deepest thanks to Kathie Nichols, Van Jacobson, and all the members of the bufferbloat.net effort for all the help on developing and testing the algorithm. In addition, our thanks to Anil Agarwal for his help with getting the hash collision probabilities in this document right.

我们衷心感谢Kathie Nichols、Van Jacobson和bufferbloat.net的所有成员在开发和测试算法方面提供的帮助。此外,我们还要感谢Anil Agarwal,感谢他帮助我们正确计算了本文档中的哈希冲突概率。

Authors' Addresses

作者地址

Toke Hoeiland-Joergensen Karlstad University Dept. of Computer Science Karlstad 65188 Sweden Email: toke@toke.dk

Toke Hoeiland Joergensen Karlstad大学计算机科学系Karlstad 65188瑞典电子邮件:toke@toke.dk

   Paul McKenney
   IBM Linux Technology Center
   1385 NW Amberglen Parkway
   Hillsboro, OR  97006
   United States of America
   Email: paulmck@linux.vnet.ibm.com
   URI:   http://www2.rdrop.com/~paulmck/
        
   Paul McKenney
   IBM Linux Technology Center
   1385 NW Amberglen Parkway
   Hillsboro, OR  97006
   United States of America
   Email: paulmck@linux.vnet.ibm.com
   URI:   http://www2.rdrop.com/~paulmck/
        

Dave Taht Teklibre 2104 W First street Apt 2002 FT Myers, FL 33901 United States of America Email: dave.taht@gmail.com URI: http://www.teklibre.com/

Dave Taht Teklibre 2104 W First street Apt 2002美国佛罗里达州迈尔斯堡33901电子邮件:Dave。taht@gmail.comURI:http://www.teklibre.com/

   Jim Gettys
   21 Oak Knoll Road
   Carlisle, MA  993
   United States of America
   Email: jg@freedesktop.org
   URI:   https://en.wikipedia.org/wiki/Jim_Gettys
        
   Jim Gettys
   21 Oak Knoll Road
   Carlisle, MA  993
   United States of America
   Email: jg@freedesktop.org
   URI:   https://en.wikipedia.org/wiki/Jim_Gettys
        

Eric Dumazet Google, Inc. 1600 Amphitheatre Pkwy Mountain View, CA 94043 United States of America Email: edumazet@gmail.com

Eric Dumazet Google,Inc.1600圆形剧场Pkwy Mountain View,加利福尼亚州94043美利坚合众国电子邮件:edumazet@gmail.com