Internet Engineering Task Force (IETF)                M. Kuehlewind, Ed.
Request for Comments: 7786                                    ETH Zurich
Category: Experimental                                  R. Scheffenegger
ISSN: 2070-1721                                             NetApp, Inc.
                                                                May 2016
        
Internet Engineering Task Force (IETF)                M. Kuehlewind, Ed.
Request for Comments: 7786                                    ETH Zurich
Category: Experimental                                  R. Scheffenegger
ISSN: 2070-1721                                             NetApp, Inc.
                                                                May 2016
        

TCP Modifications for Congestion Exposure (ConEx)

拥塞暴露的TCP修改(ConEx)

Abstract

摘要

Congestion Exposure (ConEx) is a mechanism by which senders inform the network about expected congestion based on congestion feedback from previous packets in the same flow. This document describes the necessary modifications to use ConEx with the Transmission Control Protocol (TCP).

拥塞暴露(ConEx)是一种机制,发送方根据来自同一流中先前数据包的拥塞反馈通知网络预期拥塞。本文件描述了将ConEx与传输控制协议(TCP)一起使用所需的修改。

Status of This Memo

关于下段备忘

This document is not an Internet Standards Track specification; it is published for examination, experimental implementation, and evaluation.

本文件不是互联网标准跟踪规范;它是为检查、实验实施和评估而发布的。

This document defines an Experimental Protocol for the Internet community. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.

本文档为互联网社区定义了一个实验协议。本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准;见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7786.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc7786.

Copyright Notice

版权公告

Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2016 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents

目录

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   4
   2.  Sender-Side Modifications . . . . . . . . . . . . . . . . . .   4
   3.  Counting Congestion . . . . . . . . . . . . . . . . . . . . .   5
     3.1.  Loss Detection  . . . . . . . . . . . . . . . . . . . . .   6
       3.1.1.  Without SACK Support  . . . . . . . . . . . . . . . .   7
     3.2.  Explicit Congestion Notification (ECN)  . . . . . . . . .   8
       3.2.1.  Accurate ECN Feedback . . . . . . . . . . . . . . . .  10
       3.2.2.  Classic ECN Support . . . . . . . . . . . . . . . . .  10
   4.  Setting the ConEx Flags . . . . . . . . . . . . . . . . . . .  11
     4.1.  Setting the E or the L Flag . . . . . . . . . . . . . . .  11
     4.2.  Setting the Credit Flag . . . . . . . . . . . . . . . . .  11
   5.  Loss of ConEx Information . . . . . . . . . . . . . . . . . .  14
   6.  Timeliness of the ConEx Signals . . . . . . . . . . . . . . .  14
   7.  Open Areas for Experimentation  . . . . . . . . . . . . . . .  15
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  17
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  18
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  18
     9.2.  Informative References  . . . . . . . . . . . . . . . . .  19
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  20
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  20
        
   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   4
   2.  Sender-Side Modifications . . . . . . . . . . . . . . . . . .   4
   3.  Counting Congestion . . . . . . . . . . . . . . . . . . . . .   5
     3.1.  Loss Detection  . . . . . . . . . . . . . . . . . . . . .   6
       3.1.1.  Without SACK Support  . . . . . . . . . . . . . . . .   7
     3.2.  Explicit Congestion Notification (ECN)  . . . . . . . . .   8
       3.2.1.  Accurate ECN Feedback . . . . . . . . . . . . . . . .  10
       3.2.2.  Classic ECN Support . . . . . . . . . . . . . . . . .  10
   4.  Setting the ConEx Flags . . . . . . . . . . . . . . . . . . .  11
     4.1.  Setting the E or the L Flag . . . . . . . . . . . . . . .  11
     4.2.  Setting the Credit Flag . . . . . . . . . . . . . . . . .  11
   5.  Loss of ConEx Information . . . . . . . . . . . . . . . . . .  14
   6.  Timeliness of the ConEx Signals . . . . . . . . . . . . . . .  14
   7.  Open Areas for Experimentation  . . . . . . . . . . . . . . .  15
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  17
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  18
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  18
     9.2.  Informative References  . . . . . . . . . . . . . . . . .  19
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  20
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  20
        
1. Introduction
1. 介绍

Congestion Exposure (ConEx) is a mechanism by which senders inform the network about expected congestion based on congestion feedback from previous packets in the same flow. ConEx concepts and use cases are further explained in [RFC6789]. The abstract ConEx mechanism is explained in [RFC7713]. This document describes the necessary modifications to use ConEx with the Transmission Control Protocol (TCP).

拥塞暴露(ConEx)是一种机制,发送方根据来自同一流中先前数据包的拥塞反馈通知网络预期拥塞。[RFC6789]进一步解释了ConEx概念和用例。[RFC7713]中解释了抽象的ConEx机制。本文件描述了将ConEx与传输控制协议(TCP)一起使用所需的修改。

The markings for ConEx signaling are defined in the ConEx Destination Option (CDO) for IPv6 [RFC7837]. Specifically, the use of four flags is defined: X (ConEx-capable), L (loss experienced), E (ECN experienced), and C (credit).

ConEx信令的标记在IPv6的ConEx目的地选项(CDO)[RFC7837]中定义。具体而言,定义了四个标志的使用:X(具有ConEx能力)、L(经历损失)、E(经历ECN)和C(信用)。

ConEx signaling is based on the use of either loss or Explicit Congestion Notification (ECN) marks [RFC3168] as congestion indication. The sender collects this congestion information based on existing TCP feedback mechanisms from the receiver to the sender. No changes are needed at the receiver side to implement ConEx signaling. Therefore, no additional negotiation is needed to implement and use ConEx at the sender side. This document specifies the sender's actions that are needed to provide meaningful ConEx information to the network.

ConEx信令基于使用丢失或显式拥塞通知(ECN)标记[RFC3168]作为拥塞指示。发送方根据从接收方到发送方的现有TCP反馈机制收集拥塞信息。在接收器端不需要更改即可实现ConEx信令。因此,在发送方实现和使用ConEx不需要额外的协商。本文件规定了向网络提供有意义的ConEx信息所需的发送方行动。

Section 2 provides an overview of the modifications needed for TCP senders to implement ConEx. First, congestion information has to be extracted from TCP's loss or ECN feedback as described in Section 3. Section 4 details how to set the CDO marking based on this congestion information. Section 5 discusses the loss of packets carrying ConEx information. Section 6 discusses the timeliness of the ConEx feedback signal, given that congestion is a temporary state.

第2节概述了TCP发送方实现ConEx所需的修改。首先,如第3节所述,必须从TCP的丢失或ECN反馈中提取拥塞信息。第4节详细介绍了如何基于此拥塞信息设置CDO标记。第5节讨论了携带ConEx信息的数据包的丢失。第6节讨论了ConEx反馈信号的及时性,因为拥塞是一种临时状态。

This document describes congestion accounting for TCP with and without the Selective Acknowledgement (SACK) extension [RFC2018] (in Section 3.1). However, ConEx benefits from the more accurate information that SACK provides about the number of bytes dropped in the network, and it is therefore preferable to use the SACK extension when using TCP with ConEx. The detailed mechanism to set the L flag in response to the loss-based congestion feedback signal is given in Section 4.1.

本文档描述了TCP的拥塞核算,包括选择性确认(SACK)扩展[RFC2018](第3.1节)。但是,ConEx从SACK提供的关于网络中丢弃字节数的更准确信息中获益,因此,在将TCP与ConEx一起使用时,最好使用SACK扩展。第4.1节给出了响应基于丢失的拥塞反馈信号设置L标志的详细机制。

While loss has to be minimized, ECN can provide more fine-grained feedback information. ConEx-based traffic measurement or management mechanisms could benefit from this. Unfortunately, the current ECN feedback mechanism does not reflect multiple congestion markings if they occur within the same Round-Trip Time (RTT). A more accurate

虽然损失必须最小化,但ECN可以提供更细粒度的反馈信息。基于ConEx的流量测量或管理机制可以从中受益。不幸的是,如果在同一往返时间(RTT)内出现多个拥塞标记,则当前的ECN反馈机制不会反映这些标记。更准确的

feedback extension to ECN (AccECN) is proposed in a separate document [ACCURATE], as this is also useful for other mechanisms.

ECN的反馈扩展(AccECN)在另一份文件[ACCURATE]中提出,因为这对其他机制也很有用。

Congestion accounting for both classic ECN feedback and AccECN feedback is explained in detail in Section 3.2. Setting the E flag in response to ECN-based congestion feedback is again detailed in Section 4.1.

第3.2节详细解释了经典ECN反馈和AccECN反馈的拥塞计算。第4.1节再次详细说明了为响应基于ECN的拥塞反馈而设置E标志。

1.1. Requirements Language
1.1. 需求语言

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中所述进行解释。

2. Sender-Side Modifications
2. 发送方修改

This section gives an overview of actions that need to be taken by a TCP sender modified to use ConEx signaling.

本节概述了修改为使用ConEx信令的TCP发送方需要采取的操作。

In the TCP handshake, a ConEx sender MUST negotiate for SACK and ECN preferably with AccECN feedback. Therefore, a ConEx sender MUST also implement SACK and ECN. Depending on the capability of the receiver, the following operation modes exist:

在TCP握手中,ConEx发送方必须协商SACK和ECN,最好使用AccECN反馈。因此,ConEx发送方还必须实现SACK和ECN。根据接收器的性能,存在以下操作模式:

o SACK-accECN-ConEx (SACK and accurate ECN feedback)

o SACK accECN ConEx(SACK和准确的ECN反馈)

o SACK-ECN-ConEx (SACK and classic instead of accurate ECN)

o SACK ECN ConEx(SACK和经典,而不是精确的ECN)

o accECN-ConEx (no SACK but accurate ECN feedback)

o accECN ConEx(无SACK,但ECN反馈准确)

o ECN-ConEx (no SACK and no accurate ECN feedback, but classic ECN)

o ECN ConEx(无SACK,无准确的ECN反馈,但有经典的ECN)

o SACK-ConEx (SACK but no ECN at all)

o SACK ConEx(SACK,但无ECN)

o Basic-ConEx (neither SACK nor ECN)

o 基本ConEx(既不是SACK也不是ECN)

A ConEx sender MUST expose all congestion information to the network according to the congestion information received by ECN or based on loss information provided by the TCP feedback loop. A TCP sender SHOULD count congestion byte-wise (rather than packet-wise; see next paragraph). After any congestion notification, a sender MUST mark subsequent packets with the appropriate ConEx flag in the IP header. Furthermore, a ConEx sender must send enough credit to cover all experienced congestion for the connection so far, as well as the risk of congestion for the current transmission (see Section 4.2).

ConEx发送方必须根据ECN接收到的拥塞信息或TCP反馈环路提供的丢失信息向网络公开所有拥塞信息。TCP发送方应该按字节计数拥塞(而不是按数据包;请参阅下一段)。在任何拥塞通知之后,发送方必须在IP报头中使用适当的ConEx标志标记后续数据包。此外,ConEx发送方必须发送足够的信用,以覆盖到目前为止所有经历的连接拥塞,以及当前传输的拥塞风险(见第4.2节)。

With SACK the number of lost payload bytes is known, but not the number of packets carrying these bytes. With classic ECN only an indication is given that a marking occurred, but not the exact number of payload bytes nor packets. As network congestion is usually byte-congestion [RFC7141], the byte-size of a packet marked with a CDO flag is defined to represent that number of bytes of congestion signaling [RFC7837]. Therefore, the exact number of bytes should be taken into account, if available, to make the ConEx Signal as exact as possible.

使用SACK时,丢失的有效负载字节数是已知的,但不知道携带这些字节的数据包数。对于经典ECN,仅给出标记发生的指示,而不是有效负载字节或数据包的确切数量。由于网络拥塞通常是字节拥塞[RFC7141],用CDO标志标记的数据包的字节大小被定义为表示拥塞信令的字节数[RFC7837]。因此,应考虑准确的字节数(如果可用),以使ConEx信号尽可能准确。

Detailed mechanisms for congestion counting in each operation mode are described in the next section.

下一节将详细介绍每种操作模式下的拥塞计数机制。

3. Counting Congestion
3. 计数阻塞

A ConEx TCP sender maintains two counters: one that counts congestion based on the information retrieved by loss detection, and a second that accounts for ECN-based congestion feedback. These counters hold the number of outstanding bytes that should be ConEx-Marked with, respectively, the E flag or the L flag in subsequent packets.

ConEx TCP发送方维护两个计数器:一个根据丢失检测检索到的信息统计拥塞,另一个负责基于ECN的拥塞反馈。这些计数器包含应在后续数据包中分别用E标志或L标志进行ConEx标记的未完成字节数。

The outstanding bytes for congestion indications based on loss are maintained in the Loss Exposure Gauge (LEG), as explained in Section 3.1.

如第3.1节所述,基于损耗的拥塞指示的未完成字节保留在损耗暴露量表(LEG)中。

The outstanding bytes counted based on ECN feedback information are maintained in the Congestion Exposure Gauge (CEG), as explained in Section 3.2.

如第3.2节所述,根据ECN反馈信息计算的未完成字节保留在拥塞暴露量表(CEG)中。

When the sender sends a ConEx-capable packet with the E or L flag set, it reduces the respective counter by the byte-size of the packet. This is explained for both counters in Section 4.1.

当发送方发送设置了E或L标志的支持ConEx的数据包时,它会将相应的计数器减小数据包的字节大小。第4.1节对这两个计数器进行了解释。

Note that all bytes of an IP packet must be counted in the LEG or CEG to capture the right number of bytes that should be marked. Therefore, the sender SHOULD take the payload and headers into account, up to and including the IP header. However, in TCP the information regarding how large the headers of a lost or marked packet were is usually not available, as only payload data will be acknowledged.

请注意,IP数据包的所有字节必须在LEG或CEG中计数,以捕获应标记的正确字节数。因此,发送方应该考虑有效负载和报头,包括IP报头。然而,在TCP中,关于丢失或标记的数据包的报头有多大的信息通常是不可用的,因为只有有效负载数据才会被确认。

If equal-sized packets, or at least equally distributed packet sizes, can be assumed, the sender MAY only add and subtract TCP payload bytes. In this case, there should be about the same number of ConEx-Marked packets as the original packets that were causing the congestion. Thus, both contain about the same number of header bytes so they will cancel out. This case is assumed for simplicity in the following sections.

如果可以假设大小相同的数据包,或者至少分布相同的数据包大小,则发送方只能添加和减去TCP有效负载字节。在这种情况下,带有ConEx标记的数据包的数量应该与引起拥塞的原始数据包的数量大致相同。因此,两者都包含大约相同数量的头字节,因此它们将被取消。为了简单起见,在以下章节中假设了这种情况。

Otherwise, if a sender sends different sized packets (with unequally distributed packet sizes), the sender needs to memorize or estimate the number of lost or ECN-marked packets. If the sender has sufficient memory available, the most accurate way to reconstruct the number of lost or marked packets is to remember the sequence number of all sent but not acknowledged packets. In this case, a sender is able to reconstruct the number of packets, and thus the header bytes that were sent during the last RTT. Otherwise (e.g., if not enough memory is available), the sender would need to estimate the packet size. The average packet size can be estimated if the distribution pattern of packet sizes in the last RTT is known; alternatively, the minimum packet size seen in the last RTT can be used as the most conservative estimate.

否则,如果发送方发送不同大小的数据包(数据包大小分布不均匀),发送方需要记忆或估计丢失或带有ECN标记的数据包的数量。如果发送方有足够的可用内存,重建丢失或标记的数据包数量的最准确方法是记住所有已发送但未确认的数据包的序列号。在这种情况下,发送方能够重构数据包的数量,从而重构上次RTT期间发送的报头字节。否则(例如,如果没有足够的内存可用),发送方将需要估计数据包大小。如果已知最后RTT中的分组大小的分布模式,则可以估计平均分组大小;或者,在最后的RTT中看到的最小分组大小可以用作最保守的估计。

If the number of newly sent-out packets with the ConEx L or E flag set is smaller (or larger) than this estimated number of lost/ECN-marked packets, the additional header bytes should be added to (or can be subtracted from) the respective gauge.

如果设置了ConEx L或E标志的新发送数据包的数量小于(或大于)估计的丢失/ECN标记数据包的数量,则应将额外的报头字节添加到相应的仪表上(或从中减去)。

3.1. Loss Detection
3.1. 损失检测

This section applies whether or not SACK support is available. The following subsection (Section 3.1.1) handles the case when SACK is not available.

本节适用于SACK支持是否可用。以下小节(第3.1.1节)处理SACK不可用的情况。

A TCP sender detects losses and subsequently retransmits the lost data. Therefore, the ConEx sender can simply set the ConEx L flag on all retransmissions in order to at least cover the amount of bytes lost. If this approach is taken, no LEG is needed.

TCP发送方检测丢失,然后重新传输丢失的数据。因此,ConEx发送方可以简单地在所有重传上设置ConEx L标志,以便至少覆盖丢失的字节量。如果采用这种方法,则不需要支腿。

However, any retransmission may be spurious. In this case, more bytes have been marked than necessary. To compensate for this effect, a ConEx sender can maintain a local signed counter (the LEG) that indicates the number of outstanding bytes to be sent with the ConEx L flag and also can become negative.

然而,任何重传都可能是虚假的。在这种情况下,标记的字节数超过了需要的字节数。为了补偿这种影响,ConEx发送方可以维护一个本地签名计数器(LEG),该计数器指示要使用ConEx L标志发送的未完成字节数,也可以变为负数。

Using the LEG, when a TCP sender decides that a data segment needs to be retransmitted, it will increase the LEG by the size of the TCP payload bytes in the retransmission (assuming equal sized segments such that the retransmitted packet will have the same number of header bytes as the original ones):

使用LEG,当TCP发送方决定需要重新传输数据段时,它会将LEG增加重新传输中TCP有效负载字节的大小(假设大小相等的段,以便重新传输的数据包将具有与原始数据包相同的报头字节数):

For each retransmission:

对于每次重传:

LEG += payload

航段+=有效载荷

Note how the LEG is reduced when the ConEx L marking is set as described in Section 4.

注意当按照第4节所述设置ConEx L标记时,支腿是如何减小的。

Further, to accommodate spurious retransmissions, a ConEx sender SHOULD make use of heuristics to detect such spurious retransmissions (e.g., F-RTO [RFC5682], DSACK [RFC3708], and Eifel [RFC3522], [RFC4015]), if already available in a given implementation. If no mechanism for detecting spurious retransmissions is available, the ConEx sender MAY chose to implement one of the mechanisms stated above. However, given the inaccuracy that ConEx may have anyway and the timeliness of ConEx information, a ConEx MAY also chose not to compensate for spurious retransmission. In this case, if spurious retransmissions occur, the ConEx sender has simply sent too many ConEx Signals which, e.g., would decrease the congestion allowance in a ConEx policer unnecessarily.

此外,为了适应伪重传,ConEx发送方应使用试探法来检测此类伪重传(例如,F-RTO[RFC5682]、DSACK[RFC3708]和Eifel[RFC3522]、[RFC4015]),如果在给定实现中已经可用。如果没有检测虚假重传的机制可用,ConEx发送方可以选择实现上述机制之一。然而,考虑到ConEx可能具有的不精确性以及ConEx信息的及时性,ConEx也可能选择不补偿虚假的重新传输。在这种情况下,如果发生虚假的重新传输,ConEx发送方只是发送了太多的ConEx信号,例如,这将不必要地降低ConEx策略中的拥塞容限。

If a heuristic method is used to detect spurious retransmission and has determined that a certain number of packets were retransmitted erroneously, the ConEx sender subtracts the payload size of these TCP packets from LEG.

如果使用启发式方法检测虚假重传,并确定一定数量的数据包被错误重传,ConEx发送方将从LEG中减去这些TCP数据包的有效负载大小。

If a spurious retransmission is detected:

如果检测到伪重传:

LEG -= payload

航段-=有效载荷

Note that LEG can become negative if too many L markings have already been sent. This case is further discussed in Section 6.

请注意,如果发送的L标记过多,LEG可能会变为负数。本案例将在第6节中进一步讨论。

3.1.1. Without SACK Support
3.1.1. 无袋支撑

If multiple losses occur within one RTT and SACK is not used, it may take several RTTs until all lost data is retransmitted. With the scheme described above, the ConEx information will be delayed considerably, but timeliness is important for ConEx. For ConEx, it is important to know how much data was lost; it is not important to know what data is lost. During the first RTT after the initial loss detection, the amount of received data, and thus also the amount of lost data, can be estimated based on the number of received ACKs.

如果一个RTT内发生多个丢失,并且SACK未使用,则可能需要几个RTT,直到重新传输所有丢失的数据。根据上述方案,ConEx信息将被大大延迟,但及时性对于ConEx来说很重要。对于ConEx而言,了解有多少数据丢失很重要;知道哪些数据丢失并不重要。在初始丢失检测之后的第一次RTT期间,可以基于接收到的ack的数量来估计接收到的数据量,从而也估计丢失的数据量。

Therefore, a ConEx sender can use the following algorithm to estimated the number of lost bytes with an additional delay of one RTT using an additional Loss Estimation Counter (LEC):

因此,ConEx发送方可以使用以下算法,使用额外的丢失估计计数器(LEC),以一个RTT的额外延迟来估计丢失的字节数:

flight_bytes: current flight size in bytes retransmit_bytes: payload size of the retransmission

flight_bytes:当前航班大小(以字节为单位)retransmit_bytes:重新传输的有效负载大小

At the first retransmission in a congestion event, LEC is set:

在拥塞事件中的第一次重传时,LEC设置为:

         LEC = flight_bytes - 3*SMSS
        
         LEC = flight_bytes - 3*SMSS
        

(At this point in the transmission, in the worst case, all packets in flight minus three that triggered the dupACks could have been lost.)

(在传输的这一点上,在最坏的情况下,飞行中触发重复包的所有数据包减去3都可能丢失。)

Then, during the first RTT of the congestion event:

然后,在拥塞事件的第一次RTT期间:

         For each retransmission:
            LEG += retransmit_bytes
            LEC -= retransmit_bytes
        
         For each retransmission:
            LEG += retransmit_bytes
            LEC -= retransmit_bytes
        

For each ACK: LEC -= SMSS

对于每个ACK:LEC-=SMSS

After one RTT:

一次RTT后:

LEG += LEC

腿+=LEC

(The LEC now estimates the number of outstanding bytes that should be ConEx L-marked.)

(LEC现在估计应使用ConEx L标记的未完成字节数。)

After the first RTT for each following retransmissions:

以下每次重传的第一次RTT后:

         if (LEC > 0): LEC -= retransmit_bytes
         else if (LEC==0): LEG += retransmit_bytes
        
         if (LEC > 0): LEC -= retransmit_bytes
         else if (LEC==0): LEG += retransmit_bytes
        
         if (LEC < 0): LEG += -LEC
        
         if (LEC < 0): LEG += -LEC
        

(The LEG is not increased for those bytes that were already counted.)

(对于已经计数的字节,腿不会增加。)

3.2. Explicit Congestion Notification (ECN)
3.2. 显式拥塞通知(ECN)

ECN [RFC3168] is an IP/TCP mechanism that allows network nodes to mark packets with the Congestion Experienced (CE) mark instead of dropping them when congestion occurs.

ECN[RFC3168]是一种IP/TCP机制,允许网络节点使用拥塞经历(CE)标记标记数据包,而不是在发生拥塞时丢弃数据包。

A receiver might support classic ECN, the more accurate ECN feedback scheme (AccECN), or neither. In the case that ECN is not supported for a connection, of course no ECN marks will occur; thus, the sender will never set the E flag. Otherwise, a ConEx sender needs to maintain a signed counter, the Congestion Exposure Gauge (CEG), for the number of outstanding bytes that have to be ConEx-Marked with the E flag.

接收机可能支持经典ECN、更精确的ECN反馈方案(AccECN),或者两者都不支持。在连接不支持ECN的情况下,当然不会出现ECN标记;因此,发送方永远不会设置E标志。否则,ConEx发送方需要维护一个签名计数器,即拥塞暴露量表(CEG),用于计算必须用E标志标记的ConEx未完成字节数。

The CEG is increased when ECN information is received from an ECN-capable receiver supporting the classic ECN scheme or the accurate ECN feedback scheme. When the ConEx sender receives an ACK indicating one or more segments were received with a CE mark, CEG is increased by the appropriate number of bytes as described further below.

当从支持经典ECN方案或精确ECN反馈方案的具有ECN能力的接收机接收到ECN信息时,CEG增加。当ConEx发送方接收到一个ACK,指示接收到一个或多个带有CE标记的段时,CEG增加适当的字节数,如下所述。

Unfortunately, in case of duplicate acknowledgements, the number of newly acknowledged bytes will be zero even though (CE-marked) data has been received. Therefore, we increase the CEG by DeliveredData, as defined below:

不幸的是,在重复确认的情况下,即使接收到(CE标记的)数据,新确认的字节数也将为零。因此,我们通过交付数据增加CEG,定义如下:

   DeliveredData = acked_bytes + SACK_diff + (is_dup)*1SMSS -
   (is_after_dup)*num_dup*1SMSS
        
   DeliveredData = acked_bytes + SACK_diff + (is_dup)*1SMSS -
   (is_after_dup)*num_dup*1SMSS
        

DeliveredData covers the number of bytes that has been newly delivered to the receiver. Therefore, on each arrival of an ACK, DeliveredData will be increased by the newly acknowledged bytes (acked_bytes) as indicated by the current ACK, relative to all past ACKs. The formula depends on whether SACK is available: if SACK is not available, SACK_diff is always zero, whereas if ACK information is available, is_dup and is_after_dup are always zero.

DeliveredData包含新传递给接收方的字节数。因此,在每次ACK到达时,DeliveredData将根据当前ACK指示的新确认字节(acked_字节)相对于所有过去的ACK增加。公式取决于SACK是否可用:如果SACK不可用,SACK_diff始终为零,而如果ACK信息可用,则is_dup和is_after_dup始终为零。

With SACK, DeliveredData is increased by the number of bytes provided by (new) SACK information (SACK_diff). Note that if less unacknowledged bytes are announced in the new SACK information than in the previous ACK, SACK_diff can be negative. In this case, data is newly acknowledged (in acked_bytes) that was previously accumulated into DeliveredData, based on SACK information.

使用SACK,DeliveredData将增加(新)SACK信息(SACK_diff)提供的字节数。请注意,如果新SACK信息中宣布的未确认字节数少于前一个ACK中宣布的字节数,则SACK_diff可能为负值。在这种情况下,根据SACK信息,数据是先前累积到DeliveredData中的新确认数据(以acked_字节为单位)。

Otherwise without SACK, DeliveredData is increased by 1 Sender Maximum Segment Size (SMSS) on duplicate acknowledgements because duplicate acknowledgements do not acknowledge any new data (and acked_bytes will be zero). For the subsequent partial or full ACK, acked_bytes cover all newly acknowledged bytes including those already accounted for with the receipt of any duplicate acknowledgement. Therefore, DeliveredData is reduced by one SMSS for each preceding duplicate ACK. Consequently, is_dup is one if the current ACK is a duplicated ACK without SACK, and zero otherwise. is_after_dup is only one for the next full or partial ACK after a number of duplicated ACKs without SACK and num_dup counts the number of duplicated ACKs in a row (which usually is 3 or more).

否则,在没有SACK的情况下,在重复确认时,DeliveredData将增加1发送方最大段大小(SMSS),因为重复确认不会确认任何新数据(且确认的字节将为零)。对于随后的部分或全部确认,确认的_字节包括所有新确认的字节,包括在收到任何重复确认后已经入账的字节。因此,对于每个之前的重复确认,DeliveredData减少一个SMS。因此,如果当前ACK是没有SACK的重复ACK,则is_dup为1,否则为零。is_after_dup是在多个不带SACK的重复ACK之后的下一个完整或部分ACK的唯一值,num_dup统计一行中的重复ACK数(通常为3或更多)。

With classic ECN, one congestion-marked packet causes continuous congestion feedback for a whole round trip, thus hiding the arrival of any further congestion-marked packets during that round trip. A more accurate ECN feedback scheme (AccECN) is needed to ensure that feedback properly reflects the extent of congestion marking. The two

在经典的ECN中,一个拥塞标记的数据包会在整个往返过程中产生连续的拥塞反馈,从而在该往返过程中隐藏任何进一步的拥塞标记数据包的到达。需要更准确的ECN反馈方案(AccECN),以确保反馈正确反映拥塞标记的程度。两个

cases, with and without a receiver capable of AccECN, are discussed in the following sections.

以下各节将讨论具有和不具有AccECN接收器的情况。

3.2.1. Accurate ECN Feedback
3.2.1. 精确的ECN反馈

With a more accurate ECN feedback scheme (AccECN) that is supported by the receiver, either the number of marked packets or the number of marked bytes will be fed back from the receiver to the sender and, therefore is known at the sender side. In the latter case, the CEG can be increased directly by the number of marked bytes. Otherwise if D is assumed to be the number of marks, the gauge (CEG) will be conservatively increased by one SMSS for each marking or, at the maximum, the number of newly acknowledged bytes:

通过接收器支持的更精确的ECN反馈方案(AccECN),标记数据包的数量或标记字节的数量将从接收器反馈给发送器,因此在发送器端是已知的。在后一种情况下,CEG可以直接增加标记字节的数量。否则,如果假设D为标记数,则每个标记的量规(CEG)将保守地增加一个SMS,或最多增加新确认的字节数:

   CEG += min(SMSS*D, DeliveredData)
        
   CEG += min(SMSS*D, DeliveredData)
        
3.2.2. Classic ECN Support
3.2.2. 经典的ECN支持

With classic ECN, as soon as a CE mark is seen at the receiver side, it will feed this information back to the sender by setting the Echo Congestion Experienced (ECE) flag in the TCP header of subsequent ACKs. Once the sender receives the first ECE of a congestion notification, it sets the Congestion Window Reduced (CWR) flag in the TCP header once. When this packet with the CWR flag in the TCP header arrives at the receiver side acknowledging its first ECE feedback, the receiver stops setting the ECE flag.

对于经典ECN,一旦在接收方看到CE标记,它将通过在后续ACK的TCP报头中设置ECE标志将该信息反馈给发送方。一旦发送方收到拥塞通知的第一个ECE,它将在TCP报头中设置一次拥塞窗口缩减(CWR)标志。当TCP报头中带有CWR标志的数据包到达接收方,确认其第一个ECE反馈时,接收方停止设置ECE标志。

If the ConEx sender fully conforms to the semantics of ECN signaling as defined by [RFC3168], it will receive one full RTT of ACKs with the ECE flag set whenever at least one CE mark was received by the receiver. As the sender cannot estimate how many packets have actually been CE-marked during this RTT, the most conservative assumption MAY be taken, namely assuming that all packets were marked. This can be achieved by increasing the CEG by DeliveredData for each ACK with the ECE flag:

如果ConEx发送方完全符合[RFC3168]定义的ECN信令语义,则只要接收方至少收到一个CE标记,它就会收到一个完整的RTT ACK,并设置ECE标志。由于发送方无法估计在该RTT期间实际标记了多少个分组,因此可以采用最保守的假设,即假设所有分组都被标记。这可以通过使用ECE标志通过每个ACK的DeliveredData增加CEG来实现:

CEG += DeliveredData

CEG+=交付数据

Optionally, a ConEx sender could implement the following technique (that does not conform to [RFC3168]), called "advanced compatibility mode", to considerably improve its estimate of the number of ECN-marked packets:

可选地,ConEx发送方可以实施以下技术(不符合[RFC3168]),称为“高级兼容性模式”,以显著提高其对ECN标记的数据包数量的估计:

To extract more than one ECE indication per RTT, a ConEx sender could set the CWR flag continuously to force the receiver to signal only one ECE per CE mark. Unfortunately, the use of delayed ACKs [RFC5681] (which is common) will prevent feedback of every CE mark; if a CWR confirmation is received before the ECE can be sent out on

为了每个RTT提取多个ECE指示,ConEx发送器可以连续设置CWR标志,以强制接收器每个CE标记仅发送一个ECE信号。不幸的是,使用延迟确认[RFC5681](这是常见的)将阻止每个CE标记的反馈;如果在发送ECE之前收到CWR确认

the next ACK, ECN feedback information could get lost (depending on the actual receiver implementation). Thus, a sender SHOULD set CWR only on those data segments that will presumably trigger a (delayed) ACK. The sender would need an additional control loop to estimate which data segments will trigger an ACK in order to extract more timely congestion notifications. Still, the CEG SHOULD be increased by DeliveredData, as one or more CE-marked packets could be acknowledged by one delayed ACK.

下一个ACK、ECN反馈信息可能丢失(取决于实际的接收器实现)。因此,发送方应仅在那些可能触发(延迟)ACK的数据段上设置CWR。发送方需要一个额外的控制循环来估计哪些数据段将触发ACK,以便提取更及时的拥塞通知。不过,CEG应该增加DeliveredData,因为一个或多个CE标记的数据包可以通过一个延迟的ACK进行确认。

4. Setting the ConEx Flags
4. 设置ConEx标志

By setting the X flag, a packet is marked as ConEx-capable. All packets carrying payload MUST be marked with the X flag set, including retransmissions. Only if no congestion feedback information is (currently) available, SHOULD the X flag be zero (e.g., for control packets on a connection that has not sent any user data for some time and, therefore is sending only pure ACKs that are not carrying any payload).

通过设置X标志,数据包被标记为支持ConEx。所有携带有效载荷的数据包必须用X标志集进行标记,包括重传。仅当(当前)没有可用的拥塞反馈信息时,X标志才应为零(例如,对于一段时间内未发送任何用户数据的连接上的控制数据包,因此仅发送不携带任何有效负载的纯ACK)。

4.1. Setting the E or the L Flag
4.1. 设置E或L标志

As described in Section 3.1, the sender needs to maintain a CEG counter and might also maintain a LEG counter. If no LEG is used, all retransmission will be marked with the L flag.

如第3.1节所述,发送方需要维护CEG计数器,也可以维护LEG计数器。如果未使用分支,则所有重传都将标记L标志。

Further, as long as the LEG or CEG counter is positive, the sender marks each ConEx-capable packet with L or E respectively, and decreases the LEG or CEG counter by the TCP payload bytes carried in the marked packet (assuming headers are not being counted because packet sizes are regular). No matter how small the value of LEG or CEG, if the value is positive the sender MUST NOT defer packet marking; this ensures that ConEx Signals are timely. Therefore, the value of LEG and CEG will commonly be negative.

此外,只要LEG或CEG计数器为正,发送方就分别用L或E标记每个支持ConEx的数据包,并通过标记的数据包中携带的TCP有效负载字节来减少LEG或CEG计数器(假设由于数据包大小是规则的,所以不计算报头)。无论LEG或CEG的值有多小,如果该值为正,则发送方不得延迟数据包标记;这确保了ConEx信号是及时的。因此,LEG和CEG的值通常为负值。

If both the LEG and CEG are positive, the sender MUST mark each ConEx-capable packet with both L and E. If a credit signal is also pending (see the next section), the C flag can be set as well.

如果LEG和CEG均为正值,则发送方必须使用L和E标记每个支持ConEx的数据包。如果信用信号也处于挂起状态(见下一节),则也可以设置C标志。

4.2. Setting the Credit Flag
4.2. 设置信用标志

The ConEx abstract mechanism [RFC7713] requires that sufficient credit MUST be signaled in advance to cover the expected congestion during the feedback delay of one RTT.

ConEx抽象机制[RFC7713]要求必须提前发出足够的信用信号,以覆盖一个RTT反馈延迟期间的预期拥塞。

To monitor the credit state at the audit, a ConEx sender needs to maintain a Credit State Counter (CSC) in bytes. If congestion occurs, credits will be consumed and the CSC is reduced by the number of bytes that were lost or estimated to be ECN-marked. If the risk

为了在审核时监控信用状态,ConEx发送方需要维护一个信用状态计数器(CSC),以字节为单位。如果发生拥塞,将消耗信用,并且CSC将减少丢失或估计为ECN标记的字节数。如果风险

of congestion was estimated wrongly, and thus too few credits were sent, the CSC becomes zero but cannot go negative.

错误地估计了拥塞的数量,因此发送的信用太少,CSC变为零,但不能为负。

To be sure that the credit state in the audit never reaches zero, the number of credits should always equal the number of bytes in flight as all packets could potentially get lost or congestion-marked. In this case, a ConEx sender also monitors the number of bytes in flight F. If F ever becomes larger than the CSC, the ConEx sender sets the C flag on each ConEx-capable packet and increases the CSC by the payload size of each marked packet until the CSC is no less than F again. However, a ConEx sender might also be less conservative and send fewer credits if it, e.g., assumes that the congestion will be low on a certain path based on previous experience.

为了确保审核中的信用状态永远不会达到零,信用数应始终等于传输中的字节数,因为所有数据包都可能丢失或标记拥塞。在这种情况下,ConEx发送方还监控飞行F中的字节数。如果F变得大于CSC,ConEx发送方将在每个具有ConEx能力的数据包上设置C标志,并通过每个标记数据包的有效负载大小增加CSC,直到CSC再次不小于F。然而,如果ConEx发送方根据以前的经验假设某条路径上的拥塞程度较低,则ConEx发送方也可能不那么保守,发送的信用也较少。

Recall that the CSC will be decreased whenever congestion occurs; therefore the CSC will need to be replenished as soon as the CSC drops below F. Also recall that the sender can set the C flag on a ConEx-capable packet whether or not the E or L flags are also set.

回想一下,每当发生拥塞时,CSC将降低;因此,一旦CSC降至F以下,CSC将需要立即补充。还记得,发送方可以在支持ConEx的数据包上设置C标志,无论是否也设置了E或L标志。

In TCP Slow Start, the congestion window might grow much larger than during the rest of the transmission. Likely, a sender could consider sending fewer than F credits but risking being penalized by an audit function. However, the credits should at least cover the increase in sending rate. Given the exponential increase as implemented in the TCP Slow Start algorithm, which means that the sending rate doubles every RTT, a ConEx sender should at least cover half the number of packets in flight by credits.

在TCP慢速启动中,拥塞窗口可能会比传输的其余部分大得多。可能的是,发送者可以考虑发送少于F信用,但冒着被审计功能处罚的风险。然而,信用额至少应包括发送速率的增加。考虑到TCP慢启动算法实现的指数增长,这意味着发送速率每RTT翻一番,ConEx发送方应至少覆盖传输中数据包数量的一半。

Note that the number of losses or markings within one RTT does not depend solely on the sender's actions. In general, the behavior of the cross traffic, whether Active Queue Management (AQM) is used and how it is parameterized influence how many packets might be dropped or marked. As long as any AQM encountered is not overly aggressive with ECN marking, sending half the flight size as credits should be sufficient whether congestion is signaled by loss or ECN.

请注意,一个RTT内的损失或标记数量并不完全取决于发送方的行动。通常,交叉流量的行为、是否使用主动队列管理(AQM)以及如何参数化会影响可能丢弃或标记的数据包数量。只要遇到的任何AQM没有过度攻击ECN标记,发送一半的航班大小作为积分就足够了,无论拥塞是由丢失还是ECN发出的信号。

To maintain half of the packets in flight as credits, half of the packet of the initial window must also be C-marked. In Slow Start marking, every fourth packet introduces the correct amount of credit as can be seen in Figure 1.

为了将飞行中的一半数据包保持为信用,初始窗口的一半数据包也必须标记为C。在慢启动标记中,每四个数据包引入正确的信用量,如图1所示。

                                        in_flight  credits
                RTT1  |------XC------>|     1         1
                      |------X------->|     2         1
                      |------XC------>|     3         2
                      |               |
                RTT2  |------X------->|     3         2
                      |------X------->|     4         2
                      |------X------->|     4         2
                      |------XC------>|     5         3
                      |------X------->|     5         3
                      |------X------->|     6         3
                      |               |
                RTT3  |------X------->|     6         3
                      |------XC------>|     7         4
                      |------X------->|     7         4
                      |------X------->|     8         4
                      |------X------->|     8         4
                      |------XC------>|     9         5
                      |------X------->|     9         5
                      |------X------->|    10         5
                      |------X------->|    10         5
                      |------XC------>|    11         6
                      |------X------->|    11         6
                      |------X------->|    12         6
                      |      .        |
                      |      :        |
        
                                        in_flight  credits
                RTT1  |------XC------>|     1         1
                      |------X------->|     2         1
                      |------XC------>|     3         2
                      |               |
                RTT2  |------X------->|     3         2
                      |------X------->|     4         2
                      |------X------->|     4         2
                      |------XC------>|     5         3
                      |------X------->|     5         3
                      |------X------->|     6         3
                      |               |
                RTT3  |------X------->|     6         3
                      |------XC------>|     7         4
                      |------X------->|     7         4
                      |------X------->|     8         4
                      |------X------->|     8         4
                      |------XC------>|     9         5
                      |------X------->|     9         5
                      |------X------->|    10         5
                      |------X------->|    10         5
                      |------XC------>|    11         6
                      |------X------->|    11         6
                      |------X------->|    12         6
                      |      .        |
                      |      :        |
        

Figure 1: Credits in Slow Start (with an initial window of 3)

图1:慢启动时的积分(初始窗口为3)

It is possible that a TCP flow will encounter an audit function without relevant flow state due to, e.g., rerouting or memory limitations. Therefore, the sender needs to detect this case and resend credits. A ConEx sender might reset the credit counter CSC to zero if losses occur in subsequent RTTs (assuming that the sending rate was correctly reduced based on the received congestion signal and using a conservatively large RTT estimation).

由于(例如)重新路由或内存限制,TCP流可能会遇到没有相关流状态的审计功能。因此,发送方需要检测此情况并重新发送信用。如果在随后的RTT中发生丢失,ConEx发送方可能会将信用计数器CSC重置为零(假设发送速率基于接收到的拥塞信号正确降低,并使用保守的大RTT估计)。

This section proposes a concrete algorithm for determining how much credit to signal (with a separate approach used for Slow Start). However, experimentation in credit setting algorithms is expected and encouraged. The wider goal of ConEx is to reflect the "cost" of the risk of causing congestion on those that contribute most to it. Thus, experimentation is encouraged to improve or maintain performance while reducing the risk of causing congestion and, therefore potentially reducing the need to signal so much credit.

本节提出了一个具体的算法,用于确定信号的信用度(慢启动采用单独的方法)。然而,信用设置算法的实验是值得期待和鼓励的。ConEx更广泛的目标是反映导致交通拥堵的风险的“成本”。因此,鼓励进行实验,以提高或保持性能,同时降低造成拥塞的风险,从而潜在地减少发出如此多信用信号的需要。

5. Loss of ConEx Information
5. 失去ConEx信息

Packets carrying ConEx Signals could be discarded themselves. This will be a second order problem (e.g., if the loss probability is 0.1%, the probability of losing a ConEx L signal will be 0.1% of 0.1% = 0.01%). Further, the penalty an audit induces should be proportional to the mismatch of expected ConEx marks and observed congestion, therefore the audit might only slightly increase the loss level of this flow. Therefore, an implementer MAY choose to ignore this problem, accepting instead the risk that an audit function might wrongly penalize a flow.

携带ConEx信号的数据包本身可能会被丢弃。这将是一个二阶问题(例如,如果丢失概率为0.1%,则丢失ConEx L信号的概率为0.1%的0.1%=0.01%)。此外,审计导致的惩罚应与预期ConEx标记和观察到的拥塞的不匹配成比例,因此审计可能只会略微增加该流量的损失水平。因此,实现者可以选择忽略这个问题,而是接受审计功能可能错误地惩罚流的风险。

Nonetheless, a ConEx sender is responsible for always signaling sufficient congestion feedback, and therefore SHOULD remember which packet was marked with either the L, the E, or the C flag. If one of these packets is detected as lost, the sender SHOULD increase the respective gauge(s), LEG or CEG, by the number of lost payload bytes in addition to increasing LEG for the loss.

尽管如此,ConEx发送方始终负责发送足够的拥塞反馈信号,因此应该记住哪个数据包标记了L、E或C标志。如果其中一个数据包被检测为丢失,发送方除了增加丢失的LEG外,还应增加相应的仪表LEG或CEG,增加丢失的有效负载字节数。

6. Timeliness of the ConEx Signals
6. ConEx信号的及时性

ConEx Signals will only be useful to a network node within a time delay of about one RTT after the congestion occurred. To avoid further delays, a ConEx sender SHOULD send the ConEx signaling on the next available packet.

ConEx信号仅在拥塞发生后约一个RTT的时间延迟内对网络节点有用。为避免进一步延迟,ConEx发送方应在下一个可用数据包上发送ConEx信令。

Any or all of the ConEx flags can be used in the same packet, which allows delays to be minimized when multiple signals are pending. The need to set multiple ConEx flags at the same time can occur if, e.g, an ACK is received by the sender that simultaneously indicates that at least one ECN mark was received, and that one or more segments were lost. This may happen during excessive congestion, if the queues overflow even though ECN was used and currently all forwarded packets are marked, while others have to be dropped. Another case when this might happen is when ACKs are lost, so that a subsequent ACK carries summary information not previously available to the sender.

任何或所有ConEx标志可用于同一数据包中,这允许在多个信号挂起时最小化延迟。例如,如果发送方接收到同时指示至少接收到一个ECN标记和一个或多个段丢失的ACK,则可能需要同时设置多个ConEx标志。这可能发生在过度拥塞期间,如果队列溢出,即使使用了ECN并且当前所有转发的数据包都被标记,而其他数据包则必须丢弃。另一种可能发生这种情况的情况是,ACK丢失,因此后续ACK携带了发送方以前不可用的摘要信息。

If a flow becomes application-limited, there could be insufficient bytes to send to reduce the gauges to zero or below. In such cases, the sender cannot help but delay ConEx Signals. Nonetheless, as long as the sender is marking all outgoing packets, an audit function is unlikely to penalize ConEx-Marked packets. Therefore, no matter how long a gauge has been positive, a sender MUST NOT reduce the gauge by more than the ConEx-Marked bytes it has sent.

如果流量受到应用程序限制,则可能没有足够的字节发送,无法将仪表降至零或以下。在这种情况下,发送方不得不延迟ConEx信号。尽管如此,只要发送方标记所有传出数据包,审计功能就不太可能惩罚带有ConEx标记的数据包。因此,无论仪表为正多长时间,发送方减少仪表的量不得超过其发送的ConEx标记的字节数。

If the CEG or LEG counter is negative, the respective counter MAY be reset to zero within one RTT after it was decreased the last time, or one RTT after recovery if no further congestion occurred.

如果CEG或LEG计数器为负,则在上次减少后的一个RTT内,或在恢复后的一个RTT内,如果没有进一步的拥塞发生,则可将相应计数器重置为零。

7. Open Areas for Experimentation
7. 开放试验区

All proposed mechanisms in this document are experimental, and therefore further large-scale experimentation on the Internet is required to evaluate if the signaling provided by these mechanisms is accurate and timely enough to produce value for ConEx-based (traffic management or other) mechanisms.

本文件中提出的所有机制都是实验性的,因此需要在互联网上进行进一步的大规模实验,以评估这些机制提供的信令是否准确及时,足以为基于ConEx的(流量管理或其他)机制创造价值。

The current ConEx specifications assume that congestion is counted in the number of bytes (including the IP header that directly encapsulates the CDO and everything that the IP header encapsulates) [RFC7837]. This decision was taken because most network devices today experience byte-congestion where the memory is filled exactly with the number of bytes a packet carries [RFC7141]. However, there are also devices that may allocate a certain amount of memory per packet, no matter how large a packet is. These devices get congested based on the number of packets in their memory and therefore, in this case, congestion is determined by the number of packets that have been lost or marked. Furthermore, a transport-layer endpoint such as a TCP sender or receiver, might not know the exact number of bytes that a lower layer was carrying. Therefore, a TCP endpoint may only be able to estimate the exact number of congested bytes (assuming that all lower-layer headers have the same length). If this estimation is sufficient to work with, the ConEx Signal needs to be further evaluated in tests on the Internet together with different auditor implementations.

当前的ConEx规范假设拥塞以字节数计算(包括直接封装CDO的IP报头和IP报头封装的所有内容)[RFC7837]。之所以做出这一决定,是因为目前大多数网络设备都会遇到字节拥塞,内存中的字节数正好与数据包所携带的字节数相同[RFC7141]。然而,也有一些设备可以为每个数据包分配一定数量的内存,无论数据包有多大。这些设备根据其内存中的数据包数量而拥塞,因此,在这种情况下,拥塞由丢失或标记的数据包数量决定。此外,传输层端点(如TCP发送方或接收方)可能不知道较低层承载的确切字节数。因此,TCP端点可能只能估计拥塞字节的确切数量(假设所有较低层头具有相同的长度)。如果此估计足以使用,则需要在互联网上的测试中进一步评估ConEx信号以及不同的审核员实现。

Further, the proposed marking schemes in this document are designed under the assumption that all TCP packets of a ConEx-capable flow are of equal size or that flows have a constant mean packet size over a rather small time frame, like one RTT or less. In most implementations, this assumption might be taken as well and is probably true for most of the traffic flows. If this proposed scheme is used, it is necessary to evaluate how much accuracy degrades if this precondition is not met. Evaluating with real traffic from different applications is especially important in making the decision regarding whether the proposed schemes are sufficient or whether a more complex scheme is needed.

此外,本文中提出的标记方案是在假设具有ConEx能力的流的所有TCP数据包具有相同的大小或流在相当小的时间帧(如一个RTT或更小)上具有恒定的平均数据包大小的情况下设计的。在大多数实现中,也可以采用这种假设,并且对于大多数流量来说可能是正确的。如果使用此拟议方案,则有必要评估如果不满足此前提条件,精度会降低多少。使用来自不同应用程序的实际流量进行评估对于决定所建议的方案是否足够或是否需要更复杂的方案尤为重要。

In this context, the proposed scheme to set credit markings in Slow Start runs the risk of providing an insufficient number of markings, which can cause an audit function to penalize this flow. Both the proposed credit scheme for Slow Start as well as the scheme in Congestion Avoidance must be evaluated together with one or more

在这种情况下,在慢速启动中设置信用标记的拟议方案存在提供标记数量不足的风险,这可能会导致审计职能部门惩罚此流程。建议的慢启动信贷方案和拥塞避免方案必须与一个或多个方案一起评估

specific implementations of a ConEx auditor to ensure that both algorithms, in the sender and in the auditor, work properly together with a low risk of false positives (which would lead to penalization of an honest sender). However, if a sender is wrongly assumed to cheat, the penalization of the audit should be adequate and should allow an honest sender using a congestion control scheme that is commonly used today to recover quickly.

ConEx审核员的具体实施,以确保发送者和审核员中的两种算法都能正常工作,并降低误报风险(这将导致对诚实发送者的处罚)。然而,如果发送者被错误地假定为作弊,那么审计的惩罚应该足够了,并且应该允许诚实的发送者使用当前常用的拥塞控制方案快速恢复。

Another open issue is the accuracy of the ECN feedback signal. At the time of this document's publication, there is no AccECN mechanism specified yet, and further AccECN will also take some time to be widely deployed. This document proposes an advanced compatibility mode for classic ECN. The proposed mechanism can provide more accurate feedback by utilizing the way classic ECN is specified but has a higher risk of losing information. To figure out how high this risk is in a real deployment scenario, further experimental evaluation is needed. The following argument is intended to prove that suppressing repetitions of ECE, however, is still safe against possible congestion collapse due to lost congestion feedback and should be further proven in experimentation:

另一个未决问题是ECN反馈信号的准确性。在本文档发布时,尚未指定AccECN机制,进一步的AccECN还需要一段时间才能广泛部署。本文档提出了经典ECN的高级兼容性模式。所提出的机制可以利用经典ECN的指定方式提供更准确的反馈,但具有更高的信息丢失风险。为了弄清楚在实际部署场景中这种风险有多高,需要进行进一步的实验评估。以下论点旨在证明,抑制ECE的重复仍然是安全的,不会因拥塞反馈丢失而导致拥塞崩溃,并应在实验中进一步证明:

Repetition of ECE in classic ECN is intended to ensure reliable delivery of congestion feedback. However, with advanced compatibility mode, it is possible to miss congestion notifications. This can happen in some implementations if delayed acknowledgements are used. Further, an ACK containing ECE can simply get lost. If only a few CE marks are received within one congestion event (e.g., only one), the loss of one acknowledgement due to (heavy) congestion on the reverse path can prevent that any congestion notification is received by the sender.

在经典ECN中重复ECE旨在确保可靠地提供拥塞反馈。但是,在高级兼容模式下,可能会错过拥塞通知。如果使用延迟确认,在某些实现中可能会发生这种情况。此外,包含ECE的ACK可能会丢失。如果在一个拥塞事件中仅接收到几个CE标记(例如,仅一个),则由于反向路径上的(严重)拥塞而丢失一个确认可以防止发送方接收任何拥塞通知。

However, if loss of feedback exacerbates congestion on the forward path, more forward packets will be CE-marked, increasing the likelihood that feedback from at least one CE will get through per RTT. As long as one ECE reaches the sender per RTT, the sender's congestion response will be the same as if CWR were not continuous. The only way that heavy congestion on the forward path could be completely hidden would be if all ACKs on the reverse path were lost. If total ACK loss persisted, the sender would time out and do a congestion response anyway. Therefore, the problem seems confined to potential suppression of a congestion response during light congestion.

然而,如果反馈丢失加剧了前向路径上的拥塞,则更多的前向分组将被CE标记,从而增加了来自至少一个CE的反馈通过每个RTT的可能性。只要每个RTT有一个ECE到达发送方,发送方的拥塞响应将与CWR不连续时相同。如果反向路径上的所有ACK都丢失,则可以完全隐藏正向路径上的严重拥塞。如果全部ACK丢失持续存在,发送方将超时并执行拥塞响应。因此,问题似乎仅限于在轻度拥挤期间对拥挤响应的潜在抑制。

Furthermore, even if loss of all ECN feedback leads to no congestion response, the worst that could happen would be loss instead of ECN-signaled congestion on the forward path. Given that compatibility mode does not affect loss feedback, there would be no risk of congestion collapse.

此外,即使丢失所有ECN反馈导致没有拥塞响应,最糟糕的情况可能是丢失,而不是前向路径上的ECN信号拥塞。考虑到兼容性模式不会影响损失反馈,就不会有拥塞崩溃的风险。

8. Security Considerations
8. 安全考虑

General ConEx security considerations are covered extensively in the ConEx abstract mechanism [RFC7713]. This section covers TCP-specific concerns that may occur with the addition of ConEx to TCP (while not discussing generally well-known attacks against TCP). It is assumed that any altering of ConEx information can be detected by protection mechanisms in the IP layer and is, therefore, not discussed here but in [RFC7837]. Further, [RFC7837] describes how to use ConEx to mitigate flooding attacks by using preferential drop where the use of ConEx can even increase security.

ConEx抽象机制[RFC7713]广泛涵盖了一般ConEx安全注意事项。本节介绍在TCP中添加ConEx时可能出现的TCP特定问题(而不是讨论针对TCP的常见攻击)。假设ConEx信息的任何改变都可以通过IP层中的保护机制检测到,因此,此处不讨论,而是在[RFC7837]中讨论。此外,[RFC7837]描述了如何使用ConEx通过使用优先丢弃来减轻洪水攻击,其中使用ConEx甚至可以提高安全性。

The ConEx modifications to TCP provide no mechanism for a receiver to force a sender not to use ConEx. A receiver can degrade the accuracy of ConEx by claiming that it does not support SACK, AccECN, or ECN, but the sender will never have to turn ConEx off. Further, the receiver cannot force the sender to have to mark ConEx more conservatively, in order to cover the risk of any inaccuracy. Instead, it is always the sender's choice to either mark very conservatively, which ensures that the audit always sees enough markings to not penalize the flow, or estimate the needed number of markings more tightly. This second case can lead to inaccurate marking, and therefore increases the likelihood of loss at an audit function that will only harm the receiver itself.

ConEx对TCP的修改没有为接收方提供强制发送方不使用ConEx的机制。接收方可以通过声称其不支持SACK、AccECN或ECN来降低ConEx的准确性,但发送方永远不必关闭ConEx。此外,接收方不能强迫发送方必须更保守地标记ConEx,以覆盖任何不准确的风险。取而代之的是,发送者总是可以选择非常保守地标记,这可以确保审计总是看到足够多的标记而不会惩罚流,或者更严格地估计所需的标记数量。第二种情况可能导致不准确的标记,因此增加了审计职能部门损失的可能性,这只会损害接收方本身。

Assuming the sender is limited in some way by a congestion allowance or quota, a receiver could spoof more loss or ECN congestion feedback than it actually experiences, in an attempt to make the sender draw down its allowance faster than necessary. However, over-declaring congestion simply makes the sender slow down. If the receiver is interested in the content, it will not want to harm its own performance.

假设发送方在某种程度上受到拥塞余量或配额的限制,接收方可能会欺骗比实际经历更多的损失或ECN拥塞反馈,试图使发送方更快地提取其余量。然而,过度声明拥塞只会使发送方速度变慢。如果接收者对内容感兴趣,它不会想损害自己的性能。

However, if the receiver is solely interested in making the sender draw down its allowance, the net effect will depend on the sender's congestion control algorithm as permanently adding more and more additional congestion would cause the sender to more and more reduce its sending rate. Therefore, a receiver can only maintain a certain congestion level that is corresponding to a certain sending rate. With NewReno [RFC6582], doubling congestion feedback causes the sender to reduce its sending rate such that it would only consume sqrt(2) = 1.4 times more congestion allowance. However, to improve scaling, congestion control algorithms are tending towards less responsive algorithms like Cubic or Compound TCP, and ultimately to linear algorithms like Data Center TCP (DCTCP) [DCTCP] that aim to maintain the same congestion level independent of the current sending rate and always reduce its sending window if the signaled congestion feedback is higher. In each case, if the receiver doubles congestion

然而,如果接收方只想让发送方提取其余量,则净效果将取决于发送方的拥塞控制算法,因为永久性地增加越来越多的额外拥塞将导致发送方越来越多地降低其发送速率。因此,接收机只能维持与特定发送速率相对应的特定拥塞水平。对于NewReno[RFC6582],加倍拥塞反馈会导致发送方降低其发送速率,从而只消耗sqrt(2)=1.4倍的拥塞余量。然而,为了提高可伸缩性,拥塞控制算法趋向于响应性较差的算法,如立方体或复合TCP,最终趋向于线性算法,如数据中心TCP(DCTCP)[DCTCP]其目的是保持与当前发送速率无关的相同拥塞水平,并且如果信号拥塞反馈较高,则始终减少其发送窗口。在每种情况下,如果接收器将拥塞加倍

feedback, it causes the sender to respectively consume more allowance by a factor of 1.2, 1.15, or 1, where 1 implies the attack has become completely ineffective as no further congestion allowance is consumed but the flow will decrease its sending rate to a minimum instead.

反馈时,它会使发送方分别消耗1.2、1.15或1倍的更多余量,其中1表示攻击已变得完全无效,因为不再消耗更多的拥塞余量,但流会将其发送速率降至最低。

9. References
9. 工具书类
9.1. Normative References
9.1. 规范性引用文件

[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP Selective Acknowledgment Options", RFC 2018, DOI 10.17487/RFC2018, October 1996, <http://www.rfc-editor.org/info/rfc2018>.

[RFC2018]Mathis,M.,Mahdavi,J.,Floyd,S.,和A.Romanow,“TCP选择性确认选项”,RFC 2018,DOI 10.17487/RFC2018,1996年10月<http://www.rfc-editor.org/info/rfc2018>.

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,DOI 10.17487/RFC2119,1997年3月<http://www.rfc-editor.org/info/rfc2119>.

[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001, <http://www.rfc-editor.org/info/rfc3168>.

[RFC3168]Ramakrishnan,K.,Floyd,S.,和D.Black,“向IP添加显式拥塞通知(ECN)”,RFC 3168,DOI 10.17487/RFC3168,2001年9月<http://www.rfc-editor.org/info/rfc3168>.

[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, <http://www.rfc-editor.org/info/rfc5681>.

[RFC5681]Allman,M.,Paxson,V.和E.Blanton,“TCP拥塞控制”,RFC 5681,DOI 10.17487/RFC56812009年9月<http://www.rfc-editor.org/info/rfc5681>.

[RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) Concepts, Abstract Mechanism, and Requirements", RFC 7713, DOI 10.17487/RFC7713, December 2015, <http://www.rfc-editor.org/info/rfc7713>.

[RFC7713]Mathis,M.和B.Briscoe,“拥堵暴露(ConEx)概念、抽象机制和要求”,RFC 7713,DOI 10.17487/RFC7713,2015年12月<http://www.rfc-editor.org/info/rfc7713>.

[RFC7837] Krishnan, S., Kuehlewind, M., Briscoe, B., and C. Ralli, "IPv6 Destination Option for Congestion Exposure (ConEx)", RFC 7837, DOI 10.17487/RFC7837, May 2016, <http://www.rfc-editor.org/info/rfc7837>.

[RFC7837]Krishnan,S.,Kuehlewind,M.,Briscoe,B.,和C.Ralli,“拥塞暴露的IPv6目的地选项(ConEx)”,RFC 7837,DOI 10.17487/RFC7837,2016年5月<http://www.rfc-editor.org/info/rfc7837>.

9.2. Informative References
9.2. 资料性引用

[ACCURATE] Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More Accurate ECN Feedback in TCP", Work in Progress, draft-ietf-tcpm-accurate-ecn-00, December 2015.

[准确]Briscoe,B.,Kuehlewind,M.,和R.Scheffenegger,“TCP中更准确的ECN反馈”,正在进行的工作,草稿-ietf-tcpm-ACCURATE-ECN-00,2015年12月。

[DCTCP] Alizadeh, M., Greenberg, A., Maltz, D., Padhye, J., Patel, P., Prabhakar, B., Sengupta, S., and M. Sridharan, "Data Center TCP (DCTCP)", ACM SIGCOMM Computer Communication Review, Volume 40, Issue 4, pages 63-74, DOI 10.1145/1851182.1851192, October 2010, <http://portal.acm.org/citation.cfm?id=1851192>.

[DCTCP]Alizadeh,M.,Greenberg,A.,Maltz,D.,Padhye,J.,Patel,P.,Prabhakar,B.,Sengupta,S.,和M.Sridharan,“数据中心TCP(DCTCP)”,ACM SIGCOMM计算机通信评论,第40卷,第4期,第63-74页,DOI 10.1145/1851182.1851192,2010年10月<http://portal.acm.org/citation.cfm?id=1851192>.

[ECNTCP] Briscoe, B., Jacquet, A., Moncaster, T., and A. Smith, "Re-ECN: Adding Accountability for Causing Congestion to TCP/IP", Work in Progress, draft-briscoe-conex-re-ecn-tcp-04, July 2014.

[ECNTCP]Briscoe,B.,Jacquet,A.,Moncaster,T.,和A.Smith,“重新ECN:增加导致TCP/IP拥塞的责任”,正在进行的工作,草稿-Briscoe-conex-Re-ECN-TCP-042014年7月。

[RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for TCP", RFC 3522, DOI 10.17487/RFC3522, April 2003, <http://www.rfc-editor.org/info/rfc3522>.

[RFC3522]Ludwig,R.和M.Meyer,“TCP的Eifel检测算法”,RFC 3522,DOI 10.17487/RFC3522,2003年4月<http://www.rfc-editor.org/info/rfc3522>.

[RFC3708] Blanton, E. and M. Allman, "Using TCP Duplicate Selective Acknowledgement (DSACKs) and Stream Control Transmission Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs) to Detect Spurious Retransmissions", RFC 3708, DOI 10.17487/RFC3708, February 2004, <http://www.rfc-editor.org/info/rfc3708>.

[RFC3708]Blanton,E.和M.Allman,“使用TCP重复选择确认(DSACKs)和流控制传输协议(SCTP)重复传输序列号(TSN)来检测虚假重传”,RFC 3708,DOI 10.17487/RFC3708,2004年2月<http://www.rfc-editor.org/info/rfc3708>.

[RFC4015] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm for TCP", RFC 4015, DOI 10.17487/RFC4015, February 2005, <http://www.rfc-editor.org/info/rfc4015>.

[RFC4015]Ludwig,R.和A.Gurtov,“TCP的Eifel响应算法”,RFC 4015,DOI 10.17487/RFC4015,2005年2月<http://www.rfc-editor.org/info/rfc4015>.

[RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata, "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting Spurious Retransmission Timeouts with TCP", RFC 5682, DOI 10.17487/RFC5682, September 2009, <http://www.rfc-editor.org/info/rfc5682>.

[RFC5682]Sarolahti,P.,Kojo,M.,Yamamoto,K.,和M.Hata,“前向RTO恢复(F-RTO):使用TCP检测虚假重传超时的算法”,RFC 5682,DOI 10.17487/RFC5682,2009年9月<http://www.rfc-editor.org/info/rfc5682>.

[RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The NewReno Modification to TCP's Fast Recovery Algorithm", RFC 6582, DOI 10.17487/RFC6582, April 2012, <http://www.rfc-editor.org/info/rfc6582>.

[RFC6582]Henderson,T.,Floyd,S.,Gurtov,A.,和Y.Nishida,“TCP快速恢复算法的NewReno修改”,RFC 6582,DOI 10.17487/RFC6582,2012年4月<http://www.rfc-editor.org/info/rfc6582>.

[RFC6789] Briscoe, B., Ed., Woundy, R., Ed., and A. Cooper, Ed., "Congestion Exposure (ConEx) Concepts and Use Cases", RFC 6789, DOI 10.17487/RFC6789, December 2012, <http://www.rfc-editor.org/info/rfc6789>.

[RFC6789]Briscoe,B.,Ed.,Woundy,R.,Ed.,和A.Cooper,Ed.,“拥塞暴露(ConEx)概念和用例”,RFC 6789,DOI 10.17487/RFC6789,2012年12月<http://www.rfc-editor.org/info/rfc6789>.

[RFC7141] Briscoe, B. and J. Manner, "Byte and Packet Congestion Notification", BCP 41, RFC 7141, DOI 10.17487/RFC7141, February 2014, <http://www.rfc-editor.org/info/rfc7141>.

[RFC7141]Briscoe,B.和J.Way,“字节和数据包拥塞通知”,BCP 41,RFC 7141,DOI 10.17487/RFC7141,2014年2月<http://www.rfc-editor.org/info/rfc7141>.

Acknowledgements

致谢

The authors would like to thank Bob Briscoe who contributed with these initial ideas [ECNTCP] and valuable feedback. Moreover, thanks to Jana Iyengar who also provided valuable feedback.

作者要感谢鲍勃·布里斯科(Bob Briscoe),他提出了这些初步想法[ECNTCP]和宝贵的反馈意见。此外,感谢Jana Iyengar,她也提供了宝贵的反馈。

Authors' Addresses

作者地址

Mirja Kuehlewind (editor) ETH Zurich Switzerland

Mirja Kuehlewind(编辑)瑞士苏黎世ETH

   Email: mirja.kuehlewind@tik.ee.ethz.ch
        
   Email: mirja.kuehlewind@tik.ee.ethz.ch
        

Richard Scheffenegger NetApp, Inc. Am Euro Platz 2 Vienna 1120 Austria

Richard Scheffenegger NetApp,Inc.位于奥地利维也纳欧洲广场2号,邮编:1120

   Email: rs.ietf@gmx.at
        
   Email: rs.ietf@gmx.at