Internet Engineering Task Force (IETF)                           V. Hilt
Request for Comments: 6357                      Bell Labs/Alcatel-Lucent
Category: Informational                                          E. Noel
ISSN: 2070-1721                                                AT&T Labs
                                                                 C. Shen
                                                     Columbia University
                                                              A. Abdelal
                                                          Sonus Networks
                                                             August 2011
        
Internet Engineering Task Force (IETF)                           V. Hilt
Request for Comments: 6357                      Bell Labs/Alcatel-Lucent
Category: Informational                                          E. Noel
ISSN: 2070-1721                                                AT&T Labs
                                                                 C. Shen
                                                     Columbia University
                                                              A. Abdelal
                                                          Sonus Networks
                                                             August 2011
        

Design Considerations for Session Initiation Protocol (SIP) Overload Control

会话启动协议(SIP)过载控制的设计考虑

Abstract

摘要

Overload occurs in Session Initiation Protocol (SIP) networks when SIP servers have insufficient resources to handle all SIP messages they receive. Even though the SIP protocol provides a limited overload control mechanism through its 503 (Service Unavailable) response code, SIP servers are still vulnerable to overload. This document discusses models and design considerations for a SIP overload control mechanism.

当SIP服务器没有足够的资源来处理它们接收到的所有SIP消息时,会话初始化协议(SIP)网络中会发生过载。即使SIP协议通过其503(服务不可用)响应代码提供有限的过载控制机制,SIP服务器仍然容易过载。本文档讨论SIP过载控制机制的模型和设计注意事项。

Status of This Memo

关于下段备忘

This document is not an Internet Standards Track specification; it is published for informational purposes.

本文件不是互联网标准跟踪规范;它是为了提供信息而发布的。

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.

本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准;见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6357.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc6357.

Copyright Notice

版权公告

Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2011 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents

目录

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  SIP Overload Problem . . . . . . . . . . . . . . . . . . . . .  4
   3.  Explicit vs. Implicit Overload Control . . . . . . . . . . . .  5
   4.  System Model . . . . . . . . . . . . . . . . . . . . . . . . .  6
   5.  Degree of Cooperation  . . . . . . . . . . . . . . . . . . . .  8
     5.1.  Hop-by-Hop . . . . . . . . . . . . . . . . . . . . . . . .  9
     5.2.  End-to-End . . . . . . . . . . . . . . . . . . . . . . . . 10
     5.3.  Local Overload Control . . . . . . . . . . . . . . . . . . 11
   6.  Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . 12
   7.  Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
   8.  Performance Metrics  . . . . . . . . . . . . . . . . . . . . . 14
   9.  Explicit Overload Control Feedback . . . . . . . . . . . . . . 15
     9.1.  Rate-Based Overload Control  . . . . . . . . . . . . . . . 15
     9.2.  Loss-Based Overload Control  . . . . . . . . . . . . . . . 17
     9.3.  Window-Based Overload Control  . . . . . . . . . . . . . . 18
     9.4.  Overload Signal-Based Overload Control . . . . . . . . . . 19
     9.5.  On-/Off Overload Control . . . . . . . . . . . . . . . . . 19
   10. Implicit Overload Control  . . . . . . . . . . . . . . . . . . 20
   11. Overload Control Algorithms  . . . . . . . . . . . . . . . . . 20
   12. Message Prioritization . . . . . . . . . . . . . . . . . . . . 21
   13. Operational Considerations . . . . . . . . . . . . . . . . . . 21
   14. Security Considerations  . . . . . . . . . . . . . . . . . . . 22
   15. Informative References . . . . . . . . . . . . . . . . . . . . 23
   Appendix A.  Contributors  . . . . . . . . . . . . . . . . . . . . 25
        
   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  SIP Overload Problem . . . . . . . . . . . . . . . . . . . . .  4
   3.  Explicit vs. Implicit Overload Control . . . . . . . . . . . .  5
   4.  System Model . . . . . . . . . . . . . . . . . . . . . . . . .  6
   5.  Degree of Cooperation  . . . . . . . . . . . . . . . . . . . .  8
     5.1.  Hop-by-Hop . . . . . . . . . . . . . . . . . . . . . . . .  9
     5.2.  End-to-End . . . . . . . . . . . . . . . . . . . . . . . . 10
     5.3.  Local Overload Control . . . . . . . . . . . . . . . . . . 11
   6.  Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . 12
   7.  Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
   8.  Performance Metrics  . . . . . . . . . . . . . . . . . . . . . 14
   9.  Explicit Overload Control Feedback . . . . . . . . . . . . . . 15
     9.1.  Rate-Based Overload Control  . . . . . . . . . . . . . . . 15
     9.2.  Loss-Based Overload Control  . . . . . . . . . . . . . . . 17
     9.3.  Window-Based Overload Control  . . . . . . . . . . . . . . 18
     9.4.  Overload Signal-Based Overload Control . . . . . . . . . . 19
     9.5.  On-/Off Overload Control . . . . . . . . . . . . . . . . . 19
   10. Implicit Overload Control  . . . . . . . . . . . . . . . . . . 20
   11. Overload Control Algorithms  . . . . . . . . . . . . . . . . . 20
   12. Message Prioritization . . . . . . . . . . . . . . . . . . . . 21
   13. Operational Considerations . . . . . . . . . . . . . . . . . . 21
   14. Security Considerations  . . . . . . . . . . . . . . . . . . . 22
   15. Informative References . . . . . . . . . . . . . . . . . . . . 23
   Appendix A.  Contributors  . . . . . . . . . . . . . . . . . . . . 25
        
1. Introduction
1. 介绍

As with any network element, a Session Initiation Protocol (SIP) [RFC3261] server can suffer from overload when the number of SIP messages it receives exceeds the number of messages it can process. Overload occurs if a SIP server does not have sufficient resources to process all incoming SIP messages. These resources may include CPU, memory, input/output, or disk resources.

与任何网络元件一样,当会话启动协议(SIP)[RFC3261]服务器接收的SIP消息数量超过其可以处理的消息数量时,服务器可能会过载。如果SIP服务器没有足够的资源来处理所有传入的SIP消息,则会发生过载。这些资源可能包括CPU、内存、输入/输出或磁盘资源。

Overload can pose a serious problem for a network of SIP servers. During periods of overload, the throughput of SIP messages in a network of SIP servers can be significantly degraded. In fact, overload in a SIP server may lead to a situation in which the overload is amplified by retransmissions of SIP messages causing the throughput to drop down to a very small fraction of the original processing capacity. This is often called congestion collapse.

过载会给SIP服务器网络带来严重问题。在过载期间,SIP服务器网络中SIP消息的吞吐量会显著降低。事实上,SIP服务器中的过载可能导致这样一种情况,即过载被SIP消息的重新传输放大,导致吞吐量下降到原始处理能力的很小部分。这通常被称为拥塞崩溃。

An overload control mechanism enables a SIP server to process SIP messages close to its capacity limit during times of overload. Overload control is used by a SIP server if it is unable to process all SIP requests due to resource constraints. There are other failure cases in which a SIP server can successfully process incoming requests but has to reject them for other reasons. For example, a Public Switched Telephone Network (PSTN) gateway that runs out of trunk lines but still has plenty of capacity to process SIP messages should reject incoming INVITEs using a response such as 488 (Not Acceptable Here), as described in [RFC4412]. Similarly, a SIP registrar that has lost connectivity to its registration database but is still capable of processing SIP messages should reject REGISTER requests with a 500 (Server Error) response [RFC3261]. Overload control mechanisms do not apply in these cases and SIP provides appropriate response codes for them.

过载控制机制使SIP服务器能够在过载期间处理接近其容量限制的SIP消息。如果SIP服务器由于资源限制而无法处理所有SIP请求,则使用过载控制。还有其他一些失败案例,其中SIP服务器可以成功地处理传入请求,但由于其他原因必须拒绝它们。例如,公共交换电话网(PSTN)网关的中继线已用尽,但仍有足够的容量处理SIP消息,应使用响应(如488)拒绝传入邀请(此处不接受),如[RFC4412]所述。类似地,与注册数据库失去连接但仍能处理SIP消息的SIP注册器应以500(服务器错误)响应拒绝注册请求[RFC3261]。过载控制机制不适用于这些情况,SIP为它们提供适当的响应代码。

There are cases in which a SIP server runs other services that do not involve the processing of SIP messages (e.g., processing of RTP packets, database queries, software updates, and event handling). These services may, or may not, be correlated with the SIP message volume. These services can use up a substantial share of resources available on the server (e.g., CPU cycles) and leave the server in a condition where it is unable to process all incoming SIP requests. In these cases, the SIP server applies SIP overload control mechanisms to avoid congestion collapse on the SIP signaling plane. However, controlling the number of SIP requests may not significantly reduce the load on the server if the resource shortage was created by another service. In these cases, it is to be expected that the server uses appropriate methods of controlling the resource usage of

在某些情况下,SIP服务器运行不涉及SIP消息处理的其他服务(例如,处理RTP数据包、数据库查询、软件更新和事件处理)。这些服务可能与SIP消息卷相关,也可能不相关。这些服务可能会占用服务器上大量可用资源(例如CPU周期),并使服务器处于无法处理所有传入SIP请求的状态。在这些情况下,SIP服务器应用SIP过载控制机制以避免SIP信令平面上的拥塞崩溃。但是,如果资源短缺是由其他服务造成的,则控制SIP请求的数量可能不会显著降低服务器上的负载。在这些情况下,服务器应该使用适当的方法来控制资源的使用

other services. The specifics of controlling the resource usage of other services and their coordination is out of scope for this document.

其他服务。控制其他服务的资源使用及其协调的细节超出了本文档的范围。

The SIP protocol provides a limited mechanism for overload control through its 503 (Service Unavailable) response code and the Retry-After header. However, this mechanism cannot prevent overload of a SIP server and it cannot prevent congestion collapse. In fact, it may cause traffic to oscillate and to shift between SIP servers and thereby worsen an overload condition. A detailed discussion of the SIP overload problem, the problems with the 503 (Service Unavailable) response code and the Retry-After header, and the requirements for a SIP overload control mechanism can be found in [RFC5390]. In addition, 503 is used for other situations, not just SIP server overload. A SIP overload control process based on 503 would have to specify exactly which cause values trigger the overload control.

SIP协议通过其503(服务不可用)响应代码和Retry After报头为过载控制提供了有限的机制。然而,这种机制不能防止SIP服务器过载,也不能防止拥塞崩溃。事实上,它可能会导致流量波动,并在SIP服务器之间切换,从而加剧过载情况。有关SIP过载问题、503(服务不可用)响应代码和重试后报头的问题以及SIP过载控制机制的要求的详细讨论,请参见[RFC5390]。此外,503还用于其他情况,而不仅仅是SIP服务器过载。基于503的SIP过载控制过程必须准确指定触发过载控制的原因值。

This document discusses the models, assumptions, and design considerations for a SIP overload control mechanism. The document originated in the SIP overload control design team and has been further developed by the SIP Overload Control (SOC) working group.

本文档讨论SIP过载控制机制的模型、假设和设计注意事项。该文件起源于SIP过载控制设计团队,并由SIP过载控制(SOC)工作组进一步开发。

2. SIP Overload Problem
2. SIP过载问题

A key contributor to SIP congestion collapse [RFC5390] is the regenerative behavior of overload in the SIP protocol. When SIP is running over the UDP protocol, it will retransmit messages that were dropped or excessively delayed by a SIP server due to overload and thereby increase the offered load for the already overloaded server. This increase in load worsens the severity of the overload condition and, in turn, causes more messages to be dropped. A congestion collapse can occur [Hilt] [Noel] [Shen] [Abdelal].

SIP拥塞崩溃的一个关键因素[RFC5390]是SIP协议中过载的再生行为。当SIP通过UDP协议运行时,它将重新传输SIP服务器由于过载而丢弃或过度延迟的消息,从而增加已过载服务器的负载。负载的增加加剧了过载情况的严重性,进而导致更多消息被丢弃。交通堵塞可能会发生崩溃[刀柄][Noel][Shen][Abdelal]。

Regenerative behavior under overload should ideally be avoided by any protocol as this would lead to unstable operation under overload. However, this is often difficult to achieve in practice. For example, changing the SIP retransmission timer mechanisms can reduce the degree of regeneration during overload but will impact the ability of SIP to recover from message losses. Without any retransmission, each message that is dropped due to SIP server overload will eventually lead to a failed transaction.

理想情况下,任何协议都应避免过载下的再生行为,因为这将导致过载下的不稳定操作。然而,这在实践中往往很难实现。例如,更改SIP重传计时器机制可以降低过载期间的再生程度,但会影响SIP从消息丢失中恢复的能力。如果不进行任何重传,由于SIP服务器过载而丢弃的每条消息最终都会导致事务失败。

For a SIP INVITE transaction to be successful, a minimum of three messages need to be forwarded by a SIP server. Often an INVITE transaction consists of five or more SIP messages. If a SIP server under overload randomly discards messages without evaluating them, the chances that all messages belonging to a transaction are

要使SIP INVITE事务成功,SIP服务器至少需要转发三条消息。通常,INVITE事务由五条或更多SIP消息组成。如果处于过载状态的SIP服务器在未对消息进行评估的情况下随机丢弃消息,则属于某个事务的所有消息都会丢失

successfully forwarded will decrease as the load increases. Thus, the number of transactions that complete successfully will decrease even if the message throughput of a server remains up and assuming the overload behavior is fully non-regenerative. A SIP server might (partially) parse incoming messages to determine if it is a new request or a message belonging to an existing transaction. Discarding a SIP message after spending the resources to parse it is expensive. The number of successful transactions will therefore decline with an increase in load as fewer resources can be spent on forwarding messages and more resources are consumed by inspecting messages that will eventually be dropped. The rate of the decline depends on the amount of resources spent to inspect each message.

成功转发将随着负载的增加而减少。因此,即使服务器的消息吞吐量保持不变,并且假设过载行为完全不再生,成功完成的事务数也将减少。SIP服务器可能(部分)解析传入消息,以确定它是新请求还是属于现有事务的消息。在花费资源解析SIP消息之后丢弃该消息是非常昂贵的。因此,成功事务的数量将随着负载的增加而减少,因为用于转发消息的资源会减少,而检查最终将被丢弃的消息会消耗更多的资源。下降率取决于检查每条消息所花费的资源量。

Another challenge for SIP overload control is controlling the rate of the true traffic source. Overload is often caused by a large number of user agents (UAs), each of which creates only a single message. However, the sum of their traffic can overload a SIP server. The overload mechanisms suitable for controlling a SIP server (e.g., rate control) may not be effective for individual UAs. In some cases, there are other non-SIP mechanisms for limiting the load from the UAs. These may operate independently from, or in conjunction with, the SIP overload mechanisms described here. In either case, they are out of scope for this document.

SIP过载控制的另一个挑战是控制真实流量源的速率。过载通常是由大量用户代理(UAs)引起的,每个UAs只创建一条消息。但是,它们的流量总和可能会使SIP服务器过载。适用于控制SIP服务器(例如,速率控制)的过载机制可能对单个ua无效。在某些情况下,还有其他非SIP机制用于限制来自UAs的负载。这些可以独立于此处描述的SIP过载机制或与之结合操作。在这两种情况下,它们都超出了本文档的范围。

3. Explicit vs. Implicit Overload Control
3. 显式与隐式过载控制

The main difference between explicit and implicit overload control is the way overload is signaled from a SIP server that is reaching overload condition to its upstream neighbors.

显式过载控制和隐式过载控制之间的主要区别在于从达到过载条件的SIP服务器向其上游邻居发送过载信号的方式。

In an explicit overload control mechanism, a SIP server uses an explicit overload signal to indicate that it is reaching its capacity limit. Upstream neighbors receiving this signal can adjust their transmission rate according to the overload signal to a level that is acceptable to the downstream server. The overload signal enables a SIP server to steer the load it is receiving to a rate at which it can perform at maximum capacity.

在显式过载控制机制中,SIP服务器使用显式过载信号指示其正在达到其容量限制。接收该信号的上游邻居可以根据过载信号将其传输速率调整到下游服务器可以接受的水平。过载信号使SIP服务器能够将其接收的负载控制在其能够以最大容量执行的速率。

Implicit overload control uses the absence of responses and packet loss as an indication of overload. A SIP server that is sensing such a condition reduces the load it is forwarding to a downstream neighbor. Since there is no explicit overload signal, this mechanism is robust, as it does not depend on actions taken by the SIP server running into overload.

隐式过载控制使用无响应和数据包丢失作为过载指示。感测到这种情况的SIP服务器会减少其转发给下游邻居的负载。由于没有明确的过载信号,该机制是健壮的,因为它不依赖于运行过载的SIP服务器所采取的操作。

The ideas of explicit and implicit overload control are in fact complementary. By considering implicit overload indications, a server can avoid overloading an unresponsive downstream neighbor. An

显性和隐性过载控制的思想实际上是互补的。通过考虑隐式过载指示,服务器可以避免使无响应的下游邻居过载。一

explicit overload signal enables a SIP server to actively steer the incoming load to a desired level.

显式过载信号使SIP服务器能够主动地将传入负载控制到所需的水平。

4. System Model
4. 系统模型

The model shown in Figure 1 identifies fundamental components of an explicit SIP overload control mechanism:

图1所示的模型确定了显式SIP过载控制机制的基本组件:

SIP Processor: The SIP Processor processes SIP messages and is the component that is protected by overload control.

SIP处理器:SIP处理器处理SIP消息,是受过载控制保护的组件。

Monitor: The Monitor measures the current load of the SIP Processor on the receiving entity. It implements the mechanisms needed to determine the current usage of resources relevant for the SIP Processor and reports load samples (S) to the Control Function.

监视器:监视器测量接收实体上SIP处理器的当前负载。它实现了确定SIP处理器相关资源的当前使用情况所需的机制,并向控制功能报告负载样本。

Control Function: The Control Function implements the overload control algorithm. The Control Function uses the load samples (S) and determines if overload has occurred and a throttle (T) needs to be set to adjust the load sent to the SIP Processor on the receiving entity. The Control Function on the receiving entity sends load feedback (F) to the sending entity.

控制功能:控制功能实现过载控制算法。控制功能使用负载样本并确定是否发生过载,需要设置节流阀(T)以调整发送到接收实体上SIP处理器的负载。接收实体上的控制功能向发送实体发送负载反馈(F)。

Actuator: The Actuator implements the algorithms needed to act on the throttles (T) and ensures that the amount of traffic forwarded to the receiving entity meets the criteria of the throttle. For example, a throttle may instruct the Actuator to not forward more than 100 INVITE messages per second. The Actuator implements the algorithms to achieve this objective, e.g., using message gapping. It also implements algorithms to select the messages that will be affected and determine whether they are rejected or redirected.

执行器:执行器执行作用于节流阀(T)所需的算法,并确保转发给接收实体的流量符合节流阀的标准。例如,油门可指示执行器每秒转发不超过100条INVITE消息。执行器执行算法以实现此目标,例如,使用消息间隙。它还实现算法来选择将受到影响的消息,并确定它们是被拒绝还是被重定向。

The type of feedback (F) conveyed from the receiving to the sending entity depends on the overload control method used (i.e., loss-based, rate-based, window-based, or signal-based overload control; see Section 9), the overload control algorithm (see Section 11), as well as other design parameters. The feedback (F) enables the sending entity to adjust the amount of traffic forwarded to the receiving entity to a level that is acceptable to the receiving entity without causing overload.

从接收到发送实体的反馈类型(F)取决于所使用的过载控制方法(即基于损耗、基于速率、基于窗口或基于信号的过载控制;参见第9节)、过载控制算法(参见第11节)以及其他设计参数。反馈(F)使得发送实体能够将转发给接收实体的通信量调整到接收实体可以接受的水平,而不会造成过载。

Figure 1 depicts a general system model for overload control. In this diagram, one instance of the control function is on the sending entity (i.e., associated with the actuator) and one is on the receiving entity (i.e., associated with the Monitor). However, a specific mechanism may not require both elements. In this case, one of two control function elements can be empty and simply passes along feedback. For example, if (F) is defined as a loss-rate (e.g.,

图1描述了过载控制的一般系统模型。在该图中,控制功能的一个实例位于发送实体上(即与致动器关联),另一个实例位于接收实体上(即与监视器关联)。然而,一个特定的机制可能不需要这两个要素。在这种情况下,两个控制功能元素中的一个可以是空的,并且只传递反馈。例如,如果(F)定义为损失率(例如。,

reduce traffic by 10%), there is no need for a control function on the sending entity as the content of (F) can be copied directly into (T).

将通信量减少10%),因为(F)的内容可以直接复制到(T)中,所以发送实体不需要控制功能。

The model in Figure 1 shows a scenario with one sending and one receiving entity. In a more realistic scenario, a receiving entity will receive traffic from multiple sending entities and vice versa (see Section 6). The feedback generated by a Monitor will therefore often be distributed across multiple Actuators. A Monitor needs to be able to split the load it can process across multiple sending entities and generate feedback that correctly adjusts the load each sending entity is allowed to send. Similarly, an Actuator needs to be prepared to receive different levels of feedback from different receiving entities and throttle traffic to these entities accordingly.

图1中的模型显示了一个场景,其中有一个发送实体和一个接收实体。在更现实的场景中,接收实体将从多个发送实体接收流量,反之亦然(参见第6节)。因此,监视器产生的反馈通常分布在多个执行器上。监视器需要能够在多个发送实体之间分割它可以处理的负载,并生成正确调整每个发送实体允许发送的负载的反馈。类似地,执行器需要准备好接收来自不同接收实体的不同级别的反馈,并相应地限制到这些实体的流量。

In a realistic deployment, SIP messages will flow in both directions, from server B to server A as well as server A to server B. The overload control mechanisms in each direction can be considered independently. For messages flowing from server A to server B, the sending entity is server A and the receiving entity is server B, and vice versa. The control loops in both directions operate independently.

在实际部署中,SIP消息将在两个方向上流动,从服务器B到服务器a,以及从服务器a到服务器B。每个方向上的过载控制机制可以单独考虑。对于从服务器A流向服务器B的消息,发送实体是服务器A,接收实体是服务器B,反之亦然。两个方向的控制回路独立运行。

             Sending                Receiving
              Entity                  Entity
        +----------------+      +----------------+
        |    Server A    |      |    Server B    |
        |  +----------+  |      |  +----------+  |    -+
        |  | Control  |  |  F   |  | Control  |  |     |
        |  | Function |<-+------+--| Function |  |     |
        |  +----------+  |      |  +----------+  |     |
        |     T |        |      |       ^        |     | Overload
        |       v        |      |       | S      |     | Control
        |  +----------+  |      |  +----------+  |     |
        |  | Actuator |  |      |  | Monitor  |  |     |
        |  +----------+  |      |  +----------+  |     |
        |       |        |      |       ^        |    -+
        |       v        |      |       |        |    -+
        |  +----------+  |      |  +----------+  |     |
      <-+--|   SIP    |  |      |  |   SIP    |  |     |  SIP
      --+->|Processor |--+------+->|Processor |--+->   | System
        |  +----------+  |      |  +----------+  |     |
        +----------------+      +----------------+    -+
        
             Sending                Receiving
              Entity                  Entity
        +----------------+      +----------------+
        |    Server A    |      |    Server B    |
        |  +----------+  |      |  +----------+  |    -+
        |  | Control  |  |  F   |  | Control  |  |     |
        |  | Function |<-+------+--| Function |  |     |
        |  +----------+  |      |  +----------+  |     |
        |     T |        |      |       ^        |     | Overload
        |       v        |      |       | S      |     | Control
        |  +----------+  |      |  +----------+  |     |
        |  | Actuator |  |      |  | Monitor  |  |     |
        |  +----------+  |      |  +----------+  |     |
        |       |        |      |       ^        |    -+
        |       v        |      |       |        |    -+
        |  +----------+  |      |  +----------+  |     |
      <-+--|   SIP    |  |      |  |   SIP    |  |     |  SIP
      --+->|Processor |--+------+->|Processor |--+->   | System
        |  +----------+  |      |  +----------+  |     |
        +----------------+      +----------------+    -+
        

Figure 1: System Model for Explicit Overload Control

图1:显式过载控制的系统模型

5. Degree of Cooperation
5. 合作程度

A SIP request is usually processed by more than one SIP server on its path to the destination. Thus, a design choice for an explicit overload control mechanism is where to place the components of overload control along the path of a request and, in particular, where to place the Monitor and Actuator. This design choice determines the degree of cooperation between the SIP servers on the path. Overload control can be implemented hop-by-hop with the Monitor on one server and the Actuator on its direct upstream neighbor. Overload control can be implemented end-to-end with Monitors on all SIP servers along the path of a request and an Actuator on the sender. In this case, the Control Functions associated with each Monitor have to cooperate to jointly determine the overall feedback for this path. Finally, overload control can be implemented locally on a SIP server if the Monitor and Actuator reside on the same server. In this case, the sending entity and receiving entity are the same SIP server, and the Actuator and Monitor operate on the same SIP Processor (although, the Actuator typically operates on a pre-processing stage in local overload control). Local overload control is an internal overload control mechanism, as the control loop is implemented internally on one server. Hop-by-hop and end-to-end are external overload control mechanisms. All three configurations are shown in Figure 2.

SIP请求通常由多个SIP服务器在其到目的地的路径上处理。因此,显式过载控制机制的设计选择是沿请求路径放置过载控制组件的位置,特别是监视器和执行器的位置。此设计选择决定路径上SIP服务器之间的协作程度。过载控制可以通过一台服务器上的监视器和其直接上游邻居上的执行器逐跳实现。过载控制可以通过沿请求路径的所有SIP服务器上的监视器和发送器上的执行器端到端地实现。在这种情况下,与每个监视器相关的控制功能必须协同工作,共同确定该路径的总体反馈。最后,如果监控器和执行器位于同一服务器上,则可以在SIP服务器上本地实现过载控制。在这种情况下,发送实体和接收实体是相同的SIP服务器,并且执行器和监视器在相同的SIP处理器上操作(尽管执行器通常在本地过载控制的预处理阶段上操作)。本地过载控制是一种内部过载控制机制,因为控制循环在一台服务器上内部实现。逐跳和端到端是外部过载控制机制。所有三种配置如图2所示。

                  +---------+             +------(+)---------+
         +------+ |         |             |       ^          |
         |      | |        +---+          |       |         +---+
         v      | v    //=>| C |          v       |     //=>| C |
      +---+    +---+ //    +---+       +---+    +---+ //    +---+
      | A |===>| B |                   | A |===>| B |
      +---+    +---+ \\    +---+       +---+    +---+ \\    +---+
                  ^    \\=>| D |          ^       |     \\=>| D |
                  |        +---+          |       |         +---+
                  |         |             |       v          |
                  +---------+             +------(+)---------+
        
                  +---------+             +------(+)---------+
         +------+ |         |             |       ^          |
         |      | |        +---+          |       |         +---+
         v      | v    //=>| C |          v       |     //=>| C |
      +---+    +---+ //    +---+       +---+    +---+ //    +---+
      | A |===>| B |                   | A |===>| B |
      +---+    +---+ \\    +---+       +---+    +---+ \\    +---+
                  ^    \\=>| D |          ^       |     \\=>| D |
                  |        +---+          |       |         +---+
                  |         |             |       v          |
                  +---------+             +------(+)---------+
        

(a) hop-by-hop (b) end-to-end

(a) 逐跳(b)端到端

                            +-+
                            v |
       +-+      +-+        +---+
       v |      v |    //=>| C |
      +---+    +---+ //    +---+
      | A |===>| B |
      +---+    +---+ \\    +---+
                       \\=>| D |
                           +---+
                            ^ |
                            +-+
        
                            +-+
                            v |
       +-+      +-+        +---+
       v |      v |    //=>| C |
      +---+    +---+ //    +---+
      | A |===>| B |
      +---+    +---+ \\    +---+
                       \\=>| D |
                           +---+
                            ^ |
                            +-+
        

(c) local

(c) 地方的

       ==> SIP request flow
       <-- Overload feedback loop
        
       ==> SIP request flow
       <-- Overload feedback loop
        

Figure 2: Degree of Cooperation between Servers

图2:服务器之间的协作程度

5.1. Hop-by-Hop
5.1. 逐个跳段

The idea of hop-by-hop overload control is to instantiate a separate control loop between all neighboring SIP servers that directly exchange traffic. That is, the Actuator is located on the SIP server that is the direct upstream neighbor of the SIP server that has the corresponding Monitor. Each control loop between two servers is completely independent of the control loop between other servers further up- or downstream. In the example in Figure 2(a), three independent overload control loops are instantiated: A - B, B - C, and B - D. Each loop only controls a single hop. Overload feedback received from a downstream neighbor is not forwarded further upstream. Instead, a SIP server acts on this feedback, for example, by rejecting SIP messages if needed. If the upstream neighbor of a server also becomes overloaded, it will report this problem to its

逐跳过载控制的思想是在所有直接交换流量的相邻SIP服务器之间实例化一个单独的控制循环。也就是说,执行器位于SIP服务器上,该SIP服务器是具有相应监视器的SIP服务器的直接上游邻居。两台服务器之间的每个控制回路完全独立于上下游其他服务器之间的控制回路。在图2(a)中的示例中,实例化了三个独立的重载控制循环:a-B、B-C和B-D。每个循环仅控制一个跃点。从下游邻居接收到的过载反馈不会进一步转发到上游。相反,SIP服务器会根据此反馈进行操作,例如,如果需要,会拒绝SIP消息。如果服务器的上游邻居也过载,它将向其代理报告此问题

upstream neighbors, which again take action based on the reported feedback. Thus, in hop-by-hop overload control, overload is always resolved by the direct upstream neighbors of the overloaded server without the need to involve entities that are located multiple SIP hops away.

上游邻居,他们再次根据报告的反馈采取行动。因此,在逐跳过载控制中,过载总是由过载服务器的直接上游邻居来解决,而不需要涉及位于多个SIP跃点之外的实体。

Hop-by-hop overload control reduces the impact of overload on a SIP network and can avoid congestion collapse. It is simple and scales well to networks with many SIP entities. An advantage is that it does not require feedback to be transmitted across multiple-hops, possibly crossing multiple trust domains. Feedback is sent to the next hop only. Furthermore, it does not require a SIP entity to aggregate a large number of overload status values or keep track of the overload status of SIP servers it is not communicating with.

逐跳过载控制减少了过载对SIP网络的影响,可以避免拥塞崩溃。它很简单,可以很好地扩展到具有许多SIP实体的网络。一个优点是,它不需要跨多个跃点(可能跨多个信任域)传输反馈。反馈仅发送到下一跳。此外,它不需要SIP实体聚合大量过载状态值,也不需要跟踪它不与之通信的SIP服务器的过载状态。

5.2. End-to-End
5.2. 端到端

End-to-end overload control implements an overload control loop along the entire path of a SIP request, from user agent client (UAC) to user agent server (UAS). An end-to-end overload control mechanism consolidates overload information from all SIP servers on the way (including all proxies and the UAS) and uses this information to throttle traffic as far upstream as possible. An end-to-end overload control mechanism has to be able to frequently collect the overload status of all servers on the potential path(s) to a destination and combine this data into meaningful overload feedback.

端到端过载控制沿SIP请求的整个路径(从用户代理客户端(UAC)到用户代理服务器(UAS))实现过载控制循环。端到端过载控制机制整合途中所有SIP服务器(包括所有代理和UAS)的过载信息,并使用此信息尽可能地限制上游流量。端到端过载控制机制必须能够经常收集到目标潜在路径上所有服务器的过载状态,并将这些数据组合成有意义的过载反馈。

A UA or SIP server only throttles requests if it knows that these requests will eventually be forwarded to an overloaded server. For example, if D is overloaded in Figure 2(b), A should only throttle requests it forwards to B when it knows that they will be forwarded to D. It should not throttle requests that will eventually be forwarded to C, since server C is not overloaded. In many cases, it is difficult for A to determine which requests will be routed to C and D, since this depends on the local routing decision made by B. These routing decisions can be highly variable and, for example, depend on call-routing policies configured by the user, services invoked on a call, load-balancing policies, etc. A previous message to a target that has been routed through an overloaded server does not necessarily mean that the next message to this target will also be routed through the same server.

UA或SIP服务器只有在知道这些请求最终将被转发到过载的服务器时才限制请求。例如,如果D在图2(b)中过载,A应该只在知道将转发给D的请求时限制它转发给b的请求。它不应该限制最终将转发给C的请求,因为服务器C没有过载。在许多情况下,A很难确定哪些请求将路由到C和D,因为这取决于B做出的本地路由决定。这些路由决定可能是高度可变的,例如,取决于用户配置的呼叫路由策略、呼叫调用的服务、负载平衡策略、,等。发送到目标的前一条消息已通过过载服务器路由,但这并不一定意味着发送到此目标的下一条消息也将通过同一服务器路由。

The main problem of end-to-end overload control is its inherent complexity, since UAC or SIP servers need to monitor all potential paths to a destination in order to determine which requests should be throttled and which requests may be sent. Even if this information is available, it is not clear which path a specific request will take.

端到端过载控制的主要问题是其固有的复杂性,因为UAC或SIP服务器需要监控到目的地的所有潜在路径,以确定哪些请求应该被限制,哪些请求可以被发送。即使此信息可用,也不清楚特定请求将采用哪条路径。

A variant of end-to-end overload control is to implement a control loop between a set of well-known SIP servers along the path of a SIP request. For example, an overload control loop can be instantiated between a server that only has one downstream neighbor or a set of closely coupled SIP servers. A control loop spanning multiple hops can be used if the sending entity has full knowledge about the SIP servers on the path of a SIP message.

端到端过载控制的一种变体是沿着SIP请求的路径在一组著名的SIP服务器之间实现控制循环。例如,可以在只有一个下游邻居的服务器或一组紧密耦合的SIP服务器之间实例化过载控制循环。如果发送实体完全了解SIP消息路径上的SIP服务器,则可以使用跨越多个跃点的控制循环。

Overload control for SIP servers is different from end-to-end congestion control used by transport protocols such as TCP. The traffic exchanged between SIP servers consists of many individual SIP messages. Each SIP message is created by a SIP UA to achieve a specific goal (e.g., to start setting up a call). All messages have their own source and destination addresses. Even SIP messages containing identical SIP URIs (e.g., a SUBSCRIBE and an INVITE message to the same SIP URI) can be routed to different destinations. This is different from TCP, where the traffic exchanged between routers consists of packets belonging to a usually longer flow of messages exchanged between a source and a destination (e.g., to transmit a file). If congestion occurs, the sources can detect this condition and adjust the rate at which the next packets are transmitted.

SIP服务器的过载控制不同于TCP等传输协议使用的端到端拥塞控制。SIP服务器之间交换的流量由许多单独的SIP消息组成。每个SIP消息由SIP UA创建以实现特定目标(例如,开始设置呼叫)。所有消息都有自己的源地址和目标地址。甚至包含相同SIP URI的SIP消息(例如,对相同SIP URI的订阅和邀请消息)也可以路由到不同的目的地。这与TCP不同,在TCP中,路由器之间交换的流量由属于源和目的地之间交换的通常较长的消息流(例如,传输文件)的数据包组成。如果发生拥塞,信源可以检测到这种情况并调整下一个数据包的传输速率。

5.3. Local Overload Control
5.3. 局部过载控制

The idea of local overload control (see Figure 2(c)) is to run the Monitor and Actuator on the same server. This enables the server to monitor the current resource usage and to reject messages that can't be processed without overusing local resources. The fundamental assumption behind local overload control is that it is less resource consuming for a server to reject messages than to process them. A server can therefore reject the excess messages it cannot process to stop all retransmissions of these messages. Since rejecting messages does consume resources on a SIP server, local overload control alone cannot prevent a congestion collapse.

本地过载控制的思想(见图2(c))是在同一台服务器上运行监视器和执行器。这使服务器能够监视当前资源使用情况,并拒绝在不过度使用本地资源的情况下无法处理的消息。本地过载控制背后的基本假设是,服务器拒绝消息比处理消息消耗的资源更少。因此,服务器可以拒绝它无法处理的多余消息,以停止这些消息的所有重新传输。由于拒绝消息会消耗SIP服务器上的资源,因此仅本地过载控制无法防止拥塞崩溃。

Local overload control can be used in conjunction with other overload control mechanisms and provides an additional layer of protection against overload. It is fully implemented within a SIP server and does not require cooperation between servers. In general, SIP servers should apply other overload control techniques to control load before a local overload control mechanism is activated as a mechanism of last resort.

本地过载控制可与其他过载控制机制结合使用,并提供额外的过载保护层。它完全在SIP服务器中实现,不需要服务器之间的协作。一般来说,SIP服务器应该在本地过载控制机制作为最后手段激活之前应用其他过载控制技术来控制负载。

6. Topologies
6. 拓扑

The following topologies describe four generic SIP server configurations. These topologies illustrate specific challenges for an overload control mechanism. An actual SIP server topology is likely to consist of combinations of these generic scenarios.

以下拓扑描述了四种通用SIP服务器配置。这些拓扑说明了过载控制机制的具体挑战。实际的SIP服务器拓扑可能由这些通用场景的组合组成。

In the "load balancer" configuration shown in Figure 3(a), a set of SIP servers (D, E, and F) receives traffic from a single source A. A load balancer is a typical example for such a configuration. In this configuration, overload control needs to prevent server A (i.e., the load balancer) from sending too much traffic to any of its downstream neighbors D, E, and F. If one of the downstream neighbors becomes overloaded, A can direct traffic to the servers that still have capacity. If one of the servers acts as a backup, it can be activated once one of the primary servers reaches overload.

在图3(a)所示的“负载平衡器”配置中,一组SIP服务器(D、E和F)从单个源a接收流量。负载平衡器是此类配置的典型示例。在此配置中,过载控制需要防止服务器A(即负载平衡器)向其任何下游邻居D、e和F发送过多流量。如果其中一个下游邻居过载,A可以将流量定向到仍有容量的服务器。如果其中一台服务器充当备份,则可以在其中一台主服务器过载时激活该服务器。

If A can reliably determine that D, E, and F are its only downstream neighbors and all of them are in overload, it may choose to report overload upstream on behalf of D, E, and F. However, if the set of downstream neighbors is not fixed or only some of them are in overload, then A should not activate an overload control since A can still forward the requests destined to non-overloaded downstream neighbors. These requests would be throttled as well if A would use overload control towards its upstream neighbors.

如果A能够可靠地确定D、E和F是其唯一的下游邻居,并且所有下游邻居都处于过载状态,则A可以选择代表D、E和F向上游报告过载。但是,如果下游邻居集不固定或只有部分下游邻居处于过载状态,那么A不应该激活过载控制,因为A仍然可以转发发送给未过载的下游邻居的请求。如果A对其上游邻居使用过载控制,这些请求也会被限制。

In some cases, the servers D, E, and F are in a server farm and are configured to appear as a single server to their upstream neighbors. In this case, server A can report overload on behalf of the server farm. If the load balancer is not a SIP entity, servers D, E, and F can report the overall load of the server farm (i.e., the load of the virtual server) in their messages. As an alternative, one of the servers (e.g., server E) can report overload on behalf of the server farm. In this case, not all messages contain overload control information, and all upstream neighbors need to be served by server E periodically to ensure that updated information is received.

在某些情况下,服务器D、E和F位于服务器场中,并被配置为向其上游邻居显示为单个服务器。在这种情况下,服务器A可以代表服务器场报告过载。如果负载平衡器不是SIP实体,则服务器D、E和F可以在其消息中报告服务器场的总体负载(即虚拟服务器的负载)。作为替代方案,其中一台服务器(例如服务器e)可以代表服务器场报告过载。在这种情况下,并非所有消息都包含过载控制信息,服务器E需要定期为所有上游邻居提供服务,以确保接收到更新的信息。

In the "multiple sources" configuration shown in Figure 3(b), a SIP server D receives traffic from multiple upstream sources A, B, and C. Each of these sources can contribute a different amount of traffic, which can vary over time. The set of active upstream neighbors of D can change as servers may become inactive, and previously inactive servers may start contributing traffic to D.

在图3(b)所示的“多个源”配置中,SIP服务器D接收来自多个上游源a、b和C的流量。这些源中的每一个都可以提供不同数量的流量,这些流量随时间而变化。当服务器可能变为非活动状态时,D的活动上游邻居集可能会发生变化,并且以前非活动的服务器可能会开始向D提供流量。

If D becomes overloaded, it needs to generate feedback to reduce the amount of traffic it receives from its upstream neighbors. D needs to decide by how much each upstream neighbor should reduce traffic. This decision can require the consideration of the amount of traffic

如果D过载,它需要生成反馈以减少从其上游邻居接收的流量。D需要决定每个上游邻居应该减少多少流量。这个决定可能需要考虑交通量

sent by each upstream neighbor and it may need to be re-adjusted as the traffic contributed by each upstream neighbor varies over time. Server D can use a local fairness policy to determine how much traffic it accepts from each upstream neighbor.

由每个上游邻居发送,可能需要重新调整,因为每个上游邻居提供的流量随时间变化。服务器D可以使用本地公平策略来确定它从每个上游邻居接受多少流量。

In many configurations, SIP servers form a "mesh" as shown in Figure 3(c). Here, multiple upstream servers A, B, and C forward traffic to multiple alternative servers D and E. This configuration is a combination of the "load balancer" and "multiple sources" scenario.

在许多配置中,SIP服务器形成一个“网格”,如图3(c)所示。这里,多个上游服务器A、B和C将流量转发到多个备用服务器D和E。此配置是“负载平衡器”和“多源”场景的组合。

                      +---+              +---+
                   /->| D |              | A |-\
                  /   +---+              +---+  \
                 /                               \   +---+
          +---+-/     +---+              +---+    \->|   |
          | A |------>| E |              | B |------>| D |
          +---+-\     +---+              +---+    /->|   |
                 \                               /   +---+
                  \   +---+              +---+  /
                   \->| F |              | C |-/
                      +---+              +---+
        
                      +---+              +---+
                   /->| D |              | A |-\
                  /   +---+              +---+  \
                 /                               \   +---+
          +---+-/     +---+              +---+    \->|   |
          | A |------>| E |              | B |------>| D |
          +---+-\     +---+              +---+    /->|   |
                 \                               /   +---+
                  \   +---+              +---+  /
                   \->| F |              | C |-/
                      +---+              +---+
        

(a) load balancer (b) multiple sources

(a) 负载平衡器(b)多个源

          +---+
          | A |---\                        a--\
          +---+-\  \---->+---+                 \
                 \/----->| D |             b--\ \--->+---+
          +---+--/\  /-->+---+                 \---->|   |
          | B |    \/                      c-------->| D |
          +---+---\/\--->+---+                       |   |
                  /\---->| E |            ...   /--->+---+
          +---+--/   /-->+---+                 /
          | C |-----/                      z--/
          +---+
        
          +---+
          | A |---\                        a--\
          +---+-\  \---->+---+                 \
                 \/----->| D |             b--\ \--->+---+
          +---+--/\  /-->+---+                 \---->|   |
          | B |    \/                      c-------->| D |
          +---+---\/\--->+---+                       |   |
                  /\---->| E |            ...   /--->+---+
          +---+--/   /-->+---+                 /
          | C |-----/                      z--/
          +---+
        

(c) mesh (d) edge proxy

(c) 网格(d)边代理

Figure 3: Topologies

图3:拓扑

Overload control that is based on reducing the number of messages a sender is allowed to send is not suited for servers that receive requests from a very large population of senders, each of which only sends a very small number of requests. This scenario is shown in Figure 3(d). An edge proxy that is connected to many UAs is a typical example for such a configuration. Since each UA typically infrequently sends requests, which are often related to the same session, it can't decrease its message rate to resolve the overload.

基于减少允许发送者发送的消息数量的过载控制不适用于从大量发送者接收请求的服务器,每个发送者只发送非常少量的请求。该场景如图3(d)所示。连接到许多UAs的边缘代理就是这种配置的典型示例。由于每个UA通常很少发送请求,这些请求通常与同一会话相关,因此它无法降低其消息速率来解决过载问题。

A SIP server that receives traffic from many sources, which each contribute only a small number of requests, can resort to local overload control by rejecting a percentage of the requests it receives with 503 (Service Unavailable) responses. Since it has many upstream neighbors, it can send 503 (Service Unavailable) to a fraction of them to gradually reduce load without entirely stopping all incoming traffic. The Retry-After header can be used in 503 (Service Unavailable) responses to ask upstream neighbors to wait a given number of seconds before trying the request again. Using 503 (Service Unavailable) can, however, not prevent overload if a large number of sources create requests (e.g., to place calls) at the same time.

接收来自多个源(每个源只提供少量请求)的流量的SIP服务器可以通过拒绝其接收的请求的一定百分比(503(服务不可用)响应来求助于本地过载控制。由于它有许多上游邻居,它可以向其中的一小部分发送503(服务不可用),以逐渐减少负载,而不完全停止所有传入流量。可以在503(服务不可用)响应中使用Retry After标头,要求上游邻居在重试请求之前等待给定的秒数。但是,如果大量源同时创建请求(例如,拨打电话),则使用503(服务不可用)无法防止过载。

Note: The requirements of the "edge proxy" topology are different from the ones of the other topologies, which may require a different method for overload control.

注:“边缘代理”拓扑的要求不同于其他拓扑,可能需要不同的过载控制方法。

7. Fairness
7. 公平

There are many different ways to define fairness between multiple upstream neighbors of a SIP server. In the context of SIP server overload, it is helpful to describe two categories of fairness: basic fairness and customized fairness. With basic fairness, a SIP server treats all requests equally and ensures that each request has the same chance of succeeding. With customized fairness, the server allocates resources according to different priorities. An example application of the basic fairness criteria is the "Third caller receives free tickets" scenario, where each call attempt should have an equal success probability in connecting through an overloaded SIP server, irrespective of the service provider in which the call was initiated. An example of customized fairness would be a server that assigns different resource allocations to its upstream neighbors (e.g., service providers) as defined in a service level agreement (SLA).

有许多不同的方法来定义SIP服务器的多个上游邻居之间的公平性。在SIP服务器过载的情况下,描述两类公平性是很有帮助的:基本公平性和定制公平性。在基本公平的情况下,SIP服务器平等地处理所有请求,并确保每个请求都有相同的成功机会。通过定制的公平性,服务器根据不同的优先级分配资源。基本公平性标准的一个示例应用是“第三呼叫者接收免费票证”场景,其中每个呼叫尝试在通过过载的SIP服务器进行连接时应具有相同的成功概率,而与发起呼叫的服务提供商无关。定制公平性的一个示例是,服务器按照服务级别协议(SLA)中的定义向其上游邻居(例如,服务提供商)分配不同的资源分配。

8. Performance Metrics
8. 性能指标

The performance of an overload control mechanism can be measured using different metrics.

过载控制机制的性能可以使用不同的指标来衡量。

A key performance indicator is the goodput of a SIP server under overload. Ideally, a SIP server will be enabled to perform at its maximum capacity during periods of overload. For example, if a SIP server has a processing capacity of 140 INVITE transactions per second, then an overload control mechanism should enable it to process 140 INVITEs per second even if the offered load is much higher. The delay introduced by a SIP server is another important indicator. An overload control mechanism should ensure that the

一个关键的性能指标是SIP服务器在过载情况下的性能。理想情况下,SIP服务器将能够在过载期间以最大容量运行。例如,如果SIP服务器具有每秒140个INVITE事务的处理能力,则过载控制机制应使其能够每秒处理140个INVITE,即使提供的负载要高得多。SIP服务器引入的延迟是另一个重要指标。过载控制机制应确保

delay encountered by a SIP message is not increased significantly during periods of overload. Significantly increased delay can lead to time-outs and retransmission of SIP messages, making the overload worse.

在过载期间,SIP消息遇到的延迟不会显著增加。显著增加的延迟可能导致SIP消息超时和重新传输,使过载更严重。

Responsiveness and stability are other important performance indicators. An overload control mechanism should quickly react to an overload occurrence and ensure that a SIP server does not become overloaded, even during sudden peaks of load. Similarly, an overload control mechanism should quickly stop rejecting requests if the overload disappears. Stability is another important criteria. An overload control mechanism should not cause significant oscillations of load on a SIP server. The performance of SIP overload control mechanisms is discussed in [Noel], [Shen], [Hilt], and [Abdelal].

响应性和稳定性是其他重要的性能指标。过载控制机制应该对过载事件做出快速反应,并确保SIP服务器不会过载,即使在负载突然达到峰值时也是如此。类似地,如果过载消失,过载控制机制应该快速停止拒绝请求。稳定性是另一个重要标准。过载控制机制不应在SIP服务器上引起负载的显著波动。在[Noel]、[Shen]、[Hilt]和[Abdelal]中讨论了SIP过载控制机制的性能。

In addition to the above metrics, there are other indicators that are relevant for the evaluation of an overload control mechanism:

除上述指标外,还有其他与过载控制机制评估相关的指标:

Fairness: Which type of fairness does the overload control mechanism implement?

公平性:过载控制机制实现哪种类型的公平性?

Self-limiting: Is the overload control self-limiting if a SIP server becomes unresponsive?

自我限制:如果SIP服务器没有响应,过载控制是否自我限制?

Changes in neighbor set: How does the mechanism adapt to a changing set of sending entities?

邻居集的变化:机制如何适应不断变化的发送实体集?

Data points to monitor: Which and how many data points does an overload control mechanism need to monitor?

要监控的数据点:过载控制机制需要监控哪些数据点以及多少数据点?

Computational load: What is the (CPU) load created by the overload "Monitor" and "Actuator"?

计算负载:过载“监视器”和“执行器”产生的(CPU)负载是多少?

9. Explicit Overload Control Feedback
9. 显式过载控制反馈

Explicit overload control feedback enables a receiver to indicate how much traffic it wants to receive. Explicit overload control mechanisms can be differentiated based on the type of information conveyed in the overload control feedback and whether the control function is in the receiving or sending entity (receiver- vs. sender-based overload control), or both.

显式过载控制反馈使接收器能够指示它想要接收多少流量。显式过载控制机制可以根据过载控制反馈中传递的信息类型以及控制功能是在接收实体中还是在发送实体中(基于接收方与基于发送方的过载控制),或者两者都有区别。

9.1. Rate-Based Overload Control
9.1. 基于速率的过载控制

The key idea of rate-based overload control is to limit the request rate at which an upstream element is allowed to forward traffic to the downstream neighbor. If overload occurs, a SIP server instructs

基于速率的过载控制的关键思想是限制允许上游元素将流量转发给下游邻居的请求速率。如果发生过载,SIP服务器将发出指示

each upstream neighbor to send, at most, X requests per second. Each upstream neighbor can be assigned a different rate cap.

每个上游邻居每秒最多发送X个请求。可以为每个上游邻居分配不同的速率上限。

An example algorithm for an Actuator in the sending entity is request gapping. After transmitting a request to a downstream neighbor, a server waits for 1/X seconds before it transmits the next request to the same neighbor. Requests that arrive during the waiting period are not forwarded and are either redirected, rejected, or buffered. Request gapping only affects requests that are targeted by overload control (e.g., requests that initiate a transaction and not retransmissions in an ongoing transaction).

发送实体中执行器的一个示例算法是请求间隙。在向下游邻居发送请求后,服务器等待1/X秒,然后再向同一邻居发送下一个请求。在等待期间到达的请求不会被转发,而是被重定向、拒绝或缓冲。请求间隙仅影响过载控制所针对的请求(例如,启动事务而不在正在进行的事务中重新传输的请求)。

The rate cap ensures that the number of requests received by a SIP server never increases beyond the sum of all rate caps granted to upstream neighbors. Rate-based overload control protects a SIP server against overload, even during load spikes assuming there are no new upstream neighbors that start sending traffic. New upstream neighbors need to be considered in the rate caps assigned to all upstream neighbors. The rate assigned to upstream neighbors needs to be adjusted when new neighbors join. During periods when new neighbors are joining, overload can occur in extreme cases until the rate caps of all servers are adjusted to again match the overall rate cap of the server. The overall rate cap of a SIP server is determined by an overload control algorithm, e.g., based on system load.

速率上限确保SIP服务器接收的请求数量不会超过授予上游邻居的所有速率上限之和。基于速率的过载控制可以防止SIP服务器过载,即使在负载高峰期间(假设没有新的上游邻居开始发送流量)。在分配给所有上游邻居的速率上限中,需要考虑新的上游邻居。当新邻居加入时,需要调整分配给上游邻居的速率。在新邻居加入期间,在极端情况下可能会发生过载,直到调整所有服务器的速率上限以再次匹配服务器的总体速率上限。SIP服务器的总速率上限由过载控制算法确定,例如,基于系统负载。

Rate-based overload control requires a SIP server to assign a rate cap to each of its upstream neighbors while it is activated. Effectively, a server needs to assign a share of its overall capacity to each upstream neighbor. A server needs to ensure that the sum of all rate caps assigned to upstream neighbors does not substantially oversubscribe its actual processing capacity. This requires a SIP server to keep track of the set of upstream neighbors and to adjust the rate cap if a new upstream neighbor appears or an existing neighbor stops transmitting. For example, if the capacity of the server is X and this server is receiving traffic from two upstream neighbors, it can assign a rate of X/2 to each of them. If a third sender appears, the rate for each sender is lowered to X/3. If the overall rate cap is too high, a server may experience overload. If the cap is too low, the upstream neighbors will reject requests even though they could be processed by the server.

基于速率的过载控制要求SIP服务器在其每个上游邻居被激活时为其分配速率上限。实际上,服务器需要将其总容量的一部分分配给每个上游邻居。服务器需要确保分配给上游邻居的所有速率上限之和不会严重超额订阅其实际处理能力。这需要SIP服务器跟踪一组上游邻居,并在出现新的上游邻居或现有邻居停止传输时调整速率上限。例如,如果服务器的容量为X,并且该服务器正在接收来自两个上游邻居的流量,则它可以为每个邻居分配X/2的速率。如果出现第三个发件人,则每个发件人的费率将降低到X/3。如果总速率上限过高,服务器可能会过载。如果cap太低,则上游邻居将拒绝请求,即使它们可以由服务器处理。

An approach for estimating a rate cap for each upstream neighbor is using a fixed proportion of a control variable, X, where X is initially equal to the capacity of the SIP server. The server then increases or decreases X until the workload arrival rate matches the actual server capacity. Usually, this will mean that the sum of the rate caps sent out by the server (=X) exceeds its actual capacity,

用于估计每个上游邻居的速率上限的方法是使用控制变量X的固定比例,其中X最初等于SIP服务器的容量。然后,服务器增加或减少X,直到工作负载到达率与实际服务器容量匹配。通常,这意味着服务器发送的速率上限总和(=X)超过了其实际容量,

but enables upstream neighbors who are not generating more than their fair share of the work to be effectively unrestricted. In this approach, the server only has to measure the aggregate arrival rate. However, since the overall rate cap is usually higher than the actual capacity, brief periods of overload may occur.

但是,这使上游邻居能够有效地不受限制,因为这些邻居的发电量不会超过他们的公平份额。在这种方法中,服务器只需测量总到达率。但是,由于总费率上限通常高于实际容量,因此可能会出现短暂的过载。

9.2. Loss-Based Overload Control
9.2. 基于损耗的过载控制

A loss percentage enables a SIP server to ask an upstream neighbor to reduce the number of requests it would normally forward to this server by X%. For example, a SIP server can ask an upstream neighbor to reduce the number of requests this neighbor would normally send by 10%. The upstream neighbor then redirects or rejects 10% of the traffic that is destined for this server.

丢失百分比使SIP服务器能够请求上游邻居将其通常转发到此服务器的请求数减少X%。例如,SIP服务器可以要求上游邻居将该邻居通常发送的请求数减少10%。然后,上游邻居重定向或拒绝发送到此服务器的10%流量。

To implement a loss percentage, the sending entity may employ an algorithm to draw a random number between 1 and 100 for each request to be forwarded. The request is not forwarded to the server if the random number is less than or equal to X.

为了实现丢失百分比,发送实体可以采用算法为要转发的每个请求提取1到100之间的随机数。如果随机数小于或等于X,则不会将请求转发到服务器。

An advantage of loss-based overload control is that the receiving entity does not need to track the set of upstream neighbors or the request rate it receives from each upstream neighbor. It is sufficient to monitor the overall system utilization. To reduce load, a server can ask its upstream neighbors to lower the traffic forwarded by a certain percentage. The server calculates this percentage by combining the loss percentage that is currently in use (i.e., the loss percentage the upstream neighbors are currently using when forwarding traffic), the current system utilization, and the desired system utilization. For example, if the server load approaches 90% and the current loss percentage is set to a 50% traffic reduction, then the server can decide to increase the loss percentage to 55% in order to get to a system utilization of 80%. Similarly, the server can lower the loss percentage if permitted by the system utilization.

基于丢失的过载控制的优点是,接收实体不需要跟踪上游邻居的集合或从每个上游邻居接收的请求速率。监控整个系统的利用率就足够了。为了减少负载,服务器可以要求其上游邻居将转发的流量降低一定的百分比。服务器通过组合当前正在使用的丢失百分比(即,转发流量时上游邻居当前使用的丢失百分比)、当前系统利用率和所需系统利用率来计算此百分比。例如,如果服务器负载接近90%,并且当前的损耗百分比设置为50%的通信量减少,那么服务器可以决定将损耗百分比增加到55%,以使系统利用率达到80%。同样,如果系统利用率允许,服务器可以降低丢失百分比。

Loss-based overload control requires that the throttle percentage be adjusted to the current overall number of requests received by the server. This is particularly important if the number of requests received fluctuates quickly. For example, if a SIP server sets a throttle value of 10% at time t1 and the number of requests increases by 20% between time t1 and t2 (t1<t2), then the server will see an increase in traffic by 10% between time t1 and t2. This is even though all upstream neighbors have reduced traffic by 10%. Thus, percentage throttling requires an adjustment of the throttling percentage in response to the traffic received and may not always be able to prevent a server from encountering brief periods of overload in extreme cases.

基于丢失的过载控制要求将油门百分比调整为服务器接收的当前请求总数。如果收到的请求数量快速波动,这一点尤为重要。例如,如果SIP服务器在时间t1设置了10%的限制值,并且请求数量在时间t1和t2之间增加了20%(t1<t2),那么服务器将在时间t1和t2之间看到流量增加了10%。尽管所有上游邻居的流量都减少了10%。因此,百分比节流需要根据收到的流量调整节流百分比,并且在极端情况下可能无法始终防止服务器遇到短暂的过载。

9.3. Window-Based Overload Control
9.3. 基于窗口的过载控制

The key idea of window-based overload control is to allow an entity to transmit a certain number of messages before it needs to receive a confirmation for the messages in transit. Each sender maintains an overload window that limits the number of messages that can be in transit without being confirmed. Window-based overload control is inspired by TCP [RFC0793].

基于窗口的过载控制的关键思想是允许实体在需要接收传输中消息的确认之前传输一定数量的消息。每个发件人都维护一个过载窗口,该窗口限制未经确认即可传输的邮件数量。基于窗口的过载控制源于TCP[RFC0793]。

Each sender maintains an unconfirmed message counter for each downstream neighbor it is communicating with. For each message sent to the downstream neighbor, the counter is increased. For each confirmation received, the counter is decreased. The sender stops transmitting messages to the downstream neighbor when the unconfirmed message counter has reached the current window size.

每个发送方为其正在通信的每个下游邻居维护一个未确认的消息计数器。对于发送到下游邻居的每条消息,计数器都会增加。对于收到的每个确认,计数器都会减少。当未确认的消息计数器达到当前窗口大小时,发送方停止向下游邻居发送消息。

A crucial parameter for the performance of window-based overload control is the window size. Each sender has an initial window size it uses when first sending a request. This window size can be changed based on the feedback it receives from the receiver.

基于窗口的过载控制性能的一个关键参数是窗口大小。每个发送者在第一次发送请求时都有一个初始窗口大小。此窗口大小可以根据从接收器接收到的反馈进行更改。

The sender adjusts its window size as soon as it receives the corresponding feedback from the receiver. If the new window size is smaller than the current unconfirmed message counter, the sender stops transmitting messages until more messages are confirmed and the current unconfirmed message counter is less than the window size.

发送方在收到来自接收方的相应反馈后立即调整其窗口大小。如果新窗口大小小于当前未确认消息计数器,则发送方将停止发送消息,直到确认更多消息且当前未确认消息计数器小于窗口大小。

Note that the reception of a 100 (Trying) response does not provide a confirmation for the successful processing of a message. 100 (Trying) responses are often created by a SIP server very early in processing and do not indicate that a message has been successfully processed and cleared from the input buffer. If the downstream neighbor is a stateless proxy, it will not create 100 (Trying) responses at all and will instead pass through 100 (Trying) responses created by the next stateful server. Also, 100 (Trying) responses are typically only created for INVITE requests. Explicit message confirmations do not have these problems.

注意,100(尝试)响应的接收不提供消息成功处理的确认。100(尝试)响应通常在处理的早期由SIP服务器创建,并不表示消息已成功处理并从输入缓冲区中清除。如果下游邻居是无状态代理,它将根本不会创建100个(尝试)响应,而是通过下一个有状态服务器创建的100个(尝试)响应。此外,通常仅为INVITE请求创建100个(尝试)响应。显式消息确认没有这些问题。

Window-based overload control is similar to rate-based overload control in that the total available receiver buffer space needs to be divided among all upstream neighbors. However, unlike rate-based overload control, window-based overload control is self-limiting and can ensure that the receiver buffer does not overflow under normal conditions. The transmission of messages by senders is clocked by message confirmations received from the receiver. A buffer overflow can occur in extreme cases when a large number of new upstream

基于窗口的过载控制类似于基于速率的过载控制,因为总可用接收器缓冲空间需要在所有上游邻居之间分配。但是,与基于速率的过载控制不同,基于窗口的过载控制是自限的,可以确保正常情况下接收器缓冲区不会溢出。发送方发送的消息通过从接收方接收的消息确认来计时。在极端情况下,当有大量新的上游数据时,可能会发生缓冲区溢出

neighbors arrives at the same time. However, senders will eventually stop transmitting new requests once their initial sending window is closed.

邻居们同时到达。然而,一旦发送方的初始发送窗口关闭,发送方最终将停止发送新请求。

In window-based overload control, the number of messages a sender is allowed to send can frequently be set to zero. In this state, the sender needs to be informed when it is allowed to send again and when the receiver window has opened up. However, since the sender is not allowed to transmit messages, the receiver cannot convey the new window size by piggybacking it in a response to another message. Instead, it needs to inform the sender through another mechanism, e.g., by sending a message that contains the new window size.

在基于窗口的过载控制中,允许发送者发送的消息数通常可以设置为零。在此状态下,需要通知发送方何时允许再次发送,以及何时已打开接收方窗口。但是,由于不允许发送方发送消息,因此接收方无法通过在对另一条消息的响应中使用新的窗口大小来传递新的窗口大小。相反,它需要通过另一种机制通知发送者,例如,通过发送包含新窗口大小的消息。

9.4. Overload Signal-Based Overload Control
9.4. 基于过载信号的过载控制

The key idea of overload signal-based overload control is to use the transmission of a 503 (Service Unavailable) response as a signal for overload in the downstream neighbor. After receiving a 503 (Service Unavailable) response, the sender reduces the load forwarded to the downstream neighbor to avoid triggering more 503 (Service Unavailable) responses. The sender keeps reducing the load if more 503 (Service Unavailable) responses are received. Note that this scheme is based on the use of 503 (Service Unavailable) responses without the Retry-After header, as the Retry-After header would require a sender to entirely stop forwarding requests. It should also be noted that 503 responses can be generated for reasons other than overload (e.g., server maintenance).

基于过载信号的过载控制的关键思想是使用503(服务不可用)响应的传输作为下游邻居中过载的信号。在接收到503(服务不可用)响应后,发送方减少转发给下游邻居的负载,以避免触发更多503(服务不可用)响应。如果收到更多503(服务不可用)响应,发送方将继续减少负载。请注意,此方案基于使用503(服务不可用)响应而不使用Retry After标头,因为Retry After标头将要求发送方完全停止转发请求。还应注意,503响应可能是由于过载以外的原因(例如,服务器维护)生成的。

A sender that has not received 503 (Service Unavailable) responses for a while but is still throttling traffic can start to increase the offered load. By slowly increasing the traffic forwarded, a sender can detect that overload in the downstream neighbor has been resolved and more load can be forwarded. The load is increased until the sender receives another 503 (Service Unavailable) response or is forwarding all requests it has. A possible algorithm for adjusting traffic is additive increase/multiplicative decrease (AIMD).

一段时间未收到503(服务不可用)响应但仍在限制流量的发送方可以开始增加提供的负载。通过缓慢增加转发的流量,发送方可以检测到下游邻居中的过载已得到解决,并且可以转发更多的负载。负载将增加,直到发送方收到另一个503(服务不可用)响应或转发其所有请求。调整流量的一种可能算法是加法增加/乘法减少(AIMD)。

Overload signal-based overload control is a sender-based overload control mechanism.

基于过载信号的过载控制是一种基于发送方的过载控制机制。

9.5. On-/Off Overload Control
9.5. 开/关过载控制

On-/off overload control feedback enables a SIP server to turn the traffic it is receiving either on or off. The 503 (Service Unavailable) response with a Retry-After header implements on-/off overload control. On-/off overload control is less effective in controlling load than the fine grained control methods above. All of

On-/off过载控制反馈使SIP服务器能够打开或关闭其接收的流量。503(服务不可用)响应,报头执行开/关过载控制后重试。开/关过载控制在控制负载方面不如上述细粒度控制方法有效。全部

the above methods can realize on-/off overload control, e.g., by setting the allowed rate to either zero or unlimited.

上述方法可以实现开/关过载控制,例如,通过将允许速率设置为零或无限。

10. Implicit Overload Control
10. 隐式过载控制

Implicit overload control ensures that the transmission of a SIP server is self-limiting. It slows down the transmission rate of a sender when there is an indication that the receiving entity is experiencing overload. Such an indication can be that the receiving entity is not responding within the expected timeframe or is not responding at all. The idea of implicit overload control is that senders should try to sense overload of a downstream neighbor even if there is no explicit overload control feedback. It avoids an overloaded server, which has become unable to generate overload control feedback, from being overwhelmed with requests.

隐式过载控制确保SIP服务器的传输是自限制的。当有迹象表明接收实体正在经历过载时,它会降低发送方的传输速率。这样的指示可以是接收实体没有在预期的时间范围内响应,或者根本没有响应。隐式过载控制的思想是,即使没有显式过载控制反馈,发送者也应该尝试感知下游邻居的过载。它避免了过载的服务器(无法生成过载控制反馈)被请求淹没。

Window-based overload control is inherently self-limiting since a sender cannot continue to pass messages without receiving confirmations. All other explicit overload control schemes described above do not have this property and require additional implicit controls to limit transmissions in case an overloaded downstream neighbor does not generate explicit feedback.

基于窗口的过载控制本质上是自我限制的,因为发送者不能在不接收确认的情况下继续传递消息。上述所有其他显式过载控制方案都不具有此属性,并且在过载的下游邻居不生成显式反馈的情况下,需要额外的隐式控制来限制传输。

11. Overload Control Algorithms
11. 过载控制算法

An important aspect of the design of an overload control mechanism is the overload control algorithm. The control algorithm determines when the amount of traffic to a SIP server needs to be decreased and when it can be increased. In terms of the model described in Section 4, the control algorithm takes (S) as an input value and generates (T) as a result.

过载控制机制设计的一个重要方面是过载控制算法。控制算法确定何时需要减少到SIP服务器的流量以及何时可以增加流量。根据第4节中描述的模型,控制算法将(S)作为输入值,并生成(T)作为结果。

Overload control algorithms have been studied to a large extent and many different overload control algorithms exist. With many different overload control algorithms available, it seems reasonable to suggest a baseline algorithm in a specification for a SIP overload control mechanism and allow the use of other algorithms if they provide the same protocol semantics. This will also allow the development of future algorithms, which may lead to better performance. Conversely, the overload control mechanism should allow the use of different algorithms if they adhere to the defined protocol semantics.

人们对过载控制算法进行了大量的研究,并且存在许多不同的过载控制算法。由于有许多不同的过载控制算法可用,因此在SIP过载控制机制规范中建议一个基线算法,并允许使用其他算法(如果它们提供相同的协议语义)。这也将允许开发未来的算法,这可能会导致更好的性能。相反,如果重载控制机制遵守定义的协议语义,则应允许使用不同的算法。

12. Message Prioritization
12. 消息优先级

Overload control can require a SIP server to prioritize requests and select requests to be rejected or redirected. The selection is largely a matter of local policy of the SIP server, the overall network, and the services the SIP server provides.

过载控制可能需要SIP服务器对请求进行优先级排序,并选择要拒绝或重定向的请求。选择在很大程度上取决于SIP服务器的本地策略、整个网络以及SIP服务器提供的服务。

While there are many factors that can affect the prioritization of SIP requests, the Resource-Priority Header (RPH) field [RFC4412] is a prime candidate for marking the prioritization of SIP requests. Depending on the particular network and the services it offers, a particular namespace and priority value in the RPH could indicate i) a high priority request, which should be preserved if possible during overload, ii) a low priority request, which should be dropped during overload, or iii) a label, which has no impact on message prioritization in this network.

虽然有许多因素会影响SIP请求的优先级,但资源优先级标头(RPH)字段[RFC4412]是标记SIP请求优先级的主要候选字段。根据特定网络及其提供的服务,RPH中的特定名称空间和优先级值可能表示i)高优先级请求,如果可能,应在过载期间保留;ii)低优先级请求,应在过载期间丢弃;或iii)标签,这对该网络中的邮件优先级没有影响。

For a number of reasons, responses should not be targeted in order to reduce SIP server load. Responses cannot be rejected and would have to be dropped. This triggers the retransmission of the request plus the response, leading to even more load. In addition, the request associated with a response has already been processed and dropping the response will waste the efforts that have been spent on the request. Most importantly, rejecting a request effectively also removes the request and the response. If no requests are passed along, there will be no responses coming back in return.

出于许多原因,为了减少SIP服务器负载,不应以响应为目标。响应不能被拒绝,必须删除。这会触发请求和响应的重新传输,从而导致更大的负载。此外,与响应关联的请求已被处理,删除响应将浪费在请求上的精力。最重要的是,有效地拒绝请求也会删除请求和响应。如果没有传递任何请求,则不会有响应返回。

Overload control does not change the retransmission behavior of SIP. Retransmissions are triggered using procedures defined in RFC 3261 [RFC3261] and are not subject to throttling.

过载控制不会改变SIP的重传行为。使用RFC 3261[RFC3261]中定义的程序触发重传,并且不受限制。

13. Operational Considerations
13. 业务考虑

In addition to the design considerations discussed above, implementations of a SIP overload control mechanism need to take the following operational aspects into consideration. These aspects, while important, are out of scope for this document and are left for further discussion in other documents.

除了上面讨论的设计考虑之外,SIP过载控制机制的实现还需要考虑以下操作方面。这些方面虽然很重要,但超出了本文件的范围,在其他文件中留作进一步讨论。

Selection of feedback type: A SIP overload control mechanism can support one or multiple types of explicit overload control feedback. Using a single type of feedback (e.g., loss-based feedback) has the advantage of simplifying the protocol and implementations. Supporting multiple types of feedback (e.g., loss- and rate-based feedback) provides more flexibility; however, it requires a way to select the feedback type used between two servers.

反馈类型选择:SIP过载控制机制可以支持一种或多种类型的显式过载控制反馈。使用单一类型的反馈(例如,基于损耗的反馈)具有简化协议和实现的优点。支持多种类型的反馈(例如,基于损失和基于费率的反馈)提供了更大的灵活性;但是,它需要一种方法来选择两台服务器之间使用的反馈类型。

Event reporting: Overload is a serious condition for any network of SIP servers, even if it is handled properly by an overload control mechanism. Overload events should therefore be reported by a SIP server, e.g., through a logging or management interface.

事件报告:对于任何SIP服务器网络来说,过载都是一种严重的情况,即使过载控制机制处理得当。因此,SIP服务器应报告过载事件,例如,通过日志记录或管理接口。

14. Security Considerations
14. 安全考虑

This document presents an overview of several overload control feedback mechanisms. These mechanisms and design consideration are presented as input to other documents that will specify a particular feedback mechanism. Specific security measures pertinent to a particular overload feedback mechanism will be discussed in the context of a document specifying that security mechanism. However, there are common security considerations that must be taken into account regardless of the choice of a final mechanism.

本文件概述了几种过载控制反馈机制。这些机制和设计考虑作为其他文件的输入,这些文件将指定特定的反馈机制。与特定过载反馈机制相关的特定安全措施将在指定该安全机制的文件中讨论。然而,无论最终机制如何选择,都必须考虑到一些共同的安全考虑。

First, the rate-based mechanism surveyed in Section 9.1 allocates a fixed portion of the total inbound traffic of a server to each of its upstream neighbors. Consequently, an attacker can introduce a new upstream server for a short duration, causing the overloaded server to lower the proportional traffic rate to all other existing servers. Introducing many such short-lived servers will cause the aggregate rate arriving at the overloaded server to decrease substantially, thereby affecting a reduction in the service offered by the server under attack and leading to a denial-of-service attack [RFC4732].

首先,第9.1节中介绍的基于速率的机制将服务器总入站流量的固定部分分配给其每个上游邻居。因此,攻击者可以在短时间内引入一个新的上游服务器,从而导致过载的服务器降低与所有其他现有服务器的比例通信速率。引入许多这样的短命服务器将导致到达过载服务器的聚合速率大幅降低,从而影响受攻击服务器提供的服务的减少,并导致拒绝服务攻击[RFC4732]。

The same problem exists in the windows-based mechanism discussed in Section 9.3; however, because of the window acknowledgments sent by the overloaded server, the effect is not as drastic (an attacker will have to expend resources by constantly sending traffic to keep the receiver window full).

第9.3节讨论的基于windows的机制中也存在同样的问题;但是,由于过载的服务器发送了窗口确认,因此影响并没有那么严重(攻击者必须通过不断发送流量来消耗资源,以保持接收器窗口满)。

All mechanisms assume that the upstream neighbors of an overloaded server follow the feedback received. In the rate- and window-based mechanisms, a server can directly verify if upstream neighbors follow the requested policies. As the loss-based mechanism described in Section 9.2 requires upstream neighbors to reduce traffic by a fraction and the current offered load in the upstream neighbor is unknown, a server cannot directly verify the compliance of upstream neighbors, except when traffic reduction is set to 100%. In this case, a server has to rely on heuristics to identify upstream neighbors that try to gain an advantage by not reducing load or not reducing it at the requested loss-rate. A policing mechanism can be used to throttle or block traffic from unfair or malicious upstream neighbors. Barring such a widespread policing mechanism, the communication link between the upstream neighbors and the overloaded server should be such that the identity of both the servers at the end of each link can be established and logged. The use of Transport

所有机制都假设过载服务器的上游邻居遵循接收到的反馈。在基于速率和窗口的机制中,服务器可以直接验证上游邻居是否遵循请求的策略。由于第9.2节中描述的基于损耗的机制要求上游邻居减少一小部分流量,并且上游邻居中当前提供的负载未知,服务器无法直接验证上游邻居的合规性,除非流量减少设置为100%。在这种情况下,服务器必须依靠启发式来识别上游邻居,这些邻居试图通过不降低负载或不以请求的丢失率降低负载来获得优势。监控机制可用于限制或阻止来自不公平或恶意上游邻居的流量。除非存在这种广泛的监控机制,否则上游邻居和过载服务器之间的通信链路应该能够建立并记录每个链路末端的两个服务器的身份。交通工具的使用

Layer Security (TLS) and mutual authentication of upstream neighbors [RFC3261] [RFC5922] can be used for this purpose.

层安全(TLS)和上游邻居的相互认证[RFC3261][RFC5922]可用于此目的。

If an attacker controls a server, he or she may maliciously advertise overload feedback to all of the neighbors of the server, even if the server is not experiencing overload. This will have the effect of forcing all of the upstream neighbors to reject or queue messages arriving to them and destined for the apparently overloaded server (this, in essence, is diminishing the serving capacity of the upstream neighbors since they now have to deal with their normal traffic in addition to rejecting or quarantining the traffic destined to the overloaded server). All mechanisms allow the attacker to advertise a capacity of 0, effectively disabling all traffic destined to the server pretending to be in overload and forcing all the upstream neighbors to expend resources dealing with this condition.

如果攻击者控制服务器,他或她可能会恶意向服务器的所有邻居发布过载反馈,即使服务器没有过载。这将迫使所有上游邻居拒绝或排队发送到它们的消息,并将其发送到明显过载的服务器(这本质上是在减少上游邻居的服务能力,因为他们现在除了拒绝或隔离发送到过载服务器的流量外,还必须处理正常流量)。所有机制都允许攻击者公布0的容量,从而有效地禁用所有发送到服务器的流量,假装处于过载状态,并迫使所有上游邻居花费资源来处理此情况。

As before, a remedy for this is to use a communication link such that the identity of the servers at both ends of the link is established and logged. The use of TLS and mutual authentication of neighbors [RFC3261] [RFC5922] can be used for this purpose.

如前所述,解决此问题的一种方法是使用通信链路,以便建立并记录链路两端服务器的标识。TLS的使用和邻居的相互认证[RFC3261][RFC5922]可用于此目的。

If an attacker controls several servers of a load-balanced cluster, he or she may maliciously advertise overload feedback from these servers to all senders. Senders with the policy to redirect traffic that cannot be processed by an overloaded server will start to redirect this traffic to the servers that have not reported overload. This attack can be used to create a denial-of-service attack on these servers. If these servers are compromised, the attack can be used to increase the amount of traffic that is passed through the compromised servers. This attack is ineffective if servers reject traffic based on overload feedback instead of redirecting it.

如果攻击者控制负载平衡群集的多台服务器,他或她可能会恶意将这些服务器的过载反馈通告给所有发件人。具有重定向无法由过载服务器处理的通信的策略的发件人将开始将此通信重定向到未报告过载的服务器。此攻击可用于在这些服务器上创建拒绝服务攻击。如果这些服务器受到破坏,攻击可用于增加通过受损服务器的通信量。如果服务器拒绝基于过载反馈的流量而不是重定向流量,则此攻击无效。

15. Informative References
15. 资料性引用

[Abdelal] Abdelal, A. and W. Matragi, "Signal-Based Overload Control for SIP Servers", 7th Annual IEEE Consumer Communications and Networking Conference (CCNC-10), Las Vegas, Nevada, USA, January 2010.

[Abdelal]Abdelal,A.和W.Matragi,“SIP服务器基于信号的过载控制”,第七届IEEE消费者通信和网络年会(CCNC-10),美国内华达州拉斯维加斯,2010年1月。

[Hilt] Hilt, V. and I. Widjaja, "Controlling overload in networks of SIP servers", IEEE International Conference on Network Protocols (ICNP'08), Orlando, Florida, October 2008.

[Hilt]Hilt,V.和I.Widjaja,“控制SIP服务器网络中的过载”,IEEE网络协议国际会议(ICNP'08),佛罗里达州奥兰多,2008年10月。

[Noel] Noel, E. and C. Johnson, "Novel Overload Controls for SIP Networks", International Teletraffic Congress (ITC 21), Paris, France, September 2009.

[Noel]Noel,E.和C.Johnson,“SIP网络的新型过载控制”,国际电信通信大会(ITC 21),法国巴黎,2009年9月。

[RFC0793] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, September 1981.

[RFC0793]Postel,J.,“传输控制协议”,标准7,RFC 793,1981年9月。

[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.

[RFC3261]Rosenberg,J.,Schulzrinne,H.,Camarillo,G.,Johnston,A.,Peterson,J.,Sparks,R.,Handley,M.,和E.Schooler,“SIP:会话启动协议”,RFC 3261,2002年6月。

[RFC4412] Schulzrinne, H. and J. Polk, "Communications Resource Priority for the Session Initiation Protocol (SIP)", RFC 4412, February 2006.

[RFC4412]Schulzrinne,H.和J.Polk,“会话启动协议(SIP)的通信资源优先级”,RFC 4412,2006年2月。

[RFC4732] Handley, M., Rescorla, E., and IAB, "Internet Denial-of-Service Considerations", RFC 4732, December 2006.

[RFC4732]Handley,M.,Rescorla,E.,和IAB,“互联网拒绝服务注意事项”,RFC 4732,2006年12月。

[RFC5390] Rosenberg, J., "Requirements for Management of Overload in the Session Initiation Protocol", RFC 5390, December 2008.

[RFC5390]Rosenberg,J.,“会话启动协议中过载管理的要求”,RFC 53902008年12月。

[RFC5922] Gurbani, V., Lawrence, S., and A. Jeffrey, "Domain Certificates in the Session Initiation Protocol (SIP)", RFC 5922, June 2010.

[RFC5922]Gurbani,V.,Lawrence,S.,和A.Jeffrey,“会话启动协议(SIP)中的域证书”,RFC 59222010年6月。

[Shen] Shen, C., Schulzrinne, H., and E. Nahum, "Session Initiation Protocol (SIP) Server Overload Control: Design and Evaluation, Principles", Systems and Applications of IP Telecommunications (IPTComm'08), Heidelberg, Germany, July 2008.

[Shen]Shen,C.,Schulzrinne,H.,和E.Nahum,“会话启动协议(SIP)服务器过载控制:设计和评估,原则”,IP电信系统和应用(IPTComm'08),德国海德堡,2008年7月。

Appendix A. Contributors
附录A.贡献者

Many thanks for the contributions, comments, and feedback on this document to: Mary Barnes (Nortel), Janet Gunn (CSC), Carolyn Johnson (AT&T Labs), Paul Kyzivat (Cisco), Daryl Malas (CableLabs), Tom Phelan (Sonus Networks), Jonathan Rosenberg (Cisco), Henning Schulzrinne (Columbia University), Robert Sparks (Tekelec), Nick Stewart (British Telecommunications plc), Rich Terpstra (Level 3), Fangzhe Chang (Bell Labs/Alcatel-Lucent).

非常感谢玛丽·巴恩斯(北电)、珍妮特·冈恩(CSC)、卡罗琳·约翰逊(AT&T实验室)、保罗·基齐瓦特(思科)、达里尔·马拉斯(有线实验室)、汤姆·费兰(索诺斯网络)、乔纳森·罗森博格(思科)、亨宁·舒尔兹林恩(哥伦比亚大学)、罗伯特·斯帕克斯(泰克莱克)、尼克·斯图尔特对本文件的贡献、评论和反馈(英国电信公司)、里奇·特普斯特拉(三级)、方哲昌(贝尔实验室/阿尔卡特朗讯)。

Authors' Addresses

作者地址

Volker Hilt Bell Labs/Alcatel-Lucent 791 Holmdel-Keyport Rd Holmdel, NJ 07733 USA

沃尔克希尔特贝尔实验室/阿尔卡特朗讯美国新泽西州霍姆德尔凯波特路791号霍姆德尔07733

   EMail: volker.hilt@alcatel-lucent.com
        
   EMail: volker.hilt@alcatel-lucent.com
        

Eric Noel AT&T Labs

埃里克·诺埃尔AT&T实验室

   EMail: eric.noel@att.com
        
   EMail: eric.noel@att.com
        

Charles Shen Columbia University

哥伦比亚大学

   EMail: charles@cs.columbia.edu
        
   EMail: charles@cs.columbia.edu
        

Ahmed Abdelal Sonus Networks

艾哈迈德·阿卜杜勒·索努斯网络

   EMail: aabdelal@sonusnet.com
        
   EMail: aabdelal@sonusnet.com