Internet Engineering Task Force (IETF)                        E. McMurry
Request for Comments: 7068                                   B. Campbell
Category: Informational                                           Oracle
ISSN: 2070-1721                                            November 2013
        
Internet Engineering Task Force (IETF)                        E. McMurry
Request for Comments: 7068                                   B. Campbell
Category: Informational                                           Oracle
ISSN: 2070-1721                                            November 2013
        

Diameter Overload Control Requirements

直径过载控制要求

Abstract

摘要

When a Diameter server or agent becomes overloaded, it needs to be able to gracefully reduce its load, typically by advising clients to reduce traffic for some period of time. Otherwise, it must continue to expend resources parsing and responding to Diameter messages, possibly resulting in a progressively severe overload condition. The existing Diameter mechanisms are not sufficient for managing overload conditions. This document describes the limitations of the existing mechanisms. Requirements for new overload management mechanisms are also provided.

当Diameter服务器或代理超载时,它需要能够优雅地减少其负载,通常是建议客户端在一段时间内减少流量。否则,它必须继续花费资源解析和响应Diameter消息,这可能导致逐渐严重的过载情况。现有的直径机制不足以管理过载情况。本文件描述了现有机制的局限性。还提供了新过载管理机制的要求。

Status of This Memo

关于下段备忘

This document is not an Internet Standards Track specification; it is published for informational purposes.

本文件不是互联网标准跟踪规范;它是为了提供信息而发布的。

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.

本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准;见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7068.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc7068.

Copyright Notice

版权公告

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2013 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents

目录

   1. Introduction ....................................................4
      1.1. Documentation Conventions ..................................4
      1.2. Causes of Overload .........................................5
      1.3. Effects of Overload ........................................6
      1.4. Overload vs. Network Congestion ............................6
      1.5. Diameter Applications in a Broader Network .................7
   2. Overload Control Scenarios ......................................7
      2.1. Peer-to-Peer Scenarios .....................................8
      2.2. Agent Scenarios ...........................................10
      2.3. Interconnect Scenario .....................................14
   3. Diameter Overload Case Studies .................................15
      3.1. Overload in Mobile Data Networks ..........................15
      3.2. 3GPP Study on Core Network Overload .......................16
   4. Existing Mechanisms ............................................17
   5. Issues with the Current Mechanisms .............................18
      5.1. Problems with Implicit Mechanism ..........................18
      5.2. Problems with Explicit Mechanisms .........................18
   6. Extensibility and Application Independence .....................19
   7. Solution Requirements ..........................................20
      7.1. General ...................................................20
      7.2. Performance ...............................................21
      7.3. Heterogeneous Support for Solution ........................22
      7.4. Granular Control ..........................................23
      7.5. Priority and Policy .......................................23
      7.6. Security ..................................................23
      7.7. Flexibility and Extensibility .............................24
   8. Security Considerations ........................................25
      8.1. Access Control ............................................25
      8.2. Denial-of-Service Attacks .................................26
      8.3. Replay Attacks ............................................26
      8.4. Man-in-the-Middle Attacks .................................26
      8.5. Compromised Hosts .........................................27
   9. References .....................................................27
      9.1. Normative References ......................................27
      9.2. Informative References ....................................27
   Appendix A. Contributors ..........................................29
   Appendix B. Acknowledgements ......................................29
        
   1. Introduction ....................................................4
      1.1. Documentation Conventions ..................................4
      1.2. Causes of Overload .........................................5
      1.3. Effects of Overload ........................................6
      1.4. Overload vs. Network Congestion ............................6
      1.5. Diameter Applications in a Broader Network .................7
   2. Overload Control Scenarios ......................................7
      2.1. Peer-to-Peer Scenarios .....................................8
      2.2. Agent Scenarios ...........................................10
      2.3. Interconnect Scenario .....................................14
   3. Diameter Overload Case Studies .................................15
      3.1. Overload in Mobile Data Networks ..........................15
      3.2. 3GPP Study on Core Network Overload .......................16
   4. Existing Mechanisms ............................................17
   5. Issues with the Current Mechanisms .............................18
      5.1. Problems with Implicit Mechanism ..........................18
      5.2. Problems with Explicit Mechanisms .........................18
   6. Extensibility and Application Independence .....................19
   7. Solution Requirements ..........................................20
      7.1. General ...................................................20
      7.2. Performance ...............................................21
      7.3. Heterogeneous Support for Solution ........................22
      7.4. Granular Control ..........................................23
      7.5. Priority and Policy .......................................23
      7.6. Security ..................................................23
      7.7. Flexibility and Extensibility .............................24
   8. Security Considerations ........................................25
      8.1. Access Control ............................................25
      8.2. Denial-of-Service Attacks .................................26
      8.3. Replay Attacks ............................................26
      8.4. Man-in-the-Middle Attacks .................................26
      8.5. Compromised Hosts .........................................27
   9. References .....................................................27
      9.1. Normative References ......................................27
      9.2. Informative References ....................................27
   Appendix A. Contributors ..........................................29
   Appendix B. Acknowledgements ......................................29
        
1. Introduction
1. 介绍

A Diameter [RFC6733] node is said to be overloaded when it has insufficient resources to successfully process all of the Diameter requests that it receives. When a node becomes overloaded, it needs to be able to gracefully reduce its load, typically by advising clients to reduce traffic for some period of time. Otherwise, it must continue to expend resources parsing and responding to Diameter messages, possibly resulting in a progressively severe overload condition. The existing mechanisms provided by Diameter are not sufficient for managing overload conditions. This document describes the limitations of the existing mechanisms and provides requirements for new overload management mechanisms.

当Diameter[RFC6733]节点没有足够的资源来成功处理它接收到的所有Diameter请求时,称其过载。当节点过载时,它需要能够优雅地减少其负载,通常是建议客户端在一段时间内减少流量。否则,它必须继续花费资源解析和响应Diameter消息,这可能导致逐渐严重的过载情况。Diameter提供的现有机制不足以管理过载情况。本文件描述了现有机制的局限性,并对新的过载管理机制提出了要求。

This document draws on the work done on SIP overload control ([RFC5390], [RFC6357]) as well as on experience gained via overload handling in Signaling System No. 7 (SS7) networks and studies done by the Third Generation Partnership Project (3GPP) (Section 3).

本文件借鉴了SIP过载控制([RFC5390]、[RFC6357])方面的工作,以及通过7号信令系统(SS7)网络中的过载处理获得的经验,以及第三代合作伙伴项目(3GPP)(第3节)所做的研究。

Diameter is not typically an end-user protocol; rather, it is generally used as one component in support of some end-user activity.

Diameter通常不是最终用户协议;相反,它通常被用作支持某些最终用户活动的一个组件。

For example, a SIP server might use Diameter to authenticate and authorize user access. Overload in the Diameter backend infrastructure will likely impact the experience observed by the end user in the SIP application.

例如,SIP服务器可以使用Diameter对用户访问进行身份验证和授权。Diameter后端基础设施中的过载可能会影响最终用户在SIP应用程序中观察到的体验。

The impact of Diameter overload on the client application (a client application may use the Diameter protocol and other protocols to do its job) is beyond the scope of this document.

Diameter重载对客户端应用程序的影响(客户端应用程序可能使用Diameter协议和其他协议来完成其工作)超出了本文档的范围。

This document presents non-normative descriptions of causes of overload, along with related scenarios and studies. Finally, it offers a set of normative requirements for an improved overload indication mechanism.

本文件介绍了过载原因的非规范性描述,以及相关场景和研究。最后,它为改进的过载指示机制提供了一套规范性要求。

1.1. Documentation Conventions
1.1. 文件惯例

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as defined in [RFC2119], with the exception that they are not intended for interoperability of implementations. Rather, they are used to describe requirements towards future specifications where the interoperability requirements will be defined.

本文件中的关键词“必须”、“不得”、“必需”、“应”、“不得”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中的定义进行解释,但不用于实现的互操作性。相反,它们用于描述未来规范的需求,在未来规范中将定义互操作性需求。

The terms "client", "server", "agent", "node", "peer", "upstream", and "downstream" are used as defined in [RFC6733].

术语“客户端”、“服务器”、“代理”、“节点”、“对等方”、“上游”和“下游”的使用如[RFC6733]中所定义。

1.2. Causes of Overload
1.2. 超载原因

Overload occurs when an element, such as a Diameter server or agent, has insufficient resources to successfully process all of the traffic it is receiving. Resources include all of the capabilities of the element used to process a request, including CPU processing, memory, I/O, and disk resources. It can also include external resources such as a database or DNS server, in which case the CPU, processing, memory, I/O, and disk resources of those elements are effectively part of the logical element processing the request.

当一个元素(如Diameter服务器或代理)没有足够的资源来成功处理它正在接收的所有流量时,就会发生过载。资源包括用于处理请求的元素的所有功能,包括CPU处理、内存、I/O和磁盘资源。它还可以包括外部资源,如数据库或DNS服务器,在这种情况下,这些元素的CPU、处理、内存、I/O和磁盘资源实际上是处理请求的逻辑元素的一部分。

External resources can include upstream Diameter nodes; for example, a Diameter agent can become effectively overloaded if one or more upstream nodes are overloaded.

外部资源可以包括上游直径节点;例如,如果一个或多个上游节点过载,Diameter代理可以有效地过载。

A Diameter node can become overloaded due to request levels that exceed its capacity, a reduction of available resources (for example, a local or upstream hardware failure), or a combination of the two.

由于请求级别超过其容量、可用资源减少(例如,本地或上游硬件故障)或两者的组合,Diameter节点可能会过载。

Overload can occur for many reasons, including:

发生过载的原因有很多,包括:

Inadequate capacity: When designing Diameter networks, that is, application-layer multi-node Diameter deployments, it can be very difficult to predict all scenarios that may cause elevated traffic. It may also be more costly to implement support for some scenarios than a network operator may deem worthwhile. This results in the likelihood that a Diameter network will not have adequate capacity to handle all situations.

容量不足:在设计Diameter网络(即应用层多节点Diameter部署)时,很难预测所有可能导致流量增加的场景。实施对某些场景的支持可能比网络运营商认为值得的成本更高。这可能导致Diameter网络没有足够的能力处理所有情况。

Dependency failures: A Diameter node can become overloaded because a resource on which it depends has failed or become overloaded, greatly reducing the logical capacity of the node. In these cases, even minimal traffic might cause the node to go into overload. Examples of such dependency overloads include DNS servers, databases, disks, and network interfaces that have failed or become overloaded.

依赖关系失败:Diameter节点可能会过载,因为它所依赖的资源发生故障或过载,从而大大降低了节点的逻辑容量。在这些情况下,即使是最小的通信量也可能导致节点过载。此类依赖项重载的示例包括发生故障或过载的DNS服务器、数据库、磁盘和网络接口。

Component failures: A Diameter node can become overloaded when it is a member of a cluster of servers that each share the load of traffic and one or more of the other members in the cluster fail. In this case, the remaining nodes take over the work of the failed nodes. Normally, capacity planning takes such failures into account, and servers are typically run with enough spare capacity to handle failure of another node. However, unusual failure conditions can cause many nodes to fail at once. This is often the case with software failures, where a bad packet or bad database entry hits the same bug in a set of nodes in a cluster.

组件故障:当Diameter节点是共享流量负载的服务器集群的成员,而集群中的一个或多个其他成员出现故障时,它可能会过载。在这种情况下,其余节点将接管故障节点的工作。通常情况下,容量规划会将此类故障考虑在内,服务器运行时通常会有足够的备用容量来处理另一个节点的故障。但是,异常的故障情况可能会导致多个节点同时发生故障。软件故障通常就是这种情况,坏数据包或坏数据库条目在集群中的一组节点中遇到相同的错误。

Network-initiated traffic flood: Certain access network events can precipitate floods of Diameter signaling traffic. For example, operational changes can trigger avalanche restarts, or frequent radio overlay handovers can generate excessive authorization requests. Failure of a Diameter proxy may also result in a large amount of signaling as connections and sessions are reestablished.

网络发起的流量泛滥:某些接入网络事件可能会导致DIAMER信令流量泛滥。例如,操作更改可能会触发雪崩重启,或者频繁的无线电覆盖切换可能会生成过多的授权请求。当重新建立连接和会话时,Diameter代理的故障也可能导致大量信令。

Subscriber-initiated traffic flood: Large gatherings of subscribers or events that result in many subscribers interacting with the network in close time proximity can result in Diameter signaling traffic floods. For example, the finale of a large fireworks show could be immediately followed by many subscribers posting messages, pictures, and videos concentrated on one portion of a network. Subscriber devices such as smartphones may use aggressive registration strategies that generate unusually high Diameter traffic loads.

订阅者发起的流量泛滥:订阅者的大规模聚集或导致许多订阅者近距离与网络交互的事件可能导致Diameter信令流量泛滥。例如,一场大型焰火表演的最后一场,许多订阅者会立即发布消息、图片和视频,集中在网络的一部分。智能手机等订户设备可能使用激进的注册策略,从而产生异常高的流量负载。

DoS attacks: An attacker wishing to disrupt service in the network can cause a large amount of traffic to be launched at a target element. This can be done from a central source of traffic or through a distributed DoS attack. In all cases, the volume of traffic well exceeds the capacity of the element, sending the system into overload.

DoS攻击:攻击者希望中断网络中的服务,可导致在目标元素上启动大量流量。这可以通过中央通信源或分布式DoS攻击实现。在所有情况下,通信量都远远超过了元件的容量,导致系统过载。

1.3. Effects of Overload
1.3. 超载效应

Modern Diameter networks, composed of application-layer multi-node deployments of Diameter elements, may operate at very large transaction volumes. If a Diameter node becomes overloaded or, even worse, fails completely, a large number of messages may be lost very quickly. Even with redundant servers, many messages can be lost in the time it takes for failover to complete. While a Diameter client or agent should be able to retry such requests, an overloaded peer may cause a sudden large increase in the number of transactions needing to be retried, rapidly filling local queues or otherwise contributing to local overload. Therefore, Diameter devices need to be able to shed load before critical failures can occur.

现代Diameter网络由Diameter元素的应用层多节点部署组成,可以在非常大的事务量下运行。如果Diameter节点过载,或者更糟糕的是完全失败,那么大量消息可能会很快丢失。即使使用冗余服务器,许多消息也可能在故障切换完成所需的时间内丢失。虽然Diameter客户机或代理应该能够重试此类请求,但过载的对等机可能会导致需要重试的事务数量突然大幅增加,从而迅速填满本地队列或导致本地过载。因此,直径设备需要能够在发生严重故障之前卸载。

1.4. Overload vs. Network Congestion
1.4. 过载与网络拥塞

This document uses the term "overload" to refer to application-layer overload at Diameter nodes. This is distinct from "network congestion", that is, congestion that occurs at the lower networking layers that may impact the delivery of Diameter messages between nodes. This document recognizes that element overload and network congestion are interrelated, and that overload can contribute to network congestion and vice versa.

本文档使用术语“重载”来指直径节点处的应用程序层重载。这与“网络拥塞”不同,即发生在较低网络层的拥塞,可能会影响节点之间Diameter消息的传递。本文件认识到元件过载和网络拥塞是相互关联的,过载会导致网络拥塞,反之亦然。

Network congestion issues are better handled by the transport protocols. Diameter uses TCP and the Stream Control Transmission Protocol (SCTP), both of which include congestion management features. Analysis of whether those features are sufficient for transport-level congestion between Diameter nodes and of any work to further mitigate network congestion is out of scope for both this document and the work proposed by it.

传输协议可以更好地处理网络拥塞问题。Diameter使用TCP和流控制传输协议(SCTP),两者都包括拥塞管理功能。分析这些功能是否足以应对Diameter节点之间的传输级拥塞以及进一步缓解网络拥塞的任何工作,不在本文件及其提议的工作范围之内。

1.5. Diameter Applications in a Broader Network
1.5. Diameter在更广泛网络中的应用

Most elements using Diameter applications do not use Diameter exclusively. It is important to realize that overload of an element can be caused by a number of factors that may be unrelated to the processing of Diameter or Diameter applications.

大多数使用直径应用程序的图元并不专门使用直径。必须认识到,元件过载可能由许多因素引起,这些因素可能与直径或直径应用程序的处理无关。

An element that doesn't use Diameter exclusively needs to be able to signal to Diameter peers that it is experiencing overload regardless of the cause of the overload, since the overload will affect that element's ability to process Diameter transactions. If the element communicates with protocols other than Diameter, it may also need to signal the overload situation on these protocols, depending on its function and the architecture of the network and application for which it is providing services. Whether that is necessary can only be decided within the context of that architecture and use cases. This specification details the requirements for a mechanism for signaling overload with Diameter; this mechanism provides Diameter nodes the ability to inform their Diameter peers of overload, mitigating that part of the issue. Diameter nodes may need to use this, as well as other mechanisms, to solve their broader overload issues. Indicating overload on protocols other than Diameter is out of scope for this document and for the work proposed by it.

不专门使用Diameter的元素需要能够向Diameter对等方发出信号,表明它正在经历过载,而不管过载的原因是什么,因为过载将影响该元素处理Diameter事务的能力。如果元件与Diameter以外的协议进行通信,则可能还需要根据其功能以及为其提供服务的网络和应用程序的体系结构,向这些协议发送过载情况信号。这是否必要只能在该架构和用例的上下文中决定。本规范详细说明了直径信号过载机构的要求;此机制使Diameter节点能够通知其Diameter对等节点过载,从而缓解该部分问题。Diameter节点可能需要使用此机制以及其他机制来解决其更广泛的过载问题。在Diameter以外的协议上指示过载超出了本文件及其建议工作的范围。

2. Overload Control Scenarios
2. 过载控制场景

Several Diameter deployment scenarios exist that may impact overload management. The following scenarios help motivate the requirements for an overload management mechanism.

存在几种可能影响过载管理的Diameter部署方案。以下场景有助于激发过载管理机制的需求。

These scenarios are by no means exhaustive and are in general simplified for the sake of clarity. In particular, this document assumes for the sake of clarity that the client sends Diameter requests to the server, and the server sends responses to the client, even though Diameter supports bidirectional applications. Each direction in such an application can be modeled separately.

这些场景绝非详尽无遗,为了清晰起见,通常会进行简化。特别是,为了清楚起见,本文档假设客户端向服务器发送Diameter请求,服务器向客户端发送响应,即使Diameter支持双向应用程序。这种应用程序中的每个方向都可以单独建模。

In a large-scale deployment, many of the nodes represented in these scenarios would be deployed as clusters of servers. This document assumes that such a cluster is responsible for managing its own

在大规模部署中,这些场景中表示的许多节点将部署为服务器集群。本文档假设这样的集群负责管理自己的集群

internal load-balancing and overload management so that it appears as a single Diameter node. That is, other Diameter nodes can treat it as a single, monolithic node for the purposes of overload management.

内部负载平衡和过载管理,使其显示为单个直径节点。也就是说,出于过载管理的目的,其他Diameter节点可以将其视为单个单片节点。

These scenarios do not illustrate the client application. As mentioned in Section 1, Diameter is not typically an end-user protocol; rather, it is generally used in support of some other client application. These scenarios do not consider the impact of Diameter overload on the client application.

这些场景没有说明客户端应用程序。如第1节所述,Diameter通常不是最终用户协议;相反,它通常用于支持其他一些客户端应用程序。这些方案不考虑直径超载对客户端应用程序的影响。

2.1. Peer-to-Peer Scenarios
2.1. 对等场景

This section describes Diameter peer-to-peer scenarios, that is, scenarios where a Diameter client talks directly with a Diameter server, without the use of a Diameter agent.

本节介绍Diameter对等场景,即Diameter客户端直接与Diameter服务器对话而不使用Diameter代理的场景。

Figure 1 illustrates the simplest possible Diameter relationship. The client and server share a one-to-one peer-to-peer relationship. If the server becomes overloaded, either because the client exceeds the server's capacity or because the server's capacity is reduced due to some resource dependency, the client needs to reduce the amount of Diameter traffic it sends to the server. Since the client cannot forward requests to another server, it must either queue requests until the server recovers or itself become overloaded in the context of the client application and other protocols it may also use.

图1显示了最简单的直径关系。客户端和服务器共享一对一的对等关系。如果服务器过载,或者是因为客户端超过了服务器的容量,或者是因为服务器的容量由于某些资源依赖性而减少,那么客户端需要减少发送到服务器的Diameter通信量。由于客户端无法将请求转发到另一台服务器,因此它必须将请求排队,直到服务器恢复,或者在客户端应用程序和它可能使用的其他协议的上下文中自身过载。

                            +------------------+
                            |                  |
                            |                  |
                            |     Server       |
                            |                  |
                            +--------+---------+
                                     |
                                     |
                            +--------+---------+
                            |                  |
                            |                  |
                            |     Client       |
                            |                  |
                            +------------------+
        
                            +------------------+
                            |                  |
                            |                  |
                            |     Server       |
                            |                  |
                            +--------+---------+
                                     |
                                     |
                            +--------+---------+
                            |                  |
                            |                  |
                            |     Client       |
                            |                  |
                            +------------------+
        

Figure 1: Basic Peer-to-Peer Scenario

图1:基本对等场景

Figure 2 shows a similar scenario, except in this case the client has multiple servers that can handle work for a specific realm and application. If Server 1 becomes overloaded, the client can forward traffic to Server 2. Assuming that Server 2 has sufficient reserve capacity to handle the forwarded traffic, the client should be able to continue serving client application protocol users. If Server 1 is approaching overload, but can still handle some number of new requests, it needs to be able to instruct the client to forward a subset of its traffic to Server 2.

图2显示了一个类似的场景,除了在这种情况下,客户机有多个服务器可以处理特定领域和应用程序的工作。如果服务器1过载,客户端可以将流量转发到服务器2。假设服务器2有足够的备用容量来处理转发的流量,则客户端应该能够继续为客户端应用程序协议用户提供服务。如果服务器1即将过载,但仍然可以处理一些新请求,则它需要能够指示客户端将其流量的子集转发给服务器2。

               +------------------+     +------------------+
               |                  |     |                  |
               |                  |     |                  |
               |     Server 1     |     |     Server 2     |
               |                  |     |                  |
               +--------+-`.------+     +------.'+---------+
                            `.               .'
                              `.           .'
                                `.       .'
                                  `.   .'
                            +-------`.'--------+
                            |                  |
                            |                  |
                            |     Client       |
                            |                  |
                            +------------------+
        
               +------------------+     +------------------+
               |                  |     |                  |
               |                  |     |                  |
               |     Server 1     |     |     Server 2     |
               |                  |     |                  |
               +--------+-`.------+     +------.'+---------+
                            `.               .'
                              `.           .'
                                `.       .'
                                  `.   .'
                            +-------`.'--------+
                            |                  |
                            |                  |
                            |     Client       |
                            |                  |
                            +------------------+
        

Figure 2: Multiple-Server Peer-to-Peer Scenario

图2:多服务器对等场景

Figure 3 illustrates a peer-to-peer scenario with multiple Diameter realm and application combinations. In this example, Server 2 can handle work for both applications. Each application might have different resource dependencies. For example, a server might need to access one database for Application A and another for Application B. This creates a possibility that Server 2 could become overloaded for Application A but not for Application B, in which case the client would need to divert some part of its Application A requests to Server 1, but the client should not divert any Application B requests. This requires that Server 2 be able to distinguish between applications when it indicates an overload condition to the client.

图3展示了具有多个Diameter领域和应用程序组合的对等场景。在本例中,服务器2可以处理这两个应用程序的工作。每个应用程序可能具有不同的资源依赖关系。例如,服务器可能需要为应用程序a访问一个数据库,为应用程序B访问另一个数据库。这可能会导致服务器2可能会因应用程序a而过载,但不会因应用程序B而过载,在这种情况下,客户端需要将其应用程序a请求的某些部分转移到服务器1,但客户端不应转移任何应用程序B请求。这要求服务器2在向客户端指示过载情况时能够区分应用程序。

On the other hand, it's possible that the servers host many applications. If Server 2 becomes overloaded for all applications, it would be undesirable for it to have to notify the client separately for each application. Therefore, it also needs a way to indicate that it is overloaded for all possible applications.

另一方面,服务器可能承载许多应用程序。如果服务器2对所有应用程序都过载,则不希望它必须为每个应用程序分别通知客户机。因此,它还需要一种方法来指示它对于所有可能的应用程序都是重载的。

   +---------------------------------------------+
   | Application A       +----------------------+----------------------+
   |+------------------+ |  +----------------+  |  +------------------+|
   ||                  | |  |                |  |  |                  ||
   ||                  | |  |                |  |  |                  ||
   ||     Server 1     | |  |    Server 2    |  |  |     Server 3     ||
   ||                  | |  |                |  |  |                  ||
   |+--------+---------+ |  +-------+--------+  |  +-+----------------+|
   |         |           |          |           |    |                 |
   +---------+-----------+----------+-----------+    |                 |
             |           |          |                |                 |
             |           |          |                |  Application B  |
             |           +----------+----------------+-----------------+
             ``-.._                 |                |
                   `-..__           |            _.-''
                        `--._       |        _.-''
                             ``-._  |   _.-''
                            +-----`-.-''-----+
                            |                |
                            |                |
                            |     Client     |
                            |                |
                            +----------------+
        
   +---------------------------------------------+
   | Application A       +----------------------+----------------------+
   |+------------------+ |  +----------------+  |  +------------------+|
   ||                  | |  |                |  |  |                  ||
   ||                  | |  |                |  |  |                  ||
   ||     Server 1     | |  |    Server 2    |  |  |     Server 3     ||
   ||                  | |  |                |  |  |                  ||
   |+--------+---------+ |  +-------+--------+  |  +-+----------------+|
   |         |           |          |           |    |                 |
   +---------+-----------+----------+-----------+    |                 |
             |           |          |                |                 |
             |           |          |                |  Application B  |
             |           +----------+----------------+-----------------+
             ``-.._                 |                |
                   `-..__           |            _.-''
                        `--._       |        _.-''
                             ``-._  |   _.-''
                            +-----`-.-''-----+
                            |                |
                            |                |
                            |     Client     |
                            |                |
                            +----------------+
        

Figure 3: Multiple-Application Peer-to-Peer Scenario

图3:多应用程序对等场景

2.2. Agent Scenarios
2.2. 代理场景

This section describes scenarios that include a Diameter agent, in the form of either a Diameter relay or Diameter proxy. These scenarios do not consider Diameter redirect agents, since they are more readily modeled as end servers. The examples have been kept simple deliberately, to illustrate basic concepts. Significantly more complicated topologies are possible with Diameter, including multiple intermediate agents in a path connected in a variety of ways.

本节描述了包括Diameter代理(以Diameter中继或Diameter代理的形式)的场景。这些方案不考虑直径重定向代理,因为它们更容易被建模为端服务器。这些例子刻意保持简单,以说明基本概念。使用Diameter可以实现更复杂的拓扑,包括以多种方式连接的路径中的多个中间代理。

Figure 4 illustrates a simple Diameter agent scenario with a single client, agent, and server. In this case, overload can occur at the server, at the agent, or both. But in most cases, client behavior is the same whether overload occurs at the server or at the agent. From the client's perspective, server overload and agent overload are the same thing.

图4展示了一个简单的Diameter代理场景,其中包含一个客户端、代理和服务器。在这种情况下,过载可能发生在服务器、代理或两者上。但在大多数情况下,无论过载发生在服务器还是代理上,客户端行为都是相同的。从客户机的角度来看,服务器过载和代理过载是一样的。

                           +------------------+
                           |                  |
                           |                  |
                           |     Server       |
                           |                  |
                           +--------+---------+
                                    |
                                    |
                           +--------+---------+
                           |                  |
                           |                  |
                           |      Agent       |
                           |                  |
                           +--------+---------+
                                    |
                                    |
                           +--------+---------+
                           |                  |
                           |                  |
                           |     Client       |
                           |                  |
                           +------------------+
        
                           +------------------+
                           |                  |
                           |                  |
                           |     Server       |
                           |                  |
                           +--------+---------+
                                    |
                                    |
                           +--------+---------+
                           |                  |
                           |                  |
                           |      Agent       |
                           |                  |
                           +--------+---------+
                                    |
                                    |
                           +--------+---------+
                           |                  |
                           |                  |
                           |     Client       |
                           |                  |
                           +------------------+
        

Figure 4: Basic Agent Scenario

图4:基本代理场景

Figure 5 shows an agent scenario with multiple servers. If Server 1 becomes overloaded but Server 2 has sufficient reserve capacity, the agent may be able to transparently divert some or all Diameter requests originally bound for Server 1 to Server 2.

图5显示了具有多个服务器的代理场景。如果服务器1过载,但服务器2有足够的备用容量,则代理可以透明地将最初为服务器1绑定的部分或所有Diameter请求转移到服务器2。

In most cases, the client does not have detailed knowledge of the Diameter topology upstream of the agent. If the agent uses dynamic discovery to find eligible servers, the set of eligible servers may not be enumerable from the perspective of the client. Therefore, in most cases the agent needs to deal with any upstream overload issues in a way that is transparent to the client. If one server notifies the agent that it has become overloaded, the notification should not be passed back to the client in a way that the client could mistakenly perceive the agent itself as being overloaded. If the set

在大多数情况下,客户机不了解代理上游的Diameter拓扑的详细信息。如果代理使用动态发现查找符合条件的服务器,则从客户端的角度来看,可能无法枚举符合条件的服务器集。因此,在大多数情况下,代理需要以对客户端透明的方式处理任何上游过载问题。如果一台服务器通知代理它已过载,则不应以客户端可能错误地将代理本身视为过载的方式将通知传递回客户端。如果设置

of all possible destinations upstream of the agent no longer has sufficient capacity for incoming load, the agent itself becomes effectively overloaded.

在所有可能的目的地中,代理的上游不再有足够的容量来承载传入负载,代理本身实际上会过载。

On the other hand, there are cases where the client needs to be able to select a particular server from behind an agent. For example, if a Diameter request is part of a multiple-round-trip authentication, or is otherwise part of a Diameter "session", it may have a Destination-Host Attribute-Value Pair (AVP) that requires that the request be served by Server 1. Therefore, the agent may need to inform a client that a particular upstream server is overloaded or otherwise unavailable. Note that there can be many ways a server can be specified, which may have different implications (e.g., by IP address, by host name, etc).

另一方面,在某些情况下,客户端需要能够从代理后面选择特定的服务器。例如,如果Diameter请求是多次往返身份验证的一部分,或者是Diameter“会话”的一部分,则它可能具有要求服务器1提供请求的目标主机属性值对(AVP)。因此,代理可能需要通知客户端特定的上游服务器过载或不可用。请注意,可以通过多种方式指定服务器,这些方式可能具有不同的含义(例如,通过IP地址、主机名等)。

              +------------------+     +------------------+
              |                  |     |                  |
              |                  |     |                  |
              |     Server 1     |     |     Server 2     |
              |                  |     |                  |
              +--------+-`.------+     +------.'+---------+
                           `.               .'
                             `.           .'
                               `.       .'
                                 `.   .'
                           +-------`.'--------+
                           |                  |
                           |                  |
                           |     Agent        |
                           |                  |
                           +--------+---------+
                                    |
                                    |
                                    |
                           +--------+---------+
                           |                  |
                           |                  |
                           |     Client       |
                           |                  |
                           +------------------+
        
              +------------------+     +------------------+
              |                  |     |                  |
              |                  |     |                  |
              |     Server 1     |     |     Server 2     |
              |                  |     |                  |
              +--------+-`.------+     +------.'+---------+
                           `.               .'
                             `.           .'
                               `.       .'
                                 `.   .'
                           +-------`.'--------+
                           |                  |
                           |                  |
                           |     Agent        |
                           |                  |
                           +--------+---------+
                                    |
                                    |
                                    |
                           +--------+---------+
                           |                  |
                           |                  |
                           |     Client       |
                           |                  |
                           +------------------+
        

Figure 5: Multiple-Server Agent Scenario

图5:多服务器代理场景

Figure 6 shows a scenario where an agent routes requests to a set of servers for more than one Diameter realm and application. In this scenario, if Server 1 becomes overloaded or unavailable while Server 2 still has available capacity, the agent may effectively operate at reduced capacity for Application A but at full capacity for Application B. Therefore, the agent needs to be able to report that it is overloaded for one application but not for another.

图6显示了一个场景,其中代理将多个Diameter领域和应用程序的请求路由到一组服务器。在这种情况下,如果服务器1过载或不可用,而服务器2仍有可用容量,则代理可以有效地以应用程序A的降低容量运行,但以应用程序B的满容量运行。因此,代理需要能够报告一个应用程序过载,而不是另一个应用程序过载。

   +--------------------------------------------+
   | Application A       +----------------------+----------------------+
   |+------------------+ |  +----------------+  |  +------------------+|
   ||                  | |  |                |  |  |                  ||
   ||                  | |  |                |  |  |                  ||
   ||     Server 1     | |  |    Server 2    |  |  |     Server 3     ||
   ||                  | |  |                |  |  |                  ||
   |+---------+--------+ |  +-------+--------+  |  +--+---------------+|
   |          |          |          |           |     |                |
   +----------+----------+----------+-----------+     |                |
              |          |          |                 |                |
              |          |          |                 | Application B  |
              |          +----------+-----------------+----------------+
              |                     |                 |
               ``--.__              |                _.
                      ``-.__        |          __.--''
                            `--.._  |    _..--'
                            +----``-+.''-----+
                            |                |
                            |                |
                            |    Agent       |
                            |                |
                            +-------+--------+
                                    |
                                    |
                            +-------+--------+
                            |                |
                            |                |
                            |    Client      |
                            |                |
                            +----------------+
        
   +--------------------------------------------+
   | Application A       +----------------------+----------------------+
   |+------------------+ |  +----------------+  |  +------------------+|
   ||                  | |  |                |  |  |                  ||
   ||                  | |  |                |  |  |                  ||
   ||     Server 1     | |  |    Server 2    |  |  |     Server 3     ||
   ||                  | |  |                |  |  |                  ||
   |+---------+--------+ |  +-------+--------+  |  +--+---------------+|
   |          |          |          |           |     |                |
   +----------+----------+----------+-----------+     |                |
              |          |          |                 |                |
              |          |          |                 | Application B  |
              |          +----------+-----------------+----------------+
              |                     |                 |
               ``--.__              |                _.
                      ``-.__        |          __.--''
                            `--.._  |    _..--'
                            +----``-+.''-----+
                            |                |
                            |                |
                            |    Agent       |
                            |                |
                            +-------+--------+
                                    |
                                    |
                            +-------+--------+
                            |                |
                            |                |
                            |    Client      |
                            |                |
                            +----------------+
        

Figure 6: Multiple-Application Agent Scenario

图6:多应用程序代理场景

2.3. Interconnect Scenario
2.3. 互连场景

Another scenario to consider when looking at Diameter overload is that of multiple network operators using Diameter components connected through an interconnect service, e.g., using IPX (IP Packet eXchange). IPX [IR.34] is an Inter-Operator IP Backbone that provides a roaming interconnection network between mobile operators and service providers. IPX is also used to transport Diameter signaling between operators [IR.88]. Figure 7 shows two network operators with an interconnect network between them. There could be any number of these networks between any two network operators' networks.

考虑直径超载时的另一个场景是多个网络运营商使用通过互连服务连接的直径组件,例如使用IPX(IP分组交换)。IPX[IR.34]是一个运营商间IP主干网,在移动运营商和服务提供商之间提供漫游互连网络。IPX还用于在运营商之间传输Diameter信令[IR.88]。图7显示了两个网络运营商之间的互连网络。在任何两个网络运营商的网络之间可能存在任意数量的此类网络。

               +-------------------------------------------+
               |               Interconnect                |
               |                                           |
               |   +--------------+      +--------------+  |
               |   |   Server 3   |------|   Server 4   |  |
               |   +--------------+      +--------------+  |
               |         .'                      `.        |
               +------.-'--------------------------`.------+
                    .'                               `.
                 .-'                                   `.
   ------------.'-----+                             +----`.-------------
         +----------+ |                             | +----------+
         | Server 1 | |                             | | Server 2 |
         +----------+ |                             | +----------+
                      |                             |
   Network Operator 1 |                             | Network Operator 2
   -------------------+                             +-------------------
        
               +-------------------------------------------+
               |               Interconnect                |
               |                                           |
               |   +--------------+      +--------------+  |
               |   |   Server 3   |------|   Server 4   |  |
               |   +--------------+      +--------------+  |
               |         .'                      `.        |
               +------.-'--------------------------`.------+
                    .'                               `.
                 .-'                                   `.
   ------------.'-----+                             +----`.-------------
         +----------+ |                             | +----------+
         | Server 1 | |                             | | Server 2 |
         +----------+ |                             | +----------+
                      |                             |
   Network Operator 1 |                             | Network Operator 2
   -------------------+                             +-------------------
        

Figure 7: Two-Network Interconnect Scenario

图7:两个网络互连场景

The characteristics of the information that an operator would want to share over such a connection are different from the information shared between components within a network operator's network. For example, network operators may not want to convey topology or operational information; this would in turn limit how much overload and loading information can be sent. For the interconnect scenario shown in Figure 7, Server 2 may want to signal overload to Server 1, to affect traffic coming from Network Operator 1.

运营商希望通过这种连接共享的信息的特征不同于网络运营商网络内组件之间共享的信息。例如,网络运营商可能不希望传递拓扑或操作信息;这将反过来限制可以发送的过载和加载信息的数量。对于图7所示的互连场景,服务器2可能希望向服务器1发送过载信号,以影响来自网络运营商1的流量。

This case is distinct from those internal to a network operator's network, where there may be many more elements in a more complicated topology. Also, the elements in the interconnect network may not support Diameter overload control, and the network operators may not want the interconnect network to use overload or loading information. They may only want the information to pass through the interconnect

这种情况不同于网络运营商网络内部的情况,在这种情况下,在更复杂的拓扑中可能有更多的元素。此外,互连网络中的元件可能不支持直径过载控制,并且网络运营商可能不希望互连网络使用过载或加载信息。他们可能只希望信息通过互连

network without further processing or action by the interconnect network, even if the elements in the interconnect network do support Diameter overload control.

即使互连网络中的元件支持直径过载控制,也无需互连网络进行进一步处理或操作。

3. Diameter Overload Case Studies
3. 直径过载案例研究
3.1. Overload in Mobile Data Networks
3.1. 移动数据网络中的过载问题

As the number of smartphone devices that are Third Generation (3G) and Long Term Evolution (LTE) enabled continues to expand in mobile networks, there have been situations where high signaling traffic load led to overload events at the Diameter-based Home Location Registers (HLRs) and/or Home Subscriber Servers (HSS) [TR23.843]. The root causes of the HLR overload events were manifold but included hardware failure and procedural errors. The result was high signaling traffic load on the HLR and HSS.

随着移动网络中支持第三代(3G)和长期演进(LTE)的智能手机设备的数量不断增加,出现了高信令流量负载导致基于直径的归属位置寄存器(HLR)和/或归属用户服务器(HSS)过载事件的情况[TR23.843]。HLR过载事件的根本原因是多方面的,但包括硬件故障和程序错误。其结果是HLR和HSS上的高信令流量负载。

The 3GPP architecture [TS23.002] makes extensive use of Diameter. It is used for mobility management [TS29.272], the IP Multimedia Subsystem (IMS) [TS29.228], and policy and charging control [TS29.212], as well as other functions. The details of the architecture are out of scope for this document, but it is worth noting that there are quite a few Diameter applications, some with quite large amounts of Diameter signaling in deployed networks.

3GPP架构[TS23.002]广泛使用直径。它用于移动性管理[TS29.272]、IP多媒体子系统(IMS)[TS29.228]、策略和计费控制[TS29.212]以及其他功能。该体系结构的细节不在本文档的范围内,但值得注意的是,有相当多的Diameter应用程序,其中一些在已部署的网络中具有相当大量的Diameter信令。

The 3GPP specifications do not currently address overload for Diameter applications or provide a load control mechanism equivalent to those provided in the more traditional SS7 elements in the Global System for Mobile Communications (GSM); see [TS29.002]. The capabilities specified in the 3GPP standards do not adequately address the abnormal condition where excessively high signaling traffic load situations are experienced.

3GPP规范目前没有解决Diameter应用的过载问题,也没有提供与全球移动通信系统(GSM)中更传统的SS7元素中提供的负载控制机制等效的负载控制机制;见[TS29.002]。3GPP标准中规定的能力不能充分解决异常情况,在这种情况下,会出现过高的信令业务负载情况。

Smartphones, which comprise an increasingly large percentage of mobile devices, contribute much more heavily, relative to non-smartphones, to the continuation of a registration surge, due to their very aggressive registration algorithms. Smartphone behavior contributes to network loading and can contribute to overload conditions. The aggressive smartphone logic is designed to:

智能手机在移动设备中所占比例越来越大,由于其非常激进的注册算法,相对于非智能手机,智能手机对注册激增的持续贡献更大。智能手机的行为会导致网络负载,也会导致过载情况。积极的智能手机逻辑旨在:

a. always have voice and data registration, and

a. 始终进行语音和数据注册,以及

b. constantly try to be on 3G or LTE data (and thus on 3G voice or Voice over LTE (VoLTE) [IR.92]) for their added benefits.

b. 不断尝试使用3G或LTE数据(从而使用3G语音或LTE语音(VoLTE)[IR.92]),以获得额外的好处。

Non-smartphones typically have logic to wait for a time period after registering successfully on voice and data.

非智能手机通常在成功注册语音和数据后需要等待一段时间。

The aggressive smartphone registration is problematic in two ways:

激进的智能手机注册在两个方面存在问题:

o first, by generating excessive signaling load towards the HSS that is ten times the load from a non-smartphone, and

o 首先,通过向HSS产生超过非智能手机负载十倍的过度信令负载,以及

o second, by causing continual registration attempts when a network failure affects registrations through the 3G data network.

o 第二,当网络故障影响通过3G数据网络的注册时,导致连续注册尝试。

3.2. 3GPP Study on Core Network Overload
3.2. 3GPP核心网过载研究

A study in the 3GPP System Aspects working group 2 (SA2) on core network overload has produced the technical report [TR23.843]. This enumerates several causes of overload in mobile core networks, including portions that are signaled using Diameter. [TR23.843] is a work in progress and is not complete. However, it is useful for pointing out scenarios and the general need for an overload control mechanism for Diameter.

3GPP系统特性工作组2(SA2)关于核心网络过载的研究已经产生了技术报告[TR23.843]。这列举了移动核心网络中过载的几个原因,包括使用Diameter发送信号的部分。[TR23.843]是一项正在进行的工作,尚未完成。然而,它有助于指出场景和直径过载控制机制的一般需要。

It is common for mobile networks to employ more than one radio technology and to do so in an overlay fashion with multiple technologies present in the same location (such as 2nd or 3rd generation mobile technologies, along with LTE). This presents opportunities for traffic storms when issues occur on one overlay and not another as all devices that had been on the overlay with issues switch. This causes a large amount of Diameter traffic as locations and policies are updated.

移动网络通常使用一种以上的无线电技术,并以覆盖方式使用同一位置存在的多种技术(例如第二代或第三代移动技术以及LTE)。当问题发生在一个覆盖上而不是另一个覆盖上时,这就为交通风暴提供了机会,因为覆盖上的所有设备都有问题切换。随着位置和策略的更新,这会导致大量Diameter流量。

Another scenario called out by this study is a flood of registration and mobility management events caused by some element in the core network failing. This flood of traffic from end nodes falls under the network-initiated traffic flood category. There is likely to also be traffic resulting directly from the component failure in this case. A similar flood can occur when elements or components recover as well.

本研究提出的另一种情况是,由于核心网络中的某些元素出现故障,导致大量注册和移动性管理事件。来自终端节点的流量洪水属于网络发起的流量洪水类别。在这种情况下,组件故障也可能直接导致通信量。当元件或组件也恢复时,可能会发生类似的洪水。

Subscriber-initiated traffic floods are also indicated in this study as an overload mechanism where a large number of mobile devices are attempting to access services at the same time, such as in response to an entertainment event or a catastrophic event.

本研究还指出,用户发起的流量泛滥是一种过载机制,大量移动设备试图同时访问服务,例如响应娱乐事件或灾难性事件。

While this 3GPP study is concerned with the broader effects of these scenarios on wireless networks and their elements, they have implications specifically for Diameter signaling. One of the goals of this document is to provide guidance for a core mechanism that can be used to mitigate the scenarios called out by this study.

虽然此3GPP研究关注这些场景对无线网络及其元素的更广泛影响,但它们对Diameter信令有着特殊的影响。本文件的目标之一是为核心机制提供指导,该机制可用于缓解本研究提出的情景。

4. Existing Mechanisms
4. 现有机制

Diameter offers both implicit and explicit mechanisms for a Diameter node to learn that a peer is overloaded or unreachable. The implicit mechanism is simply the lack of responses to requests. If a client fails to receive a response in a certain time period, it assumes that the upstream peer is unavailable or is overloaded to the point of effective unavailability. The watchdog mechanism [RFC3539] ensures that transaction responses occur at a certain rate even when there is otherwise little or no other Diameter traffic.

Diameter为Diameter节点提供了隐式和显式机制,以了解对等节点过载或无法访问。隐含的机制只是缺少对请求的响应。如果客户机在某个时间段内没有收到响应,它会假定上游对等机不可用或过载到有效不可用的程度。看门狗机制[RFC3539]确保事务响应以一定的速率发生,即使在其他方面很少或没有其他Diameter流量的情况下也是如此。

The explicit mechanism can involve specific protocol error responses, where an agent or server tells a downstream peer that it is either too busy to handle a request (DIAMETER_TOO_BUSY) or unable to route a request to an upstream destination (DIAMETER_UNABLE_TO_DELIVER) perhaps because that destination itself is overloaded to the point of unavailability.

显式机制可能涉及特定的协议错误响应,其中代理或服务器告诉下游对等方它太忙,无法处理请求(DIAMETER\u太忙),或者无法将请求路由到上游目的地(DIAMETER\u无法\u交付)也许是因为目的地本身过载到了不可用的程度。

Another explicit mechanism, a DPR (Disconnect-Peer-Request) message, can be sent with a Disconnect-Cause of BUSY. This signals the sender's intent to close the transport connection and requests that the client not reconnect.

另一个显式机制是DPR(断开对等请求)消息,它可以在断开连接导致繁忙的情况下发送。这表示发送方打算关闭传输连接,并请求客户端不要重新连接。

Once a Diameter node learns via one of these mechanisms that an upstream peer has become overloaded, it can then attempt to take action to reduce the load. This usually means forwarding traffic to an alternate destination, if available. If no alternate destination is available, the node must either reduce the number of messages it originates (in the case of a client) or inform the client to reduce traffic (in the case of an agent).

一旦Diameter节点通过这些机制中的一种得知上游节点已过载,它就可以尝试采取措施降低负载。这通常意味着将流量转发到备用目的地(如果可用)。如果没有可用的备用目的地,则节点必须减少其发起的消息数量(对于客户端)或通知客户端减少通信量(对于代理)。

Diameter requires the use of a congestion-managed transport layer, currently TCP or SCTP, to mitigate network congestion. It is expected that these transports manage network congestion and that issues with transport (e.g., congestion propagation and window management) are managed at that level. But even with a congestion-managed transport, a Diameter node can become overloaded at the Diameter protocol or application layers due to the causes described in Section 1.2, and congestion-managed transports do not provide facilities (and are at the wrong level) to handle server overload. Transport-level congestion management is also not sufficient to address overload in cases of multi-hop and multi-destination signaling.

Diameter需要使用拥塞管理传输层(当前为TCP或SCTP)来缓解网络拥塞。预计这些传输将管理网络拥塞,并在该级别管理传输问题(例如,拥塞传播和窗口管理)。但是,即使使用拥塞管理传输,由于第1.2节中描述的原因,Diameter节点也可能在Diameter协议或应用程序层过载,并且拥塞管理传输不提供处理服务器过载的设施(并且处于错误的级别)。在多跳和多目的地信令的情况下,传输级拥塞管理也不足以解决过载问题。

5. Issues with the Current Mechanisms
5. 现有机制的问题

The currently available Diameter mechanisms for indicating an overload condition are not adequate to avoid service outages due to overload. This inadequacy may, in turn, contribute to broader impacts resulting from overload due to unresponsive Diameter nodes causing application-layer or transport-layer retransmissions. In particular, they do not allow a Diameter agent or server to shed load as it approaches overload. At best, a node can only indicate that it needs to entirely stop receiving requests, i.e., that it has effectively failed. Even that is problematic due to the inability to indicate durational validity on the transient errors available in the base Diameter protocol. Diameter offers no mechanism to allow a node to indicate different overload states for different categories of messages, for example, if it is overloaded for one Diameter application but not another.

目前可用的指示过载情况的直径机制不足以避免过载导致的服务中断。这种不足反过来可能会导致更广泛的影响,这是由于没有响应的Diameter节点导致应用层或传输层重传而导致的过载造成的。特别是,它们不允许Diameter代理或服务器在接近过载时卸载。充其量,节点只能指示它需要完全停止接收请求,即它实际上失败了。即使是这样,也存在问题,因为无法显示基本直径协议中可用的瞬时误差的持续有效性。Diameter不提供允许节点为不同类别的消息指示不同重载状态的机制,例如,如果一个Diameter应用程序而不是另一个Diameter应用程序重载了节点。

5.1. Problems with Implicit Mechanism
5.1. 隐式机制问题

The implicit mechanism doesn't allow an agent or server to inform the client of a problem until it is effectively too late to do anything about it. The client does not know that it needs to take action until the upstream node has effectively failed. A Diameter node has no opportunity to shed load early to avoid collapse in the first place.

隐式机制不允许代理或服务器将问题通知客户机,直到实际上为时已晚才对此采取任何行动。在上游节点实际发生故障之前,客户端不知道需要采取行动。直径节点没有机会提前卸载,以避免首先崩溃。

Additionally, the implicit mechanism cannot distinguish between overload of a Diameter node and network congestion. Diameter treats the failure to receive an answer as a transport failure.

此外,隐式机制无法区分Diameter节点的过载和网络拥塞。Diameter将未能收到应答视为传输故障。

5.2. Problems with Explicit Mechanisms
5.2. 显式机制的问题

The Diameter specification is ambiguous on how a client should handle receipt of a DIAMETER_TOO_BUSY response. The base specification [RFC6733] indicates that the sending client should attempt to send the request to a different peer. It makes no suggestion that the receipt of a DIAMETER_TOO_BUSY response should affect future Diameter messages in any way.

Diameter规范对于客户端应如何处理Diameter\u To\u BUSY响应的接收是不明确的。基本规范[RFC6733]指示发送客户端应尝试将请求发送到其他对等方。这并不意味着收到DIAMETER\u TOO\u BUSY响应会以任何方式影响将来的DIAMETER消息。

The Authentication, Authorization, and Accounting (AAA) Transport Profile [RFC3539] recommends that a AAA node that receives a "Busy" response failover all remaining requests to a different agent or server. But while the Diameter base specification explicitly depends on [RFC3539] to define transport behavior, it does not refer to [RFC3539] in the description of behavior on receipt of a DIAMETER_TOO_BUSY error. There's a strong likelihood that at least some implementations will continue to send Diameter requests to an upstream peer even after receiving a DIAMETER_TOO_BUSY error.

身份验证、授权和记帐(AAA)传输配置文件[RFC3539]建议接收“忙”响应的AAA节点将所有剩余请求故障转移到不同的代理或服务器。但是,虽然Diameter base规范明确依赖于[RFC3539]来定义传输行为,但在收到Diameter\u to\u BUSY错误时的行为描述中并未提及[RFC3539]。即使在收到Diameter\u TOO\u BUSY错误后,至少有些实现仍有可能继续向上游对等方发送Diameter请求。

BCP 41 [RFC2914] describes, among other things, how end-to-end application behavior can help avoid congestion collapse. In particular, an application should avoid sending messages that will never be delivered or processed. The DIAMETER_TOO_BUSY behavior as described in the Diameter base specification fails at this, since if an upstream node becomes overloaded, a client attempts each request and does not discover the need to failover the request until the initial attempt fails.

BCP 41[RFC2914]描述了端到端应用程序行为如何帮助避免拥塞崩溃。特别是,应用程序应避免发送永远无法传递或处理的消息。DIAMETER基本规范中描述的DIAMETER_to_BUSY行为在此失败,因为如果上游节点过载,客户端会尝试每个请求,并且在初始尝试失败之前不会发现需要故障切换请求。

The situation is improved if implementations follow the [RFC3539] recommendation to keep state about upstream peer overload. But even then, the Diameter specification offers no guidance on how long a client should wait before retrying the overloaded destination. If an agent or server supports multiple realms and/or applications, DIAMETER_TOO_BUSY offers no way to indicate that it is overloaded for one application but not another. A DIAMETER_TOO_BUSY error can only indicate overload at a "whole server" scope.

如果实现遵循[RFC3539]建议,保持上游对等过载状态,则情况会得到改善。但即使如此,Diameter规范也没有提供客户机在重试过载目标之前应该等待多长时间的指导。如果一个代理或服务器支持多个领域和/或应用程序,DIAMETER\u to\u BUSY无法表示它对一个应用程序而不是另一个应用程序过载。DIAMETER\u To\u BUSY错误只能指示“整个服务器”范围内的过载。

Agent processing of a DIAMETER_TOO_BUSY response is also problematic as described in the base specification. DIAMETER_TOO_BUSY is defined as a protocol error. If an agent receives a protocol error, it may either handle it locally or forward the response back towards the downstream peer. If a downstream peer receives the DIAMETER_TOO_BUSY response, it may stop sending all requests to the agent for some period of time, even though the agent may still be able to deliver requests to other upstream peers.

DIAMETER_To_BUSY响应的代理处理也有问题,如基本规范中所述。DIAMETER_TOO_BUSY被定义为协议错误。如果代理接收到协议错误,它可以在本地处理该错误,或者将响应转发回下游对等方。如果下游对等方收到DIAMETER\u TOO\u BUSY响应,它可能会在一段时间内停止向代理发送所有请求,即使代理仍然能够向其他上游对等方发送请求。

DIAMETER_UNABLE_TO_DELIVER errors, or using DPR with cause code BUSY, also have no mechanisms for specifying the scope or cause of the failure, or the durational validity.

DIAMETER\u无法\u交付错误,或使用原因代码繁忙的DPR,也没有指定故障范围或原因或持续有效性的机制。

The issues with error responses described in [RFC6733] extend beyond the particular issues for overload control and have been addressed in an ad hoc fashion by various implementations. Addressing these in a standard way would be a useful exercise, but it is beyond the scope of this document.

[RFC6733]中描述的错误响应问题超出了过载控制的特定问题,并已通过各种实现以特殊方式解决。以标准方式解决这些问题将是一项有益的工作,但这超出了本文件的范围。

6. Extensibility and Application Independence
6. 可扩展性和应用程序独立性

Given the variety of scenarios in which Diameter elements can be deployed and the variety of roles they can fulfill with Diameter and other technologies, a single algorithm for handling overload may not be sufficient. For purposes of this discussion, an algorithm is inclusive of behavior for control of overload but does not encompass the general mechanism for transporting control information. This effort cannot anticipate all possible future scenarios and roles. Extensibility, particularly of algorithms used to deal with overload, will be important to cover these cases.

考虑到可以部署Diameter元素的各种场景,以及它们可以通过Diameter和其他技术实现的各种角色,单用一种算法来处理过载可能是不够的。在本讨论中,算法包括过载控制行为,但不包括传输控制信息的一般机制。这项工作无法预测所有可能的未来场景和角色。可扩展性,特别是用于处理过载的算法的可扩展性,对于涵盖这些情况非常重要。

Similarly, the scopes to which overload information may apply may include cases that have not yet been considered. Extensibility in this area will also be important.

类似地,过载信息可能适用的范围可能包括尚未考虑的情况。这方面的可扩展性也很重要。

The basic mechanism is intended to be application independent, that is, a Diameter node can use it across any existing and future Diameter applications and expect reasonable results. Certain Diameter applications might, however, benefit from application-specific behavior over and above the mechanism's defaults. For example, an application specification might specify relative priorities of messages or selection of a specific overload control algorithm.

基本机制旨在独立于应用程序,也就是说,Diameter节点可以在任何现有和未来的Diameter应用程序中使用它,并期望得到合理的结果。但是,某些Diameter应用程序可能会受益于特定于应用程序的行为,而不仅仅是机制的默认行为。例如,应用程序规范可能指定消息的相对优先级或特定过载控制算法的选择。

7. Solution Requirements
7. 解决方案要求

This section proposes requirements for an improved mechanism to control Diameter overload, with the goals of addressing the issues described in Section 5 and supporting the scenarios described in Section 2. These requirements are stated primarily in terms of individual node behavior to inform the design of the improved mechanism; solution designers should keep in mind that the overall goal is improved overall system behavior across all the nodes involved, not just improved behavior from specific individual nodes.

本节提出了控制直径过载的改进机制的要求,目的是解决第5节中描述的问题并支持第2节中描述的场景。这些要求主要是根据单个节点的行为来说明改进机制的设计;解决方案设计者应该记住,总体目标是改进所有相关节点的总体系统行为,而不仅仅是改进特定单个节点的行为。

7.1. General
7.1. 全体的

REQ 1: The solution MUST provide a communication method for Diameter nodes to exchange load and overload information.

要求1:解决方案必须为Diameter节点提供通信方法,以交换负载和过载信息。

REQ 2: The solution MUST allow Diameter nodes to support overload control regardless of which Diameter applications they support. Diameter clients and agents must be able to use the received load and overload information to support graceful behavior during an overload condition. Graceful behavior under overload conditions is best described by REQ 3.

请求2:解决方案必须允许Diameter节点支持过载控制,无论它们支持哪个Diameter应用程序。Diameter客户端和代理必须能够使用收到的负载和过载信息来支持过载条件下的正常行为。重载条件下的优雅行为最好由REQ 3描述。

REQ 3: The solution MUST limit the impact of overload on the overall useful throughput of a Diameter server, even when the incoming load on the network is far in excess of its capacity. The overall useful throughput under load is the ultimate measure of the value of a solution.

REQ 3:解决方案必须限制过载对Diameter服务器总体有效吞吐量的影响,即使网络上的传入负载远远超过其容量。负载下的总有效吞吐量是解决方案价值的最终度量。

REQ 4: Diameter allows requests to be sent from either side of a connection, and either side of a connection may have need to provide its overload status. The solution MUST allow each side of a connection to independently inform the other of its overload status.

REQ 4:Diameter允许从连接的任一侧发送请求,并且连接的任一侧可能需要提供其过载状态。解决方案必须允许连接的每一侧独立地通知另一侧其过载状态。

REQ 5: Diameter allows nodes to determine their peers via dynamic discovery or manual configuration. The solution MUST work consistently without regard to how peers are determined.

请求5:Diameter允许节点通过动态发现或手动配置确定其对等节点。解决方案必须始终如一地发挥作用,而不考虑同行是如何确定的。

REQ 6: The solution designers SHOULD seek to minimize the amount of new configuration required in order to work. For example, it is better to allow peers to advertise or negotiate support for the solution, rather than to require that this knowledge be configured at each node.

要求6:解决方案设计师应尽量减少工作所需的新配置数量。例如,最好允许对等方公布或协商对解决方案的支持,而不是要求在每个节点配置此知识。

7.2. Performance
7.2. 表演

REQ 7: The solution and any associated default algorithm(s) MUST ensure that the system remains stable. At some point after an overload condition has ended, the solution MUST enable capacity to stabilize and become equal to what it would be in the absence of an overload condition. Note that this also requires that the solution MUST allow nodes to shed load without introducing non-converging oscillations during or after an overload condition.

要求7:解决方案和任何相关的默认算法必须确保系统保持稳定。在过载条件结束后的某个时刻,解决方案必须使容量稳定,并与没有过载条件时的容量相等。请注意,这还要求解决方案必须允许节点在过载条件期间或之后卸载,而不引入非收敛振荡。

REQ 8: Supporting nodes MUST be able to distinguish current overload information from stale information.

请求8:支持节点必须能够区分当前过载信息和过时信息。

REQ 9: The solution MUST function across fully loaded as well as quiescent transport connections. This is partially derived from the requirement for stability in REQ 7.

要求9:解决方案必须在满载和静态传输连接中运行。这部分源自REQ 7中的稳定性要求。

REQ 10: Consumers of overload information MUST be able to determine when the overload condition improves or ends.

请求10:过载信息的使用者必须能够确定过载状况何时改善或结束。

REQ 11: The solution MUST be able to operate in networks of different sizes.

要求11:解决方案必须能够在不同规模的网络中运行。

REQ 12: When a single network node fails, goes into overload, or suffers from reduced processing capacity, the solution MUST make it possible to limit the impact of the affected node on other nodes in the network. This helps to prevent a small-scale failure from becoming a widespread outage.

REQ 12:当单个网络节点出现故障、过载或处理能力降低时,解决方案必须能够限制受影响节点对网络中其他节点的影响。这有助于防止小规模故障成为大范围停机。

REQ 13: The solution MUST NOT introduce substantial additional work for a node in an overloaded state. For example, a requirement for an overloaded node to send overload information every time it received a new request would introduce substantial work.

REQ 13:解决方案不得为处于过载状态的节点引入大量额外工作。例如,要求重载节点在每次收到新请求时发送重载信息将带来大量工作。

REQ 14: Some scenarios that result in overload involve a rapid increase of traffic with little time between normal levels and levels that induce overload. The solution SHOULD provide for rapid feedback when traffic levels increase.

REQ 14:一些导致过载的场景涉及流量的快速增加,正常水平和导致过载的水平之间的时间间隔很短。该解决方案应在流量水平增加时提供快速反馈。

REQ 15: The solution MUST NOT interfere with the congestion control mechanisms of underlying transport protocols. For example, a solution that opened additional TCP connections when the network is congested would reduce the effectiveness of the underlying congestion control mechanisms.

REQ 15:解决方案不得干扰底层传输协议的拥塞控制机制。例如,在网络拥塞时打开额外TCP连接的解决方案会降低底层拥塞控制机制的有效性。

7.3. Heterogeneous Support for Solution
7.3. 解决方案的异构支持

REQ 16: The solution is likely to be deployed incrementally. The solution MUST support a mixed environment where some, but not all, nodes implement it.

REQ 16:解决方案可能会以增量方式部署。该解决方案必须支持混合环境,其中一些节点(而不是所有节点)实现了该解决方案。

REQ 17: In a mixed environment with nodes that support the solution and nodes that do not, the solution MUST NOT result in materially less useful throughput during overload as would have resulted if the solution were not present. It SHOULD result in less severe overload in this environment.

REQ 17:在包含支持解决方案的节点和不支持解决方案的节点的混合环境中,解决方案不得导致过载期间的可用吞吐量大大降低,因为如果解决方案不存在,可能会导致这种情况。在这种环境下,它会导致不太严重的过载。

REQ 18: In a mixed environment of nodes that support the solution and nodes that do not, the solution MUST NOT preclude elements that support overload control from treating elements that do not support overload control in an equitable fashion relative to those that do. Users and operators of nodes that do not support the solution MUST NOT unfairly benefit from the solution. The solution specification SHOULD provide guidance to implementors for dealing with elements not supporting overload control.

REQ 18:在支持解决方案的节点和不支持解决方案的节点的混合环境中,解决方案不得阻止支持过载控制的元素以公平的方式对待不支持过载控制的元素。不支持该解决方案的节点的用户和操作员不得不公平地从该解决方案中获益。解决方案规范应为实施者提供处理不支持过载控制的元素的指导。

REQ 19: It MUST be possible to use the solution between nodes in different realms and in different administrative domains.

REQ 19:必须能够在不同领域和不同管理域中的节点之间使用解决方案。

REQ 20: Any explicit overload indication MUST be clearly distinguishable from other errors reported via Diameter.

要求20:任何明确的过载指示必须与通过直径报告的其他错误明确区分。

REQ 21: In cases where a network node fails, is so overloaded that it cannot process messages, or cannot communicate due to a network failure, it may not be able to provide explicit indications of the nature of the failure or its levels of overload. The solution MUST result in at least as much useful throughput as would have resulted if the solution were not in place.

REQ 21:如果网络节点发生故障,过载到无法处理消息,或由于网络故障而无法通信,则可能无法提供故障性质或过载级别的明确指示。解决方案必须至少产生与解决方案不到位时相同的有用吞吐量。

7.4. Granular Control
7.4. 颗粒控制

REQ 22: The solution MUST provide a way for a node to throttle the amount of traffic it receives from a peer node. This throttling SHOULD be graded so that it can be applied gradually as offered load increases. Overload is not a binary state; there may be degrees of overload.

REQ 22:解决方案必须为节点提供一种方式,以限制其从对等节点接收的通信量。该节流应分级,以便在提供的负载增加时逐渐应用。重载不是二进制状态;可能存在一定程度的过载。

REQ 23: The solution MUST provide sufficient information to enable a load-balancing node to divert messages that are rejected or otherwise throttled by an overloaded upstream node to other upstream nodes that are the most likely to have sufficient capacity to process them.

REQ 23:解决方案必须提供足够的信息,以使负载平衡节点能够将被过载的上游节点拒绝或以其他方式限制的消息转移到最有可能有足够能力处理它们的其他上游节点。

REQ 24: The solution MUST provide a mechanism for indicating load levels, even when not in an overload condition, to assist nodes in making decisions to prevent overload conditions from occurring.

REQ 24:解决方案必须提供指示负载水平的机制,即使在不处于过载状态时,也可以帮助节点做出决策,以防止过载情况发生。

7.5. Priority and Policy
7.5. 优先事项和政策

REQ 25: The base specification for the solution SHOULD offer general guidance on which message types might be desirable to send or process over others during times of overload, based on application-specific considerations. For example, it may be more beneficial to process messages for existing sessions ahead of new sessions. Some networks may have a requirement to give priority to requests associated with emergency sessions. Any normative or otherwise detailed definition of the relative priorities of message types during an overload condition will be the responsibility of the application specification.

REQ 25:解决方案的基本规范应提供一般性指导,根据特定于应用程序的考虑,在过载期间,哪些消息类型可能需要发送或处理。例如,在新会话之前处理现有会话的消息可能更有益。一些网络可能要求优先处理与紧急会话相关的请求。过载条件下消息类型相对优先级的任何规范性或其他详细定义将由应用程序规范负责。

REQ 26: The solution MUST NOT prevent a node from prioritizing requests based on any local policy, so that certain requests are given preferential treatment, given additional retransmission, not throttled, or processed ahead of others.

REQ 26:解决方案不得阻止节点根据任何本地策略对请求进行优先级排序,从而使某些请求得到优先处理、额外的重新传输、不受限制或提前处理。

7.6. Security
7.6. 安全

REQ 27: The solution MUST NOT provide new vulnerabilities to malicious attack or increase the severity of any existing vulnerabilities. This includes vulnerabilities to DoS and DDoS attacks as well as replay and man-in-the-middle attacks. Note that the Diameter base specification [RFC6733] lacks end-to-end security, and this must be considered (see Security Considerations in this document (Section 8)). Note

REQ 27:解决方案不得提供新的恶意攻击漏洞或增加任何现有漏洞的严重性。这包括针对DoS和DDoS攻击以及重播和中间人攻击的漏洞。请注意,Diameter base规范[RFC6733]缺乏端到端安全性,必须考虑这一点(请参阅本文件(第8节)中的安全注意事项)。笔记

that this requirement was expressed at a high level so as to not preclude any particular solution. Is is expected that the solution will address this in more detail.

这一要求是在高层次上表达的,以便不排除任何特定的解决办法。预计解决方案将更详细地解决这一问题。

REQ 28: The solution MUST NOT depend on being deployed in environments where all Diameter nodes are completely trusted. It SHOULD operate as effectively as possible in environments where other nodes are malicious; this includes preventing malicious nodes from obtaining more than a fair share of service. Note that this does not imply any responsibility on the solution to detect, or take countermeasures against, malicious nodes.

REQ 28:解决方案不得依赖于部署在所有Diameter节点都完全受信任的环境中。它应该在其他节点恶意的环境中尽可能有效地运行;这包括防止恶意节点获得超过公平份额的服务。请注意,这并不意味着解决方案有责任检测恶意节点或对其采取对策。

REQ 29: It MUST be possible for a supporting node to make authorization decisions about what information will be sent to peer nodes based on the identity of those nodes. This allows a domain administrator who considers the load of their nodes to be sensitive information to restrict access to that information. Of course, in such cases, there is no expectation that the solution itself will help prevent overload from that peer node.

REQ 29:支持节点必须能够根据对等节点的身份,就将向对等节点发送哪些信息做出授权决策。这允许将节点负载视为敏感信息的域管理员限制对该信息的访问。当然,在这种情况下,不希望解决方案本身有助于防止对等节点过载。

REQ 30: The solution MUST NOT interfere with any Diameter-compliant method that a node may use to protect itself from overload from non-supporting nodes or from denial-of-service attacks.

REQ 30:解决方案不得干扰节点可能用于保护自身免受非支持节点过载或拒绝服务攻击的任何Diameter兼容方法。

7.7. Flexibility and Extensibility
7.7. 灵活性和可扩展性

REQ 31: There are multiple situations where a Diameter node may be overloaded for some purposes but not others. For example, this can happen to an agent or server that supports multiple applications, or when a server depends on multiple external resources, some of which may become overloaded while others are fully available. The solution MUST allow Diameter nodes to indicate overload with sufficient granularity to allow clients to take action based on the overloaded resources without unreasonably forcing available capacity to go unused. The solution MUST support specification of overload information with granularities of at least "Diameter node", "realm", and "Diameter application" and MUST allow extensibility for others to be added in the future.

REQ 31:在多种情况下,直径节点可能会因某些目的而过载,但不会因其他目的而过载。例如,这可能发生在支持多个应用程序的代理或服务器上,或者当服务器依赖于多个外部资源时,其中一些资源可能会过载,而其他资源则完全可用。解决方案必须允许Diameter节点以足够的粒度指示过载,以允许客户端根据过载的资源采取操作,而不会不合理地强制使用可用容量。解决方案必须支持粒度至少为“Diameter node”、“realm”和“Diameter application”的重载信息规范,并且必须允许将来添加其他信息的可扩展性。

REQ 32: The solution MUST provide a method for extending the information communicated and the algorithms used for overload control.

REQ 32:该解决方案必须提供一种扩展所传递信息的方法以及用于过载控制的算法。

REQ 33: The solution MUST provide a default algorithm that is mandatory to implement.

REQ 33:解决方案必须提供强制实现的默认算法。

REQ 34: The solution SHOULD provide a method for exchanging overload and load information between elements that are connected by intermediaries that do not support the solution.

REQ 34:解决方案应提供一种方法,用于在由不支持该解决方案的中介连接的元素之间交换过载和负载信息。

8. Security Considerations
8. 安全考虑

A Diameter overload control mechanism is primarily concerned with the load-related and overload-related behavior of nodes in a Diameter network, and the information used to affect that behavior. Load and overload information is shared between nodes and directly affects the behavior, and thus the information is potentially vulnerable to a number of methods of attack.

Diameter过载控制机制主要涉及Diameter网络中节点的负载相关和过载相关行为,以及用于影响该行为的信息。负载和过载信息在节点之间共享,并直接影响行为,因此这些信息可能容易受到多种攻击方法的攻击。

Load and overload information may also be sensitive from both business and network protection viewpoints. Operators of Diameter equipment want to control the visibility of load and overload information to keep it from being used for competitive intelligence or for targeting attacks. It is also important that the Diameter overload control mechanism not introduce any way in which any other information carried by Diameter is sent inappropriately.

从业务和网络保护的角度来看,负载和过载信息也可能是敏感的。Diameter设备的操作员希望控制负载和过载信息的可见性,以防止其被用于竞争情报或目标攻击。同样重要的是,直径过载控制机制不得引入任何方式,使直径承载的任何其他信息被不适当地发送。

Note that the Diameter base specification [RFC6733] lacks end-to-end security, making it difficult for non-adjacent nodes to verify the authenticity and ownership of load and overload information. Authentication of load and overload information helps to alleviate several of the security issues listed in this section.

请注意,Diameter基本规范[RFC6733]缺乏端到端安全性,使得非相邻节点难以验证负载和过载信息的真实性和所有权。负载和过载信息的身份验证有助于缓解本节列出的几个安全问题。

This document includes requirements intended to mitigate the effects of attacks and to protect the information used by the mechanism. This section discusses potential security considerations for overload control solutions. This discussion provides the motivation for several normative requirements described in Section 7. The discussion includes specific references to the normative requirements that apply for each issue.

本文件包括旨在减轻攻击影响和保护机制使用的信息的要求。本节讨论过载控制解决方案的潜在安全注意事项。本讨论为第7节中描述的几个规范性要求提供了动机。讨论包括对适用于每个问题的规范性要求的具体引用。

8.1. Access Control
8.1. 访问控制

To control the visibility of load and overload information, sending should be subject to some form of authentication and authorization of the receiver. It is also important to the receivers that they are confident the load and overload information they receive is from a legitimate source. REQ 28 requires that the solution work without assuming that all Diameter nodes in a network are trusted for the purposes of exchanging overload and load information. REQ 29 requires that the solution let nodes restrict unauthorized parties

为了控制负载和过载信息的可见性,发送应该受到接收方某种形式的身份验证和授权。对于接收者来说,他们确信他们收到的负载和过载信息来自合法来源也是很重要的。REQ 28要求解决方案在不假设网络中的所有Diameter节点都是可信的情况下工作,以交换过载和负载信息。REQ 29要求解决方案允许节点限制未经授权的方

from seeing overload information. Note that this implies a certain amount of configurability on the nodes supporting the Diameter overload control mechanism.

从看到过载信息。请注意,这意味着在支持直径过载控制机制的节点上具有一定的可配置性。

8.2. Denial-of-Service Attacks
8.2. 拒绝服务攻击

An overload control mechanism provides a very attractive target for denial-of-service attacks. A small number of messages may effect a large service disruption by falsely reporting overload conditions. Alternately, attacking servers nearing, or in, overload may also be facilitated by disrupting their overload indications, potentially preventing them from mitigating their overload condition.

过载控制机制为拒绝服务攻击提供了一个非常有吸引力的目标。少量消息可能会错误地报告过载情况,从而导致严重的服务中断。或者,攻击接近或处于过载状态的服务器也可以通过中断其过载指示来进行,这可能会阻止它们缓解过载状况。

A design goal for the Diameter overload control mechanism is to minimize or eliminate the possibility of using the mechanism for this type of attack. More strongly, REQ 27 forbids the solution from introducing new vulnerabilities to malicious attack. Additionally, REQ 30 stipulates that the solution not interfere with other mechanisms used for protection against denial-of-service attacks.

直径过载控制机制的设计目标是最小化或消除使用该机制进行此类攻击的可能性。更强烈的是,REQ 27禁止该解决方案引入新的恶意攻击漏洞。此外,REQ 30规定该解决方案不会干扰用于防止拒绝服务攻击的其他机制。

As the intent of some denial-of-service attacks is to induce overload conditions, an effective overload control mechanism should help to mitigate the effects of such an attack.

由于某些拒绝服务攻击的目的是诱发过载情况,因此有效的过载控制机制应有助于减轻此类攻击的影响。

8.3. Replay Attacks
8.3. 攻击回放

An attacker that has managed to obtain some messages from the overload control mechanism may attempt to affect the behavior of nodes supporting the mechanism by sending those messages at potentially inopportune times. In addition to time shifting, replay attacks may send messages to other nodes as well (target shifting).

成功从过载控制机制获取某些消息的攻击者可能会试图通过在可能不合适的时间发送这些消息来影响支持该机制的节点的行为。除了时间转移之外,重播攻击还可能向其他节点发送消息(目标转移)。

A design goal for the Diameter overload control solution is to minimize or eliminate the possibility of causing disruption by using a replay attack on the Diameter overload control mechanism. (Allowing a replay attack using the overload control solution would violate REQ 27.)

Diameter过载控制解决方案的设计目标是通过对Diameter过载控制机制使用重放攻击来最小化或消除造成中断的可能性。(允许使用过载控制解决方案进行重播攻击将违反REQ 27。)

8.4. Man-in-the-Middle Attacks
8.4. 中间人攻击

By inserting themselves between two nodes supporting the Diameter overload control mechanism, an attacker may potentially both access and alter the information sent between those nodes. This can be used for information gathering for business intelligence and attack targeting, as well as direct attacks.

通过将自身插入支持Diameter过载控制机制的两个节点之间,攻击者可能同时访问和更改这些节点之间发送的信息。这可用于收集商业智能和攻击目标的信息,以及直接攻击。

REQs 27, 28, and 29 imply a need to prevent man-in-the-middle attacks on the overload control solution. A transport using Transport Layer Security (TLS) and/or IPsec may be desirable for this purpose.

需求27、28和29意味着需要防止对过载控制解决方案的中间人攻击。为此,可能需要使用传输层安全性(TLS)和/或IPsec的传输。

8.5. Compromised Hosts
8.5. 受损主机

A compromised host that supports the Diameter overload control mechanism could be used for information gathering as well as for sending malicious information to any Diameter node that would normally accept information from it. While it is beyond the scope of the Diameter overload control mechanism to mitigate any operational interruption to the compromised host, REQs 28 and 29 imply a need to minimize the impact that a compromised host can have on other nodes through the use of the Diameter overload control mechanism. Of course, a compromised host could be used to cause damage in a number of other ways. This is out of scope for a Diameter overload control mechanism.

支持Diameter过载控制机制的受损主机可用于信息收集以及向任何Diameter节点发送恶意信息,这些节点通常会接受来自Diameter的信息。虽然缓解受损主机的任何操作中断超出了Diameter过载控制机制的范围,但REQs 28和29意味着需要通过使用Diameter过载控制机制来最小化受损主机对其他节点的影响。当然,受损主机可能会以多种其他方式造成损害。这超出了直径过载控制机制的范围。

9. References
9. 工具书类
9.1. Normative References
9.1. 规范性引用文件

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[RFC6733] Fajardo, V., Arkko, J., Loughney, J., and G. Zorn, "Diameter Base Protocol", RFC 6733, October 2012.

[RFC6733]Fajardo,V.,Arkko,J.,Loughney,J.,和G.Zorn,“直径基准协议”,RFC 67332012年10月。

[RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, RFC 2914, September 2000.

[RFC2914]Floyd,S.,“拥塞控制原则”,BCP 41,RFC 2914,2000年9月。

[RFC3539] Aboba, B. and J. Wood, "Authentication, Authorization and Accounting (AAA) Transport Profile", RFC 3539, June 2003.

[RFC3539]Aboba,B.和J.Wood,“认证、授权和会计(AAA)传输概要”,RFC 3539,2003年6月。

9.2. Informative References
9.2. 资料性引用

[RFC5390] Rosenberg, J., "Requirements for Management of Overload in the Session Initiation Protocol", RFC 5390, December 2008.

[RFC5390]Rosenberg,J.,“会话启动协议中过载管理的要求”,RFC 53902008年12月。

[RFC6357] Hilt, V., Noel, E., Shen, C., and A. Abdelal, "Design Considerations for Session Initiation Protocol (SIP) Overload Control", RFC 6357, August 2011.

[RFC6357]Hilt,V.,Noel,E.,Shen,C.,和A.Abdelal,“会话启动协议(SIP)过载控制的设计考虑”,RFC 6357,2011年8月。

[TR23.843] 3GPP, "Study on Core Network (CN) overload solutions", TR 23.843 1.2.0, Work in Progress, October 2013.

[TR23.843]3GPP,“核心网络(CN)过载解决方案研究”,TR 23.843 1.2.0,正在进行的工作,2013年10月。

[IR.34] GSMA, "Inter-Service Provider IP Backbone Guidelines", IR 34 9.1, May 2013.

[IR.34]GSMA,“服务提供商间IP骨干网指南”,IR 349.11913年5月。

[IR.88] GSMA, "LTE Roaming Guidelines", IR 88 9.0, January 2013.

[IR.88]GSMA,“LTE漫游指南”,IR 88 9.012013年1月。

[IR.92] GSMA, "IMS Profile for Voice and SMS", IR 92 7.0, March 2013.

[IR.92]GSMA,“语音和短信的IMS配置文件”,IR 927.012013年3月。

[TS23.002] 3GPP, "Network Architecture", TS 23.002 12.2.0, June 2013.

[TS23.002]3GPP,“网络架构”,TS 23.002 12.2.012013年6月。

[TS29.272] 3GPP, "Evolved Packet System (EPS); Mobility Management Entity (MME) and Serving GPRS Support Node (SGSN) related interfaces based on Diameter protocol", TS 29.272 12.2.0, September 2013.

[TS29.272]3GPP,“演进分组系统(EPS);基于Diameter协议的移动管理实体(MME)和服务GPRS支持节点(SGSN)相关接口”,TS 29.272 12.2.012013年9月。

[TS29.212] 3GPP, "Policy and Charging Control (PCC) over Gx/Sd reference point", TS 29.212 12.2.0, September 2013.

[TS29.212]3GPP,“Gx/Sd参考点上的政策和收费控制(PCC)”,TS 29.212 12.2.012013年9月。

[TS29.228] 3GPP, "IP Multimedia (IM) Subsystem Cx and Dx interfaces; Signalling flows and message contents", TS 29.228 12.0.0, September 2013.

[TS29.228]3GPP,“IP多媒体(IM)子系统Cx和Dx接口;信令流和消息内容”,TS 29.228 12.0.012013年9月。

[TS29.002] 3GPP, "Mobile Application Part (MAP) specification", TS 29.002 12.2.0, September 2013.

[TS29.002]3GPP,“移动应用部件(MAP)规范”,TS 29.002 12.2.012013年9月。

Appendix A. Contributors
附录A.贡献者

Significant contributions to this document were made by Adam Roach and Eric Noel.

Adam Roach和Eric Noel对本文件做出了重大贡献。

Appendix B. Acknowledgements
附录B.确认书

Review of, and contributions to, this specification by Martin Dolly, Carolyn Johnson, Jianrong Wang, Imtiaz Shaikh, Jouni Korhonen, Robert Sparks, Dieter Jacobsohn, Janet Gunn, Jean-Jacques Trottin, Laurent Thiebaut, Andrew Booth, and Lionel Morand were most appreciated. We would like to thank them for their time and expertise.

非常感谢Martin Dolly、Carolyn Johnson、王建荣、Imtiaz Shaikh、Jouni Korhonen、Robert Sparks、Dieter Jacobsohn、Janet Gunn、Jean-Jacques Trottin、Laurent Thiebut、Andrew Booth和Lionel Morand对本规范的审查和贡献。我们要感谢他们的时间和专业知识。

Authors' Addresses

作者地址

Eric McMurry Oracle 17210 Campbell Rd. Suite 250 Dallas, TX 75252 US

Eric McMurry Oracle美国德克萨斯州达拉斯坎贝尔路17210号250室75252

   EMail: emcmurry@computer.org
        
   EMail: emcmurry@computer.org
        

Ben Campbell Oracle 17210 Campbell Rd. Suite 250 Dallas, TX 75252 US

美国德克萨斯州达拉斯市坎贝尔路250号17210号本坎贝尔甲骨文公司,邮编75252

   EMail: ben@nostrum.com
        
   EMail: ben@nostrum.com