Network Working Group                                  G. Choudhury, Ed.
Request for Comments: 4222                                          AT&T
BCP: 112                                                    October 2005
Category: Best Current Practice
        
Network Working Group                                  G. Choudhury, Ed.
Request for Comments: 4222                                          AT&T
BCP: 112                                                    October 2005
Category: Best Current Practice
        

Prioritized Treatment of Specific OSPF Version 2 Packets and Congestion Avoidance

优先处理特定OSPF版本2数据包和避免拥塞

Status of This Memo

关于下段备忘

This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Distribution of this memo is unlimited.

本文件规定了互联网社区的最佳现行做法,并要求进行讨论和提出改进建议。本备忘录的分发不受限制。

Copyright Notice

版权公告

Copyright (C) The Internet Society (2005).

版权所有(C)互联网协会(2005年)。

Abstract

摘要

This document recommends methods that are intended to improve the scalability and stability of large networks using Open Shortest Path First (OSPF) Version 2 protocol. The methods include processing OSPF Hellos and Link State Advertisement (LSA) Acknowledgments at a higher priority compared to other OSPF packets, and other congestion avoidance procedures.

本文档建议使用开放最短路径优先(OSPF)版本2协议改进大型网络的可扩展性和稳定性的方法。这些方法包括以比其他OSPF分组更高的优先级处理OSPF Hellos和链路状态通告(LSA)确认,以及其他拥塞避免过程。

Table of Contents

目录

   1. Introduction...................................................2
   2. Recommendations................................................3
   3. Security Considerations........................................6
   4. Acknowledgments................................................6
   5. Normative References...........................................6
   6. Informative References.........................................7
   Appendix A. LSA Storm: Causes and Impact..........................8
   Appendix B. List of Variables and Values.........................10
   Appendix C. Other Recommendations and Suggestions................11
        
   1. Introduction...................................................2
   2. Recommendations................................................3
   3. Security Considerations........................................6
   4. Acknowledgments................................................6
   5. Normative References...........................................6
   6. Informative References.........................................7
   Appendix A. LSA Storm: Causes and Impact..........................8
   Appendix B. List of Variables and Values.........................10
   Appendix C. Other Recommendations and Suggestions................11
        
1. Introduction
1. 介绍

In this document, OSPF refers to OSPFv2 [Ref1]. The scalability and stability improvement techniques described here may also apply to OSPFv3 [Ref2], but that will require further study and operational experience.

在本文件中,OSPF指的是OSPFv2[参考文献1]。此处描述的可扩展性和稳定性改进技术也可能适用于OSPFv3[Ref2],但这需要进一步的研究和操作经验。

A large network running OSPF protocol may occasionally experience the simultaneous or near-simultaneous update of a large number of link state advertisements, or LSAs. This is particularly true if OSPF traffic engineering extension [Ref3] is used that may significantly increase the number of LSAs in the network. We call this event an LSA storm and it may be initiated by an unscheduled failure or a scheduled maintenance event. The failure may be hardware, software, or procedural in nature.

运行OSPF协议的大型网络有时可能会同时或接近同时更新大量链路状态播发或LSA。如果使用OSPF流量工程扩展[Ref3],这可能会显著增加网络中LSA的数量,则情况尤其如此。我们将此事件称为LSA风暴,它可能由计划外故障或计划内维修事件引发。故障可能是硬件、软件或程序性故障。

The LSA storm causes high CPU and memory utilization at the router causing incoming packets to be delayed or dropped. Delayed acknowledgments (beyond the retransmission timer value) result in retransmissions, and delayed Hello packets (beyond the router-dead interval) result in neighbor adjacencies being declared down. The retransmissions and additional LSA originations result in further CPU and memory usage, essentially causing a positive feedback loop, which, in the extreme case, may drive the network to an unstable state.

LSA风暴导致路由器的CPU和内存利用率高,导致传入数据包延迟或丢弃。延迟确认(超过重传计时器值)导致重传,延迟Hello数据包(超过路由器死区间隔)导致邻居邻接被声明为关闭。重传和额外的LSA发起导致进一步的CPU和内存使用,基本上导致正反馈回路,在极端情况下,这可能会将网络推向不稳定状态。

The default value of the retransmission timer is 5 seconds and that of the router-dead interval is 40 seconds. However, recently there has been a lot of interest in significantly reducing OSPF convergence time. As part of that plan, much shorter (sub-second) Hello and router-dead intervals have been proposed [Ref4]. In such a scenario, it will be more likely for Hello packets to be delayed beyond the router-dead interval during network congestion caused by an LSA storm.

重传计时器的默认值为5秒,路由器死区间隔的默认值为40秒。然而,最近有很多人对显著缩短OSPF收敛时间感兴趣。作为该计划的一部分,提出了更短(亚秒)的Hello和路由器死区间隔[Ref4]。在这种情况下,在LSA风暴造成的网络拥塞期间,Hello数据包延迟超过路由器死区间隔的可能性更大。

In order to improve the scalability and stability of networks, we recommend steps for prioritizing critical OSPF packets and avoiding congestion. The details of the recommendations are given in Section 2. A simulation study is reported in [Ref13] that quantifies the congestion phenomenon and its impact. It also studies several of the recommendations and shows that they indeed improve the scalability and stability of networks using OSPF protocol. [Ref13] is available on request by contacting the editor or one of the authors.

为了提高网络的可扩展性和稳定性,我们建议采取步骤对关键OSPF数据包进行优先级排序并避免拥塞。建议的详细内容见第2节。参考文献13中报告了一项模拟研究,该研究量化了拥堵现象及其影响。它还研究了一些建议,并表明它们确实提高了使用OSPF协议的网络的可伸缩性和稳定性。如有要求,可联系编辑或作者之一获取[Ref13]。

Appendix A explains in more detail LSA storm scenarios, their impact, and points out a few real-life examples of control-message storms. Appendix B provides a list of variables used in the recommendations and their example values. Appendix C provides some further recommendations and suggestions with similar goals.

附录A更详细地解释了LSA风暴场景及其影响,并指出了一些控制消息风暴的实际示例。附录B提供了建议中使用的变量列表及其示例值。附录C提供了一些具有类似目标的进一步建议和建议。

2. Recommendations
2. 建议

The recommendations below are intended to improve the scalability and stability of large networks using OSPF protocol. During periods of network congestion, they would reduce retransmissions, avoid an adjacency to be declared down due to Hello packets being delayed beyond the RouterDeadInterval, and take other congestion avoidance steps. The recommendations are unordered except that Recommendation 2 is to be implemented only if Recommendation 1 is not implemented.

以下建议旨在提高使用OSPF协议的大型网络的可扩展性和稳定性。在网络拥塞期间,他们将减少重传,避免由于Hello数据包延迟超过RouterDeadInterval而导致邻接被声明为关闭,并采取其他拥塞避免步骤。这些建议是无序的,只是建议2只有在建议1未执行的情况下才予以执行。

(1) Classify all OSPF packets in two classes: a "high priority" class comprising OSPF Hello packets and Link State Acknowledgement packets, and a "low priority" class comprising all other packets. The classification is accomplished by examining the OSPF packet header. While receiving a packet from a neighbor and while transmitting a packet to a neighbor, try to process a "high priority" packet ahead of a "low priority" packet.

(1) 将所有OSPF数据包分为两类:一类是由OSPF Hello数据包和链路状态确认数据包组成的“高优先级”数据包,另一类是由所有其他数据包组成的“低优先级”数据包。通过检查OSPF数据包头完成分类。从邻居接收数据包和向邻居发送数据包时,尝试在“低优先级”数据包之前处理“高优先级”数据包。

The prioritized processing while transmitting may cause OSPF packets from a neighbor to be received out of sequence. If Cryptographic Authentication (AuType = 2) is used (as specified in [Ref1]), then successive received valid OSPF packets from a neighbor need to have a non-decreasing "Cryptographic sequence number". To comply with this requirement, we recommend that in case Cryptographic Authentication (AuType = 2) is used [Ref1], prioritized processing not be done at the transmitter. This will avoid packets arriving at the receiver out of sequence. However, after security processing at the receiver (including sequence number checking) is complete, the OSPF packets may be kept in a "high priority" queue or a "low priority" queue based on their class and processed accordingly. The benefit of prioritized processing is clearly higher in the absence of Cryptographic Authentication since in that case prioritization can be implemented both at the transmitter and at the receiver. However, even with Cryptographic Authentication it will be beneficial to have prioritization only at the receiver (following security processing).

传输时的优先处理可能导致从邻居接收的OSPF数据包顺序错误。如果使用加密身份验证(AuType=2)(如[Ref1]中所述),则从邻居处连续接收的有效OSPF数据包需要具有非递减的“加密序列号”。为了符合这一要求,我们建议在使用加密身份验证(AuType=2)的情况下[Ref1],不要在发送器上进行优先处理。这将避免数据包不按顺序到达接收器。然而,在接收器处的安全处理(包括序列号检查)完成之后,OSPF分组可以基于其类别保持在“高优先级”队列或“低优先级”队列中,并相应地进行处理。在没有密码认证的情况下,优先处理的好处明显更高,因为在这种情况下,优先处理可以在发送方和接收方实现。然而,即使使用加密身份验证,仅在接收方(在安全处理之后)具有优先级也是有益的。

(2) If Recommendation 1 cannot be implemented, then reset the inactivity timer for an adjacency whenever any OSPF unicast packet or any OSPF packet sent to AllSPFRouters over a point-to-point link is received over that adjacency instead of resetting

(2) 如果无法实施建议1,则在通过邻接接收到任何OSPF单播数据包或通过点到点链路发送给所有SPF路由器的任何OSPF数据包时,重置邻接的非活动计时器,而不是重置

the inactivity timer only on receipt of the Hello packet. So OSPF would declare the adjacency to be down only if no OSPF unicast packets or no OSPF packets sent to AllSPFRouters over a point-to-point link are received over that adjacency for a period equaling or exceeding the RouterDeadInterval. The reason for not recommending this proposal in conjunction with Recommendation 1 is to avoid potential undesirable side effects. One such effect is the delay in discovering the down status of an adjacency in a case where no high priority Hello packets are being received but the inactivity timer is being reset by other stale packets in the low priority queue.

非活动计时器仅在收到Hello数据包时可用。因此,OSPF将仅当在邻接上没有接收到OSPF单播数据包或通过点到点链路发送给所有SPF路由器的OSPF数据包的时间等于或超过RouterReadInterval时,才会宣布邻接关闭。不建议将本提案与建议1结合使用的原因是为了避免潜在的不良副作用。一个这样的影响是,在没有接收到高优先级Hello数据包但不活动计时器被低优先级队列中的其他过时数据包重置的情况下,延迟发现邻接的关闭状态。

(3) Use an exponential backoff algorithm for determining the value of the LSA retransmission interval (RxmtInterval). Let R(i) represent the RxmtInterval value used during the i-th retransmission of an LSA. Use the following algorithm to compute R(i).

(3) 使用指数退避算法确定LSA重传间隔(RxmtInterval)的值。设R(i)表示在LSA的第i次重传期间使用的RxmtInterval值。使用以下算法计算R(i)。

                    R(1) = Rmin
                    R(i+1) = Min(KR(i),Rmax)  for i>=1
        
                    R(1) = Rmin
                    R(i+1) = Min(KR(i),Rmax)  for i>=1
        

where K, Rmin, and Rmax are constants and the function Min(.,.) represents the minimum value of its two arguments. Example values for K, Rmin, and Rmax may be 2, 5, and 40 seconds, respectively. Note that the example value for Rmin, the initial retransmission interval, is the same as the sample value of RxmtInterval in [Ref1].

其中K、Rmin和Rmax是常数,函数Min(,.)表示其两个参数的最小值。K、Rmin和Rmax的示例值分别为2、5和40秒。注意,Rmin的示例值,即初始重传间隔,与参考文献1中RxmtInterval的样本值相同。

This recommendation is motivated by the observation that during a network congestion event caused by control messages, a major source for sustaining the congestion is the repeated retransmission of LSAs. The use of an exponential backoff algorithm for the LSA retransmission interval reduces the rate of LSA retransmissions while the network experiences congestion (during which it is more likely that multiple retransmissions of the same LSA would happen). This in turn helps the network get out of the congested state.

本建议的动机是观察到在由控制消息引起的网络拥塞事件期间,维持拥塞的主要来源是lsa的重复重传。对LSA重传间隔使用指数退避算法可在网络经历拥塞时降低LSA重传的速率(在此期间更可能发生相同LSA的多次重传)。这反过来又有助于网络摆脱拥挤状态。

(4) Implicit Congestion Detection and Action Based on That: If there is control message congestion at a router, its neighbors do not know about that explicitly. However, they can implicitly detect it based on the number of unacknowledged LSAs to this router. If this number exceeds a certain "high-water mark", then the rate at which LSAs are sent to this router should be reduced progressively using an exponential backoff mechanism but not below a certain minimum rate. At a future time, if the number of unacknowledged LSAs to this router falls below a certain "low-water mark", then the rate of sending LSAs to this router should

(4) 隐式拥塞检测和基于此的操作:如果路由器上存在控制消息拥塞,其邻居不会明确知道。但是,他们可以根据到该路由器的未确认LSA的数量隐式检测它。如果该数字超过某个“高水位线”,则应使用指数退避机制逐步降低LSA发送到此路由器的速率,但不得低于某个最小速率。在将来的某个时候,如果发送到此路由器的未确认LSA的数量低于某个“低水位线”,则发送到此路由器的LSA的速率应为

be increased progressively, again using an exponential backoff mechanism but not above a certain maximum rate. The whole algorithm is given below. Note that this algorithm is to be applied independently to each neighbor and only for unicast LSAs sent to a neighbor or LSAs sent to AllSPFRouters over a point-to-point link.

可逐步增加,再次使用指数退避机制,但不得超过某个最大速率。下面给出了整个算法。请注意,此算法将独立应用于每个邻居,并且仅适用于发送给邻居的单播LSA或通过点对点链路发送给所有SPF路由器的LSA。

Let, U(t) = Number of unacknowledged LSAs to neighbor at time t. H = A high-water mark (in units of number of unacknowledged LSAs). L = A low-water mark (in units of number of unacknowledged LSAs). G(t) = Gap between sending successive LSAs to neighbor at time t. F = The factor by which the above gap is to be increased during congestion and decreased after coming out of congestion. T = Minimum time that has to elapse before the existing gap is considered for change. Gmin = Minimum allowed value of gap. Gmax = Maximum allowed value of gap.

Let,U(t)=在时间t时发送给邻居的未确认LSA的数量。H=高水位线(以未确认LSA的数量为单位)。L=低水位线(以未确认LSA的数量为单位)。G(t)=在时间t向邻居发送连续lsa之间的间隔。F=在拥堵期间增加上述间隙并在摆脱拥堵后减小的系数。T=考虑更改现有间隙之前必须经过的最短时间。Gmin=间隙的最小允许值。Gmax=间隙的最大允许值。

       The equation below shows how the gap is to be changed after a
       time T has elapsed since the last change:
                 _
                |
                | Min(FG(t),Gmax) if U(t+T) > H
       G(t+T) = | G(t) if H >= U(t+T) >= L
                | Max(G(t)/F,Gmin) if U(t+T) < L
                |_
        
       The equation below shows how the gap is to be changed after a
       time T has elapsed since the last change:
                 _
                |
                | Min(FG(t),Gmax) if U(t+T) > H
       G(t+T) = | G(t) if H >= U(t+T) >= L
                | Max(G(t)/F,Gmin) if U(t+T) < L
                |_
        

Min(.,.) and Max(.,.) represent the minimum and maximum values of the two arguments, respectively.

Min(,.)和Max(,.)分别表示两个参数的最小值和最大值。

Example values for the various parameters of the algorithm are as follows: H = 20, L = 10, F = 2, T = 1 second, Gmin = 20 ms, Gmax = 1 second.

该算法的各种参数的示例值如下:H=20、L=10、F=2、T=1秒、Gmin=20ms、Gmax=1秒。

Recommendations 3 and 4 both slow down LSAs to congested neighbors based on implicitly detecting the congestion, but they have important differences. Recommendation 3 progressively slows down successive retransmissions of the same LSA, whereas Recommendation 4 progressively slows down all LSAs (new or retransmission) to a congested neighbor.

建议3和4都基于隐式检测拥塞来降低LSA对拥塞邻居的响应速度,但它们有重要区别。建议3逐渐降低同一LSA的连续重传速度,而建议4逐渐降低到拥塞邻居的所有LSA(新的或重传)速度。

(5) Throttling Adjacencies to Be Brought Up Simultaneously: If a router tries to bring up a large number of adjacencies to its neighbors simultaneously, then that may cause severe congestion due to database synchronization and LSA flooding activities. It is recommended that during such a situation no more than "n"

(5) 限制同时引发的邻接:如果路由器试图同时向其邻居引发大量邻接,则可能会由于数据库同步和LSA洪泛活动而导致严重拥塞。在这种情况下,建议不超过“n”

adjacencies should be brought up simultaneously. Once a subset of adjacencies has been brought up successfully, newer adjacencies may be brought up as long as the number of simultaneous adjacencies being brought up does not exceed "n". The appropriate value of "n" would depend on the router processing power, total bandwidth available for control plane traffic, and propagation delay. The value of "n" should be configurable.

邻接应同时提出。一旦邻接子集被成功提出,只要同时提出的邻接数量不超过“n”,就可以提出新的邻接。“n”的适当值取决于路由器处理能力、控制平面流量可用的总带宽和传播延迟。“n”的值应该是可配置的。

In the presence of throttling, an important issue is the order in which adjacencies are to be formed. We recommend a First Come First Served (FCFS) policy based on the order in which the request for adjacency formation arrives. Requests may either be from neighbors or self-generated. Among the self-generated requests, a priority list may be used to decide the order in which the requests are to be made. However, once an adjacency formation process starts it is not to be preempted except for unusual circumstances such as errors or time-outs.

在存在节流的情况下,一个重要的问题是邻接形成的顺序。我们建议根据邻接形成请求的到达顺序采用先到先得(FCFS)策略。请求可以来自邻居,也可以是自己生成的。在自生成的请求中,优先级列表可用于确定请求的发出顺序。然而,一旦邻接形成过程开始,除非出现错误或超时等异常情况,否则不能抢占邻接形成过程。

In some of the recommendations above, we refer to point-to-point links. Those references should also include cases where a broadcast network is to be treated as a point-to-point connection from the standpoint of IP routing [Ref5]

在上面的一些建议中,我们提到了点对点链接。这些参考还应包括从IP路由的角度将广播网络视为点对点连接的情况[参考文献5]

3. Security Considerations
3. 安全考虑

This memo does not create any new security issues for the OSPF protocol.

此备忘录不会给OSPF协议带来任何新的安全问题。

4. Acknowledgments
4. 致谢

We would like to acknowledge the support and helpful comments from OSPF WG chairs Rohit Dube, Acee Lindem, and John Moy; Routing Area directors Alex Zinin and Bill Fenner; and IESG reviewers. We acknowledge Vivek Dube, Mitchell Erblich, Mike Fox, Tony Przygienda, and Krishna Rao for comments on previous versions of the document. We also acknowledge Margaret Chiosi, Elie Francis, Jeff Han, Beth Munson, Roshan Rao, Moshe Segal, Mike Wardlow, and Pat Wirth for collaboration and encouragement in our scalability improvement efforts for Link State Protocol-based networks.

我们要感谢OSPF工作组主席Rohit Dube、Acee Lindem和John Moy的支持和有益评论;路由区域主管Alex Zinin和Bill Fenner;和IESG评论员。我们感谢Vivek Dube、Mitchell Erblich、Mike Fox、Tony Przygienda和Krishna Rao对文件先前版本的评论。我们还感谢Margaret Chiosi、Elie Francis、Jeff Han、Beth Munson、Roshan Rao、Moshe Segal、Mike Wardlow和Pat Wirth在基于链路状态协议的网络的可扩展性改进工作中的合作和鼓励。

5. Normative References
5. 规范性引用文件

[Ref1] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.

[参考文献1]莫伊,J.,“OSPF版本2”,STD 54,RFC 23281998年4月。

[Ref2] Coltun, R., Ferguson, D., and J. Moy, "OSPF for IPv6", RFC 2740, December 1999.

[Ref2]Coltun,R.,Ferguson,D.,和J.Moy,“IPv6的OSPF”,RFC 27401999年12月。

6. Informative References
6. 资料性引用

[Ref3] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering (TE) Extensions to OSPF Version 2", RFC 3630, September 2003.

[参考文献3]Katz,D.,Kompella,K.,和D.Yeung,“OSPF版本2的交通工程(TE)扩展”,RFC 3630,2003年9月。

[Ref4] C. Alaettinoglu, V. Jacobson and H. Yu, "Towards Millisecond IGP Convergence", Work in Progress.

[参考文献4]C.Alaettinoglu,V.Jacobson和H.Yu,“迈向毫秒IGP收敛”,正在进行的工作。

[Ref5] N. Shen, A. Lindem, J. Yuan, A. Zinin, R. White and S. Previdi, "Point-to-point operation over LAN in link-state routing protocols", Work in Progress.

[参考文献5]N.Shen,A.Lindem,J.Yuan,A.Zinin,R.White和S.Previdi,“链路状态路由协议下局域网上的点对点操作”,正在进行中。

[Ref6] Pappalardo, D., "AT&T, customers grapple with ATM net outage", Network World, February 26, 2001.

[参考文献6]Pappalardo,D.,“AT&T,客户应对ATM网络中断”,网络世界,2001年2月26日。

[Ref7] "AT&T announces cause of frame-relay network outage," AT&T Press Release, April 22, 1998.

[参考文献7]“AT&T宣布帧中继网络中断的原因”,AT&T新闻稿,1998年4月22日。

[Ref8] Cholewka, K., "MCI Outage Has Domino Effect", Inter@ctive Week, August 20, 1999.

[参考文献8]Cholewka,K.,“MCI大修具有多米诺骨牌效应”,Inter@ctive一九九九年八月二十日(星期)。

[Ref9] Jander, M., "In Qwest Outage, ATM Takes Some Heat", Light Reading, April 6, 2001.

[Ref9]Jander,M.,“在Qwest大修中,ATM需要一些热量”,Light Reading,2001年4月6日。

[Ref10] A. Zinin and M. Shand, "Flooding Optimizations in Link-State Routing Protocols", Work in Progress.

[Ref10]A.Zinin和M.Shand,“链路状态路由协议中的泛洪优化”,正在进行中。

[Ref11] Pillay-Esnault, P., "OSPF Refresh and Flooding Reduction in Stable Topologies", RFC 4136, July 2005.

[参考11]Pillay Esnault,P.,“稳定拓扑中的OSPF刷新和洪水减少”,RFC 41362005年7月。

[Ref12] G. Ash, G. Choudhury, V. Sapozhnikova, M. Sherif, A. Maunder, V. Manral, "Congestion Avoidance & Control for OSPF Networks", Work in Progress.

[参考文献12]G.Ash,G.Choudhury,V.Sapozhnikova,M.Sherif,A.Maunder,V.Manral,“OSPF网络的拥塞避免和控制”,正在进行的工作。

[Ref13] G. Choudhury, G. Ash, V. Manral, A. Maunder and V. Sapozhnikova, "Prioritized Treatment of Specific OSPF Packets and Congestion Avoidance: Algorithms and Simulations", AT&T Technical Report, August 2003.

[参考13]G.Choudhury,G.Ash,V.Manral,A.Maunder和V.Sapozhnikova,“特定OSPF数据包的优先处理和拥塞避免:算法和模拟”,AT&T技术报告,2003年8月。

[Ref14] Nichols, K., Blake, S., Baker, F., and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1998.

[Ref14]Nichols,K.,Blake,S.,Baker,F.,和D.Black,“IPv4和IPv6报头中区分服务字段(DS字段)的定义”,RFC 2474,1998年12月。

Appendix A. LSA Storm: Causes and Impact

附录A.LSA风暴:原因和影响

An LSA storm may be initiated due to many reasons. Here are some examples:

LSA风暴可能是由多种原因引起的。以下是一些例子:

(a) one or more link failures due to fiber cuts,

(a) 由于光纤中断而导致的一个或多个链路故障,

(b) one or more router failures for some reason, e.g., software crash or some type of disaster (including power outage) in an office complex hosting many routers,

(b) 一个或多个路由器因某种原因出现故障,例如,在承载多个路由器的办公大楼中,软件崩溃或某种类型的灾难(包括断电),

(c) link/router flapping,

(c) 链路/路由器摆动,

(d) requirement of taking down and later bringing back many routers during a software/hardware upgrade,

(d) 在软件/硬件升级过程中,需要关闭并稍后带回许多路由器,

(e) near synchronization of the periodic 1800-second LSA refreshes of a subset of LSAs,

(e) LSA子集的周期性1800秒LSA刷新接近同步,

(f) refresh of all LSAs in the system during a change in software version,

(f) 在软件版本更改期间刷新系统中的所有LSA,

(g) injecting a large number of external routes to OSPF due to a procedural error,

(g) 由于程序错误,向OSPF注入大量外部路由,

(h) Router ID changes causing a large number of LSA re-originations (possibly LSA purges as well depending on the implementation).

(h) 路由器ID更改导致大量LSA重新发起(也可能是LSA清除,具体取决于实现)。

In addition to the LSAs originated as a direct result of link/router failures, there may be other indirect LSAs as well. One example in MPLS networks is traffic engineering LSAs [Ref3] originated at other links as a result of significant changes in reserved bandwidth. These result from rerouting of Label Switched Paths (LSPs) that went down during the link/router failure. The LSA storm causes high CPU and memory utilization at the router processor causing incoming packets to be delayed or dropped. Delayed acknowledgments (beyond the retransmission timer value) results in retransmissions, and delayed Hello packets (beyond the Router-Dead interval) results in links being declared down. A trunk-down event causes router LSA origination by its end-point routers. If traffic engineering LSAs are used for each link, then that type of LSA would also be originated by the end-point routers and potentially elsewhere as well due to significant changes in reserved bandwidths at other links caused by the failure and reroute of LSPs originally using the failed trunk. Eventually, when the link recovers that would also trigger additional router LSAs and traffic engineering LSAs.

除了由链路/路由器故障直接导致的LSA外,还可能存在其他间接LSA。MPLS网络中的一个例子是由于保留带宽的重大变化而在其他链路上发起的流量工程LSA[Ref3]。这些是由于重新路由在链路/路由器故障期间中断的标签交换路径(LSP)造成的。LSA风暴导致路由器处理器的CPU和内存利用率高,导致传入数据包延迟或丢弃。延迟确认(超过重传计时器值)会导致重传,延迟Hello数据包(超过路由器死区间隔)会导致链路被声明断开。中继线关闭事件导致其端点路由器发起路由器LSA。如果对每个链路使用流量工程LSA,则该类型的LSA也将由端点路由器发起,并且可能在其他地方,因为最初使用故障中继的LSP的故障和重新路由导致其他链路的保留带宽发生重大变化。最终,当链路恢复时,也会触发额外的路由器LSA和流量工程LSA。

The retransmissions and additional LSA originations result in further CPU and memory usage, essentially causing a positive feedback loop. We define the LSA storm size as the number of LSAs in the original storm, not counting any additional LSAs resulting from the feedback loop described above. If the LSA storm is too large, then the positive feedback loop mentioned above may be large enough to indefinitely sustain a large CPU and memory utilization at many routers in the network, thereby driving the network to an unstable state. In the past, network outage events have been reported in IP and ATM networks using link-state protocols such as OSPF, Intermediate System to Intermediate System (IS-IS), Private Network-Network Interface (PNNI), or some proprietary variants. See for example [Ref6-Ref9]. In many of these examples, large-scale flooding of LSAs or other similar control messages (either naturally or triggered by some bug or inappropriate procedure) have been partly or fully responsible for network instability and outage.

重传和额外的LSA发端导致进一步的CPU和内存使用,本质上造成了正反馈循环。我们将LSA风暴大小定义为原始风暴中的LSA数量,不计算上述反馈回路产生的任何额外LSA。如果LSA风暴太大,那么上面提到的正反馈回路可能足够大,以无限期地维持网络中许多路由器的大量CPU和内存利用率,从而将网络推向不稳定状态。过去,IP和ATM网络中使用链路状态协议(如OSPF、中间系统到中间系统(IS-IS)、专用网络接口(PNNI)或某些专有变体)报告了网络中断事件。参见示例[Ref6-Ref9]。在许多例子中,LSA或其他类似控制消息的大规模泛滥(自然或由某些错误或不适当的过程触发)部分或全部导致了网络不稳定和中断。

In [Ref13], a simulation model is used to show that there is a certain LSA storm size threshold above which the network may show unstable behavior caused by a large number of retransmissions, link failures due to missed Hello packets, and subsequent link recoveries. It is also shown that the LSA storm size causing instability may be substantially increased by providing prioritized treatment to Hello and LSA Acknowledgment packets and by using an exponential backoff algorithm for determining the LSA retransmission interval. If it is not possible to prioritize Hello packets, then resetting the inactivity timer on receiving any valid OSPF packets can also provide the same benefit. Furthermore, if we prioritize Hello packets, then even when the network operates somewhat above the stability threshold, links are not declared down due to missed Hellos. This implies that even though there is control plane congestion due to many retransmissions, the data plane stays up and no new LSAs are originated (besides the ones in the original storm and the refreshes). These observations support the first three recommendations in Section 2. The authors of this document have also done simulations to verify that the other recommendations in Section 2 help avoid congestion and allow a graceful exit from a congested state.

在[参考文献13]中,使用了一个模拟模型来表明存在一定的LSA风暴大小阈值,超过该阈值,网络可能会表现出由大量重传、丢失Hello数据包导致的链路故障以及随后的链路恢复引起的不稳定行为。还表明,通过对Hello和LSA确认分组提供优先处理,并通过使用指数退避算法来确定LSA重传间隔,可以显著增加导致不稳定性的LSA风暴大小。如果不可能对Hello数据包进行优先级排序,那么在接收任何有效的OSPF数据包时重置非活动计时器也可以提供相同的好处。此外,如果我们对Hello数据包进行优先级排序,那么即使网络的运行略高于稳定性阈值,也不会因为错过Hello而宣布链路关闭。这意味着,即使由于多次重传而导致控制平面拥塞,数据平面仍保持正常,并且不会产生新的LSA(除了原始storm和刷新中的LSA)。这些观察结果支持第2节中的前三项建议。本文件的作者还进行了模拟,以验证第2节中的其他建议有助于避免拥堵,并允许从拥堵状态优雅地退出。

One might argue that the scalability issue of large networks should be solved solely by dividing the network hierarchically into multiple areas so that flooding of LSAs remains localized within areas. However, this approach increases the network management and design complexity and may result in less optimal routing between areas. Also, Autonomous System External (ASE) LSAs are flooded throughout the AS, and it may be a problem if there are large numbers of them. Furthermore, a large number of summary LSAs may need to be flooded across areas, and their numbers would increase significantly if

有人可能会说,大型网络的可伸缩性问题应该通过将网络分层划分为多个区域来解决,这样LSA的泛滥仍然局限于区域内。然而,这种方法增加了网络管理和设计的复杂性,并可能导致区域之间的路由不太优化。此外,自治系统外部(ASE)LSA在整个AS中被淹没,如果存在大量LSA,则可能会出现问题。此外,可能需要在各个地区淹没大量的汇总LSA,如果

multiple Area Border Routers are employed for the purpose of reliability. Thus, it is important to allow the network to grow towards as large a size as possible under a single area.

为了提高可靠性,采用了多区域边界路由器。因此,重要的是允许网络在单个区域下向尽可能大的规模发展。

The recommendations in the document are synergistic with a broader set of scalability and stability improvement proposals. [Ref10] proposes flooding overhead reduction in case more than one interface goes to the same neighbor. [Ref11] proposes a mechanism for greatly reducing LSA refreshes in stable topologies.

文件中的建议与一系列更广泛的可扩展性和稳定性改进建议相辅相成。[Ref10]建议在多个接口连接到同一个邻居的情况下降低泛洪开销。[Ref11]提出了一种在稳定拓扑中大大减少LSA刷新的机制。

[Ref12] proposes a wide range of congestion control and failure recovery mechanisms (some of those ideas are covered in this document, but [Ref12] has other ideas not covered here).

[Ref12]提出了广泛的拥塞控制和故障恢复机制(其中一些想法在本文档中有所介绍,但[Ref12]的其他想法在此未介绍)。

Appendix B. List of Variables and Values
附录B.变量和值列表

F = The factor by which the gap between sending successive LSAs to a neighbor is to be increased during congestion and decreased after coming out of congestion (used in Recommendation 4). Example value is 2.

F=在拥塞期间增加向邻居发送连续LSA之间的间隔,并在拥塞结束后减小间隔的系数(在建议4中使用)。示例值为2。

G(t) = Gap between sending successive LSAs to a neighbor at time t (used in Recommendation 4).

G(t)=在时间t向邻居发送连续lsa之间的间隔(在建议4中使用)。

Gmax = Maximum allowed value of gap between sending successive LSAs to a neighbor (used in Recommendation 4). Example value is 1 second.

Gmax=向邻居发送连续LSA之间的最大允许间隙值(在建议4中使用)。示例值为1秒。

Gmin = Minimum allowed value of gap between sending successive LSAs to a neighbor (used in Recommendation 4). Example value is 20 ms.

Gmin=向邻居发送连续LSA之间的最小允许间隙值(在建议4中使用)。示例值为20 ms。

H = A high-water mark (in units of number of unacknowledged LSAs). Exceeding this mark would trigger a potential increase in the gap between sending successive LSAs to a neighbor. (used in Recommendation 4). Example value is 20.

H=高水位线(以未确认LSA的数量为单位)。超过此标记将触发向邻居发送连续LSA之间的间隔的潜在增加。(在建议4中使用)。示例值为20。

K = A multiplicative constant used in increasing the RxmtInterval value used during successive retransmissions of the same LSA (used in Recommendation 3). Example value is 2.

K=一个乘法常数,用于增加相同LSA(在建议3中使用)连续重传期间使用的RxmtInerval值。示例值为2。

L = A low-water mark (in units of number of unacknowledged LSAs) Dropping below this mark would trigger a potential decrease in the gap between sending successive LSAs to a neighbor. (used in Recommendation 4). Example value is 10.

L=低于该标记的低水位线(以未确认LSA的数量为单位)可能会导致向相邻LSA发送连续LSA之间的间隔减小。(在建议4中使用)。示例值为10。

n = Upper limit on the number of adjacencies to be brought up simultaneously (used in Recommendation 5).

n=同时提出的邻接数上限(在建议5中使用)。

R(i) = RxmtInterval value used during the i-th retransmission of an LSA (used in Recommendation 3).

R(i)=在LSA的第i次重传期间使用的RxmtInterval值(在建议3中使用)。

Rmax = The maximum allowed value of RxmtInterval (used in Recommendation 3). Example value is 40 seconds.

Rmax=RxmtInterval的最大允许值(在建议3中使用)。示例值为40秒。

Rmin = The minimum allowed value of RxmtInterval (used in Recommendation 3). Example value is 5 seconds.

Rmin=RxmtInterval的最小允许值(在建议3中使用)。示例值为5秒。

T = Minimum time that has to elapse before the existing gap between sending successive LSAs to a neighbor is considered for change (used in Recommendation 4). Example value is 1 second.

T=考虑更改向邻居发送连续LSA之间的现有间隔之前必须经过的最短时间(在建议4中使用)。示例值为1秒。

U(t) = Number of unacknowledged LSAs to a neighbor at time t (used in Recommendation 4).

U(t)=在时间t时发送给邻居的未确认LSA的数量(在建议4中使用)。

Appendix C. Other Recommendations and Suggestions
附录C.其他建议和建议

(1) Explicit Marking: In Section 2, we recommended that OSPF packets be classified to "high" and "low" priority classes based on examining the OSPF packet header. In some cases (particularly in the receiver), this examination may be computationally costly. An alternative would be the use of different TOS/Precedence field settings for the two priority classes. [Ref1] recommends setting the TOS field to 0 and the Precedence field to 6 for all OSPF packets. We recommend this same setting for the "low" priority OSPF packets and a different setting for the "high" priority OSPF packets in order to be able to classify them separately without having to examine the OSPF packet header. Two examples are given below:

(1) 显式标记:在第2节中,我们建议在检查OSPF数据包头的基础上,将OSPF数据包分为“高”和“低”优先级。在某些情况下(特别是在接收器中),此检查可能会在计算上花费高昂。另一种方法是为两个优先级类别使用不同的TOS/优先级字段设置。[Ref1]建议将所有OSPF数据包的TOS字段设置为0,优先字段设置为6。我们建议对“低”优先级OSPF数据包使用相同的设置,对“高”优先级OSPF数据包使用不同的设置,以便能够在不检查OSPF数据包头的情况下对其进行单独分类。以下是两个例子:

Example 1: For "low" priority packets, set TOS field to 0 and Precedence field to 6, and for "high" priority packets set TOS field to 4 and Precedence field to 6.

示例1:对于“低”优先级数据包,将TOS字段设置为0,优先级字段设置为6;对于“高”优先级数据包,将TOS字段设置为4,优先级字段设置为6。

Example 2: For "low" priority packets, set TOS field to 0 and Precedence field to 6, and for "high" priority packets set TOS field to 0 and Precedence field to 7.

示例2:对于“低”优先级数据包,将TOS字段设置为0,优先级字段设置为6;对于“高”优先级数据包,将TOS字段设置为0,优先级字段设置为7。

Note that the TOS/Precedence bits have been redefined by Diffserv (RFC 2474, [Ref14]). Also note that the different TOS/Precedence field settings suggested above only need to be agreed among the systems on the link. This recommendation is not needed to be followed if it is easy to examine the OSPF packet header and thereby separately classify "high" and "low" priority packets.

注意,TOS/优先位已由Diffserv重新定义(RFC 2474,[Ref14])。还要注意,上面建议的不同TOS/优先字段设置只需要在链路上的系统之间达成一致。如果很容易检查OSPF数据包报头,从而将“高”优先级数据包和“低”优先级数据包分开分类,则无需遵循此建议。

(2) Further Prioritization of OSPF Packets: Besides the packets designated as "high" priority in Recommendation 1 of Section 2, there may be a need for further priority separation among the "low" priority OSPF packets. We recommend the use of three priority classes: "high", "medium" and "low". While receiving a packet from a neighbor and while transmitting a packet to a neighbor, try to process a "high priority" packet ahead of "medium" and "low" priority packets and a "medium" priority packet ahead of "low priority" packets. The "high" priority packets are as designated in Recommendation 1 of Section 2. We provide below two candidate examples for "medium" priority packets. All OSPF packets not designated as "high" or "medium" priority are "low" priority. If Cryptographic Authentication (AuType = 2) is used (as specified in [Ref1]), then prioritized treatment is to be provided only at the receiver and after security processing, but not at the transmitter since that may cause packets to arrive out of sequence and violate the requirements of "Autype = 2".

(2) OSPF数据包的进一步优先级划分:除了第2节建议1中指定为“高”优先级的数据包外,可能需要在“低”优先级OSPF数据包之间进行进一步的优先级分离。我们建议使用三个优先级别:“高”、“中”和“低”。从邻居接收数据包和向邻居发送数据包时,尝试在“中等”和“低”优先级数据包之前处理“高优先级”数据包,并在“低优先级”数据包之前处理“中等”优先级数据包。“高”优先级分组如第2节的建议1所示。我们提供以下两个“中等”优先级数据包的候选示例。所有未指定为“高”或“中”优先级的OSPF数据包都是“低”优先级。如果使用加密身份验证(AuType=2)(如参考文献[1]中所述),则仅在接收方和安全处理后提供优先处理,而不是在发送方,因为这可能导致数据包到达顺序错误,并违反“AuType=2”的要求。

One example of "medium" priority packet is the Database Description (DBD) packet from a slave (during the database synchronization process) that is used as an acknowledgment.

“中等”优先级数据包的一个示例是来自从属(在数据库同步过程中)的数据库描述(DBD)数据包,该数据包用作确认。

A second example is an LSA carrying intra-area topology change information (this may trigger SPF calculation and rerouting of Label Switched Paths, so fast processing of this packet may improve OSPF/Label Distribution Protocol (LDP) convergence times). However, if the processing cost of identifying and separately queueing the LSA in this example is deemed to be high, then the implementer may decide not to do it.

第二个示例是承载区域内拓扑变化信息的LSA(这可能触发SPF计算和标签交换路径的重新路由,因此该分组的快速处理可能改进OSPF/标签分发协议(LDP)的收敛时间)。然而,如果在该示例中识别LSA并将其单独排队的处理成本被认为是高的,则实现者可以决定不这样做。

(3) Processing a Large Number of LSA Purges: Occasionally, some events in the network, such as router ID changes, may result in a large number of LSA re-originations and LSA purges. In such a scenario, one may consider processing LSAs in different order, e.g., processing LSA purges ahead of LSA originations. We, however, do not recommend out-of-order LSA processing for several reasons. First, detecting the LSA type ahead of queueing may be computationally expensive. Out-of-order processing may also cause subtle bugs. We do not want to recommend a major change in the LSA processing paradigm for a relatively rare event such as router ID change. However, a router with a changing ID may flush the old LSAs gradually without causing a storm.

(3) 处理大量LSA清除:有时,网络中的某些事件(如路由器ID更改)可能会导致大量LSA重新发起和LSA清除。在这样的场景中,人们可以考虑以不同顺序处理LSA,例如,处理LSA清除提前LSA发起。但是,出于以下几个原因,我们不建议进行无序LSA处理。首先,在排队之前检测LSA类型可能在计算上很昂贵。无序处理也可能导致细微的错误。对于相对罕见的事件,如路由器ID更改,我们不建议对LSA处理范式进行重大更改。但是,ID不断变化的路由器可能会逐渐刷新旧LSA,而不会引起风暴。

Contributing Authors and Their Addresses

投稿作者及其地址

In addition to the editor, several people contributed to this document. The names and contact information of all authors are given below.

除了编辑之外,还有几个人对本文档做出了贡献。所有作者的姓名和联系方式如下。

Anurag S. Maunder Erlang Technology 2880 Scott Boulevard Santa Clara, CA 95052 USA

美国加利福尼亚州圣克拉拉斯科特大道2880号阿努拉格S.莫德·埃尔朗科技公司,邮编95052

Phone: (408) 420-7617 EMail: anuragm@erlangtech.com

电话:(408)420-7617电子邮件:anuragm@erlangtech.com

Gerald R. Ash AT&T Room D5-2A01 200 Laurel Avenue Middletown, NJ, 07748 USA

Gerald R.Ash美国电话电报公司D5-2A01室,地址:美国新泽西州劳雷尔大道中城200号,邮编:07748

Phone: (732) 420-4578 EMail: gash@att.com

电话:(732)420-4578电子邮件:gash@att.com

Vishwas Manral Sinett Corp, 2/1 Embassy Icon Annex, Infantry Road, Bangalore 560 001 India

印度班加罗尔步兵路大使馆图标附件2/1号Vishwas Manral Sinett公司560 001

   Phone: +91-(805)-137-7023
   EMail: vishwas@sinett.com
        
   Phone: +91-(805)-137-7023
   EMail: vishwas@sinett.com
        

Vera D. Sapozhnikova AT&T Room C5-2C29 200 Laurel Avenue Middletown, NJ, 07748 USA

Vera D.Sapozhnikova美国电话电报公司C5-2C29室,地址:美国新泽西州劳雷尔大道中城200号,邮编:07748

Phone: (732) 420-2653 EMail: sapozhnikova@att.com

电话:(732)420-2653电子邮件:sapozhnikova@att.com

Editor's Address

编辑地址

Gagan L. Choudhury AT&T Room D5-3C21 200 Laurel Avenue Middletown, NJ, 07748 USA

Gagan L.Choudhury美国电话电报公司D5-3C21室,地址:美国新泽西州劳雷尔大道米德尔顿200号,邮编:07748

Phone: (732) 420-3721 EMail: gchoudhury@att.com

电话:(732)420-3721电子邮件:gchoudhury@att.com

Full Copyright Statement

完整版权声明

Copyright (C) The Internet Society (2005).

版权所有(C)互联网协会(2005年)。

This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

本文件受BCP 78中包含的权利、许可和限制的约束,除其中规定外,作者保留其所有权利。

This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

本文件及其包含的信息是按“原样”提供的,贡献者、他/她所代表或赞助的组织(如有)、互联网协会和互联网工程任务组不承担任何明示或暗示的担保,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。

Intellectual Property

知识产权

The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何独立努力来确定任何此类权利。有关RFC文件中权利的程序信息,请参见BCP 78和BCP 79。

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.

向IETF秘书处披露的知识产权副本和任何许可证保证,或本规范实施者或用户试图获得使用此类专有权利的一般许可证或许可的结果,可从IETF在线知识产权存储库获取,网址为http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.

IETF邀请任何相关方提请其注意任何版权、专利或专利申请,或其他可能涵盖实施本标准所需技术的专有权利。请将信息发送至IETF的IETF-ipr@ietf.org.

Acknowledgement

确认

Funding for the RFC Editor function is currently provided by the Internet Society.

RFC编辑功能的资金目前由互联网协会提供。