Internet Engineering Task Force (IETF)                          M. Shand
Request for Comments: 5715                                     S. Bryant
Category: Informational                                    Cisco Systems
ISSN: 2070-1721                                             January 2010
        
Internet Engineering Task Force (IETF)                          M. Shand
Request for Comments: 5715                                     S. Bryant
Category: Informational                                    Cisco Systems
ISSN: 2070-1721                                             January 2010
        

A Framework for Loop-Free Convergence

一种无环收敛框架

Abstract

摘要

A micro-loop is a packet forwarding loop that may occur transiently among two or more routers in a hop-by-hop packet forwarding paradigm.

微循环是一种包转发循环,在逐跳包转发范例中,它可能在两个或多个路由器之间瞬时发生。

This framework provides a summary of the causes and consequences of micro-loops and enables the reader to form a judgement on whether micro-looping is an issue that needs to be addressed in specific networks. It also provides a survey of the currently proposed mechanisms that may be used to prevent or to suppress the formation of micro-loops when an IP or MPLS network undergoes topology change due to failure, repair, or management action. When sufficiently fast convergence is not available and the topology is susceptible to micro-loops, use of one or more of these mechanisms may be desirable.

该框架总结了微循环的原因和后果,使读者能够判断微循环是否是需要在特定网络中解决的问题。它还概述了当前提出的机制,这些机制可用于在IP或MPLS网络因故障、修复或管理操作而发生拓扑变化时防止或抑制微环的形成。当无法获得足够快的收敛速度且拓扑易受微环影响时,可能需要使用一个或多个此类机制。

Status of This Memo

关于下段备忘

This document is not an Internet Standards Track specification; it is published for informational purposes.

本文件不是互联网标准跟踪规范;它是为了提供信息而发布的。

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.

本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准;见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5715.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc5715.

Copyright Notice

版权公告

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2010 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents

目录

   1. Introduction ....................................................3
   2. The Nature of Micro-Loops .......................................4
   3. Applicability ...................................................5
   4. Micro-Loop Control Strategies ...................................6
   5. Loop Mitigation .................................................8
      5.1. Fast Convergence ...........................................8
      5.2. PLSN .......................................................8
   6. Micro-Loop Prevention ..........................................10
      6.1. Incremental Cost Advertisement ............................10
      6.2. Nearside Tunneling ........................................12
      6.3. Farside Tunnels ...........................................13
      6.4. Distributed Tunnels .......................................14
      6.5. Packet Marking ............................................14
      6.6. MPLS New Labels ...........................................15
      6.7. Ordered FIB Update ........................................16
      6.8. Synchronised FIB Update ...................................18
   7. Using PLSN in Conjunction with Other Methods ...................18
   8. Loop Suppression ...............................................19
   9. Compatibility Issues ...........................................20
   10. Comparison of Loop-Free Convergence Methods ...................20
   11. Security Considerations .......................................21
   12. Acknowledgments ...............................................21
   13. Informative References ........................................21
        
   1. Introduction ....................................................3
   2. The Nature of Micro-Loops .......................................4
   3. Applicability ...................................................5
   4. Micro-Loop Control Strategies ...................................6
   5. Loop Mitigation .................................................8
      5.1. Fast Convergence ...........................................8
      5.2. PLSN .......................................................8
   6. Micro-Loop Prevention ..........................................10
      6.1. Incremental Cost Advertisement ............................10
      6.2. Nearside Tunneling ........................................12
      6.3. Farside Tunnels ...........................................13
      6.4. Distributed Tunnels .......................................14
      6.5. Packet Marking ............................................14
      6.6. MPLS New Labels ...........................................15
      6.7. Ordered FIB Update ........................................16
      6.8. Synchronised FIB Update ...................................18
   7. Using PLSN in Conjunction with Other Methods ...................18
   8. Loop Suppression ...............................................19
   9. Compatibility Issues ...........................................20
   10. Comparison of Loop-Free Convergence Methods ...................20
   11. Security Considerations .......................................21
   12. Acknowledgments ...............................................21
   13. Informative References ........................................21
        
1. Introduction
1. 介绍

When there is a change to the network topology (due to the failure or restoration of a link or router, or as a result of management action), the routers need to converge on a common view of the new topology and the paths to be used for forwarding traffic to each destination. During this process, referred to as a routing transition, packet delivery between certain source/destination pairs may be disrupted. This occurs due to the time it takes for the topology change to be propagated around the network together with the time it takes each individual router to determine and then update the forwarding information base (FIB) for the affected destinations. During this transition, packets may be lost due to the continuing attempts to use the failed component and due to forwarding loops. Forwarding loops arise due to the inconsistent FIBs that occur as a result of the difference in time taken by routers to execute the transition process. This is a problem that may occur in both IP networks and MPLS networks that use the label distribution protocol (LDP) [RFC5036] as the label switched path (LSP) signaling protocol.

当网络拓扑发生变化时(由于链路或路由器的故障或恢复,或由于管理操作),路由器需要汇聚到新拓扑和用于将流量转发到每个目的地的路径的公共视图上。在此过程中(称为路由转换),某些源/目的地对之间的数据包传递可能会中断。这是因为拓扑更改在网络中传播所需的时间,以及每个路由器确定并更新受影响目的地的转发信息库(FIB)所需的时间。在此转换期间,由于继续尝试使用失败的组件和转发循环,数据包可能会丢失。转发循环是由于路由器执行转换过程所花费的时间不同而导致的FIB不一致而产生的。这是在使用标签分发协议(LDP)[RFC5036]作为标签交换路径(LSP)信令协议的IP网络和MPLS网络中可能出现的问题。

The service failures caused by routing transitions are largely hidden by higher-level protocols that retransmit the lost data. However, new Internet services could emerge that are more sensitive to the packet disruption that occurs during a transition. To make the transition transparent to their users, these services would require a short routing transition. Ideally, routing transitions would be completed in zero time with no packet loss.

The service failures caused by routing transitions are largely hidden by higher-level protocols that retransmit the lost data. However, new Internet services could emerge that are more sensitive to the packet disruption that occurs during a transition. To make the transition transparent to their users, these services would require a short routing transition. Ideally, routing transitions would be completed in zero time with no packet loss.translate error, please retry

Regardless of how optimally the mechanisms involved have been designed and implemented, it is inevitable that a routing transition will take some minimum interval that is greater than zero. This has led to the development of a traffic engineering (TE) fast-reroute mechanism for MPLS [RFC4090]. Alternative mechanisms that might be deployed in an MPLS network or an IP network are current work items in the IETF [RFC5714]. The repair mechanism may, however, be disrupted by the formation of micro-loops during the period between the time when the failure is announced and the time when all FIBs have been updated to reflect the new topology.

无论所涉及的机制的设计和实现如何优化,路由转换将不可避免地需要一些大于零的最小间隔。这导致了MPLS的流量工程(TE)快速重路由机制的发展[RFC4090]。可能部署在MPLS网络或IP网络中的替代机制是IETF[RFC5714]中的当前工作项。然而,在宣布故障和更新所有FIB以反映新拓扑之间的时间段内,微环的形成可能会破坏修复机制。

One method of mitigating the effects of micro-loops is to ensure that the network reconverges in a sufficiently short time that these effects are inconsequential. Another method is to design the network topology to minimise or even eliminate the possibility of micro-loops.

减轻微回路影响的一种方法是确保网络在足够短的时间内重新聚合,使这些影响无关紧要。另一种方法是设计网络拓扑以最小化甚至消除微环的可能性。

The propensity to form micro-loops is highly topology dependent, and algorithms are available to identify which links in a network are subject to micro-looping. In topologies that are critically

形成微环的倾向高度依赖于拓扑结构,并且可以使用算法来识别网络中哪些链路受到微环的影响。在关键的拓扑中

susceptible to the formation of micro-loops, there is little point in introducing new mechanisms to provide fast reroute without also deploying mechanisms that prevent the disruptive effects of micro-loops. Unless micro-loop prevention is used in these topologies, packets may not reach the repair and micro-looping packets may cause congestion, resulting in further packet loss.

易受微环形成的影响,在不部署防止微环破坏效应的机制的情况下,引入新机制以提供快速重路由几乎没有意义。除非在这些拓扑中使用微环预防,否则数据包可能无法到达修复位置,并且微环数据包可能导致拥塞,从而导致进一步的数据包丢失。

The disruptive effect of micro-loops is not confined to periods when there is a component failure. Micro-loops can, for example, form when a component is put back into service following repair. Micro-loops can also form as a result of a network-maintenance action such as adding a new network component, removing a network component, or modifying a link cost.

微回路的破坏性影响并不局限于部件发生故障的时期。例如,当部件在维修后重新投入使用时,可以形成微环。网络维护操作(如添加新的网络组件、移除网络组件或修改链路成本)也会形成微环路。

This framework provides a summary of the causes and consequences of micro-loops and enables the reader to form a judgement on whether micro-looping is an issue that needs to be addressed in specific networks. It also provides a survey of the currently proposed micro-loop mitigation mechanisms. When sufficiently fast convergence is not available and the topology is susceptible to micro-loops, use of one or more of these mechanisms may be desirable.

该框架总结了微循环的原因和后果,使读者能够判断微循环是否是需要在特定网络中解决的问题。它还提供了对目前提出的微回路缓解机制的调查。当无法获得足够快的收敛速度且拓扑易受微环影响时,可能需要使用一个或多个此类机制。

2. The Nature of Micro-Loops
2. 微环的性质

A micro-loop is a packet forwarding loop that may occur transiently among two or more routers in a hop-by-hop, packet forwarding paradigm.

微循环是一种包转发循环,它可以在两个或多个路由器之间以逐跳的包转发模式瞬时发生。

Micro-loops may form during the periods when a network is re-converging following ANY topology change and are caused by inconsistent FIBs in the routers. During the transition, micro-loops may occur over a single link between a pair of routers that temporarily use each other as the next hop for a prefix. Micro-loops may also form when each router in a cycle of three or more routers has the next router in the cycle as a next hop for a given prefix.

当网络在任何拓扑变化后重新汇聚时,微环可能会形成,并且是由路由器中不一致的FIB引起的。在转换期间,微循环可能会在一对路由器之间的单个链路上发生,这对路由器临时将彼此用作前缀的下一跳。当三个或更多路由器的循环中的每个路由器将循环中的下一个路由器作为给定前缀的下一跳时,也可能形成微循环。

Cyclic loops may occur if one or more of the following conditions are met:

如果满足以下一个或多个条件,则可能发生循环循环:

1. Asymmetric link costs.

1. 不对称链接成本。

2. An equal-cost path exists between a pair of routers, each of which makes a different decision regarding which path to use for forwarding to a particular destination. Note that even routers that do not implement equal-cost, multi-path (ECMP) forwarding must make a choice between the available equal-cost paths, and unless they make the same choice, the condition for cyclic loops will be fulfilled.

2. 在一对路由器之间存在一条成本相等的路径,每一对路由器对使用哪条路径转发到特定目的地做出不同的决定。请注意,即使不实现等成本、多路径(ECMP)转发的路由器也必须在可用的等成本路径之间做出选择,除非它们做出相同的选择,否则循环循环的条件将得到满足。

3. Topology changes affecting multiple links, including single node and line card failures.

3. 影响多个链路的拓扑更改,包括单节点和线路卡故障。

Micro-loops have two undesirable side effects: congestion and repair starvation.

微环有两个不良副作用:阻塞和修复饥饿。

o A looping packet consumes bandwidth until it either escapes as a result of the re-synchronization of the FIBs or its time to live (TTL) expires. This transiently increases the traffic over a link by as much as 128 times, and may cause the link to become congested. This congestion reduces the bandwidth available to other traffic (which is not otherwise affected by the topology change). As a result, the "innocent" traffic using the link experiences increased latency and is liable to congestive packet loss.

o 循环数据包消耗带宽,直到它由于FIB的重新同步而逃逸或其生存时间(TTL)到期。这会使链路上的流量瞬间增加多达128倍,并可能导致链路拥塞。这种拥塞会减少其他通信量可用的带宽(不受拓扑更改的影响)。因此,使用链路的“无辜”流量会经历更大的延迟,并且容易出现拥塞性数据包丢失。

o In cases where the link or node failure has been protected by a fast-reroute repair, an inconsistency in the FIBs may prevent some traffic from reaching the failure, and hence being repaired. The repair may thus become starved of traffic and thereby rendered ineffective.

o 在链路或节点故障已通过快速重路由修复得到保护的情况下,FIB中的不一致性可能会阻止某些通信量到达故障,从而被修复。因此,维修可能会因交通不足而失效。

Although micro-loops are usually considered in the context of a failure, similar problems of congestive packet loss and starvation may also occur if the topology change is the result of management action. For example, consider the case where a link is to be taken out of service by management action. The link can be retained in service throughout the transition, thus avoiding the need for any repair. However, if micro-loops form, they may cause congestion loss and may also prevent traffic from reaching the link.

尽管通常在发生故障的情况下考虑微环,但如果拓扑变化是管理操作的结果,则也可能发生类似的拥塞性数据包丢失和饥饿问题。例如,考虑通过管理操作将链接退出服务的情况。该链接可以在整个过渡期间保持使用,因此无需任何维修。然而,如果形成微环路,它们可能会导致拥塞损失,也可能会阻止流量到达链路。

Unless otherwise controlled, micro-loops may form in any part of the network that forwards (or in the case of a new link, will forward) packets over a path that includes the affected topology change. The time taken to propagate the topology change through the network, and the non-uniform time taken by each router to calculate the new shortest path tree (SPT) and update its FIB, contribute to the duration of the packet disruption caused by the micro-loops. In some cases, a packet may be subject to disruption from micro-loops that occur sequentially at links along the path, thus further extending the period of disruption beyond that required to resolve a single loop.

除非另有控制,否则微环可在网络的任何部分中形成,该网络通过包括受影响拓扑变化的路径转发(或在新链路的情况下,将转发)分组。通过网络传播拓扑变化所需的时间,以及每个路由器计算新的最短路径树(SPT)和更新其FIB所需的非均匀时间,都会导致微环造成的数据包中断持续时间延长。在某些情况下,数据包可能会受到沿路径的链路上顺序发生的微循环的干扰,从而进一步延长干扰周期,使其超出解析单个循环所需的时间。

3. Applicability
3. 适用性

Loop-free convergence techniques are applicable to any situation in which micro-loops may form, for example, the convergence of a network following:

无环收敛技术适用于微环可能形成的任何情况,例如,以下网络的收敛:

1. Component failure

1. 部件故障

2. Component repair

2. 部件修理

3. Management withdrawal of a component

3. 组件的管理撤回

4. Management insertion or a component

4. 管理插入或组件

5. Management change of link cost (either positive or negative)

5. 链路成本的管理变更(正或负)

6. External cost change, for example, change of external gateway as a result of a BGP change

6. 外部成本变更,例如,BGP变更导致的外部网关变更

7. A Shared Risk Link Group (SRLG) failure

7. 共享风险链接组(SRLG)故障

In each case, a component may be a link, a set of links, or an entire router. Throughout this document, we use the term SRLG when describing the procedure to be followed when multiple failures have occurred, whether or not they are members of an explicit SRLG. In the case of multiple independent failures, the loop-prevention method described for SRLG may be used, provided it is known that all of these failures have been repaired.

在每种情况下,组件可以是链路、一组链路或整个路由器。在本文档中,我们使用术语SRLG来描述发生多个故障时应遵循的程序,无论这些故障是否为显式SRLG的成员。在多个独立故障的情况下,可使用SRLG所述的环路预防方法,前提是已知所有这些故障均已修复。

Loop-free convergence techniques are applicable to both IP networks and MPLS-enabled networks that use LDP, including LDP networks that use the single-hop tunnel fast-reroute mechanism.

无环收敛技术适用于使用LDP的IP网络和支持MPLS的网络,包括使用单跳隧道快速重路由机制的LDP网络。

An assessment of whether loop-free convergence techniques are required should take into account whether or not the interior gateway protocol (IGP) convergence is sufficiently fast that any micro-loops are of such short duration that they are not disruptive, and whether or not the topology is such that micro-loops are likely to form.

对是否需要无环收敛技术的评估应考虑内部网关协议(IGP)收敛是否足够快,以确保任何微环的持续时间短到不会中断,以及拓扑结构是否可能形成微环。

4. Micro-Loop Control Strategies
4. 微环控制策略

Micro-loop control strategies fall into four basic classes:

微回路控制策略分为四类:

1. Micro-loop mitigation

1. 微回路缓解

2. Micro-loop prevention

2. 微环预防

3. Micro-loop suppression

3. 微环抑制

4. Network design to minimise micro-loops

4. 网络设计以最小化微环

A micro-loop-mitigation scheme works by re-converging the network in such a way that it reduces, but does not eliminate, the formation of micro-loops. Such schemes cannot guarantee the productive forwarding of packets during the transition.

微回路缓解方案的工作原理是,以减少但不消除微回路形成的方式重新汇聚网络。这样的方案不能保证在转换期间数据包的有效转发。

A micro-loop-prevention mechanism controls the re-convergence of the network in such a way that no micro-loops form. Such a micro-loop-prevention mechanism allows the continued use of any fast repair method until the network has converged on its new topology and prevents the collateral damage that occurs to other traffic for the duration of each micro-loop.

微环预防机制以不形成微环的方式控制网络的重新聚合。这种微环路预防机制允许继续使用任何快速修复方法,直到网络收敛到其新拓扑上,并防止在每个微环路期间对其他流量造成附带损害。

A micro-loop-suppression mechanism attempts to eliminate the collateral damage caused by micro-loops to other traffic. This may be achieved by, for example, using a packet-monitoring method that detects that a packet is looping and drops it. Such schemes make no attempt to productively forward the packet throughout the network transition.

微环路抑制机制试图消除微环路对其他通信造成的附带损害。这可以通过例如使用检测分组正在循环并丢弃它的分组监视方法来实现。这样的方案没有试图在整个网络转换过程中有效地转发数据包。

Highly meshed topologies are less susceptible to micro-loops, thus networks may be designed to minimise the occurrence of micro-loops by appropriate link placement and metric settings. However, this approach may conflict with other design requirements, such as cost and traffic planning, and may not accurately track the evolution of the network or temporary changes due to outages.

高度网格化的拓扑结构不太容易受到微环的影响,因此可以通过适当的链路布置和度量设置来设计网络,以最小化微环的发生。但是,这种方法可能与其他设计要求(如成本和交通规划)相冲突,并且可能无法准确跟踪网络的演变或因停机而产生的临时变化。

Note that all known micro-loop-prevention mechanisms and most micro-loop-mitigation mechanisms extend the duration of the re-convergence process. When the failed component is protected by a fast-reroute repair, this implies that the converging network requires the repair to remain in place for longer than would otherwise be the case. The extended convergence time means any traffic that is not repaired by an imperfect repair experiences a significantly longer outage than it would experience with conventional convergence.

请注意,所有已知的微环预防机制和大多数微环缓解机制都会延长重新收敛过程的持续时间。当故障部件受到快速重路由修复的保护时,这意味着汇聚网络需要修复保持在原位的时间比其他情况下更长。延长的收敛时间意味着,与传统收敛相比,任何未通过不完全修复进行修复的流量都会经历更长的中断时间。

When a component is returned to service, or when a network management action has taken place, this additional delay does not cause traffic disruption because there is no repair involved. However, the extended delay is undesirable because it increases the time that the network takes to be ready for another failure, and hence leaves it vulnerable to multiple failures.

当组件恢复服务时,或当网络管理操作发生时,此额外延迟不会导致通信中断,因为不涉及修复。然而,延长的延迟是不可取的,因为它增加了网络为另一个故障做好准备所需的时间,因此使其容易发生多个故障。

5. Loop Mitigation
5. 环路缓解

There are two approaches to loop mitigation.

有两种方法可以缓解循环。

o Fast convergence

o 快速收敛

o A purpose-designed, loop-mitigation mechanism

o 一种专门设计的环路缓解机制

5.1. Fast Convergence
5.1. 快速收敛

The duration of micro-loops is dependent on the speed of convergence. Improving the speed of convergence may therefore be seen as a loop-mitigation technique.

微循环的持续时间取决于收敛速度。因此,提高收敛速度可以看作是一种环路缓解技术。

5.2. PLSN
5.2. PLSN

The only known purpose-designed, loop-mitigation approach is the Path Locking with Safe-Neighbors (PLSN) method described in PLSN [ANALYSIS]. In this method, a micro-loop-free next-hop safety condition is defined as follows:

已知的唯一目的设计的环路缓解方法是PLSN[分析]中描述的带安全邻居的路径锁定(PLSN)方法。在该方法中,无微环下一跳安全条件定义如下:

In a symmetric-cost network, it is safe for router X to change to the use of neighbor Y as its next hop for a specific destination if the path through Y to that destination satisfies both of the following criteria:

在对称成本网络中,如果通过Y到达某个特定目的地的路径满足以下两个条件,则路由器X可以安全地将邻居Y用作该目的地的下一跳:

1. X considers Y as its loop-free neighbor based on the topology before the change, AND

1. X根据更改前的拓扑将Y视为其无环邻居,并且

2. X considers Y as its downstream neighbor based on the topology after the change.

2. 根据更改后的拓扑结构,X将Y视为其下游邻居。

In an asymmetric-cost network, a stricter safety condition is needed, and the criterion is that:

在不对称成本网络中,需要更严格的安全条件,标准是:

X considers Y as its downstream neighbor based on the topology both before and after the change.

X根据更改前后的拓扑将Y视为其下游邻居。

Based on these criteria, destinations are classified by each router into three classes:

根据这些标准,每个路由器将目的地分为三类:

o Type A destinations: Destinations unaffected by the change (type A1) and also destinations whose next hop after the change satisfies the safety criteria (type A2).

o A类目的地:不受变更影响的目的地(A1类),以及变更后下一跳满足安全标准的目的地(A2类)。

o Type B destinations: Destinations that cannot be sent via the new, primary next hop because the safety criteria are not satisfied, but that can be sent via another next hop that does satisfy the safety criteria.

o 类型B目的地:由于不满足安全标准而无法通过新的主下一跳发送的目的地,但可以通过另一个满足安全标准的下一跳发送的目的地。

o Type C destinations: All other destinations.

o C类目的地:所有其他目的地。

Following a topology change, type A destinations are immediately changed to go via the new topology. Type B destinations are immediately changed to go via the next hop that satisfies the safety criteria, even though this is not the shortest path. Type B destinations continue to go via this path until all routers have changed their type C destinations over to the new next hop. Routers must not change their type C destinations until all routers have changed their type A2 and B destinations to the new or intermediate (safe) next hop.

拓扑更改后,类型a目的地立即更改为通过新拓扑。类型B目的地立即更改为通过满足安全标准的下一跳,即使这不是最短路径。类型B目的地继续通过此路径,直到所有路由器将其类型C目的地更改为新的下一跳。在所有路由器将其A2和B型目的地更改为新的或中间(安全)下一跳之前,路由器不得更改其C型目的地。

Simulations indicate that this approach produces a significant reduction in the number of links that are subject to micro-looping. However, unlike all of the micro-loop-prevention methods, it is only a partial solution. In particular, micro-loops may form on any link joining a pair of type C routers.

仿真表明,这种方法可以显著减少受微环影响的链路数量。然而,与所有微环预防方法不同,它只是部分解决方案。特别是,微环可以在连接一对C型路由器的任何链路上形成。

Because routers delay updating their type C destination FIB entries, they will continue to route towards the failure during the time when the routers are changing their type A and B destinations, and hence will continue to productively forward packets, provided that viable repair paths exist.

由于路由器延迟更新其类型C目的地FIB条目,因此在路由器更改其类型A和B目的地期间,它们将继续向故障路由,因此将继续有效地转发数据包,前提是存在可行的修复路径。

A backwards-compatibility issue arises with PLSN. If a router is not capable of micro-loop control, it will not correctly delay its FIB update. If all such routers had only type A destinations, this loop-mitigation mechanism would work as it was designed. Alternatively, if all such incapable routers had only type C destinations, the "loop-prevention" announcement mechanism used to trigger the tunnel-based schemes (see Sections 6.2 to 6.4) could be used to cause the type A and B destinations to be changed, with the incapable routers and routers having type C destinations delaying until they received the "real" announcement. Unfortunately, these two approaches are mutually incompatible.

PLSN出现了向后兼容性问题。如果路由器不能进行微环控制,它将无法正确延迟其FIB更新。如果所有这样的路由器只有A类目的地,那么这种环路缓解机制将按照设计的方式工作。或者,如果所有此类无能力路由器仅具有C类目的地,则可使用用于触发基于隧道的方案(参见第6.2节至第6.4节)的“环路预防”公告机制来改变A类和B类目的地,无法使用的路由器和具有C类目的地的路由器延迟,直到它们收到“真正的”通知。不幸的是,这两种方法互不兼容。

Note that simulations indicate that in most topologies treating type B destinations as type C results in only a small degradation in loop prevention. Also note that simulation results indicate that in production networks where some, but not all, links have asymmetric costs, using the stricter asymmetric-cost criterion actually reduces the number of loop-free destinations because fewer destinations can be classified as type A or B.

请注意,仿真表明,在大多数拓扑中,将B型目的地视为C型目的地只会导致环路预防中的小退化。还要注意的是,模拟结果表明,在生产网络中,部分(但不是全部)链路具有不对称成本,使用更严格的不对称成本标准实际上减少了无环路目的地的数量,因为可以归类为A类或B类的目的地更少。

This mechanism operates identically for:

该机构的工作原理相同:

o events that degrade the topology (e.g., link failure),

o 导致拓扑降级的事件(例如链路故障),

o events that improve the topology (e.g., link restoration), and

o 改善拓扑的事件(例如,链路恢复),以及

o shared risk link group (SRLG) failure.

o 共享风险链接组(SRLG)故障。

6. Micro-Loop Prevention
6. 微环预防

Eight micro-loop-prevention methods have been proposed:

提出了八种微回路预防方法:

1. Incremental cost advertisement

1. 增量成本广告

2. Nearside tunneling

2. 近侧掘进

3. Farside tunneling

3. 远侧隧道

4. Distributed tunnels

4. 分布式隧道

5. Packet marking

5. 包标记

6. New MPLS labels

6. 新MPLS标签

7. Ordered FIB update

7. 有序FIB更新

8. Synchronized FIB update

8. 同步FIB更新

6.1. Incremental Cost Advertisement
6.1. 增量成本广告

When a link fails, the cost of the link is normally changed from its assigned metric to "infinity" in one step. However, it can be proved [OPT] that no micro-loops will form if the link cost is increased in suitable increments, and the network is allowed to stabilize before the next cost increment is advertised. Once the link cost has been increased to a value greater than that of the lowest alternative cost around the link, the link may be disabled without causing a micro-loop.

当链路发生故障时,链路的成本通常会在一个步骤中从其指定的度量更改为“无穷大”。然而,可以证明[OPT],如果链路成本以适当的增量增加,则不会形成微环,并且允许网络在下一个成本增量公布之前稳定。一旦链路成本增加到大于链路周围的最低替代成本的值,链路就可以被禁用而不会导致微循环。

The criterion for a link cost change to be safe is that any link that is subjected to a cost change of x can only cause loops in a part of the network that has a cyclic cost less than or equal to x. Because there may exist links that have a cost of one in each direction, resulting in a cyclic cost of two, this can result in the link cost having to be raised in increments of one. However, the increment can be larger where the minimum cost permits. Recent work [OPT] has

链路成本变化是否安全的标准是,任何受到x成本变化影响的链路只能在循环成本小于或等于x的网络部分中引起环路。因为可能存在在每个方向上成本为1的链路,导致循环成本为2,这可能导致链路成本必须以1的增量增加。但是,在最低成本允许的情况下,增量可以更大。最近的工作[OPT]已经完成

shown that there are a number of optimizations that can be applied to the problem in order to determine the exact set of cost values required, and hence minimise the number of increments.

结果表明,为了确定所需的成本值的精确集合,可以对问题进行许多优化,从而最大限度地减少增量的数量。

It will be appreciated that when a link is returned to service, its cost is reduced in small steps from "infinity" to its final cost, thereby providing similar micro-loop prevention during a "good-news" event. Note that the link cost may be decreased from "infinity" to any value greater than that of the lowest alternative cost around the link in one step without causing a micro-loop.

应当理解,当链路恢复服务时,其成本在从“无限”到其最终成本的小步骤中降低,从而在“好消息”事件期间提供类似的微环预防。注意,链路成本可以在一个步骤中从“无限”降低到大于链路周围的最低替代成本的任何值,而不会导致微循环。

When the failure is an SRLG, the link cost increments must be coordinated across all failing members of the SRLG. This may be achieved by completing the transition of one link before starting the next or by interleaving the changes.

当故障为SRLG时,必须在SRLG的所有故障成员之间协调链路成本增量。这可以通过在开始下一个链路之前完成一个链路的转换或者通过交错更改来实现。

The incremental cost change approach has the advantage over all other currently known loop-prevention schemes in that it requires no change to the routing protocol. It will work in any network because it does not require any cooperation from the other routers in the network.

增量成本更改方法与目前已知的所有其他环路预防方案相比具有优势,因为它不需要更改路由协议。它可以在任何网络中工作,因为它不需要网络中其他路由器的任何合作。

Where the micro-loop-prevention mechanism is being used to support a planned reconfiguration of the network, the extended total reconvergence time resulting from the multiple increments is of limited consequence, particularly where the number of increments have been optimized. This, together with the ability to implement this technique in isolation, makes this method a good candidate for use with such management-initiated changes.

当使用微环预防机制支持网络的计划重新配置时,由多个增量产生的延长的总重新聚合时间的影响是有限的,特别是在增量数量已优化的情况下。这一点,再加上隔离实施这一技术的能力,使得该方法成为与此类管理发起的更改一起使用的一个很好的候选方法。

Where the micro-loop-prevention mechanism is being used to support failure recovery, the number of increments required, and hence the time taken to fully converge, is significant even for small numbers of increments. This is because, for the duration of the transition, some parts of the network continue to use the old forwarding path, and hence use any repair mechanism for an extended period. In the case of a failure that cannot be fully repaired, some destinations may therefore become unreachable for an extended period. In addition, the network may be vulnerable to a second failure for the duration of the controlled re-convergence.

在使用微回路预防机制支持故障恢复的情况下,即使是少量增量,所需增量的数量以及完全收敛所需的时间也非常重要。这是因为,在过渡期间,网络的某些部分继续使用旧的转发路径,因此在较长时间内使用任何修复机制。在无法完全修复的故障情况下,某些目的地可能因此在较长时间内无法到达。此外,在受控重新收敛期间,网络可能容易发生第二次故障。

Where large metrics are used and no optimization (such as that described above) is performed, the incremental cost method can be extremely slow. However, in cases where the per-link metric is small, either because small values have been assigned by the network designers or because of restrictions implicit in the routing protocol (e.g., RIP restricts the metric, and BGP using the autonomous system

如果使用较大的指标,并且没有进行优化(如上述),那么增量成本法可能会非常慢。然而,在每链路度量较小的情况下,可能是因为网络设计者分配了较小的值,或者是因为路由协议中隐含的限制(例如,RIP限制该度量,BGP使用自治系统)

(AS) path length frequently uses an effective metric of one or a very small integer for each inter AS hop), the number of required increments can be acceptably small even without optimizations.

(因为)路径长度经常使用一个有效度量,即每个帧间跳一个或一个非常小的整数,所以即使没有优化,所需增量的数量也可以很小。

6.2. Nearside Tunneling
6.2. 近侧掘进

This mechanism works by creating an overlay network using tunnels whose path is not affected by the topology change and then carrying the traffic affected by the change in that new network. When all the traffic is in the new, tunnel-based network, the real network is allowed to converge on the new topology. Because all the traffic that would be affected by the change is carried in the overlay network, no micro-loops form.

该机制通过使用路径不受拓扑变化影响的隧道创建覆盖网络,然后在新网络中承载受变化影响的流量。当所有流量都在新的、基于隧道的网络中时,允许真实网络在新拓扑上聚合。由于所有受变化影响的流量都在覆盖网络中传输,因此不会形成微环路。

When a failure is detected (or a link is withdrawn from service), the router adjacent to the failure issues a new "loop-prevention" routing message announcing the topology change. This message is propagated through the network by all routers but is only understood by routers capable of using one of the tunnel-based, micro-loop-prevention mechanisms.

当检测到故障(或链路退出服务)时,与故障相邻的路由器发出新的“环路预防”路由消息,宣布拓扑更改。此消息由所有路由器通过网络传播,但只有能够使用基于隧道的微环路预防机制之一的路由器才能理解。

Each of the micro-loop-preventing routers builds a tunnel to the closest router adjacent to the failure. They then determine which of their traffic would transit the failure and place that traffic in the tunnel. When all of these tunnels are in place (determined, for example, by waiting a suitable interval), the failure is announced as normal. Because these tunnels will be unaffected by the transition and because the routers protecting the link will continue the repair (or forward across the link being withdrawn), no traffic will be disrupted by the failure. When the network has converged, these tunnels are withdrawn, allowing traffic to be forwarded along its new, "natural" path. The order of tunnel insertion and withdrawal is not important, provided that the tunnels are all in place before the normal announcement is issued and that the repair remains in place until normal convergence has completed.

每个防止微环路的路由器都会建立一个隧道,通向故障附近最近的路由器。然后,他们确定哪一个交通将通过故障,并将该交通置于隧道中。当所有这些隧道就位时(例如,通过等待适当的时间间隔来确定),将宣布故障为正常。由于这些隧道将不受过渡的影响,并且由于保护链路的路由器将继续修复(或通过正在撤回的链路转发),因此故障不会中断任何通信。当网络聚合后,这些隧道被撤回,允许流量沿着新的“自然”路径转发。隧道插入和退出的顺序并不重要,前提是在发布正常公告之前,隧道已全部就位,并且在正常会聚完成之前,修复仍在原位。

This method completes in bounded time and is generally much faster than the incremental cost method. Depending on the exact design, it completes in two or three flood-SPF-FIB update cycles.

这种方法在有限的时间内完成,通常比增量成本法快得多。根据具体设计,它在两个或三个flood SPF FIB更新周期内完成。

At the time at which the failure is announced as normal, micro-loops may form within isolated islands of non-micro-loop-preventing routers. However, only traffic entering the network via such routers can micro-loop. All traffic entering the network via a micro-loop-preventing router will be tunneled correctly to the nearest repairing router -- including, if necessary, being tunneled via a non-micro-loop-preventing router -- and will not micro-loop.

当故障被宣布为正常故障时,微环可能会在非微环路由器的隔离岛内形成。然而,只有通过这种路由器进入网络的流量才能进行微循环。所有通过微环路阻止路由器进入网络的流量都将正确地通过隧道传输到最近的修复路由器,包括(如有必要)通过非微环路阻止路由器进行隧道传输,并且不会发生微环路。

Where there is no requirement to prevent the formation of micro-loops involving non-micro-loop-preventing routers, a single, "normal" announcement may be made and a local timer used to determine the time at which transition from tunneled forwarding to normal forwarding over the new topology may commence.

如果不需要防止涉及非微环防止路由器的微环的形成,则可以发出单个“正常”公告,并且使用本地定时器来确定在新拓扑上从隧道转发到正常转发的转换可以开始的时间。

This technique has the disadvantage that it requires traffic to be tunneled during the transition. This is an issue in IP networks because not all router designs are capable of high-performance IP tunneling. It is also an issue in MPLS networks because the encapsulating router has to know the label set that the decapsulating router is distributing.

这种技术的缺点是,它需要在过渡期间对流量进行隧道传输。这是IP网络中的一个问题,因为并非所有路由器设计都能够进行高性能IP隧道。这也是MPLS网络中的一个问题,因为封装路由器必须知道非封装路由器分发的标签集。

A further disadvantage of this method is that it requires cooperation from all the routers within the routing domain to fully protect the network against micro-loops.

这种方法的另一个缺点是,它需要路由域内所有路由器的合作,以充分保护网络免受微环路的影响。

When a new link is added, the mechanism is run in "reverse". When the loop-prevention announcement is heard, routers determine which traffic they will send over the new link and tunnel that traffic to the router on the near side of that link. This path will not be affected by the presence of the new link. When the "normal" announcement is heard, they then update their FIB to send the traffic normally, according to the new topology. Any traffic encountering a router that has not yet updated its FIB will be tunneled to the near side of the link, and will therefore not loop.

当添加新链接时,该机制以“反向”运行。当听到环路预防通知时,路由器确定它们将通过新链路发送哪些通信量,并将该通信量通过隧道传输到该链路附近的路由器。此路径将不受新链接存在的影响。当听到“正常”通知时,他们会根据新拓扑更新FIB以正常发送流量。任何遇到尚未更新其FIB的路由器的流量都将通过隧道传输到链路的近端,因此不会循环。

When a management change to the topology is required, again exactly the same mechanism protects against micro-looping of packets by the micro-loop-preventing routers.

当需要对拓扑结构进行管理更改时,同样完全相同的机制可以防止微环防止路由器的数据包微环。

When the failure is an SRLG, the required strategy is to classify traffic according the furthest failing member of the SRLG that it will traverse on its way to the destination, and to tunnel that traffic to the repairing router for that SRLG member. This will require multiple tunnel destinations -- in the limiting case, one per SRLG member.

当故障是SRLG时,所需的策略是根据SRLG中最远的故障成员对通信量进行分类,该故障成员将在到达目的地的途中通过该通信量,并将该通信量通过隧道传输到该SRLG成员的修复路由器。这将需要多个隧道目的地——在极限情况下,每个SRLG成员一个。

6.3. Farside Tunnels
6.3. 远侧隧道

Farside tunneling loop prevention requires the loop-preventing routers to place all of the traffic that would traverse the failure in one or more tunnels terminating at the router (or, in the case of node failure, routers) at the far side of the failure. The properties of this method are a more uniform distribution of repair traffic than is achieved using the nearside tunnel method and, in the case of node failure, a reduction in the decapsulation load on any single router.

远端隧道环路预防要求环路预防路由器将穿越故障的所有通信量放置在一个或多个隧道中,该隧道终止于故障远端的路由器(或者,在节点故障的情况下,路由器)。该方法的特性是修复流量的分布比使用近侧隧道方法更均匀,并且在节点故障的情况下,减少了任何单个路由器上的去封装负载。

Unlike the nearside tunnel method (which uses normal routing to the repairing router), this method requires the use of a repair path to the farside router. This may be provided by the not-via [NOT-VIA] mechanism, in which case no further computation is needed.

与近侧隧道方法(使用到修复路由器的正常路由)不同,此方法需要使用到远侧路由器的修复路径。这可以由not via[not-via]机制提供,在这种情况下,不需要进一步计算。

The mode of operation is otherwise identical to the nearside tunneling loop-prevention method (Section 6.2).

操作模式在其他方面与近侧隧道环路预防方法相同(第6.2节)。

6.4. Distributed Tunnels
6.4. 分布式隧道

In the distributed tunnels loop-prevention method, each router calculates its own repair and forwards traffic affected by the failure using that repair. Unlike the fast reroute (FRR) case, the actual failure is known at the time of the calculation. The objective of the loop-preventing routers is to get the packets that would have gone via the failure into Q-space [FRR-TUNN] using routers that are in P-space. Because packets are decapsulated on entry to Q-space, rather than being forced to go to the farside of the failure, more optimum routing may be achieved. This method is subject to the same reachability constraints described in [FRR-TUNN].

在分布式隧道环路预防方法中,每个路由器计算自己的修复并使用该修复转发受故障影响的流量。与快速重路由(FRR)情况不同,实际故障在计算时已知。环路防止路由器的目标是使用P空间中的路由器将可能通过故障进入Q空间[FRR-TUNN]的数据包获取。由于数据包在进入Q空间时被解除封装,而不是被迫去故障的远端,因此可以实现更优化的路由。此方法受[FRR-TUNN]中描述的相同可达性约束的约束。

The mode of operation is otherwise identical to the nearside tunneling loop-prevention method (Section 6.2).

操作模式在其他方面与近侧隧道环路预防方法相同(第6.2节)。

An alternative distributed tunnel mechanism is for all routers to tunnel to the not-via address [NOT-VIA] associated with the failure.

另一种分布式隧道机制是让所有路由器通过隧道连接到与故障相关的not-via地址[not-via]。

6.5. Packet Marking
6.5. 包标记

If packets could be marked in some way, this information could be used to assign them to one of:

如果可以以某种方式标记数据包,则可以使用此信息将数据包分配给:

o the new topology,

o 新的拓扑结构,

o the old topology, or

o 旧拓扑,或

o a transition topology.

o 转换拓扑。

They would then be correctly forwarded during the transition. This mechanism works identically for both "bad-news" and "good-news" events. It also works identically for SRLG failure. There are three problems with this solution:

然后,它们将在转换期间正确转发。这种机制在“坏消息”和“好消息”事件中的作用是相同的。它同样适用于SRLG故障。此解决方案存在三个问题:

o A packet-marking bit may not be available, for example, a network supporting both the differentiated services architecture [RFC2475] and explicit congestion notification [RFC3168] uses all eight bits of the IPv4 Type of Service field.

o 数据包标记位可能不可用,例如,同时支持区分服务体系结构[RFC2475]和显式拥塞通知[RFC3168]的网络使用IPv4服务类型字段的所有八位。

o The mechanism would introduce a non-standard forwarding procedure.

o 该机制将引入非标准转发程序。

o Packet marking using either the old or the new topology would double the size of the FIB; however, some optimizations may be possible.

o 使用旧拓扑或新拓扑的包标记将使FIB的大小加倍;但是,可能会进行一些优化。

6.6. MPLS New Labels
6.6. MPLS新标签

In an MPLS network that is using [RFC5036] for label distribution, loop-free convergence can be achieved through the use of new labels when the path that a prefix will take through the network changes.

在使用[RFC5036]进行标签分发的MPLS网络中,当前缀通过网络的路径发生变化时,可以通过使用新标签实现无环路收敛。

As described in Section 6.2, the repairing routers issue a loop-prevention announcement to start the loop-free convergence process. All loop-preventing routers calculate the new topology and determine whether their FIB needs to be changed. If there is no change in the FIB, they take no part in the following process.

如第6.2节所述,修复路由器发布环路预防公告,以启动无环路收敛过程。所有防止环路的路由器都会计算新拓扑,并确定是否需要更改其FIB。如果FIB没有变化,他们就不参与下面的过程。

The routers that need to make a change to their FIB consider each change and check the new next hop to determine whether it will use a path in the OLD topology that reaches the destination without traversing the failure (i.e., the next hop is in P-space with respect to the failure [FRR-TUNN]). If so, the FIB entry can be immediately updated. For all of the remaining FIB entries, the router issues a new label to each of its neighbors. This new label is used to lock the path during the transition in a similar manner to the previously described method for loop-free convergence with tunnels (Section 6.2). Routers receiving a new label install it in their FIB for MPLS label translation, but do not yet remove the old label and do not yet use this new label to forward IP packets, i.e., they prepare to forward using the new label on the new path but do not use it yet. Any packets received continue to be forwarded the old way, using the old labels, towards the repair.

需要对它们的FIB进行改变的路由器考虑每一个变化并检查新的下一跳,以确定它是否将使用旧拓扑中到达目的地的路径而不穿越失败(即,下一跳是关于失败的p空间[FRR TUNN])。如果是这样,可以立即更新FIB条目。对于所有剩余的FIB条目,路由器向其每个邻居发布一个新标签。该新标签用于在过渡期间锁定路径,其方式与之前描述的隧道无环路收敛方法类似(第6.2节)。接收到新标签的路由器将其安装在FIB中以进行MPLS标签转换,但尚未删除旧标签,也未使用此新标签转发IP数据包,即,它们准备在新路径上使用新标签转发,但尚未使用。接收到的任何数据包都将继续使用旧标签以旧方式转发到修复。

At some time after the loop-prevention announcement, a normal routing announcement of the failure is issued. This announcement must not be issued until such time as all routers have carried out all of their activities that were triggered by the loop-prevention announcement. On receipt of the normal announcement, all routers that were delaying convergence move to their new path for both the new and the old labels. This involves changing the IP address entries to use the new labels AND changing the old labels to forward using the new labels.

在环路预防公告之后的某个时间,将发布故障的正常路由公告。在所有路由器都执行了由环路预防公告触发的所有活动之前,不得发布此公告。在收到正常的公告后,所有延迟聚合的路由器都会移动到新标签和旧标签的新路径。这包括更改IP地址条目以使用新标签,以及更改旧标签以使用新标签转发。

Because the new label path was installed during the loop-prevention phase, packets reach their destinations as follows:

由于新标签路径是在环路预防阶段安装的,因此数据包到达目的地的方式如下:

o If they do not go via any router using a new label, they go via the repairing router and the repair.

o 如果他们没有使用新标签通过任何路由器,他们将通过修复路由器和修复。

o If they meet any router that is using the new labels, they get marked with the new labels and reach their destination using the new path, back-tracking if necessary.

o 如果遇到任何使用新标签的路由器,它们将被标记为新标签,并使用新路径到达目的地,必要时进行回溯。

When all routers have changed to the new path, the network is converged. At some later time, when it can be assumed that all routers have moved to using the new path, the FIB can be cleaned up to remove the, now redundant, old labels.

当所有路由器都切换到新路径时,网络将聚合。在以后的某个时候,当可以假设所有路由器都已使用新路径时,可以清理FIB以删除现在冗余的旧标签。

As with other methods, the new labels may be modified to provide loop prevention for "good news". There are also a number of optimizations of this method.

与其他方法一样,新标签可能会被修改,为“好消息”提供环路预防。该方法还有许多优化。

6.7. Ordered FIB Update
6.7. 有序FIB更新

The ordered FIB loop prevention method is described in "Loop-free convergence using oFIB" [oFIB]. Micro-loops occur following a failure or a cost increase, when a router closer to the failed component revises its routes to take account of the failure before a router that is further away. By analyzing the reverse shortest path tree (rSPT) over which traffic is directed to the failed component in the old topology, it is possible to determine a strict ordering that ensures that nodes closer to the root always process the failure after any nodes further away, and hence micro-loops are prevented.

有序FIB环路预防方法在“使用oFIB的无环路收敛”[oFIB]中描述。当靠近故障组件的路由器在距离更远的路由器之前修改其路由以考虑故障时,故障或成本增加后会发生微环路。通过分析反向最短路径树(rSPT),可以确定严格的顺序,确保靠近根节点的节点总是在距离任何节点更远后处理故障,从而防止微循环。

When the failure has been announced, each router waits a multiple of the convergence timer [LF-TIMERS]. The multiple is determined by the node's position in the rSPT, and the delay value is chosen to guarantee that a node can complete its processing within this time. The convergence time may be reduced by employing a signaling mechanism to notify the parent when all the children have completed their processing, and hence when it is safe for the parent to instantiate its new routes.

当宣布失败时,每个路由器等待收敛计时器的倍数[LF-TIMERS]。倍数由节点在rSPT中的位置确定,选择延迟值以确保节点可以在此时间内完成其处理。可通过采用信令机制来减少收敛时间,以在所有子代已完成其处理时通知父代,从而当父代实例化其新路由是安全的时通知父代。

The property of this approach is therefore that it imposes a delay that is bounded by the network diameter, although in many cases it will be much less.

因此,这种方法的特性是,它施加的延迟受网络直径的限制,尽管在许多情况下,延迟要小得多。

When a link is returned to service, the convergence process above is reversed. A router first determines its distance (in hops) from the new link in the NEW topology. Before updating its FIB, it then waits a time equal to the value of that distance multiplied by the convergence timer.

当一条链路恢复服务时,上述收敛过程是反向的。路由器首先确定其与新拓扑中新链路的距离(以跳数为单位)。在更新其FIB之前,它等待的时间等于该距离值乘以收敛计时器的时间。

It will be seen that network-management actions can similarly be undertaken by treating a cost increase in a manner similar to a failure and a cost decrease similar to a restoration.

可以看出,通过以类似于故障的方式处理成本增加和以类似于恢复的方式处理成本减少,同样可以采取网络管理措施。

The ordered FIB mechanism requires all nodes in the domain to operate according to these procedures, and the presence of non-cooperating nodes can give rise to loops for any traffic that traverses them (not just traffic that is originated through them). Without additional mechanisms, these loops could remain in place for a significant time.

有序FIB机制要求域中的所有节点都按照这些过程进行操作,非合作节点的存在可能会导致任何通过它们的流量(不仅仅是通过它们产生的流量)产生循环。如果没有额外的机制,这些循环可以在相当长的一段时间内保持不变。

It should be noted that this method requires per-router ordering but not per-prefix ordering. A router must wait its turn to update its FIB, but it should then update its entire FIB.

应该注意的是,此方法需要按路由器排序,但不需要按前缀排序。路由器必须等待轮到它来更新它的FIB,但它应该随后更新它的整个FIB。

When an SRLG failure occurs, a router must classify traffic into the classes that pass over each member of the SRLG. Each router is then independently assigned a ranking with respect to each SRLG member for which they have a traffic class. These rankings may be different for each traffic class. The prefixes of each class are then changed in the FIB according to the ordering of their specific ranking. Again, as for the single failure case, signaling may be used to speed up the convergence process.

当SRLG发生故障时,路由器必须将流量分类为通过SRLG每个成员的类。然后,针对每个SRLG成员独立地为每个路由器分配一个排名,每个SRLG成员都有一个流量等级。这些排名对于每个流量等级可能不同。然后,每个类的前缀在FIB中根据其特定排名的顺序进行更改。同样,对于单一故障情况,可以使用信令来加速收敛过程。

Note that the special SRLG case of a full or partial node failure can be dealt with without using per-prefix ordering by running a single reverse-SPF computation rooted at the failed node (or common point of the subset of failing links in the partial case).

请注意,完全或部分节点故障的特殊SRLG情况可以通过在故障节点(或部分情况下故障链路子集的公共点)上运行单个反向SPF计算来处理,而无需使用每前缀排序。

There are two classes of signaling optimization that can be applied to the ordered FIB loop-prevention method:

有两类信令优化可应用于有序FIB环路预防方法:

o When the router makes NO change, it can signal immediately. This significantly reduces the time taken by the network to process long chains of routers that have no change to make to their FIB.

o 当路由器没有改变时,它可以立即发出信号。这大大减少了网络处理长链路由器所需的时间,这些路由器的FIB没有变化。

o When a router HAS changed, it can signal that it has completed. This is more problematic since this may be difficult to determine, particularly in a distributed architecture, and the optimization obtained is the difference between the actual time taken to make the FIB change and the worst-case timer value. This saving could be of the order of one second per hop.

o 当路由器发生变化时,它可以发出信号表明它已经完成。这是一个更大的问题,因为这可能很难确定,特别是在分布式体系结构中,并且获得的优化是进行FIB更改的实际时间与最坏情况下的计时器值之间的差异。这种节省大约为每跳一秒。

There is another method of executing ordered FIB that is based on pure signaling [SIG]. Methods that use signaling as an optimization are safe because eventually they fall back on the established IGP mechanisms that ensure that networks converge under conditions of packet loss. However, a mechanism that relies on signaling in order to converge requires a reliable signaling mechanism that must be proven to recover from any failure circumstance.

还有另一种执行有序FIB的方法,它基于纯信令[SIG]。使用信令作为优化的方法是安全的,因为它们最终依赖于已建立的IGP机制,以确保网络在丢包情况下收敛。然而,依赖于信令以实现收敛的机制需要可靠的信令机制,必须证明该机制能够从任何故障环境中恢复。

6.8. Synchronised FIB Update
6.8. 同步FIB更新

Micro-loops form because of the asynchronous nature of the FIB update process during a network transition. In many router architectures, it is the time taken to update the FIB itself that is the dominant term. One approach would be to have two FIBs and, in a synchronized action throughout the network, to switch from the old to the new. One way to achieve this synchronized change would be to signal or otherwise determine the wall clock time of the change and then execute the change at that time, using NTP [RFC1305] to synchronize the wall clocks in the routers.

由于网络转换期间FIB更新过程的异步性质,形成了微循环。在许多路由器架构中,更新FIB本身所需的时间是主要术语。一种方法是使用两个FIB,并在整个网络中同步操作,从旧的切换到新的。实现此同步更改的一种方法是发送信号或以其他方式确定更改的挂钟时间,然后在该时间执行更改,使用NTP[RFC1305]同步路由器中的挂钟。

This approach has a number of major issues. Firstly, two complete FIBs are needed, which may create a scaling issue; secondly, a suitable network-wide synchronization method is needed. However, neither of these are insurmountable problems.

这种方法有许多重大问题。首先,需要两个完整的FIB,这可能会产生缩放问题;其次,需要一种合适的全网同步方法。然而,这两个问题都不是不可逾越的。

Since the FIB change synchronization will not be perfect, there may be some interval during which micro-loops form. Whether this scheme is classified as a micro-loop-prevention mechanism or a micro-loop-mitigation mechanism within this taxonomy is therefore dependent on the degree of synchronization achieved.

由于FIB变化同步将不完美,因此可能存在形成微环的时间间隔。因此,在该分类法中,该方案被分类为微环预防机制还是微环缓解机制取决于实现的同步程度。

This mechanism works identically for both "bad-news" and "good-news" events. It also works identically for SRLG failure. Further consideration needs to be given to interoperating with routers that do not support this mechanism. Without a suitable interoperating mechanism, loops may form for the duration of the synchronization delay.

这种机制在“坏消息”和“好消息”事件中的作用是相同的。它同样适用于SRLG故障。需要进一步考虑与不支持此机制的路由器的互操作。如果没有合适的互操作机制,在同步延迟期间可能会形成循环。

7. Using PLSN in Conjunction with Other Methods
7. 结合其他方法使用PLSN

All of the tunnel methods and packet marking can be combined with PLSN (see Section 5.2 of this document and [ANALYSIS]) to reduce the traffic that needs to be protected by the advanced method. Specifically, all traffic could use PLSN except traffic between a pair of routers, both of which consider the destination to be type C. The type-C-to-type-C traffic would be protected from micro-looping through the use of a loop-prevention method.

所有隧道方法和数据包标记都可以与PLSN相结合(见本文件第5.2节和[分析]),以减少需要通过高级方法保护的通信量。具体地说,除了在一对路由器之间的业务之外,所有业务都可以使用PLSN,这两个路由器都考虑目的地是C类型。通过使用环路防止方法,C型到C型业务将被保护免于微循环。

However, determining whether the new next-hop router considers a destination to be type C may be computationally intensive. An alternative approach would be to use a loop-prevention method for all local type C destinations. This would not require any additional computation, but would require the additional loop-prevention method to be used in cases that would not have generated loops (i.e., when the new next-hop router considered this to be a type A or B destination).

然而,确定新的下一跳路由器是否认为目的地是类型C可能是计算密集型的。另一种方法是对所有本地C型目的地使用环路预防方法。这不需要任何额外的计算,但需要在不会产生环路的情况下使用额外的环路预防方法(即,当新的下一跳路由器认为这是a类或B类目的地时)。

The amount of traffic that would use PLSN is highly dependent on the network topology and the specific change, but would be expected to be in the range of 70% to 90% in typical networks.

使用PLSN的通信量在很大程度上取决于网络拓扑和具体变化,但在典型网络中,预计在70%到90%的范围内。

However, PLSN cannot be combined safely with ordered FIB. Consider the network fragment shown below:

然而,PLSN不能与有序FIB安全结合。考虑下面显示的网络片段:

                      R
                     /|\
                    / | \
                  1/ 2|  \3
                  /   |   \    cost S->T = 10
           Y-----X----S----T   cost T->S = 1
           |  1     2      |
           |1              |
           D---------------+
                  20
        
                      R
                     /|\
                    / | \
                  1/ 2|  \3
                  /   |   \    cost S->T = 10
           Y-----X----S----T   cost T->S = 1
           |  1     2      |
           |1              |
           D---------------+
                  20
        

On failure of link XY, according to PLSN, S will regard R as a safe neighbor for traffic to D. However, the ordered FIB rank of both R and T will be zero, and hence these can change their FIBs during the same time interval. If R changes before T, then a loop will form around R, T, and S. This can be prevented by using a stronger safety condition than PLSN currently specifies, at the cost of introducing more type C routers, and hence reducing the PLSN coverage.

根据PLSN,当链路XY发生故障时,S将把R视为D流量的安全邻居。然而,R和T的有序FIB秩都将为零,因此它们可以在同一时间间隔内改变其FIB。如果R在T之前改变,则在R、T和S周围将形成一个环路。这可以通过使用比PLSN当前指定的更强的安全条件来防止,代价是引入更多的C型路由器,从而减少PLSN覆盖范围。

8. Loop Suppression
8. 环路抑制

A micro-loop-suppression mechanism recognizes that a packet is looping and drops it. One such approach would be for a router to recognize, by some means, that it had seen the same packet before. It is difficult to see how sufficiently reliable discrimination could be achieved without some form of per-router signature, such as route recording. A packet-recognizing approach therefore seems infeasible.

微环抑制机制识别出一个包正在循环并丢弃它。一种这样的方法是路由器通过某种方式识别它以前看到过相同的数据包。如果没有某种形式的每路由器签名(如路由记录),很难看出如何实现足够可靠的鉴别。因此,数据包识别方法似乎不可行。

An alternative approach would be to recognize that a packet was looping by recognizing that it was being sent back to the place from which it had just come. This would work for the types of loop that form in symmetric-cost networks, but would not suppress the cyclic loops that form in asymmetric networks or as a result of multiple failures.

另一种方法是,通过识别数据包正在被发送回它刚刚到达的地方,来识别数据包正在循环。这适用于在对称成本网络中形成的环路类型,但不会抑制在不对称网络中形成的循环环路或由于多次故障而形成的循环环路。

This mechanism operates identically for both "bad-news" events, "good-news" events, and SRLG failure.

该机制对“坏消息”事件、“好消息”事件和SRLG故障的作用相同。

9. Compatibility Issues
9. 兼容性问题

Deployment of any micro-loop-control mechanism is a major change to a network. Full consideration must be given to interoperation between routers that are capable of micro-loop control and those that are not. Additionally, there may be a desire to limit the complexity of micro-loop control by choosing a method based purely on its simplicity. Any such decision must take into account that if a more capable scheme is needed in the future, its deployment might be complicated by interaction with the scheme previously deployed.

任何微环控制机制的部署都是对网络的重大改变。必须充分考虑能够进行微环控制的路由器和不能进行微环控制的路由器之间的互操作。此外,可能希望通过选择纯粹基于其简单性的方法来限制微回路控制的复杂性。任何此类决定都必须考虑到,如果将来需要更具能力的方案,其部署可能会因与先前部署的方案交互而变得复杂。

10. Comparison of Loop-Free Convergence Methods
10. 无环收敛方法的比较

PLSN [ANALYSIS] is an efficient mechanism to prevent the formation of micro-loops but is only a partial solution. It is a useful adjunct to some of the complete solutions but may need modification.

PLSN[分析]是防止形成微环的有效机制,但只是部分解决方案。它是一些完整解决方案的有用附件,但可能需要修改。

Incremental cost advertisement in its simplest form is impractical as a general solution because it takes too long to complete. Optimized incremental cost advertisement, however, completes in much less time and requires no assistance from other routers in the network. It is therefore useful for network-reconfiguration operations.

最简单形式的增量成本广告作为一般解决方案是不切实际的,因为它需要很长时间才能完成。然而,优化的增量成本广告在更短的时间内完成,并且不需要网络中其他路由器的帮助。因此,它对于网络重新配置操作非常有用。

Packet marking is probably impractical because of the need to find the marking bit and to change the forwarding behavior.

数据包标记可能是不切实际的,因为需要找到标记位并更改转发行为。

Of the remaining methods, distributed tunnels is significantly more complex than nearside or farside tunnels and should only be considered if there is a requirement to distribute the tunnel decapsulation load.

在其余的方法中,分布式隧道比近侧或远侧隧道复杂得多,只有在要求分布隧道脱封荷载时才应考虑。

Synchronised FIBs is a fast method but has the issue that a suitable synchronization mechanism needs to be defined. One method would be to use NTP [RFC1305]; however, the coupling of routing convergence to a protocol that uses the network may be a problem. During the transition, there will be some micro-looping for a short interval because it is not possible to achieve complete synchronization of the FIB changeover.

同步FIBs是一种快速方法,但存在需要定义合适同步机制的问题。一种方法是使用NTP[RFC1305];然而,路由收敛与使用网络的协议的耦合可能是一个问题。在过渡期间,由于不可能实现FIB转换的完全同步,短时间间隔内会出现一些微循环。

The ordered FIB mechanism has the major advantage that it is a control-plane-only solution. However, SRLGs require a per-destination calculation and the convergence delay may be high, bounded by the network diameter. The use of signaling as an accelerator may reduce the number of destinations that experience the full delay, and hence reduce the total re-convergence time to an acceptable period.

有序FIB机构的主要优点是,它是一种仅控制平面的解决方案。然而,SRLGs需要每个目的地进行计算,并且收敛延迟可能很高,受网络直径的限制。将信令用作加速器可以减少经历完全延迟的目的地的数量,从而将总的重新收敛时间减少到可接受的时间段。

The nearside and farside tunnel methods deal relatively easily with SRLGs and uncorrelated changes. The convergence delay would be small. However, these methods require the use of tunneled forwarding, which is not supported on all router hardware, and raises issues of forwarding performance. When used with PLSN, the amount of traffic that was tunneled would be significantly reduced, thus reducing the forwarding performance concerns. If the selected repair mechanism requires the use of tunnels, then a tunnel-based loop prevention scheme may be acceptable.

近侧和远侧隧道方法相对容易处理SRLGs和不相关的变化。收敛延迟很小。但是,这些方法需要使用隧道转发,这在所有路由器硬件上都不受支持,并且会引起转发性能问题。当与PLSN一起使用时,隧道传输的流量将显著减少,从而降低转发性能问题。如果选定的修复机制需要使用隧道,则可以接受基于隧道的环路预防方案。

11. Security Considerations
11. 安全考虑

This document analyzes the problem of micro-loops and summarizes a number of potential solutions that have been proposed. These solutions require only minor modifications to existing routing protocols and therefore do not add additional security risks. However, a full security analysis would need to be provided within the specification of a particular solution proposed for deployment.

本文件分析了微回路问题,并总结了已提出的一些潜在解决方案。这些解决方案只需要对现有路由协议进行微小的修改,因此不会增加额外的安全风险。但是,需要在拟议部署的特定解决方案的规范中提供全面的安全分析。

12. Acknowledgments
12. 致谢

The authors would like to acknowledge contributions to this document made by Clarence Filsfils.

作者要感谢Clarence Filsfils对本文件的贡献。

13. Informative References
13. 资料性引用

[ANALYSIS] Zinin, A., "Analysis and Minimization of Microloops in Link-state Routing Protocols", Work in Progress, October 2005.

[分析]Zinin,A.,“链路状态路由协议中微环的分析和最小化”,正在进行的工作,2005年10月。

[FRR-TUNN] Bryant, S., Filsfils, C., Previdi, S., and M. Shand, "IP Fast Reroute using tunnels", Work in Progress, November 2007.

[FRR-TUNN]Bryant,S.,Filsfils,C.,Previdi,S.,和M.Shand,“使用隧道的IP快速重路由”,正在进行的工作,2007年11月。

[LF-TIMERS] Atlas, A., Bryant, S., and M. Shand, "Synchronisation of Loop Free Timer Values", Work in Progress, February 2008.

[LF-TIMERS]Atlas,A.,Bryant,S.,和M.Shand,“无环路定时器值的同步”,正在进行的工作,2008年2月。

[NOT-VIA] Shand, M., Bryant, S., and S. Previdi, "IP Fast Reroute Using Not-via Addresses", Work in Progress, July 2009.

[NOT-VIA]Shand,M.,Bryant,S.和S.Previdi,“使用NOT-VIA地址的IP快速重路由”,正在进行的工作,2009年7月。

[OPT] Francois, P., Shand, M., and O. Bonaventure, "Disruption free topology reconfiguration in OSPF networks", IEEE INFOCOM May 2007, Anchorage.

[OPT]Francois,P.,Shand,M.,和O.Bonaventure,“OSPF网络中的无中断拓扑重构”,IEEE INFOCOM,2007年5月,安克雷奇。

[RFC1305] Mills, D., "Network Time Protocol (Version 3) Specification, Implementation", RFC 1305, March 1992.

[RFC1305]Mills,D.,“网络时间协议(版本3)规范,实施”,RFC1305,1992年3月。

[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., and W. Weiss, "An Architecture for Differentiated Services", RFC 2475, December 1998.

[RFC2475]Blake,S.,Black,D.,Carlson,M.,Davies,E.,Wang,Z.,和W.Weiss,“差异化服务架构”,RFC 24751998年12月。

[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, September 2001.

[RFC3168]Ramakrishnan,K.,Floyd,S.,和D.Black,“向IP添加显式拥塞通知(ECN)”,RFC 3168,2001年9月。

[RFC4090] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, May 2005.

[RFC4090]Pan,P.,Swallow,G.,和A.Atlas,“LSP隧道RSVP-TE快速重路由扩展”,RFC 40902005年5月。

[RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP Specification", RFC 5036, October 2007.

[RFC5036]Andersson,L.,Minei,I.,和B.Thomas,“LDP规范”,RFC 5036,2007年10月。

[RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC 5714, January 2010.

[RFC5714]Shand,M.和S.Bryant,“IP快速重路由框架”,RFC 5714,2010年1月。

[SIG] Francois, P. and O. Bonaventure, "Avoiding transient loops during IGP convergence", IEEE INFOCOM March 2005, Miami.

[SIG]Francois,P.和O.Bonaventure,“在IGP收敛过程中避免瞬态回路”,IEEE INFOCOM,2005年3月,迈阿密。

[oFIB] Francois, P., "Loop-free convergence using oFIB", Work in Progress, February 2008.

[oFIB]Francois,P.,“使用oFIB的无环收敛”,正在进行的工作,2008年2月。

Authors' Addresses

作者地址

Mike Shand Cisco Systems 250, Longwater Ave, Green Park, Reading, RG2 6GB United Kingdom

Mike Shand Cisco Systems 250,英国雷丁市格林公园朗沃特大道,RG2 6GB

   EMail: mshand@cisco.com
        
   EMail: mshand@cisco.com
        

Stewart Bryant Cisco Systems 250, Longwater Ave, Green Park, Reading, RG2 6GB United Kingdom

Stewart Bryant Cisco Systems 250,英国雷丁格林公园朗沃特大道,RG2 6GB

   EMail: stbryant@cisco.com
        
   EMail: stbryant@cisco.com