Network Working Group                                      D. Meyer, Ed.
Request for Comments: 4984                                 L. Zhang, Ed.
Category: Informational                                     K. Fall, Ed.
                                                          September 2007
Network Working Group                                      D. Meyer, Ed.
Request for Comments: 4984                                 L. Zhang, Ed.
Category: Informational                                     K. Fall, Ed.
                                                          September 2007

Report from the IAB Workshop on Routing and Addressing


Status of This Memo


This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.




This document reports the outcome of the Routing and Addressing Workshop that was held by the Internet Architecture Board (IAB) on October 18-19, 2006, in Amsterdam, Netherlands. The primary goal of the workshop was to develop a shared understanding of the problems that the large backbone operators are facing regarding the scalability of today's Internet routing system. The key workshop findings include an analysis of the major factors that are driving routing table growth, constraints in router technology, and the limitations of today's Internet addressing architecture. It is hoped that these findings will serve as input to the IETF community and help identify next steps towards effective solutions.


Note that this document is a report on the proceedings of the workshop. The views and positions documented in this report are those of the workshop participants and not of the IAB. Furthermore, note that work on issues related to this workshop report is continuing, and this document does not intend to reflect the increased understanding of issues nor to discuss the range of potential solutions that may be the outcome of this ongoing work.


Table of Contents


   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Key Findings from the Workshop . . . . . . . . . . . . . . . .  4
     2.1.  Problem #1: The Scalability of the Routing System  . . . .  4
       2.1.1.  Implications of DFZ RIB Growth . . . . . . . . . . . .  5
       2.1.2.  Implications of DFZ FIB Growth . . . . . . . . . . . .  6
     2.2.  Problem #2: The Overloading of IP Address Semantics  . . .  6
     2.3.  Other Concerns . . . . . . . . . . . . . . . . . . . . . .  7
     2.4.  How Urgent Are These Problems? . . . . . . . . . . . . . .  8
   3.  Current Stresses on the Routing and Addressing System  . . . .  8
     3.1.  Major Factors Driving Routing Table Growth . . . . . . . .  8
       3.1.1.  Avoiding Renumbering  . . . . . . . . . . . . . . . . . 9
       3.1.2.  Multihoming  . . . . . . . . . . . . . . . . . . . . . 10
       3.1.3.  Traffic Engineering  . . . . . . . . . . . . . . . . . 10
     3.2.  IPv6 and Its Potential Impact on Routing Table Size  . . . 11
   4.  Implications of Moore's Law on the Scaling Problem . . . . . . 11
     4.1.  Moore's Law  . . . . . . . . . . . . . . . . . . . . . . . 12
       4.1.1.  DRAM . . . . . . . . . . . . . . . . . . . . . . . . . 13
       4.1.2.  Off-chip SRAM  . . . . . . . . . . . . . . . . . . . . 13
     4.2.  Forwarding Engines . . . . . . . . . . . . . . . . . . . . 13
     4.3.  Chip Costs . . . . . . . . . . . . . . . . . . . . . . . . 14
     4.4.  Heat and Power . . . . . . . . . . . . . . . . . . . . . . 14
     4.5.  Summary  . . . . . . . . . . . . . . . . . . . . . . . . . 15
   5.  What Is on the Horizon . . . . . . . . . . . . . . . . . . . . 15
     5.1.  Continual Growth . . . . . . . . . . . . . . . . . . . . . 15
     5.2.  Large Numbers of Mobile Networks . . . . . . . . . . . . . 16
     5.3.  Orders of Magnitude Increase in Mobile Edge Devices  . . . 16
   6.  What Approaches Have Been Investigated . . . . . . . . . . . . 17
     6.1.  Lessons from MULTI6  . . . . . . . . . . . . . . . . . . . 17
     6.2.  SHIM6: Pros and Cons . . . . . . . . . . . . . . . . . . . 18
     6.3.  GSE/Indirection Solutions: Costs and Benefits  . . . . . . 19
     6.4.  Future for Indirection . . . . . . . . . . . . . . . . . . 20
   7.  Problem Statements . . . . . . . . . . . . . . . . . . . . . . 21
     7.1.  Problem #1: Routing Scalability  . . . . . . . . . . . . . 21
     7.2.  Problem #2: The Overloading of IP Address Semantics  . . . 22
       7.2.1.  Definition of Locator and Identifier . . . . . . . . . 22
       7.2.2.  Consequence of Locator and Identifier Overloading  . . 23
       7.2.3.  Traffic Engineering and IP Address Semantics
               Overload . . . . . . . . . . . . . . . . . . . . . . . 24
     7.3.  Additional Issues  . . . . . . . . . . . . . . . . . . . . 24
       7.3.1.  Routing Convergence  . . . . . . . . . . . . . . . . . 24
       7.3.2.  Misaligned Costs and Benefits  . . . . . . . . . . . . 25
       7.3.3.  Other Concerns . . . . . . . . . . . . . . . . . . . . 25
     7.4.  Problem Recognition  . . . . . . . . . . . . . . . . . . . 26
   8.  Criteria for Solution Development  . . . . . . . . . . . . . . 26
     8.1.  Criteria on Scalability  . . . . . . . . . . . . . . . . . 26
     8.2.  Criteria on Incentives and Economics . . . . . . . . . . . 27
   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Key Findings from the Workshop . . . . . . . . . . . . . . . .  4
     2.1.  Problem #1: The Scalability of the Routing System  . . . .  4
       2.1.1.  Implications of DFZ RIB Growth . . . . . . . . . . . .  5
       2.1.2.  Implications of DFZ FIB Growth . . . . . . . . . . . .  6
     2.2.  Problem #2: The Overloading of IP Address Semantics  . . .  6
     2.3.  Other Concerns . . . . . . . . . . . . . . . . . . . . . .  7
     2.4.  How Urgent Are These Problems? . . . . . . . . . . . . . .  8
   3.  Current Stresses on the Routing and Addressing System  . . . .  8
     3.1.  Major Factors Driving Routing Table Growth . . . . . . . .  8
       3.1.1.  Avoiding Renumbering  . . . . . . . . . . . . . . . . . 9
       3.1.2.  Multihoming  . . . . . . . . . . . . . . . . . . . . . 10
       3.1.3.  Traffic Engineering  . . . . . . . . . . . . . . . . . 10
     3.2.  IPv6 and Its Potential Impact on Routing Table Size  . . . 11
   4.  Implications of Moore's Law on the Scaling Problem . . . . . . 11
     4.1.  Moore's Law  . . . . . . . . . . . . . . . . . . . . . . . 12
       4.1.1.  DRAM . . . . . . . . . . . . . . . . . . . . . . . . . 13
       4.1.2.  Off-chip SRAM  . . . . . . . . . . . . . . . . . . . . 13
     4.2.  Forwarding Engines . . . . . . . . . . . . . . . . . . . . 13
     4.3.  Chip Costs . . . . . . . . . . . . . . . . . . . . . . . . 14
     4.4.  Heat and Power . . . . . . . . . . . . . . . . . . . . . . 14
     4.5.  Summary  . . . . . . . . . . . . . . . . . . . . . . . . . 15
   5.  What Is on the Horizon . . . . . . . . . . . . . . . . . . . . 15
     5.1.  Continual Growth . . . . . . . . . . . . . . . . . . . . . 15
     5.2.  Large Numbers of Mobile Networks . . . . . . . . . . . . . 16
     5.3.  Orders of Magnitude Increase in Mobile Edge Devices  . . . 16
   6.  What Approaches Have Been Investigated . . . . . . . . . . . . 17
     6.1.  Lessons from MULTI6  . . . . . . . . . . . . . . . . . . . 17
     6.2.  SHIM6: Pros and Cons . . . . . . . . . . . . . . . . . . . 18
     6.3.  GSE/Indirection Solutions: Costs and Benefits  . . . . . . 19
     6.4.  Future for Indirection . . . . . . . . . . . . . . . . . . 20
   7.  Problem Statements . . . . . . . . . . . . . . . . . . . . . . 21
     7.1.  Problem #1: Routing Scalability  . . . . . . . . . . . . . 21
     7.2.  Problem #2: The Overloading of IP Address Semantics  . . . 22
       7.2.1.  Definition of Locator and Identifier . . . . . . . . . 22
       7.2.2.  Consequence of Locator and Identifier Overloading  . . 23
       7.2.3.  Traffic Engineering and IP Address Semantics
               Overload . . . . . . . . . . . . . . . . . . . . . . . 24
     7.3.  Additional Issues  . . . . . . . . . . . . . . . . . . . . 24
       7.3.1.  Routing Convergence  . . . . . . . . . . . . . . . . . 24
       7.3.2.  Misaligned Costs and Benefits  . . . . . . . . . . . . 25
       7.3.3.  Other Concerns . . . . . . . . . . . . . . . . . . . . 25
     7.4.  Problem Recognition  . . . . . . . . . . . . . . . . . . . 26
   8.  Criteria for Solution Development  . . . . . . . . . . . . . . 26
     8.1.  Criteria on Scalability  . . . . . . . . . . . . . . . . . 26
     8.2.  Criteria on Incentives and Economics . . . . . . . . . . . 27
     8.3.  Criteria on Timing . . . . . . . . . . . . . . . . . . . . 28
     8.4.  Consideration on Existing Systems  . . . . . . . . . . . . 28
     8.5.  Consideration on Security  . . . . . . . . . . . . . . . . 29
     8.6.  Other Criteria . . . . . . . . . . . . . . . . . . . . . . 29
     8.7.  Understanding the Tradeoff . . . . . . . . . . . . . . . . 29
   9.  Workshop Recommendations . . . . . . . . . . . . . . . . . . . 30
   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 31
   11. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 31
   12. Informative References . . . . . . . . . . . . . . . . . . . . 31
   Appendix A.  Suggestions for Specific Steps  . . . . . . . . . . . 35
   Appendix B.  Workshop Participants . . . . . . . . . . . . . . . . 35
   Appendix C.  Workshop Agenda . . . . . . . . . . . . . . . . . . . 36
   Appendix D.  Presentations . . . . . . . . . . . . . . . . . . . . 37
     8.3.  Criteria on Timing . . . . . . . . . . . . . . . . . . . . 28
     8.4.  Consideration on Existing Systems  . . . . . . . . . . . . 28
     8.5.  Consideration on Security  . . . . . . . . . . . . . . . . 29
     8.6.  Other Criteria . . . . . . . . . . . . . . . . . . . . . . 29
     8.7.  Understanding the Tradeoff . . . . . . . . . . . . . . . . 29
   9.  Workshop Recommendations . . . . . . . . . . . . . . . . . . . 30
   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 31
   11. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 31
   12. Informative References . . . . . . . . . . . . . . . . . . . . 31
   Appendix A.  Suggestions for Specific Steps  . . . . . . . . . . . 35
   Appendix B.  Workshop Participants . . . . . . . . . . . . . . . . 35
   Appendix C.  Workshop Agenda . . . . . . . . . . . . . . . . . . . 36
   Appendix D.  Presentations . . . . . . . . . . . . . . . . . . . . 37
1. Introduction
1. 介绍

It is commonly recognized that today's Internet routing and addressing system is facing serious scaling problems. The ever-increasing user population, as well as multiple other factors including multi-homing, traffic engineering, and policy routing, have been driving the growth of the Default Free Zone (DFZ) routing table size at an increasing and potentially alarming rate [DFZ][BGT04]. While it has been long recognized that the existing routing architecture may have serious scalability problems, effective solutions have yet to be identified, developed, and deployed.


As a first step towards tackling these long-standing concerns, the IAB held a "Routing and Addressing Workshop" in Amsterdam, Netherlands on October 18-19, 2006. The main objectives of the workshop were to identify existing and potential factors that have major impacts on routing scalability, and to develop a concise problem statement that may serve as input to a set of follow-on activities. This document reports on the outcome from that workshop.


The remainder of the document is organized as follows: Section 2 provides an executive summary of the workshop findings. Section 3 describes the sources of stress in the current global routing and addressing system. Section 4 discusses the relationship between Moore's law and our ability to build large routers. Section 5 describes a few foreseeable factors that may exacerbate the current problems outlined in Section 2. Section 6 describes previous work in this area. Section 7 describes the problem statements in more detail, and Section 8 discusses the criteria that constrain the solution space. Finally, Section 9 summarizes the recommendations made by the workshop participants.


The workshop participant list is attached in Appendix B. The agenda can be found in Appendix C, and Appendix D provides pointers to the presentations from the workshop.


Finally, note that this document is a report on the outcome of the workshop, not an official document of the IAB. Any opinions expressed are those of the workshop participants and not of the IAB.


2. Key Findings from the Workshop
2. 讲习班的主要结论

This section provides a concise summary of the key findings from the workshop. While many other aspects of a routing and addressing system were discussed, the first two problems described in this section were deemed the most important ones by the workshop participants.


The clear, highest-priority takeaway from the workshop is the need to devise a scalable routing and addressing system, one that is scalable in the face of multihoming, and that facilitates a wide spectrum of traffic engineering (TE) requirements. Several scalability problems of the current routing and addressing systems were discussed, most related to the size of the DFZ routing table (frequently referred to as the Routing Information Base, or RIB) and its implications. Those implications included (but were not limited to) the sizes of the DFZ RIB and FIB (the Forwarding Information Base), the cost of recomputing the FIB, concerns about the BGP convergence times in the presence of growing RIB and FIB sizes, and the costs and power (and hence heat dissipation) properties of the hardware needed to route traffic in the core of the Internet.

从研讨会中获得的明确、最高优先级是需要设计一个可扩展的路由和寻址系统,该系统在多主环境下是可扩展的,并且有助于满足广泛的流量工程(TE)需求。讨论了当前路由和寻址系统的几个可伸缩性问题,其中大部分与DFZ路由表(通常称为路由信息库,简称RIB)的大小及其含义有关。这些影响包括(但不限于)DFZ RIB和FIB(转发信息库)的尺寸、重新计算FIB的成本、在RIB和FIB尺寸不断增长的情况下对BGP收敛时间的关注,以及成本和功率(以及散热)在互联网核心路由流量所需的硬件属性。

2.1. Problem #1: The Scalability of the Routing System
2.1. 问题#1:路由系统的可伸缩性

The shape of the growth curve of the DFZ RIB has been the topic of much research and discussion since the early days of the Internet [H03]. There have been various hypotheses regarding the sources of this growth. The workshop identified the following factors as the main driving forces behind the rapid growth of the DFZ RIB:


o Multihoming,

o 多归宿,

o Traffic engineering,

o 交通工程,

o Non-aggregatable address allocations (a big portion of which is inherited from historical allocations), and

o 不可聚合的地址分配(其中很大一部分是从历史分配中继承的),以及

o Business events, such as mergers and acquisitions.

o 商业活动,如兼并和收购。

All of the above factors can lead to prefix de-aggregation and/or the injection of unaggregatable prefixes into the DFZ RIB. Prefix de-aggregation leads to an uncontrolled DFZ RIB growth because, absent some non-topologically based routing technology (for example, Routing On Flat Labels [ROFL] or any name-independent compact routing algorithm, e.g., [CNIR]), topological aggregation is the only known practical approach to control the growth of the DFZ RIB. The following section reviews the workshop discussion of the implications of the growth of the DFZ RIB.

上述所有因素都可能导致前缀反聚合和/或将不可聚合的前缀注入DFZ RIB。前缀去聚集导致不受控制的DFZ RIB增长,因为缺少一些基于非拓扑的路由技术(例如,平面标签上的路由[ROFL]或任何独立于名称的紧凑路由算法,例如[CNIR]),拓扑聚集是唯一已知的控制DFZ RIB增长的实用方法。以下章节回顾了关于DFZ肋骨增长影响的研讨会讨论。

2.1.1. Implications of DFZ RIB Growth
2.1.1. DFZ肋骨生长的意义

Presentations made at the workshop showed that the DFZ RIB has been growing at greater than linear rates for several years [DFZ]. While this has the obvious effects on the requirements for RIB and FIB memory sizes, the growth driven by prefix de-aggregation also exposes the core of the network to the dynamic nature of the edges, i.e., the de-aggregation leads to an increased number of BGP UPDATE messages injected into the DFZ (frequently referred to as "UPDATE churn"). Consequently, additional processing is required to maintain state for the longer prefixes and to update the FIB. Note that, although the size of the RIB is bounded by the given address space size and the number of reachable hosts (i.e., O(m*2^32) for IPv4, where <m> is the average number of peers each BGP router may have), the amount of protocol activity required to distribute dynamic topological changes is not. That is, the amount of BGP UPDATE churn that the network can experience is essentially unbounded. It was also noted that the UPDATE churn, as currently measured, is heavy-tailed [ATNAC2006]. That is, a relatively small number of Autonomous Systems (ASs) or prefixes are responsible for a disproportionately large fraction of the UPDATE churn that we observe today. Furthermore, much of the churn may turn out to be unnecessary information, possibly due to instability of edge ASs being injected into the global routing system [DynPrefix], or arbitrage of some bandwidth pricing model (see [GIH], for example, or the discussion of the behavior of AS 9121 in [BGP2005]).

研讨会上的演示表明,DFZ肋骨数年来一直以高于线性的速度增长[DFZ]。虽然这对RIB和FIB内存大小的要求有明显的影响,但前缀反聚合驱动的增长也使网络核心暴露于边缘的动态特性,即反聚合导致注入DFZ的BGP更新消息数量增加(通常称为“更新搅动”)。因此,需要额外的处理来维护较长前缀的状态并更新FIB。注意,尽管RIB的大小受给定地址空间大小和可访问主机数量的限制(即IPv4的O(m*2^32),其中<m>是每个BGP路由器可能拥有的对等节点的平均数量),但分发动态拓扑更改所需的协议活动量并不是。也就是说,网络可以经历的BGP更新波动量基本上是无限的。还注意到,目前测量的更新客户流失率是重尾的[ATNAC2006]。也就是说,相对较少的自治系统(ASs)或前缀导致了我们今天所观察到的大部分更新搅动。此外,许多搅动可能是不必要的信息,可能是由于注入全局路由系统[DynPrefix]的边缘ASs的不稳定性,或某些带宽定价模型的套利(例如,参见[GIH],或[BGP2005]中AS 9121行为的讨论)。

Finally, it was noted by the workshop participants that the UPDATE churn situation may be exacerbated by the current Regional Internet Registry (RIR) policy in which end sites are allocated Provider-Independent (PI) addresses. These addresses are not topologically aggregatable, and as such, bring the churn problem described above into the core routing system. Of course, as noted by several participants, the RIRs have no real choice in this matter, as many enterprises demand PI addresses that allow them to multihome without the "provider lock" that Provider-Allocated (PA) [PIPA] address space creates. Some enterprises also find the renumbering cost associated with PA address assignments unacceptable.


2.1.2. Implications of DFZ FIB Growth
2.1.2. DFZ-FIB生长的意义

One surprising outcome of the workshop was the observation made by Tony Li about the relationship between "Moore's Law" [ML] and our ability to build cost-effective, high-performance routers (see Appendix D). "Moore's Law" is the empirical observation that the transistor density of integrated circuits, with respect to minimum component cost, doubles roughly every 24 months. A commonly held wisdom is that Moore's law would save the day by ensuring that technology will continue to scale at historical rates that surpass the growth rate of routing information handled by core router hardware. However, Li pointed out that Moore's Law does not apply to building high-end routers as far as the cost is concerned.

研讨会的一个令人惊讶的结果是,Tony Li对“摩尔定律”[ML]与我们构建经济高效、高性能路由器的能力之间的关系进行了观察(见附录D)。“摩尔定律”是一个经验性观察结果,即集成电路的晶体管密度相对于最小元件成本而言,大约每24个月翻一番。一个普遍持有的观点是,摩尔定律将通过确保技术将继续以超过核心路由器硬件处理的路由信息增长率的历史速度进行扩展来挽救这一天。然而,李指出,就成本而言,摩尔定律不适用于制造高端路由器。

Moore's Law applies specifically to the high-volume portion of the semiconductor industry, while the low-volume, customized silicon used in core routing is well off Moore's Law's cost curve. In particular, off-chip SRAM is commonly used for storing FIB data, and the driver for low-latency, high-capacity SRAM used to be PC cache memory. However, recently cache memory has been migrating directly onto the processor die, and cell phones are now the primary driver for off-chip SRAM. Given cell phones require low-power, small-capacity parts that are not applicable to high-end routers, the SRAMs that are favored for router design are not volume parts and do not track with Moore's law.


2.2. Problem #2: The Overloading of IP Address Semantics
2.2. 问题#2:IP地址语义过载

One of the fundamental assumptions underlying the scalability of routing systems was eloquently stated by Yakov Rekhter (and is sometimes referred to as "Rekhter's Law"), namely:

Yakov Rekhter(有时被称为“Rekhter定律”)雄辩地阐述了路由系统可伸缩性的基本假设之一,即:

"Addressing can follow topology or topology can follow addressing. Choose one."


The same idea was expressed by Mike O'Dell's design of an alternate address architecture for ipv6 [GSE], where the address structure was designed specifically to enable "aggressive topological aggregation" to scale the routing system. Noel Chiappa has also written extensively on this topic (see, e.g., [EID]).

Mike O'Dell为ipv6[GSE]设计的备用地址体系结构也表达了同样的想法,其中地址结构专门设计为支持“主动拓扑聚合”以扩展路由系统。Noel Chiappa也就这一主题写了大量文章(参见[EID])。

There is, however, a difficulty in creating (and maintaining) the kind of congruence envisioned by Rekhter's Law in today's Internet. The difficulty arises from the overloading of addressing with the semantics of both "who" (endpoint identifier, as used by transport layer) and "where" (locators for the routing system); some might also add that IP addresses are also overloaded with "how" [GIH]. In any


event, this kind of overloading is felt to have had deep implications for the scalability of the global routing system.


A refinement to Rekhter's Law, then, is that for the Internet routing system to scale, an IP address must be assigned in such a way that it is congruent with the Internet's topology. However, identifiers are typically assigned based upon organizational (not topological) structure and have stability as a desirable property, a "natural incongruence" arises. As a result, it is difficult (if not impossible) to make a single number space serve both purposes efficiently.


Following the logic of the previous paragraphs, workshop participants concluded that the so-called "locator/identifier overload" of the IP address semantics is one of the causes of the routing scalability problem as we see today. Thus, a "split" seems necessary to scale the routing system, although how to actually architect and implement such a split was not explored in detail.


2.3. Other Concerns
2.3. 其他关注事项

In addition to the issues described in Section 2.1 and Section 2.2, the workshop participants also identified the following three pressing, but "second tier", issues.


The first one is a general concern with IPv6 deployment. It is commonly believed that the IPv4 address space has put an effective constraint on the IPv4 RIB growth. Once this constraint is lifted by the deployment of IPv6, and in the absence of a scalable routing strategy, the rapid DFZ RIB size growth problem today can potentially be exacerbated by IPv6's much larger address space. The only routing paradigm available today for IPv6 is a combination of Classless Inter-Domain Routing (CIDR) [RFC4632] and Provider-Independent (PI) address allocation strategies [PIPA] (and possibly SHIM6 [SHIM6] when that technology is developed and deployed). Thus, the opportunity exists to create a "swamp" (unaggregatable address space) that can be many orders of magnitude larger than what we faced with IPv4. In short, the advent of IPv6 and its larger address space further underscores both the concerns raised in Section 2.1, and the importance of resolving the architectural issue raised in Section 2.2.


The second issue is slow routing convergence. In particular, the concern was that growth in the number of routes that service providers must carry will cause routing convergence to become a significant problem.


The third issue is the misalignment of costs and benefits in today's routing system. While the IETF does not typically consider the "business model" impacts of various technology choices, many participants felt that perhaps the time has come to review that philosophy.


2.4. How Urgent Are These Problems?
2.4. 这些问题有多紧迫?

There was a fairly universal agreement among the workshop participants that the problems outlined in Section 2.1 and Section 2.2 need immediate attention. This need was not because the participants perceived a looming, well-defined "hit the wall" date, but rather because these are difficult problems that to date have resisted solution, are likely to get more unwieldy as IPv6 deployment proceeds, and the development and deployment of an effective solution will necessarily take at least a few years.


3. Current Stresses on the Routing and Addressing System
3. 路由和寻址系统的当前压力

The primary concern voiced by the workshop participants regarding the state of the current Internet routing system was the rapid growth of the DFZ RIB. The number of entries in 2005 ranged from about 150,000 entries to 175,000 entries [BGP2005]; this number has reached 200,000 as of October 2006 [CIDRRPT], and is projected to increase to 370,000 or more within 5 years [Fuller]. Some workshop participants projected that the DFZ could reach 2 million entries within 15 years, and there might be as many as 10 million multihomed sites by 2050.

研讨会参与者对当前互联网路由系统的状态表示的主要关注是DFZ RIB的快速增长。2005年的参赛作品数量从150000件到175000件不等[BGP2005];截至2006年10月,这一数字已达到200000[CIDRRPT],预计在5年内将增加到370000或更多[Fuller]。一些研讨会参与者预计,DFZ将在15年内达到200万个条目,到2050年,可能会有多达1000万个多址站点。

Another related concern was the number of prefixes changed, added, and withdrawn as a function of time (i.e., BGP UPDATE churn). This has a detrimental impact on routing convergence, since UPDATEs frequently necessitate a re-computation and download of the FIB. For example, a BGP router may observe up to 500,000 BGP updates in a single day [DynPrefix], with the peak arrival rates over 1000 updates per second. Such UPDATE churn problems are not limited to DFZ routes; indeed, the number of internal routes carried by large ISPs also threatens convergence times, given that such internal routes include more specifics, Virtual Private Network (VPN) routes, and other routes that do not appear in the DFZ [ATNAC2006].


3.1. Major Factors Driving Routing Table Growth
3.1. 驱动路由表增长的主要因素

The growth of the DFZ RIB results from the addition of more prefixes to the table. Although some of this growth is organic (i.e., results simply from growth of the Internet), a large portion of the growth results from de-aggregation of address prefixes (i.e., more specific


prefixes). In this section, we discuss in more detail why this trend is accelerating and may be cause for concern.


An increasing fraction of the more-specific prefixes found in the DFZ are due to deliberate action on the part of operators [ATNAC2006]. Motivations to advertise these more-specifics include:


o Traffic Engineering, where load is balanced across multiple links through selective advertisement of more-specific routes on different links to adjust the amount of traffic received on each; and

o 流量工程,通过在不同链路上选择性地公布更具体的路线,在多个链路上平衡负载,以调整每个链路上接收的流量;和

o Attempts to prevent prefix-hijacking by other operators who might advertise more-specifics to steer traffic toward them; there are several known instances of this behavior today [BHB06].

o 试图防止其他运营商劫持前缀,这些运营商可能会宣传更多细节以引导流量流向他们;今天有几个已知的这种行为的例子[BHB06]。

3.1.1. Avoiding Renumbering
3.1.1. 避免重新编号

The workshop participants noted that customers generally prefer to have PI address space. Doing so gives them additional agility in selecting ISPs and helps them avoid the need to renumber. Many end-systems use DHCP to assign addresses, so a cursory analysis might suggest renumbering might involve modification of a modest number of routers and servers (perhaps rather than end hosts) at a site that was forced to renumber.


In reality, however, renumbering can be more cumbersome because IP addresses are often used for other purposes such as access control lists. They are also sometimes hard-coded into applications used in environments where failure of the DNS would be catastrophic (e.g., some remote monitoring applications). Although renumbering may be a mild inconvenience for some sites and guidelines have been developed for renumbering a network without a flag day [RFC4192], for others, the necessary changes are sufficiently difficult so as to make renumbering effectively impossible.


For these reasons, PI address space is sought by a growing number of customers. Current RIR policy reflects this trend, and their policy is to allocate PI prefixes to all customers who claim a need. Routing PI prefixes requires additional entries in the DFZ routing and forwarding tables. At present, ISPs do not typically charge to route PI prefixes. Therefore, the "costs" of the additional prefixes, in terms of routing table entries and processing overhead, is born by the global routing system as a whole, rather than directly by the users of PI space. The workshop participants observed that no strong disincentive exists to discourage the increasing use of PI address space.


3.1.2. Multihoming
3.1.2. 多归宿

Multihoming refers generically to the case in which a site is served by more than one ISP [RFC4116]. There are several reasons for the observed increase in multihoming, including the increased reliance on the Internet for mission- and business-critical applications and the general decrease in cost to obtain Internet connectivity. Multihoming provides backup routing -- Internet connection redundancy; in some circumstances, multihoming is mandatory due to contract or law. Multihoming can be accomplished using either PI or PA address space, and multihomed sites generally have their own AS numbers (although some do not; this generally occurs when such customers are statically routed).


A multihomed site using PI address space has its prefixes present in the forwarding and routing tables of each of its providers. For PA space, each prefix allocated from one provider's address allocation will be aggregatable for that provider but not the others. If the addresses are allocated from a 'primary' ISP (i.e., one that the site uses for routing unless a failure occurs), then the additional routing table entries only appear during path failures to that primary ISP. A problem with multihoming arises when a customer's PA IP prefixes are advertised by AS(es) other than their 'primary' ISP's. Because of the longest-matching prefix forwarding rule, in this case, the customer's traffic will be directed through the non-primary AS(s). In response, the primary ISP is forced to de-aggregate the customer's prefix in order to keep the customer's traffic flowing through it instead of the non-primary AS(s).

使用PI地址空间的多址站点在其每个提供程序的转发和路由表中都有其前缀。对于PA空间,从一个提供程序的地址分配中分配的每个前缀都可以为该提供程序聚合,但不能为其他提供程序聚合。如果地址是从“主”ISP分配的(即,除非发生故障,否则站点用于路由的地址),则附加路由表条目仅在该主ISP的路径故障期间出现。当客户的PA IP前缀由其“主”ISP以外的AS(es)发布时,就会出现多归属问题。由于最长匹配前缀转发规则,在这种情况下,客户的流量将直接通过非主AS。作为响应,主ISP被迫对客户的前缀进行反聚合,以保持客户的流量通过该前缀,而不是非主AS。

3.1.3. Traffic Engineering
3.1.3. 交通工程

Traffic engineering (TE) is the act of arranging for certain Internet traffic to use or avoid certain network paths (that is, TE puts traffic where capacity exists, or where some set of parameters of the path is more favorable to the traffic being placed there). TE is performed by both ISPs and customer networks, for three primary reasons:


o First, as mentioned above, to match traffic with network capacity, or to spread the traffic load across multiple links (frequently referred to as "load balancing").

o 首先,如上所述,将流量与网络容量相匹配,或将流量负载分布在多个链路上(通常称为“负载平衡”)。

o Second, to reduce costs by shifting traffic to lower cost paths or by balancing the incoming and outgoing traffic volume to maintain appropriate peering relations.

o 第二,通过将流量转移到成本较低的路径或通过平衡传入和传出流量来保持适当的对等关系来降低成本。

o Finally, TE is sometimes deployed to enforce certain forms of policy (e.g., Canadian government traffic may not be permitted to transit through the United States).

o 最后,有时部署TE来执行某些形式的政策(例如,加拿大政府交通可能不允许通过美国过境)。

Few tools exist for inter-domain traffic engineering today. Network operators usually achieve traffic engineering by "tweaking" the processing of routing protocols to achieve desired results. At the BGP level, if the address range requiring TE is a portion of a larger PA address aggregate, network operators implementing TE are forced to de-aggregate otherwise aggregatable prefixes in order to steer the traffic of the particular address range to specific paths.


In today's highly competitive environment, providers require TE to maintain good performance and low cost in their networks. However, the current practice of TE deployment results in an increase of the DFZ RIB; although individual operators may have a certain gain from doing TE, it leads to an overall increased cost for the Internet routing infrastructure as a whole.


3.2. IPv6 and Its Potential Impact on Routing Table Size
3.2. IPv6及其对路由表大小的潜在影响

Due to the increased IPv6 address size over IPv4, a full immediate transition to IPv6 is estimated to lead to the RIB and FIB sizes increasing by a factor of about four. The size of the routing table based on a more realistic assumption, that of parallel IPv4 and IPv6 routing for many years, is less clear. An increasing amount of allocated IPv6 address prefixes is in PI space. ARIN [ARIN] has relaxed its policy for allocation of such space and has been allocating /48 prefixes when customers request PI prefixes. Thus, the same pressures affecting IPv4 address allocations also affect IPv6 allocations.


4. Implications of Moore's Law on the Scaling Problem
4. 摩尔定律对标度问题的启示

[Editor's note: The information in this section is gathered from presentations given at the workshop. The presentation slides can be retrieved from the pointer provided in Appendix D. It is worth noting that this information has generated quite a bit of discussion since the workshop, and as such requires further community input.]


The workshop heard from Tony Li about the relationship between Moore's law and the ability to build cost-effective, high-performance routers. The scalability of the current routing subsystem manifests itself in the forwarding table (FIB) and routing table (RIB) of the routers in the core of the Internet. The implementation choices for FIB storage are on-chip SRAM, off-chip SRAM, or DRAM. DRAM is commonly used in lower end devices. RIB storage is done via DRAM.

研讨会听取了Tony Li关于摩尔定律与构建高性价比、高性能路由器能力之间关系的介绍。当前路由子系统的可扩展性体现在互联网核心路由器的转发表(FIB)和路由表(RIB)中。FIB存储的实现选择是片上SRAM、片外SRAM或DRAM。DRAM通常用于低端设备。肋骨存储通过DRAM完成。

[Editor's note: The exact implementation of a high-performance router's RIB and FIB memories is the subject of much debate; it is also possible that alternative designs may appear in the future.]


The scalability question then becomes whether these memory technologies can scale faster than the size of the full routing table. Intrinsic in this statement is the assumption that core routers will be continually and indefinitely upgraded on a periodic basis to keep up with the technology curve and that the costs of those upgrades will be passed along to the general Internet community.


4.1. Moore's Law
4.1. 摩尔定律

In 1965, Gordon Moore projected that the density of transistors in integrated circuits could double every two years, with respect to minimum component cost. The period was subsequently adjusted to be between 18-24 months and this conjecture became known as Moore's Law [ML]. The semiconductor industry has been following this density trend for the last 40 or so years.

1965年,戈登·摩尔(Gordon Moore)预测,集成电路中的晶体管密度每两年可以翻一番,以降低元件成本。这一时期后来被调整为18-24个月,这一推测被称为摩尔定律[ML]。半导体工业在过去的40年左右一直遵循这种密度趋势。

The commonly held wisdom is that Moore's law will save the day by ensuring that technology will continue to scale at the historical rate that will surpass the growth rate of routing information. However, it is vital to understand that Moore's law comes out of the high-volume portion of the semiconductor industry, where the costs of silicon are dominated by the actual fabrication costs. The customized silicon used in core routers is produced in far lower volume, typically in the 1,000-10,000 parts per year, whereas microprocessors are running in the tens of millions per year. This places the router silicon well off the cost curve, where the economies of scale are not directly inherited, and yield improvements are not directly inherited from the best current practices. Thus, router silicon benefits from the technological advances made in semiconductors, but does not follow Moore's law from a cost perspective.


To date, this cost difference has not shown clearly. However, the growth in bandwidth of the Internet and the steady climb of the speed of individual links has forced router manufacturers to apply more sophisticated silicon technology continuously. There has been a new generation of router hardware that has grown at about 4x the bandwidth every three years, and increases in routing table size have been absorbed by the new generations of hardware. Now that router hardware is nearing the practical limits of per-lambda bandwidth, it is possible that upgrades solely for meeting the forwarding table scaling will become more visible.


4.1.1. DRAM
4.1.1. 德拉姆

In routers, DRAM is used for storing the RIB and, in lower-end routers, is also used for storing the FIB. Historically, DRAM capacity grows at about 4x every 3.3 years. This translates to 2.4x every 2 years, so DRAM capacity actually grows faster than Moore's law would suggest. DRAM speed, however, only grows about 10% per year, or 1.2x every 2 years [DRAM] [Molinero]. This is an issue because BGP convergence time is limited by DRAM access speeds. In processing a BGP update, a BGP speaker receives a path and must compare it to all of the other paths it has stored for the prefix. It then iterates over all of the prefixes in the update stream. This results in a memory access pattern that has proven to limit the effectiveness of processor caching. As a result, BGP convergence time degrades at the routing table growth rate, divided by the speed improvement rate of DRAM. In the long run, this is likely to become a significant issue.


4.1.2. Off-chip SRAM
4.1.2. 片外SRAM

Storing the FIB in off-chip SRAM is a popular design decision. For high-speed interfaces, this requires low-latency, high-capacity parts. The driver for this type of SRAM was formerly PC cache memory. However, this cache memory has recently been migrating directly onto the processor die, so that the volumes of cache memory have fallen off. Today, the primary driver for off-chip SRAM is cell phones, which require low-power, small-capacity parts that are not applicable to high-end router design. As a result, the SRAMs that are favored for router design are not volume parts. They have fallen off the cost curve and do not track with Moore's law.


4.2. Forwarding Engines
4.2. 转发引擎

For many years, router companies have been building special-purpose silicon to provide high-speed packet-forwarding capabilities. This has been necessary because the architectural limitations of general purpose CPUs make them incapable of providing the high-bandwidth, low latency, low-jitter I/O interface for making high speed forwarding decisions.


As a result, the forwarding engines being built for high-end routers are some of the most sophisticated Application-specific Integrated Circuits (ASICs) being built, and are currently only one technological step behind general-purpose CPUs. This has been largely driven by the growth in bandwidth and has already pushed the technology well beyond the knee in the price/performance curve. Given that this level of technology is already a requirement to meet the performance goals, using on-chip SRAM is an interesting design


alternative. If this choice is selected, then growth in the available FIB is tightly coupled to process technology improvements, which are driven by the general-purpose CPU market. While this growth rate should suffice, in general, the forwarding engine market is decidedly off the high-volume price curve, resulting in spiraling costs to support basic forwarding.


Moreover, if there is any change in Moore's law or decrease in the rate of processor technology evolution, the forwarding engine could quickly become the technological leader of silicon technology. This would rapidly result in forwarding technology becoming prohibitively expensive.


4.3. Chip Costs
4.3. 芯片成本

Each process technology step in chip development has come at increasing cost. The milestone of sending a completed chip design to a fabricator for manufacturing is known as 'tapeout', and is the point where the designer pays for the fixed overhead of putting the chip into production. The costs of taping out a chip have been rising about 1.5x every 2 years, driven by new process technology. The actual design and development costs have been rising similarly, because each new generation of technology increases the device count by roughly a factor of 2. This allows new features and chip architectures, which inevitably lead to an increase in complexity and labor costs. If new chip development was driven solely by the need to scale up memory, and if memory structures scaled, then we would expect labor costs to remain fixed. Unfortunately, memory structures typically do not seem to scale linearly. Individual memory controllers have a non-negligible cost, leading to the design for an internal on-chip interconnect of memories. The net result is that we can expect that chip development costs to continue to escalate roughly in line with the increases in tapeout costs, leading to an ongoing cost curve of about 1.5x every 2 years. Since each technology step roughly doubles memory, that implies that if demand grows faster than about (2x/1.5x) = 1.3x per year, then technology refresh will not be able to remain on a constant cost curve.


4.4. Heat and Power
4.4. 热力

Transistors consume power both when idle ("leakage current") and when switching. The smaller and hotter the transistors, the larger the leakage current. The overall power consumption is not linear with the density increase. Thus, as the need for more powerful routers increases, cooling technology grows more taxed. At present, the existing air cooling system is starting to be a limiting factor for scaling high-performance routers.


A key metric for system evaluation is now the unit of forwarding bandwidth per Watt-- [(Mb/s)/W]. About 60% of the power goes to the forwarding engine circuits, with the rest divided between the memories, route processors, and interconnect. Using parallelization to achieve higher bandwidths can aggravate the situation, due to increased power and cooling demands.


[Editor's note: Many in the community have commented that heat, power consumption, and the attendant heat dissipation, along with size limitations of fabrication processes for high speed parallel I/O interfaces, are the current limiting factors.]


4.5. Summary
4.5. 总结

Given the uncontrolled nature of its growth rate, there is some concern about the long-term prospects for the health and cost of the routing subsystem of the Internet. The ongoing growth will force periodic technology refreshes. However, the growth rate can possibly exceed the rate that can be supported at constant cost based on the development costs seen in the router industry. Since high-end routing is based on low-volume technology, the cost advantages that the bulk of the broader computing industry see, based on Moore's law, are not directly inherited. This leads to a sustainable growth rate of 1.3x/2yrs for the forwarding table and 1.2x/2yrs for the routing table. Given that the current baseline growth is at 1.3x/2yrs [CIDRRPT], with bursts that even exceed Moore's law, the trend is for the costs of technology refresh to continue to grow, indefinitely, even in constant dollars.


5. What Is on the Horizon
5. 什么在地平线上

Routing and addressing are two fundamental pieces of the Internet architecture, thus any changes to them will likely impact almost all of the "IP stack", from applications to packet forwarding. In resolving the routing scalability problems, as agreed upon by the workshop attendees, we should aim at a long-term solution. This requires a clear understanding of various trends in the foreseeable future: the growth in Internet user population, the applications, and the technology.


5.1. Continual Growth
5.1. 持续增长

The backbone operators expect that the current Internet user population base will continue to expand, as measured by the traffic volume, the number of hosts connected to the Internet, the number of customer networks, and the number of regional providers.


5.2. Large Numbers of Mobile Networks
5.2. 大量移动网络

Boeing's Connexion service pioneered the deployment of commercial mobile networks that may change their attachment points to the Internet on a global scale. It is believed that such in-flight Internet connectivity would likely become commonplace in the not-too-distant future. When that happens, there can be multiple thousands of airplane networks in the air at any given time.


Given that today's DFZ RIB already handles over 200,000 prefixes [CIDRRPT], several thousands of mobile networks, each represented by a single prefix announcement, may not necessarily raise serious routing scalability or stability concerns. However, there is an open question regarding whether this number can become substantially larger if other types of mobile networks, such as networks on trains or ships, come into play. If such mobile networks become commonplace, then their impact on the global routing system needs to be assessed.

考虑到今天的DFZ RIB已经处理了超过200000个前缀[CIDRRPT],数千个移动网络,每个都由一个前缀声明表示,可能不一定会引起严重的路由可伸缩性或稳定性问题。然而,如果其他类型的移动网络(如火车或轮船上的网络)发挥作用,这个数字是否会大幅增加,这是一个悬而未决的问题。如果这种移动网络变得普遍,那么就需要评估它们对全球路由系统的影响。

5.3. Orders of Magnitude Increase in Mobile Edge Devices
5.3. 移动边缘设备数量级增长

Today's technology trend indicates that billions of hand-held gadgets may come online in the next several years. There were different opinions regarding whether this would, or would not, have a significant impact on global routing scalability. The current solutions for mobile hosts, namely Mobile IP (e.g., [RFC3775]), handle the mobility by one level of indirection through home agents; mobile hosts do not appear any different, from a routing perspective, than stationary hosts. If we follow the same approach, new mobile devices should not present challenges beyond the increase in the size of the host population.


The workshop participants recognized that the increase in the number of mobile devices can be significant, and that if a scalable routing system supporting generic identity-locator separation were developed and introduced, billions of mobile gadgets could be supported without bringing undue impact on global routing scalability and stability.


Further investigation is needed to gain a complete understanding of the implications on the global routing system of connecting many new mobile hand-held devices (including mobile sensor networks) to the Internet.


6. What Approaches Have Been Investigated
6. 研究了哪些方法

Over the years, there have been many efforts designed to investigate scalable inter-domain routing for the Internet [IDR-REQS]. To benefit from the insights obtained from these past results, the workshop reviewed several major previous and ongoing IETF efforts:


1. The MULTI6 working group's exploration of the solution space and the lessons learned,

1. MULTI6工作组对解决方案空间的探索和经验教训,

2. The solution to multihoming being developed by the SHIM6 Working Group, and its pros and cons,

2. SHIM6工作组正在开发的多主定位解决方案及其优缺点,

3. The GSE proposal made by O'Dell in 1997, and its pros and cons, and

3. O'Dell在1997年提出的GSE提案及其利弊,以及

4. Map-and-Encap [RFC1955], a general indirection-based solution to scalable multihoming support.

4. Map and Encap[RFC1955],一种通用的基于间接寻址的解决方案,可扩展多宿主支持。

6.1. Lessons from MULTI6
6.1. 多元文化的经验教训6

The MULTI6 working group was chartered to explore the solution space for scalable support of IPv6 multihoming. The numerous proposals collected by MULTI6 working group generally fell into one of two major categories: resolving the above-mentioned conflict by using provider-independent address assignments, or by assigning multiple address prefixes to multihomed sites, one for each of its providers, so that all the addresses can be topologically aggregatable.


The first category includes proposals of (1) simply allocating provider-independent address space, which is effectively the current practice, and (2) assigning IP addresses based on customers' geographical locations. The first approach does not scale; the second approach represents a fundamental change to the Internet routing system and its economic model, and imposes undue constraints on ISPs. These proposals were found to be incomplete, as they offered no solutions to the new problems they introduced.


The majority of the proposals fell into the second category-- assigning multiple address blocks per site. Because IP addresses have been used as identifiers by higher-level protocols and applications, these proposals face a fundamental design decision regarding which layer should be responsible for mapping the multiple locators (i.e., the multiple addresses received from ISPs) to an identifier. A related question involves which nodes are responsible for handling multiple addresses. One can implement a multi-address scheme at either each individual host or at edge routers of a site, or even both. Handling multiple addresses by edge routers provides


the ability to control the traffic flow of the entire site. Conversely, handling multiple addresses by individual hosts offers each host the flexibility to choose different policies for selecting a provider; it also implies changes to all the hosts of a multihomed site.


During the process of evaluating all the proposals, two major lessons were learned:


o Changing anything in the current practice is hard: for example, inserting an additional header into the protocol would impact IP fragmentation processing, and the current congestion control assumes that each TCP connection follows a single routing path. In addition, operators ask for the ability to perform traffic engineering on a per-site basis, and specification of site policy is often interdependent with the IP address structure.

o 改变当前实践中的任何内容都是困难的:例如,在协议中插入额外的头会影响IP分段处理,并且当前的拥塞控制假设每个TCP连接遵循一条路由路径。此外,运营商要求能够在每个站点的基础上执行流量工程,站点策略的规范通常与IP地址结构相互依赖。

o The IP address has been used as an identifier and has been codified into many Internet applications that manipulate IP addresses directly or include IP addresses within the application layer data stream. IP addresses have also been used as identifiers in configuring network policies. Changing the semantics of an IP address, for example, using only the last 64- bit as identifiers as proposed by GSE, would require changes to all such applications and network devices.

o IP地址已被用作标识符,并被编入许多直接操作IP地址或在应用层数据流中包含IP地址的Internet应用程序中。在配置网络策略时,IP地址也被用作标识符。更改IP地址的语义,例如,仅使用GSE建议的最后64位作为标识符,将需要更改所有此类应用程序和网络设备。

6.2. SHIM6: Pros and Cons
6.2. 新闻6:利弊

The SHIM6 working group took the second approach from the MULTI6 working group's investigation, i.e., supporting multihoming through the use of multiple addresses. SHIM6 adopted a host-based approach, where the host IP stack includes a "shim" that presents a stable "upper layer identifier" (ULID) to the upper layer protocols, but may rewrite the IP packets sent and received so that a currently working IP address is used in the transmitted packets. When needed, a SHIM6 header is also included in the packet itself, to signal to the remote stack.


With SHIM6, protocols above the IP layer use the ULID to identify endpoints (e.g., for TCP connections). The current design suggests choosing one of the locators as the ULID (borrowing a locator to be used as an identifier). This approach makes the implementation compatible with existing IPv6 upper layer protocol implementations and applications. Many of these applications have inherited the long time practice of using IP addresses as identifiers.


SHIM6 is able to isolate upper layer protocols from multiple IP layer addresses. This enables a multihomed site to use provider-allocated


prefixes, one from each of its multiple providers, to facilitate provider-based prefix aggregation. However, this gain comes with several significant costs. First, SHIM6 requires modifications to all host stack implementations to support the shim processing. Second, the shim layer must maintain the mapping between the identifier and the multiple locators returned from IPv6 AAAA name resolution, and must take the responsibility to try multiple locators if failures ever occur during the end-to-end communication. At this time, the host has little information to determine the order of locators it should use in reaching a multihomed destination, however, there is ongoing effort in addressing this issue.

前缀,来自其多个提供程序中的每一个,以便于基于提供程序的前缀聚合。然而,这一收益伴随着几大成本。首先,SHIM6需要修改所有主机堆栈实现以支持垫片处理。其次,垫片层必须维护标识符和从IPv6 AAAA名称解析返回的多个定位器之间的映射,并且如果在端到端通信期间发生故障,垫片层必须负责尝试多个定位器。此时,主机几乎没有信息来确定在到达多宿目的地时应该使用的定位器的顺序,但是,正在努力解决这个问题。

Furthermore, as a host-based approach, SHIM6 provides little control to the service provider for effective traffic engineering. At the same time, it also imposes additional state information on the host regarding the multiple locators of the remote communication end. Such state information may not be a significant issue for individual user hosts, but can lead to larger resource demands on large application servers that handle hundreds of thousands of simultaneous TCP connections.


Yet another major issue with the SHIM6 solution is the need for renumbering when a site changes providers. Although a multihomed site is assigned multiple address blocks, none of them can be treated as a persistent identifier for the site. When the site changes one of its providers, it must purge the address block of that provider from the entire site. The current practice of using the IP address as both an identifier and a locator has been strengthened by the use of IP addresses in access control lists present in various types of policy-enforcement devices (e.g., firewalls). If SHIM6's ULIDs are to be used for policy enforcement, a change of providers may necessitate the re-configuration of many such devices.


6.3. GSE/Indirection Solutions: Costs and Benefits
6.3. GSE/间接解决方案:成本和收益

The use of indirection for scalable multihoming was discussed at the workshop, including the GSE [GSE] and indirection approaches, such as Map-and-Encap [RFC1955], in general. The GSE proposal changes the IPv6 address structure to bear the semantics of both an identifier and a locator. The first n bytes of the 16-byte IPv6 address are called the Routing Goop (RG), and are used by the routing system exclusively as a locator. The last 8 bytes of the IPv6 address specify an interface on an end-system. The middle (16 - n - 8) bytes are used to identify site local topology. The border routers of a site re-write the source RG of each outgoing packet to make the source address part of the source provider's address aggregation; they also re-write the destination RG of each incoming packet to hide the site's RG from all the internal routers and hosts. Although GSE


designates the lower 8 bytes of the IPv6 address as identifiers, the extent to which GSE could be made compatible with increasingly-popular cryptographically-generated addresses (CGA) remains to be determined [dGSE].


All identifier/locator split proposals require a mapping service that can return a set of locators corresponding to a given identifier. In addition, these proposals must also address the problem of detecting locator failures and redirecting data flows to remaining locators for a multihomed site. The Map-and-Encap proposal did not address these issues. GSE proposed to use DNS for providing the mapping service, but it did not offer an effective means for locator failure recovery. GSE also requires host stack modifications, as the upper layers and applications are only allowed to use the lower 8-bytes, rather than the entire, IPv6 address.


6.4. Future for Indirection
6.4. 间接法的未来

As the saying goes, "There is no problem in computer science that cannot be solved by an extra level of indirection". The GSE proposal can be considered a specific instantiation of a class of indirection-based solutions to scalable multihoming. Map-and-Encap [RFC1955] represents a more general form of this indirection solution, which uses tunneling, instead of locator rewriting, to cross the DFZ and support provider-based prefix aggregation. This class of solutions avoids the provider and customer conflicts regarding PA and PI prefixes by putting each in a separate name space, so that ISPs can use topologically aggregatable addresses while customers can have their globally unique and provider-independent identifiers. Thus, it supports scalable multihoming, and requires no changes to the end systems when the encapsulation is performed by the border routers of a site. It also requires no changes to the current practice of both applications as well as backbone operations.

俗话说,“计算机科学中没有任何问题不能通过额外的间接手段来解决”。GSE提案可以被视为一类基于间接寻址的可伸缩多主解决方案的具体实例。Map and Encap[RFC1955]代表了这种间接寻址解决方案的更一般形式,它使用隧道而不是定位器重写来跨越DFZ并支持基于提供程序的前缀聚合。这类解决方案通过将PA和PI前缀放在单独的名称空间中,避免了提供商和客户在PA和PI前缀方面的冲突,因此ISP可以使用拓扑聚合地址,而客户可以拥有其全局唯一且独立于提供商的标识符。因此,它支持可伸缩的多宿主,并且当封装由站点的边界路由器执行时,不需要更改终端系统。它也不需要改变当前应用程序和主干网操作的实践。

However, all gains of an effective solution are accompanied with certain associated costs. As stated earlier in this section, a mapping service must be provided. This mapping service not only brings with it the associated complexity and cost, but it also adds another point of failure and could also be a potential target for malicious attacks. Any solution to routing scalability is necessarily a cost/benefit tradeoff. Given the high potential of its gains, this indirection approach deserves special attention in our search for scalable routing solutions.


7. Problem Statements
7. 问题陈述

The fundamental goal of this workshop was to develop a prioritized problem statement regarding routing and addressing problems facing us today, and the workshop spent a considerable amount of time on reaching that goal. This section provides a description of the prioritized problem statement, together with elaborations on both the rationale and open issues.


The workshop participants noted that there exist different classes of stakeholders in the Internet community who view today's global routing system from different angles, and assign different priorities to different aspects of the problem set. The prioritized problem statement in this section is the consensus of the participants in this workshop, representing primarily large network operators and a few router vendors. It is likely that a different group of participants would produce a different list, or with different priorities. For example, freedom to change providers without renumbering might make the top of the priority list assembled by a workshop of end users and enterprise network operators.


7.1. Problem #1: Routing Scalability
7.1. 问题1:路由可伸缩性

The workshop participants believe that routing scalability is the most important problem facing the Internet today and must be solved, although the time frame in which these problems need solutions was not directly specified. The routing scalability problem includes the size of the DFZ RIB and FIB, the implications of the growth of the RIB and FIB on routing convergence times, and the cost, power (and hence, heat dissipation) and ASIC real estate requirements of core router hardware.

研讨会参与者认为,路由可伸缩性是当今互联网面临的最重要问题,必须加以解决,尽管没有直接规定这些问题需要解决的时间范围。路由可扩展性问题包括DFZ RIB和FIB的大小,RIB和FIB的增长对路由收敛时间的影响,以及核心路由器硬件的成本、功耗(因此,散热)和ASIC不动产要求。

It is commonly believed that the IPv4 RIB growth has been constrained by the limited IPv4 address space. However, even under this constraint, the DFZ IPv4 RIB has been growing at what appears to be an accelerating rate [DFZ]. Given that the IPv6 routing architecture is the same as the IPv4 architecture (with substantially larger address space), if/when IPv6 becomes widely deployed, it is natural to predict that routing table growth for IPv6 will only exacerbate the situation.


The increasing deployment of Virtual Private Network/Virtual Routing and Forwarding (VPN/VRF) is considered another major factor driving the routing system growth. However, there are different views regarding whether this factor has, or does not have, a direct impact to the DFZ RIB. A common practice is to delegate specific routers to handle VPN connections, thus backbone routers do not necessarily hold


state for individual VPNs. Nevertheless, VPNs do represent scalability challenges in network operations.


7.2. Problem #2: The Overloading of IP Address Semantics
7.2. 问题#2:IP地址语义过载

As we have reported in Section 3, multihoming, along with traffic engineering, appear to be the major factors driving the growth of the DFZ RIB. Below, we elaborate their impact on the DFZ RIB.


7.2.1. Definition of Locator and Identifier
7.2.1. 定位器和标识符的定义

Roughly speaking, the Internet comprises a large number of transit networks and a much larger number of customer networks containing hosts that are attached to the backbone. Viewing the Internet as a graph, transit networks have branches and customer networks with hosts hang at the edges as leaves.


As its name suggests, locators identify locations in the topology, and a network's or host's locator should be topologically constrained by its present position. Identifiers, in principle, should be network-topology independent. That is, even though a network or host may need to change its locator when it is moved to a different set of attachment points in the Internet, its identifier should remain constant.


From an ISP's viewpoint, identifiers identify customer networks and customer hosts. Note that the word "identifier" used here is defined in the context of the Internet routing system; the definition may well be different when the word "identifier" is used in other contexts. As an example, a non-routable, provider-independent IP prefix for an enterprise network could serve as an identifier for that enterprise. This block of IP addresses can be used to route packets inside the enterprise network. However, they are independent from the DFZ topology, which is why they are not globally routable on the Internet.


Note that in cases such as the last example, the definition of locators and identifiers can be context-dependent. Following the example further, a PI address may be routable in an enterprise but not the global network. If allowed to be visible in the global network, such addresses might act as identifiers from a backbone operator's point of view but locators from an enterprise operator's point of view.


7.2.2. Consequence of Locator and Identifier Overloading
7.2.2. 定位器和标识符重载的后果

In today's Internet architecture, IP addresses have been used as both locators and identifiers. Combined with the use of CIDR to perform route aggregation, a problem arises for either providers or customers (or both).


Consider, for example, a campus network C that received prefix x.y.z/24 from provider P1. When C multihomes with a second provider P2, both P1 and P2 must announce x.y.z/24 so that C can be reached through both providers. In this example, the prefix x.y.z/24 serves both as an identifier for C, as well as a (non-aggregatable) locator for C's two attachment points to the transit system.


As far as the DFZ RIB is concerned, the above example shows that customer multihoming blurs the distinction between PA and PI prefixes. Although C received a PA prefix x.y.z/24 from P1, C's multihoming forced this prefix to be announced globally (equivalent to a PI prefix), and forced the prefix's original owner, provider P1, to de-aggregate. As a result, today's multihoming practice leads to a growth of the routing table size in proportion to the number of multihomed customers. The only practical way to scale a routing system today is topological aggregation, which gets destroyed by customer multihoming.

就DFZ RIB而言,上面的示例表明,客户多归属模糊了PA和PI前缀之间的区别。虽然C从P1接收到PA前缀x.y.z/24,但C的多归属强制全局宣布该前缀(相当于PI前缀),并强制前缀的原始所有者提供商P1取消聚合。因此,今天的多宿实践导致路由表的大小与多宿客户的数量成比例增长。如今,扩展路由系统的唯一实用方法是拓扑聚合,它会被客户多宿破坏。

Although multihoming may blur the PA/PI distinction, there exists a big difference between PA and PI prefixes when a customer changes its provider(s). If the customer has used a PA prefix from a former provider P1, the prefix is supposed to be returned to P1 upon completion of the change. The customer is supposed to get a new prefix from its new provider, i.e., renumbering its network. It is necessary for providers to reclaim their PA prefixes from former customers in order to keep the topological aggregatiblity of their prefixes. On the other hand, renumbering is considered very painful, if not impossible, by many Internet users, especially large enterprise customers. It is not uncommon for IP addresses in such enterprises to penetrate deeply into various parts of the networking infrastructure, ranging from applications to network management (e.g., policy databases, firewall configurations, etc.). This shows how fragile the system becomes due to the overloading of IP addresses as both locators and identifiers; significant enterprise operations could be disrupted due to the otherwise simple operation of switching IP address prefix assignment.


7.2.3. Traffic Engineering and IP Address Semantics Overload
7.2.3. 流量工程与IP地址语义过载

In today's practice, traffic engineering (TE) is achieved by de-aggregating IP prefixes. One can effectively adjust the traffic volume along specific routing paths by adjusting the prefix lengths and the number of prefixes announced through those paths. Thus, the very means of TE practice directly conflicts with constraining the routing table growth.


On the surface, traffic engineering induced prefix de-aggregation seems orthogonal to the locator-identifier overloading problem. However, this may not necessarily be true. Had all the IP prefixes been topologically aggregatable to start with, it would make re-aggregation possible or easier, when the finer granularity prefix announcements propagate further away from their origins.


7.3. Additional Issues
7.3. 其他问题
7.3.1. Routing Convergence
7.3.1. 路由收敛

There are two kinds of routing convergence issues, eBGP (global routing) convergence and IGP (enterprise or provider) routing convergence. Upon isolated topological events, eBGP convergence does not suffer from extensive path explorations in most cases [PathExp], and convergence delay is largely determined by the minimum route advertisement interval (MRAI) timer [RFC4098], except those cases when a route is withdrawn. Route withdrawals tend to suffer from path explorations and hence slow convergence; one participant's experience suggests that the withdrawal delays often last up to a couple of minutes. One may argue that, if the destination becomes unreachable, a long convergence delay would not bring further damage to applications. However, there are often cases where a more specific route (a longer prefix) has failed, yet the destination can still be reached through an aggregated route (a shorter prefix). In these cases, the long convergence delay does impact application performance.


While IGPs are designed to and do converge more quickly than BGP might, the workshop participants were concerned that, in addition to the various special purpose routes that IGPs must carry, the rapid growth of the DFZ RIB size can effectively slow down IGP convergence. The IGP convergence delay can be due to multiple factors, including


1. Delays in detecting physical failures,

1. 延迟检测物理故障,

2. The delay in loading updated information into the FIB, and

2. 将更新信息加载到FIB中的延迟,以及

3. The large size of the internal RIB, often twice as big as the DFZ RIB, which can lead to both longer route computation time and longer FIB loading time.

3. 内肋尺寸较大,通常是DFZ肋的两倍,这可能导致更长的路线计算时间和更长的FIB加载时间。

The workshop participants hold different views regarding (1) the severity of the routing convergence problem; and (2) whether it is an architectural problem, or an implementation issue. However, people generally agree that if we solve the routing scalability problem, that will certainly help reduce the convergence delay or make the problem a much easier one to handle because of the reduced number of routes to process.


7.3.2. Misaligned Costs and Benefits
7.3.2. 成本和收益失调

Today's rapid growth of the DFZ RIB is driven by a few major factors, including multihoming and traffic engineering, in addition to the organic growth of the Internet's user base. There is a powerful incentive to deploy each of the above features, as they bring direct benefits to the parties who make use of them. However, the beneficiaries may not bear the direct costs of the resulting routing table size increase, and there is no measurable or enforceable constraint to limit such increase.

除了互联网用户群的有机增长外,DFZ RIB今天的快速增长还受到几个主要因素的推动,包括多主和流量工程。部署上述每一项功能都有强大的动力,因为它们会为使用它们的各方带来直接利益。然而,受益人可能不承担由此产生的路由表大小增加的直接成本,并且没有可测量或可执行的约束来限制这种增加。

For example, suppose that a service provider has two bandwidth-constrained transoceanic links and wants to split its prefix announcements in order to fully load each link. The origin AS benefits from performing the de-aggregation. However, if the de-aggregated announcements propagate globally, the cost is born by all other ASs. That is, the costs of this type of TE practice are not contained to the beneficiaries. Multihoming provides a similar example (in this case, the multihomed site achieves a benefit, but the global Internet incurs the cost of carrying the additional prefix(es)).


The misalignment of cost and benefit in the current routing system has been a driver for acceleration of the routing system size growth.


7.3.3. Other Concerns
7.3.3. 其他关注事项

Mobility was among the most frequently mentioned issues at the workshop. It is expected that billions of mobile gadgets may be connected to the Internet in the near future. There was also a discussion on network mobility as deployed in the Connexion service provided by Boeing over the last few years. However, at this time it seems unclear (1) whether the Boeing-like network mobility support would cause a scaling issue in the routing system, and (2) exactly what would be the impact of billions of mobile hosts on the global


routing system. These discussions were covered in Section 5 of this report.


Routing security is another issue that was brought up a number of times during the workshop. The consensus from the workshop participants was that, however important routing security may be, it was out of scope for this workshop, whose main goal was to produce a problem statement about addressing and routing scalability. It was duly considered that security must be one of the top design goals when we get to a solution development stage. It was also noted that, if we continue to allow the routing table to grow indefinitely, then it may be impossible to add security enhancements in the future.


7.4. Problem Recognition
7.4. 问题识别

The first step in solving a problem is recognizing its existence as well as its importance. However, recognizing the severity of the routing scaling issue can be a challenge by itself, because there does not exist a specific hard limit on routing system scalability that can be easily demonstrated, nor is there any specific answer to the question of how much time we may have in developing a solution. Nevertheless, a general consensus among the workshop participants is that we seem to be running out of time. The current RIB scaling leads to both accelerated hardware cost increases, as explained in Section 4, as well as pressure for shorter depreciation cycles, which in turn also translates to cost increases.


8. Criteria for Solution Development
8. 解决方案开发的标准

Any common problem statement may admit multiple different solutions. This section provides a set of considerations, as identified from the workshop discussion, over the solution space. Given the heterogeneity among customers and providers of the global Internet, and the elasticity of the problem, none of these considerations should inherently preclude any specific solution. Consequently, although the following considerations were initially deemed as constraints on solutions, we have instead opted to adopt the term 'criteria' to be used in guiding solution evaluations.


8.1. Criteria on Scalability
8.1. 可扩展性标准

Clearly, any proposed solution must solve the problem at hand, and our number one problem concerns the scalability of the Internet's routing and addressing system(s) as outlined in previous sections. Under the assumption of continued growth of the Internet user population, continued increases of multihoming and RFC 2547 VPN [RFC2547] deployment, the solution must enable the routing system to scale gracefully, as measured by the number of

显然,任何提议的解决方案都必须解决手头的问题,我们的首要问题是互联网路由和寻址系统的可伸缩性,如前几节所述。在假定互联网用户数量持续增长、多归属和RFC 2547 VPN[RFC2547]部署持续增加的情况下,解决方案必须使路由系统能够按照用户数量进行适当扩展

o DFZ Internet routes, and

o DFZ互联网路由,以及

o Internal routes.

o 内部路线。

In addition, scalable support for traffic engineering (TE) must be considered as a business necessity, not an option. Capacity planning involves placing circuits based on traffic demand over a relatively long time scale, while TE must work more immediately to match the traffic load to the existing capacity and to match the routing policy requirements.


It was recognized that different parties in the Internet may have different specific TE requirements. For example,


o End site TE: based on locally determined performance or cost policies, end sites may wish to control the traffic volume exiting to, or entering from specific providers.

o 终端站点TE:根据本地确定的性能或成本策略,终端站点可能希望控制进出特定提供商的流量。

o Small ISP to transit ISP TE: operators may face tight resource constraints and wish to influence the volume of entering traffic from both customers and providers along specific routing paths to best utilize the limited resources.

o 小型ISP到中转ISP TE:运营商可能面临资源紧张的限制,希望通过特定路由路径影响客户和提供商的流量输入,以最佳利用有限的资源。

o Large ISP TE: given the densely connected nature of the Internet topology, a given destination normally can be reached through different routing paths. An operator may wish to be able to adjust the traffic volume sent to each of its peers based on business relations with its neighbor ASs.

o 大型ISP TE:考虑到互联网拓扑的密集连接性质,通常可以通过不同的路由路径到达给定的目的地。运营商可能希望能够根据与其邻居ASs的业务关系调整发送给每个对等方的通信量。

At this time, it remains an open issue whether a scalable TE solution would be necessarily inside the routing protocol, or can be accomplished through means that are external to the routing system.


8.2. Criteria on Incentives and Economics
8.2. 激励和经济标准

The workshop attendees concluded that one important reason for uncontrolled routing growth was the misalignment of incentives. New entries are added to the routing system to provide benefit to specific parties, while the cost is born by everyone in the global routing system. The consensus of the workshop was that any proposed solutions should strive to provide incentives to reward practices that reduce the overall system cost, and punish the "bad" behavior that imposes undue burden on the global system.


Given the global scale and distributed nature of the Internet, there can no longer (ever) be a flag day on the Internet. To bootstrap the deployment of new solutions, the solutions should provide incentives to first movers. That is, even when a single party starts to deploy


the new solution, there should be measurable benefits to balance the costs.


Independent of what kind of solutions the IETF develops, if any, it is unlikely that the resulting routing system would stay constant in size. Instead, the workshop participants believed the routing system will continue to grow, and that ISPs will continue to go through system and hardware upgrade cycles. Many attendees expressed a desire that the scaling properties of the system can allow the hardware to keep up with the Internet growth at a rate that is comparable to the current costs, for example, allowing one to keep a 5-year hardware depreciation cycle, as opposed to a situation where scaling leads to accelerated cost increases.


8.3. Criteria on Timing
8.3. 时间标准

Although there does not exist a specific hard deadline, the unanimous consensus among the workshop participants is that the solution development must start now. If one assumes that the solution specification can get ready within a 1 - 2 year time frame, that will be followed by another 2-year certification cycle. As a result, even in the best case scenario, we are facing a 3 - 5 year time frame in getting the solutions deployed.


8.4. Consideration on Existing Systems
8.4. 对现有制度的审议

The routing scalability problem is a shared one between IPv4 and IPv6, as IPv6 simply inherited IPv4's CIDR-style "Provider-based Addressing". The proposed solutions should, and are also expected to, solve the problem for both IPv4 and IPv6.


Backwards compatibility with the existing IPv4 and IPv6 protocol stack is a necessity. Although a wide deployment of IPv6 is yet to happen, there has been substantial investment into IPv6 implementation and deployment by various parties. IPv6 is considered a legacy with shipped code. Thus, a highly desired feature of any proposed solution is to avoid imposing backwards-incompatible changes on end hosts (either IPv4 or IPv6).


In the routing system itself, the solutions must allow incremental changes from the current operational Internet. The solutions should be backward compatible with the routing protocols in use today, including BGP, OSPF, IS-IS, and others, possibly with incremental enhancements.


The above backward-compatibility considerations should not constrain the exploration of the solution space. We need to first find right solutions, and look into their backward-compatibility issues after


that. This way enables us to gain a full understanding of the tradeoffs, and what potential gains, if any, that we may achieve by relaxing the backward-compatibility concerns.


As a rule of thumb for successful deployment, for any new design, its chance of success is higher if it makes fewer changes to the existing system.


8.5. Consideration on Security
8.5. 关于安全的思考

Security should be considered from day one of solution development. If nothing else, the solutions must not make securing the routing system any worse than the situation today. It is highly desirable to have a solution that makes it more difficult to inject false routing information, and makes it easier to filter out DoS traffic.


However, securing the routing system is not considered a requirement for the solution development. Security is important; having a working system in the first place is even more important.


8.6. Other Criteria
8.6. 其他标准

A number of other criteria were also raised that fall into various different categories. They are summarized below.


o Site renumbering forced by the routing system should be avoided.

o 应避免路由系统强制的站点重新编号。

o Site reconfiguration driven by the routing system should be minimized.

o 路由系统驱动的站点重新配置应最小化。

o The solutions should not force ISPs to reveal internal topology.

o 解决方案不应强制ISP显示内部拓扑。

o Routing convergence delay must be under control.

o 路由收敛延迟必须得到控制。

o End-to-end data delivery paths should be stable enough for good Voice over IP (VoIP) performance.

o 端到端数据传输路径应足够稳定,以实现良好的IP语音(VoIP)性能。

8.7. Understanding the Tradeoff
8.7. 理解权衡

As the old saying goes, every coin has two sides. If we let the routing table continue to grow at its present rate, rapid hardware and software upgrade and replacement cycles for deployed core routing equipment may become cost prohibitive. In the worst case, the routing table growth may exceed our ability to engineer the global routing system in a cost-effective way. On the other hand, solutions for stopping or substantially slowing down the growth in the Internet routing table will necessarily bring their own costs, perhaps showing up elsewhere and in different forms. Examples of such tradeoffs are


presented in Section 6, where we examined the gains and costs of a few different approaches to scalable multihoming support (SHIM6, GSE, and a general tunneling approach). A major task in the solution development is to understand who may have to give up what, and whether that makes a worthy tradeoff.


Before ending this discussion on the solution criteria, it is worth mentioning the shortest presentation at the workshop, which was made by Tony Li (the presentation slides can be found from Appendix D). He asked a fundamental question: what is at stake? It is the Internet itself. If the routing system does not scale with the continued growth of the Internet, eventually the costs might spiral out of control, the digital divide widen, and the Internet growth slow down, stop, or retreat. Compared to this problem, he considered that none of the criteria mentioned so far (except solving the problem) was important enough to block the development and deployment of an effective solution.

在结束关于解决方案标准的讨论之前,值得一提的是研讨会上最短的演示文稿,由Tony Li制作(演示幻灯片见附录D)。他问了一个根本性的问题:这关系到什么?这是互联网本身。如果路由系统不能随着互联网的持续增长而扩展,最终成本可能会失控,数字鸿沟扩大,互联网增长放缓、停止或倒退。与这个问题相比,他认为迄今为止提到的标准(解决问题除外)都不足以阻止有效解决方案的开发和部署。

9. Workshop Recommendations
9. 讲习班建议

The workshop attendees would like to make the following recommendations:


First of all, the workshop participants would like to reiterate the importance of solving the routing scalability problem. They noted that the concern over the scalability and flexibility of the routing and addressing system has been with us for a very long time, and the current growth rate of the DFZ RIB is exceeding our ability to engineer the routing infrastructure in an economically feasible way. We need to start developing a long-term solution that can last for the foreseeable future.

首先,研讨会参与者希望重申解决路由可伸缩性问题的重要性。他们指出,长期以来,我们一直关注路由和寻址系统的可扩展性和灵活性,目前DFZ RIB的增长速度超过了我们以经济可行的方式设计路由基础设施的能力。我们需要开始制定一个能够在可预见的未来持续的长期解决方案。

Second, because the participants of this workshop consisted of mostly large service providers and major router vendors, the workshop participants recommend that IAB/IESG organize additional workshops or use other venues of communication to reach out to other stakeholders, such as content providers, retail providers, and enterprise operators, both to communicate to them the outcome of this workshop, and to solicit the routing/addressing problems they are facing today, and their requirements on the solution development.


Third, the workshop participants recommend conducting the solution development in an open, transparent way, with broad-ranging participation from the larger networking community. A majority of the participants indicated their willingness to commit resources toward developing a solution. We must also invite the participation from the research community in this process. The locator-identifier split represents a fundamental architectural issue, and the IAB


should lead the investigation into understanding of both how to make this architectural change and the overall impact of the change.


Fourth, given the goal of developing a long-term solution, and the fact that development and deployment cycles will necessarily take some time, it may be helpful (or even necessary) to buy some time through engineering feasible short- or intermediate-term solutions (e.g., FIB compression).


Fifth, the workshop participants believe the next step is to develop a roadmap from here to the solution deployment. The IAB and IESG are expected to take on the leadership role in this roadmap development, and to leverage on the momentum from this successful workshop to move forward quickly. The roadmap should provide clearly defined short-, medium-, and long-term objectives to guide the solution development process, so that the community as a whole can proceed in an orchestrated way, seeing exactly where we are going when engineering necessary short-term fixes.


Finally, the workshop participants also made a number of suggestions that the IETF might consider when examining the solution space. These suggestions are captured in Appendix A.


10. Security Considerations
10. 安全考虑

While the security of the routing system is of great concern, this document introduces no new protocol or protocol usage and as such presents no new security issues.


11. Acknowledgments
11. 致谢

Jari Arkko, Vince Fuller, Darrel Lewis, Tony Li, Eric Rescorla, and Ted Seely made many insightful comments on earlier versions of this document. Finally, many thanks to Wouter Wijngaards for the fine notes he took during the workshop.

Jari Arkko、Vince Fuller、Darrel Lewis、Tony Li、Eric Rescorla和Ted Seely对本文件的早期版本发表了许多有见地的评论。最后,非常感谢Wouter Wijngaards在研讨会期间所做的精细笔记。

12. Informative References
12. 资料性引用

[RFC1955] Hinden, R., "New Scheme for Internet Routing and Addressing (ENCAPS) for IPNG", RFC 1955, June 1996.

[RFC1955]Hinden,R.,“IPNG的互联网路由和寻址新方案(ENCAPS)”,RFC 19551996年6月。

[RFC2547] Rosen, E. and Y. Rekhter, "BGP/MPLS VPNs", RFC 2547, March 1999.

[RFC2547]Rosen,E.和Y.Rekhter,“BGP/MPLS VPN”,RFC 2547,1999年3月。

[RFC3775] Johnson, D., Perkins, C., and J. Arkko, "Mobility Support in IPv6", RFC 3775, June 2004.

[RFC3775]Johnson,D.,Perkins,C.,和J.Arkko,“IPv6中的移动支持”,RFC 37752004年6月。

[RFC4098] Berkowitz, H., Davies, E., Hares, S., Krishnaswamy, P., and M. Lepp, "Terminology for Benchmarking BGP Device Convergence in the Control Plane", RFC 4098, June 2005.

[RFC4098]Berkowitz,H.,Davies,E.,Hares,S.,Krishnaswamy,P.,和M.Lepp,“控制平面内BGP设备聚合基准测试术语”,RFC 4098,2005年6月。

[RFC4116] Abley, J., Lindqvist, K., Davies, E., Black, B., and V. Gill, "IPv4 Multihoming Practices and Limitations", RFC 4116, July 2005.

[RFC4116]Abley,J.,Lindqvist,K.,Davies,E.,Black,B.,和V.Gill,“IPv4多宿主实践和限制”,RFC 41162005年7月。

[RFC4192] Baker, F., Lear, E., and R. Droms, "Procedures for Renumbering an IPv6 Network without a Flag Day", RFC 4192, September 2005.

[RFC4192]Baker,F.,Lear,E.,和R.Droms,“在没有国旗日的情况下对IPv6网络重新编号的程序”,RFC 41922005年9月。

[RFC4632] Fuller, V. and T. Li, "Classless Inter-domain Routing (CIDR): The Internet Address Assignment and Aggregation Plan", BCP 122, RFC 4632, August 2006.

[RFC4632]Fuller,V.和T.Li,“无类域间路由(CIDR):互联网地址分配和聚合计划”,BCP 122,RFC 4632,2006年8月。

[IDR-REQS] Doria, A. and E. Davies, "Analysis of IDR requirements and History", Work in Progress, February 2007.


[ARIN] "American Registry for Internet Numbers",


[PIPA] Karrenberg, D., "IPv4 Address Allocation and Assignment Policies for the RIPE NCC Service Region", RIPE-387, 2006.

[PIPA]Karrenberg,D.,“成熟NCC服务区域的IPv4地址分配和分配策略”,RIME-387, 2006.

[SHIM6] "Site Multihoming by IPv6 Intermediation (shim6)",


[EID] Chiappa, J., "Endpoints and Endpoint Names: A Proposed Enhancement to the Internet Architecture",, 1999.


[GSE] O'Dell, M., "GSE - An Alternate Addressing Architecture for IPv6", Work in Progress, 1997.


[dGSE] Zhang, L., "An Overview of Multihoming and Open Issues in GSE", IETF Journal, ietfjournal/?p=98#more-98, 2006.

[dGSE]Zhang,L.,“GSE中的多宿主和开放问题概述”,IETF杂志, ietfjournal/?p=98#更多信息-982006。

[PathExp] Oliveira, R. and et. al., "Quantifying Path Exploration in the Internet", Internet Measurement Conference (IMC) 2006, imc175f-oliveira.pdf.


[DynPrefix] Oliveira, R. and et. al., "Measurement of Highly Active Prefixes in BGP", IEEE GLOBECOM 2005

[DynPrefix]Oliveira,R.和等人,“BGP中高活性前缀的测量”,IEEE GLOBECOM 2005年。

[BHB06] Boothe, P., Hielbert, J., and R. Bush, "Short-Lived Prefix Hijacking on the Internet", NANOG 36, 2006.

[BHB06]Boothe,P.,Hielbert,J.,和R.Bush,“互联网上的短暂前缀劫持”,NANOG 36, 2006.

[ROFL] Caesar, M. and et. al., "ROFL: Routing on Flat Labels", SIGCOMM 2006, discussion/showpaper.php?paper_id=34, 2006.

[ROFL]Caesar,M.和等人,“ROFL:平面标签上的布线”,SIGCOM2006, discussion/showpaper.php?paper_id=342006。

[CNIR] Abraham, I. and et. al., "Compact Name-Independent Routing with Minimum Stretch", ACM Symposium on Parallel Algorithms and Architectures,, 2004.

[CNIR]Abraham,I.和等人,“最小拉伸的紧凑名称独立路由”,ACM并行算法和体系结构研讨会,, 2004.

[BGT04] Bu, T., Gao, L., and D. Towsley, "On Characterizing BGP Routing Table Growth", J. Computer and Telecomm Networking V45N1, 2004.


[Fuller] Fuller, V., "Scaling issues with ipv6 routing+ multihoming", routingandaddressing/vaf-iab-raws.pdf, 2006.

[Fuller]Fuller,V.,“ipv6路由+多宿主的扩展问题”, 路由和地址/vaf-iab-raws.pdf,2006年。

[H03] Huston, G., "Analyzing the Internet's BGP Routing Table", 2001-v4-n1-bgp/bgp.pdf, 2003.

[H03]Huston,G.,“分析互联网的BGP路由表”, 2001-v4-n1-bgp/bgp.pdf,2003年。

[BGP2005] Huston, G., "2005 -- A BGP Year in Review", http:// routing-pres-huston-routing-update.pdf.


[DFZ] Huston, G., "Growth of the BGP Table - 1994 to Present",, 2006.

[DFZ]Huston,G.,“BGP表的增长——1994年至今”,, 2006.

[GIH] Huston, G., "Wither Routing?",, 2006.

[GIH]Huston,G.,“威瑟路由?”,, 2006.

[ATNAC2006] Huston, G. and G. Armitage, "Projecting Future IPv4 Router Requirements from Trends in Dynamic BGP Behaviour", atnac-2006/bgp-atnac2006.pdf, 2006.

[ATNAC2006]Huston,G.和G.Armitage,“根据动态BGP行为趋势预测未来IPv4路由器需求”, atnac-2006/bgp-atnac2006.pdf,2006年。

[CIDRRPT] "The CIDR Report",


[ML] "Moore's Law", Wikipedia's_law, 2006.


[Molinero] Molinero-Fernandez, P., "Technology trends in routers and switches", PhD thesis, Stanford University http:// pmf_thesis_node5.html, 2005.

[Molinero]Molinero Fernandez,P.,“路由器和交换机的技术趋势”,斯坦福大学博士论文,2005年。

[DRAM] Landler, P., "DRAM Productivity and Capacity/Demand Model", Global Economic Workshop http:// 07_econ.pdf, 1999.


Appendix A. Suggestions for Specific Steps

At the end of the workshop there was a lively round-table discussion regarding specific steps that IETF may consider undertaking towards a quick solution development, as well as potential issues to avoid. Those steps included:


o Finding a home (mailing list) to continue the discussion started from the workshop with wider participation. [Editor's note: Done -- This action has been completed. The list is]

o 找到一个家(邮件列表)继续讨论,从研讨会开始,参与面更广。[编者按:完成--此操作已完成。列表为]

o Considering a special process to expedite solution development, avoiding the lengthy protocol standardization cycles. For example, IESG may charter special design teams for the solution investigation.

o 考虑采用特殊流程加快解决方案开发,避免冗长的协议标准化周期。例如,IESG可以为解决方案调查组建专门的设计团队。

o If a working group is to be formed, care must be taken to ensure that the scope of the charter is narrow and specific enough to allow quick progress, and that the WG chair be forceful enough to keep the WG activity focused. There was also a discussion on which area this new WG should belong to; both routing area ADs and Internet area ADs are willing to host it.

o 如果要成立一个工作组,必须注意确保《宪章》的范围足够狭窄和具体,以便能够迅速取得进展,并且工作组主席必须足够有力,以保持工作组活动的重点。还讨论了新工作组应属于哪个领域;路由区广告和互联网区广告都愿意主办。

o It is desirable that the solutions be developed in an open environment and free from any Intellectual Property Right claims.

o 最好是在开放环境中开发解决方案,且不存在任何知识产权要求。

Finally, given the perceived severity of the problem at hand, the workshop participants trust that IAB/IESG/IETF will take prompt actions. However, if that were not to happen, operators and vendors would be most likely to act on their own and get a solution deployed.


Appendix B. Workshop Participants

Loa Anderson (IAB) Jari Arkko (IESG) Ron Bonica Ross Callon (IESG) Brian Carpenter (IAB) David Conrad (IANA) Leslie Daigle (IAB Chair) Elwyn Davies (IAB) Terry Davis Weisi Dong Aaron Falk (IRTF Chair) Kevin Fall (IAB) Dino Farinacci Vince Fuller Vijay Gill

Loa Anderson(IAB)Jari Arkko(IESG)Ron Bonica Ross Callon(IESG)Brian Carpenter(IAB)David Conrad(IANA)Leslie Daigle(IAB)Elwyn Davies(IAB)Terry Davis Weisi Dong Aaron Falk(IRTF)Kevin Fall(IAB)Dino Farinaci Vince Fuller Vijay Gill

Russ Housley (IESG) Geoff Huston Daniel Karrenberg Dorian Kim Olaf Kolkman (IAB) Darrel Lewis Tony Li Kurtis Lindqvist (IAB) Peter Lothberg David Meyer (IAB) Christopher Morrow Dave Oran (IAB) Phil Roberts (IAB Executive Director) Jason Schiller Peter Schoenmaker Ted Seely Mark Townsley (IESG) Iljitsch van Beijnum Ruediger Volk Magnus Westerlund (IESG) Lixia Zhang (IAB)

Russ Housley(IESG)Geoff Huston Daniel Karrenberg Dorian Kim Olaf Kolkman(IAB)Darrel Lewis Tony Li Kurtis Lindqvist(IAB)Peter Lothberg David Meyer(IAB)Christopher Morrow Dave Oran(IAB)Phil Roberts(IAB)Jason Schiller Peter Schoenmaker Ted Seely Mark Townsley(IESG)Iljitsch van Beijnum Ruediger Volk Magnus Westerlund(IESG)张丽霞(IAB)

Appendix C. Workshop Agenda

IAB Routing and Addressing Workshop Agenda October 18-19 Amsterdam, Netherlands


DAY 1: the proposed goal is to collect, as complete as possible, a set of scalability problems in the routing and addressing area facing the Internet today.


0815-0900: Welcome, framing up for the 2 days Moderator: Leslie Daigle


0900-1200: Morning session Moderator: Elwyn Davies Strawman topics for the morning session: - Scalability - Multihoming support - Traffic Engineering - Routing Table Size: Rate of growth, Dynamics (this is not limited to DFZ, include iBGP) - Causes of the growth - Pains from the growth (perhaps "Impact on routers" can come here?) - How big a problem is BGP slow convergence?

0900-1200:上午会议主持人:Elwyn Davies Strawman上午会议的主题:-可伸缩性-多宿支持-流量工程-路由表大小:增长率,动态性(不限于DFZ,包括iBGP)-增长原因-增长带来的痛苦(也许“对路由器的影响”会出现在这里?)-BGP缓慢收敛的问题有多大?

1015-1030: Coffee Break


1200-1300: Lunch


1330-1730: Afternoon session: What are the top 3 routing problems in your network? Moderator: Kurt Erik Lindqvist

1330-1730:下午课程:您的网络中最常见的3个路由问题是什么?主持人:Kurt Erik Lindqvist

1500-1530: Coffee Break


   Dinner at Indrapura (, sponsored by Cisco
   Dinner at Indrapura (, sponsored by Cisco
   DAY 2: The proposed goal is to formulate a problem statement
   DAY 2: The proposed goal is to formulate a problem statement

0800-0830: Welcome


0830-1000: Morning session: What's on the table Moderator: Elwyn Davies - shim6 - GSE

0830-1000:上午会议:桌上有什么主持人:Elwyn Davies-shim6-GSE

1000-1030: Coffee Break


   1030-1200: Problem Statement session #1: document the problems
              Moderator: David Meyer
   1030-1200: Problem Statement session #1: document the problems
              Moderator: David Meyer

1200-1300: Lunch


   1300-1500: Problem Statement session # 2, cont;
              Moderator: Dino Farinacci
               - Constraints on solutions
   1300-1500: Problem Statement session # 2, cont;
              Moderator: Dino Farinacci
               - Constraints on solutions

1500-1530: Coffee Break


1530-1730: Summary and Wrap-up Moderator: Leslie Daigle


Appendix D. Presentations

The presentations from the workshop can be found on


Authors' Addresses


David Meyer (editor)



Lixia Zhang (editor)



Kevin Fall (editor)



Full Copyright Statement


Copyright (C) The IETF Trust (2007).


This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

本文件受BCP 78中包含的权利、许可和限制的约束,除其中规定外,作者保留其所有权利。



Intellectual Property


The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何独立努力来确定任何此类权利。有关RFC文件中权利的程序信息,请参见BCP 78和BCP 79。

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at


The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at