Internet Engineering Task Force (IETF) A. Sajassi, Ed. Request for Comments: 8365 Cisco Category: Standards Track J. Drake, Ed. ISSN: 2070-1721 Juniper N. Bitar Nokia R. Shekhar Juniper J. Uttaro AT&T W. Henderickx Nokia March 2018
Internet Engineering Task Force (IETF) A. Sajassi, Ed. Request for Comments: 8365 Cisco Category: Standards Track J. Drake, Ed. ISSN: 2070-1721 Juniper N. Bitar Nokia R. Shekhar Juniper J. Uttaro AT&T W. Henderickx Nokia March 2018
A Network Virtualization Overlay Solution Using Ethernet VPN (EVPN)
使用以太网VPN(EVPN)的网络虚拟化覆盖解决方案
Abstract
摘要
This document specifies how Ethernet VPN (EVPN) can be used as a Network Virtualization Overlay (NVO) solution and explores the various tunnel encapsulation options over IP and their impact on the EVPN control plane and procedures. In particular, the following encapsulation options are analyzed: Virtual Extensible LAN (VXLAN), Network Virtualization using Generic Routing Encapsulation (NVGRE), and MPLS over GRE. This specification is also applicable to Generic Network Virtualization Encapsulation (GENEVE); however, some incremental work is required, which will be covered in a separate document. This document also specifies new multihoming procedures for split-horizon filtering and mass withdrawal. It also specifies EVPN route constructions for VXLAN/NVGRE encapsulations and Autonomous System Border Router (ASBR) procedures for multihoming of Network Virtualization Edge (NVE) devices.
本文档指定了如何将以太网VPN(EVPN)用作网络虚拟化覆盖(NVO)解决方案,并探讨了IP上的各种隧道封装选项及其对EVPN控制平面和过程的影响。特别是,分析了以下封装选项:虚拟可扩展LAN(VXLAN)、使用通用路由封装的网络虚拟化(NVGRE)和基于GRE的MPLS。本规范也适用于通用网络虚拟化封装(GENEVE);但是,需要进行一些增量工作,这些工作将在单独的文件中介绍。本文件还规定了分离地平线过滤和大规模回收的新多归宿程序。它还规定了VXLAN/NVGRE封装的EVPN路由构造,以及网络虚拟化边缘(NVE)设备多宿的自治系统边界路由器(ASBR)程序。
Status of This Memo
关于下段备忘
This is an Internet Standards Track document.
这是一份互联网标准跟踪文件。
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.
本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。有关互联网标准的更多信息,请参见RFC 7841第2节。
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc8365.
有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问https://www.rfc-editor.org/info/rfc8365.
Copyright Notice
版权公告
Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2018 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(https://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。
Table of Contents
目录
1. Introduction ....................................................4 2. Requirements Notation and Conventions ...........................5 3. Terminology .....................................................5 4. EVPN Features ...................................................7 5. Encapsulation Options for EVPN Overlays .........................8 5.1. VXLAN/NVGRE Encapsulation ..................................8 5.1.1. Virtual Identifiers Scope ...........................9 5.1.2. Virtual Identifiers to EVI Mapping .................11 5.1.3. Constructing EVPN BGP Routes .......................13 5.2. MPLS over GRE .............................................15 6. EVPN with Multiple Data-Plane Encapsulations ...................15 7. Single-Homing NVEs - NVE Residing in Hypervisor ................16 7.1. Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE ....16 7.2. Impact on EVPN Procedures for VXLAN/NVGRE Encapsulations ..17 8. Multihoming NVEs - NVE Residing in ToR Switch ..................18 8.1. EVPN Multihoming Features .................................18 8.1.1. Multihomed ES Auto-Discovery .......................18 8.1.2. Fast Convergence and Mass Withdrawal ...............18 8.1.3. Split-Horizon ......................................19 8.1.4. Aliasing and Backup Path ...........................19 8.1.5. DF Election ........................................20 8.2. Impact on EVPN BGP Routes and Attributes ..................20 8.3. Impact on EVPN Procedures .................................20 8.3.1. Split Horizon ......................................21 8.3.2. Aliasing and Backup Path ...........................22 8.3.3. Unknown Unicast Traffic Designation ................22 9. Support for Multicast ..........................................23 10. Data-Center Interconnections (DCIs) ...........................24 10.1. DCI Using GWs ............................................24 10.2. DCI Using ASBRs ..........................................24 10.2.1. ASBR Functionality with Single-Homing NVEs ........25 10.2.2. ASBR Functionality with Multihoming NVEs ..........26 11. Security Considerations .......................................28 12. IANA Considerations ...........................................29 13. References ....................................................29 13.1. Normative References .....................................29 13.2. Informative References ...................................30 Acknowledgements ..................................................32 Contributors ......................................................32 Authors' Addresses ................................................33
1. Introduction ....................................................4 2. Requirements Notation and Conventions ...........................5 3. Terminology .....................................................5 4. EVPN Features ...................................................7 5. Encapsulation Options for EVPN Overlays .........................8 5.1. VXLAN/NVGRE Encapsulation ..................................8 5.1.1. Virtual Identifiers Scope ...........................9 5.1.2. Virtual Identifiers to EVI Mapping .................11 5.1.3. Constructing EVPN BGP Routes .......................13 5.2. MPLS over GRE .............................................15 6. EVPN with Multiple Data-Plane Encapsulations ...................15 7. Single-Homing NVEs - NVE Residing in Hypervisor ................16 7.1. Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE ....16 7.2. Impact on EVPN Procedures for VXLAN/NVGRE Encapsulations ..17 8. Multihoming NVEs - NVE Residing in ToR Switch ..................18 8.1. EVPN Multihoming Features .................................18 8.1.1. Multihomed ES Auto-Discovery .......................18 8.1.2. Fast Convergence and Mass Withdrawal ...............18 8.1.3. Split-Horizon ......................................19 8.1.4. Aliasing and Backup Path ...........................19 8.1.5. DF Election ........................................20 8.2. Impact on EVPN BGP Routes and Attributes ..................20 8.3. Impact on EVPN Procedures .................................20 8.3.1. Split Horizon ......................................21 8.3.2. Aliasing and Backup Path ...........................22 8.3.3. Unknown Unicast Traffic Designation ................22 9. Support for Multicast ..........................................23 10. Data-Center Interconnections (DCIs) ...........................24 10.1. DCI Using GWs ............................................24 10.2. DCI Using ASBRs ..........................................24 10.2.1. ASBR Functionality with Single-Homing NVEs ........25 10.2.2. ASBR Functionality with Multihoming NVEs ..........26 11. Security Considerations .......................................28 12. IANA Considerations ...........................................29 13. References ....................................................29 13.1. Normative References .....................................29 13.2. Informative References ...................................30 Acknowledgements ..................................................32 Contributors ......................................................32 Authors' Addresses ................................................33
This document specifies how Ethernet VPN (EVPN) [RFC7432] can be used as a Network Virtualization Overlay (NVO) solution and explores the various tunnel encapsulation options over IP and their impact on the EVPN control plane and procedures. In particular, the following encapsulation options are analyzed: Virtual Extensible LAN (VXLAN) [RFC7348], Network Virtualization using Generic Routing Encapsulation (NVGRE) [RFC7637], and MPLS over Generic Routing Encapsulation (GRE) [RFC4023]. This specification is also applicable to Generic Network Virtualization Encapsulation (GENEVE) [GENEVE]; however, some incremental work is required, which will be covered in a separate document [EVPN-GENEVE]. This document also specifies new multihoming procedures for split-horizon filtering and mass withdrawal. It also specifies EVPN route constructions for VXLAN/NVGRE encapsulations and Autonomous System Border Router (ASBR) procedures for multihoming of Network Virtualization Edge (NVE) devices.
本文档详细说明了如何将以太网VPN(EVPN)[RFC7432]用作网络虚拟化覆盖(NVO)解决方案,并探讨了IP上的各种隧道封装选项及其对EVPN控制平面和过程的影响。具体而言,分析了以下封装选项:虚拟可扩展LAN(VXLAN)[RFC7348]、使用通用路由封装(NVGRE)[RFC7637]的网络虚拟化以及基于通用路由封装(GRE)的MPLS)[RFC4023]。本规范也适用于通用网络虚拟化封装(GENEVE)[GENEVE];但是,需要进行一些增量工作,这些工作将在单独的文件[EVPN-GENEVE]中介绍。本文件还规定了分离地平线过滤和大规模回收的新多归宿程序。它还规定了VXLAN/NVGRE封装的EVPN路由构造,以及网络虚拟化边缘(NVE)设备多宿的自治系统边界路由器(ASBR)程序。
In the context of this document, an NVO is a solution to address the requirements of a multi-tenant data center, especially one with virtualized hosts, e.g., Virtual Machines (VMs) or virtual workloads. The key requirements of such a solution, as described in [RFC7364], are the following:
在本文档的上下文中,NVO是一种解决方案,用于满足多租户数据中心的需求,特别是具有虚拟化主机(如虚拟机(VM)或虚拟工作负载)的数据中心。如[RFC7364]所述,此类解决方案的关键要求如下:
- Isolation of network traffic per tenant
- 每个租户的网络流量隔离
- Support for a large number of tenants (tens or hundreds of thousands)
- 支持大量租户(数万或数十万)
- Extension of Layer 2 (L2) connectivity among different VMs belonging to a given tenant segment (subnet) across different Points of Delivery (PoDs) within a data center or between different data centers
- 跨数据中心内或不同数据中心之间的不同交付点(POD),在属于给定租户段(子网)的不同VM之间扩展第2层(L2)连接
- Allowing a given VM to move between different physical points of attachment within a given L2 segment
- 允许给定VM在给定L2段内的不同物理连接点之间移动
The underlay network for NVO solutions is assumed to provide IP connectivity between NVO endpoints.
假设NVO解决方案的参考底图网络提供NVO端点之间的IP连接。
This document describes how EVPN can be used as an NVO solution and explores applicability of EVPN functions and procedures. In particular, it describes the various tunnel encapsulation options for EVPN over IP and their impact on the EVPN control plane as well as procedures for two main scenarios:
本文档描述了如何将EVPN用作NVO解决方案,并探讨了EVPN功能和程序的适用性。特别是,它描述了EVPN over IP的各种隧道封装选项及其对EVPN控制平面的影响,以及两种主要场景的程序:
(a) single-homing NVEs - when an NVE resides in the hypervisor, and
(a) 单一归宿NVE-当NVE驻留在虚拟机监控程序中时,以及
(b) multihoming NVEs - when an NVE resides in a Top-of-Rack (ToR) device.
(b) 多主NVE-当NVE位于机架顶部(ToR)设备中时。
The possible encapsulation options for EVPN overlays that are analyzed in this document are:
本文档中分析的EVPN覆盖的可能封装选项包括:
- VXLAN and NVGRE
- VXLAN与NVGRE
- MPLS over GRE
- 基于GRE的MPLS
Before getting into the description of the different encapsulation options for EVPN over IP, it is important to highlight the EVPN solution's main features, how those features are currently supported, and any impact that the encapsulation has on those features.
在介绍EVPN over IP的不同封装选项之前,重点介绍EVPN解决方案的主要功能、当前如何支持这些功能以及封装对这些功能的任何影响非常重要。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“建议”、“不建议”、“可”和“可选”在所有大写字母出现时(如图所示)应按照BCP 14[RFC2119][RFC8174]所述进行解释。
Most of the terminology used in this documents comes from [RFC7432] and [RFC7365].
本文件中使用的大多数术语来自[RFC7432]和[RFC7365]。
VXLAN: Virtual Extensible LAN
虚拟可扩展局域网
GRE: Generic Routing Encapsulation
GRE:通用路由封装
NVGRE: Network Virtualization using Generic Routing Encapsulation
NVGRE:使用通用路由封装的网络虚拟化
GENEVE: Generic Network Virtualization Encapsulation
GENEVE:通用网络虚拟化封装
PoD: Point of Delivery
PoD:交货点
NV: Network Virtualization
网络虚拟化
NVO: Network Virtualization Overlay
NVO:网络虚拟化覆盖
NVE: Network Virtualization Edge
网络虚拟化边缘
VNI: VXLAN Network Identifier
VNI:VXLAN网络标识符
VSID: Virtual Subnet Identifier (for NVGRE)
VSID:虚拟子网标识符(用于NVGRE)
I-SID: Service Instance Identifier
I-SID:服务实例标识符
EVPN: Ethernet VPN
以太网VPN
EVI: EVPN Instance. An EVPN instance spanning the Provider Edge (PE) devices participating in that EVPN
EVI:EVPN实例。跨越参与该EVPN的提供者边缘(PE)设备的EVPN实例
MAC-VRF: A Virtual Routing and Forwarding table for Media Access Control (MAC) addresses on a PE
MAC-VRF:PE上媒体访问控制(MAC)地址的虚拟路由和转发表
IP-VRF: A Virtual Routing and Forwarding table for Internet Protocol (IP) addresses on a PE
IP-VRF:PE上用于Internet协议(IP)地址的虚拟路由和转发表
ES: Ethernet Segment. When a customer site (device or network) is connected to one or more PEs via a set of Ethernet links, then that set of links is referred to as an 'Ethernet segment'.
以太网段。当客户站点(设备或网络)通过一组以太网链路连接到一个或多个PEs时,该组链路称为“以太网段”。
Ethernet Segment Identifier (ESI): A unique non-zero identifier that identifies an Ethernet segment is called an 'Ethernet Segment Identifier'.
以太网段标识符(ESI):标识以太网段的唯一非零标识符称为“以太网段标识符”。
Ethernet Tag: An Ethernet tag identifies a particular broadcast domain, e.g., a VLAN. An EVPN instance consists of one or more broadcast domains.
以太网标签:以太网标签标识特定的广播域,例如VLAN。EVPN实例由一个或多个广播域组成。
PE: Provider Edge
PE:提供程序边缘
Single-Active Redundancy Mode: When only a single PE, among all the PEs attached to an ES, is allowed to forward traffic to/from that ES for a given VLAN, then the Ethernet segment is defined to be operating in Single-Active redundancy mode.
单一主动冗余模式:当在连接到ES的所有PE中,只有一个PE被允许为给定VLAN向/从该ES转发流量时,以太网段被定义为在单一主动冗余模式下运行。
All-Active Redundancy Mode: When all PEs attached to an Ethernet segment are allowed to forward known unicast traffic to/from that ES for a given VLAN, then the ES is defined to be operating in All-Active redundancy mode.
全主动冗余模式:当允许连接到以太网段的所有PE向给定VLAN的ES转发已知单播通信量/从该ES转发已知单播通信量时,ES被定义为在全主动冗余模式下运行。
PIM-SM: Protocol Independent Multicast - Sparse-Mode
PIM-SM:协议无关多播-稀疏模式
PIM-SSM: Protocol Independent Multicast - Source-Specific Multicast
PIM-SSM:协议独立多播-源特定多播
BIDIR-PIM: Bidirectional PIM
BIDIR-PIM:双向PIM
EVPN [RFC7432] was originally designed to support the requirements detailed in [RFC7209] and therefore has the following attributes which directly address control-plane scaling and ease of deployment issues.
EVPN[RFC7432]最初设计用于支持[RFC7209]中详细说明的需求,因此具有以下属性,可直接解决控制平面扩展和易于部署的问题。
1. Control-plane information is distributed with BGP and broadcast and multicast traffic is sent using a shared multicast tree or with ingress replication.
1. 控制平面信息通过BGP分发,广播和多播流量通过共享多播树或入口复制发送。
2. Control-plane learning is used for MAC (and IP) addresses instead of data-plane learning. The latter requires the flooding of unknown unicast and Address Resolution Protocol (ARP) frames; whereas, the former does not require any flooding.
2. 控制平面学习用于MAC(和IP)地址,而不是数据平面学习。后者需要淹没未知的单播和地址解析协议(ARP)帧;然而,前者不需要任何洪水。
3. Route Reflector (RR) is used to reduce a full mesh of BGP sessions among PE devices to a single BGP session between a PE and the RR. Furthermore, RR hierarchy can be leveraged to scale the number of BGP routes on the RR.
3. 路由反射器(RR)用于将PE设备之间的BGP会话的完整网格减少为PE和RR之间的单个BGP会话。此外,可以利用RR层次结构来扩展RR上BGP路由的数量。
4. Auto-discovery via BGP is used to discover PE devices participating in a given VPN, PE devices participating in a given redundancy group, tunnel encapsulation types, multicast tunnel types, multicast members, etc.
4. 通过BGP的自动发现用于发现参与给定VPN的PE设备、参与给定冗余组的PE设备、隧道封装类型、多播隧道类型、多播成员等。
5. All-Active multihoming is used. This allows a given Customer Edge (CE) device to have multiple links to multiple PEs, and traffic to/from that CE fully utilizes all of these links.
5. 使用所有主动多址。这允许给定的客户边缘(CE)设备具有到多个PE的多个链路,并且进出该CE的流量充分利用所有这些链路。
6. When a link between a CE and a PE fails, the PEs for that EVI are notified of the failure via the withdrawal of a single EVPN route. This allows those PEs to remove the withdrawing PE as a next hop for every MAC address associated with the failed link. This is termed "mass withdrawal".
6. 当CE和PE之间的链路出现故障时,该EVI的PE将通过退出单个EVPN路由得到故障通知。这允许这些PE删除正在退出的PE,作为与故障链路相关联的每个MAC地址的下一跳。这被称为“大规模撤军”。
7. BGP route filtering and constrained route distribution are leveraged to ensure that the control-plane traffic for a given EVI is only distributed to the PEs in that EVI.
7. 利用BGP路由过滤和受限路由分布,确保给定EVI的控制平面流量仅分布到该EVI中的PE。
8. When an IEEE 802.1Q [IEEE.802.1Q] interface is used between a CE and a PE, each of the VLAN IDs (VIDs) on that interface can be mapped onto a bridge table (for up to 4094 such bridge tables). All these bridge tables may be mapped onto a single MAC-VRF (in case of VLAN-aware bundle service).
8. 当在CE和PE之间使用IEEE 802.1Q[IEEE.802.1Q]接口时,该接口上的每个VLAN ID(VID)都可以映射到桥接表(最多4094个此类桥接表)。所有这些网桥表都可以映射到单个MAC-VRF(在支持VLAN的捆绑服务的情况下)。
9. VM Mobility mechanisms ensure that all PEs in a given EVI know the ES with which a given VM, as identified by its MAC and IP addresses, is currently associated.
9. VM移动机制确保给定EVI中的所有PE都知道给定VM(通过其MAC和IP地址标识)当前与之关联的ES。
10. RTs are used to allow the operator (or customer) to define a spectrum of logical network topologies including mesh, hub and spoke, and extranets (e.g., a VPN whose sites are owned by different enterprises), without the need for proprietary software or the aid of other virtual or physical devices.
10. RTs用于允许运营商(或客户)定义一系列逻辑网络拓扑,包括网状网、集线器和辐条网以及外联网(例如,其站点由不同企业拥有的VPN),而无需专有软件或其他虚拟或物理设备的帮助。
Because the design goal for NVO is millions of instances per common physical infrastructure, the scaling properties of the control plane for NVO are extremely important. EVPN and the extensions described herein, are designed with this level of scalability in mind.
由于NVO的设计目标是每个公共物理基础设施有数百万个实例,因此NVO控制平面的缩放特性非常重要。EVPN和本文描述的扩展在设计时考虑了这种级别的可伸缩性。
Both VXLAN and NVGRE are examples of technologies that provide a data plane encapsulation which is used to transport a packet over the common physical IP infrastructure between Network Virtualization Edges (NVEs) - e.g., VXLAN Tunnel End Points (VTEPs) in VXLAN network. Both of these technologies include the identifier of the specific NVO instance, VNI in VXLAN and VSID in NVGRE, in each packet. In the remainder of this document we use VNI as the representation for NVO instance with the understanding that VSID can equally be used if the encapsulation is NVGRE unless it is stated otherwise.
VXLAN和NVGRE都是提供数据平面封装的技术示例,数据平面封装用于在网络虚拟化边缘(NVE)之间通过公共物理IP基础设施传输数据包,例如VXLAN网络中的VXLAN隧道端点(VTEP)。这两种技术在每个数据包中都包括特定NVO实例的标识符、VXLAN中的VNI和NVGRE中的VSID。在本文档的其余部分中,我们使用VNI作为NVO实例的表示,并理解如果封装为NVGRE,则同样可以使用VSID,除非另有说明。
Note that a PE is equivalent to an NVE/VTEP.
请注意,PE相当于NVE/VTEP。
VXLAN encapsulation is based on UDP, with an 8-byte header following the UDP header. VXLAN provides a 24-bit VNI, which typically provides a one-to-one mapping to the tenant VID, as described in [RFC7348]. In this scenario, the ingress VTEP does not include an inner VLAN tag on the encapsulated frame, and the egress VTEP discards the frames with an inner VLAN tag. This mode of operation in [RFC7348] maps to VLAN-Based Service in [RFC7432], where a tenant VID gets mapped to an EVI.
VXLAN封装基于UDP,UDP报头后面有一个8字节的报头。VXLAN提供一个24位VNI,它通常提供到租户VID的一对一映射,如[RFC7348]中所述。在此场景中,入口VTEP不包括封装帧上的内部VLAN标记,而出口VTEP丢弃具有内部VLAN标记的帧。[RFC7348]中的这种操作模式映射到[RFC7432]中基于VLAN的服务,其中租户VID映射到EVI。
VXLAN also provides an option of including an inner VLAN tag in the encapsulated frame, if explicitly configured at the VTEP. This mode of operation can map to VLAN Bundle Service in [RFC7432] because all the tenant's tagged frames map to a single bridge table / MAC-VRF, and the inner VLAN tag is not used for lookup by the disposition PE when performing VXLAN decapsulation as described in Section 6 of [RFC7348].
如果在VTEP上明确配置,VXLAN还提供在封装帧中包含内部VLAN标记的选项。此操作模式可以映射到[RFC7432]中的VLAN捆绑服务,因为所有租户的标记帧映射到单个网桥表/MAC-VRF,并且在执行[RFC7348]第6节中所述的VXLAN解封装时,处置PE不使用内部VLAN标记进行查找。
[RFC7637] encapsulation is based on GRE encapsulation, and it mandates the inclusion of the optional GRE Key field, which carries the VSID. There is a one-to-one mapping between the VSID and the tenant VID, as described in [RFC7637]. The inclusion of an inner VLAN tag is prohibited. This mode of operation in [RFC7637] maps to VLAN Based Service in [RFC7432].
[RFC7637]封装基于GRE封装,它要求包含可选的GRE密钥字段,该字段携带VSID。VSID和租户VID之间有一对一的映射,如[RFC7637]所述。禁止包含内部VLAN标记。[RFC7637]中的这种操作模式映射到[RFC7432]中基于VLAN的服务。
As described in the next section, there is no change to the encoding of EVPN routes to support VXLAN or NVGRE encapsulation, except for the use of the BGP Encapsulation Extended Community to indicate the encapsulation type (e.g., VXLAN or NVGRE). However, there is potential impact to the EVPN procedures depending on where the NVE is located (i.e., in hypervisor or ToR) and whether multihoming capabilities are required.
如下一节所述,除了使用BGP封装扩展社区来指示封装类型(例如,VXLAN或NVGRE)外,EVPN路由的编码没有改变以支持VXLAN或NVGRE封装。但是,根据NVE的位置(即,在虚拟机监控程序或虚拟机监控程序中)以及是否需要多主功能,EVPN程序可能会受到影响。
Although VNIs are defined as 24-bit globally unique values, there are scenarios in which it is desirable to use a locally significant value for the VNI, especially in the context of a data-center interconnect.
尽管VNI被定义为24位全局唯一值,但在某些情况下,需要为VNI使用本地有效值,尤其是在数据中心互连的上下文中。
In the case where NVEs in different data centers need to be interconnected, and the NVEs need to use VNIs as globally unique identifiers within a data center, then a Gateway (GW) needs to be employed at the edge of the data-center network (DCN). This is because the Gateway will provide the functionality of translating the VNI when crossing network boundaries, which may align with operator span-of-control boundaries. As an example, consider the network of Figure 1. Assume there are three network operators: one for each of the DC1, DC2, and WAN networks. The Gateways at the edge of the data centers are responsible for translating the VNIs between the values used in each of the DCNs and the values used in the WAN.
如果不同数据中心中的NVE需要互连,并且NVE需要使用VNI作为数据中心内的全局唯一标识符,则需要在数据中心网络(DCN)的边缘使用网关(GW)。这是因为网关将在跨越网络边界时提供转换VNI的功能,这可能与操作员控制边界的范围一致。作为一个例子,考虑图1的网络。假设有三个网络运营商:DC1、DC2和WAN网络各一个。数据中心边缘的网关负责在每个DCN中使用的值和WAN中使用的值之间转换VNI。
+--------------+ | | +---------+ | WAN | +---------+ +----+ | +---+ +----+ +----+ +---+ | +----+ |NVE1|--| | | |WAN | |WAN | | | |--|NVE3| +----+ |IP |GW |--|Edge| |Edge|--|GW | IP | +----+ +----+ |Fabric +---+ +----+ +----+ +---+ Fabric | +----+ |NVE2|--| | | | | |--|NVE4| +----+ +---------+ +--------------+ +---------+ +----+
+--------------+ | | +---------+ | WAN | +---------+ +----+ | +---+ +----+ +----+ +---+ | +----+ |NVE1|--| | | |WAN | |WAN | | | |--|NVE3| +----+ |IP |GW |--|Edge| |Edge|--|GW | IP | +----+ +----+ |Fabric +---+ +----+ +----+ +---+ Fabric | +----+ |NVE2|--| | | | | |--|NVE4| +----+ +---------+ +--------------+ +---------+ +----+
|<------ DC 1 ------> <------ DC2 ------>|
|<------ DC 1 ------> <------ DC2 ------>|
Figure 1: Data-Center Interconnect with Gateway
图1:数据中心与网关的互连
In the case where NVEs in different data centers need to be interconnected, and the NVEs need to use locally assigned VNIs (e.g., similar to MPLS labels), there may be no need to employ Gateways at the edge of the DCN. More specifically, the VNI value that is used by the transmitting NVE is allocated by the NVE that is receiving the traffic (in other words, this is similar to a "downstream-assigned" MPLS label). This allows the VNI space to be decoupled between different DCNs without the need for a dedicated Gateway at the edge of the data centers. This topic is covered in Section 10.2.
在不同数据中心中的NVE需要互连,并且NVE需要使用本地分配的VNI(例如,类似于MPLS标签)的情况下,可能不需要在DCN的边缘使用网关。更具体地说,由发送NVE使用的VNI值由接收业务的NVE分配(换句话说,这类似于“下游分配的”MPLS标签)。这使得VNI空间可以在不同的DCN之间解耦,而无需在数据中心边缘设置专用网关。本主题将在第10.2节中介绍。
+--------------+ | | +---------+ | WAN | +---------+ +----+ | | +----+ +----+ | | +----+ |NVE1|--| | |ASBR| |ASBR| | |--|NVE3| +----+ |IP Fabric|---| | | |--|IP Fabric| +----+ +----+ | | +----+ +----+ | | +----+ |NVE2|--| | | | | |--|NVE4| +----+ +---------+ +--------------+ +---------+ +----+
+--------------+ | | +---------+ | WAN | +---------+ +----+ | | +----+ +----+ | | +----+ |NVE1|--| | |ASBR| |ASBR| | |--|NVE3| +----+ |IP Fabric|---| | | |--|IP Fabric| +----+ +----+ | | +----+ +----+ | | +----+ |NVE2|--| | | | | |--|NVE4| +----+ +---------+ +--------------+ +---------+ +----+
|<------ DC 1 -----> <---- DC2 ------>|
|<------ DC 1 -----> <---- DC2 ------>|
Figure 2: Data-Center Interconnect with ASBR
图2:数据中心与ASBR的互连
Just like in [RFC7432], where two options existed for mapping broadcast domains (represented by VLAN IDs) to an EVI, when the EVPN control plane is used in conjunction with VXLAN (or NVGRE encapsulation), there are also two options for mapping broadcast domains represented by VXLAN VNIs (or NVGRE VSIDs) to an EVI:
正如在[RFC7432]中,存在两个选项用于将广播域(由VLAN ID表示)映射到EVI,当EVPN控制平面与VXLAN(或NVGRE封装)结合使用时,还有两个选项用于将由VXLAN VNI(或NVGRE VSID)表示的广播域映射到EVI:
Option 1: A Single Broadcast Domain per EVI
选项1:每个EVI一个广播域
In this option, a single Ethernet broadcast domain (e.g., subnet) represented by a VNI is mapped to a unique EVI. This corresponds to the VLAN-Based Service in [RFC7432], where a tenant-facing interface, logical interface (e.g., represented by a VID), or physical interface gets mapped to an EVI. As such, a BGP Route Distinguisher (RD) and Route Target (RT) are needed per VNI on every NVE. The advantage of this model is that it allows the BGP RT constraint mechanisms to be used in order to limit the propagation and import of routes to only the NVEs that are interested in a given VNI. The disadvantage of this model may be the provisioning overhead if the RD and RT are not derived automatically from the VNI.
在此选项中,由VNI表示的单个以太网广播域(例如,子网)映射到唯一的EVI。这对应于[RFC7432]中基于VLAN的服务,其中面向租户的接口、逻辑接口(例如,由VID表示)或物理接口映射到EVI。因此,每个NVE上的每个VNI都需要BGP路由识别器(RD)和路由目标(RT)。该模型的优点是,它允许使用BGP RT约束机制,以便将路由的传播和导入限制为仅对给定VNI感兴趣的NVE。如果RD和RT不是从VNI自动派生出来的,则此模型的缺点可能是资源调配开销。
In this option, the MAC-VRF table is identified by the RT in the control plane and by the VNI in the data plane. In this option, the specific MAC-VRF table corresponds to only a single bridge table.
在该选项中,MAC-VRF表由控制平面中的RT和数据平面中的VNI标识。在该选项中,特定MAC-VRF表仅对应于一个网桥表。
Option 2: Multiple Broadcast Domains per EVI
选项2:每个EVI有多个广播域
In this option, multiple subnets, each represented by a unique VNI, are mapped to a single EVI. For example, if a tenant has multiple segments/subnets each represented by a VNI, then all the VNIs for that tenant are mapped to a single EVI; for example, the EVI in this case represents the tenant and not a subnet. This corresponds to the VLAN-aware bundle service in [RFC7432]. The advantage of this model is that it doesn't require the provisioning of an RD/RT per VNI. However, this is a moot point when compared to Option 1 where auto-derivation is used. The disadvantage of this model is that routes would be imported by NVEs that may not be interested in a given VNI.
在此选项中,多个子网(每个子网由唯一的VNI表示)映射到单个EVI。例如,如果一个租户有多个段/子网,每个段/子网由一个VNI表示,那么该租户的所有VNI都映射到一个EVI;例如,本例中的EVI表示租户,而不是子网。这对应于[RFC7432]中的VLAN感知捆绑服务。此模型的优点是不需要为每个VNI提供RD/RT。然而,与使用自动推导的选项1相比,这是一个没有实际意义的点。该模型的缺点是,路线将由可能对给定的VNI不感兴趣的NVE导入。
In this option, the MAC-VRF table is identified by the RT in the control plane; a specific bridge table for that MAC-VRF is identified by the <RT, Ethernet Tag ID> in the control plane. In this option, the VNI in the data plane is sufficient to identify a specific bridge table.
在该选项中,MAC-VRF表由控制平面中的RT标识;MAC-VRF的特定桥接表由控制平面中的<RT,Ethernet Tag ID>标识。在此选项中,数据平面中的VNI足以识别特定的桥接表。
In order to simplify configuration, when the option of a single VNI per EVI is used, the RT used for EVPN can be auto-derived. RD can be auto-generated as described in [RFC7432], and RT can be auto-derived as described next.
为了简化配置,当使用每个EVI一个VNI选项时,可以自动导出用于EVPN的RT。RD可以按照[RFC7432]中的说明自动生成,RT可以按照下面的说明自动派生。
Since a Gateway PE as depicted in Figure 1 participates in both the DCN and WAN BGP sessions, it is important that, when RT values are auto-derived from VNIs, there be no conflict in RT spaces between DCNs and WANs, assuming that both are operating within the same Autonomous System (AS). Also, there can be scenarios where both VXLAN and NVGRE encapsulations may be needed within the same DCN, and their corresponding VNIs are administered independently, which means VNI spaces can overlap. In order to avoid conflict in RT spaces, the 6-byte RT values with 2-octet AS number for DCNs can be auto-derived as follow:
由于如图1所示的网关PE同时参与DCN和WAN BGP会话,因此当RT值从VNI自动派生时,假设DCN和WAN都在同一自治系统(as)内运行,则DCN和WAN之间的RT空间不存在冲突是很重要的。此外,在某些情况下,VXLAN和NVGRE封装可能需要在同一DCN内,并且它们对应的VNI是独立管理的,这意味着VNI空间可能重叠。为了避免RT空间中的冲突,DCN的6字节RT值(以2个八位字节作为编号)可以自动导出,如下所示:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator | Local Administrator | +-----------------------------------------------+---------------+ | Local Administrator (Cont.) | +-------------------------------+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator | Local Administrator | +-----------------------------------------------+---------------+ | Local Administrator (Cont.) | +-------------------------------+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator |A| TYPE| D-ID | Service ID | +-----------------------------------------------+---------------+ | Service ID (Cont.) | +-------------------------------+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Global Administrator |A| TYPE| D-ID | Service ID | +-----------------------------------------------+---------------+ | Service ID (Cont.) | +-------------------------------+
The 6-octet RT field consists of two sub-fields:
6-octet RT字段由两个子字段组成:
- Global Administrator sub-field: 2 octets. This sub-field contains an AS number assigned by IANA <https://www.iana.org/assignments/ as-numbers/>.
- 全局管理员子字段:2个八位字节。此子字段包含IANA分配的AS编号<https://www.iana.org/assignments/ 作为数字/>。
- Local Administrator sub-field: 4 octets
- 本地管理员子字段:4个八位字节
* A: A single-bit field indicating if this RT is auto-derived
* A:指示此RT是否自动派生的单位字段
0: auto-derived 1: manually derived
0:自动派生1:手动派生
* Type: A 3-bit field that identifies the space in which the other 3 bytes are defined. The following spaces are defined:
* 类型:一个3位字段,标识定义其他3个字节的空间。定义了以下空间:
0 : VID (802.1Q VLAN ID) 1 : VXLAN 2 : NVGRE 3 : I-SID 4 : EVI 5 : dual-VID (QinQ VLAN ID)
0:VID(802.1Q VLAN ID)1:VXLAN 2:NVGRE 3:I-SID 4:EVI 5:dual VID(QinQ VLAN ID)
* D-ID: A 4-bit field that identifies domain-id. The default value of domain-id is zero, indicating that only a single numbering space exist for a given technology. However, if more than one number space exists for a given technology (e.g., overlapping VXLAN spaces), then each of the number spaces need to be identified by its corresponding domain-id starting from 1.
* D-ID:标识域ID的4位字段。域ID的默认值为零,表示给定技术只存在一个编号空间。但是,如果给定技术存在多个数字空间(例如,重叠的VXLAN空间),则每个数字空间都需要从1开始通过其相应的域id进行标识。
* Service ID: This 3-octet field is set to VNI, VSID, I-SID, or VID.
* 服务ID:此3-octet字段设置为VNI、VSID、I-SID或VID。
It should be noted that RT auto-derivation is applicable for 2-octet AS numbers. For 4-octet AS numbers, the RT needs to be manually configured because 3-octet VNI fields cannot be fit within the 2-octet local administrator field.
应该注意的是,RT自动求导适用于2-octet作为数字。对于4-octet作为数字,RT需要手动配置,因为3-octet VNI字段不能在2-octet本地管理员字段中容纳。
In EVPN, an MPLS label, for instance, identifying the forwarding table is distributed by the egress PE via the EVPN control plane and is placed in the MPLS header of a given packet by the ingress PE. This label is used upon receipt of that packet by the egress PE for disposition of that packet. This is very similar to the use of the VNI by the egress NVE, with the difference being that an MPLS label has local significance while a VNI typically has global significance. Accordingly, and specifically to support the option of locally assigned VNIs, the MPLS Label1 field in the MAC/IP Advertisement route, the MPLS label field in the Ethernet A-D per EVI route, and the MPLS label field in the P-Multicast Service Interface (PMSI) Tunnel attribute of the Inclusive Multicast Ethernet Tag (IMET) route are used to carry the VNI. For the balance of this memo, the above MPLS label fields will be referred to as the VNI field. The VNI field is used for both local and global VNIs; for either case, the entire 24-bit field is used to encode the VNI value.
在EVPN中,例如,标识转发表的MPLS标签由出口PE经由EVPN控制平面分发,并由入口PE放置在给定分组的MPLS报头中。该标签在出口PE接收到该分组时用于处理该分组。这与出口NVE使用VNI非常相似,不同之处在于MPLS标签具有局部意义,而VNI通常具有全局意义。因此,特别是为了支持本地分配的vni选项,MAC/IP播发路由中的MPLS标签1字段、以太网A-D per EVI路由中的MPLS标签字段以及包含性多播以太网标签(IMET)的P多播服务接口(PMSI)隧道属性中的MPLS标签字段路线用于承载VNI。对于本备忘录的其余部分,上述MPLS标签字段将被称为VNI字段。VNI字段用于本地和全局VNI;对于这两种情况,整个24位字段用于编码VNI值。
For the VLAN-Based Service (a single VNI per MAC-VRF), the Ethernet Tag field in the MAC/IP Advertisement, Ethernet A-D per EVI, and IMET route MUST be set to zero just as in the VLAN-Based Service in [RFC7432].
对于基于VLAN的服务(每个MAC-VRF一个VNI),MAC/IP播发中的以太网标签字段、每个EVI的以太网a-D和IME路由必须设置为零,就像[RFC7432]中基于VLAN的服务一样。
For the VLAN-Aware Bundle Service (multiple VNIs per MAC-VRF with each VNI associated with its own bridge table), the Ethernet Tag field in the MAC Advertisement, Ethernet A-D per EVI, and IMET route MUST identify a bridge table within a MAC-VRF; the set of Ethernet Tags for that EVI needs to be configured consistently on all PEs within that EVI. For locally assigned VNIs, the value advertised in the Ethernet Tag field MUST be set to a VID just as in the VLAN-aware bundle service in [RFC7432]. Such setting must be done consistently on all PE devices participating in that EVI within a given domain. For global VNIs, the value advertised in the Ethernet Tag field SHOULD be set to a VNI as long as it matches the existing semantics of the Ethernet Tag, i.e., it identifies a bridge table within a MAC-VRF and the set of VNIs are configured consistently on each PE in that EVI.
对于VLAN感知捆绑服务(每个MAC-VRF有多个VNI,每个VNI与自己的网桥表相关联),MAC广告中的以太网标签字段、每个EVI的以太网A-D和IME路由必须标识MAC-VRF中的网桥表;需要在该EVI内的所有PE上一致地配置该EVI的以太网标签集。对于本地分配的VNI,必须将Ethernet标记字段中公布的值设置为VID,就像[RFC7432]中的VLAN感知捆绑服务一样。此类设置必须在给定域内参与该EVI的所有PE设备上一致进行。对于全局VNI,在Ethernet Tag字段中公布的值应设置为VNI,只要它与Ethernet Tag的现有语义匹配,即它标识MAC-VRF中的桥接表,并且在该EVI中的每个PE上一致地配置VNI集。
In order to indicate which type of data-plane encapsulation (i.e., VXLAN, NVGRE, MPLS, or MPLS in GRE) is to be used, the BGP Encapsulation Extended Community defined in [RFC5512] is included with all EVPN routes (i.e., MAC Advertisement, Ethernet A-D per EVI, Ethernet A-D per ESI, IMET, and Ethernet Segment) advertised by an egress PE. Five new values have been assigned by IANA to extend the list of encapsulation types defined in [RFC5512]; they are listed in Section 11.
为了指示要使用哪种类型的数据平面封装(即VXLAN、NVGRE、MPLS或GRE中的MPLS),出口PE播发的所有EVPN路由(即MAC播发、每个EVI的以太网A-D、每个ESI的以太网A-D、IMT和以太网段)都包括[RFC5512]中定义的BGP封装扩展社区。IANA分配了五个新值,以扩展[RFC5512]中定义的封装类型列表;它们列在第11节中。
The MPLS encapsulation tunnel type, listed in Section 11, is needed in order to distinguish between an advertising node that only supports non-MPLS encapsulations and one that supports MPLS and non-MPLS encapsulations. An advertising node that only supports MPLS encapsulation does not need to advertise any encapsulation tunnel types; i.e., if the BGP Encapsulation Extended Community is not present, then either MPLS encapsulation or a statically configured encapsulation is assumed.
为了区分仅支持非MPLS封装的广告节点和支持MPLS和非MPLS封装的广告节点,需要第11节中列出的MPLS封装隧道类型。仅支持MPLS封装的播发节点不需要播发任何封装隧道类型;i、 例如,如果BGP封装扩展社区不存在,则假定MPLS封装或静态配置的封装。
The Next Hop field of the MP_REACH_NLRI attribute of the route MUST be set to the IPv4 or IPv6 address of the NVE. The remaining fields in each route are set as per [RFC7432].
路由的MP_REACH_NLRI属性的下一个跃点字段必须设置为NVE的IPv4或IPv6地址。每个路由中的剩余字段按照[RFC7432]进行设置。
Note that the procedure defined here -- to use the MPLS Label field to carry the VNI in the presence of a Tunnel Encapsulation Extended Community specifying the use of a VNI -- is aligned with the procedures described in Section 8.2.2.2 of [TUNNEL-ENCAP] ("When a Valid VNI has not been Signaled").
请注意,此处定义的程序——使用MPLS标签字段在存在指定使用VNI的隧道封装扩展社区的情况下承载VNI——与[Tunnel-ENCAP]第8.2.2节中描述的程序一致(“当未发送有效VNI信号时”)。
The EVPN data plane is modeled as an EVPN MPLS client layer sitting over an MPLS PSN tunnel server layer. Some of the EVPN functions (split-horizon, Aliasing, and Backup Path) are tied to the MPLS client layer. If MPLS over GRE encapsulation is used, then the EVPN MPLS client layer can be carried over an IP PSN tunnel transparently. Therefore, there is no impact to the EVPN procedures and associated data-plane operation.
EVPN数据平面被建模为位于MPLS PSN隧道服务器层之上的EVPN MPLS客户机层。一些EVPN功能(拆分地平线、别名和备份路径)绑定到MPLS客户端层。如果使用MPLS over GRE封装,那么EVPN MPLS客户端层可以透明地通过IP PSN隧道传输。因此,对EVPN程序和相关数据平面操作没有影响。
[RFC4023] defines the standard for using MPLS over GRE encapsulation, which can be used for this purpose. However, when MPLS over GRE is used in conjunction with EVPN, it is recommended that the GRE key field be present and be used to provide a 32-bit entropy value only if the P nodes can perform Equal-Cost Multipath (ECMP) hashing based on the GRE key; otherwise, the GRE header SHOULD NOT include the GRE key field. The Checksum and Sequence Number fields MUST NOT be included, and the corresponding C and S bits in the GRE header MUST be set to zero. A PE capable of supporting this encapsulation SHOULD advertise its EVPN routes along with the Tunnel Encapsulation Extended Community indicating MPLS over GRE encapsulation as described in the previous section.
[RFC4023]定义了使用MPLS over GRE封装的标准,可用于此目的。然而,当MPLS over GRE与EVPN结合使用时,建议仅当P节点可以基于GRE密钥执行等代价多路径(ECMP)散列时,GRE密钥字段才存在并用于提供32位熵值;否则,GRE标题不应包括GRE键字段。校验和和和序列号字段不得包含在内,GRE报头中相应的C和S位必须设置为零。能够支持此封装的PE应公布其EVPN路由以及隧道封装扩展社区,指示MPLS over GRE封装,如前一节所述。
The use of the BGP Encapsulation Extended Community per [RFC5512] allows each NVE in a given EVI to know each of the encapsulations supported by each of the other NVEs in that EVI. That is, each of the NVEs in a given EVI may support multiple data-plane encapsulations. An ingress NVE can send a frame to an egress NVE only if the set of encapsulations advertised by the egress NVE forms a non-empty intersection with the set of encapsulations supported by the ingress NVE; it is at the discretion of the ingress NVE which encapsulation to choose from this intersection. (As noted in Section 5.1.3, if the BGP Encapsulation extended community is not present, then the default MPLS encapsulation or a locally configured encapsulation is assumed.)
根据[RFC5512]使用BGP封装扩展社区允许给定EVI中的每个NVE知道该EVI中其他每个NVE支持的每个封装。也就是说,给定EVI中的每个nve可以支持多个数据平面封装。入口NVE仅当出口NVE所通告的封装集与入口NVE所支持的封装集形成非空相交时,才可以向出口NVE发送帧;入口NVE可自行决定从该交叉点选择哪种封装。(如第5.1.3节所述,如果BGP封装扩展社区不存在,则假定默认MPLS封装或本地配置的封装。)
When a PE advertises multiple supported encapsulations, it MUST advertise encapsulations that use the same EVPN procedures including procedures associated with split-horizon filtering described in Section 8.3.1. For example, VXLAN and NVGRE (or MPLS and MPLS over GRE) encapsulations use the same EVPN procedures; thus, a PE can advertise both of them and can support either of them or both of them simultaneously. However, a PE MUST NOT advertise VXLAN and MPLS encapsulations together because (a) the MPLS field of EVPN routes is
当PE播发多个受支持的封装时,它必须播发使用相同EVPN程序的封装,包括第8.3.1节中描述的与拆分地平线过滤相关的程序。例如,VXLAN和NVGRE(或MPLS和MPLS over GRE)封装使用相同的EVPN过程;因此,PE可以同时宣传这两种产品,也可以同时支持其中一种或两种产品。但是,PE不得同时公布VXLAN和MPLS封装,因为(a)EVPN路由的MPLS字段是
set to either an MPLS label or a VNI, but not both and (b) some EVPN procedures (such as split-horizon filtering) are different for VXLAN/ NVGRE and MPLS encapsulations.
设置为MPLS标签或VNI,但不能同时设置为两者和(b)对于VXLAN/NVGRE和MPLS封装,某些EVPN过程(如分割地平线过滤)是不同的。
An ingress node that uses shared multicast trees for sending broadcast or multicast frames MAY maintain distinct trees for each different encapsulation type.
使用共享多播树发送广播或多播帧的入口节点可以为每个不同的封装类型维护不同的树。
It is the responsibility of the operator of a given EVI to ensure that all of the NVEs in that EVI support at least one common encapsulation. If this condition is violated, it could result in service disruption or failure. The use of the BGP Encapsulation Extended Community provides a method to detect when this condition is violated, but the actions to be taken are at the discretion of the operator and are outside the scope of this document.
给定EVI的运营商有责任确保该EVI中的所有NVE至少支持一个公共封装。如果违反此条件,可能会导致服务中断或故障。BGP封装扩展社区的使用提供了一种检测何时违反此条件的方法,但要采取的行动由运营商自行决定,不在本文档的范围内。
When an NVE and its hosts/VMs are co-located in the same physical device, e.g., when they reside in a server, the links between them are virtual and they typically share fate. That is, the subject hosts/VMs are typically not multihomed or, if they are multihomed, the multihoming is a purely local matter to the server hosting the VM and the NVEs, and it need not be "visible" to any other NVEs residing on other servers. Thus, it does not require any specific protocol mechanisms. The most common case of this is when the NVE resides on the hypervisor.
当NVE及其主机/虚拟机位于同一物理设备中时(例如,当它们位于服务器中时),它们之间的链路是虚拟的,并且它们通常共享命运。也就是说,主体主机/VM通常不是多宿主的,或者,如果它们是多宿主的,则多宿主对于托管VM和NVE的服务器来说是纯粹的本地问题,并且它不需要对驻留在其他服务器上的任何其他NVE“可见”。因此,它不需要任何特定的协议机制。最常见的情况是NVE驻留在虚拟机监控程序上。
In the subsections that follow, we will discuss the impact on EVPN procedures for the case when the NVE resides on the hypervisor and the VXLAN (or NVGRE) encapsulation is used.
在接下来的小节中,我们将讨论当NVE驻留在虚拟机监控程序上并且使用VXLAN(或NVGRE)封装时对EVPN过程的影响。
7.1. Impact on EVPN BGP Routes & Attributes for VXLAN/NVGRE Encapsulations
7.1. VXLAN/NVGRE封装对EVPN BGP路由和属性的影响
In scenarios where different groups of data centers are under different administrative domains, and these data centers are connected via one or more backbone core providers as described in [RFC7365], the RD must be a unique value per EVI or per NVE as described in [RFC7432]. In other words, whenever there is more than one administrative domain for global VNI, a unique RD must be used; or, whenever the VNI value has local significance, a unique RD must be used. Therefore, it is recommended to use a unique RD as described in [RFC7432] at all times.
如果不同的数据中心组位于不同的管理域下,并且这些数据中心通过[RFC7365]中所述的一个或多个骨干核心提供商连接,则RD必须是[RFC7432]中所述的每个EVI或每个NVE的唯一值。换句话说,每当全局VNI有多个管理域时,必须使用唯一的RD;或者,只要VNI值具有局部意义,就必须使用唯一的RD。因此,建议始终使用[RFC7432]中所述的唯一RD。
When the NVEs reside on the hypervisor, the EVPN BGP routes and attributes associated with multihoming are no longer required. This reduces the required routes and attributes to the following subset of four out of the total of eight listed in Section 7 of [RFC7432]:
当NVE驻留在虚拟机监控程序上时,不再需要与多主关联的EVPN BGP路由和属性。这将[RFC7432]第7节中列出的八条路线中的四条路线和属性减少为以下子集:
- MAC/IP Advertisement Route
- MAC/IP广告路由
- Inclusive Multicast Ethernet Tag Route
- 包含多播以太网标记路由
- MAC Mobility Extended Community
- MAC移动扩展社区
- Default Gateway Extended Community
- 默认网关扩展社区
However, as noted in Section 8.6 of [RFC7432], in order to enable a single-homing ingress NVE to take advantage of fast convergence, Aliasing, and Backup Path when interacting with multihomed egress NVEs attached to a given ES, the single-homing ingress NVE should be able to receive and process routes that are Ethernet A-D per ES and Ethernet A-D per EVI.
然而,如[RFC7432]第8.6节所述,为了使单归宿入口NVE在与连接到给定ES的多归宿出口NVE交互时能够利用快速收敛、混叠和备份路径,单主入口NVE应能够接收和处理每个ES的以太网A-D和每个EVI的以太网A-D路由。
When the NVEs reside on the hypervisors, the EVPN procedures associated with multihoming are no longer required. This limits the procedures on the NVE to the following subset.
当NVE驻留在虚拟机监控程序上时,不再需要与多宿主相关联的EVPN过程。这将NVE上的过程限制为以下子集。
1. Local learning of MAC addresses received from the VMs per Section 10.1 of [RFC7432].
1. 根据[RFC7432]第10.1节,本地学习从虚拟机接收的MAC地址。
2. Advertising locally learned MAC addresses in BGP using the MAC/IP Advertisement routes.
2. 使用MAC/IP播发路由在BGP中播发本地学习的MAC地址。
3. Performing remote learning using BGP per Section 9.2 of [RFC7432].
3. 根据[RFC7432]第9.2节使用BGP执行远程学习。
4. Discovering other NVEs and constructing the multicast tunnels using the IMET routes.
4. 发现其他NVE并使用IME路由构建多播隧道。
5. Handling MAC address mobility events per the procedures of Section 15 in [RFC7432].
5. 按照[RFC7432]第15节的程序处理MAC地址移动事件。
However, as noted in Section 8.6 of [RFC7432], in order to enable a single-homing ingress NVE to take advantage of fast convergence, Aliasing, and Backup Path when interacting with multihomed egress NVEs attached to a given ES, a single-homing ingress NVE should implement the ingress node processing of routes that are Ethernet A-D per ES and Ethernet A-D per EVI as defined in Sections 8.2 ("Fast Convergence") and 8.4 ("Aliasing and Backup Path") of [RFC7432].
然而,如[RFC7432]第8.6节所述,为了使单归宿入口NVE在与连接到给定ES的多归宿出口NVE交互时能够利用快速收敛、混叠和备份路径,单归宿入口NVE应按照[RFC7432]第8.2节(“快速收敛”)和第8.4节(“混叠和备份路径”)中的定义,对每个ES的以太网a-D和每个EVI的以太网a-D路由进行入口节点处理。
In this section, we discuss the scenario where the NVEs reside in the ToR switches AND the servers (where VMs are residing) are multihomed to these ToR switches. The multihoming NVE operates in All-Active or Single-Active redundancy mode. If the servers are single-homed to the ToR switches, then the scenario becomes similar to that where the NVE resides on the hypervisor, as discussed in Section 7, as far as the required EVPN functionality is concerned.
在本节中,我们将讨论以下场景:NVE驻留在ToR交换机中,服务器(VM驻留在其中)与这些ToR交换机进行多址连接。多主NVE在全主动或单主动冗余模式下工作。如果服务器是ToR交换机的单主机,那么就所需的EVPN功能而言,该场景与第7节中讨论的NVE驻留在虚拟机监控程序上的场景类似。
[RFC7432] defines a set of BGP routes, attributes, and procedures to support multihoming. We first describe these functions and procedures, then discuss which of these are impacted by the VXLAN (or NVGRE) encapsulation and what modifications are required. As will be seen later in this section, the only EVPN procedure that is impacted by non-MPLS overlay encapsulation (e.g., VXLAN or NVGRE) where it provides space for one ID rather than a stack of labels, is that of split-horizon filtering for multihomed ESs described in Section 8.3.1.
[RFC7432]定义了一组BGP路由、属性和过程,以支持多归属。我们首先描述这些功能和过程,然后讨论哪些功能和过程受到VXLAN(或NVGRE)封装的影响,以及需要进行哪些修改。如本节下文所述,受非MPLS覆盖封装(如VXLAN或NVGRE)影响的唯一EVPN程序(为一个ID而不是一堆标签提供空间)是第8.3.1节所述的多宿ESs的拆分地平线过滤。
In this section, we will recap the multihoming features of EVPN to highlight the encapsulation dependencies. The section only describes the features and functions at a high level. For more details, the reader is to refer to [RFC7432].
在本节中,我们将回顾EVPN的多宿主特性,以突出封装依赖性。本节仅从较高的层次描述功能和功能。有关更多详细信息,读者请参阅[RFC7432]。
EVPN NVEs (or PEs) connected to the same ES (e.g., the same server via Link Aggregation Group (LAG)) can automatically discover each other with minimal to no configuration through the exchange of BGP routes.
通过交换BGP路由,连接到相同ES(例如,通过链路聚合组(LAG)连接到相同服务器)的EVPN NVE(或PEs)可以自动发现彼此,而配置最少甚至没有。
EVPN defines a mechanism to efficiently and quickly signal, to remote NVEs, the need to update their forwarding tables upon the occurrence of a failure in connectivity to an ES (e.g., a link or a port failure). This is done by having each NVE advertise an Ethernet A-D route per ES for each locally attached segment. Upon a failure in connectivity to the attached segment, the NVE withdraws the corresponding Ethernet A-D route. This triggers all NVEs that receive the withdrawal to update their next-hop adjacencies for all MAC addresses associated with the ES in question. If no other NVE had advertised an Ethernet A-D route for the same segment, then the
EVPN定义了一种机制,用于在与ES的连接出现故障(例如,链路或端口故障)时,高效、快速地向远程NVE发出更新其转发表的信号。这是通过让每个NVE为每个本地连接段的每个ES公布以太网A-D路由来实现的。当连接到连接段的连接出现故障时,NVE将撤回相应的以太网a-D路由。这将触发所有接收撤回的NVE,以更新其与所述ES相关联的所有MAC地址的下一跳邻接。如果没有其他NVE为同一网段播发以太网A-D路由,则
NVE that received the withdrawal simply invalidates the MAC entries for that segment. Otherwise, the NVE updates the next-hop adjacency list accordingly.
收到撤回的NVE只会使该段的MAC条目无效。否则,NVE将相应地更新下一跳邻接列表。
If a server is multihomed to two or more NVEs (represented by an ES ES1) and operating in an All-Active redundancy mode, sends a BUM (i.e., Broadcast, Unknown unicast, or Multicast) packet to one of these NVEs, then it is important to ensure the packet is not looped back to the server via another NVE connected to this server. The filtering mechanism on the NVE to prevent such loop and packet duplication is called "split-horizon filtering".
如果一台服务器与两个或多个NVE(由ES ES1表示)多宿,并在全活动冗余模式下运行,向其中一个NVE发送BUM(即广播、未知单播或多播)数据包,则必须确保数据包不会通过连接到此服务器的另一个NVE环回服务器。NVE上防止这种循环和数据包重复的过滤机制称为“分割地平线过滤”。
In the case where a station is multihomed to multiple NVEs, it is possible that only a single NVE learns a set of the MAC addresses associated with traffic transmitted by the station. This leads to a situation where remote NVEs receive MAC Advertisement routes, for these addresses, from a single NVE even though multiple NVEs are connected to the multihomed station. As a result, the remote NVEs are not able to effectively load-balance traffic among the NVEs connected to the multihomed ES. For example, this could be the case when the NVEs perform data-path learning on the access and the load-balancing function on the station hashes traffic from a given source MAC address to a single NVE. Another scenario where this occurs is when the NVEs rely on control-plane learning on the access (e.g., using ARP), since ARP traffic will be hashed to a single link in the LAG.
在站点被多址到多个NVE的情况下,可能只有单个NVE学习与站点发送的业务相关联的一组MAC地址。这导致远程NVE从单个NVE接收这些地址的MAC播发路由的情况,即使多个NVE连接到多址站点。因此,远程NVE无法在连接到多址ES的NVE之间有效地负载平衡流量。例如,当NVE在访问上执行数据路径学习并且站点上的负载平衡功能将来自给定源MAC地址的流量散列到单个NVE时,可能出现这种情况。发生这种情况的另一种情况是,由于ARP流量将在延迟中散列到单个链路,因此NVE依赖于访问的控制平面学习(例如,使用ARP)。
To alleviate this issue, EVPN introduces the concept of "Aliasing". This refers to the ability of an NVE to signal that it has reachability to a given locally attached ES, even when it has learned no MAC addresses from that segment. The Ethernet A-D route per EVI is used to that end. Remote NVEs that receive MAC Advertisement routes with non-zero ESIs should consider the MAC address as reachable via all NVEs that advertise reachability to the relevant Segment using Ethernet A-D routes with the same ESI and with the Single-Active flag reset.
为了缓解这个问题,EVPN引入了“别名”的概念。这是指NVE能够发出信号,表明其能够到达给定的本地连接的ES,即使它没有从该段中了解到MAC地址。每个EVI的以太网A-D路由用于该目的。接收非零IMS的MAC广告路由的远程NVE应该考虑MAC地址可通过所有NVE到达,该NVE使用相同的ESI的以太网A-路由和单个活动标志重置来向相关段宣传可达性。
Backup Path is a closely related function, albeit one that applies to the case where the redundancy mode is Single-Active. In this case, the NVE signals that it has reachability to a given locally attached ES using the Ethernet A-D route as well. Remote NVEs that receive the MAC Advertisement routes, with non-zero ESI, should consider the MAC address as reachable via the advertising NVE. Furthermore, the remote NVEs should install a Backup Path, for said MAC, to the NVE
备份路径是一项密切相关的功能,尽管它适用于冗余模式为单激活的情况。在这种情况下,NVE也使用以太网a-D路由向给定的本地连接的ES发送信号,表明其具有可达性。接收非广告ESI的MAC广告路由的远程NVS应该考虑MAC地址可通过广告NVE到达。此外,远程NVE应为所述MAC安装到NVE的备份路径
that had advertised reachability to the relevant segment using an Ethernet A-D route with the same ESI and with the Single-Active flag set.
已使用具有相同ESI和单个活动标志集的以太网A-D路由通告相关段的可达性。
If a host is multihomed to two or more NVEs on an ES operating in All-Active redundancy mode, then, for a given EVI, only one of these NVEs, termed the "Designated Forwarder" (DF) is responsible for sending it broadcast, multicast, and, if configured for that EVI, unknown unicast frames.
如果主机多址到ES上以所有活动冗余模式运行的两个或多个NVE,则对于给定EVI,只有一个称为“指定转发器”(DF)的NVE负责向其发送广播、多播和未知单播帧(如果为该EVI配置)。
This is required in order to prevent duplicate delivery of multi-destination frames to a multihomed host or VM, in case of All-Active redundancy.
在所有活动冗余的情况下,这是为了防止多目标帧重复传送到多主机或VM所必需的。
In NVEs where frames tagged as IEEE 802.1Q [IEEE.802.1Q] are received from hosts, the DF election should be performed based on host VIDs per Section 8.5 of [RFC7432]. Furthermore, multihoming PEs of a given ES MAY perform DF election using configured IDs such as VNI, EVI, normalized VIDs, and etc., as along the IDs are configured consistently across the multihoming PEs.
在从主机接收标记为IEEE 802.1Q[IEEE.802.1Q]的帧的NVE中,应根据[RFC7432]第8.5节的主机视频执行测向选择。此外,给定ES的多归属PEs可以使用配置的id(例如VNI、EVI、归一化的vid等)执行DF选择,因为沿着id在多归属PEs上一致地配置。
In GWs where VXLAN-encapsulated frames are received, the DF election is performed on VNIs. Again, it is assumed that, for a given Ethernet segment, VNIs are unique and consistent (e.g., no duplicate VNIs exist).
在接收VXLAN封装帧的GWs中,DF选择在VNI上执行。同样,假设对于给定的以太网段,VNI是唯一且一致的(例如,不存在重复的VNI)。
Since multihoming is supported in this scenario, the entire set of BGP routes and attributes defined in [RFC7432] is used. The setting of the Ethernet Tag field in the MAC Advertisement, Ethernet A-D per EVI, and IMET) routes follows that of Section 5.1.3. Furthermore, the setting of the VNI field in the MAC Advertisement and Ethernet A-D per EVI routes follows that of Section 5.1.3.
由于在此场景中支持多归属,因此使用[RFC7432]中定义的整个BGP路由和属性集。MAC广告、以太网A-D/EVI和IMET)路由中以太网标签字段的设置遵循第5.1.3节的规定。此外,MAC广告和以太网A-D每EVI路由中VNI字段的设置遵循第5.1.3节的规定。
Two cases need to be examined here, depending on whether the NVEs are operating in Single-Active or in All-Active redundancy mode.
这里需要检查两种情况,这取决于NVE是在单激活冗余模式下运行还是在全激活冗余模式下运行。
First, let's consider the case of Single-Active redundancy mode, where the hosts are multihomed to a set of NVEs; however, only a single NVE is active at a given point of time for a given VNI. In this case, the Aliasing is not required, and the split-horizon
首先,我们考虑单个主动冗余模式的情况,其中主机是多宿主的一组NVE;然而,对于给定的VNI,在给定的时间点只有一个NVE处于活动状态。在这种情况下,不需要别名,并且分割地平线
filtering may not be required, but other functions such as multihomed ES auto-discovery, fast convergence and mass withdrawal, Backup Path, and DF election are required.
可能不需要过滤,但需要其他功能,如多址ES自动发现、快速收敛和大规模撤回、备份路径和DF选择。
Second, let's consider the case of All-Active redundancy mode. In this case, out of all the EVPN multihoming features listed in Section 8.1, the use of the VXLAN or NVGRE encapsulation impacts the split-horizon and Aliasing features, since those two rely on the MPLS client layer. Given that this MPLS client layer is absent with these types of encapsulations, alternative procedures and mechanisms are needed to provide the required functions. Those are discussed in detail next.
其次,让我们考虑所有主动冗余模式的情况。在这种情况下,在第8.1节中列出的所有EVPN多主功能中,VXLAN或NVGRE封装的使用会影响分割地平线和混叠功能,因为这两种功能依赖于MPLS客户端层。考虑到这些类型的封装不存在此MPLS客户机层,因此需要替代的过程和机制来提供所需的功能。下面将详细讨论这些问题。
In EVPN, an MPLS label is used for split-horizon filtering to support All-Active multihoming where an ingress NVE adds a label corresponding to the site of origin (aka an ESI label) when encapsulating the packet. The egress NVE checks the ESI label when attempting to forward a multi-destination frame out an interface, and if the label corresponds to the same site identifier (ESI) associated with that interface, the packet gets dropped. This prevents the occurrence of forwarding loops.
在EVPN中,MPLS标签用于分割地平线过滤,以支持所有主动多址,其中入口NVE在封装数据包时添加与源站点相对应的标签(又称ESI标签)。出口NVE在尝试将多目标帧转发出接口时检查ESI标签,并且如果标签对应于与该接口相关联的同一站点标识符(ESI),则分组被丢弃。这可以防止发生转发循环。
Since VXLAN and NVGRE encapsulations do not include the ESI label, other means of performing the split-horizon filtering function must be devised for these encapsulations. The following approach is recommended for split-horizon filtering when VXLAN (or NVGRE) encapsulation is used.
由于VXLAN和NVGRE封装不包括ESI标签,因此必须为这些封装设计其他执行分割地平线过滤功能的方法。当使用VXLAN(或NVGRE)封装时,建议采用以下方法进行分割地平线过滤。
Every NVE tracks the IP address(es) associated with the other NVE(s) with which it has shared multihomed ESs. When the NVE receives a multi-destination frame from the overlay network, it examines the source IP address in the tunnel header (which corresponds to the ingress NVE) and filters out the frame on all local interfaces connected to ESs that are shared with the ingress NVE. With this approach, it is required that the ingress NVE perform replication locally to all directly attached Ethernet segments (regardless of the DF election state) for all flooded traffic ingress from the access interfaces (i.e., from the hosts). This approach is referred to as "Local Bias", and has the advantage that only a single IP address need be used per NVE for split-horizon filtering, as opposed to requiring an IP address per Ethernet segment per NVE.
每个NVE跟踪与其共享多址ESs的其他NVE关联的IP地址。当NVE从覆盖网络接收到多目标帧时,它检查隧道头中的源IP地址(对应于入口NVE),并过滤掉与入口NVE共享的ESs连接的所有本地接口上的帧。使用这种方法,要求入口NVE对来自接入接口(即,来自主机)的所有洪泛流量入口本地执行复制到所有直接连接的以太网段(无论DF选择状态如何)。这种方法被称为“局部偏差”,其优点是每个NVE只需使用一个IP地址进行分割地平线过滤,而不是每个NVE每个以太网段都需要一个IP地址。
In order to allow proper operation of split-horizon filtering among the same group of multihoming PE devices, a mix of PE devices with MPLS over GRE encapsulations running the procedures from [RFC7432]
为了允许在同一组多主PE设备之间正确操作拆分地平线过滤,将PE设备与运行[RFC7432]程序的MPLS over GRE封装混合使用
for split-horizon filtering on the one hand and VXLAN/NVGRE encapsulation running local-bias procedures on the other on a given Ethernet segment MUST NOT be configured.
一方面,对于分割地平线过滤,另一方面,对于在给定以太网段上运行本地偏置程序的VXLAN/NVGRE封装,不得进行配置。
The Aliasing and the Backup Path procedures for VXLAN/NVGRE encapsulation are very similar to the ones for MPLS. In the case of MPLS, Ethernet A-D route per EVI is used for Aliasing when the corresponding ES operates in All-Active multihoming, and the same route is used for Backup Path when the corresponding ES operates in Single-Active multihoming. In the case of VXLAN/NVGRE, the same route is used for the Aliasing and the Backup Path with the difference that the Ethernet Tag and VNI fields in Ethernet A-D per EVI route are set as described in Section 5.1.3.
VXLAN/NVGRE封装的别名和备份路径过程与MPLS非常相似。在MPLS的情况下,当相应的ES在所有活动多宿主中运行时,每个EVI的以太网A-D路由用于别名,当相应的ES在单个活动多宿主中运行时,相同的路由用于备份路径。在VXLAN/NVGRE的情况下,别名和备份路径使用相同的路由,不同之处在于每个EVI路由的以太网A-D中的以太网标签和VNI字段的设置如第5.1.3节所述。
In EVPN, when an ingress PE uses ingress replication to flood unknown unicast traffic to egress PEs, the ingress PE uses a different EVPN MPLS label (from the one used for known unicast traffic) to identify such BUM traffic. The egress PEs use this label to identify such BUM traffic and, thus, apply DF filtering for All-Active multihomed sites. In absence of an unknown unicast traffic designation and in the presence of enabling unknown unicast flooding, there can be transient duplicate traffic to All-Active multihomed sites under the following condition: the host MAC address is learned by the egress PE(s) and advertised to the ingress PE; however, the MAC Advertisement has not been received or processed by the ingress PE, resulting in the host MAC address being unknown on the ingress PE but known on the egress PE(s). Therefore, when a packet destined to that host MAC address arrives on the ingress PE, it floods it via ingress replication to all the egress PE(s), and since they are known to the egress PE(s), multiple copies are sent to the All-Active multihomed site. It should be noted that such transient packet duplication only happens when a) the destination host is multihomed via All-Active redundancy mode, b) flooding of unknown unicast is enabled in the network, c) ingress replication is used, and d) traffic for the destination host is arrived on the ingress PE before it learns the host MAC address via BGP EVPN advertisement. If it is desired to avoid occurrence of such transient packet duplication (however low probability that may be), then VXLAN-GPE encapsulation needs to be used between these PEs and the ingress PE needs to set the BUM Traffic Bit (B bit) [VXLAN-GPE] to indicate that this is an ingress-replicated BUM traffic.
在EVPN中,当入口PE使用入口复制将未知的单播业务洪泛到出口PE时,入口PE使用不同的EVPN MPLS标签(与用于已知单播业务的标签不同)来识别此类BUM业务。出口PEs使用此标签识别此类BUM流量,从而对所有活动多址站点应用DF过滤。在不存在未知单播通信量指定且存在启用未知单播泛洪的情况下,在以下条件下,可以存在到所有活动多址站点的瞬时重复通信量:主机MAC地址由出口PE读入并通告给入口PE;然而,入口PE尚未接收或处理MAC播发,导致主机MAC地址在入口PE上未知,但在出口PE上已知。因此,当目的地为该主机MAC地址的分组到达入口PE时,它通过入口复制将其洪泛到所有出口PE,并且由于出口PE知道它们,所以多个副本被发送到所有活动多址站点。应该注意的是,只有当a)目标主机通过所有活动冗余模式进行多宿,b)在网络中启用未知单播的洪泛,c)使用入口复制时,才会发生这种瞬时数据包复制,和d)在入口PE通过BGP EVPN播发获知主机MAC地址之前,目标主机的通信量到达入口PE。如果希望避免此类瞬态数据包复制的发生(无论可能性多么低),则需要在这些PE之间使用VXLAN-GPE封装,并且入口PE需要设置BUM流量位(B位)[VXLAN-GPE]以指示这是入口复制的BUM流量。
The EVPN IMET route is used to discover the multicast tunnels among the endpoints associated with a given EVI (e.g., given VNI) for VLAN-Based Service and a given <EVI, VLAN> for VLAN-Aware Bundle Service. All fields of this route are set as described in Section 5.1.3. The originating router's IP address field is set to the NVE's IP address. This route is tagged with the PMSI Tunnel attribute, which is used to encode the type of multicast tunnel to be used as well as the multicast tunnel identifier. The tunnel encapsulation is encoded by adding the BGP Encapsulation Extended Community as per Section 5.1.1. For example, the PMSI Tunnel attribute may indicate the multicast tunnel is of type Protocol Independent Multicast - Sparse-Mode (PIM-SM); whereas, the BGP Encapsulation Extended Community may indicate the encapsulation for that tunnel is of type VXLAN. The following tunnel types as defined in [RFC6514] can be used in the PMSI Tunnel attribute for VXLAN/NVGRE:
EVPN IME路由用于发现与基于VLAN的服务的给定EVI(例如,给定VNI)和感知VLAN的捆绑服务的给定<EVI,VLAN>关联的端点之间的多播隧道。按照第5.1.3节所述设置该路线的所有字段。原始路由器的IP地址字段设置为NVE的IP地址。此路由使用PMSI隧道属性进行标记,该属性用于编码要使用的多播隧道类型以及多播隧道标识符。根据第5.1.1节,通过添加BGP封装扩展社区对隧道封装进行编码。例如,PMSI隧道属性可以指示多播隧道是类型独立于协议的多播-稀疏模式(PIM-SM);然而,BGP封装扩展社区可能表示该隧道的封装类型为VXLAN。[RFC6514]中定义的以下隧道类型可用于VXLAN/NVGRE的PMSI隧道属性:
+ 3 - PIM-SSM Tree + 4 - PIM-SM Tree + 5 - BIDIR-PIM Tree + 6 - Ingress Replication
+ 3 - PIM-SSM Tree + 4 - PIM-SM Tree + 5 - BIDIR-PIM Tree + 6 - Ingress Replication
In case of VXLAN and NVGRE encapsulations with locally assigned VNIs, just as in [RFC7432], each PE MUST advertise an IMET route to other PEs in an EVPN instance for the multicast tunnel type that it uses (i.e., ingress replication, PIM-SM, PIM-SSM, or BIDIR-PIM tunnel). However, for globally assigned VNIs, each PE MUST advertise an IMET route to other PEs in an EVPN instance for ingress replication or a PIM-SSM tunnel, and they MAY advertise an IMET route for a PIM-SM or BIDIR-PIM tunnel. In case of a PIM-SM or BIDIR-PIM tunnel, no information in the IMET route is needed by the PE to set up these tunnels.
对于带有本地分配的VNI的VXLAN和NVGRE封装,正如[RFC7432]中所述,每个PE必须为其使用的多播隧道类型(即入口复制、PIM-SM、PIM-SSM或BIDIR-PIM隧道)向EVPN实例中的其他PE公布IME路由。但是,对于全局分配的VNI,每个PE必须向EVPN实例中的其他PE播发IME路由以进行入口复制或PIM-SSM隧道,并且它们可以为PIM-SM或BIDIR-PIM隧道播发IME路由。对于PIM-SM或BIDIR-PIM隧道,PE不需要IMT路线中的信息来设置这些隧道。
In the scenario where the multicast tunnel is a tree, both the Inclusive as well as the Aggregate Inclusive variants may be used. In the former case, a multicast tree is dedicated to a VNI. Whereas, in the latter, a multicast tree is shared among multiple VNIs. For VNI-Based Service, the Aggregate Inclusive mode is accomplished by having the NVEs advertise multiple IMET routes with different RTs (one per VNI) but with the same tunnel identifier encoded in the PMSI Tunnel attribute. For VNI-Aware Bundle Service, the Aggregate Inclusive mode is accomplished by having the NVEs advertise multiple IMET routes with different VNIs encoded in the Ethernet Tag field, but with the same tunnel identifier encoded in the PMSI Tunnel attribute.
在多播隧道是树的场景中,可以使用Inclusive和Aggregate Inclusive变量。在前一种情况下,多播树专用于VNI。然而,在后者中,多播树在多个VNI之间共享。对于基于VNI的服务,聚合包含模式通过让NVE公布具有不同RTs(每个VNI一个)但具有在PMSI隧道属性中编码的相同隧道标识符的多个IMT路由来实现。对于支持VNI的捆绑服务,聚合包含模式是通过让NVE公布多个IMT路由来实现的,这些路由在以太网标记字段中编码了不同的VNI,但在PMSI隧道属性中编码了相同的隧道标识符。
For DCIs, the following two main scenarios are considered when connecting data centers running evpn-overlay (as described here) over an MPLS/IP core network:
对于DCIs,在通过MPLS/IP核心网络连接运行evpn overlay(如下所述)的数据中心时,考虑以下两种主要方案:
- Scenario 1: DCI using GWs
- 场景1:使用GWs的DCI
- Scenario 2: DCI using ASBRs
- 场景2:使用ASBR的DCI
The following two subsections describe the operations for each of these scenarios.
以下两个小节描述了每个场景的操作。
This is the typical scenario for interconnecting data centers over WAN. In this scenario, EVPN routes are terminated and processed in each GW and MAC/IP route are always re-advertised from DC to WAN but from WAN to DC, they are not re-advertised if unknown MAC addresses (and default IP address) are utilized in the NVEs. In this scenario, each GW maintains a MAC-VRF (and/or IP-VRF) for each EVI. The main advantage of this approach is that NVEs do not need to maintain MAC and IP addresses from any remote data centers when default IP routes and unknown MAC routes are used; that is, they only need to maintain routes that are local to their own DC. When default IP routes and unknown MAC routes are used, any unknown IP and MAC packets from NVEs are forwarded to the GWs where all the VPN MAC and IP routes are maintained. This approach reduces the size of MAC-VRF and IP-VRF significantly at NVEs. Furthermore, it results in a faster convergence time upon a link or NVE failure in a multihomed network or device redundancy scenario, because the failure-related BGP routes (such as mass withdrawal message) do not need to get propagated all the way to the remote NVEs in the remote DCs. This approach is described in detail in Section 3.4 of [DCI-EVPN-OVERLAY].
这是通过广域网互连数据中心的典型场景。在这种情况下,EVPN路由在每个GW中终止和处理,MAC/IP路由始终从DC重新通告到WAN,但从WAN重新通告到DC,如果在NVE中使用未知MAC地址(和默认IP地址),则不会重新通告。在这种情况下,每个GW为每个EVI维护一个MAC-VRF(和/或IP-VRF)。这种方法的主要优点是,当使用默认IP路由和未知MAC路由时,NVE不需要维护来自任何远程数据中心的MAC和IP地址;也就是说,他们只需要维护他们自己的DC的本地路由。当使用默认IP路由和未知MAC路由时,来自NVE的任何未知IP和MAC数据包将转发到GWs,在GWs中维护所有VPN MAC和IP路由。这种方法在NVE上显著减小了MAC-VRF和IP-VRF的大小。此外,由于故障相关的BGP路由(如大规模撤回消息)不需要一直传播到远程DCs中的远程NVE,因此在多宿网络或设备冗余场景中发生链路故障或NVE故障时,它会导致更快的收敛时间。[DCI-EVPN-OVERLAY]第3.4节详细描述了该方法。
This approach can be considered as the opposite of the first approach. It favors simplification at DCI devices over NVEs such that larger MAC-VRF (and IP-VRF) tables need to be maintained on NVEs; whereas DCI devices don't need to maintain any MAC (and IP) forwarding tables. Furthermore, DCI devices do not need to terminate and process routes related to multihoming but rather to relay these messages for the establishment of an end-to-end Label Switched Path (LSP). In other words, DCI devices in this approach operate similar to ASBRs for inter-AS Option B (see Section 10 of [RFC4364]). This requires locally assigned VNIs to be used just like downstream-assigned MPLS VPN labels where, for all practical purposes, the VNIs
这种方法可视为与第一种方法相反。它有利于DCI设备的简化,而不是NVE,因此需要在NVE上维护更大的MAC-VRF(和IP-VRF)表;而DCI设备不需要维护任何MAC(和IP)转发表。此外,DCI设备不需要终止和处理与多归属相关的路由,而是中继这些消息以建立端到端标签交换路径(LSP)。换句话说,这种方法中的DCI设备的运行方式与AS间选项B的ASBR类似(见[RFC4364]第10节)。这要求使用本地分配的VNI,就像下游分配的MPLS VPN标签一样,在所有实际用途中,VNI
function like 24-bit VPN labels. This approach is equally applicable to data centers (or Carrier Ethernet networks) with MPLS encapsulation.
功能类似于24位VPN标签。这种方法同样适用于采用MPLS封装的数据中心(或运营商以太网)。
In inter-AS Option B, when ASBR receives an EVPN route from its DC over internal BGP (iBGP) and re-advertises it to other ASBRs, it re-advertises the EVPN route by re-writing the BGP next hops to itself, thus losing the identity of the PE that originated the advertisement. This rewrite of BGP next hop impacts the EVPN mass withdrawal route (Ethernet A-D per ES) and its procedure adversely. However, it does not impact the EVPN Aliasing mechanism/procedure because when the Aliasing routes (Ethernet A-D per EVI) are advertised, the receiving PE first resolves a MAC address for a given EVI into its corresponding <ES, EVI>, and, subsequently, it resolves the <ES, EVI> into multiple paths (and their associated next hops) via which the <ES, EVI> is reachable. Since Aliasing and MAC routes are both advertised on a per-EVI-basis and they use the same RD and RT (per EVI), the receiving PE can associate them together on a per-BGP-path basis (e.g., per originating PE). Thus, it can perform recursive route resolution, e.g., a MAC is reachable via an <ES, EVI> which in turn, is reachable via a set of BGP paths; thus, the MAC is reachable via the set of BGP paths. Due to the per-EVI basis, the association of MAC routes and the corresponding Aliasing route is fixed and determined by the same RD and RT; there is no ambiguity when the BGP next hop for these routes is rewritten as these routes pass through ASBRs. That is, the receiving PE may receive multiple Aliasing routes for the same EVI from a single next hop (a single ASBR), and it can still create multiple paths toward that <ES, EVI>.
在内部AS选项B中,当ASBR通过内部BGP(iBGP)从其DC接收EVPN路由并将其重新播发给其他ASBR时,它通过将BGP下一跳重新写入自身来重新播发EVPN路由,从而丢失发起播发的PE的身份。BGP下一跳的重写对EVPN大规模撤回路由(每个ES的以太网A-D)及其过程产生不利影响。然而,它不影响EVPN混叠机制/过程,因为当混叠路由(每个EVI的以太网A-D)被通告时,接收PE首先将给定EVI的MAC地址解析为其对应的<ES,EVI>,并且随后,它将<ES,EVI>解析为多个路径(及其相关联的下一跳),通过这些路径,<ES,EVI>是可访问的。由于别名和MAC路由均基于每个EVI进行广告,并且它们使用相同的RD和RT(每个EVI),因此接收PE可以基于每个BGP路径(例如,每个发起PE)将它们关联在一起。因此,它可以执行递归路由解析,例如,MAC可经由<ES,EVI>到达,而该MAC又可经由一组BGP路径到达;因此,MAC可以通过BGP路径集访问。由于基于每EVI,MAC路由和相应的别名路由的关联是固定的,并且由相同的RD和RT确定;当这些路由通过ASBR时,重写这些路由的BGP下一跳时,不会出现歧义。也就是说,接收PE可以从单个下一跳(单个ASBR)接收同一EVI的多个别名路由,并且它仍然可以创建指向该EVI的多个路径。
However, when the BGP next-hop address corresponding to the originating PE is rewritten, the association between the mass withdrawal route (Ethernet A-D per ES) and its corresponding MAC routes cannot be made based on their RDs and RTs because the RD for the mass Withdrawal route is different than the one for the MAC routes. Therefore, the functionality needed at the ASBRs and the receiving PEs depends on whether the Mass Withdrawal route is originated and whether there is a need to handle route resolution ambiguity for this route. The following two subsections describe the functionality needed by the ASBRs and the receiving PEs depending on whether the NVEs reside in a hypervisors or in ToR switches.
然而,当重写与发起PE相对应的BGP下一跳地址时,不能基于其RDs和RTs建立大规模撤回路由(以太网A-D per ES)与其相应MAC路由之间的关联,因为大规模撤回路由的RD不同于MAC路由的RD。因此,ASBR和接收PEs所需的功能取决于是否发起大规模撤离路线,以及是否需要处理该路线的路线解析模糊性。以下两小节描述了ASBR和接收PE所需的功能,具体取决于NVE是驻留在虚拟机监控程序中还是驻留在虚拟机交换机中。
When NVEs reside in hypervisors as described in Section 7.1, there is no multihoming; thus, there is no need for the originating NVE to send Ethernet A-D per ES or Ethernet A-D per EVI routes. However, as noted in Section 7, in order to enable a single-homing ingress NVE to take advantage of fast convergence, Aliasing, and Backup Path when
如第7.1节所述,当NVE驻留在虚拟机监控程序中时,不存在多宿主;因此,发起的NVE不需要每个ES发送以太网A-D或每个EVI路由发送以太网A-D。然而,如第7节所述,为了使单个归巢入口能够在以下情况下利用快速收敛、混叠和备份路径:
interacting with multihoming egress NVEs attached to a given ES, the single-homing NVE should be able to receive and process Ethernet A-D per ES and Ethernet A-D per EVI routes. The handling of these routes is described in the next section.
与连接到给定ES的多主出口NVE交互时,单主NVE应能够接收和处理每个ES的以太网a-D和每个EVI的以太网a-D路由。下一节将介绍这些路线的处理。
When NVEs reside in ToR switches and operate in multihoming redundancy mode, there is a need, as described in Section 8, for the originating multihoming NVE to send Ethernet A-D per ES route(s) (used for mass withdrawal) and Ethernet A-D per EVI routes (used for Aliasing). As described above, the rewrite of BGP next hop by ASBRs creates ambiguities when Ethernet A-D per ES routes are received by the remote NVE in a different ASBR because the receiving NVE cannot associate that route with the MAC/IP routes of that ES advertised by the same originating NVE. This ambiguity inhibits the function of mass withdrawal per ES by the receiving NVE in a different AS.
当NVE驻留在ToR交换机中并在多主冗余模式下运行时,如第8节所述,原始多主NVE需要按照ES路由发送以太网a-D(用于大规模提取)和按照EVI路由发送以太网a-D(用于别名)。如上所述,当远程NVE在不同的ASBR中接收到以太网A-D per ES路由时,ASBR重写BGP next hop会产生歧义,因为接收到的NVE无法将该路由与同一发起NVE公布的该ES的MAC/IP路由相关联。这种模糊性抑制了接收NVE在不同AS中的批量提取功能。
As an example, consider a scenario where a CE is multihomed to PE1 and PE2, where these PEs are connected via ASBR1 and then ASBR2 to the remote PE3. Furthermore, consider that PE1 receives M1 from CE1 but not PE2. Therefore, PE1 advertises Ethernet A-D per ES1, Ethernet A-D per EVI1, and M1; whereas, PE2 only advertises Ethernet A-D per ES1 and Ethernet A-D per EVI1. ASBR1 receives all these five advertisements and passes them to ASBR2 (with itself as the BGP next hop). ASBR2, in turn, passes them to the remote PE3, with itself as the BGP next hop. PE3 receives these five routes where all of them have the same BGP next hop (i.e., ASBR2). Furthermore, the two Ethernet A-D per ES routes received by PE3 have the same information, i.e., same ESI and the same BGP next hop. Although both of these routes are maintained by the BGP process in PE3 (because they have different RDs and, thus, are treated as different BGP routes), information from only one of them is used in the L2 routing table (L2 RIB).
作为一个例子,考虑一个场景,其中CE是多宿主的PE1和PE2,其中这些PES通过ASBR1连接,然后ASBR2连接到远程PE3。此外,考虑PE1从CE1接收M1,但不接收PE2。因此,PE1根据ES1宣传以太网A-D,根据EVI1宣传以太网A-D,以及M1;然而,PE2仅根据ES1宣传以太网A-D,根据EVI1宣传以太网A-D。ASBR1接收所有这五个广告并将它们传递给ASBR2(自身作为BGP下一跳)。ASBR2依次将它们传递给远程PE3,并将其自身作为BGP下一跳。PE3接收这五条路由,其中所有路由都具有相同的BGP下一跳(即ASBR2)。此外,由PE3接收的两个以太网A-D per-ES路由具有相同的信息,即相同的ESI和相同的BGP下一跳。虽然这两条路由都由PE3中的BGP进程维护(因为它们具有不同的RD,因此被视为不同的BGP路由),但L2路由表(L2 RIB)中只使用其中一条路由的信息。
PE1 / \ CE ASBR1---ASBR2---PE3 \ / PE2
PE1 / \ CE ASBR1---ASBR2---PE3 \ / PE2
Figure 3: Inter-AS Option B
图3:Inter AS选项B
Now, when the AC between the PE2 and the CE fails and PE2 sends Network Layer Reachability Information (NLRI) withdrawal for Ethernet A-D per ES route, and this withdrawal gets propagated and received by the PE3, the BGP process in PE3 removes the corresponding BGP route; however, it doesn't remove the associated information (namely ESI and
现在,当PE2和CE之间的AC发生故障,并且PE2通过ES路由发送以太网A-D的网络层可达性信息(NLRI)撤回,并且该撤回被PE3传播和接收时,PE3中的BGP进程移除相应的BGP路由;但是,它不会删除相关信息(即ESI和
BGP next hop) from the L2 routing table (L2 RIB) because it still has the other Ethernet A-D per ES route (originated from PE1) with the same information. That is why the mass withdrawal mechanism does not work when doing DCI with inter-AS Option B. However, as described previously, the Aliasing function works and so does "mass withdrawal per EVI" (which is associated with withdrawing the EVPN route associated with Aliasing, i.e., Ethernet A-D per EVI route).
BGP下一跳)从L2路由表(L2 RIB)中删除,因为每个ES路由(源自PE1)仍具有具有相同信息的其他以太网A-D。这就是为什么在使用inter AS选项B进行DCI时,大规模撤回机制不起作用的原因。然而,如前所述,别名功能起作用,“每EVI大规模撤回”(与撤回与别名相关的EVPN路由相关,即每EVI路由的以太网A-D)也起作用。
In the above example, the PE3 receives two Aliasing routes with the same BGP next hop (ASBR2) but different RDs. One of the Aliasing route has the same RD as the advertised MAC route (M1). PE3 follows the route resolution procedure specified in [RFC7432] upon receiving the two Aliasing routes; that is, it resolves M1 to <ES, EVI1>, and, subsequently, it resolves <ES, EVI1> to a BGP path list with two paths along with the corresponding VNIs/MPLS labels (one associated with PE1 and the other associated with PE2). It should be noted that even though both paths are advertised by the same BGP next hop (ASRB2), the receiving PE3 can handle them properly. Therefore, M1 is reachable via two paths. This creates two end-to-end LSPs, from PE3 to PE1 and from PE3 to PE2, for M1 such that when PE3 wants to forward traffic destined to M1, it can load-balance between the two LSPs. Although route resolution for Aliasing routes with the same BGP next hop is not explicitly mentioned in [RFC7432], this is the expected operation; thus, it is elaborated here.
在上述示例中,PE3接收具有相同BGP下一跳(ASBR2)但不同RDs的两个别名路由。其中一个别名路由与公布的MAC路由(M1)具有相同的RD。PE3在收到两条混叠路由后,遵循[RFC7432]中规定的路由解析程序;也就是说,它将M1解析为<ES,EVI1>,随后将<ES,EVI1>解析为具有两条路径以及相应的vni/MPLS标签(一条与PE1关联,另一条与PE2关联)的BGP路径列表。应该注意的是,即使两条路径都由相同的BGP下一跳(ASRB2)播发,接收PE3也可以正确处理它们。因此,M1可通过两条路径到达。这为M1创建了两个端到端LSP,从PE3到PE1,从PE3到PE2,这样当PE3想要转发到M1的流量时,它可以在两个LSP之间实现负载平衡。尽管[RFC7432]中没有明确提到具有相同BGP下一跳的别名路由的路由解析,但这是预期的操作;因此,这里对其进行了阐述。
When the AC between the PE2 and the CE fails and PE2 sends NLRI withdrawal for Ethernet A-D per EVI routes, and these withdrawals get propagated and received by the PE3, the PE3 removes the Aliasing route and updates the path list; that is, it removes the path corresponding to the PE2. Therefore, all the corresponding MAC routes for that <ES, EVI> that point to that path list will now have the updated path list with a single path associated with PE1. This action can be considered to be the mass withdrawal at the per-EVI level. The mass withdrawal at the per-EVI level has a longer convergence time than the mass withdrawal at the per-ES level; however, it is much faster than the convergence time when the withdrawal is done on a per-MAC basis.
当PE2和CE之间的AC发生故障,并且PE2针对每个EVI路由发送以太网A-D的NLRI撤回,并且这些撤回被PE3传播和接收时,PE3移除别名路由并更新路径列表;也就是说,它删除了与PE2对应的路径。因此,指向该路径列表的<ES,EVI>的所有相应MAC路由现在将具有更新的路径列表,其中包含与PE1相关联的单个路径。该行动可视为在每EVI水平上的大规模撤回。每EVI水平的大规模退出比每ES水平的大规模退出具有更长的收敛时间;然而,当在每个MAC的基础上进行提取时,它比收敛时间快得多。
If a PE becomes detached from a given ES, then, in addition to withdrawing its previously advertised Ethernet A-D per ES routes, it MUST also withdraw its previously advertised Ethernet A-D per EVI routes for that ES. For a remote PE that is separated from the withdrawing PE by one or more EVPN inter-AS Option B ASBRs, the withdrawal of the Ethernet A-D per ES routes is not actionable. However, a remote PE is able to correlate a previously advertised Ethernet A-D per EVI route with any MAC/IP Advertisement routes also advertised by the withdrawing PE for that <ES, EVI, BD>. Hence, when
如果PE与给定ES分离,则除了撤回其先前公布的每个ES的以太网a-D路由外,还必须撤回其先前公布的每个ES的每个EVI的以太网a-D路由。对于通过一个或多个EVPN inter AS选项B ASBR与退出的PE分离的远程PE,每个ES路由的以太网a-D退出是不可操作的。然而,远程PE能够将先前通告的每个EVI路由的以太网a-D与任何MAC/IP通告路由相关联,该MAC/IP通告路由也由撤回的PE为该<ES,EVI,BD>通告。因此,当
it receives the withdrawal of an Ethernet A-D per EVI route, it SHOULD remove the withdrawing PE as a next hop for all MAC addresses associated with that <ES, EVI, BD>.
它接收到每个EVI路由的以太网A-D的撤回,它应该删除撤回的PE,作为与该<ES,EVI,BD>关联的所有MAC地址的下一跳。
In the previous example, when the AC between PE2 and the CE fails, PE2 will withdraw its Ethernet A-D per ES and per EVI routes. When PE3 receives the withdrawal of an Ethernet A-D per EVI route, it removes PE2 as a valid next hop for all MAC addresses associated with the corresponding <ES, EVI, BD>. Therefore, all the MAC next hops for that <ES, EVI, BD> will now have a single next hop, viz. the LSP to PE1.
在前面的示例中,当PE2和CE之间的AC发生故障时,PE2将根据ES和EVI路由撤回其以太网A-D。当PE3收到每个EVI路由的以太网A-D的撤回时,它将PE2作为与相应的<ES,EVI,BD>相关联的所有MAC地址的有效下一跳删除。因此,<ES,EVI,BD>的所有MAC下一跳现在将有一个下一跳,即。从LSP到PE1。
In summary, it can be seen that Aliasing (and Backup Path) functionality should work as is for inter-AS Option B without requiring any additional functionality in ASBRs or PEs. However, the mass withdrawal functionality falls back from per-ES mode to per-EVI mode for inter-AS Option B. That is, PEs receiving a mass withdrawal route from the same AS take action on Ethernet A-D per ES route; whereas, PEs receiving mass withdrawal routes from different ASes take action on the Ethernet A-D per EVI route.
总之,可以看出,别名(和备份路径)功能应按as选项B的原样工作,而不需要ASBR或PEs中的任何附加功能。然而,对于inter-AS选项B,大规模撤回功能从每ES模式退回到每EVI模式。即,PEs接收来自同一AS的大规模撤回路由,并根据ES路由在以太网a-D上采取行动;然而,接收来自不同ASE的大规模撤回路由的PE根据EVI路由在以太网A-D上采取行动。
This document uses IP-based tunnel technologies to support data-plane transport. Consequently, the security considerations of those tunnel technologies apply. This document defines support for VXLAN [RFC7348] and NVGRE encapsulations [RFC7637]. The security considerations from those RFCs apply to the data-plane aspects of this document.
本文档使用基于IP的隧道技术来支持数据平面传输。因此,适用这些隧道技术的安全考虑。本文档定义了对VXLAN[RFC7348]和NVGRE封装[RFC7637]的支持。这些RFC的安全注意事项适用于本文档的数据平面方面。
As with [RFC5512], any modification of the information that is used to form encapsulation headers, to choose a tunnel type, or to choose a particular tunnel for a particular payload type may lead to user data packets getting misrouted, misdelivered, and/or dropped.
与[RFC5512]一样,对用于形成封装头、选择隧道类型或为特定有效负载类型选择特定隧道的信息的任何修改都可能导致用户数据包被错误路由、错误发送和/或丢弃。
More broadly, the security considerations for the transport of IP reachability information using BGP are discussed in [RFC4271] and [RFC4272] and are equally applicable for the extensions described in this document.
更广泛地说,[RFC4271]和[RFC4272]中讨论了使用BGP传输IP可达性信息的安全注意事项,它们同样适用于本文档中描述的扩展。
This document registers the following in the "BGP Tunnel Encapsulation Attribute Tunnel Types" registry.
本文档在“BGP隧道封装属性隧道类型”注册表中注册以下内容。
Value Name ----- ------------------------ 8 VXLAN Encapsulation 9 NVGRE Encapsulation 10 MPLS Encapsulation 11 MPLS in GRE Encapsulation 12 VXLAN GPE Encapsulation
Value Name ----- ------------------------ 8 VXLAN Encapsulation 9 NVGRE Encapsulation 10 MPLS Encapsulation 11 MPLS in GRE Encapsulation 12 VXLAN GPE Encapsulation
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.
[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,DOI 10.17487/RFC2119,1997年3月<https://www.rfc-editor.org/info/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8174]Leiba,B.,“RFC 2119关键词中大写与小写的歧义”,BCP 14,RFC 8174,DOI 10.17487/RFC8174,2017年5月<https://www.rfc-editor.org/info/rfc8174>.
[RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, <https://www.rfc-editor.org/info/rfc7432>.
[RFC7432]Sajassi,A.,Ed.,Aggarwal,R.,Bitar,N.,Isaac,A.,Uttaro,J.,Drake,J.,和W.Henderickx,“基于BGP MPLS的以太网VPN”,RFC 7432,DOI 10.17487/RFC7432,2015年2月<https://www.rfc-editor.org/info/rfc7432>.
[RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, L., Sridhar, T., Bursell, M., and C. Wright, "Virtual eXtensible Local Area Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, <https://www.rfc-editor.org/info/rfc7348>.
[RFC7348]Mahalingam,M.,Dutt,D.,Duda,K.,Agarwal,P.,Kreeger,L.,Sridhar,T.,Bursell,M.,和C.Wright,“虚拟可扩展局域网(VXLAN):在第3层网络上覆盖虚拟化第2层网络的框架”,RFC 7348,DOI 10.17487/RFC7348,2014年8月<https://www.rfc-editor.org/info/rfc7348>.
[RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation Subsequent Address Family Identifier (SAFI) and the BGP Tunnel Encapsulation Attribute", RFC 5512, DOI 10.17487/RFC5512, April 2009, <https://www.rfc-editor.org/info/rfc5512>.
[RFC5512]Mohapatra,P.和E.Rosen,“BGP封装后续地址族标识符(SAFI)和BGP隧道封装属性”,RFC 5512,DOI 10.17487/RFC5512,2009年4月<https://www.rfc-editor.org/info/rfc5512>.
[RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., "Encapsulating MPLS in IP or Generic Routing Encapsulation (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, <https://www.rfc-editor.org/info/rfc4023>.
[RFC4023]Worster,T.,Rekhter,Y.,和E.Rosen,编辑,“在IP或通用路由封装(GRE)中封装MPLS”,RFC 4023,DOI 10.17487/RFC4023,2005年3月<https://www.rfc-editor.org/info/rfc4023>.
[RFC7637] Garg, P., Ed. and Y. Wang, Ed., "NVGRE: Network Virtualization Using Generic Routing Encapsulation", RFC 7637, DOI 10.17487/RFC7637, September 2015, <https://www.rfc-editor.org/info/rfc7637>.
[RFC7637]Garg,P.,Ed.和Y.Wang,Ed.,“NVGRE:使用通用路由封装的网络虚拟化”,RFC 7637,DOI 10.17487/RFC7637,2015年9月<https://www.rfc-editor.org/info/rfc7637>.
[RFC7209] Sajassi, A., Aggarwal, R., Uttaro, J., Bitar, N., Henderickx, W., and A. Isaac, "Requirements for Ethernet VPN (EVPN)", RFC 7209, DOI 10.17487/RFC7209, May 2014, <https://www.rfc-editor.org/info/rfc7209>.
[RFC7209]Sajassi,A.,Aggarwal,R.,Uttaro,J.,Bitar,N.,Henderickx,W.,和A.Isaac,“以太网VPN(EVPN)的要求”,RFC 7209,DOI 10.17487/RFC7209,2014年5月<https://www.rfc-editor.org/info/rfc7209>.
[RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", RFC 4272, DOI 10.17487/RFC4272, January 2006, <https://www.rfc-editor.org/info/rfc4272>.
[RFC4272]Murphy,S.,“BGP安全漏洞分析”,RFC 4272,DOI 10.17487/RFC4272,2006年1月<https://www.rfc-editor.org/info/rfc4272>.
[RFC7364] Narten, T., Ed., Gray, E., Ed., Black, D., Fang, L., Kreeger, L., and M. Napierala, "Problem Statement: Overlays for Network Virtualization", RFC 7364, DOI 10.17487/RFC7364, October 2014, <https://www.rfc-editor.org/info/rfc7364>.
[RFC7364]Narten,T.,Ed.,Gray,E.,Ed.,Black,D.,Fang,L.,Kreeger,L.,和M.Napierala,“问题陈述:网络虚拟化覆盖”,RFC 7364,DOI 10.17487/RFC7364,2014年10月<https://www.rfc-editor.org/info/rfc7364>.
[RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. Rekhter, "Framework for Data Center (DC) Network Virtualization", RFC 7365, DOI 10.17487/RFC7365, October 2014, <https://www.rfc-editor.org/info/rfc7365>.
[RFC7365]Lasserre,M.,Balus,F.,Morin,T.,Bitar,N.,和Y.Rekhter,“数据中心(DC)网络虚拟化框架”,RFC 7365,DOI 10.17487/RFC7365,2014年10月<https://www.rfc-editor.org/info/rfc7365>.
[RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP Encodings and Procedures for Multicast in MPLS/BGP IP VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, <https://www.rfc-editor.org/info/rfc6514>.
[RFC6514]Aggarwal,R.,Rosen,E.,Morin,T.,和Y.Rekhter,“MPLS/BGP IP VPN中的BGP编码和多播过程”,RFC 6514,DOI 10.17487/RFC6514,2012年2月<https://www.rfc-editor.org/info/rfc6514>.
[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 10.17487/RFC4271, January 2006, <https://www.rfc-editor.org/info/rfc4271>.
[RFC4271]Rekhter,Y.,Ed.,Li,T.,Ed.,和S.Hares,Ed.,“边境网关协议4(BGP-4)”,RFC 4271,DOI 10.17487/RFC4271,2006年1月<https://www.rfc-editor.org/info/rfc4271>.
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2006, <https://www.rfc-editor.org/info/rfc4364>.
[RFC4364]Rosen,E.和Y.Rekhter,“BGP/MPLS IP虚拟专用网络(VPN)”,RFC 4364,DOI 10.17487/RFC4364,2006年2月<https://www.rfc-editor.org/info/rfc4364>.
[TUNNEL-ENCAP] Rosen, E., Ed., Patel, K., and G. Velde, "The BGP Tunnel Encapsulation Attribute", Work in Progress draft-ietf-idr-tunnel-encaps-09, February 2018.
[TUNNEL-ENCAP]Rosen,E.,Ed.,Patel,K.,和G.Velde,“BGP隧道封装属性”,在建工程草案-ietf-idr-TUNNEL-encaps-092018年2月。
[DCI-EVPN-OVERLAY] Rabadan, J., Ed., Sathappan, S., Henderickx, W., Sajassi, A., and J. Drake, "Interconnect Solution for EVPN Overlay networks", Work in Progress, draft-ietf-bess-dci-evpn-overlay-10, March 2018.
[DCI-EVPN-OVERLAY]Rabadan,J.,Ed.,Sathappan,S.,Henderickx,W.,Sajassi,A.,和J.Drake,“EVPN覆盖网络的互连解决方案”,正在进行中,草案-ietf-bess-DCI-EVPN-OVERLAY-10,2018年3月。
[EVPN-GENEVE] Boutros, S., Sajassi, A., Drake, J., and J. Rabadan, "EVPN control plane for Geneve", Work in Progress, draft-boutros-bess-evpn-geneve-02, March 2018.
[EVPN-GENEVE]Boutros,S.,Sajassi,A.,Drake,J.,和J.Rabadan,“GENEVE的EVPN控制平面”,正在进行的工作,草稿-Boutros-bess-EVPN-GENEVE-022018年3月。
[VXLAN-GPE] Maino, F., Kreeger, L., Ed., and U. Elzur, Ed., "Generic Protocol Extension for VXLAN", Work in Progress, draft-ietf-nvo3-vxlan-gpe-05, October 2017.
[VXLAN-GPE]Maino,F.,Kreeger,L.,Ed.,和U.Elzur,Ed.,“VXLAN的通用协议扩展”,正在进行的工作,草稿-ietf-nvo3-VXLAN-GPE-052017年10月。
[GENEVE] Gross, J., Ed., Ganga, I., Ed., and T. Sridhar, Ed., "Geneve: Generic Network Virtualization Encapsulation", Work in Progress, draft-ietf-nvo3-geneve-06, March 2018.
[GENEVE]Gross,J.,Ed.,Ganga,I.,Ed.,和T.Sridhar,Ed.,“GENEVE:通用网络虚拟化封装”,正在进行的工作,草稿-ietf-nvo3-GENEVE-062018年3月。
[IEEE.802.1Q] IEEE, "IEEE Standard for Local and metropolitan area networks - Bridges and Bridged Networks - Media Access Control (MAC) Bridges and Virtual Bridged Local Area Networks", IEEE Std 802.1Q.
[IEEE.802.1Q]IEEE,“局域网和城域网IEEE标准-网桥和桥接网络-媒体访问控制(MAC)网桥和虚拟桥接局域网”,IEEE标准802.1Q。
Acknowledgements
致谢
The authors would like to thank Aldrin Isaac, David Smith, John Mullooly, Thomas Nadeau, Samir Thoria, and Jorge Rabadan for their valuable comments and feedback. The authors would also like to thank Jakob Heitz for his contribution on Section 10.2.
作者要感谢Aldrin Isaac、David Smith、John Mullooly、Thomas Nadeau、Samir Thoria和Jorge Rabadan的宝贵评论和反馈。作者还要感谢Jakob Heitz对第10.2节的贡献。
Contributors
贡献者
S. Salam K. Patel D. Rao S. Thoria D. Cai Cisco
S.Salam K.Patel D.Rao S.Thoria D.Cai Cisco
Y. Rekhter A. Issac W. Lin N. Sheth Juniper
Y.Rekhter A.Issac W.Lin N.Sheth Juniper
L. Yong Huawei
杨华伟
Authors' Addresses
作者地址
Ali Sajassi (editor) Cisco United States of America
Ali Sajassi(编辑)美国思科公司
Email: sajassi@cisco.com
Email: sajassi@cisco.com
John Drake (editor) Juniper Networks United States of America
John Drake(编辑)Juniper Networks美利坚合众国
Email: jdrake@juniper.net
Email: jdrake@juniper.net
Nabil Bitar Nokia United States of America
诺基亚美国公司
Email: nabil.bitar@nokia.com
Email: nabil.bitar@nokia.com
R. Shekhar Juniper United States of America
美利坚合众国R.Shekhar Juniper
Email: rshekhar@juniper.net
Email: rshekhar@juniper.net
James Uttaro AT&T United States of America
詹姆斯·乌塔罗美国电话电报公司
Email: uttaro@att.com
Email: uttaro@att.com
Wim Henderickx Nokia Copernicuslaan 50 2018 Antwerp Belgium
Wim Henderickx诺基亚哥白尼50 2018比利时安特卫普
Email: wim.henderickx@nokia.com
Email: wim.henderickx@nokia.com