Internet Engineering Task Force (IETF)                       A. Ghanwani
Request for Comments: 8293                                          Dell
Category: Informational                                        L. Dunbar
ISSN: 2070-1721                                               M. McBride
                                                               V. Bannai
                                                             R. Krishnan
                                                            January 2018
Internet Engineering Task Force (IETF)                       A. Ghanwani
Request for Comments: 8293                                          Dell
Category: Informational                                        L. Dunbar
ISSN: 2070-1721                                               M. McBride
                                                               V. Bannai
                                                             R. Krishnan
                                                            January 2018

A Framework for Multicast in Network Virtualization over Layer 3




This document provides a framework for supporting multicast traffic in a network that uses Network Virtualization over Layer 3 (NVO3). Both infrastructure multicast and application-specific multicast are discussed. It describes the various mechanisms that can be used for delivering such traffic as well as the data plane and control plane considerations for each of the mechanisms.


Status of This Memo


This document is not an Internet Standards Track specification; it is published for informational purposes.


This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 7841.

本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准;见RFC 7841第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at


Copyright Notice


Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2018 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents ( in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents


   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Infrastructure Multicast  . . . . . . . . . . . . . . . .   3
     1.2.  Application-Specific Multicast  . . . . . . . . . . . . .   4
   2.  Terminology and Abbreviations . . . . . . . . . . . . . . . .   4
   3.  Multicast Mechanisms in Networks That Use NVO3  . . . . . . .   5
     3.1.  No Multicast Support  . . . . . . . . . . . . . . . . . .   6
     3.2.  Replication at the Source NVE . . . . . . . . . . . . . .   6
     3.3.  Replication at a Multicast Service Node . . . . . . . . .   8
     3.4.  IP Multicast in the Underlay  . . . . . . . . . . . . . .  10
     3.5.  Other Schemes . . . . . . . . . . . . . . . . . . . . . .  11
   4.  Simultaneous Use of More Than One Mechanism . . . . . . . . .  12
   5.  Other Issues  . . . . . . . . . . . . . . . . . . . . . . . .  12
     5.1.  Multicast-Agnostic NVEs . . . . . . . . . . . . . . . . .  12
     5.2.  Multicast Membership Management for DC with VMs . . . . .  13
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  13
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  13
   8.  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .  13
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  13
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  13
     9.2.  Informative References  . . . . . . . . . . . . . . . . .  14
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  17
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17
   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Infrastructure Multicast  . . . . . . . . . . . . . . . .   3
     1.2.  Application-Specific Multicast  . . . . . . . . . . . . .   4
   2.  Terminology and Abbreviations . . . . . . . . . . . . . . . .   4
   3.  Multicast Mechanisms in Networks That Use NVO3  . . . . . . .   5
     3.1.  No Multicast Support  . . . . . . . . . . . . . . . . . .   6
     3.2.  Replication at the Source NVE . . . . . . . . . . . . . .   6
     3.3.  Replication at a Multicast Service Node . . . . . . . . .   8
     3.4.  IP Multicast in the Underlay  . . . . . . . . . . . . . .  10
     3.5.  Other Schemes . . . . . . . . . . . . . . . . . . . . . .  11
   4.  Simultaneous Use of More Than One Mechanism . . . . . . . . .  12
   5.  Other Issues  . . . . . . . . . . . . . . . . . . . . . . . .  12
     5.1.  Multicast-Agnostic NVEs . . . . . . . . . . . . . . . . .  12
     5.2.  Multicast Membership Management for DC with VMs . . . . .  13
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  13
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  13
   8.  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .  13
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  13
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  13
     9.2.  Informative References  . . . . . . . . . . . . . . . . .  14
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .  17
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  17
1. Introduction
1. 介绍

Network Virtualization over Layer 3 (NVO3) [RFC7365] is a technology that is used to address issues that arise in building large, multi-tenant data centers (DCs) that make extensive use of server virtualization [RFC7364].


This document provides a framework for supporting multicast traffic in a network that uses NVO3. Both infrastructure multicast and application-specific multicast are considered. It describes various mechanisms, and the considerations of each of them, that can be used for delivering such traffic in networks that use NVO3.


The reader is assumed to be familiar with the terminology and concepts as defined in the NVO3 Framework [RFC7365] and NVO3 Architecture [RFC8014] documents.


1.1. Infrastructure Multicast
1.1. 基础设施多播

Infrastructure multicast refers to networking services that require multicast or broadcast delivery, such as Address Resolution Protocol (ARP), Neighbor Discovery (ND), Dynamic Host Configuration Protocol (DHCP), multicast Domain Name Server (mDNS), etc., some of which are described in Sections 5 and 6 of RFC 3819 [RFC3819]. It is possible to provide solutions for these services that do not involve multicast in the underlay network. For example, in the case of ARP/ND, a Network Virtualization Authority (NVA) can be used for distributing the IP address to Media Access Control (MAC) address mappings to all of the Network Virtualization Edges (NVEs). An NVE can then trap ARP Request and/or ND Neighbor Solicitation messages from the Tenant Systems (TSs) that are attached to it and respond to them, thereby eliminating the need for the broadcast/multicast of such messages. In the case of DHCP, the NVE can be configured to forward these messages using the DHCP relay function [RFC2131].

基础设施多播是指需要多播或广播交付的网络服务,如地址解析协议(ARP)、邻居发现(ND)、动态主机配置协议(DHCP)、多播域名服务器(mDNS)等,其中一些在RFC 3819[RFC3819]第5节和第6节中描述。可以为这些服务提供在底层网络中不涉及多播的解决方案。例如,在ARP/ND的情况下,可以使用网络虚拟化机构(NVA)将IP地址到媒体访问控制(MAC)地址映射分发到所有网络虚拟化边缘(NVE)。然后,NVE可以捕获来自连接到它的租户系统(TSs)的ARP请求和/或ND邻居请求消息并对其作出响应,从而消除对此类消息的广播/多播的需要。对于DHCP,可以将NVE配置为使用DHCP中继功能[RFC2131]转发这些消息。

Of course, it is possible to support all of these infrastructure multicast protocols natively if the underlay provides multicast transport. However, even in the presence of multicast transport, it may be beneficial to use the optimizations mentioned above to reduce the amount of such traffic in the network.


1.2. Application-Specific Multicast
1.2. 特定于应用程序的多播

Application-specific multicast traffic refers to multicast traffic that originates and is consumed by user applications. Several such applications are described elsewhere [DC-MC]. Application-specific multicast may be either Source-Specific Multicast (SSM) or Any-Source Multicast (ASM) [RFC3569] and has the following characteristics:


1. Receiver hosts are expected to subscribe to multicast content using protocols such as IGMP [RFC3376] (IPv4) or Multicast Listener Discovery (MLD) [RFC2710] (IPv6). Multicast sources and listeners participate in these protocols using addresses that are in the TS address domain.

1. 接收方主机应使用诸如IGMP[RFC3376](IPv4)或多播侦听器发现(MLD)[RFC2710](IPv6)等协议订阅多播内容。多播源和侦听器使用TS地址域中的地址参与这些协议。

2. The set of multicast listeners for each multicast group may not be known in advance. Therefore, it may not be possible or practical for an NVA to get the list of participants for each multicast group ahead of time.

2. 每个多播组的多播侦听器集可能事先未知。因此,NVA提前获取每个多播组的参与者列表可能是不可能的或不实际的。

2. Terminology and Abbreviations
2. 术语和缩写

In this document, the terms host, Tenant System (TS), and Virtual Machine (VM) are used interchangeably to represent an end station that originates or consumes data packets.


ASM: Any-Source Multicast


IGMP: Internet Group Management Protocol


LISP: Locator/ID Separation Protocol


MSN: Multicast Service Node


RLOC: Routing Locator


NVA: Network Virtualization Authority


NVE: Network Virtualization Edge


NVGRE: Network Virtualization using GRE


PIM: Protocol-Independent Multicast


SSM: Source-Specific Multicast


TS: Tenant System


VM: Virtual Machine


VN: Virtual Network


VTEP: VXLAN Tunnel End Point


VXLAN: Virtual eXtensible LAN


3. Multicast Mechanisms in Networks That Use NVO3
3. 使用NVO3的网络中的多播机制

In NVO3 environments, traffic between NVEs is transported using an encapsulation such as VXLAN [RFC7348] [VXLAN-GPE], Network Virtualization using Generic Routing Encapsulation (NVGRE) [RFC7637], Geneve [Geneve], Generic UDP Encapsulation [GUE], etc.


What makes networks using NVO3 different from other networks is that some NVEs, especially NVEs implemented in servers, might not support regular multicast protocols such as PIM. Instead, the only capability they may support would be that of encapsulating data packets from VMs with an outer unicast header. Therefore, it is important for networks using NVO3 to have mechanisms to support multicast as a network capability for NVEs, to map multicast traffic from VMs (users/applications) to an equivalent multicast capability inside the NVE, or to figure out the outer destination address if NVE does not support native multicast (e.g., PIM) or IGMP.


With NVO3, there are many possible ways that multicast may be handled in such networks. We discuss some of the attributes of the following four methods:


1. No multicast support

1. 不支持多播

2. Replication at the source NVE

2. 源NVE上的复制

3. Replication at a multicast service node

3. 多播服务节点上的复制

4. IP multicast in the underlay

4. 底图中的IP多播

These methods are briefly mentioned in the NVO3 Framework [RFC7365] and NVO3 Architecture [RFC8014] documents. This document provides more details about the basic mechanisms underlying each of these methods and discusses the issues and trade-offs of each.


We note that other methods are also possible, such as [EDGE-REP], but we focus on the above four because they are the most common.


It is worth noting that when selecting a multicast mechanism, it is useful to consider the impact of these on any multicast congestion control mechanisms that applications may be using to obtain the desired system dynamics. In addition, the same rules for Explicit


Congestion Notification (ECN) would apply to multicast traffic being encapsulated, as for unicast traffic [RFC6040].


3.1. No Multicast Support
3.1. 不支持多播

In this scenario, there is no support whatsoever for multicast traffic when using the overlay. This method can only work if the following conditions are met:


1. All of the application traffic in the network is unicast traffic, and the only multicast/broadcast traffic is from ARP/ND protocols.

1. 网络中的所有应用程序流量都是单播流量,唯一的多播/广播流量来自ARP/ND协议。

2. An NVA is used by all of the NVEs to determine the mapping of a given TS's MAC and IP address to the NVE that it is attached to. In other words, there is no data-plane learning. Address resolution requests via ARP/ND that are issued by the TSs must be resolved by the NVE that they are attached to.

2. 所有NVA都使用NVA来确定给定TS的MAC和IP地址到其所连接的NVE的映射。换句话说,没有数据平面学习。TSs通过ARP/ND发出的地址解析请求必须由其所附的NVE解析。

With this approach, it is not possible to support application-specific multicast. However, certain multicast/broadcast applications can be supported without multicast; for example, DHCP, which can be supported by use of DHCP relay function [RFC2131].


The main drawback of this approach, even for unicast traffic, is that it is not possible to initiate communication with a TS for which a mapping to an NVE does not already exist at the NVA. This is a problem in the case where the NVE is implemented in a physical switch and the TS is a physical end station that has not registered with the NVA.


3.2. Replication at the Source NVE
3.2. 源NVE上的复制

With this method, the overlay attempts to provide a multicast service without requiring any specific support from the underlay, other than that of a unicast service. A multicast or broadcast transmission is achieved by replicating the packet at the source NVE and making copies, one for each destination NVE that the multicast packet must be sent to.


For this mechanism to work, the source NVE must know, a priori, the IP addresses of all destination NVEs that need to receive the packet. For the purpose of ARP/ND, this would involve knowing the IP addresses of all the NVEs that have TSs in the VN of the TS that generated the request.


For the support of application-specific multicast traffic, a method similar to that of receiver-sites registration for a particular multicast group, described in [LISP-Signal-Free], can be used. The registrations from different receiver sites can be merged at the NVA, which can construct a multicast replication list inclusive of all NVEs to which receivers for a particular multicast group are attached. The replication list for each specific multicast group is maintained by the NVA. Note that using receiver-sites registration does not necessarily mean the source NVE must do replication. If the NVA indicates that multicast packets are encapsulated to multicast service nodes, then there would be no replication at the NVE.

为了支持特定于应用程序的多播通信量,可以使用与特定多播组的接收器站点注册类似的方法,如[LISP Signal Free]中所述。来自不同接收器站点的注册可以在NVA处合并,NVA可以构造一个多播复制列表,其中包括特定多播组的接收器所连接到的所有NVE。NVA维护每个特定多播组的复制列表。请注意,使用接收方站点注册并不一定意味着源NVE必须进行复制。如果NVA指示多播数据包被封装到多播服务节点,则在NVE处不会有复制。

The receiver-sites registration is achieved by egress NVEs performing IGMP/MLD snooping to maintain the state for which attached TSs have subscribed to a given IP multicast group. When the members of a multicast group are outside the NVO3 domain, it is necessary for NVO3 gateways to keep track of the remote members of each multicast group. The NVEs and NVO3 gateways then communicate with the multicast groups that are of interest to the NVA. If the membership is not communicated to the NVA, and if it is necessary to prevent TSs attached to an NVE that have not subscribed to a multicast group from receiving the multicast traffic, the NVE would need to maintain multicast group membership information.


In the absence of IGMP/MLD snooping, the traffic would be delivered to all TSs that are part of the VN.


In multihoming environments, i.e., in those where a TS is attached to more than one NVE, the NVA would be expected to provide information to all of the NVEs under its control about all of the NVEs to which such a TS is attached. The ingress NVE can then choose any one of those NVEs as the egress NVE for the data frames destined towards the multi-homed TS.


This method requires multiple copies of the same packet to all NVEs that participate in the VN. If, for example, a tenant subnet is spread across 50 NVEs, the packet would have to be replicated 50 times at the source NVE. Obviously, this approach creates more traffic to the network that can cause congestion when the network load is high. This also creates an issue with the forwarding performance of the NVE.


Note that this method is similar to what was used in Virtual Private LAN Service (VPLS) [RFC4762] prior to support of Multiprotocol Label Switching (MPLS) multicast [RFC7117]. While there are some similarities between MPLS Virtual Private Network (VPN) and NVO3, there are some key differences:


o The attachment from Customer Edge (CE) to Provider Edge (PE) in VPNs is somewhat static, whereas in a DC that allows VMs to migrate anywhere, the TS attachment to NVE is much more dynamic.

o 在VPN中,从客户边缘(CE)到提供商边缘(PE)的连接在某种程度上是静态的,而在允许VM迁移到任何地方的DC中,TS到NVE的连接要动态得多。

o The number of PEs to which a single VPN customer is attached in an MPLS VPN environment is normally far less than the number of NVEs to which a VN's VMs are attached in a DC.

o 在MPLS VPN环境中,单个VPN客户连接到的PE数量通常远小于在DC中连接到VN VM的NVE数量。

When a VPN customer has multiple multicast groups, "Multicast VPN" [RFC6513] combines all those multicast groups within each VPN client to one single multicast group in the MPLS (or VPN) core. The result is that messages from any of the multicast groups belonging to one VPN customer will reach all the PE nodes of the client. In other words, any messages belonging to any multicast groups under customer X will reach all PEs of the customer X. When the customer X is attached to only a handful of PEs, the use of this approach does not result in an excessive waste of bandwidth in the provider's network.


In a DC environment, a typical hypervisor-based virtual switch may only support on the order of 10's of VMs (as of this writing). A subnet with N VMs may be, in the worst case, spread across N virtual switches (vSwitches). Using an "MPLS VPN multicast" approach in such a scenario would require the creation of a multicast group in the core in order for the VN to reach all N NVEs. If only a small percentage of this client's VMs participate in application-specific multicast, a great number of NVEs will receive multicast traffic that is not forwarded to any of their attached VMs, resulting in a considerable waste of bandwidth.

在DC环境中,典型的基于虚拟机监控程序的虚拟交换机可能只支持大约10个虚拟机(截至本文撰写之时)。在最坏的情况下,具有N个VM的子网可能分布在N个虚拟交换机(vSwitches)上。在这种情况下使用“MPLS VPN多播”方法需要在核心中创建多播组,以便VN到达所有N个NVE。如果该客户机的VM中只有一小部分参与特定于应用程序的多播,那么大量NVE将接收到未转发到其任何连接的VM的多播通信量,从而造成相当大的带宽浪费。

Therefore, the multicast VPN solution may not scale in a DC environment with the dynamic attachment of VNs to NVEs and a greater number of NVEs for each VN.


3.3. Replication at a Multicast Service Node
3.3. 多播服务节点上的复制

With this method, all multicast packets would be sent using a unicast tunnel encapsulation from the ingress NVE to a Multicast Service Node (MSN). The MSN, in turn, would create multiple copies of the packet and would deliver a copy, using a unicast tunnel encapsulation, to each of the NVEs that are part of the multicast group for which the packet is intended.


This mechanism is similar to that used by the Asynchronous Transfer Mode (ATM) Forum's LAN Emulation (LANE) specification [LANE]. The MSN is similar to the Rendezvous Point (RP) in Protocol Independent Multicast - Sparse Mode (PIM-SM), but different in that the user data traffic is carried by the NVO3 tunnels.


The following are possible ways for the MSN to get the membership information for each multicast group:


o The MSN can obtain this membership information from the IGMP/MLD report messages sent by TSs in response to IGMP/MLD query messages from the MSN. The IGMP/MLD query messages are sent from the MSN to the NVEs, which then forward the query messages to TSs attached to them. An IGMP/MLD query message sent out by the MSN to an NVE is encapsulated with the MSN address in the outer IP source address field and the address of the NVE in the outer IP destination address field. An encapsulated IGMP/MLD query message also has a virtual network (VN) identifier (corresponding to the VN that the TSs belong to) in the outer header and a multicast address in the inner IP destination address field. Upon receiving the encapsulated IGMP/MLD query message, the NVE establishes a mapping for "MSN address" to "multicast address", decapsulates the received encapsulated IGMP/MLD message, and multicasts the decapsulated query message to the TSs that belong to the VN attached to that NVE. An IGMP/MLD report message sent by a TS includes the multicast address and the address of the TS. With the proper "MSN address" to "multicast address" mapping, the NVEs can encapsulate all multicast data frames containing the "multicast address" with the address of the MSN in the outer IP destination address field.

o MSN可以从TSs响应MSN的IGMP/MLD查询消息而发送的IGMP/MLD报告消息中获取此成员资格信息。IGMP/MLD查询消息从MSN发送到NVE,然后NVE将查询消息转发到附加到它们的TSs。MSN向NVE发送的IGMP/MLD查询消息在外部IP源地址字段中用MSN地址封装,在外部IP目标地址字段中用NVE地址封装。封装的IGMP/MLD查询消息在外部报头中还具有虚拟网络(VN)标识符(对应于TSs所属的VN),在内部IP目的地地址字段中具有多播地址。在接收到封装的IGMP/MLD查询消息后,NVE建立“MSN地址”到“多播地址”的映射,将接收到的封装的IGMP/MLD消息解封装,并将解封装的查询消息多播到属于连接到该NVE的VN的TSs。TS发送的IGMP/MLD报告消息包括多播地址和TS的地址。通过正确的“MSN地址”到“多播地址”映射,NVEs可以用外部IP目标地址字段中的MSN地址封装包含“多播地址”的所有多播数据帧。

o The MSN can obtain the membership information from the NVEs that have the capability to establish multicast groups by snooping native IGMP/MLD messages (note that the communication must be specific to the multicast addresses) or by having the NVA obtain the information from the NVEs and in turn have MSN communicate with the NVA. This approach requires additional protocol between MSN and NVEs.

o MSN可以通过监听本机IGMP/MLD消息(注意,通信必须特定于多播地址)或通过让NVA从NVS获得信息,进而让MSN与NVA通信,从能够建立多播组的NVS获得成员资格信息。这种方法需要MSN和NVE之间的附加协议。

Unlike the method described in Section 3.2, there is no performance impact at the ingress NVE, nor are there any issues with multiple copies of the same packet from the source NVE to the MSN. However, there remain issues with multiple copies of the same packet on links that are common to the paths from the MSN to each of the egress NVEs. Additional issues that are introduced with this method include the availability of the MSN, methods to scale the services offered by the MSN, and the suboptimality of the delivery paths.


Finally, the IP address of the source NVE must be preserved in packet copies created at the multicast service node if data-plane learning is in use. This could create problems if IP source address Reverse Path Forwarding (RPF) checks are in use.


3.4. IP Multicast in the Underlay
3.4. 底图中的IP多播

In this method, the underlay supports IP multicast and the ingress NVE encapsulates the packet with the appropriate IP multicast address in the tunnel encapsulation header for delivery to the desired set of NVEs. The protocol in the underlay could be any variant of PIM, or a protocol-dependent multicast, such as [ISIS-Multicast].


If an NVE connects to its attached TSs via a Layer 2 network, there are multiple ways for NVEs to support the application-specific multicast:


o The NVE only supports the basic IGMP/MLD snooping function, while the "TS routers" handle the application-specific multicast. This scheme doesn't utilize the underlay IP multicast protocols. Instead routers, which are themselves TSs attached to the NVE, would handle multicast protocols for the application-specific multicast. We refer to such routers as TS routers.

o NVE仅支持基本的IGMP/MLD监听功能,而“TS路由器”处理特定于应用程序的多播。该方案不使用底层IP多播协议。相反,路由器本身就是连接到NVE的TSs,将为特定于应用程序的多播处理多播协议。我们把这种路由器称为TS路由器。

o The NVE can act as a pseudo multicast router for the directly attached TSs and support the mapping of IGMP/MLD messages to the messages needed by the underlay IP multicast protocols.

o NVE可以充当直接连接的TSs的伪多播路由器,并支持将IGMP/MLD消息映射到底层IP多播协议所需的消息。

With this method, there are none of the issues with the methods described in Sections 3.2 and 3.3 with respect to scaling and congestion. Instead, there are other issues described below.


With PIM-SM, the number of flows required would be (n*g), where n is the number of source NVEs that source packets for the group, and g is the number of groups. Bidirectional PIM (BIDIR-PIM) would offer better scalability with the number of flows required being g. Unfortunately, many vendors still do not fully support BIDIR or have limitations on its implementation. [RFC6831] describes the use of SSM as an alternative to BIDIR, provided that the NVEs have a way to learn of each other's IP addresses so that they can join all of the SSM Shortest Path Trees (SPTs) to create/maintain an underlay SSM IP multicast tunnel solution.

对于PIM-SM,所需的流数为(n*g),其中n是为组发送数据包的源NVE数,g是组数。双向PIM(BIDIR-PIM)将提供更好的可扩展性,所需的流数量为g。不幸的是,许多供应商仍然不完全支持BIDIR或对其实现有限制。[RFC6831]描述了SSM作为BIDIR替代方案的使用,前提是NVE能够了解彼此的IP地址,以便他们能够加入所有SSM最短路径树(SPT),以创建/维护底层SSM IP多播隧道解决方案。

In the absence of any additional mechanism (e.g., using an NVA for address resolution), for optimal delivery, there would have to be a separate group for each VN for infrastructure multicast plus a separate group for each application-specific multicast address within a tenant.


An additional consideration is that only the lower 23 bits of the IP address (regardless of whether IPv4 or IPv6 is in use) are mapped to the outer MAC address, and if there is equipment that prunes multicasts at Layer 2, there will be some aliasing.


Finally, a mechanism to efficiently provision such addresses for each group would be required.


There are additional optimizations that are possible, but they come with their own restrictions. For example, a set of tenants may be restricted to some subset of NVEs, and they could all share the same outer IP multicast group address. This, however, introduces a problem of suboptimal delivery (even if a particular tenant within the group of tenants doesn't have a presence on one of the NVEs that another one does, the multicast packets would still be delivered to that NVE). It also introduces an additional network management burden to optimize which tenants should be part of the same tenant group (based on the NVEs they share), which somewhat dilutes the value proposition of NVO3 (to completely decouple the overlay and physical network design allowing complete freedom of placement of VMs anywhere within the DC).


Multicast schemes such as Bit Indexed Explicit Replication (BIER) [RFC8279] may be able to provide optimizations by allowing the underlay network to provide optimum multicast delivery without requiring routers in the core of the network to maintain per-multicast group state.


3.5. Other Schemes
3.5. 其他计划

There are still other mechanisms that may be used that attempt to combine some of the advantages of the above methods by offering multiple replication points, each with a limited degree of replication [EDGE-REP]. Such schemes offer a trade-off between the amount of replication at an intermediate node (e.g., router) versus performing all of the replication at the source NVE or all of the replication at a multicast service node.


4. Simultaneous Use of More Than One Mechanism
4. 同时使用多个机构

While the mechanisms discussed in the previous section have been discussed individually, it is possible for implementations to rely on more than one of these. For example, the method of Section 3.1 could be used for minimizing ARP/ND, while at the same time, multicast applications may be supported by one, or a combination, of the other methods. For small multicast groups, the methods of source NVE replication or the use of a multicast service node may be attractive, while for larger multicast groups, the use of multicast in the underlay may be preferable.


5. Other Issues
5. 其他问题
5.1. Multicast-Agnostic NVEs
5.1. 多播不可知网络

Some hypervisor-based NVEs do not process or recognize IGMP/MLD frames, i.e., those NVEs simply encapsulate the IGMP/MLD messages in the same way as they do for regular data frames.


By default, a TS router periodically sends IGMP/MLD query messages to all the hosts in the subnet to trigger the hosts that are interested in the multicast stream to send back IGMP/MLD reports. In order for the MSN to get the updated multicast group information, the MSN can also send the IGMP/MLD query message comprising a client-specific multicast address encapsulated in an overlay header to all of the NVEs to which the TSs in the VN are attached.


However, the MSN may not always be aware of the client-specific multicast addresses. In order to perform multicast filtering, the MSN has to snoop the IGMP/MLD messages between TSs and their corresponding routers to maintain the multicast membership. In order for the MSN to snoop the IGMP/MLD messages between TSs and their router, the NVA needs to configure the NVE to send copies of the IGMP/MLD messages to the MSN in addition to the default behavior of sending them to the TSs' routers; e.g., the NVA has to inform the NVEs to encapsulate data frames with the Destination Address (DA) being (DA of IGMP report) to the TSs' router and MSN.

然而,MSN可能并不总是知道特定于客户端的多播地址。为了执行多播过滤,MSN必须在TSs及其相应路由器之间嗅探IGMP/MLD消息以保持多播成员资格。为了让MSN在TSs和它们的路由器之间窥探IGMP/MLD消息,NVA需要配置NVE将IGMP/MLD消息的副本发送到MSN,以及将它们发送到TSs的路由器的默认行为;e、 例如,NVA必须通知NVS将目标地址(DA)为224.0.0.2(IGMP报告的DA)的数据帧封装到TSs的路由器和MSN。

This process is similar to "Source Replication" described in Section 3.2, except the NVEs only replicate the message to the TSs' router and MSN.


5.2. Multicast Membership Management for DC with VMs
5.2. 基于VMs的DC组播成员管理

For DCs with virtualized servers, VMs can be added, deleted, or moved very easily. When VMs are added, deleted, or moved, the NVEs to which the VMs are attached are changed.


When a VM is deleted from an NVE or a new VM is added to an NVE, the VM management system should notify the MSN to send the IGMP/MLD query messages to the relevant NVEs (as described in Section 3.3) so that the multicast membership can be updated promptly.


Otherwise, if there are changes of VMs attachment to NVEs (within the duration of the configured default time interval that the TSs routers use for IGMP/MLD queries), multicast data may not reach the VM(s) that moved.


6. Security Considerations
6. 安全考虑

This document does not introduce any new security considerations beyond what is described in the NVO3 Architecture document [RFC8014].


7. IANA Considerations
7. IANA考虑

This document does not require any IANA actions.


8. Summary
8. 总结

This document has identified various mechanisms for supporting application-specific multicast in networks that use NVO3. It highlights the basics of each mechanism and some of the issues with them. As solutions are developed, the protocols would need to consider the use of these mechanisms, and coexistence may be a consideration. It also highlights some of the requirements for supporting multicast applications in an NVO3 network.


9. References
9. 工具书类
9.1. Normative References
9.1. 规范性引用文件

[RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. Thyagarajan, "Internet Group Management Protocol, Version 3", RFC 3376, DOI 10.17487/RFC3376, October 2002, <>.

[RFC3376]Cain,B.,Deering,S.,Kouvelas,I.,Fenner,B.,和A.Thyagarajan,“互联网组管理协议,版本3”,RFC 3376,DOI 10.17487/RFC3376,2002年10月<>.

[RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 2012, <>.

[RFC6513]Rosen,E.,Ed.和R.Aggarwal,Ed.,“MPLS/BGP IP VPN中的多播”,RFC 6513,DOI 10.17487/RFC6513,2012年2月<>.

[RFC7364] Narten, T., Ed., Gray, E., Ed., Black, D., Fang, L., Kreeger, L., and M. Napierala, "Problem Statement: Overlays for Network Virtualization", RFC 7364, DOI 10.17487/RFC7364, October 2014, <>.

[RFC7364]Narten,T.,Ed.,Gray,E.,Ed.,Black,D.,Fang,L.,Kreeger,L.,和M.Napierala,“问题陈述:网络虚拟化覆盖”,RFC 7364,DOI 10.17487/RFC7364,2014年10月<>.

[RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. Rekhter, "Framework for Data Center (DC) Network Virtualization", RFC 7365, DOI 10.17487/RFC7365, October 2014, <>.

[RFC7365]Lasserre,M.,Balus,F.,Morin,T.,Bitar,N.,和Y.Rekhter,“数据中心(DC)网络虚拟化框架”,RFC 7365,DOI 10.17487/RFC7365,2014年10月<>.

[RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T. Narten, "An Architecture for Data-Center Network Virtualization over Layer 3 (NVO3)", RFC 8014, DOI 10.17487/RFC8014, December 2016, <>.

[RFC8014]Black,D.,Hudson,J.,Kreeger,L.,Lasserre,M.,和T.Narten,“第3层数据中心网络虚拟化架构(NVO3)”,RFC 8014,DOI 10.17487/RFC8014,2016年12月<>.

9.2. Informative References
9.2. 资料性引用

[RFC2131] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, DOI 10.17487/RFC2131, March 1997, <>.

[RFC2131]Droms,R.,“动态主机配置协议”,RFC 2131,DOI 10.17487/RFC2131,1997年3月<>.

[RFC2710] Deering, S., Fenner, W., and B. Haberman, "Multicast Listener Discovery (MLD) for IPv6", RFC 2710, DOI 10.17487/RFC2710, October 1999, <>.

[RFC2710]Deering,S.,Fenner,W.,和B.Haberman,“IPv6的多播侦听器发现(MLD)”,RFC 2710,DOI 10.17487/RFC2710,1999年10月<>.

[RFC3569] Bhattacharyya, S., Ed., "An Overview of Source-Specific Multicast (SSM)", RFC 3569, DOI 10.17487/RFC3569, July 2003, <>.

[RFC3569]Bhattacharyya,S.,编辑,“源特定多播(SSM)概述”,RFC 3569,DOI 10.17487/RFC3569,2003年7月<>.

[RFC3819] Karn, P., Ed., Bormann, C., Fairhurst, G., Grossman, D., Ludwig, R., Mahdavi, J., Montenegro, G., Touch, J., and L. Wood, "Advice for Internet Subnetwork Designers", BCP 89, RFC 3819, DOI 10.17487/RFC3819, July 2004, <>.

[RFC3819]Karn,P.,Ed.,Bormann,C.,Fairhurst,G.,Grossman,D.,Ludwig,R.,Mahdavi,J.,黑山,G.,Touch,J.,和L.Wood,“互联网子网络设计师的建议”,BCP 89,RFC 3819,DOI 10.17487/RFC3819,2004年7月<>.

[RFC4762] Lasserre, M., Ed. and V. Kompella, Ed., "Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007, <>.

[RFC4762]Lasserre,M.,Ed.和V.Kompella,Ed.,“使用标签分发协议(LDP)信令的虚拟专用LAN服务(VPLS)”,RFC 4762,DOI 10.17487/RFC4762,2007年1月<>.

[RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion Notification", RFC 6040, DOI 10.17487/RFC6040, November 2010, <>.

[RFC6040]Briscoe,B.,“明确拥塞通知的隧道挖掘”,RFC 6040,DOI 10.17487/RFC6040,2010年11月<>.

[RFC6831] Farinacci, D., Meyer, D., Zwiebel, J., and S. Venaas, "The Locator/ID Separation Protocol (LISP) for Multicast Environments", RFC 6831, DOI 10.17487/RFC6831, January 2013, <>.

[RFC6831]Farinaci,D.,Meyer,D.,Zwiebel,J.,和S.Venaas,“用于多播环境的定位器/ID分离协议(LISP)”,RFC 6831,DOI 10.17487/RFC6831,2013年1月<>.

[RFC7117] Aggarwal, R., Ed., Kamite, Y., Fang, L., Rekhter, Y., and C. Kodeboniya, "Multicast in Virtual Private LAN Service (VPLS)", RFC 7117, DOI 10.17487/RFC7117, February 2014, <>.

[RFC7117]Aggarwal,R.,Ed.,Kamite,Y.,Fang,L.,Rekhter,Y.,和C.Kodeboniya,“虚拟专用局域网服务(VPLS)中的多播”,RFC 7117,DOI 10.17487/RFC71172014年2月<>.

[RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, L., Sridhar, T., Bursell, M., and C. Wright, "Virtual eXtensible Local Area Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, <>.

[RFC7348]Mahalingam,M.,Dutt,D.,Duda,K.,Agarwal,P.,Kreeger,L.,Sridhar,T.,Bursell,M.,和C.Wright,“虚拟可扩展局域网(VXLAN):在第3层网络上覆盖虚拟化第2层网络的框架”,RFC 7348,DOI 10.17487/RFC7348,2014年8月<>.

[RFC7637] Garg, P., Ed. and Y. Wang, Ed., "NVGRE: Network Virtualization Using Generic Routing Encapsulation", RFC 7637, DOI 10.17487/RFC7637, September 2015, <>.

[RFC7637]Garg,P.,Ed.和Y.Wang,Ed.,“NVGRE:使用通用路由封装的网络虚拟化”,RFC 7637,DOI 10.17487/RFC7637,2015年9月<>.

[RFC8279] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., Przygienda, T., and S. Aldrin, "Multicast Using Bit Index Explicit Replication (BIER)", RFC 8279, DOI 10.17487/RFC8279, November 2017, <>.

[RFC8279]Wijnands,IJ.,Ed.,Rosen,E.,Ed.,Dolganow,A.,Przygienda,T.,和S.Aldrin,“使用位索引显式复制(BIER)的多播”,RFC 8279,DOI 10.17487/RFC8279,2017年11月<>.

[DC-MC] McBride, M. and H. Liu, "Multicast in the Data Center Overview", Work in Progress, draft-mcbride-armd-mcast-overview-02, July 2012.


[EDGE-REP] Marques, P., Fang, L., Winkworth, D., Cai, Y., and P. Lapukhov, "Edge multicast replication for BGP IP VPNs.", Work in Progress, draft-marques-l3vpn-mcast-edge-01, June 2012.

[EDGE-REP]Marques,P.,Fang,L.,Winkworth,D.,Cai,Y.,和P.Lapukhov,“BGP IP VPN的边缘多播复制”,正在进行的工作,草稿-Marques-l3vpn-mcast-EDGE-012年6月。

[Geneve] Gross, J., Ganga, I., and T. Sridhar, "Geneve: Generic Network Virtualization Encapsulation", Work in Progress, draft-ietf-nvo3-geneve-05, September 2017.


[GUE] Herbert, T., Yong, L., and O. Zia, "Generic UDP Encapsulation", Work in Progress, draft-ietf-intarea-gue-05, December 2017.


[ISIS-Multicast] Yong, L., Weiguo, H., Eastlake, D., Qu, A., Hudson, J., and U. Chunduri, "IS-IS Protocol Extension For Building Distribution Trees", Work in Progress, draft-yong-isis-ext-4-distribution-tree-03, October 2014.

[ISIS Multicast]Yong,L.,Weiguo,H.,Eastlake,D.,Qu,A.,Hudson,J.,和U.Chunduri,“构建分发树的IS-IS协议扩展”,正在进行的工作,草稿-Yong-ISIS-ext-4-Distribution-tree-032014年10月。

[LANE] ATM Forum, "LAN Emulation Over ATM: Version 1.0", ATM Forum Technical Committee, af-lane-0021.000, January 1995.


[LISP-Signal-Free] Moreno, V. and D. Farinacci, "Signal-Free LISP Multicast", Work in Progress, draft-ietf-lisp-signal-free-multicast-07, November 2017.


[VXLAN-GPE] Maino, F., Kreeger, L., and U. Elzur, "Generic Protocol Extension for VXLAN", Work in Progress, draft-ietf-nvo3-vxlan-gpe-05, October 2017.




Many thanks are due to Dino Farinacci, Erik Nordmark, Lucy Yong, Nicolas Bouliane, Saumya Dikshit, Joe Touch, Olufemi Komolafe, and Matthew Bocci for their valuable comments and suggestions.


Authors' Addresses


Anoop Ghanwani Dell



Linda Dunbar Huawei Technologies 5340 Legacy Drive, Suite 1750 Plano, TX 75024 United States of America

Linda Dunbar华为技术5340 Legacy Drive,美国德克萨斯州普莱诺1750室75024

Phone: (469) 277 5840 Email:


Mike McBride Huawei Technologies



Vinay Bannai Google

Vinay Bannai谷歌


Ram Krishnan Dell