Internet Engineering Task Force (IETF)                      G. Camarillo
Request for Comments: 5888                                      Ericsson
Obsoletes: 3388                                           H. Schulzrinne
Category: Standards Track                            Columbia University
ISSN: 2070-1721                                                June 2010
        
Internet Engineering Task Force (IETF)                      G. Camarillo
Request for Comments: 5888                                      Ericsson
Obsoletes: 3388                                           H. Schulzrinne
Category: Standards Track                            Columbia University
ISSN: 2070-1721                                                June 2010
        

The Session Description Protocol (SDP) Grouping Framework

会话描述协议(SDP)分组框架

Abstract

摘要

In this specification, we define a framework to group "m" lines in the Session Description Protocol (SDP) for different purposes. This framework uses the "group" and "mid" SDP attributes, both of which are defined in this specification. Additionally, we specify how to use the framework for two different purposes: for lip synchronization and for receiving a media flow consisting of several media streams on different transport addresses. This document obsoletes RFC 3388.

在本规范中,我们定义了一个框架,用于将会话描述协议(SDP)中的“m”行分组,以实现不同的目的。此框架使用本规范中定义的“group”和“mid”SDP属性。此外,我们还指定了如何将该框架用于两个不同的目的:用于lip同步和接收由不同传输地址上的多个媒体流组成的媒体流。本文件淘汰了RFC 3388。

Status of This Memo

关于下段备忘

This is an Internet Standards Track document.

这是一份互联网标准跟踪文件。

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.

本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。有关互联网标准的更多信息,请参见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5888.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc5888.

Copyright Notice

版权公告

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2010 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents

目录

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Overview of Operation  . . . . . . . . . . . . . . . . . . . .  3
   4.  Media Stream Identification Attribute  . . . . . . . . . . . .  4
   5.  Group Attribute  . . . . . . . . . . . . . . . . . . . . . . .  4
   6.  Use of "group" and "mid" . . . . . . . . . . . . . . . . . . .  4
   7.  Lip Synchronization (LS) . . . . . . . . . . . . . . . . . . .  5
     7.1.  Example of LS  . . . . . . . . . . . . . . . . . . . . . .  5
   8.  Flow Identification (FID)  . . . . . . . . . . . . . . . . . .  6
     8.1.  SIP and Cellular Access  . . . . . . . . . . . . . . . . .  6
     8.2.  DTMF Tones . . . . . . . . . . . . . . . . . . . . . . . .  7
     8.3.  Media Flow Definition  . . . . . . . . . . . . . . . . . .  7
     8.4.  FID Semantics  . . . . . . . . . . . . . . . . . . . . . .  7
       8.4.1.  Examples of FID  . . . . . . . . . . . . . . . . . . .  8
     8.5.  Scenarios That FID Does Not Cover  . . . . . . . . . . . . 11
       8.5.1.  Parallel Encoding Using Different Codecs . . . . . . . 11
       8.5.2.  Layered Encoding . . . . . . . . . . . . . . . . . . . 12
       8.5.3.  Same IP Address and Port Number  . . . . . . . . . . . 12
   9.  Usage of the "group" Attribute in SIP  . . . . . . . . . . . . 13
     9.1.  Mid Value in Answers . . . . . . . . . . . . . . . . . . . 13
       9.1.1.  Example  . . . . . . . . . . . . . . . . . . . . . . . 14
     9.2.  Group Value in Answers . . . . . . . . . . . . . . . . . . 15
       9.2.1.  Example  . . . . . . . . . . . . . . . . . . . . . . . 15
     9.3.  Capability Negotiation . . . . . . . . . . . . . . . . . . 16
       9.3.1.  Example  . . . . . . . . . . . . . . . . . . . . . . . 16
     9.4.  Backward Compatibility . . . . . . . . . . . . . . . . . . 17
       9.4.1.  Offerer Does Not Support "group" . . . . . . . . . . . 17
       9.4.2.  Answerer Does Not Support "group"  . . . . . . . . . . 17
   10. Changes from RFC 3388  . . . . . . . . . . . . . . . . . . . . 18
   11. Security Considerations  . . . . . . . . . . . . . . . . . . . 18
   12. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 19
   13. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 19
   14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
     14.1. Normative References . . . . . . . . . . . . . . . . . . . 20
     14.2. Informative References . . . . . . . . . . . . . . . . . . 20
        
   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Overview of Operation  . . . . . . . . . . . . . . . . . . . .  3
   4.  Media Stream Identification Attribute  . . . . . . . . . . . .  4
   5.  Group Attribute  . . . . . . . . . . . . . . . . . . . . . . .  4
   6.  Use of "group" and "mid" . . . . . . . . . . . . . . . . . . .  4
   7.  Lip Synchronization (LS) . . . . . . . . . . . . . . . . . . .  5
     7.1.  Example of LS  . . . . . . . . . . . . . . . . . . . . . .  5
   8.  Flow Identification (FID)  . . . . . . . . . . . . . . . . . .  6
     8.1.  SIP and Cellular Access  . . . . . . . . . . . . . . . . .  6
     8.2.  DTMF Tones . . . . . . . . . . . . . . . . . . . . . . . .  7
     8.3.  Media Flow Definition  . . . . . . . . . . . . . . . . . .  7
     8.4.  FID Semantics  . . . . . . . . . . . . . . . . . . . . . .  7
       8.4.1.  Examples of FID  . . . . . . . . . . . . . . . . . . .  8
     8.5.  Scenarios That FID Does Not Cover  . . . . . . . . . . . . 11
       8.5.1.  Parallel Encoding Using Different Codecs . . . . . . . 11
       8.5.2.  Layered Encoding . . . . . . . . . . . . . . . . . . . 12
       8.5.3.  Same IP Address and Port Number  . . . . . . . . . . . 12
   9.  Usage of the "group" Attribute in SIP  . . . . . . . . . . . . 13
     9.1.  Mid Value in Answers . . . . . . . . . . . . . . . . . . . 13
       9.1.1.  Example  . . . . . . . . . . . . . . . . . . . . . . . 14
     9.2.  Group Value in Answers . . . . . . . . . . . . . . . . . . 15
       9.2.1.  Example  . . . . . . . . . . . . . . . . . . . . . . . 15
     9.3.  Capability Negotiation . . . . . . . . . . . . . . . . . . 16
       9.3.1.  Example  . . . . . . . . . . . . . . . . . . . . . . . 16
     9.4.  Backward Compatibility . . . . . . . . . . . . . . . . . . 17
       9.4.1.  Offerer Does Not Support "group" . . . . . . . . . . . 17
       9.4.2.  Answerer Does Not Support "group"  . . . . . . . . . . 17
   10. Changes from RFC 3388  . . . . . . . . . . . . . . . . . . . . 18
   11. Security Considerations  . . . . . . . . . . . . . . . . . . . 18
   12. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 19
   13. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 19
   14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
     14.1. Normative References . . . . . . . . . . . . . . . . . . . 20
     14.2. Informative References . . . . . . . . . . . . . . . . . . 20
        
1. Introduction
1. 介绍

RFC 3388 [RFC3388] specified a media-line grouping framework for SDP [RFC4566]. This specification obsoletes RFC 3388 [RFC3388].

RFC 3388[RFC3388]为SDP[RFC4566]指定了媒体线路分组框架。本规范淘汰了RFC 3388[RFC3388]。

An SDP [RFC4566] session description typically contains one or more media lines, which are commonly known as "m" lines. When a session description contains more than one "m" line, SDP does not provide any means to express a particular relationship between two or more of them. When an application receives an SDP session description with more than one "m" line, it is up to the application to determine what to do with them. SDP does not carry any information about grouping media streams.

SDP[RFC4566]会话描述通常包含一个或多个媒体行,通常称为“m”行。当会话描述包含多个“m”行时,SDP不提供任何方法来表示两个或多个“m”行之间的特定关系。当应用程序收到一个SDP会话描述,其中包含多个“m”行时,由应用程序决定如何处理它们。SDP不携带关于分组媒体流的任何信息。

While in some environments this information can be carried out of band, it is necessary to have a mechanism in SDP to express how different media streams within a session description relate to each other. The framework defined in this specification is such a mechanism.

虽然在某些环境中,这些信息可以带外执行,但在SDP中必须有一种机制来表示会话描述中的不同媒体流如何相互关联。本规范中定义的框架就是这样一种机制。

2. Terminology
2. 术语

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中所述进行解释。

3. Overview of Operation
3. 业务概况

This section provides a non-normative description of how the SDP Grouping Framework defined in this document works. In a given session description, each "m" line is identified by a token, which is carried in a "mid" attribute below the "m" line. The session description carries session-level "group" attributes that group different "m" lines (identified by their tokens) using different group semantics. The semantics of a group describe the purpose for which the "m" lines are grouped. For example, the "group" line in the session description below indicates that the "m" lines identified by tokens 1 and 2 (the audio and the video "m" lines, respectively) are grouped for the purpose of lip synchronization (LS).

本节对本文件中定义的SDP分组框架的工作原理进行了非规范性描述。在给定的会话描述中,每个“m”行由令牌标识,令牌携带在“m”行下方的“mid”属性中。会话描述包含会话级别的“组”属性,这些属性使用不同的组语义对不同的“m”行(由其标记标识)进行分组。组的语义描述了“m”行分组的目的。例如,下面会话描述中的“组”行指示由令牌1和2标识的“m”行(分别是音频和视频“m”行)被分组以用于lip同步(LS)。

          v=0
          o=Laura 289083124 289083124 IN IP4 one.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:LS 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=video 30002 RTP/AVP 31
          a=mid:2
        
          v=0
          o=Laura 289083124 289083124 IN IP4 one.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:LS 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=video 30002 RTP/AVP 31
          a=mid:2
        
4. Media Stream Identification Attribute
4. 媒体流标识属性

This document defines the "media stream identification" media attribute, which is used for identifying media streams within a session description. Its formatting in SDP [RFC4566] is described by the following Augmented Backus-Naur Form (ABNF) [RFC5234]:

本文档定义了“媒体流标识”媒体属性,用于标识会话描述中的媒体流。它在SDP[RFC4566]中的格式由以下扩展的巴科斯诺尔格式(ABNF)[RFC5234]描述:

           mid-attribute      = "a=mid:" identification-tag
           identification-tag = token
                                ; token is defined in RFC 4566
        
           mid-attribute      = "a=mid:" identification-tag
           identification-tag = token
                                ; token is defined in RFC 4566
        

The identification-tag MUST be unique within an SDP session description.

标识标签在SDP会话描述中必须是唯一的。

5. Group Attribute
5. 组属性

This document defines the "group" session-level attribute, which is used for grouping together different media streams. Its formatting in SDP is described by the following ABNF [RFC5234]:

本文档定义了“组”会话级别属性,用于将不同的媒体流分组在一起。其SDP格式由以下ABNF[RFC5234]描述:

           group-attribute     = "a=group:" semantics
                                 *(SP identification-tag)
           semantics           = "LS" / "FID" / semantics-extension
           semantics-extension = token
                                 ; token is defined in RFC 4566
        
           group-attribute     = "a=group:" semantics
                                 *(SP identification-tag)
           semantics           = "LS" / "FID" / semantics-extension
           semantics-extension = token
                                 ; token is defined in RFC 4566
        

This document defines two standard semantics: Lip Synchronization (LS) and Flow Identification (FID). Semantics extensions follow the Standards Action policy [RFC5226].

本文档定义了两种标准语义:Lip同步(LS)和流标识(FID)。语义扩展遵循标准操作策略[RFC5226]。

6. Use of "group" and "mid"
6. 使用“组”和“mid”

All of the "m" lines of a session description that uses "group" MUST be identified with a "mid" attribute whether they appear in the group line(s) or not. If a session description contains at least one "m" line that has no "mid" identification, the application MUST NOT perform any grouping of media lines.

会话描述中使用“组”的所有“m”行必须用“mid”属性标识,无论它们是否出现在组行中。如果会话描述至少包含一个没有“mid”标识的“m”行,则应用程序不得执行任何媒体行分组。

"a=group" lines are used to group together several "m" lines that are identified by their "mid" attribute. "a=group" lines that contain identification-tags that do not correspond to any "m" line within the session description MUST be ignored. The application acts as if the "a=group" line did not exist. The behavior of an application receiving an SDP description with grouped "m" lines is defined by the semantics field in the "a=group" line.

“a=group”行用于将由其“mid”属性标识的多个“m”行组合在一起。必须忽略包含与会话描述中的任何“m”行不对应的标识标记的“a=组”行。应用程序的行为就像“a=group”行不存在一样。使用分组的“m”行接收SDP描述的应用程序的行为由“a=group”行中的语义字段定义。

There MAY be several "a=group" lines in a session description. The "a=group" lines of a session description can use the same or different semantics. An "m" line identified by its "mid" attribute MAY appear in more than one "a=group" line.

会话描述中可能有多行“a=group”。会话描述的“a=group”行可以使用相同或不同的语义。由“mid”属性标识的“m”行可能出现在多个“a=group”行中。

7. Lip Synchronization (LS)
7. 唇同步(LS)

An application that receives a session description that contains "m" lines that are grouped together using LS semantics MUST synchronize the playout of the corresponding media streams. Note that LS semantics apply not only to a video stream that has to be synchronized with an audio stream; the playout of two streams of the same type can be synchronized as well.

接收包含使用LS语义分组在一起的“m”行的会话描述的应用程序必须同步相应媒体流的播放。注意,LS语义不仅适用于必须与音频流同步的视频流;相同类型的两个流的播放也可以同步。

For RTP streams, synchronization is typically performed using the RTP Control Protocol (RTCP), which provides enough information to map time stamps from the different streams into a local absolute time value. However, the concept of media stream synchronization MAY also apply to media streams that do not make use of RTP. If this is the case, the application MUST recover the original timing relationship between the streams using whatever mechanism is available.

对于RTP流,通常使用RTP控制协议(RTCP)执行同步,RTP控制协议提供足够的信息将不同流的时间戳映射到本地绝对时间值。然而,媒体流同步的概念也可以应用于不使用RTP的媒体流。如果是这种情况,应用程序必须使用任何可用的机制恢复流之间的原始定时关系。

7.1. Example of LS
7.1. LS示例

The following example shows a session description of a conference that is being multicast. The first media stream (mid:1) contains the voice of the speaker who speaks in English. The second media stream (mid:2) contains the video component, and the third (mid:3) media stream carries the translation to Spanish of what she is saying. The first and second media streams have to be synchronized.

以下示例显示正在进行多播的会议的会话描述。第一个媒体流(mid:1)包含讲英语的人的声音。第二个媒体流(mid:2)包含视频组件,第三个媒体流(mid:3)包含她所说内容的西班牙语翻译。第一和第二媒体流必须同步。

          v=0
          o=Laura 289083124 289083124 IN IP4 two.example.com
          c=IN IP4 233.252.0.1/127
          t=0 0
          a=group:LS 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=video 30002 RTP/AVP 31
          a=mid:2
          m=audio 30004 RTP/AVP 0
          i=This media stream contains the Spanish translation
          a=mid:3
        
          v=0
          o=Laura 289083124 289083124 IN IP4 two.example.com
          c=IN IP4 233.252.0.1/127
          t=0 0
          a=group:LS 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=video 30002 RTP/AVP 31
          a=mid:2
          m=audio 30004 RTP/AVP 0
          i=This media stream contains the Spanish translation
          a=mid:3
        

Note that although the third media stream is not present in the group line, it still has to contain a "mid" attribute (mid:3), as stated before.

注意,尽管第三媒体流不存在于组行中,但如前所述,它仍然必须包含“mid”属性(mid:3)。

8. Flow Identification (FID)
8. 流量识别(FID)

An "m" line in an SDP session description defines a media stream. However, SDP does not define what a media stream is. This definition can be found in the Real Time Streaming Protocol (RTSP) specification. The RTSP RFC [RFC2326] defines a media stream as "a single media instance, e.g., an audio stream or a video stream as well as a single whiteboard or shared application group. When using RTP, a stream consists of all RTP and RTCP packets created by a source within an RTP session".

SDP会话描述中的“m”行定义媒体流。然而,SDP并没有定义什么是媒体流。这个定义可以在实时流协议(RTSP)规范中找到。RTSP RFC[RFC2326]将媒体流定义为“单个媒体实例,例如音频流或视频流以及单个白板或共享应用程序组。使用RTP时,流由RTP会话中源创建的所有RTP和RTCP数据包组成”。

This definition assumes that a single audio (or video) stream maps into an RTP session. The RTP RFC [RFC1889] (at present obsoleted by [RFC3550]) used to define an RTP session as follows: "For each participant, the session is defined by a particular pair of destination transport addresses (one network address plus a port pair for RTP and RTCP)".

此定义假定单个音频(或视频)流映射到RTP会话。RTP RFC[RFC1889](目前已被[RFC3550]淘汰)用于定义RTP会话,如下所示:“对于每个参与者,会话由一对特定的目标传输地址(一个网络地址加上RTP和RTCP的端口对)定义。”。

While the previous definitions cover the most common cases, there are situations where a single media instance (e.g., an audio stream or a video stream) is sent using more than one RTP session. Two examples (among many others) of this kind of situation are cellular systems using the Session Initiation Protocol (SIP; [RFC3261]) and systems receiving Dual-Tone Multi-Frequency (DTMF) tones on a different host than the voice.

虽然前面的定义涵盖了最常见的情况,但也存在使用多个RTP会话发送单个媒体实例(例如,音频流或视频流)的情况。这种情况的两个示例(在许多其他示例中)是使用会话发起协议(SIP;[rfc326])的蜂窝系统和在不同于语音的主机上接收双音多频(DTMF)音的系统。

8.1. SIP and Cellular Access
8.1. SIP与蜂窝接入

Systems using a cellular access and SIP as a signalling protocol need to receive media over the air. During a session, the media can be encoded using different codecs. The encoded media has to traverse

使用蜂窝接入和SIP作为信令协议的系统需要通过空中接收媒体。在会话期间,可以使用不同的编解码器对媒体进行编码。编码媒体必须遍历

the radio interface. The radio interface is generally characterized as being prone to bit errors and associated with relatively high packet transfer delays. In addition, radio interface resources in a cellular environment are scarce and thus expensive, which calls for special measures in providing a highly efficient transport. In order to get an appropriate speech quality in combination with an efficient transport, precise knowledge of codec properties is required so that a proper radio bearer for the RTP session can be configured before transferring the media. These radio bearers are dedicated bearers per media type (i.e., codec).

无线电接口。无线电接口通常具有容易发生比特错误和与相对较高的分组传输延迟相关联的特征。此外,蜂窝环境中的无线电接口资源稀缺,因此价格昂贵,这就需要采取特殊措施来提供高效的传输。为了结合有效的传输获得适当的语音质量,需要准确了解编解码器属性,以便在传输媒体之前为RTP会话配置适当的无线电承载。这些无线电承载是每种媒体类型(即编解码器)的专用承载。

Cellular systems typically configure different radio bearers on different port numbers. Therefore, incoming media has to have different destination port numbers for the different possible codecs in order to be routed properly to the correct radio bearer. Thus, this is an example in which several RTP sessions are used to carry a single media instance (the encoded speech from the sender).

蜂窝系统通常在不同的端口号上配置不同的无线电承载。因此,为了正确地路由到正确的无线电承载,传入媒体必须具有不同可能编解码器的不同目标端口号。因此,这是一个示例,其中几个RTP会话用于承载单个媒体实例(来自发送方的编码语音)。

8.2. DTMF Tones
8.2. 拨号音

Some voice sessions include DTMF tones. Sometimes, the voice handling is performed by a different host than the DTMF handling. It is common to have an application server in the network gathering DTMF tones for the user while the user receives the encoded speech on his user agent. In this situation, it is necessary to establish two RTP sessions: one for the voice and the other for the DTMF tones. Both RTP sessions are logically part of the same media instance.

一些语音会话包括DTMF音调。有时,语音处理由不同于DTMF处理的主机执行。当用户在其用户代理上接收编码语音时,网络中的应用服务器通常为用户收集DTMF音调。在这种情况下,有必要建立两个RTP会话:一个用于语音,另一个用于DTMF音调。两个RTP会话在逻辑上是同一媒体实例的一部分。

8.3. Media Flow Definition
8.3. 媒体流定义

The previous examples show that the definition of a media stream in [RFC2326] does not cover some scenarios. It cannot be assumed that a single media instance maps into a single RTP session. Therefore, we introduce the definition of a media flow:

前面的示例表明,[RFC2326]中的媒体流定义不包括某些场景。不能假设单个媒体实例映射到单个RTP会话。因此,我们引入媒体流的定义:

A media flow consists of a single media instance, e.g., an audio stream or a video stream as well as a single whiteboard or shared application group. When using RTP, a media flow comprises one or more RTP sessions.

媒体流包括单个媒体实例,例如音频流或视频流以及单个白板或共享应用程序组。当使用RTP时,媒体流包括一个或多个RTP会话。

8.4. FID Semantics
8.4. FID语义

Several "m" lines grouped together using FID semantics form a media flow. A media agent handling a media flow that comprises several "m" lines MUST send a copy of the media to every "m" line that is part of the flow as long as the codecs and the direction attribute present in a particular "m" line allow it.

使用FID语义将几个“m”行组合在一起,形成一个媒体流。处理包含多个“m”行的媒体流的媒体代理必须向作为流一部分的每个“m”行发送媒体副本,只要特定“m”行中存在的编解码器和方向属性允许。

It is assumed that the application uses only one codec at a time to encode the media produced. This codec MAY change dynamically during the session, but at any particular moment, only one codec is in use.

假设应用程序一次仅使用一个编解码器对生成的媒体进行编码。此编解码器在会话期间可能会动态更改,但在任何特定时刻,只有一个编解码器在使用。

The application encodes the media using the current codec and checks, one by one, all of the "m" lines that are part of the flow. If a particular "m" line contains the codec being used and the direction attribute is "sendonly" or "sendrecv", a copy of the encoded media is sent to the address/port specified in that particular media stream. If either the "m" line does not contain the codec being used or the direction attribute is neither "sendonly" nor "sendrecv", nothing is sent over this media stream.

应用程序使用当前编解码器对媒体进行编码,并逐个检查流中的所有“m”行。如果特定的“m”行包含正在使用的编解码器,且方向属性为“sendonly”或“sendrecv”,则编码媒体的副本将发送到该特定媒体流中指定的地址/端口。如果“m”行不包含正在使用的编解码器,或者“方向”属性既不是“sendonly”也不是“sendrecv”,则不会通过此媒体流发送任何内容。

The application typically ends up sending media to different destinations (IP address/port number) depending on the codec used at any moment.

应用程序通常会将媒体发送到不同的目的地(IP地址/端口号),具体取决于随时使用的编解码器。

8.4.1. Examples of FID
8.4.1. FID示例

The session description below might be sent by a SIP user agent using a cellular access. The user agent supports GSM (Global System for Mobile communications) on port 30000 and AMR (Adaptive Multi-Rate) on port 30002. When the remote party sends GSM, it will send RTP packets to port number 30000. When AMR is the codec chosen, packets will be sent to port 30002. Note that the remote party can switch between both codecs dynamically in the middle of the session. However, in this example, only one media stream at a time carries voice. The other remains "muted" while its corresponding codec is not in use.

下面的会话描述可能由SIP用户代理使用蜂窝访问发送。用户代理在端口30000上支持GSM(全球移动通信系统),在端口30002上支持AMR(自适应多速率)。当远程方发送GSM时,它将向端口号30000发送RTP数据包。当选择AMR编解码器时,数据包将发送到端口30002。请注意,远程方可以在会话中间动态地切换两个编解码器。然而,在此示例中,一次只有一个媒体流承载语音。另一个在其相应的编解码器未使用时保持“静音”。

            v=0
            o=Laura 289083124 289083124 IN IP4 three.example.com
            c=IN IP4 192.0.2.1
            t=0 0
            a=group:FID 1 2
            m=audio 30000 RTP/AVP 3
            a=rtpmap:3 GSM/8000
            a=mid:1
            m=audio 30002 RTP/AVP 97
            a=rtpmap:97 AMR/8000
            a=fmtp:97 mode-set=0,2,5,7; mode-change-period=2;
          mode-change-neighbor; maxframes=1
            a=mid:2
        
            v=0
            o=Laura 289083124 289083124 IN IP4 three.example.com
            c=IN IP4 192.0.2.1
            t=0 0
            a=group:FID 1 2
            m=audio 30000 RTP/AVP 3
            a=rtpmap:3 GSM/8000
            a=mid:1
            m=audio 30002 RTP/AVP 97
            a=rtpmap:97 AMR/8000
            a=fmtp:97 mode-set=0,2,5,7; mode-change-period=2;
          mode-change-neighbor; maxframes=1
            a=mid:2
        

(The linebreak in the fmtp line accommodates RFC formatting restrictions; SDP does not have continuation lines.)

(fmtp行中的换行符符合RFC格式限制;SDP没有续行。)

In the previous example, a system receives media on the same IP address on different port numbers. The following example shows how a system can receive different codecs on different IP addresses.

在前面的示例中,系统在不同端口号的相同IP地址上接收媒体。下面的示例显示了系统如何在不同的IP地址上接收不同的编解码器。

           v=0
           o=Laura 289083124 289083124 IN IP4 four.example.com
           c=IN IP4 192.0.2.1
           t=0 0
           a=group:FID 1 2
           m=audio 20000 RTP/AVP 0
           c=IN IP4 192.0.2.2
           a=rtpmap:0 PCMU/8000
           a=mid:1
           m=audio 30002 RTP/AVP 97
           a=rtpmap:97 AMR/8000
           a=fmtp:97 mode-set=0,2,5,7; mode-change-period=2;
         mode-change-neighbor; maxframes=1
           a=mid:2
        
           v=0
           o=Laura 289083124 289083124 IN IP4 four.example.com
           c=IN IP4 192.0.2.1
           t=0 0
           a=group:FID 1 2
           m=audio 20000 RTP/AVP 0
           c=IN IP4 192.0.2.2
           a=rtpmap:0 PCMU/8000
           a=mid:1
           m=audio 30002 RTP/AVP 97
           a=rtpmap:97 AMR/8000
           a=fmtp:97 mode-set=0,2,5,7; mode-change-period=2;
         mode-change-neighbor; maxframes=1
           a=mid:2
        

(The linebreak in the fmtp line accommodates RFC formatting restrictions; SDP does not have continuation lines.)

(fmtp行中的换行符符合RFC格式限制;SDP没有续行。)

The cellular terminal in this example only supports the AMR codec. However, many current IP phones only support PCM (Pulse-Code Modulation; payload 0). In order to be able to interoperate with them, the cellular terminal uses a transcoder whose IP address is 192.0.2.2. The cellular terminal includes the transcoder IP address in its SDP description to provide support for PCM. Remote systems will send AMR directly to the terminal, but PCM will be sent to the transcoder. The transcoder will be configured (using whatever method is preferred) to convert the incoming PCM audio to AMR and send it to the terminal.

本例中的蜂窝终端仅支持AMR编解码器。然而,许多当前的IP电话只支持PCM(脉冲编码调制;有效负载0)。为了能够与它们互操作,蜂窝终端使用IP地址为192.0.2.2的转码器。蜂窝终端在其SDP描述中包括转码器IP地址,以提供对PCM的支持。远程系统将直接向终端发送AMR,但PCM将发送至转码器。将配置转码器(使用任何首选方法)将传入PCM音频转换为AMR并发送至终端。

The next example shows how the "group" attribute used with FID semantics can indicate the use of two different codecs in the two directions of a bidirectional media stream.

下一个示例显示了与FID语义一起使用的“group”属性如何指示在双向媒体流的两个方向上使用两个不同的编解码器。

          v=0
          o=Laura 289083124 289083124 IN IP4 five.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 30002 RTP/AVP 8
          a=recvonly
          a=mid:2
        
          v=0
          o=Laura 289083124 289083124 IN IP4 five.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 30002 RTP/AVP 8
          a=recvonly
          a=mid:2
        

A user agent that receives the SDP description above knows that, at a certain moment, it can send either PCM u-law to port number 30000 or PCM A-law to port number 30002. However, the media agent also knows that the other end will only send PCM u-law (payload 0).

接收上述SDP描述的用户代理知道,在某个时刻,它可以将PCM u-law发送到端口号30000,或将PCM A-law发送到端口号30002。但是,媒体代理也知道另一端将只发送PCM u-law(有效负载0)。

The following example shows a session description with different "m" lines grouped together using FID semantics that contain the same codec.

下面的示例显示了一个会话描述,其中使用包含相同编解码器的FID语义将不同的“m”行分组在一起。

          v=0
          o=Laura 289083124 289083124 IN IP4 six.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2 3
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 30002 RTP/AVP 8
          a=mid:2
          m=audio 20000 RTP/AVP 0 8
          c=IN IP4 192.0.2.2
          a=recvonly
          a=mid:3
        
          v=0
          o=Laura 289083124 289083124 IN IP4 six.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2 3
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 30002 RTP/AVP 8
          a=mid:2
          m=audio 20000 RTP/AVP 0 8
          c=IN IP4 192.0.2.2
          a=recvonly
          a=mid:3
        

At a particular point in time, if the media agent receiving the SDP message above is sending PCM u-law (payload 0), it sends RTP packets to 192.0.2.1 on port 30000 and to 192.0.2.2 on port 20000 (first and third "m" lines). If it is sending PCM A-law (payload 8), it sends RTP packets to 192.0.2.1 on port 30002 and to 192.0.2.2 on port 20000 (second and third "m" lines).

在特定时间点,如果接收上述SDP消息的媒体代理正在发送PCM u-law(有效负载0),则它将RTP数据包发送到端口30000上的192.0.2.1和端口20000上的192.0.2.2(第一和第三“m”行)。如果它正在发送PCM A-law(有效负载8),它将RTP数据包发送到端口30002上的192.0.2.1和端口20000上的192.0.2.2(第二和第三条“m”线)。

The system that generated the SDP description above supports PCM u-law on port 30000 and PCM A-law on port 30002. Besides, it uses an application server that records the conversation and whose IP address is 192.0.2.2. The application server does not need to understand the media content, so it always receives a copy of the media stream, regardless of the codec and payload type that is being used. That is why the application server always receives a copy of the audio stream regardless of the codec being used at any given moment (it actually performs an RTP dump, so it can effectively receive any codec).

生成上述SDP说明的系统支持端口30000上的PCM u-law和端口30002上的PCM A-law。此外,它还使用一个应用服务器来记录对话,其IP地址为192.0.2.2。应用程序服务器不需要理解媒体内容,因此它总是接收媒体流的副本,而不管所使用的编解码器和负载类型如何。这就是为什么应用服务器总是接收音频流的副本,而不管在任何给定时刻使用的编解码器是什么(它实际上执行RTP转储,因此可以有效地接收任何编解码器)。

Remember that if several "m" lines that are grouped together using the FID semantics contain the same codec, the media agent MUST send copies of the same media stream as several RTP sessions at the same time.

请记住,如果使用FID语义分组在一起的多个“m”行包含相同的编解码器,则媒体代理必须同时发送与多个RTP会话相同的媒体流副本。

The last example in this section deals with DTMF tones. DTMF tones can be transmitted using a regular voice codec or can be transmitted as telephony events. The RTP payload for DTMF tones treated as

本节的最后一个示例涉及DTMF音调。DTMF音调可以使用常规语音编解码器传输,也可以作为电话事件传输。DTMF音的RTP有效负载被视为

telephone events is described in [RFC4733]. Below, there is an example of an SDP session description using FID semantics and this payload type.

[RFC4733]中描述了电话事件。下面是一个使用FID语义和此有效负载类型的SDP会话描述示例。

          v=0
          o=Laura 289083124 289083124 IN IP4 seven.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 20000 RTP/AVP 97
          c=IN IP4 192.0.2.2
          a=rtpmap:97 telephone-events
          a=mid:2
        
          v=0
          o=Laura 289083124 289083124 IN IP4 seven.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 20000 RTP/AVP 97
          c=IN IP4 192.0.2.2
          a=rtpmap:97 telephone-events
          a=mid:2
        

The remote party would send PCM encoded voice (payload 0) to 192.0.2.1 and DTMF tones encoded as telephony events to 192.0.2.2. Note that only voice or DTMF is sent at a particular point in time. When DTMF tones are sent, the first media stream does not carry any data and, when voice is sent, there is no data in the second media stream. FID semantics provide different destinations for alternative codecs.

远程方将向192.0.2.1发送PCM编码的语音(有效载荷0),并向192.0.2.2发送作为电话事件编码的DTMF音调。请注意,在特定时间点仅发送语音或DTMF。当发送DTMF音调时,第一媒体流不携带任何数据,当发送语音时,第二媒体流中没有数据。FID语义为备选编解码器提供了不同的目的地。

8.5. Scenarios That FID Does Not Cover
8.5. FID未涵盖的场景

It is worthwhile mentioning some scenarios where the "group" attribute using existing semantics (particularly FID) might seem to be applicable but is not.

值得一提的是,在某些场景中,使用现有语义(特别是FID)的“group”属性似乎是适用的,但实际上并不适用。

8.5.1. Parallel Encoding Using Different Codecs
8.5.1. 使用不同编解码器的并行编码

FID semantics are useful when the application only uses one codec at a time. An application that encodes the same media using different codecs simultaneously MUST NOT use FID to group those media lines. Some systems that handle DTMF tones are a typical example of parallel encoding using different codecs. Some systems implement the RTP payload defined in RFC 4733 [RFC4733], but when they send DTMF tones, they do not mute the voice channel. Therefore, in effect they are sending two copies of the same DTMF tone: encoded as voice and encoded as a telephony event. When the receiver gets both copies, it typically uses the telephony event rather than the tone encoded as voice. FID semantics MUST NOT be used in this context to group both media streams, since such a system is not using alternative codecs but rather different parallel encodings for the same information.

当应用程序一次只使用一个编解码器时,FID语义非常有用。同时使用不同编解码器对同一媒体进行编码的应用程序不得使用FID对这些媒体行进行分组。一些处理DTMF音调的系统是使用不同编解码器进行并行编码的典型示例。一些系统实现RFC 4733[RFC4733]中定义的RTP有效负载,但当它们发送DTMF音调时,它们不会使语音信道静音。因此,实际上他们发送的是同一DTMF音调的两个副本:编码为语音和编码为电话事件。当接收器获得两个副本时,它通常使用电话事件,而不是编码为语音的音调。在此上下文中,不能使用FID语义对两个媒体流进行分组,因为这样的系统不使用替代的编解码器,而是对相同的信息使用不同的并行编码。

8.5.2. Layered Encoding
8.5.2. 分层编码

Layered encoding schemes encode media in different layers. The quality of the media stream at the receiver varies depending on the number of layers received. SDP provides a means to group together contiguous multicast addresses that transport different layers. The "c" line below:

分层编码方案在不同的层中对媒体进行编码。接收器处的媒体流的质量取决于接收的层数。SDP提供了一种将传输不同层的连续多播地址分组的方法。下面的“c”行:

          c=IN IP4 233.252.0.1/127/3
        
          c=IN IP4 233.252.0.1/127/3
        

is equivalent to the following three "c" lines:

相当于以下三条“c”线:

          c=IN IP4 233.252.0.1/127
          c=IN IP4 233.252.0.2/127
          c=IN IP4 233.252.0.3/127
        
          c=IN IP4 233.252.0.1/127
          c=IN IP4 233.252.0.2/127
          c=IN IP4 233.252.0.3/127
        

FID MUST NOT be used to group "m" lines that do not represent the same information. Therefore, FID MUST NOT be used to group "m" lines that contain the different layers of layered encoding schemes. Besides, we do not define new group semantics to provide a more flexible way of grouping different layers, because the already existing SDP mechanism covers the most useful scenarios. Since the existing SDP mechanism already covers the most useful scenarios, we do not define a new group semantics to define a more flexible way of grouping different layers.

不得使用FID对不代表相同信息的“m”行进行分组。因此,不得使用FID对包含分层编码方案不同层的“m”行进行分组。此外,我们没有定义新的组语义来提供更灵活的方式来分组不同的层,因为已经存在的SDP机制涵盖了最有用的场景。由于现有的SDP机制已经涵盖了最有用的场景,因此我们没有定义新的组语义来定义更灵活的分组不同层的方法。

8.5.3. Same IP Address and Port Number
8.5.3. 相同的IP地址和端口号

If media streams using several different codecs have to be sent to the same IP address and port, the traditional SDP syntax of listing several codecs in the same "m" line MUST be used. FID MUST NOT be used to group "m" lines with the same IP address/port. Therefore, an SDP description like the one below MUST NOT be generated.

如果必须将使用多个不同编解码器的媒体流发送到同一IP地址和端口,则必须使用在同一“m”行中列出多个编解码器的传统SDP语法。不得使用FID将具有相同IP地址/端口的“m”行分组。因此,不能生成如下SDP描述。

          v=0
          o=Laura 289083124 289083124 IN IP4 eight.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 30000 RTP/AVP 8
          a=mid:2
        
          v=0
          o=Laura 289083124 289083124 IN IP4 eight.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 30000 RTP/AVP 8
          a=mid:2
        

The correct SDP description for the session above would be the following one:

上述会话的正确SDP描述如下:

v=0 o=Laura 289083124 289083124 IN IP4 nine.example.com c=IN IP4 192.0.2.1 t=0 0 m=audio 30000 RTP/AVP 0 8

v=0 o=IP4 nine.example.com中的Laura 289083124 289083124 c=IP4 192.0.2.1中的t=0 0 m=音频30000 RTP/AVP 0 8

If two "m" lines are grouped using FID, they MUST differ in their transport addresses (i.e., IP address plus port).

如果使用FID对两条“m”线进行分组,则它们的传输地址(即IP地址加端口)必须不同。

9. Usage of the "group" Attribute in SIP
9. SIP中“group”属性的用法

SDP descriptions are used by several different protocols, SIP among them. We include a section about SIP, because the "group" attribute will most likely be used mainly by SIP systems.

SDP描述由几种不同的协议使用,其中包括SIP。我们包含了一个关于SIP的部分,因为“group”属性很可能主要由SIP系统使用。

SIP [RFC3261] is an application layer protocol for establishing, terminating, and modifying multimedia sessions. SIP carries session descriptions in the bodies of the SIP messages but is independent from the protocol used for describing sessions. SDP [RFC4566] is one of the protocols that can be used for this purpose.

SIP[RFC3261]是用于建立、终止和修改多媒体会话的应用层协议。SIP在SIP消息体中承载会话描述,但独立于用于描述会话的协议。SDP[RFC4566]是可用于此目的的协议之一。

At session establishment, SIP provides a three-way handshake (INVITE-200 OK-ACK) between end systems. However, just two of these three messages carry SDP, as described in [RFC3264].

在会话建立时,SIP在终端系统之间提供三方握手(INVITE-200 OK-ACK)。然而,这三条消息中只有两条携带SDP,如[RFC3264]所述。

9.1. Mid Value in Answers
9.1. 答案中的中值

The "mid" attribute is an identifier for a particular media stream. Therefore, the "mid" value in the offer MUST be the same as the "mid" value in the answer. Besides, subsequent offers (e.g., in a re-INVITE) SHOULD use the same "mid" value for the already existing media streams.

“mid”属性是特定媒体流的标识符。因此,报价中的“中间”值必须与答案中的“中间”值相同。此外,后续提供(例如,在重新邀请中)应该对已经存在的媒体流使用相同的“中间”值。

[RFC3264] describes the usage of SDP in text of SIP. The offerer and the answerer align their media description so that the nth media stream ("m=" line) in the offerer's session description corresponds to the nth media stream in the answerer's description.

[RFC3264]描述了SIP文本中SDP的用法。报价人和应答人对齐其媒体描述,以便报价人会话描述中的第n个媒体流(“m=”行)对应于应答人描述中的第n个媒体流。

The presence of the "group" attribute in an SDP session description does not modify this behavior.

SDP会话描述中存在“group”属性不会修改此行为。

Since the "mid" attribute provides a means to label "m" lines, it would be possible to perform media alignment using "mid" labels rather than matching nth "m" lines. However, this would not bring any gain and would add complexity to implementations. Therefore, SIP

由于“mid”属性提供了标记“m”行的方法,因此可以使用“mid”标签而不是匹配第n个“m”行来执行介质对齐。然而,这不会带来任何好处,并且会增加实现的复杂性。因此,SIP

systems MUST perform media alignment matching nth lines regardless of the presence of the "group" or "mid" attributes.

无论是否存在“组”或“中间”属性,系统都必须执行与第n行匹配的介质对齐。

If a media stream that contained a particular "mid" identifier in the offer contains a different identifier in the answer, the application ignores all of the "mid" and "group" lines that might appear in the session description. The following example illustrates this scenario.

如果在报价中包含特定“mid”标识符的媒体流在应答中包含不同的标识符,则应用程序将忽略会话描述中可能出现的所有“mid”和“组”行。下面的示例演示了此场景。

9.1.1. Example
9.1.1. 实例

Two SIP entities exchange SDPs during session establishment. The INVITE contains the SDP description below:

两个SIP实体在会话建立期间交换SDP。INVITE包含以下SDP说明:

          v=0
          o=Laura 289083124 289083124 IN IP4 ten.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2
          m=audio 30000 RTP/AVP 0 8
          a=mid:1
          m=audio 30002 RTP/AVP 0 8
          a=mid:2
        
          v=0
          o=Laura 289083124 289083124 IN IP4 ten.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2
          m=audio 30000 RTP/AVP 0 8
          a=mid:1
          m=audio 30002 RTP/AVP 0 8
          a=mid:2
        

The 200 OK response contains the following SDP description:

200 OK响应包含以下SDP说明:

          v=0
          o=Bob 289083122 289083122 IN IP4 eleven.example.com
          c=IN IP4 192.0.2.3
          t=0 0
          a=group:FID 1 2
          m=audio 25000 RTP/AVP 0 8
          a=mid:2
          m=audio 25002 RTP/AVP 0 8
          a=mid:1
        
          v=0
          o=Bob 289083122 289083122 IN IP4 eleven.example.com
          c=IN IP4 192.0.2.3
          t=0 0
          a=group:FID 1 2
          m=audio 25000 RTP/AVP 0 8
          a=mid:2
          m=audio 25002 RTP/AVP 0 8
          a=mid:1
        

Since alignment of "m" lines is performed based on matching of nth lines, the first stream had "mid:1" in the INVITE and "mid:2" in the 200 OK. Therefore, the application ignores every "mid" and "group" line contained in the SDP description.

由于“m”行的对齐是基于第n行的匹配来执行的,因此第一个流在INVITE中具有“mid:1”,在200 OK中具有“mid:2”。因此,应用程序将忽略SDP描述中包含的每个“mid”和“group”行。

A well-behaved SIP user agent would have returned the SDP description below in the 200 OK response.

行为良好的SIP用户代理将在下面的200OK响应中返回SDP描述。

          v=0
          o=Bob 289083122 289083122 IN IP4 twelve.example.com
          c=IN IP4 192.0.2.3
          t=0 0
          a=group:FID 1 2
          m=audio 25002 RTP/AVP 0 8
          a=mid:1
          m=audio 25000 RTP/AVP 0 8
          a=mid:2
        
          v=0
          o=Bob 289083122 289083122 IN IP4 twelve.example.com
          c=IN IP4 192.0.2.3
          t=0 0
          a=group:FID 1 2
          m=audio 25002 RTP/AVP 0 8
          a=mid:1
          m=audio 25000 RTP/AVP 0 8
          a=mid:2
        
9.2. Group Value in Answers
9.2. 答案中的群体价值

A SIP entity that receives an offer that contains an "a=group" line with semantics that it does not understand MUST return an answer without the "group" line. Note that, as described in the previous section, the "mid" lines MUST still be present in the answer.

接收包含“A=group”行且其语义不理解的要约的SIP实体必须返回不带“group”行的答案。请注意,如前一节所述,“中间”行必须仍然存在于答案中。

A SIP entity that receives an offer that contains an "a=group" line with semantics that are understood MUST return an answer that contains an "a=group" line with the same semantics. The identification-tags contained in this "a=group" line MUST be the same as those received in the offer, or a subset of them (zero identification-tags is a valid subset). When the identification-tags in the answer are a subset, the "group" value to be used in the session MUST be the one present in the answer.

接收到包含“A=group”行且其语义已被理解的要约的SIP实体必须返回包含具有相同语义的“A=group”行的应答。此“a=组”行中包含的标识标签必须与报价中收到的标识标签相同,或者是其中的一个子集(零标识标签是有效的子集)。当答案中的标识标签是子集时,会话中使用的“组”值必须是答案中的值。

SIP entities refuse media streams by setting the port to zero in the corresponding "m" line. "a=group" lines MUST NOT contain identification-tags that correspond to "m" lines with the port set to zero.

SIP实体通过在相应的“m”行中将端口设置为零来拒绝媒体流。“a=组”行不得包含与端口设置为零的“m”行相对应的标识标签。

Note that grouping of "m" lines MUST always be requested by the offerer, but never by the answerer. Since SIP provides a two-way SDP exchange, an answerer that requested grouping would not know whether the "group" attribute was accepted by the offerer or not. An answerer that wants to group media lines issues another offer after having responded to the first one (in a re-INVITE, for instance).

请注意,“m”行的分组必须始终由报价人请求,但决不能由应答人请求。由于SIP提供双向SDP交换,请求分组的应答者将不知道“组”属性是否被报价人接受。想要将媒体线路分组的应答者在回应第一个报价后(例如,在重新邀请中)发出另一个报价。

9.2.1. Example
9.2.1. 实例

The example below shows how the callee refuses a media stream offered by the caller by setting its port number to zero. The "mid" value corresponding to that media stream is removed from the "group" value in the answer.

下面的示例显示被调用者如何通过将端口号设置为零来拒绝调用者提供的媒体流。与该媒体流相对应的“中间”值将从应答中的“组”值中删除。

SDP description in the INVITE from caller to callee:

从呼叫者到被呼叫者的邀请中的SDP说明:

          v=0
          o=Laura 289083124 289083124 IN IP4 thirteen.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2 3
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 30002 RTP/AVP 8
          a=mid:2
          m=audio 30004 RTP/AVP 3
          a=mid:3
        
          v=0
          o=Laura 289083124 289083124 IN IP4 thirteen.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID 1 2 3
          m=audio 30000 RTP/AVP 0
          a=mid:1
          m=audio 30002 RTP/AVP 8
          a=mid:2
          m=audio 30004 RTP/AVP 3
          a=mid:3
        

SDP description in the INVITE from callee to caller:

从被调用方到调用方的邀请中的SDP说明:

          v=0
          o=Bob 289083125 289083125 IN IP4 fourteen.example.com
          c=IN IP4 192.0.2.3
          t=0 0
          a=group:FID 1 3
          m=audio 20000 RTP/AVP 0
          a=mid:1
          m=audio 0 RTP/AVP 8
          a=mid:2
          m=audio 20002 RTP/AVP 3
          a=mid:3
        
          v=0
          o=Bob 289083125 289083125 IN IP4 fourteen.example.com
          c=IN IP4 192.0.2.3
          t=0 0
          a=group:FID 1 3
          m=audio 20000 RTP/AVP 0
          a=mid:1
          m=audio 0 RTP/AVP 8
          a=mid:2
          m=audio 20002 RTP/AVP 3
          a=mid:3
        
9.3. Capability Negotiation
9.3. 能力谈判

A client that understands "group" and "mid", but does not want to use these SDP features in a particular session, may still want to indicate that it supports these features. To indicate this support, a client can add an "a=3Dgroup" line with no identification-tags for every semantics value it understands.

理解“组”和“mid”但不想在特定会话中使用这些SDP功能的客户机可能仍然希望表明它支持这些功能。为了表示这种支持,客户机可以为其理解的每个语义值添加一个“a=3Dgroup”行,该行没有标识标签。

If a server receives an offer that contains empty "a=group" lines, it SHOULD add its capabilities also in the form of empty "a=group" lines to its answer.

如果服务器收到包含空“a=group”行的报价,则应以空“a=group”行的形式将其功能添加到其答案中。

9.3.1. Example
9.3.1. 实例

A system that supports both LS and FID semantics but does not want to group any media stream for this particular session generates the following SDP description:

支持LS和FID语义但不希望为此特定会话对任何媒体流进行分组的系统将生成以下SDP描述:

          v=0
          o=Bob 289083125 289083125 IN IP4 fifteen.example.com
          c=IN IP4 192.0.2.3
          t=0 0
          a=group:LS
          a=group:FID
          m=audio 20000 RTP/AVP 0 8
        
          v=0
          o=Bob 289083125 289083125 IN IP4 fifteen.example.com
          c=IN IP4 192.0.2.3
          t=0 0
          a=group:LS
          a=group:FID
          m=audio 20000 RTP/AVP 0 8
        

The server that receives that offer supports FID but not LS. It responds with the SDP description below:

接收该报价的服务器支持FID,但不支持LS。其响应如下SDP描述:

          v=0
          o=Laura 289083124 289083124 IN IP4 sixteen.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID
          m=audio 30000 RTP/AVP 0
        
          v=0
          o=Laura 289083124 289083124 IN IP4 sixteen.example.com
          c=IN IP4 192.0.2.1
          t=0 0
          a=group:FID
          m=audio 30000 RTP/AVP 0
        
9.4. Backward Compatibility
9.4. 向后兼容性

This document does not define any SIP "Require" header field. Therefore, if one of the SIP user agents does not understand the "group" attribute, the standard SDP fall-back mechanism MUST be used, namely, attributes that are not understood are simply ignored.

本文档未定义任何SIP“Require”标题字段。因此,如果其中一个SIP用户代理不理解“group”属性,则必须使用标准SDP回退机制,即忽略不理解的属性。

9.4.1. Offerer Does Not Support "group"
9.4.1. 报价人不支持“集团”

This situation does not represent a problem, because grouping requests are always performed by offerers and not by answerers. If the offerer does not support "group", this attribute will simply not be used.

这种情况并不代表问题,因为分组请求总是由提供方而不是应答方执行。如果报价人不支持“组”,则该属性将不被使用。

9.4.2. Answerer Does Not Support "group"
9.4.2. 回答者不支持“组”

The answerer will ignore the "group" attribute since it does not understand it and will also ignore the "mid" attribute. For LS semantics, the answerer might decide to perform, or not to perform, synchronization between media streams.

回答者将忽略“group”属性,因为它不理解该属性,还将忽略“mid”属性。对于LS语义,应答者可能决定在媒体流之间执行或不执行同步。

For FID semantics, the answerer will consider the session to consist of several media streams.

对于FID语义,应答者将考虑会话由多个媒体流组成。

Different implementations will behave in different ways.

不同的实现将以不同的方式运行。

In the case of audio and different "m" lines for different codecs, an implementation might decide to act as a mixer with the different incoming RTP sessions, which is the correct behavior.

在音频和不同编解码器的不同“m”线的情况下,实现可能会决定充当不同传入RTP会话的混音器,这是正确的行为。

An implementation might also decide to refuse the request (e.g., 488 Not Acceptable Here, or 606 Not Acceptable), because it contains several "m" lines. In this case, the server does not support the type of session that the caller wanted to establish. In case the client is willing to establish a simpler session anyway, the client can re-try the request without the "group" attribute and with only one "m" line per flow.

实现还可能决定拒绝请求(例如,此处488不可接受,或606不可接受),因为它包含多个“m”行。在这种情况下,服务器不支持调用方想要建立的会话类型。如果客户机愿意建立更简单的会话,那么客户机可以在不使用“group”属性的情况下重新尝试请求,并且每个流只有一行“m”。

10. Changes from RFC 3388
10. RFC 3388的变更

Section 3 (Overview of Operation) has been added for clarity. The AMR and GSM acronyms are now expanded on their first use. The examples now use IP addresses in the range suitable for examples.

为了清晰起见,增加了第3节(操作概述)。AMR和GSM首字母缩略词现在在首次使用时已扩展。示例现在使用适合示例的范围内的IP地址。

The grouping mechanism is now defined as an extensible framework. Earlier, RFC 3388 [RFC3388] used to discourage extensions to this mechanism in favor of using new session description protocols.

分组机制现在被定义为一个可扩展的框架。早些时候,RFC3388[RFC3388]用于阻止对该机制的扩展,以支持使用新的会话描述协议。

Given a semantics value, RFC 3388 [RFC3388] used to restrict "m" line identifiers to only appear in a single group using that semantics. That restriction has been lifted in this specification. From conversations with implementers, existing (i.e., legacy) implementations enforce this restriction on a per-semantics basis. That is, they only enforce this restriction for supported semantics. Because of the nature of existing semantics, implementations will only use a single "m" line identifier across groups using a given semantics even after the restriction has been lifted by this specification. Consequently, the lifting of this restriction will not cause backward-compatibility problems, because implementations supporting new semantics will be updated to not enforce this restriction at the same time as they are updated to support the new semantics.

给定语义值,RFC 3388[RFC3388]用于限制“m”行标识符仅出现在使用该语义的单个组中。本规范取消了该限制。从与实现者的对话中,现有(即遗留)实现在每个语义的基础上强制执行此限制。也就是说,它们仅对受支持的语义实施此限制。由于现有语义的性质,即使本规范取消了限制,实现也只会在使用给定语义的组之间使用单个“m”行标识符。因此,解除此限制不会导致向后兼容性问题,因为支持新语义的实现将在更新以支持新语义的同时更新以不强制执行此限制。

11. Security Considerations
11. 安全考虑

Using the "group" parameter with FID semantics, an entity that managed to modify the session descriptions exchanged between the participants to establish a multimedia session could force the participants to send a copy of the media to any destination of its choosing.

使用具有FID语义的“group”参数,一个能够修改参与者之间交换的会话描述以建立多媒体会话的实体可以强制参与者向其选择的任何目的地发送媒体副本。

Integrity mechanisms provided by protocols used to exchange session descriptions and media encryption can be used to prevent this attack. In SIP, Secure/Multipurpose Internet Mail Extensions (S/MIME) [RFC5750] and Transport Layer Security (TLS) [RFC5246] can be used to protect session description exchanges in an end-to-end and a hop-by-hop fashion, respectively.

用于交换会话描述和媒体加密的协议提供的完整性机制可用于防止此攻击。在SIP中,安全/多用途Internet邮件扩展(S/MIME)[RFC5750]和传输层安全(TLS)[RFC5246]可分别用于以端到端和逐跳方式保护会话描述交换。

12. IANA Considerations
12. IANA考虑

This document defines two SDP attributes: "mid" and "group".

本文档定义了两个SDP属性:“mid”和“group”。

The "mid" attribute is used to identify media streams within a session description, and its format is defined in Section 4.

“mid”属性用于标识会话描述中的媒体流,其格式在第4节中定义。

The "group" attribute is used for grouping together different media streams, and its format is defined in Section 5.

“group”属性用于将不同的媒体流分组在一起,其格式在第5节中定义。

This document defines a framework to group media lines in SDP using different semantics. Semantics values to be used with this framework are registered by the IANA following the Standards Action policy [RFC5226].

本文档定义了一个使用不同语义对SDP中的媒体行进行分组的框架。IANA根据标准操作策略[RFC5226]注册与此框架一起使用的语义值。

The IANA Considerations section of the RFC MUST include the following information, which appears in the IANA registry along with the RFC number of the publication.

RFC的IANA注意事项部分必须包括以下信息,这些信息与出版物的RFC编号一起出现在IANA注册表中。

o A brief description of the semantics.

o 对语义的简要描述。

o Token to be used within the "group" attribute. This token may be of any length, but SHOULD be no more than four characters long.

o 要在“组”属性中使用的令牌。此标记可以是任意长度,但长度不应超过四个字符。

o Reference to a standards track RFC.

o 参考标准跟踪RFC。

The following are the current entries in the registry:

以下是注册表中的当前条目:

      Semantics                          Token  Reference
      ---------------------------------  -----  -----------
      Lip Synchronization                 LS     [RFC5888]
      Flow Identification                 FID    [RFC5888]
      Single Reservation Flow             SRF    [RFC3524]
      Alternative Network Address Types   ANAT   [RFC4091]
      Forward Error Correction            FEC    [RFC4756]
      Decoding Dependency                 DDP    [RFC5583]
        
      Semantics                          Token  Reference
      ---------------------------------  -----  -----------
      Lip Synchronization                 LS     [RFC5888]
      Flow Identification                 FID    [RFC5888]
      Single Reservation Flow             SRF    [RFC3524]
      Alternative Network Address Types   ANAT   [RFC4091]
      Forward Error Correction            FEC    [RFC4756]
      Decoding Dependency                 DDP    [RFC5583]
        
13. Acknowledgments
13. 致谢

Goran Eriksson and Jan Holler were coauthors of RFC 3388 [RFC3388].

Goran Eriksson和Jan Holler是RFC 3388[RFC3388]的合著者。

14. References
14. 工具书类
14.1. Normative References
14.1. 规范性引用文件

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.

[RFC3261]Rosenberg,J.,Schulzrinne,H.,Camarillo,G.,Johnston,A.,Peterson,J.,Sparks,R.,Handley,M.,和E.Schooler,“SIP:会话启动协议”,RFC 3261,2002年6月。

[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.

[RFC3264]Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,2002年6月。

[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.

[RFC4566]Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC4566,2006年7月。

[RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.

[RFC5226]Narten,T.和H.Alvestrand,“在RFCs中编写IANA注意事项部分的指南”,BCP 26,RFC 5226,2008年5月。

[RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008.

[RFC5234]Crocker,D.和P.Overell,“语法规范的扩充BNF:ABNF”,STD 68,RFC 5234,2008年1月。

[RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, August 2008.

[RFC5246]Dierks,T.和E.Rescorla,“传输层安全(TLS)协议版本1.2”,RFC 5246,2008年8月。

[RFC5750] Ramsdell, B. and S. Turner, "Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.2 Certificate Handling", RFC 5750, January 2010.

[RFC5750]Ramsdell,B.和S.Turner,“安全/多用途Internet邮件扩展(S/MIME)版本3.2证书处理”,RFC 57502010年1月。

14.2. Informative References
14.2. 资料性引用

[RFC1889] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 1889, January 1996.

[RFC1889]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,RFC 1889,1996年1月。

[RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998.

[RFC2326]Schulzrinne,H.,Rao,A.,和R.Lanphier,“实时流协议(RTSP)”,RFC2326,1998年4月。

[RFC3388] Camarillo, G., Eriksson, G., Holler, J., and H. Schulzrinne, "Grouping of Media Lines in the Session Description Protocol (SDP)", RFC 3388, December 2002.

[RFC3388]Camarillo,G.,Eriksson,G.,Holler,J.,和H.Schulzrinne,“会话描述协议(SDP)中媒体线路的分组”,RFC 3388,2002年12月。

[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.

[RFC3550]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。

[RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF Digits, Telephony Tones, and Telephony Signals", RFC 4733, December 2006.

[RFC4733]Schulzrinne,H.和T.Taylor,“DTMF数字、电话音和电话信号的RTP有效载荷”,RFC 47332006年12月。

Authors' Addresses

作者地址

Gonzalo Camarillo Ericsson Hirsalantie 11 Jorvas 02420 FINLAND

Gonzalo Camarillo Ericsson Hirsalantie 11 Jorvas 02420芬兰

   EMail: Gonzalo.Camarillo@ericsson.com
        
   EMail: Gonzalo.Camarillo@ericsson.com
        

Henning Schulzrinne Columbia University 1214 Amsterdam Avenue New York, NY 10027 USA

美国纽约州纽约市阿姆斯特丹大道1214号亨宁·舒尔兹林内哥伦比亚大学,邮编:10027

   EMail: schulzrinne@cs.columbia.edu
        
   EMail: schulzrinne@cs.columbia.edu