Network Working Group F. de Bont Request for Comments: 5691 Philips Electronics Updates: 3640 S. Doehla Category: Standards Track Fraunhofer IIS M. Schmidt Dolby Laboratories R. Sperschneider Fraunhofer IIS October 2009
Network Working Group F. de Bont Request for Comments: 5691 Philips Electronics Updates: 3640 S. Doehla Category: Standards Track Fraunhofer IIS M. Schmidt Dolby Laboratories R. Sperschneider Fraunhofer IIS October 2009
RTP Payload Format for Elementary Streams with MPEG Surround Multi-Channel Audio
具有MPEG环绕多声道音频的基本流的RTP有效负载格式
Abstract
摘要
This memo describes extensions for the RTP payload format defined in RFC 3640 for the transport of MPEG Surround multi-channel audio. Additional Media Type parameters are defined to signal backwards-compatible transmission inside an MPEG-4 Audio elementary stream. In addition, a layered transmission scheme that doesn't use the MPEG-4 systems framework is presented to transport an MPEG Surround elementary stream via RTP in parallel with an RTP stream containing the downmixed audio data.
本备忘录描述了RFC 3640中定义的用于传输MPEG环绕多声道音频的RTP有效负载格式的扩展。额外的媒体类型参数被定义为在MPEG-4音频基本流内信号向后兼容传输。此外,提出了一种不使用MPEG-4系统框架的分层传输方案,通过RTP将MPEG环绕基本流与包含下混音频数据的RTP流并行传输。
Status of This Memo
关于下段备忘
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
本文件规定了互联网社区的互联网标准跟踪协议,并要求进行讨论和提出改进建议。有关本协议的标准化状态和状态,请参考当前版本的“互联网官方协议标准”(STD 1)。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2009 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括《信托法律条款》第4.e节中所述的简化BSD许可文本,并且提供BSD许可中所述的代码组件时不提供任何担保。
Table of Contents
目录
1. Introduction ....................................................2 2. Conventions .....................................................3 3. Definitions and Abbreviations ...................................3 3.1. Definitions ................................................3 3.2. Abbreviations ..............................................4 4. Transport of MPEG Surround ......................................4 4.1. Embedded Spatial Audio Data in AAC Payloads ................4 4.2. MPEG Surround Elementary Stream ............................5 4.2.1. Low Bitrate MPEG Surround ...........................7 4.2.2. High Bitrate MPEG Surround ..........................8 5. IANA Considerations .............................................8 5.1. Media Type Registration ....................................9 5.2. Registration of Mode Definitions with IANA .................9 5.3. Usage of SDP ..............................................10 6. Security Considerations ........................................10 7. References .....................................................11 7.1. Normative References ......................................11 7.2. Informative References ....................................11
1. Introduction ....................................................2 2. Conventions .....................................................3 3. Definitions and Abbreviations ...................................3 3.1. Definitions ................................................3 3.2. Abbreviations ..............................................4 4. Transport of MPEG Surround ......................................4 4.1. Embedded Spatial Audio Data in AAC Payloads ................4 4.2. MPEG Surround Elementary Stream ............................5 4.2.1. Low Bitrate MPEG Surround ...........................7 4.2.2. High Bitrate MPEG Surround ..........................8 5. IANA Considerations .............................................8 5.1. Media Type Registration ....................................9 5.2. Registration of Mode Definitions with IANA .................9 5.3. Usage of SDP ..............................................10 6. Security Considerations ........................................10 7. References .....................................................11 7.1. Normative References ......................................11 7.2. Informative References ....................................11
MPEG Surround (Spatial Audio Coding, SAC) [23003-1] is an International Standard that was finalized by MPEG in January 2007. It is capable of re-creating N channels based on M < N transmitted channels and additional control data. In the preferred modes of operating the Spatial Audio Coding system, the M channels can either be a single mono channel or a stereo channel pair. The control data represents a significantly lower data rate than the data rate required for transmitting all N channels, making the coding very efficient while at the same time ensuring compatibility with M channel devices.
MPEG环绕(空间音频编码,SAC)[23003-1]是MPEG于2007年1月最终确定的国际标准。它能够基于M<N个传输通道和附加控制数据重新创建N个通道。在操作空间音频编码系统的优选模式中,M个信道可以是单个单声道或立体声信道对。控制数据表示比传输所有N个信道所需的数据速率低得多的数据速率,使得编码非常有效,同时确保与M个信道设备的兼容性。
The MPEG Surround standard incorporates a number of tools that enable features that allow for broad application of the standard. A key feature is the ability to scale the spatial image quality gradually from very low spatial overhead towards transparency. Another key feature is that the decoder input can be made compatible to existing matrixed surround technologies.
MPEG环绕声标准包含了许多工具,这些工具支持广泛应用该标准的功能。一个关键特性是能够将空间图像质量从非常低的空间开销逐渐扩展到透明度。另一个关键特性是解码器输入可以与现有矩阵环绕技术兼容。
As an example, for 5.1 multi-channel audio, the MPEG Surround encoder creates a stereo (or mono) downmix signal and spatial information describing the full 5.1 material in a highly efficient, parameterised format. The spatial information is transmitted alongside the downmix.
例如,对于5.1多声道音频,MPEG环绕声编码器以高效的参数化格式创建立体声(或单声道)下混音信号和描述完整5.1材质的空间信息。空间信息沿下混频传输。
By using MPEG Surround, existing services can easily be upgraded to provide surround sound in a backwards-compatible fashion. While a stereo decoder in an existing legacy consumer device ignores the MPEG Surround data and plays back the stereo signal without any quality degradation, an MPEG-Surround-enabled decoder will deliver high quality, multi-channel audio.
通过使用MPEG环绕,可以轻松升级现有服务,以向后兼容的方式提供环绕声。虽然现有传统消费设备中的立体声解码器忽略MPEG环绕声数据并播放立体声信号而不降低任何质量,但启用MPEG环绕声的解码器将提供高质量的多声道音频。
The MPEG Surround decoder can operate in modes that render the multi-channel signal to multi-channel or stereo output, or it can operate in a two-channel headphone mode to produce a virtual surround output signal.
MPEG环绕声解码器可以在将多声道信号呈现为多声道或立体声输出的模式下工作,也可以在双声道耳机模式下工作以产生虚拟环绕声输出信号。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中所述进行解释。
This memo makes use of the definitions specified in [14496-1], [14496-3], [23003-1], and [RFC3640]. Frequently used terms are summed up for convenience:
本备忘录使用了[14496-1]、[14496-3]、[23003-1]和[RFC3640]中规定的定义。为方便起见,对常用术语进行了总结:
Access Unit: An MPEG Access Unit is the smallest data entity to which timing information is attributed. In the case of audio, an Access Unit is the smallest individually accessible portion of coded audio data within an elementary stream.
访问单元:MPEG访问单元是时间信息所属的最小数据实体。在音频的情况下,访问单元是基本流中编码音频数据的最小的单独可访问部分。
AudioSpecificConfig(): Extends the class DecoderSpecificInfo(), as defined in [14496-1], when the objectType indication refers to a stream complying with [14496-3]. AudioSpecificConfig() is used as the configuration structure for MPEG-4 audio as specified in [14496-3]. It contains the field audioObjectType, which distinguishes between the different audio codecs defined in [14496-3], general audio information (e.g., the sampling frequency and number of channels), and further codec-dependent information structures.
AudioSpecificConfig():当objectType指示引用符合[14496-3]的流时,扩展[14496-1]中定义的DecoderSpecificInfo()类。AudioSpecificConfig()用作[14496-3]中指定的MPEG-4音频的配置结构。它包含字段audioObjectType,用于区分[14496-3]中定义的不同音频编解码器、一般音频信息(例如,采样频率和通道数)以及其他与编解码器相关的信息结构。
SpatialSpecificConfig(): Configuration structure for MPEG Surround audio coding, as specified in [23003-1]. An AudioSpecificConfig() with an audioObjectType of value 30 contains a SpatialSpecificConfig() structure.
SpatialSpecificConfig():MPEG环绕音频编码的配置结构,如[23003-1]中所述。audioObjectType值为30的AudioSpecificConfig()包含一个SpatialSpecificConfig()结构。
AOT: Audio Object Type AAC: Advanced Audio Coding ASC: AudioSpecificConfig() structure AU: Access Unit HE AAC: High Efficiency AAC PLI: Profile and Level Indication SSC: SpatialSpecificConfig() structure
AOT: Audio Object Type AAC: Advanced Audio Coding ASC: AudioSpecificConfig() structure AU: Access Unit HE AAC: High Efficiency AAC PLI: Profile and Level Indication SSC: SpatialSpecificConfig() structure
From a top-level perspective, MPEG Surround data can be subdivided into configuration data contained in the SpatialSpecificConfig() (SSC) and the SpatialFrame(), which contains the MPEG Surround payload. The configuration data can be signaled in-band or out-of-band. In the case of in-band signaling the SSC is conveyed in a SacDataFrame() jointly with a SpatialFrame(). In the case of out-of-band signaling, the SSC is transmitted to the decoder separately, e.g., by Session Description Protocol (SDP) [RFC4566] means.
从顶层的角度来看,MPEG环绕数据可以细分为包含在SpatialSpecificConfig()(SSC)和包含MPEG环绕负载的SpatialFrame()中的配置数据。配置数据可以在带内或带外发送信号。在带内信令的情况下,SSC在SacDataFrame()中与SpatialFrame()一起传输。在带外信令的情况下,例如通过会话描述协议(SDP)[rfc456]装置,将SSC单独发送到解码器。
SpatialFrame()s may be transmitted either embedded into the downmix stream (Section 4.1) or as individual elementary streams besides the downmix audio stream (Section 4.2).
SpatialFrame()可以嵌入到下混音流(第4.1节)中,或者作为下混音音频流(第4.2节)之外的单个基本流进行传输。
The buffer definition for AAC decoders limits the size of an AU, as specified in [14496-3]. For high-bitrate applications that exceed this limit, all MPEG Surround data MUST be put in a separate stream, as defined in Section 4.2.
AAC解码器的缓冲区定义限制了AU的大小,如[14496-3]所述。对于超过此限制的高比特率应用,所有MPEG环绕数据必须放在单独的流中,如第4.2节所定义。
[14496-3] defines the extension_payload() as a mechanism for transport of extension data inside AAC payloads. Typical extension data include Spectral Band Replication (SBR) data and MPEG Surround data, i.e., a SacDataFrame() in extension_payload()s of type EXT_SAC_DATA. extension_payload()s reside inside the downmix AAC elementary stream. The resulting single elementary stream is transported as specified in [RFC3640]. As AAC decoders are required to skip unknown extension data, MPEG Surround data can be embedded in backwards-compatible fashion and be transported with the mechanism already described in [RFC3640].
[14496-3]将扩展_有效载荷()定义为在AAC有效载荷内传输扩展数据的机制。典型的扩展数据包括光谱带复制(SBR)数据和MPEG环绕数据,即EXT_SAC_数据类型的扩展载荷()中的SacDataFrame()。扩展_payload()驻留在downmix AAC基本流中。按照[RFC3640]中的规定传输生成的单个基本流。由于AAC解码器需要跳过未知的扩展数据,MPEG环绕数据可以以向后兼容的方式嵌入,并使用[RFC3640]中已经描述的机制进行传输。
The SacDataFrame() includes a SpatialFrame() and an optional header that contains an SSC. Any SSC in a SacDataFrame() MUST be identical to the SSC conveyed via SDP for that stream.
SacDataFrame()包括SpatialFrame()和包含SSC的可选标头。SacDataFrame()中的任何SSC必须与该流通过SDP传输的SSC相同。
No new mode is introduced for SpatialFrame()s being embedded into AAC payloads. Either the mode AAC-lbr or the mode AAC-hbr SHOULD be used. The additional Media Type parameters, as defined in Section 5.1, SHOULD be present when SpatialFrame()s are embedded into AAC payloads.
没有为将SpatialFrame()嵌入AAC有效负载引入新模式。应使用AAC lbr模式或AAC hbr模式。当SpatialFrame()嵌入AAC有效载荷时,应提供第5.1节中定义的附加媒体类型参数。
For example:
例如:
m=audio 5000 RTP/AVP 96 a=rtpmap:96 mpeg4-generic/48000/2 a=fmtp:96 streamType=5; profile-level-id=44; mode=AAC-hbr; config=131 056E598; sizeLength=13; indexLength=3; indexDeltaLength=3; constant Duration=2048; MPS-profile-level-id=55; MPS-config=F1B4CF920442029B 501185B6DA00;
m=audio 5000 RTP/AVP 96 a=rtpmap:96 mpeg4-generic/48000/2 a=fmtp:96 streamType=5; profile-level-id=44; mode=AAC-hbr; config=131 056E598; sizeLength=13; indexLength=3; indexDeltaLength=3; constant Duration=2048; MPS-profile-level-id=55; MPS-config=F1B4CF920442029B 501185B6DA00;
In this example, the stream specifies the HE AAC Profile at Level 2 [Profile and Level Indication (PLI) 44] and the config string contains the hexadecimal representation of the HE AAC ASC [audioObjectType=2 (AAC LC); extensionAudioObjectType=5 (SBR); samplingFrequencyIndex=0x6 (24kHz); extensionSamplingFrequencyIndex=0x3 (48kHz); channelConfiguration=2 (2.0 channels)] of the downmix AAC elementary stream that is using explicit backwards-compatible signaling.
In this example, the stream specifies the HE AAC Profile at Level 2 [Profile and Level Indication (PLI) 44] and the config string contains the hexadecimal representation of the HE AAC ASC [audioObjectType=2 (AAC LC); extensionAudioObjectType=5 (SBR); samplingFrequencyIndex=0x6 (24kHz); extensionSamplingFrequencyIndex=0x3 (48kHz); channelConfiguration=2 (2.0 channels)] of the downmix AAC elementary stream that is using explicit backwards-compatible signaling.
Furthermore, the stream specifies the MPEG Surround Baseline Profile at Level 3 (PLI55) and the MPS-config string contains the hexadecimal representation of the MPEG Surround ASC [audioObjectType=30 (MPEG Surround); samplingFrequencyIndex=0x3 (48kHz); channelConfiguration=6 (5.1 channels); sacPayloadEmbedding=1; SSC=(48 kHz; 32 slots; 525 tree; ResCoding=1; ResBands=[0,13,13,13])].
此外,流在级别3(PLI55)指定MPEG环绕基线配置文件,MPS配置字符串包含MPEG环绕ASC的十六进制表示形式[audioObjectType=30(MPEG环绕);samplingFrequencyIndex=0x3(48kHz);channelConfiguration=6(5.1个通道);sacPayloadEmbedding=1;SSC=(48 kHz;32个插槽;525个树;重新编码=1;重新频带=[0,13,13,13])。
Note that the a=fmtp line of the example above has been wrapped to fit the page; it would comprise a single line in the SDP file.
注意,上面示例的a=fmtp行已被包装以适合页面;它将包含SDP文件中的一行。
MPEG Surround SpatialFrame()s can be present in an individual elementary stream. This stream complements the stream containing the downmix audio data, which may be coded by an arbitrary coding scheme. MPEG Surround elementary streams are packetized as specified in [RFC3640]. The mode signaled and used for an MPEG Surround elementary stream MUST be either MPS-hbr or MPS-lbr. The MPS-hbr mode SHALL be used when the frame size may exceed 63 bytes, e.g., when high-bitrate residual coding is in use.
MPEG环绕空间帧()可以出现在单个基本流中。该流补充包含下混音音频数据的流,下混音音频数据可以通过任意编码方案进行编码。MPEG环绕基本流按照[RFC3640]中的规定进行打包。发送信号并用于MPEG环绕基本流的模式必须是MPS hbr或MPS lbr。当帧大小可能超过63字节时,例如,当使用高比特率剩余编码时,应使用MPS hbr模式。
The dependency relationships between the MPEG Surround elementary stream and the downmix stream are signaled as specified in [RFC5583].
MPEG环绕基本流和下混音流之间的依赖关系按照[RFC5583]中的规定发出信号。
The media clocks of the MPEG Surround elementary stream and the downmix stream SHALL operate in the same clock domain, i.e., the clocks are derived from a common clock and MUST NOT drift. RTCP sender reports MUST indicate that the stream timestamps are not drifting, i.e., that a single sender report for each stream is sufficient to establish unambiguous timing. The sampling rate of the MPEG Surround signal and the decoded downmix signal MUST be identical.
MPEG环绕基本流和下混频流的媒体时钟应在相同的时钟域中运行,即,时钟源自公共时钟,且不得漂移。RTCP发送方报告必须表明流时间戳没有漂移,即每个流的单个发送方报告足以确定明确的时间。MPEG环绕声信号和解码的下混音信号的采样率必须相同。
If HE AAC is used as the coding scheme for the downmix, the RTP clock-rate of the downmix MAY be the sampling rate of the AAC core, i.e., the clock-rate of the MPEG Surround elementary stream is an integer multiple of the clock-rate of the downmix stream.
如果将AAC用作下混频的编码方案,则下混频的RTP时钟速率可以是AAC核心的采样速率,即,MPEG环绕基本流的时钟速率是下混频流的时钟速率的整数倍。
Note that separate RTP streams have different random RTP timestamp offsets, and therefore RTCP MUST be used to synchronize the coded downmix audio data and the MPEG Surround elementary stream.
注意,单独的RTP流具有不同的随机RTP时间戳偏移,因此必须使用RTCP来同步编码的下混音音频数据和MPEG环绕基本流。
For example:
例如:
a=group:DDP L1 L2
a=组:DDP L1 L2
m=audio 5000 RTP/AVP 96 a=rtpmap:96 mpeg4-generic/48000/2 a=fmtp:96 streamType=5; profile-level-id=44; mode=AAC-hbr; config=2B1 18800; sizeLength=13; indexLength=3; indexDeltaLength=3; constantDu ration=2048 a=mid:L1
m=audio 5000 RTP/AVP 96 a=rtpmap:96 mpeg4-generic/48000/2 a=fmtp:96 streamType=5; profile-level-id=44; mode=AAC-hbr; config=2B1 18800; sizeLength=13; indexLength=3; indexDeltaLength=3; constantDu ration=2048 a=mid:L1
m=audio 5002 RTP/AVP 97 a=rtpmap:97 mpeg4-generic/48000/6 a=fmtp:97 streamType=5; profile-level-id=55; mode=MPS-hbr; config=F1B 0CF920460029B601189E79E70; sizeLength=13; indexLength=3; indexDelt aLength=3; constantDuration=2048 a=mid:L2 a=depend:97 lay L1:96
m=audio 5002 RTP/AVP 97 a=rtpmap:97 mpeg4-generic/48000/6 a=fmtp:97 streamType=5; profile-level-id=55; mode=MPS-hbr; config=F1B 0CF920460029B601189E79E70; sizeLength=13; indexLength=3; indexDelt aLength=3; constantDuration=2048 a=mid:L2 a=depend:97 lay L1:96
In this example, the first stream specifies the HE AAC Profile at Level 2 (PLI44) and the config string contains the hexadecimal representation of the HE AAC ASC [audioObjectType=2 (AAC LC); extensionAudioObjectType=5 (SBR); samplingFrequencyIndex=0x6 (24kHz); extensionSamplingFrequencyIndex=0x3 (48kHz); channelConfiguration=2 (2.0 channels)].
In this example, the first stream specifies the HE AAC Profile at Level 2 (PLI44) and the config string contains the hexadecimal representation of the HE AAC ASC [audioObjectType=2 (AAC LC); extensionAudioObjectType=5 (SBR); samplingFrequencyIndex=0x6 (24kHz); extensionSamplingFrequencyIndex=0x3 (48kHz); channelConfiguration=2 (2.0 channels)].
The second stream specifies Baseline MPEG Surround Profile at Level 3 (PLI55) and the config string contains the hexadecimal representation of the ASC [AOT=30(MPEG Surround); 48 kHz; 5.1-ch; sacPayloadEmbedding=0; SSC=(48 kHz; 32 slots; 525 tree; ResCoding=1; ResBands=[7,7,7,7])].
The second stream specifies Baseline MPEG Surround Profile at Level 3 (PLI55) and the config string contains the hexadecimal representation of the ASC [AOT=30(MPEG Surround); 48 kHz; 5.1-ch; sacPayloadEmbedding=0; SSC=(48 kHz; 32 slots; 525 tree; ResCoding=1; ResBands=[7,7,7,7])].
Note that the a=fmtp lines of the example above have been wrapped to fit the page; they would each comprise a single line in the SDP file.
注意,上面示例中的a=fmtp行已被包装以适合页面;它们将分别包含SDP文件中的一行。
This mode is signaled by mode=MPS-lbr. This mode supports the transport of one or more complete Access Units, each consisting of a single MPEG Surround SpatialFrame(). The AUs can be variably sized and interleaved. The maximum size of a SpatialFrame() is 63 bytes. Fragmentation MUST NOT be used in this mode. Receivers MUST support de-interleaving.
此模式由模式=MPS lbr发出信号。此模式支持传输一个或多个完整的访问单元,每个单元由单个MPEG环绕空间帧()组成。AU可以是可变大小和交错的。SpatialFrame()的最大大小为63字节。在此模式下不得使用碎片。接收机必须支持解交织。
The payload configuration is the same as in the AAC-lbr mode. It consists of the AU Header Section, followed by concatenated AUs. Note that Access Units are byte-aligned. The Auxiliary Section MUST be empty in the MPS-lbr mode. The 1-octet AU-header MUST provide:
有效负载配置与AAC lbr模式中的相同。它由AU头部分组成,后面是连接的AU。请注意,访问单元是字节对齐的。在MPS lbr模式下,辅助部分必须为空。1-octet AU头必须提供:
1. the size of each AAC frame, encoded as 6 bits.
1. 每个AAC帧的大小,编码为6位。
2. 2 bits of index information for computing the sequence (and hence timing) of each SpatialFrame().
2. 2位索引信息,用于计算每个空间帧()的序列(因此也用于计时)。
The concatenated AU Header Section MUST be preceded by the 16-bit AU-headers-length field.
连接的AU标头部分前面必须有16位AU标头长度字段。
In addition to the required Media format parameters, the following parameters MUST be present with fixed values: sizeLength (fixed value 6), indexLength (fixed value 2), and indexDeltaLength (fixed value 2). The parameter maxDisplacement MUST be present when interleaving. SpatialFrame()s always have a fixed duration per AU; the fixed duration MUST be signaled by the Media format parameter constantDuration.
除所需的媒体格式参数外,以下参数必须具有固定值:sizeLength(固定值6)、indexLength(固定值2)和IndexDeltagLength(固定值2)。交错时必须存在参数maxDisplacement。SpatialFrame()的每个AU始终具有固定的持续时间;固定的持续时间必须由媒体格式参数constantDuration发出信号。
The value of the "config" parameter is the hexadecimal representation of the ASC, as defined in [14496-3], with an AOT of 30 and the sacPayloadEmbedding flag set to 0.
“config”参数的值是[14496-3]中定义的ASC的十六进制表示形式,AOT为30,sacPayloadEmbedding标志设置为0。
The "profile-level-id" parameter SHALL contain a valid PLI for MPEG Surround, as specified in [14496-3].
“profile level id”参数应包含[14496-3]中规定的MPEG环绕的有效PLI。
This mode is signaled by mode=MPS-hbr. This mode supports the transportation of either one fragment of an Access Unit or one complete AU or several complete AUs. Each AU consists of a single MPEG Surround SpatialFrame(). The AUs can be variably sized and interleaved. The maximum size of a SpatialFrame() is 8191 bytes. Receivers MUST support de-interleaving.
该模式由模式=MPS hbr发出信号。此模式支持一个访问单元的一个片段或一个完整AU或多个完整AU的传输。每个AU由单个MPEG环绕空间帧()组成。AU可以是可变大小和交错的。SpatialFrame()的最大大小为8191字节。接收机必须支持解交织。
The payload configuration is the same as in the AAC-hbr mode. It consists of the AU Header Section, followed by either one SpatialFrame(), a fragment of a SpatialFrame(), or several concatenated SpatialFrame()s. Note that Access Units are byte-aligned. The Auxiliary Section MUST be empty in the MPS-hbr mode. The 2-octet AU-header MUST provide:
有效负载配置与AAC hbr模式中的相同。它由AU头部分组成,后跟一个SpatialFrame()、一个SpatialFrame()的片段或几个串联的SpatialFrame()。请注意,访问单元是字节对齐的。在MPS hbr模式下,辅助部分必须为空。2-octet AU头必须提供:
1. the size of each AAC frame, encoded as 13 bits.
1. 每个AAC帧的大小,编码为13位。
2. 3 bits of index information for computing the sequence (and hence timing) of each SpatialFrame(), i.e., the AU-Index or AU-Index-delta field.
2. 3位索引信息,用于计算每个SpatialFrame()的序列(以及由此产生的定时),即AU索引或AU索引增量字段。
Each AU-Index field MUST be coded with the value 0. The concatenated AU Header Section MUST be preceded by the 16-bit AU-headers-length field.
每个AU索引字段必须用值0编码。连接的AU标头部分前面必须有16位AU标头长度字段。
In addition to the required Media format parameters, the following parameters MUST be present with fixed values: sizeLength (fixed value 13), indexLength (fixed value 3), and indexDeltaLength (fixed value 3). The parameter maxDisplacement MUST be present when interleaving. SpatialFrame()s always have a fixed duration per AU; the fixed duration MUST be signaled by the Media format parameter constantDuration.
除所需的媒体格式参数外,以下参数必须具有固定值:sizeLength(固定值13)、indexLength(固定值3)和IndexDeltagLength(固定值3)。交错时必须存在参数maxDisplacement。SpatialFrame()的每个AU始终具有固定的持续时间;固定的持续时间必须由媒体格式参数constantDuration发出信号。
The value of the "config" parameter is the hexadecimal representation of the ASC, as defined in [14496-3], with an AOT of 30 and the sacPayloadEmbedding flag set to 0.
“config”参数的值是[14496-3]中定义的ASC的十六进制表示形式,AOT为30,sacPayloadEmbedding标志设置为0。
The "profile-level-id" parameter SHALL contain a valid PLI for MPEG Surround, as specified in [14496-3].
“profile level id”参数应包含[14496-3]中规定的MPEG环绕的有效PLI。
This memo defines additional optional format parameters to the Media type "audio" and its subtype "mpeg4-generic". These parameters SHALL only be used in combination with the AAC-lbr or AAC-hbr modes (cf. Section 3.3 of [RFC3640]) of "mpeg4-generic".
此备忘录定义了媒体类型“audio”及其子类型“mpeg4 generic”的其他可选格式参数。这些参数只能与“mpeg4通用”的AAC lbr或AAC hbr模式(参见[RFC3640]第3.3节)结合使用。
This memo defines the following additional optional parameters, which SHALL be used if MPEG Surround data is present inside the payload of an AAC elementary stream.
本备忘录定义了以下附加可选参数,如果AAC基本流的有效负载中存在MPEG环绕数据,则应使用这些参数。
MPS-profile-level-id: A decimal representation of the MPEG Surround Profile and Level indication as defined in [14496-3]. This parameter MUST be used in the capability exchange or session set-up procedure to indicate the MPEG Surround Profile and Level that the decoder must be capable of in order to decode the stream.
MPS配置文件级别id:MPEG环绕配置文件和级别指示的十进制表示,如[14496-3]中所定义。此参数必须在能力交换或会话设置过程中使用,以指示解码器必须能够解码流的MPEG环绕模式和级别。
MPS-config: A hexadecimal representation of an octet string that expresses the AudioSpecificConfig (ASC), as defined in [14496-3], for MPEG Surround. The ASC is mapped onto the hexadecimal octet string in a most significant bit (MSB)-first basis. The AOT in this ASC SHALL have the value 30. The SSC inside the ASC MUST have the sacPayloadEmbedding flag set to 1.
MPS配置:八位字节字符串的十六进制表示形式,表示[14496-3]中定义的MPEG环绕的AudioSpecificConfig(ASC)。ASC以最高有效位(MSB)为基础映射到十六进制八位字符串。本ASC中的AOT值应为30。ASC内的SSC必须将sacPayloadEmbedding标志设置为1。
This section of this memo requests the registration of the "MPS-hbr" value and the "MPS-lbr" value for the "mode" parameter of the "mpeg4- generic" media subtype within the media type "audio". The "mpeg4- generic" media subtype is defined in [RFC3640], and [RFC3640] defines a repository for the "mode" parameter. This memo registers the modes "MPS-hbr" and "MPS-lbr" to support MPEG Surround elementary streams.
本备忘录本节要求在媒体类型“音频”中注册“mpeg4-通用”媒体子类型的“模式”参数的“MPS hbr”值和“MPS lbr”值。[RFC3640]中定义了“mpeg4-通用”媒体子类型,[RFC3640]定义了“模式”参数的存储库。此备忘录注册了“MPS hbr”和“MPS lbr”模式,以支持MPEG环绕基本流。
Media type name:
媒体类型名称:
audio
音频
Subtype name:
子类型名称:
mpeg4-generic
mpeg4通用
Required parameters:
所需参数:
The "mode" parameter is required by [RFC3640]. This memo specifies the additional modes "MPS-hbr" and "MPS-lbr", in accordance with [RFC3640].
[RFC3640]需要“模式”参数。本备忘录根据[RFC3640]规定了附加模式“MPS hbr”和“MPS lbr”。
Optional parameters:
可选参数:
For the modes "AAC-hbr" and "AAC-lbr", this memo specifies the additional optional parameters "MPS-profile-level-id" and "MPS-config". See Section 4.1 for usage details.
对于模式“AAC hbr”和“AAC lbr”,本备忘录规定了附加可选参数“MPS配置文件级别id”和“MPS配置”。有关用法的详细信息,请参见第4.1节。
Optional parameters for the modes "MPS-hbr" and "MPS-lbr" may be used as specified in [RFC3640]. The optional parameters "MPS-profile-level-id" and "MPS-config" SHALL NOT be used for the modes "MPS-hbr" and "MPS-lbr".
可按照[RFC3640]中的规定使用模式“MPS hbr”和“MPS lbr”的可选参数。可选参数“MPS配置文件级别id”和“MPS配置”不得用于模式“MPS hbr”和“MPS lbr”。
It is assumed that the Media format parameters are conveyed via an SDP message, as specified in Section 4.4 of [RFC3640].
根据[RFC3640]第4.4节的规定,假设通过SDP消息传送媒体格式参数。
RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550], in the RTP payload format specification for MPEG-4 elementary streams [RFC3640] (which is extended with this memo), and in any applicable RTP profile. The main security considerations for the RTP packet carrying the RTP payload format defined within this memo are confidentiality, integrity, and source authenticity. Confidentiality is achieved by encryption of the RTP payload. Integrity of the RTP packets is achieved through a suitable cryptographic integrity-protection mechanism. Such a cryptographic system may also allow the authentication of the source of the payload. A suitable security mechanism for this RTP payload format should provide confidentiality, integrity protection, and source authentication capable of at least determining if an RTP packet is from a member of the RTP session.
使用本规范中定义的有效载荷格式的RTP数据包应遵守RTP规范[RFC3550]、MPEG-4基本流的RTP有效载荷格式规范[RFC3640](随本备忘录扩展)以及任何适用RTP配置文件中讨论的安全注意事项。携带本备忘录中定义的RTP有效载荷格式的RTP数据包的主要安全注意事项是机密性、完整性和源真实性。保密性是通过对RTP有效负载进行加密来实现的。RTP数据包的完整性是通过合适的密码完整性保护机制实现的。这样的密码系统还可以允许对有效载荷的源进行认证。此RTP有效载荷格式的合适安全机制应提供机密性、完整性保护和源认证,其至少能够确定RTP分组是否来自RTP会话的成员。
The AAC audio codec includes an extension mechanism to transmit extra data within a stream that is gracefully skipped by decoders that do not support this extra data. This covert channel may be used to transmit unauthorized data in an otherwise valid stream.
AAC音频编解码器包括一个扩展机制,用于在不支持此额外数据的解码器正常跳过的流中传输额外数据。此隐蔽通道可用于在其他有效流中传输未经授权的数据。
Note that the appropriate mechanism to provide security to RTP and payloads following this memo may vary. It is dependent on the application, the transport, and the signaling protocol employed. Therefore, a single mechanism is not sufficient; although, if suitable, usage of the Secure Real-time Transport Protocol (SRTP) [RFC3711] is recommended. Other mechanisms that may be used are IPsec [RFC4301] and Transport Layer Security (TLS) [RFC5246] (RTP over TCP); other alternatives may exist.
请注意,根据本备忘录为RTP和有效负载提供安全性的适当机制可能会有所不同。它取决于应用程序、传输和所采用的信令协议。因此,单一的机制是不够的;但是,如果合适,建议使用安全实时传输协议(SRTP)[RFC3711]。可使用的其他机制包括IPsec[RFC4301]和传输层安全(TLS)[RFC5246](TCP上的RTP);可能存在其他替代方案。
[14496-1] MPEG, "ISO/IEC International Standard 14496-1 - Coding of audio-visual objects, Part 1 Systems", 2004.
[14496-1]MPEG,“ISO/IEC国际标准14496-1-视听对象编码,第1部分系统”,2004年。
[14496-3] MPEG, "ISO/IEC International Standard 14496-3 - Coding of audio-visual objects, Part 3 Audio", 2009.
[14496-3]MPEG,“ISO/IEC国际标准14496-3-视听对象编码,第3部分音频”,2009年。
[23003-1] MPEG, "ISO/IEC International Standard 23003-1 - MPEG Surround (MPEG D)", 2007.
[23003-1]MPEG,“ISO/IEC国际标准23003-1-MPEG环绕(MPEG D)”,2007年。
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.
[RFC3550]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。
[RFC3640] van der Meer, J., Mackie, D., Swaminathan, V., Singer, D., and P. Gentric, "RTP Payload Format for Transport of MPEG-4 Elementary Streams", RFC 3640, November 2003.
[RFC3640]van der Meer,J.,Mackie,D.,Swaminathan,V.,Singer,D.,和P.Gentric,“MPEG-4基本流传输的RTP有效载荷格式”,RFC 36402003年11月。
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.
[RFC4566]Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC4566,2006年7月。
[RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding Dependency in the Session Description Protocol (SDP)", RFC 5583, July 2009.
[RFC5583]Schierl,T.和S.Wenger,“会话描述协议(SDP)中的信令媒体解码依赖性”,RFC 5583,2009年7月。
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004.
[RFC3711]Baugher,M.,McGrew,D.,Naslund,M.,Carrara,E.,和K.Norrman,“安全实时传输协议(SRTP)”,RFC 37112004年3月。
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, December 2005.
[RFC4301]Kent,S.和K.Seo,“互联网协议的安全架构”,RFC 43012005年12月。
[RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, August 2008.
[RFC5246]Dierks,T.和E.Rescorla,“传输层安全(TLS)协议版本1.2”,RFC 5246,2008年8月。
Authors' Addresses
作者地址
Frans de Bont Philips Electronics High Tech Campus 5 5656 AE Eindhoven, NL
弗朗斯·德·邦特飞利浦电子高科技园区5 5656埃因霍温,NL
Phone: ++31 40 2740234 EMail: frans.de.bont@philips.com
Phone: ++31 40 2740234 EMail: frans.de.bont@philips.com
Stefan Doehla Fraunhofer IIS Am Wolfmantel 33 91058 Erlangen, DE
斯特凡·多赫拉·弗劳恩霍夫(Stefan Doehla Fraunhofer)位于德国埃尔兰根市沃尔夫曼特尔33 91058号
Phone: +49 9131 776 6042 EMail: stefan.doehla@iis.fraunhofer.de
Phone: +49 9131 776 6042 EMail: stefan.doehla@iis.fraunhofer.de
Malte Schmidt Dolby Laboratories Deutschherrnstr. 15-19 90537 Nuernberg, DE
德国马尔特施密特杜比实验室。德国纽伦堡15-19 90537
Phone: +49 911 928 91 42 EMail: malte.schmidt@dolby.com
Phone: +49 911 928 91 42 EMail: malte.schmidt@dolby.com
Ralph Sperschneider Fraunhofer IIS Am Wolfmantel 33 91058 Erlangen, DE
拉尔夫·斯珀施奈德·弗劳恩霍夫,德国埃尔兰根沃尔夫曼特尔33 91058
Phone: +49 9131 776 6167 EMail: ralph.sperschneider@iis.fraunhofer.de
Phone: +49 9131 776 6167 EMail: ralph.sperschneider@iis.fraunhofer.de