Network Working Group M. Westerlund Request for Comments: 5404 I. Johansson Category: Standards Track Ericsson AB January 2009
Network Working Group M. Westerlund Request for Comments: 5404 I. Johansson Category: Standards Track Ericsson AB January 2009
RTP Payload Format for G.719
G.719的RTP有效负载格式
Status of This Memo
关于下段备忘
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
本文件规定了互联网社区的互联网标准跟踪协议,并要求进行讨论和提出改进建议。有关本协议的标准化状态和状态,请参考当前版本的“互联网官方协议标准”(STD 1)。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (c) 2008 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2008 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/ 许可证信息)在本文件发布之日生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。
Abstract
摘要
This document specifies the payload format for packetization of the G.719 full-band codec encoded audio signals into the Real-time Transport Protocol (RTP). The payload format supports transmission of multiple channels, multiple frames per payload, and interleaving.
本文件规定了将G.719全频段编解码器编码音频信号打包成实时传输协议(RTP)的有效载荷格式。有效负载格式支持多个信道的传输、每个有效负载多个帧以及交织。
Table of Contents
目录
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Definitions and Conventions . . . . . . . . . . . . . . . . . 3 3. G.719 Description . . . . . . . . . . . . . . . . . . . . . . 3 4. Payload Format Capabilities . . . . . . . . . . . . . . . . . 4 4.1. Multi-Rate Encoding and Rate Adaptation . . . . . . . . . 4 4.2. Support for Multi-Channel Sessions . . . . . . . . . . . . 5 4.3. Robustness against Packet Loss . . . . . . . . . . . . . . 5 4.3.1. Use of Forward Error Correction (FEC) . . . . . . . . 5 4.3.2. Use of Frame Interleaving . . . . . . . . . . . . . . 6 5. Payload Format . . . . . . . . . . . . . . . . . . . . . . . . 7 5.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 8 5.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 8 5.2.1. Basic ToC Element . . . . . . . . . . . . . . . . . . 9 5.3. Basic Mode . . . . . . . . . . . . . . . . . . . . . . . . 10 5.4. Interleaved Mode . . . . . . . . . . . . . . . . . . . . . 10 5.5. Audio Data . . . . . . . . . . . . . . . . . . . . . . . . 11 5.6. Implementation Considerations . . . . . . . . . . . . . . 12 5.6.1. Receiving Redundant Frames . . . . . . . . . . . . . . 12 5.6.2. Interleaving . . . . . . . . . . . . . . . . . . . . . 12 5.6.3. Decoding Validation . . . . . . . . . . . . . . . . . 13 6. Payload Examples . . . . . . . . . . . . . . . . . . . . . . . 13 6.1. 3 Mono Frames with 2 Different Bitrates . . . . . . . . . 13 6.2. 2 Stereo Frame-Blocks of the Same Bitrate . . . . . . . . 14 6.3. 4 Mono Frames Interleaved . . . . . . . . . . . . . . . . 15 7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 16 7.1. Media Type Definition . . . . . . . . . . . . . . . . . . 16 7.2. Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . 19 7.2.1. Offer/Answer Considerations . . . . . . . . . . . . . 19 7.2.2. Declarative SDP Considerations . . . . . . . . . . . . 22 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 9. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 23 10. Security Considerations . . . . . . . . . . . . . . . . . . . 24 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 12.1. Normative References . . . . . . . . . . . . . . . . . . . 25 12.2. Informative References . . . . . . . . . . . . . . . . . . 26
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Definitions and Conventions . . . . . . . . . . . . . . . . . 3 3. G.719 Description . . . . . . . . . . . . . . . . . . . . . . 3 4. Payload Format Capabilities . . . . . . . . . . . . . . . . . 4 4.1. Multi-Rate Encoding and Rate Adaptation . . . . . . . . . 4 4.2. Support for Multi-Channel Sessions . . . . . . . . . . . . 5 4.3. Robustness against Packet Loss . . . . . . . . . . . . . . 5 4.3.1. Use of Forward Error Correction (FEC) . . . . . . . . 5 4.3.2. Use of Frame Interleaving . . . . . . . . . . . . . . 6 5. Payload Format . . . . . . . . . . . . . . . . . . . . . . . . 7 5.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 8 5.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 8 5.2.1. Basic ToC Element . . . . . . . . . . . . . . . . . . 9 5.3. Basic Mode . . . . . . . . . . . . . . . . . . . . . . . . 10 5.4. Interleaved Mode . . . . . . . . . . . . . . . . . . . . . 10 5.5. Audio Data . . . . . . . . . . . . . . . . . . . . . . . . 11 5.6. Implementation Considerations . . . . . . . . . . . . . . 12 5.6.1. Receiving Redundant Frames . . . . . . . . . . . . . . 12 5.6.2. Interleaving . . . . . . . . . . . . . . . . . . . . . 12 5.6.3. Decoding Validation . . . . . . . . . . . . . . . . . 13 6. Payload Examples . . . . . . . . . . . . . . . . . . . . . . . 13 6.1. 3 Mono Frames with 2 Different Bitrates . . . . . . . . . 13 6.2. 2 Stereo Frame-Blocks of the Same Bitrate . . . . . . . . 14 6.3. 4 Mono Frames Interleaved . . . . . . . . . . . . . . . . 15 7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 16 7.1. Media Type Definition . . . . . . . . . . . . . . . . . . 16 7.2. Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . 19 7.2.1. Offer/Answer Considerations . . . . . . . . . . . . . 19 7.2.2. Declarative SDP Considerations . . . . . . . . . . . . 22 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 9. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 23 10. Security Considerations . . . . . . . . . . . . . . . . . . . 24 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 12.1. Normative References . . . . . . . . . . . . . . . . . . . 25 12.2. Informative References . . . . . . . . . . . . . . . . . . 26
This document specifies the payload format for packetization of the G.719 full-band (FB) codec encoded audio signals into the Real-time Transport Protocol (RTP) [RFC3550]. The payload format supports transmission of multiple channels, multiple frames per payload, and packet loss robustness methods using redundancy or interleaving.
本文件规定了将G.719全频段(FB)编解码器编码的音频信号打包成实时传输协议(RTP)[RFC3550]的有效载荷格式。有效负载格式支持多个信道的传输、每个有效负载多个帧以及使用冗余或交织的分组丢失鲁棒性方法。
This document starts with conventions, a brief description of the codec, and the payload format's capabilities. The payload format is specified in Section 5. Examples can be found in Section 6. The media type and its mappings to the Session Description Protocol (SDP) and usage in SDP offer/answer are then specified. The document ends with considerations regarding congestion control and security.
本文档从约定、编解码器的简要说明和有效负载格式的功能开始。第5节规定了有效载荷格式。示例见第6节。然后指定媒体类型及其到会话描述协议(SDP)的映射以及SDP提供/应答中的用法。该文件以关于拥塞控制和安全性的考虑作为结尾。
The term "frame-block" is used in this document to describe the time-synchronized set of audio frames in a multi-channel audio session. In particular, in an N-channel session, a frame-block will contain N audio frames, one from each of the channels, and all N speech frames represent exactly the same time period.
术语“帧块”在本文档中用于描述多声道音频会话中的音频帧的时间同步集。特别地,在N信道会话中,帧块将包含N个音频帧,每个信道一个,并且所有N个语音帧表示完全相同的时间段。
This document contains depictions of bit fields. The most significant bit is always leftmost in the figure on each row and has the lowest enumeration. For fields that are depicted over multiple rows, the upper row is more significant than the next.
本文档包含位字段的描述。最高有效位在每行的图中总是最左边的,并且具有最低的枚举。对于在多行上描绘的字段,上行比下一行更重要。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照RFC 2119[RFC2119]中所述进行解释。
The ITU-T G.719 full-band codec is a transform coder based on Modulated Lapped Transform (MLT). G.719 is a low-complexity full-bandwidth codec for conversational speech and audio coding. The encoder input and decoder output are sampled at 48 kHz. The codec enables full-bandwidth from 20 Hz to 20 kHz, encoding of speech, music, and general audio content at rates from 32 kbit/s up to 128 kbit/s. The codec operates on 20-ms frames and has an algorithmic delay of 40 ms.
ITU-T G.719全频段编解码器是一种基于调制重叠变换(MLT)的变换编码器。G.719是一种用于会话语音和音频编码的低复杂度全带宽编解码器。编码器输入和解码器输出以48 kHz采样。编解码器支持从20 Hz到20 kHz的全带宽,以32 kbit/s到128 kbit/s的速率对语音、音乐和一般音频内容进行编码。编解码器在20毫秒帧上运行,算法延迟为40毫秒。
The codec provides excellent quality for speech, music, and other types of audio. Some of the applications for which this coder is suitable are:
编解码器为语音、音乐和其他类型的音频提供卓越的质量。本编码器适用的一些应用包括:
o Real-time communications such as video conferencing and telephony
o 实时通信,如视频会议和电话
o Streaming audio
o 流式音频
o Archival and messaging
o 存档和消息传递
The encoding and decoding algorithm can change the bitrate at any 20-ms frame boundary. The encoder receives the audio sampled at 48 kHz. The support of other sampling rates is possible by re-sampling the input signal to the codec's sampling rate, i.e., 48 kHz; however, this functionality is not part of the standard.
编码和解码算法可以在任何20ms帧边界处改变比特率。编码器接收以48 kHz采样的音频。通过将输入信号重新采样到编解码器的采样率,即48 kHz,可以支持其他采样率;但是,此功能不是标准的一部分。
The encoding is performed on equally sized frames. For each frame, the encoder decides between two encoding modes, a transient mode and a stationary mode. The decision is based on statistics derived from the input signal. The stationary mode uses a long MLT that leads to a spectrum of 960 coefficients, while the transient encoding mode uses a short MLT (higher time resolution transform) that results in 4 spectra (4 x 240 = 960 coefficients). The encoding of the spectrum is done in two steps. First, the spectral envelope is computed, quantized, and Huffman encoded. The envelope is computed on a non-uniform frequency subdivision. From the coded spectral envelope, a weighted spectral envelope is derived and is used for bit allocation; this process is also repeated at the decoder. Thus, only the spectral envelope is transmitted. The output of the bit allocation is used in order to quantize the spectra. In addition, for stationary frames, the encoder estimates the amount of noise level. The decoder applies the reverse operation upon reception of the bit stream. The non-coded coefficients (i.e., no bits allocated) are replaced by entries of a noise codebook that is built based on the decoded coefficients.
编码在大小相同的帧上执行。对于每一帧,编码器决定两种编码模式,一种是瞬时模式,另一种是静止模式。该决定基于从输入信号中导出的统计信息。静止模式使用长MLT,产生960个系数的频谱,而瞬态编码模式使用短MLT(更高的时间分辨率变换),产生4个频谱(4 x 240=960个系数)。频谱编码分两步完成。首先,对光谱包络进行计算、量化和哈夫曼编码。包络是在非均匀频率细分上计算的。从编码的谱包络中导出加权谱包络并用于比特分配;该过程也在解码器处重复。因此,仅传输光谱包络。比特分配的输出用于量化频谱。此外,对于静止帧,编码器估计噪声级的量。解码器在接收到比特流时应用反向操作。非编码系数(即,未分配比特)被基于解码系数构建的噪声码本的条目替换。
This payload format has a number of capabilities, and this section discusses them in some detail.
此有效负载格式具有许多功能,本节将详细讨论这些功能。
G.719 supports a multi-rate encoding capability that enables on a per-frame basis variation of the encoding rate. This enables support for bitrate adaptation and congestion control. The possibility to aggregate multiple audio frames into a single RTP payload is another dimension of adaptation. The RTP and payload format overhead can thus be reduced by the aggregation at the cost of increased delay and reduced packet-loss robustness.
G.719支持多速率编码能力,能够在每帧的基础上改变编码速率。这支持比特率自适应和拥塞控制。将多个音频帧聚合到单个RTP有效负载的可能性是自适应的另一个方面。因此,可以通过聚合降低RTP和有效负载格式开销,但代价是增加延迟和降低分组丢失鲁棒性。
The RTP payload format defined in this document supports multi-channel audio content (e.g., stereophonic or surround audio sessions). Although the G.719 codec itself does not support encoding of multi-channel audio content into a single bit stream, it can be used to separately encode and decode each of the individual channels. To transport (or store) the separately encoded multi-channel content, the audio frames for all channels that are framed and encoded for the same 20-ms period are logically collected in a "frame-block".
本文档中定义的RTP有效负载格式支持多声道音频内容(例如,立体声或环绕声音频会话)。尽管G.719编解码器本身不支持将多声道音频内容编码为单个比特流,但它可用于单独编码和解码各个声道。为了传输(或存储)单独编码的多频道内容,在“帧块”中逻辑地收集针对相同20ms周期被帧化和编码的所有频道的音频帧。
At the session setup, out-of-band signaling must be used to indicate the number of channels in the payload type. The order of the audio frames within the frame-block depends on the number of the channels and follows the definition in Section 4.1 of the RTP/AVP profile [RFC3551]. When using SDP for signaling, the number of channels is specified in the rtpmap attribute.
At the session setup, out-of-band signaling must be used to indicate the number of channels in the payload type. The order of the audio frames within the frame-block depends on the number of the channels and follows the definition in Section 4.1 of the RTP/AVP profile [RFC3551]. When using SDP for signaling, the number of channels is specified in the rtpmap attribute.translate error, please retry
The payload format supports several means, including forward error correction (FEC) and frame interleaving, to increase robustness against packet loss.
有效载荷格式支持多种方法,包括前向纠错(FEC)和帧交织,以提高对分组丢失的鲁棒性。
Generic forward error correction within RTP is defined, for example, in RFC 5109 [RFC5109]. Audio redundancy coding is defined in RFC 2198 [RFC2198]. Either scheme can be used to add redundant information to the RTP packet stream and make it more resilient to packet losses, at the expense of a higher bitrate. Please see either of the RFCs for a discussion of the implications of the higher bitrate to network congestion.
例如,在RFC 5109[RFC5109]中定义了RTP内的通用前向纠错。音频冗余编码在RFC 2198[RFC2198]中定义。这两种方案均可用于向RTP数据包流中添加冗余信息,并使其在以更高比特率为代价的情况下对数据包丢失具有更强的恢复能力。有关更高比特率对网络拥塞的影响的讨论,请参见任一RFC。
In addition to these media-unaware mechanisms, this memo specifies a G.719-specific form of audio redundancy coding, which may be beneficial in terms of packetization overhead. Conceptually, previously transmitted transport frames are aggregated together with new ones. A sliding window can be used to group the frames to be sent in each payload. However, irregular or non-consecutive patterns are also possible by inserting NO_DATA frames between primary and redundant transmissions. Figure 1 below shows an example.
除了这些媒体不知道的机制外,本备忘录还规定了一种G.719特定形式的音频冗余编码,这在分组开销方面可能是有益的。从概念上讲,先前传输的传输帧与新传输帧聚合在一起。滑动窗口可用于对每个有效载荷中要发送的帧进行分组。然而,通过在主传输和冗余传输之间插入无_数据帧,也可能出现不规则或不连续的模式。下面的图1显示了一个示例。
--+--------+--------+--------+--------+--------+--------+--------+-- | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | --+--------+--------+--------+--------+--------+--------+--------+--
--+--------+--------+--------+--------+--------+--------+--------+-- | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | --+--------+--------+--------+--------+--------+--------+--------+--
<---- p(n-1) ----> <----- p(n) -----> <---- p(n+1) ----> <---- p(n+2) ----> <---- p(n+3) ----> <---- p(n+4) ---->
<---- p(n-1) ----> <----- p(n) -----> <---- p(n+1) ----> <---- p(n+2) ----> <---- p(n+3) ----> <---- p(n+4) ---->
Figure 1: An example of redundant transmission
图1:冗余传输示例
Here, each frame is retransmitted once in the following RTP payload packet. f(n-2)...f(n+4) denote a sequence of audio frames, and p(n-1)...p(n+4) a sequence of payload packets.
这里,每个帧在下面的RTP有效负载分组中重传一次。f(n-2)…f(n+4)表示音频帧序列,p(n-1)…p(n+4)表示有效负载分组序列。
The mechanism described does not really require signaling at the session setup. However, signaling has been defined to allow for the sender to voluntarily bind the buffering and delay requirements. If nothing is signaled, the use of this mechanism is allowed and unbounded. For a certain timestamp, the receiver may receive multiple copies of a frame containing encoded audio data, even at different encoding rates. The cost of this scheme is bandwidth and the receiver delay necessary to allow the redundant copy to arrive.
所描述的机制实际上不需要在会话设置时发送信令。然而,信令已被定义为允许发送方自愿绑定缓冲和延迟要求。如果未发出任何信号,则允许使用此机制,且不受限制。对于某个时间戳,接收机可以接收包含编码音频数据的帧的多个副本,甚至以不同的编码速率。该方案的成本是允许冗余副本到达所需的带宽和接收器延迟。
This redundancy scheme provides a functionality similar to the one described in RFC 2198, but it works only if both original frames and redundant representations are G.719 frames. When the use of other media coding schemes is desirable, one has to resort to RFC 2198.
该冗余方案提供了与RFC 2198中描述的功能类似的功能,但仅当原始帧和冗余表示均为G.719帧时,该方案才有效。当需要使用其他媒体编码方案时,必须求助于RFC 2198。
The sender is responsible for selecting an appropriate amount of redundancy based on feedback about the channel conditions, e.g., in the RTP Control Protocol (RTCP) [RFC3550] receiver reports. The sender is also responsible for avoiding congestion, which may be exacerbated by redundancy (see Section 9 for more details).
发送方负责根据有关信道条件的反馈选择适当数量的冗余,例如,在RTP控制协议(RTCP)[RFC3550]接收方报告中。发送方还负责避免因冗余而加剧的拥塞(更多详细信息,请参见第9节)。
To decrease protocol overhead, the payload design allows several audio transport frames to be encapsulated into a single RTP packet. One of the drawbacks of such an approach is that in the case of packet loss, several consecutive frames are lost. Consecutive frame loss normally renders error concealment less efficient and usually causes clearly audible and annoying distortions in the reconstructed audio. Interleaving of transport frames can improve the audio quality in such cases by distributing the consecutive losses into a number of isolated frame losses, which are easier to conceal.
为了减少协议开销,有效负载设计允许将多个音频传输帧封装到单个RTP数据包中。这种方法的缺点之一是,在分组丢失的情况下,几个连续帧丢失。连续的帧丢失通常会使错误隐藏效率降低,并且通常会在重建的音频中导致清晰可听和恼人的失真。在这种情况下,传输帧的交错可以通过将连续损耗分配到多个更容易隐藏的孤立帧损耗中来改善音频质量。
However, interleaving and bundling several frames per payload also increases end-to-end delay and sets higher buffering requirements. Therefore, interleaving is not appropriate for all use cases or devices. Streaming applications should most likely be able to exploit interleaving to improve audio quality in lossy transmission conditions.
然而,每个有效负载交错和捆绑几个帧也会增加端到端延迟,并设置更高的缓冲要求。因此,交织并不适用于所有用例或设备。流媒体应用程序最有可能利用交织来改善有损传输条件下的音频质量。
Note that this payload design supports the use of frame interleaving as an option. The usage of this feature needs to be negotiated in the session setup.
请注意,此有效负载设计支持使用帧交错作为选项。此功能的使用需要在会话设置中协商。
The interleaving supported by this format is rather flexible. For example, a continuous pattern can be defined, as depicted in Figure 2.
这种格式支持的交织是相当灵活的。例如,可以定义连续模式,如图2所示。
--+--------+--------+--------+--------+--------+--------+--------+-- | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | --+--------+--------+--------+--------+--------+--------+--------+--
--+--------+--------+--------+--------+--------+--------+--------+-- | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | --+--------+--------+--------+--------+--------+--------+--------+--
[ p(n) ] [ p(n+1) ] [ p(n+1) ] [ p(n+2) ] [ p(n+2) ] [ p(n+3) ] [ p(n+4) ]
[ p(n) ] [ p(n+1) ] [ p(n+1) ] [ p(n+2) ] [ p(n+2) ] [ p(n+3) ] [ p(n+4) ]
Figure 2: An example of interleaving pattern that has constant delay
图2:具有恒定延迟的交织模式示例
In Figure 2, the consecutive frames, denoted f(n-2) to f(n+4), are aggregated into packets p(n) to p(n+4), each packet carrying two frames. This approach provides an interleaving pattern that allows for constant delay in both the interleaving and de-interleaving processes. The de-interleaving buffer needs to have room for at least three frames, including the one that is ready to be consumed. The storage space for three frames is needed, for example, when f(n) is the next frame to be decoded: since frame f(n) was received in packet p(n+2), which also carried frame f(n+3), both these frames are stored in the buffer. Furthermore, frame f(n+1) received in the previous packet, p(n+1), is also in the de-interleaving buffer. Note also that in this example the buffer occupancy varies: when frame f(n+1) is the next one to be decoded, there are only two frames, f(n+1) and f(n+3), in the buffer.
在图2中,表示为f(n-2)到f(n+4)的连续帧被聚合成分组p(n)到p(n+4),每个分组承载两个帧。该方法提供了一种交织模式,该模式允许交织和解交织过程中的恒定延迟。解交错缓冲区需要至少有三个帧的空间,包括准备使用的帧。例如,当f(n)是要解码的下一帧时,需要三个帧的存储空间:因为帧f(n)是在分组p(n+2)中接收的,分组p(n+2)也携带帧f(n+3),所以这两个帧都存储在缓冲器中。此外,在前一分组p(n+1)中接收的帧f(n+1)也在解交织缓冲器中。还请注意,在该示例中,缓冲区占用情况有所不同:当帧f(n+1)是下一个要解码的帧时,缓冲区中只有两个帧f(n+1)和f(n+3)。
The main purpose of the payload design for G.719 is to maximize the potential of the codec to its fullest degree with as minimal overhead as possible. In the design, both basic and interleaved modes have
G.719有效载荷设计的主要目的是以尽可能最小的开销最大限度地发挥编解码器的潜力。在设计中,基本模式和交织模式都具有相同的性能
been included, as the codec is suitable both for conversational and other low-delay applications as well as streaming, where more delay is acceptable.
已包括,因为编解码器适用于会话和其他低延迟应用以及流媒体,其中更多延迟是可以接受的。
The main structural difference between the basic and interleaved modes is the extension of the table of contents entries with frame displacement fields in the interleaved mode. The basic mode supports aggregation of multiple consecutive frames in a payload. The interleaved mode supports aggregation of multiple frames that are non-consecutive in time. In both modes, it is possible to have frames encoded with different frame types in the same payload.
基本模式和交错模式之间的主要结构差异是在交错模式中使用帧位移字段扩展目录条目。基本模式支持有效负载中多个连续帧的聚合。交织模式支持时间上不连续的多个帧的聚合。在这两种模式中,可以在同一有效负载中使用不同的帧类型对帧进行编码。
The payload format also supports the usage of G.719 for carrying multi-channel content using one discrete encoder per channel all using the same bitrate. In this case, a complete frame-block with data from all channels is included in the RTP payload. The data is the concatenation of all the encoded audio frames in the order specified for that number of included channels. Also, interleaving is done on complete frame-blocks rather than on individual audio frames.
有效载荷格式还支持使用G.719来承载多频道内容,每个频道使用一个离散编码器,所有编码器都使用相同的比特率。在这种情况下,RTP有效载荷中包括包含来自所有信道的数据的完整帧块。该数据是所有编码音频帧按照为该数量的包含通道指定的顺序进行的串联。此外,交织是在完整的帧块上完成的,而不是在单个音频帧上完成的。
The RTP timestamp corresponds to the sampling instant of the first sample encoded for the first frame-block in the packet. The timestamp clock frequency SHALL be 48000 Hz. The timestamp is also used to recover the correct decoding order of the frame-blocks.
RTP时间戳对应于为分组中的第一帧块编码的第一样本的采样瞬间。时间戳时钟频率应为48000 Hz。时间戳还用于恢复帧块的正确解码顺序。
The RTP header marker bit (M) SHALL be set to 1 whenever the first frame-block carried in the packet is the first frame-block in a talkspurt (see definition of the talkspurt in Section 4.1 of [RFC3551]). For all other packets, the marker bit SHALL be set to zero (M=0).
每当包中携带的第一帧块是TalkSport中的第一帧块时,RTP报头标记位(M)应设置为1(见[RFC3551]第4.1节中TalkSput的定义)。对于所有其他数据包,标记位应设置为零(M=0)。
The assignment of an RTP payload type for the format defined in this memo is outside the scope of this document. The RTP profiles in use currently mandate binding the payload type dynamically for this payload format. This is basically necessary because the payload type expresses the configuration of the payload itself, i.e., basic or interleaved mode, and the number of channels carried.
为本备忘录中定义的格式分配RTP有效负载类型超出了本文档的范围。当前使用的RTP配置文件强制为此有效负载格式动态绑定有效负载类型。这基本上是必要的,因为有效负载类型表示有效负载本身的配置,即基本或交织模式,以及所承载的信道数。
The remaining RTP header fields are used as specified in [RFC3550].
剩余的RTP标头字段按照[RFC3550]中的规定使用。
The payload consists of one or more table of contents (ToC) entries followed by the audio data corresponding to the ToC entries. The following sections describe both the basic mode and the interleaved
有效负载包括一个或多个目录(ToC)条目,后跟与ToC条目对应的音频数据。以下各节描述基本模式和交织模式
mode. Each ToC entry MUST be padded to a byte boundary to ensure octet alignment. The rules regarding maximum payload size given in Section 3.2 of [RFC5405] SHOULD be followed.
模式每个ToC条目必须填充到字节边界,以确保八位字节对齐。应遵守[RFC5405]第3.2节中给出的关于最大有效负载大小的规则。
All the different formats and modes in this document use a common basic ToC that may be extended in the different options described below.
本文档中的所有不同格式和模式都使用一个通用的基本ToC,可以在下面描述的不同选项中进行扩展。
0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |F| L |R|R| +-+-+-+-+-+-+-+-+
0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |F| L |R|R| +-+-+-+-+-+-+-+-+
Figure 3: Basic TOC element
图3:基本TOC要素
F (1 bit): If set to 1, indicates that this ToC entry is followed by another ToC entry; if set to zero, indicates that this ToC entry is the last one in the ToC.
F(1位):如果设置为1,则表示此ToC条目后面跟着另一个ToC条目;如果设置为零,则表示此ToC条目是ToC中的最后一个条目。
L (5 bits): A field that gives the frame length of each individual frame within the frame-block.
L(5位):给出帧块内每个单独帧的帧长度的字段。
L length(bytes) ============================ 0 0 NO_DATA 1-7 N/A (reserved) 8-22 80+10*(L-8) 23-27 240+20*(L-23) 28-31 N/A (reserved)
L length(bytes) ============================ 0 0 NO_DATA 1-7 N/A (reserved) 8-22 80+10*(L-8) 23-27 240+20*(L-23) 28-31 N/A (reserved)
Figure 4: How to map L values to frame lengths
图4:如何将L值映射到帧长度
L=0 (NO_DATA) is used to indicate an empty frame, which is useful if frames are missing (e.g., at re-packetization), or to insert gaps when sending redundant frames together with primary frames in the same payload. The value range [1..7] and [28..31] inclusive is reserved for future use in this document version; if these values occur in a ToC, the entire packet SHOULD be treated as invalid and discarded. A few examples are given below where the frame size and the corresponding codec bitrate is computed based on the value L.
L=0(无_数据)用于指示空帧,如果帧丢失(例如,在重新打包时),则空帧很有用,或者在同一有效负载中与主帧一起发送冗余帧时插入间隙。值范围[1..7]和[28..31]在本文档版本中保留供将来使用;如果这些值出现在ToC中,则应将整个数据包视为无效并丢弃。下面给出了几个示例,其中帧大小和相应的编解码器比特率是基于值L计算的。
L Bytes Codec Bitrate(kbps) =================================== 8 80 32 9 90 36 10 100 40 12 120 48 16 160 64 22 220 88 23 240 96 25 280 112 27 320 128
L Bytes Codec Bitrate(kbps) =================================== 8 80 32 9 90 36 10 100 40 12 120 48 16 160 64 22 220 88 23 240 96 25 280 112 27 320 128
Figure 5: Examples of L values and corresponding frame lengths
图5:L值和相应帧长度的示例
This encoding yields a granularity of 4 kbps between 32 and 88 kbps and a granularity of 8 kbps between 88 and 128 kbps with a defined range of 32-128 kbps for the codec data.
此编码产生32到88 kbps之间的4 kbps粒度和88到128 kbps之间的8 kbps粒度,编解码器数据的定义范围为32到128 kbps。
R (2 bits): Reserved bits. SHALL be set to zero on sending and SHALL be ignored on reception.
R(2位):保留位。发送时应设置为零,接收时应忽略。
The basic ToC element shown in Figure 3 is followed by a 1-octet field for the number of frame-blocks (#frames) to form the ToC entry. The frame-blocks field tells how many frame-blocks of the same length the ToC entry relates to.
图3所示的基本ToC元素后面是一个1-octet字段,用于表示构成ToC条目的帧块数(#帧)。frame blocks(帧块)字段告诉ToC条目与多少个相同长度的帧块相关。
0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ | #frames | +-+-+-+-+-+-+-+-+
0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ | #frames | +-+-+-+-+-+-+-+-+
Figure 6: Number of frame-blocks field
图6:帧块数字段
The basic ToC is followed by a 1-octet field for the number of frame-blocks (#frames) and then the DIS fields to form a ToC entry in interleaved mode. The frame-blocks field tells how many frame-blocks of the same length the ToC relates to. The DIS fields, one for each frame-block indicated by the #frames field, express the interleaving distance between audio frames carried in the payload. If necessary to achieve octet alignment, a 4-bit padding is added.
基本ToC后面是一个1-octet字段,表示帧块(#帧)的数量,然后是DIS字段,以形成交错模式下的ToC条目。frame blocks(帧块)字段告诉ToC与多少个相同长度的帧块相关。DIS字段表示有效载荷中携带的音频帧之间的交织距离,每个帧块对应一个DIS字段,由#frames字段表示。如果需要实现八位字节对齐,则添加4位填充。
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | #frames | DIS1 | ... | DISi | ... | DISn | Padd | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | #frames | DIS1 | ... | DISi | ... | DISn | Padd | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7: Number of frame-block + interleave fields
图7:帧块+交织字段的数量
DIS1...DISn (4 bits): A list of n (n=#frames) displacement fields indicating the displacement of the i:th (i=1..n) audio frame-block relative to the preceding frame-block in the payload, in units of 20-ms long audio frame-blocks). The 4-bit unsigned integer displacement values may be between zero and 15 indicating the number of audio frame-blocks in decoding order between the (i-1):th and the i:th frame in the payload. Note that for the first ToC entry of the payload, the value of DIS1 is meaningless. It SHALL be set to zero by a sender and SHALL be ignored by a receiver. This frame-block's location in the decoding order is uniquely defined by the RTP timestamp. Note that for subsequent ToC entries DIS1 indicates the number of frames between the last frame of the previous group and the first frame of this group.
DIS1…DISn(4位):n(n=#帧)位移字段的列表,指示i:th(i=1..n)音频帧块相对于有效负载中的前一帧块的位移,以20毫秒长的音频帧块为单位)。4位无符号整数位移值可以在0到15之间,指示有效载荷中(i-1):th和i:th帧之间以解码顺序的音频帧块的数量。请注意,对于有效负载的第一个ToC条目,DIS1的值没有意义。发送方应将其设置为零,接收方应忽略。此帧块在解码顺序中的位置由RTP时间戳唯一定义。请注意,对于后续ToC条目,DIS1表示上一组的最后一帧和该组的第一帧之间的帧数。
Padd (4 bits): To ensure octet alignment, 4 padding bits SHALL be included at the end of the ToC entry in case there is an odd number of frame-blocks in the group referenced by this ToC entry. These bits SHALL be set to zero and SHALL be ignored by the receiver. If a group containing an even number of frames is referenced by this ToC entry, these padding bits SHALL NOT be included in the payload.
Padd(4位):为确保八位字节对齐,如果ToC条目引用的组中有奇数个帧块,则ToC条目末尾应包含4个填充位。这些位应设置为零,并应被接收器忽略。如果包含偶数帧的组被该ToC条目引用,则这些填充位不应包括在有效载荷中。
The audio data part follows the table of contents. All the octets comprising an audio frame SHALL be appended to the payload as a unit. For each frame-block, the audio frames are concatenated in the order indicated by the table in Section 4.1 of [RFC3551] for the number of channels configured for the payload type in use. So the first channel (leftmost) indicated comes first followed by the next channel. The audio frame-blocks are packetized in increasing timestamp order within each group of frame-blocks (per ToC entry), i.e., oldest frame-block first. The groups of frame-blocks are packetized in the same order as their corresponding ToC entries.
音频数据部分遵循目录。构成音频帧的所有八位字节应作为一个单元附加到有效载荷上。对于每个帧块,音频帧按照[RFC3551]第4.1节中表中所示的顺序连接,用于所用有效负载类型配置的信道数量。因此,指示的第一个通道(最左侧)首先出现,然后是下一个通道。音频帧块在每组帧块(每个ToC条目)内以增加的时间戳顺序打包,即,首先是最早的帧块。帧块组按与其对应的ToC条目相同的顺序打包。
The audio frames are specified in ITU recommendation [ITU-T-G719].
音频帧在ITU建议[ITU-T-G719]中规定。
The G.719 bit stream is split into a sequence of octets and transmitted in order from the leftmost (most significant (MSB)) bit to the rightmost (least significant (LSB)) bit.
G.719比特流被分成八位字节序列,并按从最左边(最高有效位(MSB))比特到最右边(最低有效位(LSB))比特的顺序传输。
An application implementing this payload format MUST understand all the payload parameters specified in this specification. Any mapping of the parameters to a signaling protocol MUST support all parameters. So an implementation of this payload format in an application using SDP is required to understand all the payload parameters in their SDP-mapped form. This requirement ensures that an implementation always can decide whether it is capable of communicating when the communicating entities support this version of the specification.
实现此有效负载格式的应用程序必须理解本规范中指定的所有有效负载参数。参数到信令协议的任何映射都必须支持所有参数。因此,需要在使用SDP的应用程序中实现此有效负载格式,才能理解SDP映射形式中的所有有效负载参数。此要求确保了当通信实体支持此版本的规范时,实现始终可以决定是否能够通信。
Basic mode SHALL be implemented and the interleaved mode SHOULD be implemented. The implementation burden of both is rather small, and supporting both ensures interoperability. However, interleaving is not mandated as it has limited applicability for conversational applications that require tight delay boundaries.
应实施基本模式,并应实施交错模式。两者的实现负担相当小,支持两者可以确保互操作性。然而,交错并不是强制性的,因为它对需要严格延迟边界的会话应用程序的适用性有限。
The reception of redundant audio frames, i.e., more than one audio frame from the same source for the same time slot, MUST be supported by the implementation. In the case that the receiver gets multiple audio frames in different bitrates for the same time slot, it is RECOMMENDED that the receiver keeps the one with the highest bitrate.
该实现必须支持接收冗余音频帧,即在同一时隙内从同一来源接收多个音频帧。如果接收器在同一时隙中以不同比特率获得多个音频帧,建议接收器保留具有最高比特率的音频帧。
The use of interleaving requires further considerations. As presented in the example in Section 4.3.2, a given interleaving pattern requires a certain amount of the de-interleaving buffer. This buffer space, expressed in a number of transport frame slots, is indicated by the "interleaving" media type parameter. The number of frame slots needed can be converted into actual memory requirements by considering the 320 bytes per frame used by the highest bitrate of G.719.
交织的使用需要进一步考虑。如第4.3.2节中的示例所示,给定的交织模式需要一定数量的解交织缓冲器。该缓冲区空间以传输帧时隙的数量表示,由“交错”媒体类型参数表示。通过考虑G.719的最高比特率所使用的每帧320字节,可以将所需的帧槽数量转换为实际内存需求。
The information about the frame buffer size is not always sufficient to determine when it is appropriate to start consuming frames from the interleaving buffer. Additional information is needed when the interleaving pattern changes. The "int-delay" media type parameter is defined to convey this information. It allows a sender to indicate the minimal media time that needs to be present in the buffer before the decoder can start consuming frames from the buffer. Because the sender has full control over the interleaving pattern, it can calculate this value. In certain cases (for example, if joining a multicast session with interleaving mid-session), a receiver may initially receive only part of the packets in the interleaving
关于帧缓冲区大小的信息并不总是足以确定何时开始使用交织缓冲区中的帧是合适的。当交织模式改变时,需要额外的信息。定义“int delay”媒体类型参数以传递此信息。它允许发送方指示在解码器开始使用缓冲区中的帧之前,缓冲区中需要存在的最小媒体时间。因为发送方完全控制交织模式,所以它可以计算这个值。在某些情况下(例如,如果加入具有交织中间会话的多播会话),接收机最初可能仅接收交织中的部分分组
pattern. This initial partial reception (in frame sequence order) of frames can yield too few frames for acceptable quality from the audio decoding. This problem also arises when using encryption for access control, and the receiver does not have the previous key. Although the G.719 is robust and thus tolerant to a high random frame erasure rate, it would have difficulties handling consecutive frame losses at startup. Thus, some special implementation considerations are described.
图案帧的这种初始部分接收(按帧序列顺序)可能会产生太少的帧,无法从音频解码获得可接受的质量。当使用加密进行访问控制,并且接收方没有以前的密钥时,也会出现此问题。虽然G.719是健壮的,因此能够承受高随机帧擦除率,但在启动时处理连续帧丢失会有困难。因此,描述了一些特殊的实现注意事项。
In order to handle this type of startup efficiently, decoding can start provided that:
为了有效地处理此类启动,可以启动解码,前提是:
1. There are at least two consecutive frames available.
1. 至少有两个连续帧可用。
2. More than or equal to half the frames are available in the time period from where decoding was planned to start and the most forward received decoding.
2. 在计划开始解码和最前向接收解码的时间段内,超过或等于一半的帧可用。
After receiving a number of packets, in the worst case as many packets as the interleaving pattern covers, the previously described effects disappear and normal decoding is resumed. Similar issues arise when a receiver leaves a session or has lost access to the stream. If the receiver leaves the session, this would be a minor issue since playout is normally stopped. The sender can avoid this type of problem in many sessions by starting and ending interleaving patterns correctly when risks of losses occur. One such example is a key-change done for access control to encrypted streams. If only some keys are provided to clients and there is a risk they will receive content for which they do not have the key, it is recommended that interleaving patterns do not overlap key changes.
在接收到多个分组之后,在最坏的情况下,与交织模式覆盖的分组一样多的分组,先前描述的效果消失并且恢复正常解码。当接收器离开会话或失去对流的访问时,也会出现类似的问题。如果接收器离开会话,这将是一个小问题,因为播放通常会停止。发送方可以通过在发生丢失风险时正确启动和结束交织模式,在许多会话中避免此类问题。一个这样的例子是为对加密流进行访问控制而进行的密钥更改。如果只向客户端提供了一些密钥,并且存在接收到没有密钥的内容的风险,建议交错模式不要与密钥更改重叠。
If the receiver finds a mismatch between the size of a received payload and the size indicated by the ToC of the payload, the receiver SHOULD discard the packet. This is recommended because decoding a frame parsed from a payload based on erroneous ToC data could severely degrade the audio quality.
如果接收器发现接收到的有效载荷的大小与有效载荷的ToC指示的大小不匹配,则接收器应丢弃该数据包。建议这样做,因为根据错误的ToC数据对从有效负载解析的帧进行解码可能会严重降低音频质量。
A few examples to highlight the payload format follow.
下面是几个突出显示有效负载格式的示例。
The first example is a payload consisting of 3 mono frames where the first 2 frames correspond to a bitrate of 32 kbps (80 bytes/frame) and the last is 48 kbps (120 bytes/frame).
第一个示例是由3个单帧组成的有效负载,其中前2个帧对应于32 kbps(80字节/帧)的比特率,最后一个是48 kbps(120字节/帧)。
The first 32 bits are ToC fields. Bit 0 is '1' as another ToC field follows. Bits 1..5 are '01000' = 80 bytes/frame. Bits 8..15 are '00000010' = 2 frame-blocks with 80 bytes/frame. Bit 16 is '0', no more ToC follows. Bits 17..21 are '01100' = 120 bytes/frame. Bits 24..31 are '00000001' = 1 frame-block with 120 bytes/frame.
前32位是ToC字段。第0位为“1”,后面是另一个ToC字段。位1..5为“01000”=80字节/帧。位8..15是'00000010'=2个帧块,每帧80字节。第16位为“0”,后面不再有ToC。位17..21为“01100”=120字节/帧。位24..31为'00000001'=1帧块,每帧120字节。
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1|0 1 0 0 0|0 0|0 0 0 0 0 0 1 0|0|0 1 1 0 0|0 0|0 0 0 0 0 0 0 1| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |d(0) frame 1 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |d(0) frame 2 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |d(0) frame 3 | . . | d(959)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |1|0 1 0 0 0|0 0|0 0 0 0 0 0 1 0|0|0 1 1 0 0|0 0|0 0 0 0 0 0 0 1| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |d(0) frame 1 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |d(0) frame 2 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |d(0) frame 3 | . . | d(959)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The second example is a payload consisting of 2 stereo frames that correspond to a bitrate of 32 kbps (80 bytes/frame) per channel. The receiver calculates the number of frames in the audio block by multiplying the value of the "channels" parameter (2) with the #frames field value (2) to derive that there are 4 audio frames in the payload.
第二个示例是由2个立体声帧组成的有效负载,对应于每个通道32 kbps(80字节/帧)的比特率。接收器通过将“通道”参数(2)的值乘以#frames字段值(2)来计算音频块中的帧数,从而得出有效负载中有4个音频帧。
The first 16 bits is the ToC field. Bit 0 is '0' as no ToC field follows. Bits 1..5 are '01000' = 80 bytes/frame. Bits 8..15 are '00000010' = 2 frame-blocks with 80 bytes/frame.
前16位是ToC字段。位0为“0”,因为后面没有ToC字段。位1..5为“01000”=80字节/帧。位8..15是'00000010'=2个帧块,每帧80字节。
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0 1 0 0 0|0 0|0 0 0 0 0 0 1 0| d(0) frame 1 left ch. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . | d(639)| d(0) frame 1 right ch. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . | d(639)| d(0) frame 2 left ch. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . | d(639)| d(0) frame 2 right ch. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0 1 0 0 0|0 0|0 0 0 0 0 0 1 0| d(0) frame 1 left ch. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . | d(639)| d(0) frame 1 right ch. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . | d(639)| d(0) frame 2 left ch. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . | d(639)| d(0) frame 2 right ch. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The third example is a payload consisting of 4 mono frames that correspond to a bitrate of 32 kbps (80 bytes/frame) interleaved. A pattern of interleaving for constant delay when aggregating 4 frames is used in the example below. The actual packet illustrated is packet n, while the previous and following packets' frame-block content is shown to illustrate the pattern.
第三个示例是由4个单帧组成的有效负载,对应于32 kbps(80字节/帧)的交织比特率。在下面的示例中,使用了在聚合4帧时用于恒定延迟的交织模式。所示的实际分组是分组n,而示出先前和后续分组的帧块内容以示出该模式。
Packet n-3: 1, 6, 11, 16 Packet n-2: 5, 10, 15, 20 Packet n-1: 9, 14, 19, 24 Packet n: 13, 18, 23, 28 Packet n+1: 17, 22, 27, 32 Packet n+2: 21, 26, 31, 36
Packet n-3: 1, 6, 11, 16 Packet n-2: 5, 10, 15, 20 Packet n-1: 9, 14, 19, 24 Packet n: 13, 18, 23, 28 Packet n+1: 17, 22, 27, 32 Packet n+2: 21, 26, 31, 36
The first 32 bits are the ToC field. Bit 0 is '0' as there is no ToC field following. Bits 1..5 are '01000' = 80 bytes/frame. Bits 8..15 are '00000100' = 4 frame-blocks with 80 bytes/frame. Bits 16..19 are '0000' = DIS1 (0). Bits 20..23 are '0100' = DIS2 (4). Bits 24..27 are '0100' = DIS3 (4). Bits 28..31 are '0100' = DIS4 (4).
前32位是ToC字段。位0为“0”,因为后面没有ToC字段。位1..5为“01000”=80字节/帧。位8..15是“00000100”=4个帧块,每帧80字节。位16..19是'0000'=DIS1(0)。位20..23是'0100'=DIS2(4)。位24..27为'0100'=DIS3(4)。位28..31是'0100'=DIS4(4)。
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0 1 0 0 0|0 0|0 0 0 0 0 1 0 0|0 0 0 0|0 1 0 0|0 1 0 0|0 1 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(0) frame 13 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(0) frame 18 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(0) frame 23 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(0) frame 28 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0|0 1 0 0 0|0 0|0 0 0 0 0 1 0 0|0 0 0 0|0 1 0 0|0 1 0 0|0 1 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(0) frame 13 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(0) frame 18 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(0) frame 23 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | d(0) frame 28 | . . | d(639)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This RTP payload format is identified using the media type audio/ G719, which is registered in accordance with [RFC4855] and uses the template of [RFC4288].
该RTP有效负载格式使用媒体类型audio/G719标识,该媒体类型根据[RFC4855]注册,并使用[RFC4288]模板。
The media type for the G.719 codec is allocated from the IETF tree since G.719 has the potential to become a widely used audio codec in general Voice over IP (VoIP), teleconferencing, and streaming applications. This media type registration covers real-time transfer via RTP.
G.719编解码器的媒体类型是从IETF树中分配的,因为G.719有可能成为通用IP语音(VoIP)、电话会议和流媒体应用中广泛使用的音频编解码器。这种媒体类型注册包括通过RTP进行实时传输。
Note, any unspecified parameter MUST be ignored by the receiver to ensure that additional parameters can be added in any future revision of this specification.
注意,接收器必须忽略任何未指定的参数,以确保在本规范的任何未来版本中可以添加其他参数。
Type name: audio
类型名称:音频
Subtype name: G719
子类型名称:G719
Required parameters: none
所需参数:无
Optional parameters:
可选参数:
interleaving: Indicates that interleaved mode SHALL be used for the payload. The parameter specifies the number of frame-block slots available in a de-interleaving buffer (including the frame that is ready to be consumed) for each source. Its value is equal to one plus the maximum number of frames that can precede any frame in transmission order and follow the frame in RTP timestamp order. The value MUST be greater than zero. If this parameter is not present, interleaved mode SHALL NOT be used.
交织:表示有效负载应使用交织模式。该参数指定每个源的解交错缓冲区(包括准备使用的帧)中可用的帧块插槽数。其值等于1加上传输顺序中任何帧之前和RTP时间戳顺序中帧之后的最大帧数。该值必须大于零。如果该参数不存在,则不应使用交织模式。
int-delay: The minimal media time delay in milliseconds that is needed to avoid underrun in the de-interleaving buffer before starting decoding, i.e., the difference in RTP timestamp ticks between the earliest and latest audio frame present in the de-interleaving buffer expressed in milliseconds. The value is a stream property and provided per source. The allowed values are zero to the largest value expressible by an unsigned 16-bit integer (65535). Please note that in practice, the largest value that can be used is equal to the declared size of the interleaving buffer of the receiver. If the value for some reason is larger than the receiver buffer declared by or for the receiver, this value defaults to the size of the receiver buffer. For sources for which this value hasn't been provided, the value defaults to the size of the receiver buffer. The format is a comma-separated list of synchronization source (SSRC) ":" delay in ms pairs, which in ABNF [RFC5234] is expressed as:
int delay:在开始解码之前,为避免解交错缓冲区中的欠运行所需的最小媒体时间延迟(以毫秒为单位),即,解交错缓冲区中存在的最早和最新音频帧之间的RTP时间戳刻度差(以毫秒为单位)。该值是一个流属性,按源提供。允许的值为0到可由无符号16位整数(65535)表示的最大值。请注意,在实践中,可以使用的最大值等于接收机交错缓冲区的声明大小。如果由于某种原因,该值大于由接收器或为接收器声明的接收器缓冲区,则该值默认为接收器缓冲区的大小。对于尚未提供此值的源,该值默认为接收器缓冲区的大小。格式为以逗号分隔的同步源列表(SSRC)“:”毫秒对延迟,在ABNF[RFC5234]中表示为:
int-delay = "int-delay:" source-delay *("," source-delay)
int-delay = "int-delay:" source-delay *("," source-delay)
source-delay = SSRC ":" delay-value
源延迟=SSRC“:“延迟值
SSRC = 1*8HEXDIG ; The 32-bit SSRC encoded in hex format
SSRC = 1*8HEXDIG ; The 32-bit SSRC encoded in hex format
delay-value = 1*5DIGIT ; The delay value in milliseconds
delay-value = 1*5DIGIT ; The delay value in milliseconds
Example: int-delay=ABCD1234:1000,4321DCB:640
Example: int-delay=ABCD1234:1000,4321DCB:640
NOTE: No white space allowed in the parameter before the end of all the value pairs
注意:在所有值对结束之前,参数中不允许有空格
max-red: The maximum duration in milliseconds that elapses between the primary (first) transmission of a frame and any redundant transmission that the sender will use. This parameter allows a receiver to have a bounded delay when redundancy is used. Allowed values are between zero (no redundancy will be used) and 65535. If the parameter is omitted, no limitation on the use of redundancy is present.
最大红色:帧的主要(第一次)传输和发送方将使用的任何冗余传输之间经过的最大持续时间(毫秒)。当使用冗余时,此参数允许接收器具有有界延迟。允许值介于零(不使用冗余)和65535之间。如果省略该参数,则对冗余的使用没有限制。
channels: The number of audio channels. The possible values (1-6) and their respective channel order is specified in Section 4.1 of [RFC3551]. If omitted, it has the default value of 1.
频道:音频频道的数量。[RFC3551]第4.1节规定了可能的值(1-6)及其各自的通道顺序。如果省略,则默认值为1。
CBR: Constant Bitrate (CBR) indicates the exact codec bitrate in bits per second (not including the overhead from packetization, RTP header, or lower layers) that the codec MUST use. "CBR" is to be used when the dynamic rate cannot be supported (one case is, e.g., gateway to H.320). "CBR" is mostly used for gateways to circuit switch networks. Therefore, the "CBR" is the rate not including any FEC as specified in Section 4.3.1. If FEC is to be used, the "b=" parameter MUST be used to allow the extra bitrate needed to send the redundant information. It is RECOMMENDED that this parameter is only used when necessary to establish a working communication. The usage of this parameter has implications for congestion control that need to be considered; see Section 9.
CBR:恒定比特率(CBR)表示编解码器必须使用的确切编解码器比特率,单位为比特/秒(不包括分组、RTP头或较低层的开销)。当无法支持动态速率时(例如,H.320网关),将使用“CBR”。“CBR”主要用于电路交换网络的网关。因此,“CBR”是指不包括第4.3.1节规定的任何FEC的费率。如果要使用FEC,则必须使用“b=”参数来允许发送冗余信息所需的额外比特率。建议仅在需要建立工作通信时使用此参数。该参数的使用对需要考虑的拥塞控制有影响;见第9节。
ptime: see [RFC4566].
ptime:请参阅[RFC4566]。
maxptime: see [RFC4566].
maxptime:请参阅[RFC4566]。
Encoding considerations: This media type is framed and binary; see Section 4.8 of [RFC4288].
编码注意事项:此媒体类型为框架和二进制;见[RFC4288]第4.8节。
Security considerations: See Section 10 of RFC 5404.
安全注意事项:见RFC 5404第10节。
Interoperability considerations: The support of the Interleaving mode is not mandatory and needs to be negotiated. See Section 7.2 for how to do that for SDP-based protocols.
互操作性注意事项:交错模式的支持不是强制性的,需要协商。关于如何为基于SDP的协议执行此操作,请参见第7.2节。
Published specification: RFC 5404
已发布规范:RFC 5404
Applications that use this media type: Real-time audio applications like Voice over IP and teleconference, and multi-media streaming.
使用这种媒体类型的应用程序:实时音频应用程序,如IP语音和电话会议,以及多媒体流。
Additional information: none
其他信息:无
Person & email address to contact for further information: Ingemar Johansson <ingemar.s.johansson@ericsson.com>
联系人和电子邮件地址,以获取更多信息:Ingemar Johansson<Ingemar.s。johansson@ericsson.com>
Intended usage: COMMON
预期用途:普通
Restrictions on usage: This media type depends on RTP framing, and hence is only defined for transfer via RTP [RFC3550]. Transport within other framing protocols is not defined at this time.
使用限制:此媒体类型取决于RTP帧,因此仅定义为通过RTP传输[RFC3550]。此时未定义其他帧协议内的传输。
Author: Ingemar Johansson <ingemar.s.johansson@ericsson.com> Magnus Westerlund <magnus.westerlund@ericsson.com>
Author: Ingemar Johansson <ingemar.s.johansson@ericsson.com> Magnus Westerlund <magnus.westerlund@ericsson.com>
Change controller: IETF Audio/Video Transport working group delegated from the IESG.
变更控制员:IESG授权的IETF音频/视频传输工作组。
Additionally, note that file storage of G.719-encoded audio in ISO base media file format is specified in Annex A of [ITU-T-G719]. Thus, media file formats such as MP4 (audio/mp4 or video/mp4) [RFC4337] and 3GP (audio/3GPP and video/3GPP) [RFC3839] can contain G.719-encoded audio.
此外,请注意,ISO基本媒体文件格式的G.719编码音频的文件存储在[ITU-T-G719]的附录A中有规定。因此,诸如MP4(音频/MP4或视频/MP4)[RFC4337]和3GP(音频/3GPP和视频/3GPP)[RFC3839]之类的媒体文件格式可以包含G.719编码的音频。
The information carried in the media type specification has a specific mapping to fields in the Session Description Protocol (SDP) [RFC4566], which is commonly used to describe RTP sessions. When SDP is used to specify sessions employing the G.719 codec, the mapping is as follows:
媒体类型规范中包含的信息与会话描述协议(SDP)[RFC4566]中的字段具有特定映射,该协议通常用于描述RTP会话。当使用SDP指定使用G.719编解码器的会话时,映射如下:
o The media type ("audio") goes in SDP "m=" as the media name.
o 媒体类型(“音频”)以SDP“m=”作为媒体名称。
o The media subtype (payload format name) goes in SDP "a=rtpmap" as the encoding name. The RTP clock rate in "a=rtpmap" MUST be 48000, and the encoding parameter "channels" (Section 7.1) MUST either be explicitly set to N or omitted, implying a default value of 1. The values of N that are allowed are specified in Section 4.1 in [RFC3551].
o 媒体子类型(有效负载格式名称)以SDP“a=rtpmap”作为编码名称。“a=rtpmap”中的RTP时钟频率必须为48000,编码参数“通道”(第7.1节)必须明确设置为N或省略,这意味着默认值为1。[RFC3551]第4.1节规定了允许的N值。
o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and "a=maxptime" attributes, respectively.
o 参数“ptime”和“maxptime”分别位于SDP“a=ptime”和“a=maxptime”属性中。
o Any remaining parameters go in the SDP "a=fmtp" attribute by copying them directly from the media type parameter string as a semicolon-separated list of parameter=value pairs.
o 通过直接从媒体类型参数字符串中以分号分隔的参数=值对列表形式复制其余参数,将其放入SDP“a=fmtp”属性中。
The following considerations apply when using SDP offer/answer procedures to negotiate the use of G.719 payload in RTP:
当使用SDP报价/应答程序协商RTP中G.719有效载荷的使用时,应考虑以下因素:
o Each combination of the RTP payload transport format configuration parameters ("interleaving" and "channels") is unique in its bit pattern and not compatible with any other combination. When creating an offer in an application desiring to use the more advanced features (interleaving or more than one channel), the offerer is RECOMMENDED to also offer a payload type containing
o RTP有效负载传输格式配置参数(“交织”和“信道”)的每个组合在其位模式中是唯一的,并且与任何其他组合不兼容。在希望使用更高级功能(交错或多个通道)的应用程序中创建报价时,建议报价人也提供包含以下内容的有效负载类型
only the configuration with a single channel. If multiple configurations are of interest to the application, they may all be offered; however, care should be taken not to offer too many payload types. An SDP answerer MUST include, in the SDP answer for a payload type, the following parameters unmodified from the SDP offer (unless it removes the payload type): "interleaving" and "channels". However, the value of the "interleaving" parameter MAY be changed. The SDP offerer and answerer MUST generate G.719 packets as described by these parameters.
只有一个通道的配置。如果应用程序对多个配置感兴趣,则可以提供所有配置;但是,应注意不要提供太多的有效负载类型。SDP应答器必须在有效负载类型的SDP应答中包含未经SDP报价修改的以下参数(除非删除有效负载类型):“交错”和“通道”。然而,“交织”参数的值可以改变。SDP提供方和应答方必须根据这些参数生成G.719数据包。
o The "interleaving" and "int-delay" parameters' values have a specific relationship that needs to be considered. It also depends on the directionality of the streams and their delivery method. The high-level explanation that can be understood from the definition is that the value of "interleaving" declares the size of the receiver buffer, while "int-delay" is a stream property provided by the sender to inform how much buffer space it in practice is using for the stream it sends.
o “交错”和“int delay”参数的值具有需要考虑的特定关系。它还取决于流的方向性及其传输方法。从定义中可以理解的高级解释是,“交错”的值声明了接收方缓冲区的大小,而“int delay”是发送方提供的流属性,用于告知它实际上为它发送的流使用了多少缓冲区空间。
* For media streams that are sent over multicast, the value of "interleaving" SHALL NOT be changed by the answerer. It shall either be accepted or the payload type deleted. The value of the "int-delay" parameter is a stream property and provided by the offer/answer agent that intends to send media with this payload type, and for each stream coming from that agent (one or more). The value MUST be between zero and what corresponds to the buffer size declared by the value of the "interleaving" parameter.
* 对于通过多播发送的媒体流,“交织”的值不应由应答者更改。应接受或删除有效负载类型。“int delay”参数的值是一个流属性,由打算发送具有此有效负载类型的媒体的提供/应答代理提供,并且针对来自该代理的每个流(一个或多个)。该值必须介于零和由“interleaving”参数的值声明的缓冲区大小之间。
* For unicast streams that the offerer declares as send-only, the value of the "interleaving" parameter is the size that the answerer is RECOMMENDED to use by the offerer. The answerer MAY change it to any allowed value. The "int-delay" parameter value will be the one the offerer intends to use unless the answerer reduces the value of the "interleaving" parameter below what is needed for that "int-delay" value. If the "interleaving" value in the answer is smaller than the offer's "int-delay" value, the "int-delay" value is per default reduced to be corresponding to the "interleaving" value. If the offerer is not satisfied with this, he will need to perform another round of offer/answer. As the answerer will not send any media, it doesn't include any "int-delay" in the answer.
* 对于报价人声明为仅发送的单播流,“交错”参数的值是报价人建议应答人使用的大小。回答者可以将其更改为任何允许的值。“int delay”参数值将是报价人打算使用的参数值,除非应答人将“交错”参数值降低到低于该“int delay”值所需的值。如果答案中的“交错”值小于报价的“int delay”值,“int delay”值按默认值减少,以与“交错”值相对应。如果报价人对此不满意,他将需要进行另一轮报价/答复。由于回答者不会发送任何媒体,因此回答中不包含任何“int延迟”。
* For unicast streams that the offerer declares as recvonly, the value of "interleaving" in the offer will be the offerer's size of the interleaving buffer. The answerer indicates its preferred size of the interleaving buffer for any future round of offer/answer. The offerer will not provide any "int-delay"
* 对于报价人声明为recvonly的单播流,报价中“交错”的值将是报价人交错缓冲区的大小。应答者指出其在未来任何一轮报价/应答中首选的交织缓冲区大小。报价人不提供任何“整数延迟”
parameter as it is not sending any media. The answerer is recommended to include in its answer an "int-delay" parameter to declare what the property is for the stream it is going to send. The answer is expected to be capable of selecting a valid parameter value that is between zero and the declared maximum number of slots in the de-interleaving buffer.
参数,因为它不发送任何媒体。建议应答器在其应答中包含一个“int delay”参数,以声明它将要发送的流的属性。答案应该能够选择一个有效的参数值,该参数值介于零和反交错缓冲区中声明的最大插槽数之间。
* For unicast streams that the offer declares as sendrecv streams, the value of the "interleaving" parameter in the offer will be the offerer's size of the interleaving buffer. The answerer will in the answer indicate the size of its actual interleaving buffer. It is recommended that this value is at least as big as the offer's. The offerer is recommended to include an "int-delay" parameter that is selected based on the answerer having at least as much interleaving space as the offerer unless nothing else is known. As the offerer's interleaving buffer size is not yet known, this may fail, in which case the default rule is to downgrade the value of the "int-delay" to correspond to the full size of the answerer's interleaving buffer. If the offerer isn't satisfied with this, it will need to initiate another round of offer/answer. The answerer is recommended in its answer to include an "int-delay" parameter to declare what the property is for the stream(s) it is going to send. The answer is expected to be capable of selecting a valid parameter value that is between zero and the declared maximum number of slots in the de-interleaving buffer.
* 对于要约声明为sendrecv流的单播流,要约中“交织”参数的值将是要约人交织缓冲区的大小。回答者将在回答中指出其实际交错缓冲区的大小。建议该值至少与报价一样大。建议报价人包括一个“int delay”参数,该参数是基于应答人至少具有与报价人相同的交织空间而选择的,除非不知道其他情况。由于报价人的交织缓冲区大小未知,这可能会失败,在这种情况下,默认规则是降低“int delay”的值,以对应于应答人交织缓冲区的完整大小。如果报价人对此不满意,则需要发起另一轮报价/答复。建议应答器在其回答中包含一个“int delay”参数,以声明它将要发送的流的属性。答案应该能够选择一个有效的参数值,该参数值介于零和反交错缓冲区中声明的最大插槽数之间。
o In most cases, the parameters "maxptime" and "ptime" will not affect interoperability; however, the setting of the parameters can affect the performance of the application. The SDP offer/ answer handling of the "ptime" parameter is described in [RFC3264]. The "maxptime" parameter MUST be handled in the same way.
o 在大多数情况下,参数“maxptime”和“ptime”不会影响互操作性;但是,参数的设置可能会影响应用程序的性能。[RFC3264]中描述了“ptime”参数的SDP提供/应答处理。必须以相同的方式处理“maxptime”参数。
o The parameter "max-red" is a stream property parameter. For sendonly or sendrecv unicast media streams, the parameter declares the limitation on redundancy that the stream sender will use. For recvonly streams, it indicates the desired value for the stream sent to the receiver. The answerer MAY change the value, but is RECOMMENDED to use the same limitation as the offer declares. In the case of multicast, the offerer MAY declare a limitation; this SHALL be answered using the same value. A media sender using this payload format is RECOMMENDED to always include the "max-red" parameter. This information is likely to simplify the media stream handling in the receiver. This is especially true if no redundancy will be used, in which case "max-red" is set to zero.
o 参数“最大红色”是一个流属性参数。对于sendonly或sendrecv单播媒体流,参数声明流发送方将使用的冗余限制。对于recvonly streams,它指示发送到接收器的流的所需值。回答者可以更改值,但建议使用与报价声明相同的限制。在多播的情况下,报价人可以声明限制;应使用相同的值对此进行回答。建议使用此有效负载格式的媒体发送器始终包含“max red”参数。该信息可能简化接收器中的媒体流处理。如果不使用冗余,尤其如此,在这种情况下,“最大红色”设置为零。
o Any unknown parameter in an offer SHALL be removed in the answer.
o 报价中的任何未知参数应在答复中删除。
o The "b=" SDP parameter SHOULD be used to negotiate the maximum bandwidth to be used for the audio stream. The offerer may offer a maximum rate and the answer may contain a lower rate. If no "b=" parameter is present in the offer or answer, it implies a rate up to 128 kbps.
o “b=”SDP参数应用于协商音频流使用的最大带宽。报价人可以提供最高价格,而答复可能包含较低的价格。如果报价或答复中没有“b=”参数,则表示最高速率为128 kbps。
o The parameter "CBR" is a receiver capability; i.e., only receivers that really require a constant bitrate should use it. Usage of this parameter has a negative impact on the possibility to perform congestion control; see Section 9. For recvonly and sendrecv streams, it indicates the desired constant bitrate that the receiver wants to accept. A sender MUST be able to send a constant bitrate stream since it is a subset of the variable bitrate capability. If the offer includes this parameter, the answerer MUST send G.719 audio at the constant bitrate if it is within the allowed session bitrate ("b=" parameter). If the answerer cannot support the stated CBR, this payload type must be refused in the answer. The answerer SHOULD only include this parameter if the answerer itself requires to receive at a constant bitrate, even if the offer did not include the "CBR" parameter. In this case, the offerer SHALL send at the constant bitrate, but SHALL be able to accept media at a variable bitrate. An answerer is RECOMMEND to use the same CBR as in the offer, as symmetric usage is more likely to work. If both sides require a particular CBR, there is the possibility of communication failure when one or both sides can't transmit the requested rate. In this case, the agent detecting this issue will have to perform a second round of offer/answer to try to find another working configuration or end the established session. In case the offer contained a "CBR" parameter but the answer does not, then the offerer is free to transmit at any rate to the answerer, but the answerer is restricted to the declared rate.
o 参数“CBR”是接收机能力;i、 例如,只有真正需要恒定比特率的接收机才应该使用它。使用此参数会对执行拥塞控制的可能性产生负面影响;见第9节。对于RECVOLY和sendrecv流,它指示接收器想要接受的所需恒定比特率。发送方必须能够发送恒定比特率流,因为它是可变比特率功能的子集。如果报价包含此参数,应答者必须以恒定比特率发送G.719音频,前提是该音频在允许的会话比特率(“b=”参数)内。如果回答者不能支持规定的CBR,则回答中必须拒绝该有效载荷类型。如果应答器本身要求以恒定比特率接收,即使报价不包括“CBR”参数,应答器也应仅包括此参数。在这种情况下,报价人应以恒定比特率发送,但应能够以可变比特率接收媒体。建议回答者使用与报价中相同的CBR,因为对称使用更可能有效。如果双方都需要特定的CBR,当一方或双方无法传输请求的速率时,可能会出现通信故障。在这种情况下,检测到此问题的代理必须执行第二轮提供/应答,以尝试找到另一个工作配置或结束已建立的会话。如果报价包含“CBR”参数,但答案不包含,则报价人可自由以任何速率向应答人发送,但应答人仅限于声明的速率。
In declarative usage, like SDP in the Real Time Streaming Protocol (RTSP) [RFC2326] or the Session Announcement Protocol (SAP) [RFC2974], the parameters SHALL be interpreted as follows:
在声明性使用中,如实时流协议(RTSP)[RFC2326]或会话公告协议(SAP)[RFC2974]中的SDP,参数应解释如下:
o The payload format configuration parameters ("interleaving" and "channels") are all declarative, and a participant MUST use the configuration(s) that is provided for the session. More than one configuration may be provided if necessary by declaring multiple RTP payload types; however, the number of types should be kept small.
o 有效负载格式配置参数(“交错”和“通道”)都是声明性的,参与者必须使用为会话提供的配置。如有必要,可通过声明多个RTP有效负载类型来提供多个配置;但是,类型的数量应保持在较小的范围内。
o It might not be possible to know the SSRC values that are going to be used by the sources at the time of sending the SDP. This is not a major issue as the size of the interleaving buffer can be tailored towards the values that are actually going to be used, thus ensuring that the default values for "int-delay" are not resulting in too much extra buffering.
o 在发送SDP时,可能无法知道源将使用的SSRC值。这不是一个主要问题,因为交错缓冲区的大小可以根据实际要使用的值进行定制,从而确保“int delay”的默认值不会导致太多额外的缓冲。
o Any "maxptime" and "ptime" values should be selected with care to ensure that the session's participants can achieve reasonable performance.
o 应谨慎选择任何“maxptime”和“ptime”值,以确保课程参与者能够实现合理的绩效。
o The parameter "CBR" if included applies to all RTP streams using that payload type for which a particular CBR is declared. Usage of this parameter has a negative impact on the possibility to perform congestion control; see Section 9.
o 参数“CBR”(如果包括)适用于使用为其声明特定CBR的有效负载类型的所有RTP流。使用此参数会对执行拥塞控制的可能性产生负面影响;见第9节。
One media type (audio/G719) has been defined and registered in the media types registry; see Section 7.1.
已在媒体类型注册表中定义并注册了一种媒体类型(音频/G719);见第7.1节。
The general congestion control considerations for transporting RTP data apply; see RTP [RFC3550] and any applicable RTP profile like AVP [RFC3551]. However, the multi-rate capability of G.719 audio coding provides a mechanism that may help to control congestion, since the bandwidth demand can be adjusted (within the limits of the codec) by selecting a different encoding bitrate.
传输RTP数据的一般拥塞控制注意事项适用;参见RTP[RFC3550]和任何适用的RTP配置文件,如AVP[RFC3551]。然而,G.719音频编码的多速率能力提供了一种可能有助于控制拥塞的机制,因为可以通过选择不同的编码比特率来调整带宽需求(在编解码器的限制范围内)。
The number of frames encapsulated in each RTP payload highly influences the overall bandwidth of the RTP stream due to header overhead constraints. Packetizing more frames in each RTP payload can reduce the number of packets sent and hence the header overhead, at the expense of increased delay and reduced error robustness. If forward error correction (FEC) is used, the amount of FEC-induced redundancy needs to be regulated such that the use of FEC itself does not cause a congestion problem. In other words, a sender SHALL NOT increase the total bitrate when adding redundancy in response to packet loss, and needs instead to adjust it down in accordance to the congestion control algorithm being run. Thus, when adding redundancy, the media bitrate will need to be reduced to provide room for the redundancy.
由于报头开销限制,封装在每个RTP有效负载中的帧的数量高度影响RTP流的总体带宽。在每个RTP有效负载中打包更多帧可以减少发送的数据包数量,从而减少报头开销,但代价是增加延迟和降低错误鲁棒性。如果使用前向纠错(FEC),则需要调节FEC引起的冗余量,以便FEC本身的使用不会导致拥塞问题。换言之,发送方在响应数据包丢失而添加冗余时不应增加总比特率,而是需要根据正在运行的拥塞控制算法将其调低。因此,当添加冗余时,需要降低媒体比特率以提供冗余空间。
The "CBR" signaling parameter allows a receiver to lock down an RTP payload type to use a single encoding rate. As this prevents the codec rate from being lowered when congestion is experienced, the sender is constrained to either change the packetization or abort the
“CBR”信令参数允许接收机锁定RTP有效负载类型以使用单一编码速率。由于这可以防止在遇到拥塞时降低编解码器速率,因此发送方必须更改打包或中止打包
transmission. Since these responses to congestion are severely limited, implementations SHOULD NOT use the "CBR" parameter unless they are interacting with a device that cannot support a variable bitrate (e.g., a gateway to H.320 systems). When using CBR mode, a receiver MUST monitor the packet loss rate to ensure congestion is not caused, following the guidelines in Section 2 of RFC 3551.
传输由于这些对拥塞的响应受到严重限制,除非实现与不支持可变比特率的设备(例如,H.320系统的网关)交互,否则不应使用“CBR”参数。在使用CBR模式时,接收器必须根据RFC 3551第2节中的指南监控数据包丢失率,以确保不会造成拥塞。
RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550] and in any applicable RTP profile. The main security considerations for the RTP packet carrying the RTP payload format defined within this memo are confidentiality, integrity, and source authenticity. Confidentiality is achieved by encryption of the RTP payload. Integrity of the RTP packets is achieved through a suitable cryptographic integrity protection mechanism. Such a cryptographic system may also allow the authentication of the source of the payload. A suitable security mechanism for this RTP payload format should provide confidentiality, integrity protection, and at least source authentication capable of determining if an RTP packet is from a member of the RTP session.
使用本规范中定义的有效负载格式的RTP数据包应遵守RTP规范[RFC3550]和任何适用RTP配置文件中讨论的安全注意事项。携带本备忘录中定义的RTP有效载荷格式的RTP数据包的主要安全注意事项是机密性、完整性和源真实性。保密性是通过对RTP有效负载进行加密来实现的。RTP数据包的完整性是通过合适的密码完整性保护机制实现的。这样的密码系统还可以允许对有效载荷的源进行认证。此RTP有效负载格式的合适安全机制应提供机密性、完整性保护,并且至少能够确定RTP分组是否来自RTP会话的成员的源认证。
Note that the appropriate mechanism to provide security to RTP and payloads following this memo may vary. It is dependent on the application, the transport, and the signaling protocol employed. Therefore, a single mechanism is not sufficient, although if suitable, usage of the Secure Real-time Transport Protocol (SRTP) [RFC3711] is recommended. Other mechanisms that may be used are IPsec [RFC4301] and Transport Layer Security (TLS) [RFC5246] (RTP over TCP); other alternatives may exist.
请注意,根据本备忘录为RTP和有效负载提供安全性的适当机制可能会有所不同。它取决于应用程序、传输和所采用的信令协议。因此,单一机制是不够的,尽管如果合适,建议使用安全实时传输协议(SRTP)[RFC3711]。可使用的其他机制包括IPsec[RFC4301]和传输层安全(TLS)[RFC5246](TCP上的RTP);可能存在其他替代方案。
The use of interleaving in conjunction with encryption can have a negative impact on confidentiality for a short period of time. Consider the following packets (in brackets) containing frame numbers as indicated: {10, 14, 18}, {13, 17, 21}, {16, 20, 24} (a popular continuous diagonal interleaving pattern). The originator wishes to deny some participants the ability to hear material starting at time 16. Simply changing the key on the packet with the timestamp at or after 16, and denying that new key to those participants, does not achieve this; frames 17, 18, and 21 have been supplied in prior packets under the prior key, and error concealment may make the audio intelligible at least as far as frame 18 or 19, and possibly further.
交织与加密结合使用会在短时间内对保密性产生负面影响。考虑下面的数据包(括号内)包含框编号,如{ 10, 14, 18 },{ 13, 17, 21 },{ 16, 20, 24 }(一种流行的连续对角交错模式)。发起者希望剥夺一些参与者从时间16开始聆听材料的能力。简单地更改时间戳为16或16之后的数据包上的密钥,并拒绝向这些参与者提供新密钥,并不能实现这一点;帧17、18和21已在先前密钥下的先前分组中提供,并且错误隐藏可使音频至少远至帧18或19,并且可能远至帧18或19。
This RTP payload format and its media decoder do not exhibit any significant non-uniformity in the receiver-side computational complexity for packet processing, and thus are unlikely to pose a denial-of-service threat due to the receipt of pathological data. Nor does the RTP payload format contain any active content.
此RTP有效载荷格式及其媒体解码器在用于分组处理的接收器端计算复杂度方面不表现出任何显著的非均匀性,因此不太可能由于接收病理数据而造成拒绝服务威胁。RTP有效负载格式也不包含任何活动内容。
The authors would like to thank Roni Even and Anisse Taleb for their help with this document. We would also like to thank the people who have provided feedback: Colin Perkins, Mark Baker, and Stephen Botzko.
作者要感谢Roni Even和Anise Taleb对本文件的帮助。我们还要感谢提供反馈的人:科林·珀金斯、马克·贝克和斯蒂芬·博茨科。
[ITU-T-G719] ITU-T, "Specification : ITU-T G.719 extension for 20 kHz fullband audio", April 2008.
[ITU-T-G719]ITU-T,“规范:20 kHz全频段音频的ITU-T G.719扩展”,2008年4月。
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.
[RFC3264]Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,2002年6月。
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.
[RFC3550]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003.
[RFC3551]Schulzrinne,H.和S.Casner,“具有最小控制的音频和视频会议的RTP配置文件”,STD 65,RFC 3551,2003年7月。
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.
[RFC4566]Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC4566,2006年7月。
[RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008.
[RFC5234]Crocker,D.和P.Overell,“语法规范的扩充BNF:ABNF”,STD 68,RFC 5234,2008年1月。
[RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines for Application Designers", BCP 145, RFC 5405, November 2008.
[RFC5405]Eggert,L.和G.Fairhurst,“应用程序设计者的单播UDP使用指南”,BCP 145,RFC 5405,2008年11月。
[RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, September 1997.
[RFC2198]Perkins,C.,Kouvelas,I.,Hodson,O.,Hardman,V.,Handley,M.,Bolot,J.,Vega Garcia,A.,和S.Fosse Parisis,“冗余音频数据的RTP有效载荷”,RFC 21981997年9月。
[RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998.
[RFC2326]Schulzrinne,H.,Rao,A.,和R.Lanphier,“实时流协议(RTSP)”,RFC2326,1998年4月。
[RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session Announcement Protocol", RFC 2974, October 2000.
[RFC2974]Handley,M.,Perkins,C.,和E.Whelan,“会话公告协议”,RFC 2974,2000年10月。
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004.
[RFC3711]Baugher,M.,McGrew,D.,Naslund,M.,Carrara,E.,和K.Norrman,“安全实时传输协议(SRTP)”,RFC 37112004年3月。
[RFC3839] Castagno, R. and D. Singer, "MIME Type Registrations for 3rd Generation Partnership Project (3GPP) Multimedia files", RFC 3839, July 2004.
[RFC3839]Castagno,R.和D.Singer,“第三代合作伙伴关系项目(3GPP)多媒体文件的MIME类型注册”,RFC 38392004年7月。
[RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", BCP 13, RFC 4288, December 2005.
[RFC4288]Freed,N.和J.Klensin,“介质类型规范和注册程序”,BCP 13,RFC 4288,2005年12月。
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, December 2005.
[RFC4301]Kent,S.和K.Seo,“互联网协议的安全架构”,RFC 43012005年12月。
[RFC4337] Y Lim and D. Singer, "MIME Type Registration for MPEG-4", RFC 4337, March 2006.
[RFC4337]Y Lim和D.Singer,“MPEG-4的MIME类型注册”,RFC 4337,2006年3月。
[RFC4855] Casner, S., "Media Type Registration of RTP Payload Formats", RFC 4855, February 2007.
[RFC4855]Casner,S.,“RTP有效负载格式的媒体类型注册”,RFC 48552007年2月。
[RFC5109] Li, A., "RTP Payload Format for Generic Forward Error Correction", RFC 5109, December 2007.
[RFC5109]Li,A.“通用前向纠错的RTP有效载荷格式”,RFC 5109,2007年12月。
[RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, August 2008.
[RFC5246]Dierks,T.和E.Rescorla,“传输层安全(TLS)协议版本1.2”,RFC 5246,2008年8月。
Authors' Addresses
作者地址
Magnus Westerlund Ericsson AB Torshamnsgatan 21-23 SE-164 83 Stockholm SWEDEN
Magnus Westerlund Ericsson AB Torshamnsgatan 21-23 SE-164 83瑞典斯德哥尔摩
Phone: +46 10 7190000 EMail: magnus.westerlund@ericsson.com
Phone: +46 10 7190000 EMail: magnus.westerlund@ericsson.com
Ingemar Johansson Ericsson AB Laboratoriegrand 11 SE-971 28 Lulea SWEDEN
英格玛·约翰逊·爱立信AB实验室和瑞典卢利亚11 SE-971 28
Phone: +46 10 7190000 EMail: ingemar.s.johansson@ericsson.com
Phone: +46 10 7190000 EMail: ingemar.s.johansson@ericsson.com