Network Working Group A. Sollaud Request for Comments: 5459 France Telecom Updates: 4749 January 2009 Category: Standards Track
Network Working Group A. Sollaud Request for Comments: 5459 France Telecom Updates: 4749 January 2009 Category: Standards Track
G.729.1 RTP Payload Format Update: Discontinuous Transmission (DTX) Support
G.729.1 RTP有效载荷格式更新:不连续传输(DTX)支持
Status of This Memo
关于下段备忘
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
本文件规定了互联网社区的互联网标准跟踪协议,并要求进行讨论和提出改进建议。有关本协议的标准化状态和状态,请参考当前版本的“互联网官方协议标准”(STD 1)。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2009 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/ 许可证信息)在本文件发布之日生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。
Abstract
摘要
This document updates the Real-time Transport Protocol (RTP) payload format to be used for the International Telecommunication Union (ITU-T) Recommendation G.729.1 audio codec. It adds Discontinuous Transmission (DTX) support to the RFC 4749 specification, in a backward-compatible way. An updated media type registration is included for this payload format.
本文件更新了用于国际电信联盟(ITU-T)建议G.729.1音频编解码器的实时传输协议(RTP)有效载荷格式。它以向后兼容的方式向RFC 4749规范添加了不连续传输(DTX)支持。此有效负载格式包含更新的媒体类型注册。
Table of Contents
目录
1. Introduction ....................................................2 2. Background ......................................................2 3. RTP Header Usage ................................................3 4. Payload Format ..................................................3 5. Payload Format Parameters .......................................4 5.1. Media Type Registration Update .............................4 5.2. Mapping to SDP Parameters ..................................5 5.2.1. DTX Offer-Answer Model Considerations ...............5 5.2.2. DTX Declarative SDP Considerations ..................6 6. Congestion Control ..............................................6 7. Security Considerations .........................................6 8. IANA Considerations .............................................6 9. References ......................................................6 9.1. Normative References .......................................6 9.2. Informative References .....................................7
1. Introduction ....................................................2 2. Background ......................................................2 3. RTP Header Usage ................................................3 4. Payload Format ..................................................3 5. Payload Format Parameters .......................................4 5.1. Media Type Registration Update .............................4 5.2. Mapping to SDP Parameters ..................................5 5.2.1. DTX Offer-Answer Model Considerations ...............5 5.2.2. DTX Declarative SDP Considerations ..................6 6. Congestion Control ..............................................6 7. Security Considerations .........................................6 8. IANA Considerations .............................................6 9. References ......................................................6 9.1. Normative References .......................................6 9.2. Informative References .....................................7
The International Telecommunication Union (ITU-T) Recommendation G.729.1 [ITU-G.729.1] is a scalable and wideband extension of the Recommendation G.729 [ITU-G.729] audio codec. [RFC4749] specifies the payload format for packetization of G.729.1-encoded audio signals into the Real-time Transport Protocol (RTP) [RFC3550].
国际电信联盟(ITU-T)建议G.729.1[ITU-G.729.1]是建议G.729[ITU-G.729]音频编解码器的可扩展宽带扩展。[RFC4749]指定将G.729.1编码音频信号打包到实时传输协议(RTP)[RFC3550]中的有效负载格式。
Annex C of G.729.1 [ITU-G.729.1-C] adds Discontinuous Transmission (DTX) support to G.729.1. This document updates the RTP payload format to allow usage of this Annex.
G.729.1[ITU-G.729.1-C]的附录C为G.729.1增加了不连续传输(DTX)支持。本文件更新了RTP有效载荷格式,以允许使用本附件。
Only changes or additions to [RFC4749] will be described in the following sections.
以下章节仅描述对[RFC4749]的更改或添加。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中所述进行解释。
G.729.1 supports Discontinuous Transmission (DTX), a.k.a. "silence suppression". It means that the coder includes a Voice Activity Detection (VAD) algorithm to determine if an audio frame contains silence or actual audio. During silence periods, the coder may significantly decrease the transmitted bit rate by sending a small frame called a Silence Insertion Descriptor (SID) and then stop transmission. The receiver's decoder will generate comfort noise (CNG) according to the parameters contained in the SID. This DTX/CNG scheme is specified in [ITU-G.729.1-C].
G.729.1支持不连续传输(DTX),也称为“静音抑制”。这意味着编码器包括语音活动检测(VAD)算法,以确定音频帧是否包含静音或实际音频。在静默期期间,编码器可以通过发送称为静默插入描述符(SID)的小帧,然后停止传输来显著降低传输比特率。接收器的解码器将根据SID中包含的参数产生舒适噪声(CNG)。该DTX/CNG方案在[ITU-G.729.1-C]中有规定。
The G.729.1 SID has an embedded structure. The core SID is the same as the legacy G.729 SID [ITU-G.729-B]. The first enhancement layer adds some parameters for narrowband comfort noise, while a second enhancement layer adds wideband information. The G.729.1 SID can be 2, 3, or 6 octets long.
G.729.1 SID具有嵌入式结构。核心SID与传统G.729 SID相同[ITU-G.729-B]。第一增强层为窄带舒适噪声添加一些参数,而第二增强层添加宽带信息。G.729.1 SID的长度可以是2、3或6个八位字节。
The fields of the RTP header must be used as described in [RFC4749], except for the Marker (M) bit.
RTP报头的字段必须按照[RFC4749]中的说明使用,但标记(M)位除外。
If DTX is used, the first packet of a talkspurt -- that is, the first packet after a silence period during which packets have not been transmitted contiguously -- MUST be distinguished by setting the M bit in the RTP data header to 1. The M bit in all other packets MUST be set to 0. The beginning of a talkspurt MAY be used to adjust the playout delay to reflect changing network delays.
如果使用DTX,则必须通过将RTP数据报头中的M位设置为1来区分TalkSput的第一个数据包,即在数据包未连续传输的静默期之后的第一个数据包。所有其他数据包中的M位必须设置为0。TalkSport的开始可用于调整播放延迟以反映不断变化的网络延迟。
If DTX is not used, the M bit MUST be set to 0 in all packets.
如果未使用DTX,则所有数据包中的M位必须设置为0。
The payload format is the same as in [RFC4749], with the option to add a SID at the end.
有效负载格式与[RFC4749]中的相同,可以选择在末尾添加SID。
So the complete payload consists of a payload header of 1 octet (MBS (maximum bit rate supported) and FT (frame type) fields), followed by zero or more consecutive audio frames at the same bit rate, followed by zero or one SID.
因此,完整的有效负载由1个八位字节(MBS(支持的最大比特率)和FT(帧类型)字段组成的有效负载头组成,后面是零个或多个相同比特率的连续音频帧,后面是零个或一个SID。
Note that this is consistent with the payload format of G.729 described in section 4.5.6 of [RFC3551].
注意,这与[RFC3551]第4.5.6节中描述的G.729有效载荷格式一致。
To be able to transport a SID alone -- that is, without actual audio frames -- we assign the FT value 14 to the SID. When using FT=14, only a single SID frame SHALL be included in the payload. The actual SID size (2, 3, or 6 octets) is inferred from the payload size: it is the size of what is left after the payload header.
为了能够单独传输SID——也就是说,没有实际的音频帧——我们将FT值14分配给SID。当使用FT=14时,有效载荷中仅应包含一个SID帧。实际SID大小(2、3或6个八位字节)是从有效负载大小推断出来的:它是有效负载头之后剩余的大小。
When a SID is appended to actual audio frames, the FT value remains the one describing the encoding rate of the audio frames. Since the SID is much smaller than any other frame, it can be easily detected at the receiver side, and it will not hinder the calculation of the number of frames. The actual SID size is inferred from the payload size: it is the size of what is left after the audio frames.
当SID附加到实际音频帧时,FT值仍然是描述音频帧编码速率的值。由于SID比任何其他帧小得多,因此可以在接收器侧容易地检测到它,并且它不会妨碍帧数的计算。实际SID大小是根据有效负载大小推断出来的:它是音频帧后剩余内容的大小。
Section 5.4 of [RFC4749] mandates to ignore the remaining bytes after complete frames. This document overrides that statement: the receiver of the payload must consider these remaining bytes as a SID frame. If the size of this remainder is not a valid SID frame size (2, 3, or 6 octets), the receiver MUST ignore these bytes.
[RFC4749]第5.4节要求忽略完整帧后的剩余字节。此文档重写该语句:有效载荷的接收器必须考虑这些剩余字节作为SID帧。如果此余数的大小不是有效的SID帧大小(2、3或6个八位字节),则接收器必须忽略这些字节。
The full FT table is given for convenience:
为方便起见,提供了完整的FT表:
+-------+---------------+-------------------+ | FT | encoding rate | frame size | +-------+---------------+-------------------+ | 0 | 8 kbps | 20 octets | | 1 | 12 kbps | 30 octets | | 2 | 14 kbps | 35 octets | | 3 | 16 kbps | 40 octets | | 4 | 18 kbps | 45 octets | | 5 | 20 kbps | 50 octets | | 6 | 22 kbps | 55 octets | | 7 | 24 kbps | 60 octets | | 8 | 26 kbps | 65 octets | | 9 | 28 kbps | 70 octets | | 10 | 30 kbps | 75 octets | | 11 | 32 kbps | 80 octets | | 12-13 | (reserved) | - | | 14 | SID | 2, 3, or 6 octets | | 15 | NO_DATA | 0 | +-------+---------------+-------------------+
+-------+---------------+-------------------+ | FT | encoding rate | frame size | +-------+---------------+-------------------+ | 0 | 8 kbps | 20 octets | | 1 | 12 kbps | 30 octets | | 2 | 14 kbps | 35 octets | | 3 | 16 kbps | 40 octets | | 4 | 18 kbps | 45 octets | | 5 | 20 kbps | 50 octets | | 6 | 22 kbps | 55 octets | | 7 | 24 kbps | 60 octets | | 8 | 26 kbps | 65 octets | | 9 | 28 kbps | 70 octets | | 10 | 30 kbps | 75 octets | | 11 | 32 kbps | 80 octets | | 12-13 | (reserved) | - | | 14 | SID | 2, 3, or 6 octets | | 15 | NO_DATA | 0 | +-------+---------------+-------------------+
DTX has no impact on the MBS definition and use.
DTX对MBS的定义和使用没有影响。
Parameters defined in [RFC4749] are not modified. We add a new optional parameter to configure DTX.
[RFC4749]中定义的参数未修改。我们添加了一个新的可选参数来配置DTX。
We add a new optional parameter to the audio/G7291 media subtype:
我们向audio/G7291媒体子类型添加了一个新的可选参数:
dtx: indicates that Discontinuous Transmission (DTX) is used or preferred. Permissible values are 0 and 1. 0 means no DTX. 1 means DTX support, as described in Annex C of ITU-T Recommendation G.729.1. 0 is implied if this parameter is omitted.
dtx:表示使用或首选不连续传输(dtx)。允许值为0和1。0表示没有DTX。1表示DTX支持,如ITU-T建议G.729.1附录C所述。如果省略此参数,则表示为0。
When DTX is turned off, the RTP payload MUST NOT contain a SID, and the FT value 14 MUST NOT be used.
当DTX关闭时,RTP有效负载不得包含SID,且不得使用FT值14。
The information carried in the media type specification has a specific mapping to fields in the Session Description Protocol (SDP) [RFC4566], which is commonly used to describe RTP sessions.
媒体类型规范中包含的信息与会话描述协议(SDP)[RFC4566]中的字段具有特定映射,该协议通常用于描述RTP会话。
The mapping described in [RFC4749] remains unchanged.
[RFC4749]中描述的映射保持不变。
The "dtx" parameter goes in the SDP "a=fmtp" attribute.
“dtx”参数位于SDP“a=fmtp”属性中。
Some example partial SDP session descriptions utilizing G.729.1 encodings follow.
下面是一些使用G.729.1编码的部分SDP会话描述示例。
Example 1: default parameters (DTX off)
示例1:默认参数(DTX关闭)
m=audio 57586 RTP/AVP 96 a=rtpmap:96 G7291/16000
m=audio 57586 RTP/AVP 96 a=rtpmap:96 G7291/16000
Example 2: recommended packet duration of 40 ms (=2 frames), maximum bit rate is 20 kbps, DTX supported and preferred.
示例2:建议的数据包持续时间为40 ms(=2帧),最大比特率为20 kbps,支持DTX且首选。
m=audio 49987 RTP/AVP 97 a=rtpmap:97 G7291/16000 a=fmtp:97 maxbitrate=20000; dtx=1 a=ptime:40
m=audio 49987 RTP/AVP 97 a=rtpmap:97 G7291/16000 a=fmtp:97 maxbitrate=20000; dtx=1 a=ptime:40
The offer-answer model considerations of [RFC4749] fully apply. In this section, we only define the management of the "dtx" parameter.
[RFC4749]的报价-应答模型考虑因素完全适用。在本节中,我们仅定义“dtx”参数的管理。
The "dtx" parameter concerns both sending and receiving, so both sides of a bi-directional session MUST have the same DTX setting. If one party indicates that it does not support DTX, DTX must be deactivated both ways. In other words, DTX is actually activated if, and only if, "dtx=1" in the offer and in the answer.
“dtx”参数涉及发送和接收,因此双向会话的双方必须具有相同的dtx设置。如果一方表示不支持DTX,则必须以两种方式停用DTX。换句话说,当且仅当报价和答案中的“DTX=1”时,DTX才被激活。
A special rule applies for multicast: the "dtx" parameter becomes declarative and MUST NOT be negotiated. This parameter is fixed, and a participant MUST use the configuration that is provided for the session.
一个特殊的规则适用于多播:“dtx”参数变成声明性的,不能协商。此参数是固定的,参与者必须使用为会话提供的配置。
An RTP receiver compliant with only RFC 4749 and not this specification will ignore the "dtx" parameter and will not include it in its answer, so DTX will not be activated in this case. As a remark, if that happened, this kind of receiver would not be confused by an RTP stream with DTX because RFC 4749 requires that the bytes that are now used for SID frames be ignored. But during the silence
仅符合RFC 4749而不符合本规范的RTP接收器将忽略“dtx”参数,并且不会将其包含在其答案中,因此在这种情况下不会激活dtx。作为说明,如果发生这种情况,这种接收器不会被RTP流与DTX混淆,因为RFC 4749要求忽略现在用于SID帧的字节。但是在沉默中
period, the receiver would then react using packet loss concealment instead of comfort noise generation, leading to bad audio quality. This justifies the use of the "dtx" parameter, even if the payload format is backward-compatible at the binary level.
在此期间,接收器将使用包丢失隐藏而不是舒适噪声产生作出反应,从而导致不良的音频质量。这证明了使用“dtx”参数的合理性,即使有效负载格式在二进制级别是向后兼容的。
The "dtx" parameter is declarative and provides the parameter that SHALL be used when receiving and/or sending the configured stream.
“dtx”参数是声明性的,并提供在接收和/或发送配置流时应使用的参数。
The congestion control considerations of [RFC4749] apply.
[RFC4749]中的拥塞控制注意事项适用。
The use of DTX can help congestion control by reducing the number of transmitted RTP packets and the average bandwidth of audio streams.
DTX的使用可以通过减少传输的RTP数据包的数量和音频流的平均带宽来帮助拥塞控制。
The security considerations of [RFC4749] apply.
[RFC4749]的安全注意事项适用。
By observing the RTP flow with DTX, an attacker could see when and for how long people are speaking. This is a general fact for DTX, and G.729.1 DTX introduces no new specific issue.
通过使用DTX观察RTP流,攻击者可以看到人们说话的时间和持续时间。这是DTX的一般事实,G.729.1 DTX没有引入新的特定问题。
IANA has assigned a new optional parameter for media subtype (audio/ G7291); see Section 5.1.
IANA为媒体子类型(audio/G7291)分配了一个新的可选参数;见第5.1节。
[ITU-G.729.1] International Telecommunications Union, "G.729 based Embedded Variable bit-rate coder: An 8-32 kbit/s scalable wideband coder bitstream interoperable with G.729", ITU-T Recommendation G.729.1, May 2006.
[ITU-G.729.1]国际电信联盟,“基于G.729的嵌入式可变比特率编码器:可与G.729互操作的8-32 kbit/s可伸缩宽带编码器比特流”,ITU-T建议G.729.1,2006年5月。
[ITU-G.729.1-C] International Telecommunications Union, "G.729.1 DTX/CNG scheme", ITU-T Recommendation G.729.1 Annex C, May 2008.
[ITU-G.729.1-C]国际电信联盟,“G.729.1 DTX/CNG方案”,ITU-T建议G.729.1附件C,2008年5月。
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.
[RFC3550]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.
[RFC4566]Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC4566,2006年7月。
[RFC4749] Sollaud, A., "RTP Payload Format for the G.729.1 Audio Codec", RFC 4749, October 2006.
[RFC4749]Sollaud,A.,“G.729.1音频编解码器的RTP有效载荷格式”,RFC 4749,2006年10月。
[ITU-G.729] International Telecommunications Union, "Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP)", ITU-T Recommendation G.729, March 1996.
[ITU-G.729]国际电信联盟,“使用共轭结构代数码激励线性预测(CS-ACELP)对8kbit/s语音进行编码”,ITU-T建议G.729,1996年3月。
[ITU-G.729-B] International Telecommunications Union, "A silence compression scheme for G.729 optimized for terminals conforming to Recommendation V.70", ITU-T Recommendation G.729 Annex B, October 1996.
[ITU-G.729-B]国际电信联盟,“为符合建议V.70的终端优化的G.729静音压缩方案”,ITU-T建议G.729附录B,1996年10月。
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003.
[RFC3551]Schulzrinne,H.和S.Casner,“具有最小控制的音频和视频会议的RTP配置文件”,STD 65,RFC 3551,2003年7月。
Author's Address
作者地址
Aurelien Sollaud France Telecom 2 avenue Pierre Marzin Lannion Cedex 22307 France
法国皮埃尔·马津·拉尼翁大道2号奥雷林·索洛德法国电信公司Cedex 22307
Phone: +33 2 96 05 15 06 EMail: aurelien.sollaud@orange-ftgroup.com
Phone: +33 2 96 05 15 06 EMail: aurelien.sollaud@orange-ftgroup.com