Network Working Group R. Zopf Request for Comments: 3389 Lucent Technologies Category: Standards Track September 2002
Network Working Group R. Zopf Request for Comments: 3389 Lucent Technologies Category: Standards Track September 2002
Real-time Transport Protocol (RTP) Payload for Comfort Noise (CN)
舒适性噪声(CN)的实时传输协议(RTP)有效载荷
Status of this Memo
本备忘录的状况
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
本文件规定了互联网社区的互联网标准跟踪协议,并要求进行讨论和提出改进建议。有关本协议的标准化状态和状态,请参考当前版本的“互联网官方协议标准”(STD 1)。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (C) The Internet Society (2002). All Rights Reserved.
版权所有(C)互联网协会(2002年)。版权所有。
Abstract
摘要
This document describes a Real-time Transport Protocol (RTP) payload format for transporting comfort noise (CN). The CN payload type is primarily for use with audio codecs that do not support comfort noise as part of the codec itself such as ITU-T Recommendations G.711, G.726, G.727, G.728, and G.722.
本文档描述了用于传输舒适性噪声(CN)的实时传输协议(RTP)有效负载格式。CN有效载荷类型主要用于不支持舒适噪声作为编解码器本身一部分的音频编解码器,如ITU-T建议G.711、G.726、G.727、G.728和G.722。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [7].
本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照RFC-2119[7]中所述进行解释。
This document describes a RTP [1] payload format for transporting comfort noise. The payload format is based on Appendix II of ITU-T Recommendation G.711 [8] which defines a comfort noise payload format (or bit-stream) for ITU-T G.711 [2] use in packet-based multimedia communication systems. The payload format is generic and may also be used with other audio codecs without built-in Discontinuous Transmission (DTX) capability such as ITU-T Recommendations G.726 [3], G.727 [4], G.728 [5], and G.722 [6]. The payload format provides a minimum interoperability specification for communication of comfort noise parameters. The comfort noise analysis and synthesis as well as the Voice Activity Detection (VAD) and DTX algorithms are unspecified and left implementation-specific.
本文档描述了用于传输舒适性噪声的RTP[1]有效载荷格式。有效载荷格式基于ITU-T建议G.711[8]的附录II,该附录定义了用于基于分组的多媒体通信系统的ITU-T G.711[2]的舒适噪声有效载荷格式(或比特流)。有效载荷格式是通用的,也可以与其他没有内置不连续传输(DTX)功能的音频编解码器一起使用,如ITU-T建议G.726[3]、G.727[4]、G.728[5]和G.722[6]。有效载荷格式为舒适性噪声参数的通信提供了最低互操作性规范。舒适性噪声分析和合成以及语音活动检测(VAD)和DTX算法未指定,且只针对具体实现。
However, an example solution for G.711 has been tested and is described in the Appendix [8]. It uses the VAD and DTX of G.729 Annex B [9] and a comfort noise generation algorithm (CNG) which is provided in the Appendix for information.
然而,G.711的示例解决方案已经过测试,并在附录[8]中进行了描述。它使用了G.729附录B[9]中的VAD和DTX以及附录中提供的舒适性噪声产生算法(CNG),以供参考。
The comfort noise payload, which is also known as a Silence Insertion Descriptor (SID) frame, consists of a single octet description of the noise level and MAY contain spectral information in subsequent octets. An earlier version of the CN payload format consisting only of the noise level byte was defined in draft revisions of the RFC 1890. The extended payload format defined in this document should be backward compatible with implementations of the earlier version assuming that only the first byte is interpreted and any additional spectral information bytes are ignored.
舒适性噪声有效载荷也称为静音插入描述符(SID)帧,由噪声级的单个八位组描述组成,并可能包含后续八位组中的频谱信息。RFC1890的修订草案中定义了仅由噪声级字节组成的CN有效载荷格式的早期版本。本文档中定义的扩展有效负载格式应与早期版本的实现向后兼容,前提是仅解释第一个字节,而忽略任何其他光谱信息字节。
The comfort noise payload consists of a description of the noise level and spectral information in the form of reflection coefficients for an all-pole model of the noise. The inclusion of spectral information is OPTIONAL and the model order (number of coefficients) is left unspecified. The encoder may choose an appropriate model order based on such considerations as quality, complexity, expected environmental noise, and signal bandwidth. The model order is not explicitly transmitted since the number of coefficients can be derived from the length of the payload at the receiver. The decoder may reduce the model order by setting higher order reflection coefficients to zero if desired to reduce complexity or for other reasons.
舒适性噪声有效载荷包括噪声级的描述和噪声全极点模型反射系数形式的频谱信息。光谱信息的包含是可选的,模型顺序(系数数量)未指定。编码器可以基于诸如质量、复杂性、预期环境噪声和信号带宽等考虑来选择适当的模型顺序。由于可以从接收器处的有效载荷长度导出系数的数量,因此不显式传输模型阶数。如果需要降低复杂性或出于其他原因,解码器可以通过将高阶反射系数设置为零来降低模型阶数。
The magnitude of the noise level is packed into the least significant bits of the noise-level byte with the most significant bit unused and always set to 0 as shown below in Figure 1. The least significant bit of the noise level magnitude is packed into the least significant bit of the byte.
噪声级的大小被压缩到噪声级字节的最低有效位,最高有效位未使用,并始终设置为0,如图1所示。噪声级幅值的最低有效位被压缩到字节的最低有效位。
The noise level is expressed in -dBov, with values from 0 to 127 representing 0 to -127 dBov. dBov is the level relative to the overload of the system. (Note: Representation relative to the overload point of a system is particularly useful for digital implementations, since one does not need to know the relative calibration of the analog circuitry.) For example, in the case of a u-law system, the reference would be a square wave with values +/- 8031, and this square wave represents 0dBov. This translates into 6.18dBm0.
噪声级用-dBov表示,0到127之间的值表示0到-127 dBov。dBov是相对于系统过载的水平。(注:相对于系统过载点的表示对于数字实现特别有用,因为不需要知道模拟电路的相对校准。)例如,在u定律系统的情况下,参考值为+/-8031的方波,该方波表示0dBov。这转化为6.18dBm0。
0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |0| level | +-+-+-+-+-+-+-+-+
0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |0| level | +-+-+-+-+-+-+-+-+
Figure 1: Noise Level Packing
图1:噪声级填料
The spectral information is transmitted using reflection coefficients [8]. Each reflection coefficient can have values between -1 and 1 and is quantized uniformly using 8 bits. The quantized value is represented by the 8 bit index N, where N=0..,254, and index N=255 is reserved for future use. Each index N is packed into a separate byte with the MSB first. The quantized value of each reflection coefficient, k_i, can be obtained from its corresponding index using:
使用反射系数传输光谱信息[8]。每个反射系数的值可以介于-1和1之间,并使用8位均匀量化。量化值由8位索引N表示,其中N=0..254和索引N=255保留供将来使用。每个索引N都被压缩到一个单独的字节中,MSB位于第一位。每个反射系数的量化值k_i可使用以下公式从其相应的指数中获得:
k_i(N_i) = 258*(N_i-127) for N_i = 0...254; -1 < k_i < 1 ------------- 32768
k_i(N_i) = 258*(N_i-127) for N_i = 0...254; -1 < k_i < 1 ------------- 32768
The first byte of the payload MUST contain the noise level as shown in Figure 1. Quantized reflection coefficients are packed in subsequent bytes in ascending order as in Figure 2 where M is the model order. The total length of the payload is M+1 bytes. Note that a 0th order model (i.e., no spectral envelope information) reduces to transmitting only the energy level.
有效负载的第一个字节必须包含噪声级,如图1所示。量化反射系数按升序压缩在后续字节中,如图2所示,其中M是模型顺序。有效负载的总长度为M+1字节。注意,0阶模型(即,无光谱包络信息)简化为仅传输能级。
Byte 1 2 3 ... M+1 +-----+-----+-----+-----+-----+ |level| N1 | N2 | ... | NM | +-----+-----+-----+-----+-----+
Byte 1 2 3 ... M+1 +-----+-----+-----+-----+-----+ |level| N1 | N2 | ... | NM | +-----+-----+-----+-----+-----+
Figure 2: CN Payload Packing Format
图2:CN有效负载打包格式
The RTP header for the comfort noise packet SHOULD be constructed as if the comfort noise were an independent codec. Thus, the RTP timestamp designates the beginning of the comfort noise period. When this payload format is used under the RTP profile specified in RFC 1890 [10], a static payload type of 13 is assigned for RTP timestamp clock rate of 8,000 Hz; if other rates are needed, they MUST be defined through dynamic payload types. The RTP packet SHOULD NOT have the marker bit set.
舒适噪声数据包的RTP报头应被构造为舒适噪声是独立的编解码器。因此,RTP时间戳指定舒适噪声周期的开始。当在RFC 1890[10]中规定的RTP配置文件下使用此有效负载格式时,为8000 Hz的RTP时间戳时钟速率分配静态有效负载类型13;如果需要其他速率,则必须通过动态有效负载类型来定义它们。RTP数据包不应设置标记位。
Each RTP packet containing comfort noise MUST contain exactly one CN payload per channel. This is required since the CN payload has a variable length. If multiple audio channels are used, each channel MUST use the same spectral model order 'M'.
每个包含舒适性噪声的RTP数据包必须在每个信道中仅包含一个CN有效负载。这是必需的,因为CN有效负载具有可变长度。如果使用多个音频通道,每个通道必须使用相同的频谱模型顺序“M”。
An audio codec with DTX capabilities generally includes VAD, DTX, and CNG algorithms. The job of the VAD is to discriminate between active and inactive voice segments in the input signal. During inactive voice segments, the role of the CNG is to sufficiently describe the ambient noise while minimizing the transmission rate. A CN payload (or SID frame) containing a description of the noise is sent to the receiver to drive the CNG. The DTX algorithm determines when a CN payload is transmitted. During active voice segments, packets of the voice codec are transmitted and indicated in the RTP header by the static or dynamic payload type for that codec. At the beginning of an inactive voice segment (silence period), a CN packet is transmitted in the same RTP stream and indicated by the CN payload type. The CN packet update rate is left implementation specific. For example, the CN packet may be sent periodically or only when there is a significant change in the background noise characteristics. The CNG algorithm at the receiver uses the information in the CN payload to update its noise generation model and then produce an appropriate amount of comfort noise.
具有DTX功能的音频编解码器通常包括VAD、DTX和CNG算法。VAD的工作是区分输入信号中的活动和非活动语音段。在非活动语音段期间,CNG的作用是充分描述环境噪声,同时最小化传输速率。包含噪声描述的CN有效载荷(或SID帧)被发送到接收器以驱动CNG。DTX算法确定何时发送CN有效负载。在活动语音段期间,语音编解码器的分组被传输,并通过该编解码器的静态或动态有效负载类型在RTP报头中指示。在非活动语音段(静默期)的开始处,在相同的RTP流中发送CN分组,并由CN有效负载类型指示。CN数据包更新率取决于具体实现。例如,可以周期性地或仅当背景噪声特性中存在显著变化时才发送CN分组。接收器处的CNG算法使用CN有效载荷中的信息更新其噪声产生模型,然后产生适当数量的舒适噪声。
The CN payload format provides a minimum interoperability specification for communication of comfort noise parameters. The comfort noise analysis and synthesis as well as the VAD and DTX algorithms are unspecified and left implementation-specific. However, an example solution for G.711 has been tested and is described in Appendix II of ITU-T Recommendation G.711 [8]. It uses the VAD and DTX of G.729 Annex B [9] and a comfort noise generation algorithm (CNG), which is provided in the Appendix for information. Additional guidelines for use such as the factors affecting system performance in the design of the VAD/DTX/CNG algorithms are described in the Appendix.
CN有效载荷格式为舒适性噪声参数的通信提供了最低互操作性规范。舒适性噪声分析和合成以及VAD和DTX算法未指定,且只针对具体实现。然而,G.711的示例解决方案已经过测试,并在ITU-T建议G.711[8]的附录II中进行了描述。它使用G.729附录B[9]中的VAD和DTX以及舒适噪声产生算法(CNG),该算法在附录中提供,以供参考。附录中描述了其他使用指南,如VAD/DTX/CNG算法设计中影响系统性能的因素。
When using the Session Description Protocol (SDP) [11] to specify RTP payload information, the use of comfort noise is indicated by the inclusion of a payload type for CN on the media description line. When using CN with the RTP/AVP profile [10] and a codec whose RTP timestamp clock rate is 8000 Hz, such as G.711 (PCMU, static payload type 0), the static payload type 13 for CN can be used:
当使用会话描述协议(SDP)[11]来指定RTP有效载荷信息时,舒适性噪声的使用通过在媒体描述行上包括CN的有效载荷类型来指示。当将CN与RTP/AVP配置文件[10]和RTP时间戳时钟频率为8000 Hz的编解码器一起使用时,例如G.711(PCMU,静态有效负载类型0),可以使用CN的静态有效负载类型13:
m=audio 49230 RTP/AVP 0 13
m=音频49230 RTP/AVP 0 13
When using CN with a codec that has a different RTP timestamp clock rate, a dynamic payload type mapping (rtpmap attribute) is required.
将CN与具有不同RTP时间戳时钟速率的编解码器一起使用时,需要动态有效负载类型映射(rtpmap属性)。
This example shows CN used with the G.722.1 codec (see RFC 3047 [12]):
此示例显示了与G.722.1编解码器一起使用的CN(请参见RFC 3047[12]):
m=audio 49230 RTP/AVP 101 102 a=rtpmap:101 G7221/16000 a=fmtp:121 bitrate=24000 a=rtpmap:102 CN/16000
m=audio 49230 RTP/AVP 101 102 a=rtpmap:101 G7221/16000 a=fmtp:121 bitrate=24000 a=rtpmap:102 CN/16000
Omission of a payload type for CN on the media description line implies that this comfort noise payload format will not be used, but it does not imply that silence will not be suppressed. RTP allows discontinuous transmission (silence suppression) on any audio payload format. The receiver can detect silence suppression on the first packet received after the silence by observing that the RTP timestamp is not contiguous with the end of the interval covered by the previous packet even though the RTP sequence number has incremented only by one. The RTP marker bit is also normally set on such a packet.
在媒体描述行中省略CN的有效负载类型意味着不会使用该舒适噪声有效负载格式,但并不意味着不会抑制静音。RTP允许在任何音频有效负载格式上进行不连续传输(静音抑制)。接收机可以通过观察RTP时间戳与前一分组所覆盖的间隔的末端不连续来检测在静默之后接收的第一分组上的静默抑制,即使RTP序列号仅增加了一个。RTP标记位通常也设置在这样的数据包上。
This section defines a new RTP payload name and associated MIME type, CN (audio/CN). The payload format specified in this document is also assigned payload type 13 in the RTP Payload Types table of the RTP Parameters registry maintained by the Internet Assigned Numbers Authority (IANA).
本节定义了一个新的RTP有效负载名称和相关的MIME类型CN(audio/CN)。本文件中规定的有效载荷格式也在由互联网分配号码管理局(IANA)维护的RTP参数注册表的RTP有效载荷类型表中分配了有效载荷类型13。
MIME media type name: audio
MIME媒体类型名称:音频
MIME subtype name: CN
MIME子类型名称:CN
Required parameters: None
所需参数:无
Optional parameters: rate: specifies the RTP timestamp clock rate, which is usually (but not always) equal to the sampling rate. This parameter should have the same value as the codec used in conjunction with comfort noise. The default value is 8000.
可选参数:速率:指定RTP时间戳时钟速率,它通常(但不总是)等于采样速率。此参数的值应与与comfort noise结合使用的编解码器的值相同。默认值为8000。
Encoding considerations: This type is only defined for transfer via RTP [RFC 1889].
编码注意事项:此类型仅为通过RTP[RFC 1889]传输而定义。
Security considerations: see Section 7 "Security Considerations".
安全注意事项:见第7节“安全注意事项”。
Interoperability considerations: none
互操作性注意事项:无
Published specification: This document and Appendix II of ITU-T Recommendation G.711
出版规范:本文件和ITU-T建议G.711附录II
Applications which use this media type: Audio and video streaming and conferencing tools.
使用此媒体类型的应用程序:音频和视频流和会议工具。
Additional information: none
其他信息:无
Person & email address to contact for further information: Robert Zopf zopf@lucent.com
联系人和电子邮件地址,以获取更多信息:Robert Zopfzopf@lucent.com
Intended usage: COMMON
预期用途:普通
Author/Change controller: Author: Robert Zopf Change controller: IETF AVT Working Group
作者/变更负责人:作者:Robert Zopf变更负责人:IETF AVT工作组
RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [1]. This implies that confidentiality of the media streams is achieved by encryption. Because the payload format is arranged end-to-end, encryption MAY be performed after encapsulation so there is no conflict between the two operations.
使用本规范中定义的有效负载格式的RTP数据包应遵守RTP规范[1]中讨论的安全注意事项。这意味着媒体流的机密性是通过加密实现的。因为有效负载格式是端到端排列的,所以可以在封装之后执行加密,因此两个操作之间没有冲突。
As this format transports background noise, there are no significant security, confidentiality, or authentication concerns.
由于这种格式传输背景噪声,因此不存在重大的安全性、机密性或身份验证问题。
[1] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 1889, January 1996.
[1] Schulzrinne,H.,Casner,S.,Frederick,R.和V.Jacobson,“RTP:实时应用的传输协议”,RFC 1889,1996年1月。
[2] ITU Recommendation G.711 (11/88) - Pulse code modulation (PCM) of voice frequencies.
[2] ITU建议G.711(11/88)-语音频率的脉冲编码调制(PCM)。
[3] ITU Recommendation G.726 (12/90) - 40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM).
[3] ITU建议G.726(12/90)-40、32、24、16kbit/s自适应差分脉冲编码调制(ADPCM)。
[4] ITU Recommendation G.727 (12/90) - 5-, 4-, 3- and 2-bits sample embedded adaptive differential pulse code modulation (ADPCM).
[4] ITU建议G.727(12/90)-5、4、3和2位采样嵌入式自适应差分脉冲编码调制(ADPCM)。
[5] ITU Recommendation G.728 (09/92) - Coding of speech at 16 kbits/s using low-delay code excited linear prediction.
[5] ITU建议G.728(09/92)-使用低延迟码激励线性预测对16 kbits/s的语音进行编码。
[6] ITU Recommendation G.722 (11/88) - 7 kHz audio-coding within 64 kbit/s.
[6] ITU建议G.722(11/88)-64kbit/s范围内的7kHz音频编码。
[7] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[7] Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。
[8] Appendix II to Recommendation G.711 (02/2000) - A comfort noise payload definition for ITU-T G.711 use in packet-based multimedia communication systems.
[8] 建议G.711(02/2000)的附录II——ITU-T G.711在分组多媒体通信系统中使用的舒适性噪声有效载荷定义。
[9] Annex B (08/97) to Recommendation G.729 - C source code and test vectors for implementation verification of the algorithm of the G.729 silence compression scheme.
[9] 建议G.729的附录B(08/97)-C G.729静默压缩方案算法实现验证的源代码和测试向量。
[10] Schulzrinne, H., "RTP Profile for Audio and Video Conferences with Minimal Control", RFC 1890, January 1996.
[10] Schulzrinne,H.,“具有最小控制的音频和视频会议的RTP配置文件”,RFC 1890,1996年1月。
[11] Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998.
[11] Handley,M.和V.Jacobson,“SDP:会话描述协议”,RFC 2327,1998年4月。
[12] Luthi, P., "RTP Payload Format for ITU-T Recommendation G.722.1", RFC 3047, January 2001.
[12] Luthi,P.,“ITU-T建议G.722.1的RTP有效载荷格式”,RFC3047,2001年1月。
Robert Zopf Lucent Technologies INS Access VoIP Networks 2G-234A 101 Crawfords Corner Rd Holmdel, NJ 07733-3030 US
Robert Zopf Lucent Technologies INS接入VoIP网络美国新泽西州霍姆德尔克劳福德角路101号2G-234A 07733-3030
Phone: 1-732-949-1667 Fax: 1-732-949-7016 EMail: zopf@lucent.com
电话:1-732-949-1667传真:1-732-949-7016电子邮件:zopf@lucent.com
Copyright (C) The Internet Society (2002). All Rights Reserved.
版权所有(C)互联网协会(2002年)。版权所有。
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
本文件及其译本可复制并提供给他人,对其进行评论或解释或协助其实施的衍生作品可全部或部分编制、复制、出版和分发,不受任何限制,前提是上述版权声明和本段包含在所有此类副本和衍生作品中。但是,不得以任何方式修改本文件本身,例如删除版权通知或对互联网协会或其他互联网组织的引用,除非出于制定互联网标准的需要,在这种情况下,必须遵循互联网标准过程中定义的版权程序,或根据需要将其翻译成英语以外的其他语言。
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.
上述授予的有限许可是永久性的,互联网协会或其继承人或受让人不会撤销。
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
本文件和其中包含的信息是按“原样”提供的,互联网协会和互联网工程任务组否认所有明示或暗示的保证,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。
Acknowledgement
确认
Funding for the RFC Editor function is currently provided by the Internet Society.
RFC编辑功能的资金目前由互联网协会提供。