Internet Engineering Task Force (IETF) M. Ramalho, Ed. Request for Comments: 7655 P. Jones Category: Standards Track Cisco Systems ISSN: 2070-1721 N. Harada NTT M. Perumal Ericsson L. Miao Huawei Technologies November 2015
Internet Engineering Task Force (IETF) M. Ramalho, Ed. Request for Comments: 7655 P. Jones Category: Standards Track Cisco Systems ISSN: 2070-1721 N. Harada NTT M. Perumal Ericsson L. Miao Huawei Technologies November 2015
RTP Payload Format for G.711.0
G.711.0的RTP有效负载格式
Abstract
摘要
This document specifies the Real-time Transport Protocol (RTP) payload format for ITU-T Recommendation G.711.0. ITU-T Rec. G.711.0 defines a lossless and stateless compression for G.711 packet payloads typically used in IP networks. This document also defines a storage mode format for G.711.0 and a media type registration for the G.711.0 RTP payload format.
本文件规定了ITU-T建议G.711.0的实时传输协议(RTP)有效载荷格式。ITU-T Rec.G.711.0定义了通常用于IP网络的G.711数据包有效载荷的无损和无状态压缩。本文档还定义了G.711.0的存储模式格式和G.711.0 RTP有效负载格式的媒体类型注册。
Status of This Memo
关于下段备忘
This is an Internet Standards Track document.
这是一份互联网标准跟踪文件。
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.
本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。有关互联网标准的更多信息,请参见RFC 5741第2节。
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7655.
有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc7655.
Copyright Notice
版权公告
Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2015 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。
Table of Contents
目录
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 4 3. G.711.0 Codec Background . . . . . . . . . . . . . . . . . . 4 3.1. General Information and Use of the ITU-T G.711.0 Codec . 4 3.2. Key Properties of G.711.0 Design . . . . . . . . . . . . 6 3.3. G.711 Input Frames to G.711.0 Output Frames . . . . . . . 8 3.3.1. Multiple G.711.0 Output Frames per RTP Payload Considerations . . . . . . . . . . . . . . . . . . . 9 4. RTP Header and Payload . . . . . . . . . . . . . . . . . . . 10 4.1. G.711.0 RTP Header . . . . . . . . . . . . . . . . . . . 10 4.2. G.711.0 RTP Payload . . . . . . . . . . . . . . . . . . . 12 4.2.1. Single G.711.0 Frame per RTP Payload Example . . . . 12 4.2.2. G.711.0 RTP Payload Definition . . . . . . . . . . . 13 4.2.2.1. G.711.0 RTP Payload Encoding Process . . . . . . 14 4.2.3. G.711.0 RTP Payload Decoding Process . . . . . . . . 15 4.2.4. G.711.0 RTP Payload for Multiple Channels . . . . . . 17 5. Payload Format Parameters . . . . . . . . . . . . . . . . . . 19 5.1. Media Type Registration . . . . . . . . . . . . . . . . . 20 5.2. Mapping to SDP Parameters . . . . . . . . . . . . . . . . 22 5.3. Offer/Answer Considerations . . . . . . . . . . . . . . . 22 5.4. SDP Examples . . . . . . . . . . . . . . . . . . . . . . 23 5.4.1. SDP Example 1 . . . . . . . . . . . . . . . . . . . . 23 5.4.2. SDP Example 2 . . . . . . . . . . . . . . . . . . . . 23 6. G.711.0 Storage Mode Conventions and Definition . . . . . . . 24 6.1. G.711.0 PLC Frame . . . . . . . . . . . . . . . . . . . . 24 6.2. G.711.0 Erasure Frame . . . . . . . . . . . . . . . . . . 25 6.3. G.711.0 Storage Mode Definition . . . . . . . . . . . . . 26 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 8. Security Considerations . . . . . . . . . . . . . . . . . . . 27 9. Congestion Control . . . . . . . . . . . . . . . . . . . . . 28 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 29 10.1. Normative References . . . . . . . . . . . . . . . . . . 29 10.2. Informative References . . . . . . . . . . . . . . . . . 30 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 31 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 4 3. G.711.0 Codec Background . . . . . . . . . . . . . . . . . . 4 3.1. General Information and Use of the ITU-T G.711.0 Codec . 4 3.2. Key Properties of G.711.0 Design . . . . . . . . . . . . 6 3.3. G.711 Input Frames to G.711.0 Output Frames . . . . . . . 8 3.3.1. Multiple G.711.0 Output Frames per RTP Payload Considerations . . . . . . . . . . . . . . . . . . . 9 4. RTP Header and Payload . . . . . . . . . . . . . . . . . . . 10 4.1. G.711.0 RTP Header . . . . . . . . . . . . . . . . . . . 10 4.2. G.711.0 RTP Payload . . . . . . . . . . . . . . . . . . . 12 4.2.1. Single G.711.0 Frame per RTP Payload Example . . . . 12 4.2.2. G.711.0 RTP Payload Definition . . . . . . . . . . . 13 4.2.2.1. G.711.0 RTP Payload Encoding Process . . . . . . 14 4.2.3. G.711.0 RTP Payload Decoding Process . . . . . . . . 15 4.2.4. G.711.0 RTP Payload for Multiple Channels . . . . . . 17 5. Payload Format Parameters . . . . . . . . . . . . . . . . . . 19 5.1. Media Type Registration . . . . . . . . . . . . . . . . . 20 5.2. Mapping to SDP Parameters . . . . . . . . . . . . . . . . 22 5.3. Offer/Answer Considerations . . . . . . . . . . . . . . . 22 5.4. SDP Examples . . . . . . . . . . . . . . . . . . . . . . 23 5.4.1. SDP Example 1 . . . . . . . . . . . . . . . . . . . . 23 5.4.2. SDP Example 2 . . . . . . . . . . . . . . . . . . . . 23 6. G.711.0 Storage Mode Conventions and Definition . . . . . . . 24 6.1. G.711.0 PLC Frame . . . . . . . . . . . . . . . . . . . . 24 6.2. G.711.0 Erasure Frame . . . . . . . . . . . . . . . . . . 25 6.3. G.711.0 Storage Mode Definition . . . . . . . . . . . . . 26 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 8. Security Considerations . . . . . . . . . . . . . . . . . . . 27 9. Congestion Control . . . . . . . . . . . . . . . . . . . . . 28 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 29 10.1. Normative References . . . . . . . . . . . . . . . . . . 29 10.2. Informative References . . . . . . . . . . . . . . . . . 30 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 31 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31
The International Telecommunication Union (ITU-T) Recommendation G.711.0 [G.711.0] specifies a stateless and lossless compression for G.711 packet payloads typically used in Voice over IP (VoIP) networks. This document specifies the Real-time Transport Protocol (RTP) RFC 3550 [RFC3550] payload format and storage modes for this compression.
国际电信联盟(ITU-T)建议G.711.0[G.711.0]规定了通常用于IP语音(VoIP)网络的G.711数据包有效载荷的无状态无损压缩。本文件规定了此压缩的实时传输协议(RTP)RFC 3550[RFC3550]有效负载格式和存储模式。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照RFC 2119[RFC2119]中所述进行解释。
ITU-T Recommendation G.711.0 [G.711.0] is a lossless and stateless compression mechanism for ITU-T Recommendation G.711 [G.711] and thus is not a "codec" in the sense of "lossy" codecs typically carried by RTP. When negotiated end-to-end, ITU-T Rec. G.711.0 is negotiated as if it were a codec, with the understanding that ITU-T Rec. G.711.0 losslessly encoded the underlying (lossy) G.711 Pulse Code Modulation (PCM) sample representation of an audio signal. For this reason, ITU-T Rec. G.711.0 will be interchangeably referred to in this document as a "lossless data compression algorithm" or a "codec", depending on context. Within this document, individual G.711 PCM samples will be referred to as "G.711 symbols" or just "symbols" for brevity.
ITU-T建议G.711.0[G.711.0]是ITU-T建议G.711[G.711]的无损和无状态压缩机制,因此不是RTP通常携带的“有损”编解码器意义上的“编解码器”。当协商端到端时,ITU-T Rec.G.711.0被协商为一个编解码器,理解为ITU-T Rec.G.711.0对音频信号的底层(有损)G.711脉冲编码调制(PCM)样本表示进行无损编码。因此,ITU-T Rec.G.711.0在本文件中将根据上下文互换地称为“无损数据压缩算法”或“编解码器”。在本文件中,为简洁起见,单个G.711 PCM样品将被称为“G.711符号”或仅称为“符号”。
This section describes the ITU-T Recommendation G.711 [G.711] codec, its properties, typical uses cases, and its key design properties.
本节介绍了ITU-T建议G.711[G.711]编解码器、其特性、典型用例及其关键设计特性。
ITU-T Recommendation G.711 is the benchmark standard for narrowband telephony. It has been successful for many decades because of its proven voice quality, ubiquity, and utility. A new ITU-T recommendation, G.711.0, has been established for defining a stateless and lossless compression for G.711 packet payloads typically used in VoIP networks. ITU-T Rec. G.711.0 is also known as ITU-T Rec. G.711 Annex A [G.711-A1], as ITU-T Rec. G.711 Annex A is effectively a pointer ITU-T Rec. G.711.0. Henceforth in this document, ITU-T Rec. G.711.0 will simply be referred to as "G.711.0" and ITU-T Rec. G.711 simply as "G.711".
ITU-T建议G.711是窄带电话的基准标准。几十年来,由于其经验证的语音质量、普遍性和实用性,它已经取得了成功。新的ITU-T建议G.711.0已经建立,用于定义VoIP网络中通常使用的G.711数据包有效载荷的无状态无损压缩。ITU-T Rec.G.711.0也称为ITU-T Rec.G.711附录A[G.711-A1],因为ITU-T Rec.G.711附录A实际上是ITU-T Rec.G.711.0的指针。此后,在本文件中,ITU-T Rec.G.711.0将简称为“G.711.0”,ITU-T Rec.G.711将简称为“G.711”。
G.711.0 may be employed end-to-end, in which case the RTP payload format specification and use is nearly identical to the G.711 RTP specification found in RFC 3551 [RFC3551]. The only significant difference for G.711.0 is the required use of a dynamic payload type (the static PT of 0 or 8 is presently almost always used with G.711 even though dynamic assignment of other payload types is allowed) and the recommendation not to use Voice Activity Detection (see Section 4.1).
G.711.0可以端到端使用,在这种情况下,RTP有效负载格式规范和使用与RFC 3551[RFC3551]中的G.711 RTP规范几乎相同。G.711.0唯一的显著区别是需要使用动态有效负载类型(目前,静态PT为0或8几乎总是与G.711一起使用,即使允许动态分配其他有效负载类型),以及建议不要使用语音活动检测(见第4.1节)。
G.711.0, being both lossless and stateless, may also be employed as a lossless compression mechanism for G.711 payloads anywhere between end systems that have negotiated use of G.711. Because the only significant difference between the G.711 RTP payload format header and the G.711.0 payload format header defined in this document is the payload type, a G.711 RTP packet can be losslessly converted to a G.711.0 RTP packet simply by compressing the G.711 payload (thus creating a G.711.0 payload), changing the payload type to the dynamic value desired and copying all the remaining G.711 RTP header fields into the corresponding G.711.0 RTP header. In a similar manner, the corresponding decompression of the G.711.0 RTP packet thus created back to the original source G.711 RTP packet can be accomplished by losslessly decompressing the G.711.0 payload back to the original source G.711 payload, changing the payload type back to the payload type of the original G.711 RTP packet and copying all the remaining G.711.0 RTP header fields into the corresponding G.711 RTP header. As a packet produced by the compression and decompression as described above is indistinguishable in every detail to the source G.711 packet, such compression can be made invisible to the end systems. Specification of how systems on the path between the end systems discover each other and negotiate the use of G.711.0 compression as described in this paragraph is outside the scope of this document.
G.711.0既无损又无状态,也可用作协商使用G.711的终端系统之间G.711有效载荷的无损压缩机制。由于本文件中定义的G.711 RTP有效负载格式报头和G.711.0有效负载格式报头之间的唯一显著差异是有效负载类型,因此只需压缩G.711有效负载(从而创建G.711.0有效负载),即可将G.711 RTP包无损地转换为G.711.0 RTP包,将有效负载类型更改为所需的动态值,并将所有剩余的G.711 RTP标头字段复制到相应的G.711.0 RTP标头中。以类似的方式,可以通过无损地将G.711.0有效载荷解压缩回原始源G.711有效载荷来完成将由此创建的G.711.0 RTP分组解压缩回原始源G.711 RTP分组的相应解压缩,将有效负载类型更改回原始G.711 RTP数据包的有效负载类型,并将所有剩余的G.711.0 RTP报头字段复制到相应的G.711 RTP报头中。由于通过如上所述的压缩和解压缩产生的分组在每个细节上与源G.711分组不可区分,因此可以使这样的压缩对终端系统不可见。本段所述的终端系统之间路径上的系统如何相互发现并协商使用G.711.0压缩的规范不在本文件范围内。
It is informative to note that G.711.0, being both lossless and stateless, can be employed multiple times (e.g., on multiple, individual hops or series of hops) of a given flow with no degradation of quality relative to end-to-end G.711. Stated another way, multiple "lossless transcodes" from/to G.711.0/G.711 do not affect voice quality as typically occurs with lossy transcodes to/ from dissimilar codecs.
值得注意的是,无损和无状态的G.711.0可以多次(例如,在多个、单个跳或一系列跳上)使用给定流,而不会降低相对于端到端G.711的质量。换句话说,从G.711.0/G.711到G.711.0/G.711的多个“无损转码”不会像从不同编解码器到不同编解码器的有损转码那样影响语音质量。
Lastly, it is expected that G.711.0 will be used as an archival format for recorded G.711 streams. Therefore, a G.711.0 Storage Mode Format is also included in this document.
最后,预计G.711.0将用作记录的G.711流的存档格式。因此,本文件中还包括G.711.0存储模式格式。
The fundamental design of G.711.0 resulted from the desire to losslessly encode and compress frames of G.711 symbols independent of what types of signals those G.711 frames contained. The primary G.711.0 use case is for G.711 encoded, zero-mean, acoustic signals (such as speech and music).
G.711.0的基本设计源于对G.711符号的帧进行无损编码和压缩的愿望,而与这些G.711符号包含的信号类型无关。主要的G.711.0用例用于G.711编码的零均值声音信号(如语音和音乐)。
G.711.0 attributes are below:
G.711.0属性如下:
A1 Compression for zero-mean acoustic signals: G.711.0 was designed as its primary use case for the compression of G.711 payloads that contained "speech" or other zero-mean acoustic signals. G.711.0 obtains greater than 50% average compression in service provider environments [ICASSP].
A1零平均声信号压缩:G.711.0被设计为压缩包含“语音”或其他零平均声信号的G.711有效载荷的主要用例。G.711.0在服务提供商环境[ICASSP]中获得超过50%的平均压缩。
A2 Lossless for any G.711 payload: G.711.0 was designed to be lossless for any valid G.711 payload - even if the payload consisted of apparently random G.711 symbols (e.g., a modem or FAX payload). G.711.0 could be used for "aggregate 64 kbps G.711 channels" carried over IP without explicit concern if a subset of these channels happened to be carrying something other than voice or general audio. To the extent that a particular channel carried something other than voice or general audio, G.711.0 ensured that it was carried losslessly, if not significantly compressed.
A2任何G.711有效载荷的无损:G.711.0设计为任何有效G.711有效载荷的无损-即使有效载荷由明显随机的G.711符号组成(例如调制解调器或传真有效载荷)。G.711.0可用于通过IP传输的“聚合64 kbps G.711通道”,而无需明确关注这些通道的子集是否恰好承载语音或一般音频以外的内容。在某种程度上,如果某个特定频道承载的不是语音或一般音频,则G.711.0确保该频道即使没有显著压缩,也能无损承载。
A3 Stateless: Compression of a frame of G.711 symbols was only to be dependent on that frame and not on any prior frame. Although greater compression is usually available by observing a longer history of past G.711 symbols, it was decided that the compression design would be stateless to completely eliminate error propagation common in many lossy codec designs (e.g., ITU-T Rec. G.729 [G.729] and ITU-T Rec. G.722 [G.722]). That is, the decoding process need not be concerned about lost prior packets because the decompression of a given G.711.0 frame is not dependent on potentially lost prior G.711.0 frames. Owing to this stateless property, the frames input to the G.711.0 encoder may be changed "on-the-fly" (a 5 ms encoding could be followed by a 20 ms encoding).
A3无状态:G.711符号帧的压缩仅依赖于该帧,而不依赖于任何先前帧。虽然通过观察过去的G.711符号的较长历史通常可以获得更大的压缩,但确定压缩设计将是无状态的,以完全消除许多有损编解码器设计中常见的错误传播(例如,ITU-T Rec.G.729[G.729]和ITU-T Rec.G.722[G.722])。也就是说,解码过程不需要关心丢失的先前分组,因为给定G.711.0帧的解压缩不依赖于潜在丢失的先前G.711.0帧。由于这种无状态属性,输入到G.711.0编码器的帧可能会“在运行中”更改(5毫秒的编码后可能是20毫秒的编码)。
A4 Self-describing: This property is defined as the ability to determine how many source G.711 samples are contained within the G.711.0 frame solely by information contained within the G.711.0 frame. Generally, the number of source G.711 symbols can be determined by decoding the initial octets of the compressed G.711.0 frame (these octets are called "prefix codes" in the standard). A G.711.0 decoder need not know how
A4自描述:该属性定义为仅通过G.711.0框架内包含的信息确定G.711.0框架内包含多少源G.711样本的能力。通常,源G.711符号的数量可以通过解码压缩的G.711.0帧的初始八位字节来确定(这些八位字节在标准中称为“前缀码”)。G.711.0解码器不需要知道如何实现
many symbols are contained in the original G.711 frame (e.g., parameter ptime in the Session Description Protocol (SDP) [RFC4566]), as it is able to decompress the G.711.0 frame presented to it without signaling knowledge.
许多符号包含在原始G.711帧中(例如,会话描述协议(SDP)[RFC4566]中的参数ptime),因为它能够在不知道信令的情况下解压缩呈现给它的G.711.0帧。
A5 Accommodate G.711 payload sizes typically used in IP: G.711 input frames of length typically found in VoIP applications represent SDP ptime values of 5 ms, 10 ms, 20 ms, 30 ms, or 40 ms. Because the dominant sampling frequency for G.711 is 8000 samples per second, G.711.0 was designed to compress G.711 input frames of 40, 80, 160, 240, or 320 samples.
A5适应IP中典型使用的G.711有效负载大小:VoIP应用中典型长度的G.711输入帧表示5毫秒、10毫秒、20毫秒、30毫秒或40毫秒的SDP ptime值。由于G.711的主要采样频率为每秒8000个样本,G.711.0设计用于压缩40、80、160、240的G.711输入帧,或320个样本。
A6 Bounded expansion: Since attribute A2 above requires G.711.0 to be lossless for any payload (which could consist of any combination of octets with each octet spanning the entire space of 2^8 values), by definition there exists at least one potential G.711 payload that must be "uncompressible". Since the quantum of compression is an octet, the minimum expansion of such an uncompressible payload was designed to be the minimum possible of one octet. Thus, G.711.0 "compressed" frames can be of length one octet to X+1 octets, where X is the size of the input G.711 frame in octets. G.711.0 can therefore be viewed as a Variable Bit Rate (VBR) encoding in which the size of the G.711.0 output frame is a function of the G.711 symbols input to it.
A6有界扩展:由于上述属性A2要求G.711.0对任何有效载荷都是无损的(可能由八位字节的任意组合组成,每个八位字节跨越2^8个值的整个空间),因此根据定义,至少存在一个必须是“不可压缩”的潜在G.711有效载荷。由于压缩量是一个八位元,因此这种不可压缩有效载荷的最小扩展被设计为一个八位元的最小可能。因此,G.711.0“压缩”帧的长度可以是一个八位字节到X+1个八位字节,其中X是输入G.711帧的大小(以八位字节为单位)。因此,G.711.0可被视为可变比特率(VBR)编码,其中G.711.0输出帧的大小是G.711符号输入的函数。
A7 Algorithmic delay: G.711.0 was designed to have the algorithmic delay equal to the time represented by the number of samples in the G.711 input frame (i.e., no "look-ahead").
A7算法延迟:G.711.0设计的算法延迟等于G.711输入帧中样本数表示的时间(即,无“前瞻”)。
A8 Low Complexity: Less than 1.0 Weighted Million Operations Per Second (WMOPS) average and low memory footprint (~5k octets RAM, ~5.7k octets ROM, and ~3.6 basic operations) [ICASSP] [G.711.0].
A8低复杂性:平均每秒不到100万次加权运算(WMOPS),内存占用率低(~5k八位字节RAM、~5.7k八位字节ROM和~3.6次基本运算)[ICASSP][G.711.0]。
A9 Both A-law and mu-law supported: G.711 has two operating laws, A-law and mu-law. These two laws are also known as PCMA and PCMU in RTP applications [RFC3551].
A9 A-law和mu-law均受支持:G.711有两个运行法则,A-law和mu-law。这两条定律在RTP应用中也称为PCMA和PCMU[RFC3551]。
These attributes generally make it trivial to compress a G.711 input frame consisting of 40, 80, 160, 240, or 320 samples. After the input frame is presented to a G.711.0 encoder, a G.711.0 "self-describing" output frame is produced. The number of samples contained within this frame is easily determined at the G.711.0 decoder by virtue of attribute A4. The G.711.0 decoder can decode the G.711.0 frame back to a G.711 frame by using only data within the G.711.0 frame.
这些属性通常使压缩由40、80、160、240或320个样本组成的G.711输入帧变得简单。将输入帧呈现给G.711.0编码器后,生成G.711.0“自描述”输出帧。在G.711.0解码器处,借助属性A4可以容易地确定该帧中包含的样本数。通过仅使用G.711.0帧内的数据,G.711.0解码器可以将G.711.0帧解码回G.711帧。
Lastly we note that losing a G.711.0 encoded packet is identical in effect to losing a G.711 packet (when using RTP); this is because a G.711.0 payload, like the corresponding G.711 payload, is stateless. Thus, it is anticipated that existing G.711 Packet Loss Concealment (PLC) mechanisms will be employed when a G.711.0 packet is lost and an identical MOS degradation relative to G.711 loss will be achieved.
最后,我们注意到丢失G.711.0编码的数据包在效果上与丢失G.711数据包相同(当使用RTP时);这是因为G.711.0有效负载与相应的G.711有效负载一样是无状态的。因此,预计当G.711.0分组丢失时,将采用现有的G.711分组丢失隐藏(PLC)机制,并且将实现与G.711丢失相同的MOS降级。
G.711.0 is a lossless and stateless compression of G.711 frames. Figure 1 depicts this where "A" is the process of G.711.0 encoding and "B" is the process of G.711.0 decoding.
G.711.0是G.711帧的无损和无状态压缩。图1描述了其中“A”是G.711.0编码过程,“B”是G.711.0解码过程。
|--------------------------| A |------------------------------| | G.711 Input Frame |----->| G.711.0 Output Frame | | of X Octets | | containing 1 to X+1 Octets | | (where X MUST be 40, 80, | | (precise value dependent on | | 160, 240, or 320 octets) |<-----| G.711.0 ability to compress) | |__________________________| B |______________________________|
|--------------------------| A |------------------------------| | G.711 Input Frame |----->| G.711.0 Output Frame | | of X Octets | | containing 1 to X+1 Octets | | (where X MUST be 40, 80, | | (precise value dependent on | | 160, 240, or 320 octets) |<-----| G.711.0 ability to compress) | |__________________________| B |______________________________|
Figure 1: 1:1 Mapping from G.711 Input Frame to G.711.0 Output Frame
图1:G.711输入帧到G.711.0输出帧的1:1映射
Note that the mapping is 1:1 (lossless) in both directions, subject to two constraints. The first constraint is that the input frame provided to the G.711.0 encoder (process "A") has a specific number of input G.711 symbols consistent with attribute A5 (40, 80, 160, 240, or 320 octets). The second constraint is that the companding law used to create the G.711 input frame (A-law or mu-law) must be known, consistent with attribute A9.
请注意,映射在两个方向上都是1:1(无损),受两个约束。第一个约束是,提供给G.711.0编码器(过程“A”)的输入帧具有与属性A5一致的特定数量的输入G.711符号(40、80、160、240或320个八位字节)。第二个约束条件是,用于创建G.711输入帧的压扩法则(A法则或mu法则)必须已知,与属性A9一致。
Subject to these two constraints, the input G.711 frame is processed by the G.711.0 encoder ("process A") and produces a "self-describing" G.711.0 output frame, consistent with attribute A4. Depending on the source G.711 symbols, the G.711.0 output frame can contain anywhere from 1 to X+1 octets, where X is the number of input G.711 symbols. Compression results for virtually every zero-mean acoustic signal encoded by G.711.0.
根据这两个约束条件,输入G.711帧由G.711.0编码器处理(“过程A”),并产生与属性A4一致的“自描述”G.711.0输出帧。根据源G.711符号,G.711.0输出帧可以包含1到X+1个八位字节,其中X是输入G.711符号的数量。几乎所有由G.711.0编码的零平均声信号的压缩结果。
Since the G.711.0 output frame is "self-describing", a G.711.0 decoder (process "B") can losslessly reproduce the original G.711 input frame with only the knowledge of which companding law was used (A-law or mu-law). The first octet of a G.711.0 frame is called the "Prefix Code" octet; the information within this octet conveys how many G.711 symbols the decoder is to create from a given G.711.0 input frame (i.e., 0, 40, 80, 160, 240, or 320). The Prefix Code value of 0x00 is used to denote zero G.711 source symbols, which allows the use of 0x00 as a payload padding octet (described later in Section 3.3.1).
由于G.711.0输出帧是“自描述的”,因此G.711.0解码器(过程“B”)可以仅在知道使用了哪种压扩律(a律或mu律)的情况下无损地再现原始G.711输入帧。G.711.0帧的第一个八位组称为“前缀码”八位组;该八位组中的信息表示解码器将从给定的G.711.0输入帧(即0、40、80、160、240或320)创建多少G.711符号。前缀代码值0x00用于表示零G.711源符号,这允许使用0x00作为有效负载填充八位字节(稍后在第3.3.1节中描述)。
Since G.711.0 was designed with typical G.711 payload lengths as a design constraint (attribute A5), this lossless encoding can be performed only with knowledge of the companding law being used. This information is anticipated to be signaled in SDP and is described later in this document.
由于G.711.0是以典型的G.711有效载荷长度作为设计约束(属性A5)进行设计的,因此这种无损编码只能在了解压缩法则的情况下进行。该信息预计将在SDP中发出信号,并在本文件后面进行描述。
If the original inputs were known to be from a zero-mean acoustic signal coded by G.711, an intelligent G.711.0 encoder could infer the G.711 companding law in use (via G.711 input signal amplitude histogram statistics). Likewise, an intelligent G.711.0 decoder producing G.711 from the G.711.0 frames could also infer which encoding law is in use. Thus, G.711.0 could be designed for use in applications that have limited stream signaling between the G.711 endpoints (i.e., they only know "G.711 at 8k sampling is being used", but nothing more). Such usage is not further described in this document. Additionally, if the original inputs were known to come from zero-mean acoustic signals, an intelligent G.711.0 encoder could tell if the G.711.0 payload had been encrypted -- as the symbols would not have the distribution expected in either companding law and would appear random. Such determination is also not further discussed in this document.
如果已知原始输入来自由G.711编码的零平均声信号,则智能G.711.0编码器可推断使用中的G.711压扩定律(通过G.711输入信号振幅直方图统计)。同样,从G.711.0帧生成G.711的智能G.711.0解码器也可以推断使用的是哪种编码法则。因此,G.711.0可设计用于在G.711端点之间具有有限流信令的应用中(即,它们仅知道“正在使用8k采样的G.711”,仅此而已)。本文档中不再进一步描述这种用法。此外,如果已知原始输入来自零平均声信号,则智能G.711.0编码器可以判断G.711.0有效载荷是否已加密,因为符号不会具有压扩定律中预期的分布,并且看起来是随机的。本文件中也不进一步讨论此类确定。
It is easily seen that this process is 1:1 and that lossless compression based on G.711.0 can be employed multiple times, as the original G.711 input symbols are always reproduced with 100% fidelity.
很容易看出,该过程为1:1,并且基于G.711.0的无损压缩可以多次使用,因为原始G.711输入符号始终以100%保真度再现。
As a general rule, G.711.0 frames containing more source G.711 symbols (from a given channel) will typically result in higher compression, but there are exceptions to this rule. A G.711.0 encoder may choose to encode 20 ms of input G.711 symbols as: 1) a single 20 ms G.711.0 frame, or 2) as two 10 ms G.711.0 frames, or 3) any other combination of 5 ms or 10 ms G.711.0 frames -- depending on which encoding resulted in fewer bits. As an example, an intelligent encoder might encode 20 ms of G.711 symbols as two 10 ms G.711.0 frames if the first 10 ms was "silence" and two G.711.0 frames took fewer bits than any other possible encoding combination of G.711.0 frame sizes.
作为一般规则,包含更多源G.711符号(来自给定信道)的G.711.0帧通常会导致更高的压缩,但此规则也有例外。G.711.0编码器可选择将20ms的输入G.711符号编码为:1)单个20ms G.711.0帧,或2)两个10ms G.711.0帧,或3)5ms或10ms G.711.0帧的任何其他组合,具体取决于哪种编码产生的比特数较少。例如,如果前10ms为“静默”,且两个G.711.0帧比G.711.0帧大小的任何其他可能编码组合占用更少比特,则智能编码器可将20ms的G.711符号编码为两个10ms的G.711.0帧。
During the process of G.711.0 standardization, it was recognized that although it is sometimes advantageous to encode integer multiples of 40 G.711 symbols in whatever input symbol format resulted in the most compression (as per above), the simplest choice is to encode the entire ptime's worth of input G.711 symbols into one G.711.0 frame (if the ptime supported it). This is especially so since the larger number of source G.711 symbols typically resulted in the highest
在G.711.0标准化过程中,人们认识到,尽管有时以导致最大压缩的任何输入符号格式对40个G.711符号的整数倍进行编码是有利的(如上所述),但最简单的选择是将整个ptime值的输入G.711符号编码到一个G.711.0帧中(如果ptime支持)。尤其如此,因为较大数量的源G.711符号通常会导致最高的
compression anyway and there is added complexity in searching for other possibilities (involving more G.711.0 frames) that were unlikely to produce a more bit efficient result.
无论如何,压缩都会增加搜索其他不可能产生更高效结果的可能性(涉及更多G.711.0帧)的复杂性。
The design of ITU-T Rec. G.711.0 [G.711.0] foresaw the possibility of multiple G.711.0 input frames in that the decoder was defined to decode what it refers to as an incoming "bit stream". For this specification, the bit stream is the G.711.0 RTP payload itself. Thus, the decoder will take the G.711.0 RTP payload and will produce an output frame containing the original G.711 symbols independent of how many G.711.0 frames were present in it. Additionally, any number of 0x00 padding octets placed between the G.711.0 frames will be silently (and safely) ignored by the G.711.0 decoding process Section 4.2.3).
ITU-T Rec.G.711.0[G.711.0]的设计预见了多个G.711.0输入帧的可能性,因为解码器被定义为解码它所指的传入“比特流”。对于本规范,比特流是G.711.0 RTP有效负载本身。因此,解码器将采用G.711.0 RTP有效载荷,并将产生包含原始G.711符号的输出帧,而与其中存在多少G.711.0帧无关。此外,G.711.0解码过程(第4.2.3节)将无声(安全)忽略放置在G.711.0帧之间的任何数量的0x00填充八位字节。
To recap, a G.711.0 encoder may choose to encode incoming G.711 symbols into one or more than one G.711.0 frames and put the resultant frame(s) into the G.711.0 RTP payload. Zero or more 0x00 padding octets may also be included in the G.711.0 RTP payload. The G.711.0 decoder, being insensitive to the number of G.711.0 encoded frames that are contained within it, will decode the G.711.0 RTP payload into the source G.711 symbols. Although examples of single or multiple G.711 frame cases are illustrated in Section 4.2, the multiple G.711.0 frame cases MUST be supported and there is no need for negotiation (SDP or otherwise) required for it.
总而言之,G.711.0编码器可以选择将传入的G.711符号编码为一个或多个G.711.0帧,并将结果帧放入G.711.0 RTP有效载荷。G.711.0 RTP有效负载中还可能包含零个或多个0x00填充八位字节。G.711.0解码器对其中包含的G.711.0编码帧的数量不敏感,将G.711.0 RTP有效载荷解码为源G.711符号。尽管第4.2节中说明了单个或多个G.711框架案例的示例,但必须支持多个G.711.0框架案例,且无需协商(SDP或其他)。
In this section, we describe the precise format for G.711.0 frames carried via RTP. We begin with an RTP header description relative to G.711, then provide two G.711.0 payload examples.
在本节中,我们将描述通过RTP传输的G.711.0帧的精确格式。我们从与G.711相关的RTP头描述开始,然后提供两个G.711.0有效负载示例。
Relative to G.711 RTP headers, the utilization of G.711.0 does not create any special requirements with respect to the contents of the RTP packet header. The only significant difference is that the payload type (PT) RTP header field MUST have a value corresponding to the dynamic payload type assigned to the flow. This is in contrast to most current uses of G.711 that typically use the static payload assignment of PT = 0 (PCMU) or PT = 8 (PCMA) [RFC3551] even though the negotiation and use of dynamic payload types is allowed for G.711. With the exception of rare PT exhaustion cases, the existing G.711 PT values of 0 and 8 MUST NOT be used for G.711.0 (helping to avoid possible payload confusion with G.711 payloads).
相对于G.711 RTP报头,G.711.0的使用不会对RTP数据包报头的内容产生任何特殊要求。唯一显著的区别是有效负载类型(PT)RTP报头字段必须具有与分配给流的动态有效负载类型相对应的值。这与G.711的大多数当前使用形成对比,G.711通常使用PT=0(PCMU)或PT=8(PCMA)[RFC3551]的静态有效负载分配,即使G.711允许协商和使用动态有效负载类型。除极少数PT耗尽情况外,现有G.711 PT值0和8不得用于G.711.0(有助于避免可能与G.711有效载荷混淆)。
Voice Activity Detection (VAD) SHOULD NOT be used when G.711.0 is negotiated because G.711.0 obtains high compression during "VAD silence intervals" and one of the advantages of G.711.0 over G.711 with VAD is the lack of any VAD-inducing artifacts in the received signal. However, if VAD is employed, the Marker bit (M) MUST be set in the first packet of a talkspurt (the first packet after a silence period in which packets have not been transmitted contiguously as per rules specified in [RFC3551] for G.711 payloads). This definition, being consistent with the G.711 RTP VAD use, further allows lossless transcoding between G.711 RTP packets and G.711.0 RTP packets as described in Section 3.1.
当协商G.711.0时,不应使用语音活动检测(VAD),因为G.711.0在“VAD静默间隔”期间获得高压缩,并且G.711.0相对于G.711和VAD的优点之一是接收信号中没有任何VAD诱发伪影。然而,如果采用VAD,则必须在TalkSport的第一个分组中设置标记位(M)(静默期之后的第一个分组,在静默期中,分组没有按照[RFC3551]中针对G.711有效负载规定的规则连续发送)。该定义与G.711 RTP VAD的使用一致,进一步允许G.711 RTP数据包和G.711.0 RTP数据包之间的无损转码,如第3.1节所述。
With this introduction, the RTP packet header fields are defined as follows:
在此介绍中,RTP数据包头字段定义如下:
V - As per [RFC3550]
V-根据[RFC3550]
P - As per [RFC3550]
P-根据[RFC3550]
X - As per [RFC3550]
X-根据[RFC3550]
CC - As per [RFC3550]
抄送-根据[RFC3550]
M - As per [RFC3550] and [RFC3551]
M-根据[RFC3550]和[RFC3551]
PT - The assignment of an RTP payload type for the format defined in this memo is outside the scope of this document. The RTP profiles in use currently mandate binding the payload type dynamically for this payload format (e.g., see [RFC3550] and [RFC4585]).
PT-本备忘录中定义的格式的RTP有效负载类型的分配不在本文件的范围内。当前使用的RTP配置文件要求为此有效负载格式动态绑定有效负载类型(例如,请参见[RFC3550]和[RFC4585])。
SN - As per [RFC3550]
序号-根据[RFC3550]
timestamp - As per [RFC3550]
时间戳-根据[RFC3550]
SSRC - As per [RFC3550]
SSRC-根据[RFC3550]
CSRC - As per [RFC3550]
中国证监会-根据[RFC3550]
V (version bits), P (padding bit), X (extension bit), CC (CSRC count), M (marker bit), PT (payload type), SN (sequence number), timestamp, SSRC (synchronizing source) and CSRC (contributing sources) are as defined in [RFC3550] and are as typically used with G.711. PT (payload type) is as defined in [RFC3551].
V(版本位)、P(填充位)、X(扩展位)、CC(CSC计数)、M(标记位)、PT(有效负载类型)、SN(序列号)、时间戳、SSRC(同步源)和CSC(贡献源)如[RFC3550]中所定义,通常与G.711一起使用。PT(有效负载类型)的定义见[RFC3551]。
This section defines the G.711.0 RTP payload and illustrates it by means of two examples.
本节定义了G.711.0 RTP有效载荷,并通过两个示例对其进行了说明。
The first example, in Section 4.2.1, depicts the case in which carrying only one G.711.0 frame in the RTP payload is desired. This case is expected to be the dominant use case and is shown separately for the purposes of clarity.
第4.2.1节中的第一个示例描述了需要在RTP有效载荷中仅承载一个G.711.0帧的情况。该用例预计将成为主要用例,为清晰起见,将单独显示。
The second example, in Section 4.2.2, depicts the general case in which carrying one or more G.711.0 frames in the RTP payload is desired. This is the actual definition of the G.711.0 RTP payload.
第二个示例(第4.2.2节)描述了一般情况,其中需要在RTP有效载荷中承载一个或多个G.711.0帧。这是G.711.0 RTP有效载荷的实际定义。
This example depicts a single G.711.0 frame in the RTP payload. This is expected to be the dominant RTP payload case for G.711.0, as the G.711.0 encoding process supports the SDP packet times (ptime and maxptime, see [RFC4566]) commonly used when G.711 is transported in RTP. Additionally, as mentioned previously, larger G.711.0 frames generally compress more effectively than a multiplicity of smaller G.711.0 frames.
此示例描述RTP有效负载中的单个G.711.0帧。这预计将是G.711.0的主要RTP有效负载情况,因为G.711.0编码过程支持在RTP中传输G.711时常用的SDP数据包时间(ptime和maxptime,请参见[RFC4566])。此外,如前所述,较大的G.711.0帧通常比多个较小的G.711.0帧更有效地压缩。
The following figure illustrates the single G.711.0 frame per RTP payload case.
下图显示了每个RTP有效负载情况下的单个G.711.0帧。
|-------------------|-------------------| | One G.711.0 Frame | Zero or more 0x00 | | | Padding Octets | |___________________|___________________|
|-------------------|-------------------| | One G.711.0 Frame | Zero or more 0x00 | | | Padding Octets | |___________________|___________________|
Figure 2: Single G.711.0 Frame in RTP Payload Case
图2:RTP有效负载情况下的单个G.711.0帧
Encoding Process: A single G.711.0 frame is inserted into the RTP payload. The amount of time represented by the G.711 symbols compressed in the G.711.0 frame MUST correspond to the ptime signaled for applications using SDP. Although generally not desired, padding desired in the RTP payload after the G.711.0 frame MAY be created by placing one or more 0x00 octets after the G.711.0 frame. Such padding may be desired based on the Security Considerations (see Section 8).
编码过程:将单个G.711.0帧插入RTP有效负载。在G.711.0帧中压缩的G.711符号所表示的时间量必须与使用SDP的应用程序发出的ptime信号相对应。虽然通常不需要,但是可以通过在G.711.0帧之后放置一个或多个0x00八位字节来创建G.711.0帧之后的RTP有效载荷中需要的填充。基于安全考虑,可能需要这种填充(参见第8节)。
Decoding Process: Passing the entire RTP payload to the G.711.0 decoder is sufficient for the G.711.0 decoder to create the source G.711 symbols. Any padding inserted after the G.711.0 frame (i.e., the 0x00 octets) present in the RTP payload is silently ignored by
解码过程:将整个RTP有效载荷传递给G.711.0解码器足以使G.711.0解码器创建源G.711符号。RTP有效负载中存在的G.711.0帧(即0x00八位字节)之后插入的任何填充都会被默认忽略
the G.711.0 decoding process. The decoding process is fully described in Section 4.2.3.
G.711.0解码过程。第4.2.3节详细描述了解码过程。
This section defines the G.711.0 RTP payload and illustrates the case in which one or more G.711.0 frames are to be placed in the payload. All G.711.0 RTP decoders MUST support the general case described in this section (rationale presented previously in Section 3.3.1).
本节定义了G.711.0 RTP有效载荷,并说明了将一个或多个G.711.0帧放置在有效载荷中的情况。所有G.711.0 RTP解码器必须支持本节所述的一般情况(之前第3.3.1节给出的基本原理)。
Note that since each G.711.0 frame is self-describing (see Attribute A4 in Section 3.2), the individual G.711.0 frames in the RTP payload need not represent the same duration of time (i.e., a 5 ms G.711.0 frame could be followed by a 20 ms G.711.0 frame). Owing to this, the amount of time represented in the RTP payload MAY be any integer multiple of 5 ms (as 5 ms is the smallest interval of time that can be represented in a G.711.0 frame).
注意,由于每个G.711.0帧都是自描述的(参见第3.2节中的属性A4),RTP有效载荷中的各个G.711.0帧不需要表示相同的持续时间(即,5 ms G.711.0帧后面可以是20 ms G.711.0帧)。因此,RTP有效载荷中表示的时间量可以是5ms的任意整数倍(因为5ms是可以在G.711.0帧中表示的最小时间间隔)。
The following figure illustrates the one or more G.711.0 frames per RTP payload case where the number of G.711.0 frames placed in the RTP payload is N. We note that when N is equal to 1, this case is identical to the previous example.
下图说明了每个RTP有效负载一个或多个G.711.0帧的情况,其中放置在RTP有效负载中的G.711.0帧的数量为N。我们注意到,当N等于1时,这种情况与前面的示例相同。
|----------|---------|----------|---------|----------------| | First | Second | | Nth | Zero or more | | G.711.0 | G.711.0 | ... | G.711.0 | 0x00 | | Frame | Frame | | Frame | Padding Octets | |__________|_________|__________|_________|________________|
|----------|---------|----------|---------|----------------| | First | Second | | Nth | Zero or more | | G.711.0 | G.711.0 | ... | G.711.0 | 0x00 | | Frame | Frame | | Frame | Padding Octets | |__________|_________|__________|_________|________________|
Figure 3: One or More G.711.0 Frames in RTP Payload Case
图3:RTP有效负载情况下的一个或多个G.711.0帧
We note here that when we have multiple G.711.0 frames, the individual frames can be, and generally are, of different lengths. The decoding process described in Section 4.2.3 is used to determine the frame boundaries.
这里我们注意到,当我们有多个G.711.0帧时,单个帧可以并且通常具有不同的长度。第4.2.3节中描述的解码过程用于确定帧边界。
Encoding Process: One or more G.711.0 frames are placed in the RTP payload simply by concatenating the G.711.0 frames together. The amount of time represented by the G.711 symbols compressed in all the G.711.0 frames in the RTP payload MUST correspond to the ptime signaled for applications using SDP. Although not generally desired, padding in the RTP payload SHOULD be placed after the last G.711.0 frame in the payload and MAY be created by placing one or more 0x00 octets after the last G.711.0 frame. Such padding may be desired based on security considerations (see Section 8). Additional details about the encoding process and considerations are specified later in Section 4.2.2.1.
编码过程:只需将G.711.0帧连接在一起,即可将一个或多个G.711.0帧放置在RTP有效负载中。RTP有效载荷中所有G.711.0帧中压缩的G.711符号所表示的时间量必须对应于使用SDP的应用所发信号的ptime。虽然通常不需要,但RTP有效负载中的填充应放置在有效负载中最后一个G.711.0帧之后,并且可以通过在最后一个G.711.0帧之后放置一个或多个0x00八位字节来创建。基于安全考虑,可能需要这种填充(参见第8节)。有关编码过程和注意事项的更多详细信息,请参见第4.2.2.1节。
Decoding Process: As G.711.0 frames can be of varying length, the payload decoding process described in Section 4.2.3 is used to determine where the individual G.711.0 frame boundaries are. Any padding octets inserted before or after any G.711.0 frame in the RTP payload is silently (and safely) ignored by the G.711.0 decoding process specified in Section 4.2.3.
解码过程:由于G.711.0帧可以具有不同的长度,因此使用第4.2.3节中描述的有效载荷解码过程来确定各个G.711.0帧边界的位置。在RTP有效载荷中任何G.711.0帧之前或之后插入的任何填充八位字节都会被第4.2.3节中规定的G.711.0解码过程无声地(安全地)忽略。
ITU-T G.711.0 supports five possible input frame lengths: 40, 80, 160, 240, and 320 samples per frame, and the rationale for choosing those lengths was given in the description of property A5 in Section 3.2. Assuming a frequency of 8000 samples per second, these lengths correspond to input frames representing 5 ms, 10 ms, 20 ms, 30 ms, or 40 ms. So while the standard assumed the input "bit stream" consisted of G.711 symbols of some integer multiple of 5 ms in length, it did not specify exactly what frame lengths to use as input to the G.711.0 encoder itself. The intent of this section is to provide some guidance for the selection.
ITU-T G.711.0支持五种可能的输入帧长度:每帧40、80、160、240和320个样本,选择这些长度的基本原理在第3.2节属性A5的描述中给出。假设频率为每秒8000个样本,这些长度对应于表示5ms、10ms、20ms、30ms或40ms的输入帧。因此,虽然标准假设输入“比特流”由长度为5ms的整数倍的G.711符号组成,它没有明确指定使用什么帧长度作为G.711.0编码器本身的输入。本节旨在为选择提供一些指导。
Consider a typical IETF use case of 20 ms (160 octets) of G.711 input samples represented in a G.711.0 payload and signaled by using the SDP parameter ptime. As described in Section 3.3.1, the simplest way to encode these 160 octets is to pass the entire 160 octets to the G.711.0 encoder, resulting in precisely one G.711.0 compressed frame, and put that singular frame into the G.711.0 RTP payload. However, neither the ITU-T G.711.0 standard nor this IETF payload format mandates this. In fact, 20 ms of input G.711 symbols can be encoded as 1, 2, 3, or 4 G.711.0 frames in any one of six combinations (i.e., {20ms}, {10ms:10ms}, {10ms:5ms:5ms}, {5ms:10ms:5ms}, {5ms:5ms:10ms}, {5ms:5ms:5ms:5ms}) and any of these combinations would decompress into the same source 160 G.711 octets. As an aside, we note that the first octet of any G.711.0 frame will be the prefix code octet and information in this octet determines how many G.711 symbols are represented in the G.711.0 frame.
考虑一个典型的IETF用例,使用G.71.0有效载荷表示的G.711输入样本20毫秒(160个八位字节),并使用SDP参数pTIME发出信号。如第3.3.1节所述,对这160个八位字节进行编码的最简单方法是将整个160个八位字节传递给G.711.0编码器,从而精确生成一个G.711.0压缩帧,并将该奇异帧放入G.711.0 RTP有效载荷中。然而,无论是ITU-T G.711.0标准还是IETF有效载荷格式都不要求这样做。事实上,20ms的输入G.711符号可以在六种组合(即{20ms}、{10ms:10ms}、{10ms:5ms}、{5ms:5ms}、{5ms:10ms}、{5ms:5ms}、{5ms:5ms:10ms}、{5ms:5ms:5ms}、{5ms:5ms})中的任何一种组合中编码为1、2、3或4个G.711.0帧,并且这些组合中的任何一种都将解压缩到同一源160个G.711八位字节中。另一方面,我们注意到,任何G.711.0帧的第一个八位字节将是前缀代码八位字节,该八位字节中的信息确定G.711.0帧中表示了多少G.711符号。
Notwithstanding the above, we expect one of two encodings to be used by implementers: the simplest possible (one 160-byte input to the G.711.0 encoder that usually results in the highest compression) or the combination of possible input frames to a G.711.0 encoder that results in the highest compression for the payload. The explicit mention of this issue in this IETF document was deemed important because the ITU-T G.711.0 standard is silent on this issue and there is a desire for this issue to be documented in a formal Standards Developing Organization (SDO) document (i.e., here).
尽管有上述规定,我们仍希望实现者使用两种编码中的一种:可能的最简单编码(向G.711.0编码器输入一个160字节的编码,通常会产生最高的压缩)或向G.711.0编码器输入可能的帧的组合,从而产生有效负载的最高压缩。本IETF文件中明确提及该问题被认为是重要的,因为ITU-T G.711.0标准对该问题没有提及,并且希望将该问题记录在正式的标准开发组织(SDO)文件中(即此处)。
The G.711.0 decoding process is a standard part of G.711.0 bit stream decoding and is implemented in the ITU-T Rec. G.711.0 reference code. The decoding process algorithm described in this section is a slight enhancement of the ITU-T reference code to explicitly accommodate RTP padding (as described above).
G.711.0解码过程是G.711.0比特流解码的标准部分,在ITU-T Rec.G.711.0参考码中实现。本节中描述的解码处理算法是对ITU-T参考码的轻微增强,以明确适应RTP填充(如上所述)。
Before describing the decoding, we note here that the largest possible G.711.0 frame is created whenever the largest number of G.711 symbols is encoded (320 from Section 3.2, property A5) and these 320 symbols are "uncompressible" by the G.711.0 encoder. In this case (via property A6 in Section 3.2), the G.711.0 output frame will be 321 octets long. We also note that the value 0x00 chosen for the optional padding cannot be the first octet of a valid ITU-T Rec. G.711.0 frame (see [G.711.0]). We also note that whenever more than one G.711.0 frame is contained in the RTP payload, decoding of the individual G.711.0 frames will occur multiple times.
在描述解码之前,我们在此注意到,只要对最大数量的G.711符号进行编码(第3.2节,属性A5中的320个),并且G.711.0编码器“不可压缩”这些320个符号,就会创建最大可能的G.711.0帧。在这种情况下(通过第3.2节中的属性A6),G.711.0输出帧的长度为321个八位字节。我们还注意到,为可选填充选择的值0x00不能是有效ITU-T Rec.G.711.0帧的第一个八位字节(参见[G.711.0])。我们还注意到,每当RTP有效载荷中包含多个G.711.0帧时,各个G.711.0帧的解码将发生多次。
For the decoding algorithm below, let N be the number of octets in the RTP payload (i.e., excluding any RTP padding, but including any RTP payload padding), let P equal the number of RTP payload octets processed by the G.711.0 decoding process, let K be the number of G.711 symbols presently in the output buffer, let Q be the number of octets contained in the G.711.0 frame being processed, and let "!=" represent not equal to. The keyword "STOP" is used below to indicate the end of the processing of G.711.0 frames in the RTP payload. The algorithm below assumes an output buffer for the decoded G.711 source symbols of length sufficient to accommodate the expected number of G.711 symbols and an input buffer of length 321 octets.
对于下面的解码算法,设N为RTP有效载荷中的八位字节数(即,不包括任何RTP填充,但包括任何RTP有效载荷填充),设P等于由G.711.0解码过程处理的RTP有效载荷八位字节数,设K为当前在输出缓冲器中的G.711符号数,设Q为正在处理的G.711.0帧中包含的八位字节数,“!=”表示不等于。下面使用关键字“STOP”表示RTP有效负载中G.711.0帧的处理结束。下面的算法假设解码的G.711源符号的输出缓冲器的长度足以容纳G.711符号的预期数量,输入缓冲器的长度为321个八位字节。
G.711.0 RTP Payload Decoding Heuristic:
G.711.0 RTP有效载荷解码启发式:
H1 Initialization of counters: Initialize P, the number of processed octets counter, to zero. Initialize K, the counter for how many G.711 symbols are in the output buffer, to zero. Initialize N to the number of octets in the RTP payload (including any RTP payload padding). Go to H2.
H1计数器初始化:将处理的八位字节数计数器P初始化为零。将输出缓冲区中有多少G.711符号的计数器K初始化为零。将N初始化为RTP有效负载中的八位字节数(包括任何RTP有效负载填充)。去H2。
H2 Read internal buffer: Read min{320+1, (N-P)-1} octets into the internal buffer from the (P+1) octet of the RTP payload. We note at this point, N-P octets have yet to be processed and that 320+1 octets is the largest possible G.711.0 frame. Also note that in the common case of zero-based array indexing of a uint8 array of octets, that this operation will read octets from index P through index [min{320+1, (N-P)}] from the RTP payload. Go to H3.
H2读取内部缓冲区:从RTP有效负载的(P+1)八位元将min{320+1,(N-P)-1}八位元读取到内部缓冲区中。我们注意到,在这一点上,N-P八位字节尚未被处理,320+1八位字节是可能的最大G.711.0帧。还请注意,在对uint8八位字节数组进行基于零的数组索引的常见情况下,此操作将从RTP有效负载的索引P到索引[min{320+1,(N-P)}]读取八位字节。去H3。
H3 Analyze the first octet in the internal buffer: If this octet is 0x00 (a padding octet), go to H4; otherwise, go to H5 (process a G.711.0 frame).
H3分析内部缓冲区中的第一个八位字节:如果该八位字节为0x00(填充八位字节),则转到H4;否则,转到H5(处理G.711.0机架)。
H4 Process padding octet (no G.711 symbols generated): Increment the processed packets counter by one (set P = P + 1). If the result of this increment results in P >= N, then STOP (as all RTP Payload octets have been processed); otherwise, go to H2.
H4处理填充八位字节(未生成G.711符号):将已处理数据包计数器增加1(设置P=P+1)。如果此增量的结果导致P>=N,则停止(因为所有RTP有效负载八位字节都已处理);否则,转到H2。
H5 Process an individual G.711.0 frame (produce G.711 samples in the output frame): Pass the internal buffer to the G.711.0 decoder. The G.711.0 decoder will read the first octet (called the "prefix code" octet in ITU-T Rec. G.711.0 [G.711.0]) to determine the number of source G.711 samples M are contained in this G.711.0 frame. The G.711.0 decoder will produce exactly M G.711 source symbols (M can only have values of 0, 40, 80, 160, 240, or 320). If K = 0, these M symbols will be the first in the output buffer and are placed at the beginning of the output buffer. If K != 0, concatenate these M symbols with the prior symbols in the output buffer (there are K prior symbols in the buffer). Set K = K + M (as there are now this many G.711 source symbols in the output buffer). The G.711.0 decoder will have consumed some number of octets, Q, in the internal buffer to produce the M G.711 symbols. Increment the number of payload octets processed counter by this quantity (set P = P + Q). If the result of this increment results in P >= N, then STOP (as all RTP Payload octets have been processed); otherwise, go to H2.
H5处理单个G.711.0帧(在输出帧中产生G.711样本):将内部缓冲区传递给G.711.0解码器。G.711.0解码器将读取第一个八位字节(在ITU-T Rec.G.711.0[G.711.0]中称为“前缀码”八位字节),以确定此G.711.0帧中包含的源G.711样本M的数量。G.711.0解码器将精确生成M个G.711源符号(M的值只能为0、40、80、160、240或320)。如果K=0,这些M符号将是输出缓冲区中的第一个符号,并放置在输出缓冲区的开头。如果K!=0,将这些M个符号与输出缓冲区中的前一个符号连接(缓冲区中有K个前一个符号)。设置K=K+M(因为现在输出缓冲区中有这么多G.711源符号)。G.711.0解码器将在内部缓冲器中消耗一定数量的八位字节Q,以产生M G.711符号。将计数器处理的有效负载八位字节数增加此数量(设置P=P+Q)。如果此增量的结果导致P>=N,则停止(因为所有RTP有效负载八位字节都已处理);否则,转到H2。
At this point, the output buffer will contain precisely K G.711 source symbols that should correspond to the ptime signaled if SDP was used and the encoding process was without error. If ptime was signaled via SDP and the number of G.711 symbols in the output buffer is something other than what corresponds to ptime, the packet MUST be discarded unless other system design knowledge allows for otherwise (e.g., occasional 5 ms clock slips causing one more or one less G.711.0 frame than nominal to be in the payload). Lastly, due to the buffer reads in H2 being bounded (to 321 octets or less), N being bounded to the size of the G.711.0 RTP payload, and M being bounded to the number of source G.711 symbols, there is no buffer overrun risk.
此时,输出缓冲区将精确包含K G.711源符号,如果使用SDP且编码过程没有错误,则该源符号应对应于发送的ptime信号。如果ptime通过SDP发出信号,且输出缓冲区中的G.711符号的数量与ptime对应的数量不同,则必须丢弃该数据包,除非其他系统设计知识另有规定(例如,偶尔出现5毫秒时钟滑动,导致有效载荷中的G.711.0帧多于或少于标称值)。最后,由于H2中的缓冲区读取有界(到321个八位字节或更少),N有界于G.711.0 RTP有效负载的大小,M有界于源G.711符号的数量,因此不存在缓冲区溢出风险。
We also note, as an aside, that the algorithm above (and the ITU-T G.711.0 reference code) accommodates padding octets (0x00) placed anywhere between G.711.0 frames in the RTP payload as well as prior to or after any or all G.711.0 frames. The ITU-T G.711.0 reference code does not have Steps H3 and H4 as separate steps (i.e., Step H5 immediately follows H2) at the added computational cost of some
我们还注意到,作为旁白,上述算法(以及ITU-T G.711.0参考代码)适应RTP有效载荷中G.711.0帧之间以及任何或所有G.711.0帧之前或之后的任何位置的填充八位字节(0x00)。ITU-T G.711.0参考代码没有将步骤H3和H4作为单独的步骤(即,步骤H5紧跟在H2之后),增加了一些计算成本
additional buffer passing to/from the G.711.0 frame decoder functions. That is, the G.711.0 decoder in the reference code "silently ignores" 0x00 padding octets at the beginning of what it believes to be a frame boundary encoded by G.711.0. Thus, Steps H3 and H4 above are an optimization over the reference code shown for clarity.
向G.711.0帧解码器功能传递/从G.711.0帧解码器功能传递的附加缓冲区。也就是说,参考代码中的G.711.0解码器在其认为是由G.711.0编码的帧边界的开始处“静默地忽略”0x00填充八位字节。因此,为了清楚起见,上述步骤H3和H4是对所示参考代码的优化。
If the decoder is at a playout endpoint location, this G.711 buffer SHOULD be used in the same manner as a received G.711 RTP payload would have been used (passed to a playout buffer, to a PLC implementation, etc.).
如果解码器位于播放端点位置,则该G.711缓冲区的使用方式应与接收到的G.711 RTP有效载荷的使用方式相同(传递到播放缓冲区、PLC实现等)。
We explicitly note that a framing error condition will result whenever the buffer sent to a G.711.0 decoder does not begin with a valid first G.711.0 frame octet (i.e., a valid G.711.0 prefix code or a 0x00 padding octet). The expected result is that the decoder will not produce the desired/correct G.711 source symbols. However, as already noted, the output returned by the G.711.0 decoder will be bounded (to less than 321 octets per G.711.0 decode request) and if the number of the (presumed) G.711 symbols produced is known to be in error, the decoded output MUST be discarded.
我们明确指出,只要发送到G.711.0解码器的缓冲区不是以有效的第一个G.711.0帧八位组(即,有效的G.711.0前缀代码或0x00填充八位组)开始,就会出现帧错误情况。预期结果是解码器将不会产生所需/正确的G.711源符号。然而,如前所述,G.711.0解码器返回的输出将有界(每个G.711.0解码请求少于321个八位字节),并且如果已知产生的(假定的)G.711符号的数量有误,则必须丢弃解码输出。
In this section, we describe the use of multiple "channels" of G.711 data encoded by G.711.0 compression.
在本节中,我们描述了G.711.0压缩编码的G.711数据的多个“通道”的使用。
The dominant use of G.711 in RTP transport has been for single channel use cases. For this case, the above G.711.0 encoding and decoding process is used. However, the multiple channel case for G.711.0 (a frame-based compression) is different from G.711 (a sample-based encoding) and is described separately here.
G.711在RTP传输中的主要用途是用于单通道用例。对于这种情况,使用上述G.711.0编码和解码过程。然而,G.711.0(基于帧的压缩)的多信道情况不同于G.711(基于样本的编码),并且在这里单独描述。
Section 4 of RFC 3551 [RFC3551] provides guidelines for encoding audio channels and Section 4.1 of RFC 3551 [RFC3551] for the ordering of the channels within the RTP payload. The ordering guidelines in Section 4.1 of RFC 3551 SHOULD be used unless an application-specific channel ordering is more appropriate.
RFC 3551[RFC3551]第4节提供了音频信道编码指南,RFC 3551[RFC3551]第4.1节提供了RTP有效负载内信道排序指南。应使用RFC 3551第4.1节中的订购指南,除非特定于应用的频道订购更合适。
An implicit assumption in RFC 3551 is that all the channel data multiplexed into an RTP payload MUST represent the same physical time span. The case for G.711.0 is no different; the underlying G.711 data for all channels in a G.711.0 RTP payload MUST span the same interval in time (e.g., the same "ptime" for a SDP-specified codec negotiation).
RFC3551中的一个隐含假设是,多路复用到RTP有效负载中的所有信道数据必须表示相同的物理时间跨度。G.711.0的情况也不例外;G.711.0 RTP有效载荷中所有信道的底层G.711数据必须跨越相同的时间间隔(例如,SDP指定编解码器协商的相同“ptime”)。
Section 4.2 of RFC 3551 provides guidelines for sample-based encodings such as G.711. This guidance is tantamount to interleaving the individual samples in that they SHOULD be packed in consecutive octets.
RFC 3551的第4.2节为基于样本的编码(如G.711)提供了指南。这一指导相当于交错单个样本,因为它们应包装在连续的八位字节中。
RFC 3551 provides guidelines for frame-based encodings in which the frames are interleaved. However, this guidance stems from the stated assumption that "the frame size for the frame-oriented codecs is given". However, this assumption is not valid for G.711.0 in that individual consecutive G.711.0 frames (as per Section 4.2.2 of this document) can:
RFC3551提供了帧交错的基于帧的编码准则。然而,本指南源于“给定了面向帧编解码器的帧大小”的假设。但是,该假设对G.711.0无效,因为单个连续G.711.0帧(根据本文件第4.2.2节)可以:
1. represent different time spans (e.g., two 5 ms G.711.0 frames in lieu of one 10 ms G.711.0 frame), and
1. 表示不同的时间跨度(例如,两个5ms g.711.0帧代替一个10ms g.711.0帧),以及
2. be of different lengths in octets (and typically are).
2. 以八位字节为单位具有不同的长度(通常为)。
Therefore, a different, but also simple, concatenation-based approach is specified in this RFC.
因此,在此RFC中指定了一种不同但也简单的基于串联的方法。
For the multiple channel G.711.0 case, each G.711 channel is independently encoded into one or more G.711.0 frames defined here as a "G.711.0 channel superframe". Each one of these superframes is identical to the multiple G.711.0 frame case illustrated in Figure 3 of Section 4.2.2 in which each superframe can have one or more individual G.711.0 frames within it. Then each G.711.0 channel superframe is concatenated -- in channel order -- into a G.711.0 RTP payload. Then, if optional G.711.0 padding octets (0x00) are desired, it is RECOMMENDED that these octets are placed after the last G.711.0 channel superframe. As per above, such padding may be desired based on Security Considerations (see Section 8). This is depicted in Figure 4.
对于多信道G.711.0情况,每个G.711信道独立地编码到一个或多个G.711.0帧中,在这里定义为“G.711.0信道超帧”。这些超帧中的每一个都与第4.2.2节图3所示的多个G.711.0帧情况相同,其中每个超帧内可以有一个或多个单独的G.711.0帧。然后,每个G.711.0信道超帧按信道顺序连接成G.711.0 RTP有效载荷。然后,如果需要可选的G.711.0填充八位字节(0x00),建议将这些八位字节放置在最后一个G.711.0通道超帧之后。如上所述,基于安全考虑,可能需要这种填充(参见第8节)。这如图4所示。
|----------|---------|----------|---------|---------| | First | Second | | Nth | Zero | | G.711.0 | G.711.0 | ... | G.711.0 | or more | | Channel | Channel | | Channel | 0x00 | | Super- | Super- | | Super | Padding | | Frame | Frame | | Frame | Octets | |__________|_________|__________|_________|_________|
|----------|---------|----------|---------|---------| | First | Second | | Nth | Zero | | G.711.0 | G.711.0 | ... | G.711.0 | or more | | Channel | Channel | | Channel | 0x00 | | Super- | Super- | | Super | Padding | | Frame | Frame | | Frame | Octets | |__________|_________|__________|_________|_________|
Figure 4: Multiple G.711.0 Channel Superframes in RTP Payload
图4:RTP有效载荷中的多个G.711.0信道超帧
We note that although the individual superframes can be of different lengths in octets (and usually are), the number of G.711 source symbols represented -- in compressed form -- in each channel superframe is identical (since all the channels represent the identically same time interval).
我们注意到,尽管各个超帧可以是不同长度的八位字节(通常是),但在每个信道超帧中以压缩形式表示的G.711源符号的数量是相同的(因为所有信道表示相同的时间间隔)。
The G.711.0 decoder at the receiving end simply decodes the entire G.711.0 (multiple channel) payload into individual G.711 symbols. If M such G.711 symbols result and there were N channels, then the first M/N G.711 samples would be from the first channel, the second M/N G.711 samples would be from the second channel, and so on until the Nth set of G.711 samples are found. Similarly, if the number of channels was not known, but the payload "ptime" was known, one could infer (knowing the sampling rate) how many G.711 symbols each channel contained; then, with this knowledge, the number of channels of data contained in the payload could be determined. When SDP is used, the number of channels is known because the optional parameter is a MUST when there is more than one channel negotiated (see Section 5.1). Additionally, when SDP is used, the parameter ptime is a RECOMMENDED optional parameter. We note that if both parameters channels and ptime are known, one could provide a check for the other and the converse. Whichever algorithm is used to determine the number of channels, if the length of the source G.711 symbols in the payload (M) is not an integer multiple of the number of channels (N), then the packet SHOULD be discarded.
接收端的G.711.0解码器将整个G.711.0(多信道)有效载荷简单地解码为单个G.711符号。如果产生M个这样的G.711符号并且存在N个信道,则第一个M/N G.711样本将来自第一个信道,第二个M/N G.711样本将来自第二个信道,依此类推,直到找到第N组G.711样本。类似地,如果信道的数量未知,但有效载荷“ptime”已知,则可以推断(知道采样率)每个信道包含多少个G.711符号;然后,利用该知识,可以确定有效载荷中包含的数据信道的数量。使用SDP时,通道数是已知的,因为当协商了多个通道时,必须使用可选参数(见第5.1节)。此外,使用SDP时,参数ptime是推荐的可选参数。我们注意到,如果两个参数通道和ptime都已知,一个可以检查另一个,反之亦然。无论使用哪种算法来确定信道数,如果有效载荷(M)中的源G.711符号的长度不是信道数(N)的整数倍,则应丢弃该分组。
Lastly, we note that although any padding for the multiple channel G.711.0 payload is RECOMMENDED to be placed at the end of the payload, the G.711.0 decoding algorithm described in Section 4.2.3 will successfully decode the payload in Figure 4 if the 0x00 padding octet is placed anywhere before or after any individual G.711.0 frame in the RTP payload. The number of padding octets introduced at any G.711.0 frame boundary therefore does not affect the number M of the source G.711 symbols produced. Thus, the decision for padding MAY be made on a per-superframe basis.
最后,我们注意到,尽管建议将多信道G.711.0有效载荷的任何填充放置在有效载荷的末端,如果0x00填充八位组位于RTP有效载荷中任何单独G.711.0帧之前或之后,则第4.2.3节中描述的G.711.0解码算法将成功解码图4中的有效载荷。因此,在任何G.711.0帧边界引入的填充八位组的数量不会影响产生的源G.711符号的数量M。因此,可以基于每个超帧作出填充的决定。
This section defines the parameters that may be used to configure optional features in the G.711.0 RTP transmission.
本节定义了可用于配置G.711.0 RTP传输中可选功能的参数。
The parameters defined here are a part of the media subtype registration for the G.711.0 codec. Mapping of the parameters into SDP RFC 4566 [RFC4566] is also provided for those applications that use SDP.
此处定义的参数是G.711.0编解码器的媒体子类型注册的一部分。还为使用SDP的应用程序提供了参数到SDP RFC 4566[RFC4566]的映射。
Type name: audio
类型名称:音频
Subtype name: G711-0
子类型名称:G711-0
Required parameters:
所需参数:
clock rate: The RTP timestamp clock rate, which is equal to the sampling rate. The typical rate used with G.711 encoding is 8000, but other rates may be specified. The default rate is 8000.
时钟速率:RTP时间戳时钟速率,等于采样速率。与G.711编码一起使用的典型速率为8000,但可以指定其他速率。违约率为8000。
complaw: This format-specific parameter, specified on the "a=fmtp: line", indicates the companding law (A-law or mu-law) employed. This format-specific parameter, as per RFC 4566 [RFC4566], is given unchanged to the media tool using this format. The case-insensitive values are "complaw=al" or "complaw=mu" are used for A-law and mu-law, respectively.
平面:在“a=fmtp:line”上指定的此特定于格式的参数表示所采用的压扩法则(a法则或mu法则)。根据RFC 4566[RFC4566]的规定,此特定于格式的参数未经更改地提供给使用此格式的媒体工具。不区分大小写的值为“SULLAW=al”或“SULLAW=mu”,分别用于A-law和mu-law。
Optional parameters:
可选参数:
channels: See RFC 4566 [RFC4566] for definition. Specifies how many audio streams are represented in the G.711.0 payload and MUST be present if the number of channels is greater than one. This parameter defaults to 1 if not present (as per RFC 4566) and is typically a non-zero, small-valued positive integer. It is expected that implementations that specify multiple channels will also define a mechanism to map the channels appropriately within their system design; otherwise, the channel order specified in Section 4.1 of RFC 3551 [RFC3551] will be assumed (e.g., left, right, center). Similar to the usual interpretation in RFC 3551 [RFC3551], the number of channels SHALL be a non-zero, positive integer.
通道:有关定义,请参见RFC 4566[RFC4566]。指定G.711.0有效负载中表示的音频流的数量,如果通道数大于1,则必须显示音频流的数量。如果不存在该参数(根据RFC 4566),则该参数默认为1,通常为非零的小值正整数。预期指定多个通道的实现也将定义一种机制,以便在其系统设计中适当地映射通道;否则,将采用RFC 3551[RFC3551]第4.1节中规定的信道顺序(例如,左、右、中)。与RFC 3551[RFC3551]中的通常解释类似,通道数应为非零正整数。
maxptime: See RFC 4566 [RFC4566] for definition.
maxptime:有关定义,请参见RFC 4566[RFC4566]。
ptime: See RFC 4566 [RFC4566] for definition. The inclusion of "ptime" is RECOMMENDED and SHOULD be in the SDP unless there is an application-specific reason not to include it (e.g., an application that has a variable ptime on a packet-by-packet basis). For constant ptime applications, it is considered good form to include "ptime" in the SDP for session diagnostic purposes. For the constant ptime multiple channel case described in Section 4.2.2, the inclusion of "ptime" can provide a desirable payload check.
ptime:有关定义,请参见RFC 4566[RFC4566]。建议在SDP中包含“ptime”,除非有特定于应用程序的理由不包含它(例如,在分组基础上具有可变ptime的应用程序)。对于固定的ptime应用程序,将“ptime”包含在SDP中用于会话诊断被认为是一种良好的形式。对于第4.2.2节中描述的恒定ptime多信道情况,包含“ptime”可提供理想的有效负载检查。
Encoding considerations:
编码注意事项:
This media type is framed binary data (see Section 4.8 in RFC 6838 [RFC6838]) compressed as per ITU-T Rec. G.711.0.
这种媒体类型是按ITU-T Rec.G.711.0压缩的帧二进制数据(见RFC 6838[RFC6838]第4.8节)。
Security considerations:
安全考虑:
See Section 8.
见第8节。
Interoperability considerations: none
互操作性注意事项:无
Published specification:
已发布的规范:
ITU-T Rec. G.711.0 and RFC 7655 (this document).
ITU-T Rec.G.711.0和RFC 7655(本文件)。
Applications that use this media type:
使用此媒体类型的应用程序:
Although initially conceived for VoIP, the use of G.711.0, like G.711 before it, may find use within audio and video streaming and/or conferencing applications for the audio portion of those applications.
虽然最初设想用于VoIP,但与之前的G.711一样,G.711.0的使用可能会在音频和视频流和/或会议应用程序中用于这些应用程序的音频部分。
Additional information:
其他信息:
The following applies to stored-file transfer methods:
以下内容适用于存储的文件传输方法:
Magic numbers: #!G7110A\n or #!G7110M\n (for A-law or MU-law encodings respectively, see Section 6).
神奇数字:#!G7110A\n或#!G7110M\n(有关A-law或MU-law编码,请参见第6节)。
File Extensions: None
文件扩展名:无
Macintosh file type code: None
Macintosh文件类型代码:无
Object identifier or OIL: None
对象标识符或油:无
Person & email address to contact for further information:
联系人和电子邮件地址,以获取更多信息:
Michael A. Ramalho <mramalho@cisco.com> or <mar42@cornell.edu>
Michael A. Ramalho <mramalho@cisco.com> or <mar42@cornell.edu>
Intended usage: COMMON
预期用途:普通
Restrictions on usage:
使用限制:
This media type depends on RTP framing, and hence is only defined for transfer via RTP [RFC3550]. Transport within other framing protocols is not defined at this time.
此媒体类型取决于RTP帧,因此仅定义为通过RTP传输[RFC3550]。此时未定义其他帧协议内的传输。
Author: Michael A. Ramalho
作者:Michael A.Ramalho
Change controller:
更改控制器:
IETF Payload working group delegated from the IESG.
IESG授权的IETF有效载荷工作组。
The information carried in the media type specification has a specific mapping to fields in SDP, which is commonly used to describe an RTP session. When SDP is used to specify sessions employing G.711.0, the mapping is as follows:
媒体类型规范中包含的信息具有到SDP中字段的特定映射,SDP通常用于描述RTP会话。当SDP用于指定使用G.711.0的会话时,映射如下:
o The media type ("audio") goes in SDP "m=" as the media name.
o 媒体类型(“音频”)以SDP“m=”作为媒体名称。
o The media subtype ("G711-0") goes in SDP "a=rtpmap" as the encoding name.
o 媒体子类型(“G711-0”)以SDP“a=rtpmap”作为编码名称。
o The required parameter "rate" also goes in "a=rtpmap" as the clock rate.
o 所需的参数“rate”也作为时钟频率进入“a=rtpmap”。
o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and "a=maxptime" attributes, respectively.
o 参数“ptime”和“maxptime”分别位于SDP“a=ptime”和“a=maxptime”属性中。
o Remaining parameters go in the SDP "a=fmtp" attribute by copying them directly from the media type string as a semicolon-separated list of parameter=value pairs.
o 其余参数直接从媒体类型字符串中复制为参数=值对的分号分隔列表,进入SDP“a=fmtp”属性。
The following considerations apply when using the SDP offer/answer mechanism [RFC3264] to negotiate the "channels" attribute.
当使用SDP提供/应答机制[RFC3264]协商“通道”属性时,以下注意事项适用。
o If the offering endpoint specifies a value for the optional channels parameter that is greater than one, and the answering endpoint both understands the parameter and cannot support that value requested, the answer MUST contain the optional channels parameter with the highest value it can support.
o 如果提供端点为可选通道参数指定的值大于1,并且应答端点都理解该参数并且无法支持所请求的值,则应答必须包含可选通道参数,该参数的值必须为其能够支持的最高值。
o If the offering endpoint specifies a value for the optional channels parameter, the answer MUST contain the optional channels parameter unless the only value the answering endpoint can support is one, in which case the answer MAY contain the optional channels parameter with a value of 1.
o 如果产品端点为可选通道参数指定了一个值,则答案必须包含可选通道参数,除非答案端点可以支持的唯一值为1,在这种情况下,答案可能包含值为1的可选通道参数。
o If the offering endpoint specifies a value for the ptime parameter that the answering endpoint cannot support, the answer MUST contain the optional ptime parameter.
o 如果提供端点指定了应答端点无法支持的ptime参数值,则应答必须包含可选的ptime参数。
o If the offering endpoint specifies a value for the maxptime parameter that the answering endpoint cannot support, the answer MUST contain the optional maxptime parameter.
o 如果提供端点指定了应答端点无法支持的maxptime参数值,则应答必须包含可选的maxptime参数。
The following examples illustrate how to signal G.711.0 via SDP.
以下示例说明如何通过SDP向G.711.0发送信号。
m=audio RTP/AVP 98 a=rtpmap:98 G711-0/8000 a=fmtp:98 complaw=mu
m=audio RTP/AVP 98 a=rtpmap:98 G711-0/8000 a=fmtp:98 complaw=mu
In the above example, the dynamic payload type 98 is mapped to G.711.0 via the "a=rtpmap" parameter. The mandatory "complaw" is on the "a=fmtp" parameter line. Note that neither optional parameters "ptime" nor "channels" is present; although, it is generally good form to include "ptime" in the SDP if the session is a constant ptime session for diagnostic purposes.
在上述示例中,动态有效负载类型98通过“a=rtpmap”参数映射到G.711.0。“a=fmtp”参数行上有强制性的“PLUMW”。注意,可选参数“ptime”和“通道”均不存在;尽管如此,如果会话是用于诊断目的的恒定ptime会话,则在SDP中包含“ptime”通常是一种良好的形式。
The following example illustrates an offering endpoint requesting 2 channels, but the answering endpoint can only support (or render) one channel.
以下示例说明了一个提供端点请求2个通道,但应答端点只能支持(或呈现)一个通道。
Offer:
报价:
m=audio RTP/AVP 98 a=rtpmap:98 G711-0/8000/2 a=ptime:20 a=fmtp:98 complaw=al
m=audio RTP/AVP 98 a=rtpmap:98 G711-0/8000/2 a=ptime:20 a=fmtp:98 complaw=al
Answer:
答复:
m=audio RTP/AVP 98 a=rtpmap: 98 G711-0/8000/1 a=ptime: 20 a=fmtp:98 complaw=al
m=audio RTP/AVP 98 a=rtpmap: 98 G711-0/8000/1 a=ptime: 20 a=fmtp:98 complaw=al
In this example, the offer had an optional channels parameter. The answer must have the optional channels parameter also unless the value in the answer is one. Shown here is when the answer explicitly contains the channels parameter (it need not have and it would be interpreted as one channel). As mentioned previously, it is considered good form to include "ptime" in the SDP for session diagnostic purposes if the session is a constant ptime session.
在本例中,报价有一个可选的channels参数。答案也必须具有可选通道参数,除非答案中的值为1。这里显示的是当答案显式包含channels参数时(它不需要有,并且将被解释为一个通道)。如前所述,如果会话是一个恒定的ptime会话,则将“ptime”包含在SDP中用于会话诊断是一种良好的形式。
The G.711.0 storage mode definition in this section is similar to many other IETF codecs (e.g., iLBC RFC 3951 [RFC3951] and EVRC-NW RFC 6884 [RFC6884]), and is essentially a concatenation of individual G.711.0 frames.
本节中的G.711.0存储模式定义类似于许多其他IETF编解码器(例如,iLBC RFC 3951[RFC3951]和EVRC-NW RFC 6884[RFC6884]),本质上是单个G.711.0帧的串联。
We note that something must be stored for any G.711.0 frames that are not received at the receiving endpoint, no matter what the cause. In this section, we describe two mechanisms, a "G.711.0 PLC Frame" and a "G.711.0 Erasure Frame". These G.711.0 PLC and G.711.0 Erasure Frames are described prior to the G.711.0 storage mode definition for clarity.
我们注意到,必须为接收端点未接收到的任何G.711.0帧存储某些内容,无论原因是什么。在本节中,我们将介绍两种机制,“G.711.0 PLC帧”和“G.711.0擦除帧”。为清晰起见,这些G.711.0 PLC和G.711.0擦除帧在G.711.0存储模式定义之前进行了描述。
When G.711 RTP payloads are not received by a rendering endpoint, a PLC mechanism is typically employed to "fill in" the missing G.711 symbols with something that is auditorially pleasing; thus, the loss may be not noticed by a listener. Such a PLC mechanism for G.711 is specified in ITU-T Rec. G.711 - Appendix 1 [G.711-AP1].
当渲染端点未接收到G.711 RTP有效载荷时,通常使用PLC机制用令人满意的内容“填充”缺失的G.711符号;因此,听众可能不会注意到这种损失。ITU-T Rec.G.711-附录1[G.711-AP1]中规定了G.711的这种PLC机制。
A natural extension when creating G.711.0 frames for storage environments is to employ such a PLC mechanism to create G.711 symbols for the span of time in which G.711.0 payloads were not received -- and then to compress the resulting "G.711 PLC symbols" via G.711.0 compression. The G.711.0 frame(s) created by such a process are called "G.711.0 PLC Frames".
在为存储环境创建G.711.0帧时,一个自然的扩展是使用这种PLC机制为未接收G.711.0有效载荷的时间跨度创建G.711符号,然后通过G.711.0压缩来压缩生成的“G.711 PLC符号”。由该过程创建的G.711.0帧称为“G.711.0 PLC帧”。
Since PLC mechanisms are designed to render missing audio data with the best fidelity and intelligibility, G.711.0 frames created via such processing is likely best for most recording situations (such as voicemail storage) unless there is a requirement not to fabricate (audio) data not actually received.
由于PLC机制旨在以最佳保真度和可理解性呈现缺失的音频数据,因此,通过此类处理创建的G.711.0帧可能最适合大多数录制情况(如语音邮件存储),除非要求不制作未实际接收的(音频)数据。
After such PLC G.711 symbols have been generated and then encoded by a G.711.0 encoder, the resulting frames may be stored in G.711.0 frame format. As a result, there is nothing to specify here -- the G.711.0 PLC frames are stored as if they were received by the receiving endpoint. In other words, PLC-generated G.711.0 frames appear as "normal" or "ordinary" G.711.0 frames in the storage mode file.
在生成此类PLC G.711符号并随后由G.711.0编码器编码后,所得帧可以G.711.0帧格式存储。因此,这里没有什么需要说明的——存储G.711.0 PLC帧时,就好像它们是由接收端点接收的一样。换句话说,PLC生成的G.711.0帧在存储模式文件中显示为“正常”或“普通”G.711.0帧。
"Erasure Frames", or equivalently "Null Frames", have been designed for many frame-based codecs since G.711 was standardized. These null/erasure frames explicitly represent data from incoming audio that were either not received by the receiving system or represent data that a transmitting system decided not to send. Transmitting systems may choose not to send data for a variety of reasons (e.g., not enough wireless link capacity in radio-based systems) and can choose to send a "null frame" in lieu of the actual audio. It is also envisioned that erasure frames would be used in storage mode applications for specific archival purposes where there is a requirement not to fabricate audio data that was not actually received.
自G.711标准化以来,许多基于帧的编解码器都设计了“擦除帧”或等效的“空帧”。这些空/擦除帧明确表示接收系统未接收到的来自传入音频的数据,或表示传输系统决定不发送的数据。传输系统可能出于各种原因(例如,基于无线电的系统中没有足够的无线链路容量)选择不发送数据,并且可以选择发送“空帧”来代替实际音频。还可以设想,擦除帧将用于存储模式应用中的特定存档目的,其中要求不制作未实际接收的音频数据。
Thus, a G.711.0 erasure frame is a representation of the amount of time in G.711.0 frames that were not received or not encoded by the transmitting system.
因此,G.711.0擦除帧是G.711.0帧中未被发送系统接收或编码的时间量的表示。
Prior to defining a G.711.0 erasure frame, it is beneficial to note what many G.711 RTP systems send when the endpoint is "muted". When muted, many of these systems will send an entire G.711 payload of either 0+ or 0- (i.e., one of the two levels closest to "analog zero" in either G.711 companding law). Next we note that a desirable property for a G.711.0 erasure frame is for "non-G.711.0 Erasure Frame-aware" endpoints to be able to playback a G.711.0 erasure frame with the existing G.711.0 ITU-T reference code.
在定义G.711.0擦除帧之前,最好注意端点“静音”时许多G.711 RTP系统发送的内容。当静音时,其中许多系统将发送0+或0的整个G.711有效载荷(即,在任一G.711压扩定律中最接近“模拟零”的两个电平之一)。接下来,我们注意到,G.711.0擦除帧的理想特性是“非G.711.0擦除帧感知”端点能够使用现有G.711.0 ITU-T参考码回放G.711.0擦除帧。
A G.711.0 Erasure Frame is defined as any G.711.0 frame for which the corresponding G.711 sample values are either the value 0++ or the value 0-- for the entirety of the G.711.0 frame. The levels of 0++ and 0-- are defined to be the two levels above or below analog zero, respectively. An entire frame of value 0++ or 0-- is expected to be extraordinarily rare when the frame was in fact generated by a natural signal, as analog inputs such as speech and music are zero-mean and are typically acoustically coupled to digital sampling systems. Note that the playback of a G.711.0 frame characterized as an erasure frame is auditorially equivalent to a muted signal (a very low value constant).
G.711.0擦除帧定义为任何G.711.0帧,其对应的G.711样本值为整个G.711.0帧的值0++或值0。0++和0--的级别分别定义为高于或低于模拟零的两个级别。当帧实际上是由自然信号生成时,值为0++或0的整个帧被认为是非常罕见的,因为语音和音乐等模拟输入是零均值,并且通常与数字采样系统进行声学耦合。注意,以擦除帧为特征的G.711.0帧的回放在听觉上等同于静音信号(非常低的值常数)。
These G.711.0 erasure frames can be reasonably characterized as null or erasure frames while meeting the desired playback goal of being decoded by the G.711.0 ITU-T reference code. Thus, similarly to G.711 PLC frames, the G.711.0 erasure frames appear as "normal" or "ordinary" G.711.0 frames in the storage mode format.
这些G.711.0擦除帧可以合理地表征为空帧或擦除帧,同时满足由G.711.0 ITU-T参考码解码的期望回放目标。因此,与G.711 PLC帧类似,G.711.0擦除帧在存储模式格式中显示为“正常”或“普通”G.711.0帧。
The storage format is used for storing G.711.0 encoded frames. The format for the G.711.0 storage mode file defined by this RFC is shown below.
存储格式用于存储G.711.0编码的帧。本RFC定义的G.711.0存储模式文件的格式如下所示。
|---------------------------|----------|--------------| | Magic Number | | | | | Version | Concatenated | | "#!G7110A\n" (for A-law) | Octet | G.711.0 | | or | | Frames | | "#!G7110M\n" (for mu-law) | "0x00" | | |___________________________|__________|______________|
|---------------------------|----------|--------------| | Magic Number | | | | | Version | Concatenated | | "#!G7110A\n" (for A-law) | Octet | G.711.0 | | or | | Frames | | "#!G7110M\n" (for mu-law) | "0x00" | | |___________________________|__________|______________|
Figure 5: G.711.0 Storage Mode Format
图5:G.711.0存储模式格式
The storage mode file consists of a magic number and a version octet followed by the individual G.711.0 frames concatenated together.
存储模式文件由一个幻数和一个版本八位字节组成,后跟连接在一起的各个G.711.0帧。
The magic number for G.711.0 A-law corresponds to the ASCII character string "#!G7110A\n", i.e., "0x23 0x21 0x47 0x37 0x31 0x31 0x30 0x41 0x0A". Likewise, the magic number for G.711.0 MU-law corresponds to the ASCII character string "#!G7110M\n", i.e., "0x23 0x21 0x47 0x37 0x31 0x31 0x4E 0x4D 0x0A".
G.711.0 A-law的幻数对应于ASCII字符串“#!G7110A\n”,即“0x23 0x21 0x47 0x37 0x31 0x31 0x30 0x41 0x0A”。同样,G.711.0 MU法则的幻数对应于ASCII字符串“#!G7110M\n”,即“0x23 0x21 0x47 0x37 0x31 0x31 0x4E 0x4D 0x0A”。
The version number octet allows for the future specification of other G.711.0 storage mode formats. The specification of other storage mode formats may be desirable as G.711.0 frames are of variable length and a future format may include an indexing methodology that would enable playout far into a long G.711.0 recording without the necessity of decoding all the G.711.0 frames since the beginning of the recording. Other future format specification may include support for multiple channels, metadata, and the like. For these reasons, it was determined that a versioning strategy was desirable for the G.711.0 storage mode definition specified by this RFC. This RFC only specifies Version 0 and thus the value of "0x00" MUST be used for the storage mode defined by this RFC.
版本号八位字节允许将来指定其他G.711.0存储模式格式。其他存储模式格式的规范可能是可取的,因为G.711.0帧是可变长度的,并且未来的格式可能包括索引方法,该索引方法将允许播放到长G.711.0记录中,而无需解码自记录开始以来的所有G.711.0帧。其他未来的格式规范可能包括对多个通道、元数据等的支持。出于这些原因,确定本RFC指定的G.711.0存储模式定义需要版本控制策略。此RFC仅指定版本0,因此“0x00”的值必须用于此RFC定义的存储模式。
The G.711.0 codec data frames, including any necessary erasure or PLC frames, are stored in consecutive order concatenated together as shown in Section 4.2.2. As the Version 0 storage mode only supports a single channel, the RTP payload format supporting multiple channels defined in Section 4.2.4 is not supported in this storage mode definition.
如第4.2.2节所示,G.711.0编解码器数据帧(包括任何必要的擦除或PLC帧)以连续顺序连接在一起存储。由于版本0存储模式仅支持单个通道,因此在该存储模式定义中不支持第4.2.4节中定义的支持多个通道的RTP有效负载格式。
To decode the individual G.711.0 frames, the algorithm presented in Section 4.2.2 may be used to decode the individual G.711.0 frames. If the version octet is determined not to be zero, the remainder of
为了解码各个G.711.0帧,可使用第4.2.2节中给出的算法解码各个G.711.0帧。如果确定版本八位字节不为零,则
the payload MUST NOT be passed to the G.711.0 decoder, as the ITU-T G.711.0 reference decoder can only decode concatenated G.711.0 frames and has not been designed to decode elements in yet to be specified future storage mode formats.
不得将有效载荷传递给G.711.0解码器,因为ITU-T G.711.0参考解码器只能解码串联的G.711.0帧,并且未设计为以尚未指定的未来存储模式格式解码元素。
One media type (audio/G711-0) has been defined and registered in IANA's "Media Types" registry. See Section 5.1 for details.
一种媒体类型(audio/G711-0)已在IANA的“媒体类型”注册表中定义和注册。详见第5.1节。
RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550], and in any applicable RTP profile (such as RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711], or RTP/ SAVPF [RFC5124]. However, as "Securing the RTP Protocol Framework: Why RTP Does Not Mandate a Single Media Security Solution" [RFC7202] discusses, it is not a responsibility of the RTP payload format to discuss or mandate what solutions are used to meet the basic security goals like confidentiality, integrity, and source authenticity for RTP in general. This responsibility lays on anyone using RTP in an application. They can find guidance on available security mechanisms and important considerations in "Options for Securing RTP Sessions" [RFC7201]. Applications SHOULD use one or more appropriate strong security mechanisms. The rest of this Security Considerations section discusses the security impacting properties of the playload format itself.
使用本规范中定义的有效负载格式的RTP数据包受RTP规范[RFC3550]和任何适用RTP配置文件(如RTP/AVP[RFC3551]、RTP/AVPF[RFC4585]、RTP/SAVP[RFC3711]或RTP/SAVPF[RFC5124]中讨论的安全注意事项的约束“保护RTP协议框架:为什么RTP不要求单一媒体安全解决方案”[RFC7202]讨论,RTP有效负载格式不负责讨论或授权使用什么解决方案来满足RTP的基本安全目标,如机密性、完整性和源真实性。这一责任由在应用程序中使用RTP的任何人承担。他们可以找到有关可用安全机制和“保护RTP会话的选项”[RFC7201]中的rtant注意事项。应用程序应使用一个或多个适当的强安全机制。本安全注意事项部分的其余部分将讨论影响playload格式本身安全性的属性。
Because the data compression used with this payload format is applied end-to-end, any encryption needs to be performed after compression.
由于与此有效负载格式一起使用的数据压缩是端到端应用的,因此任何加密都需要在压缩后执行。
Note that end-to-end security with either authentication, integrity, or confidentiality protection will prevent a network element not within the security context from performing media-aware operations other than discarding complete packets. To allow any (media-aware) intermediate network element to perform its operations, it is required to be a trusted entity that is included in the security context establishment.
请注意,具有身份验证、完整性或机密性保护的端到端安全性将防止不在安全上下文中的网元执行除丢弃完整数据包以外的媒体感知操作。要允许任何(媒体感知)中间网元执行其操作,它必须是安全上下文建立中包含的受信任实体。
G.711.0 has no known denial-of-service (DoS) attacks due to decoding, as data posing as a desired G711.0 payload will be decoded into something (as per the decoding algorithm) with a finite amount of computation. This is due to the decompression algorithm having a finite worst-case processing path (no infinite computational loops are possible). We also note that the data read by the G.711.0 decoder is controlled by the length of the individual encoded G.711.0 frame(s) contained in the RTP payload. The decoding algorithm
由于解码,G.711.0没有已知的拒绝服务(DoS)攻击,因为伪装成所需G711.0有效载荷的数据将通过有限的计算量被解码成某种东西(根据解码算法)。这是由于解压算法具有有限的最坏情况处理路径(不可能有无限的计算循环)。我们还注意到,由G.711.0解码器读取的数据由RTP有效载荷中包含的单个编码G.711.0帧的长度控制。解码算法
specified previously in Section 4.2.3 ensures that the G.711.0 decoder will not read beyond the length of the internal buffer specified (which is in turn specified to be no greater than the largest possible G.711.0 frame of 321 octets). Therefore, a G.711.0 payload does not carry "active content" that could impose malicious side-effects upon the receiver.
先前在第4.2.3节中规定,确保G.711.0解码器的读取不会超过规定的内部缓冲区长度(反过来,规定的长度不超过321个八位字节的最大可能G.711.0帧)。因此,G.711.0有效载荷不携带可能对接收器施加恶意副作用的“活动内容”。
G.711.0 is a VBR audio codec. There have been recent concerns with VBR speech codecs where a passive observer can identify phrases from a standard speech corpus by means of the lengths produced by the encoder even when the payload is encrypted [IEEE]. In this paper, it was determined that some Code-Excited Linear Prediction (CELP) codecs would produce discrete packet lengths for some phonemes. Furthermore, with the use of appropriately designed Hidden Markov Models (HMMs), such a system could predict phrases with unexpected accuracy. One CELP codec studied, SPEEX, had the property that produced 21 different packet lengths in its wideband mode, and these packet lengths probabilistically mapped to phonemes that an HMM system could be trained on. In this paper, it was determined that a mitigation technique would be to pad the output of the encoder with random padding lengths to the effect: 1) that more discrete payload sizes would result, and 2) that the probabilistic mapping to phonemes would become less clear. As G.711 is not a speech-model-based codec, neither is G.711.0. A G.711.0 encoding, during talking periods, produces frames of varying frame lengths that are not likely to have a strong mapping to phonemes. Thus, G.711.0 is not expected to have this same vulnerability. It should be noted that "silence" (only one value of G.711 in the entire G.711 input frame) or "near silence" (only a few G.711 values) is easily detectable as G.711.0 frame lengths or one or a few octets. If one desires to mitigate for silence/non-silence detection, statistically variable padding should be added to G.711.0 frames that resulted in very small G.711.0 frames (less than about 20% of the symbols of the corresponding G.711 input frame). Methods of introducing padding in the G.711.0 payloads have been provided in the G.711.0 RTP payload definition in Section 4.2.2.
G.711.0是一个VBR音频编解码器。最近有人关注VBR语音编解码器,其中被动观察者可以通过编码器产生的长度识别标准语音语料库中的短语,即使有效负载已加密[IEEE]。在本文中,确定了一些码激励线性预测(CELP)编解码器会为某些音素产生离散的数据包长度。此外,通过使用适当设计的隐马尔可夫模型(HMMs),这样的系统可以以意想不到的精度预测短语。研究的一个CELP编解码器SPEEX具有在其宽带模式下产生21种不同数据包长度的特性,这些数据包长度可能映射到HMM系统可以训练的音素。在本文中,确定了一种缓解技术,即使用随机填充长度填充编码器的输出,以达到以下效果:1)将产生更多离散的有效负载大小,2)到音素的概率映射将变得不那么清晰。由于G.711不是基于语音模型的编解码器,因此G.711.0也不是。在通话期间,G.711.0编码会产生不同帧长的帧,这些帧不太可能与音素有很强的对应关系。因此,G.711.0预计不会有相同的漏洞。应注意,“静默”(整个G.711输入帧中只有一个G.711值)或“接近静默”(只有几个G.711值)很容易检测为G.711.0帧长度或一个或几个八位字节。如果希望缓解静默/非静默检测,则应向G.711.0帧添加统计变量填充,从而产生非常小的G.711.0帧(小于相应G.711输入帧符号的约20%)。第4.2.2节G.711.0 RTP有效载荷定义中提供了在G.711.0有效载荷中引入填充的方法。
The G.711 codec is a Constant Bit Rate (CBR) codec that does not have a means to regulate the bitrate. The G.711.0 lossless compression algorithm typically compresses the G.711 CBR stream into a lower-bandwidth VBR stream. However, being lossless, it does not possess means of further reducing the bitrate beyond the compression result based on G.711.0. The G.711.0 RTP payloads can be made arbitrarily large by means of adding optional padding bytes (subject only to MTU limitations).
G.711编解码器是一种恒定比特率(CBR)编解码器,没有调节比特率的方法。G.711.0无损压缩算法通常将G.711 CBR流压缩为较低带宽的VBR流。然而,由于是无损的,它不具备在基于G.711.0的压缩结果之外进一步降低比特率的手段。通过添加可选的填充字节(仅受MTU限制),可以使G.711.0 RTP有效负载任意大。
Therefore, there are no explicit ways to regulate the bit rate of the transmissions outlined in this RTP payload format except by means of modulating the number of optional padding bytes in the RTP payload.
因此,除了通过调制RTP有效载荷中可选填充字节的数量之外,没有明确的方法来调节该RTP有效载荷格式中概述的传输的比特率。
[G.711] ITU-T, "Pulse Code Modulation (PCM) of Voice Frequencies", ITU-T Recommendation G.711 PCM, 1988.
[G.711]ITU-T,“语音频率的脉冲编码调制(PCM)”,ITU-T建议G.711 PCM,1988年。
[G.711-A1] ITU-T, "New Annex A on Lossless Encoding of PCM Frames", ITU-T Recommendation G.711 Amendment 1, 2009.
[G.711-A1]ITU-T,“关于PCM帧无损编码的新附录A”,ITU-T建议G.711修改件1,2009年。
[G.711-AP1] ITU-T, "A high quality low-complexity algorithm for packet loss concealment with G.711", ITU-T Recommendation G.711 AP1, 1999.
[G.711-AP1]ITU-T,“使用G.711实现高质量低复杂度的包丢失隐藏算法”,ITU-T建议G.711 AP1,1999年。
[G.711.0] ITU-T, "Lossless Compression of G.711 Pulse Code Modulation", ITU-T Recommendation G.711 LC PCM, 2009.
[G.711.0]ITU-T,“G.711脉冲编码调制的无损压缩”,ITU-T建议G.711 LC PCM,2009年。
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>.
[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,DOI 10.17487/RFC2119,1997年3月<http://www.rfc-editor.org/info/rfc2119>.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, DOI 10.17487/RFC3264, June 2002, <http://www.rfc-editor.org/info/rfc3264>.
[RFC3264]Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,DOI 10.17487/RFC3264,2002年6月<http://www.rfc-editor.org/info/rfc3264>.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, July 2003, <http://www.rfc-editor.org/info/rfc3550>.
[RFC3550]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 3550,DOI 10.17487/RFC3550,2003年7月<http://www.rfc-editor.org/info/rfc3550>.
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, DOI 10.17487/RFC3551, July 2003, <http://www.rfc-editor.org/info/rfc3551>.
[RFC3551]Schulzrinne,H.和S.Casner,“具有最小控制的音频和视频会议的RTP配置文件”,STD 65,RFC 3551,DOI 10.17487/RFC3551,2003年7月<http://www.rfc-editor.org/info/rfc3551>.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, DOI 10.17487/RFC3711, March 2004, <http://www.rfc-editor.org/info/rfc3711>.
[RFC3711]Baugher,M.,McGrew,D.,Naslund,M.,Carrara,E.,和K.Norrman,“安全实时传输协议(SRTP)”,RFC 3711,DOI 10.17487/RFC3711,2004年3月<http://www.rfc-editor.org/info/rfc3711>.
[RFC3951] Andersen, S., Duric, A., Astrom, H., Hagen, R., Kleijn, W., and J. Linden, "Internet Low Bit Rate Codec (iLBC)", RFC 3951, DOI 10.17487/RFC3951, December 2004, <http://www.rfc-editor.org/info/rfc3951>.
[RFC3951]Andersen,S.,Duric,A.,Astrom,H.,Hagen,R.,Kleijn,W.,和J.Linden,“互联网低比特率编解码器(iLBC)”,RFC 3951,DOI 10.17487/RFC39512004年12月<http://www.rfc-editor.org/info/rfc3951>.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, DOI 10.17487/RFC4566, July 2006, <http://www.rfc-editor.org/info/rfc4566>.
[RFC4566]Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC 4566,DOI 10.17487/RFC4566,2006年7月<http://www.rfc-editor.org/info/rfc4566>.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, "Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, DOI 10.17487/RFC4585, July 2006, <http://www.rfc-editor.org/info/rfc4585>.
[RFC4585]Ott,J.,Wenger,S.,Sato,N.,Burmeister,C.,和J.Rey,“基于实时传输控制协议(RTCP)的反馈(RTP/AVPF)的扩展RTP配置文件”,RFC 4585,DOI 10.17487/RFC4585,2006年7月<http://www.rfc-editor.org/info/rfc4585>.
[RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February 2008, <http://www.rfc-editor.org/info/rfc5124>.
[RFC5124]Ott,J.和E.Carrara,“基于实时传输控制协议(RTCP)的反馈扩展安全RTP配置文件(RTP/SAVPF)”,RFC 5124DOI 10.17487/RFC5124,2008年2月<http://www.rfc-editor.org/info/rfc5124>.
[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type Specifications and Registration Procedures", BCP 13, RFC 6838, DOI 10.17487/RFC6838, January 2013, <http://www.rfc-editor.org/info/rfc6838>.
[RFC6838]Freed,N.,Klensin,J.和T.Hansen,“介质类型规范和注册程序”,BCP 13,RFC 6838,DOI 10.17487/RFC6838,2013年1月<http://www.rfc-editor.org/info/rfc6838>.
[RFC6884] Fang, Z., "RTP Payload Format for the Enhanced Variable Rate Narrowband-Wideband Codec (EVRC-NW)", RFC 6884, DOI 10.17487/RFC6884, March 2013, <http://www.rfc-editor.org/info/rfc6884>.
[RFC6884]方,Z,“增强型可变速率窄带宽带编解码器(EVRC-NW)的RTP有效载荷格式”,RFC 6884,DOI 10.17487/RFC6884,2013年3月<http://www.rfc-editor.org/info/rfc6884>.
[RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014, <http://www.rfc-editor.org/info/rfc7201>.
[RFC7201]Westerlund,M.和C.Perkins,“保护RTP会话的选项”,RFC 7201,DOI 10.17487/RFC7201,2014年4月<http://www.rfc-editor.org/info/rfc7201>.
[RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP Framework: Why RTP Does Not Mandate a Single Media Security Solution", RFC 7202, DOI 10.17487/RFC7202, April 2014, <http://www.rfc-editor.org/info/rfc7202>.
[RFC7202]Perkins,C.和M.Westerlund,“保护RTP框架:为什么RTP不要求单一媒体安全解决方案”,RFC 7202,DOI 10.17487/RFC7202,2014年4月<http://www.rfc-editor.org/info/rfc7202>.
[G.722] ITU-T, "7 kHz audio-coding within 64 kbit/s", ITU-T Recommendation G.722, 1988.
[G.722]ITU-T,“64 kbit/s内的7 kHz音频编码”,ITU-T建议G.722,1988年。
[G.729] ITU-T, "Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear prediction (CS-ACELP)", ITU-T Recommendation G.729, 2007.
[G.729]ITU-T,“使用共轭结构代数码激励线性预测(CS-ACELP)对8kbit/s语音进行编码”,ITU-T建议G.7292007。
[ICASSP] Harada, N., Yamamoto, Y., Moriya, T., Hiwasaki, Y., Ramalho, M., Netsch, L., Stachurski, J., Miao, L., Taddei, H., and F. Qi, "Emerging ITU-T Standard G.711.0 - Lossless Compression of G.711 Pulse Code Modulation, International Conference on Acoustics Speech and Signal Processing (ICASSP), 2010, ISBN 978-1-4244-4244-4295-9", March 2010.
[ICASSP]Harada,N.,Yamamoto,Y.,Moriya,T.,Hiwasaki,Y.,Ramalho,M.,Netsch,L.,Stachurski,J.,Miao,L.,Taddei,H.,和F.Qi,“新兴ITU-T标准G.711.0-G.711脉冲编码调制的无损压缩,国际声学语音和信号处理会议(ICASSP),2010年,ISBN 978-1-4244-4295-9”,2010年3月。
[IEEE] Wright, C., Ballard, L., Coull, S., Monrose, F., and G. Masson, "Spot Me if You Can: Uncovering Spoken Phrases in Encrypted VoIP Conversations, IEEE Symposium on Security and Privacy, 2008, ISBN: 978-0-7695-3168-7", May 2008.
[IEEE]Wright,C.,Ballard,L.,Coull,S.,Monrose,F.,和G.Masson,“如果可以的话发现我:在加密VoIP对话中发现口语短语,IEEE安全和隐私研讨会,2008年,ISBN:978-0-7695-3168-7”,2008年5月。
Acknowledgements
致谢
There have been many people contributing to G.711.0 in the course of its development. The people listed here deserve special mention: Takehiro Moriya, Claude Lamblin, Herve Taddei, Simao Campos, Yusuke Hiwasaki, Jacek Stachurski, Lorin Netsch, Paul Coverdale, Patrick Luthi, Paul Barrett, Jari Hagqvist, Pengjun (Jeff) Huang, John Gibbs, Yutaka Kamamoto, and Csaba Kos. The review and oversight by the IETF Payload working group chairs Ali Begen and Roni Even during the development of this RFC is appreciated. Additionally, the careful review by Richard Barnes, the extensive review by David Black, and the reviews provided by the IESG are likewise very much appreciated.
在G.711.0的发展过程中,有许多人为其做出了贡献。这里列出的人值得特别提及:森喜朗、克洛德·兰布林、埃尔夫·塔代伊、西芒·坎波斯、久崎优介、雅切克·斯塔丘斯基、洛林·内什、保罗·科弗代尔、帕特里克·卢蒂、保罗·巴雷特、贾里·哈格维斯特、黄鹏君(杰夫)、约翰·吉布斯、龟本由隆和萨巴·科斯。IETF有效载荷工作组主席Ali Begen和Roni即使在本RFC的开发过程中也进行了审查和监督,对此表示赞赏。此外,Richard Barnes的仔细审查、David Black的广泛审查以及IESG提供的审查也同样非常感谢。
Contributors
贡献者
The authors thank everyone who have contributed to this document. The people listed here deserve special mention: Ali Begen, Roni Even, and Hadriel Kaplan.
作者感谢所有对本文件作出贡献的人。这里列出的人值得特别提及:阿里·贝根、甚至罗尼和哈德里尔·卡普兰。
Authors' Addresses
作者地址
Michael A. Ramalho (editor) Cisco Systems, Inc. 6310 Watercrest Way Unit 203 Lakewood Ranch, FL 34202 United States Phone: +1 919 476 2038 Email: mramalho@cisco.com
Michael A.Ramalho(编辑)思科系统公司6310 Watercrest Way 203单元佛罗里达州莱克伍德牧场34202美国电话:+1 919 476 2038电子邮件:mramalho@cisco.com
Paul E. Jones Cisco Systems, Inc. 7025 Kit Creek Road Research Triangle Park, NC 27709 United States
Paul E.Jones Cisco Systems,Inc.美国北卡罗来纳州Kit Creek Road研究三角公园7025号,邮编:27709
Phone: +1 919 476 2048 Email: paulej@packetizer.com
Phone: +1 919 476 2048 Email: paulej@packetizer.com
Noboru Harada NTT Communications Science Labs 3-1 Morinosato-Wakamiya Atsugi, Kanagawa 243-0198 Japan
日本神奈川县原田信武NTT通信科学实验室3-1 Morinosato Wakamiya Atsugi,神奈川243-0198
Phone: +81 46 240 3676 Email: harada.noboru@lab.ntt.co.jp
Phone: +81 46 240 3676 Email: harada.noboru@lab.ntt.co.jp
Muthu Arul Mozhi Perumal Ericsson Ferns Icon Doddanekundi, Mahadevapura Bangalore, Karnataka 560037 India
Muthu Arul Mozhi Perumal Ericsson蕨类植物图标Doddanekundi,马哈德瓦普拉班加罗尔,卡纳塔克邦560037印度
Phone: +91 9449288768 Email: muthu.arul@gmail.com
Phone: +91 9449288768 Email: muthu.arul@gmail.com
Lei Miao Huawei Technologies Co. Ltd Q22-2-A15R, Environment Protection Park No. 156 Beiqing Road HaiDian District Beijing 100095 China
中国北京海淀区北青路156号环保园区雷苗华为技术有限公司Q22-2-A15R 100095
Phone: +86 1059728300 Email: lei.miao@huawei.com
Phone: +86 1059728300 Email: lei.miao@huawei.com