Internet Engineering Task Force (IETF)                         S. Ikonin
Request for Comments: 6262                                    SPIRIT DSP
Category: Standards Track                                    August 2011
ISSN: 2070-1721
        
Internet Engineering Task Force (IETF)                         S. Ikonin
Request for Comments: 6262                                    SPIRIT DSP
Category: Standards Track                                    August 2011
ISSN: 2070-1721
        

RTP Payload Format for IP-MR Speech Codec

IP-MR语音编解码器的RTP有效负载格式

Abstract

摘要

This document specifies the payload format for packetization of SPIRIT IP-MR encoded speech signals into the Real-time Transport Protocol (RTP). The payload format supports transmission of multiple frames per packet and introduces redundancy for robustness against packet loss and bit errors.

本文件规定了将SPIRIT IP-MR编码语音信号打包成实时传输协议(RTP)的有效载荷格式。有效负载格式支持每个数据包传输多个帧,并引入冗余以抵抗数据包丢失和位错误。

Status of This Memo

关于下段备忘

This is an Internet Standards Track document.

这是一份互联网标准跟踪文件。

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.

本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。有关互联网标准的更多信息,请参见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6262.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc6262.

Copyright Notice

版权公告

Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2011 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this

本文件可能包含2008年11月10日之前发布或公开的IETF文件或IETF贡献中的材料。本协议某些部分的版权控制人

material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

材料可能未授予IETF信托允许在IETF标准过程之外修改此类材料的权利。在未从控制此类材料版权的人员处获得充分许可的情况下,不得在IETF标准流程之外修改本文件,也不得在IETF标准流程之外创建其衍生作品,除了将其格式化以RFC形式发布或将其翻译成英语以外的其他语言。

Table of Contents

目录

   1. Introduction ....................................................2
   2. IP-MR Codec Description .........................................3
   3. Payload Format ..................................................4
      3.1. RTP Header Usage ...........................................4
      3.2. RTP Payload Structure ......................................4
      3.3. Speech Payload Header ......................................5
      3.4. Speech Payload Table of Contents ...........................6
      3.5. Speech Payload Data ........................................6
      3.6. Redundancy Payload Header ..................................7
      3.7. Redundancy Payload Table of Contents .......................8
      3.8. Redundancy Payload Data ....................................8
   4. Payload Examples ................................................9
      4.1. Payload Carrying a Single Frame ............................9
      4.2. Payload Carrying Multiple Frames with Redundancy ..........10
   5. Congestion Control .............................................11
   6. Security Considerations ........................................12
   7. Payload Format Parameters ......................................13
      7.1. Media Type Registration ...................................13
      7.2. Mapping Media Type Parameters into SDP ....................14
   8. IANA Considerations ............................................14
   9. Normative References ...........................................15
   Appendix A. Retrieving Frame Information ..........................16
      A.1. get_frame_info.c ..........................................16
        
   1. Introduction ....................................................2
   2. IP-MR Codec Description .........................................3
   3. Payload Format ..................................................4
      3.1. RTP Header Usage ...........................................4
      3.2. RTP Payload Structure ......................................4
      3.3. Speech Payload Header ......................................5
      3.4. Speech Payload Table of Contents ...........................6
      3.5. Speech Payload Data ........................................6
      3.6. Redundancy Payload Header ..................................7
      3.7. Redundancy Payload Table of Contents .......................8
      3.8. Redundancy Payload Data ....................................8
   4. Payload Examples ................................................9
      4.1. Payload Carrying a Single Frame ............................9
      4.2. Payload Carrying Multiple Frames with Redundancy ..........10
   5. Congestion Control .............................................11
   6. Security Considerations ........................................12
   7. Payload Format Parameters ......................................13
      7.1. Media Type Registration ...................................13
      7.2. Mapping Media Type Parameters into SDP ....................14
   8. IANA Considerations ............................................14
   9. Normative References ...........................................15
   Appendix A. Retrieving Frame Information ..........................16
      A.1. get_frame_info.c ..........................................16
        
1. Introduction
1. 介绍

This document specifies the payload format for packetization of SPIRIT IP-MR encoded speech signals into the Real-time Transport Protocol (RTP). The payload format supports transmission of multiple frames per packet and introduces redundancy for robustness against packet loss and bit errors.

本文件规定了将SPIRIT IP-MR编码语音信号打包成实时传输协议(RTP)的有效载荷格式。有效负载格式支持每个数据包传输多个帧,并引入冗余以抵抗数据包丢失和位错误。

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照RFC 2119[RFC2119]中所述进行解释。

2. IP-MR Codec Description
2. IP-MR编解码器描述

IP-MR is a wideband speech codec designed by SPIRIT for conferencing services over packet-switched networks such as the Internet.

IP-MR是SPIRIT设计的宽带语音编解码器,用于通过分组交换网络(如互联网)提供会议服务。

IP-MR is a scalable codec. This means that the source not only has the ability to change transmission rate on the fly, but the gateway is also able to decrease bandwidth at any time without performance overhead. There are 6 coding rates from 7.7 to 34.2 kbps available.

IP-MR是一种可扩展的编解码器。这意味着信源不仅能够动态地改变传输速率,而且网关还能够随时减少带宽,而不会产生性能开销。有6种编码速率,从7.7到34.2 kbps。

The codec operates on a frame-by-frame basis with a frame size of 20 ms at a 16 kHz sampling rate with a total end-to-end delay of 25 ms. Each compressed frame is represented as a sequence of layers. The first (base) layer is mandatory while the other (enhancement) layers can be safely discarded. Information about the particular frame structure is available from the payload header. In order to adjust outgoing bandwidth, the gateway MUST read the frame(s) structure from the payload header, define which enhancement layers to discard, and compose a new RTP packet according to this specification.

编解码器以16 kHz采样率逐帧运行,帧大小为20 ms,总端到端延迟为25 ms。每个压缩帧表示为层序列。第一(基本)层是必需的,而其他(增强)层可以安全地丢弃。有关特定帧结构的信息可从有效负载标头获得。为了调整输出带宽,网关必须从有效负载报头读取帧结构,定义要丢弃的增强层,并根据本规范组成新的RTP数据包。

In fact, not all bits within a frame are equally tolerant to distortion. IP-MR defines 6 classes ('A'-'F') of sensitivity to bit errors. Any damage of class 'A' bits causes significant reconstruction artifacts while the loss in class 'F' may not even be perceived by the listener. Note that only the base layer in a bitstream is represented as a set of classes.

事实上,并非一个帧中的所有位都具有相同的失真容忍度。IP-MR定义了6类(A'-'F')对位错误的敏感性。“A”类位的任何损坏都会导致严重的重建伪影,而“F”类位的丢失甚至可能不会被侦听器察觉到。请注意,位流中只有基本层表示为一组类。

The IP-MR payload format allows frame duplication through the packets to improve robustness against packet loss (Section 3.6). The base layer can be retransmitted completely or in several sensitive classes. Enchantment layers are not retransmittable.

IP-MR有效载荷格式允许通过数据包进行帧复制,以提高对数据包丢失的鲁棒性(第3.6节)。基本层可以完全重传,也可以在几个敏感类中重传。附魔层不可重新传输。

The fine-grained redundancy in conjunction with bitrate scalability allows applications to adjust the trade-off between overhead and robustness against packet loss. Note that this approach is supported natively within a packet and requires no out-of-band signals or session-initialization procedures.

细粒度冗余和比特率可伸缩性允许应用程序调整开销和抗数据包丢失的鲁棒性之间的平衡。请注意,此方法在数据包内本机支持,不需要带外信号或会话初始化过程。

The main IP-MR features are as follows:

IP-MR的主要特点如下:

o High-quality wideband speech codec.

o 高质量宽带语音编解码器。

o Bitrate scalable with 6 average rates from 7.7 to 34.2 kbps.

o 比特率可扩展,平均速率为6,从7.7到34.2 kbps。

o Built-in discontinuous transmission (DTX) and comfort noise generation (CNG) support.

o 内置不连续变速器(DTX)和舒适噪音产生(CNG)支持。

o Flexible in-band redundancy control scheme for packet-loss protection.

o 用于数据包丢失保护的灵活带内冗余控制方案。

3. Payload Format
3. 有效载荷格式

The payload format consists of the RTP header and the IP-MR payload.

有效负载格式由RTP报头和IP-MR有效负载组成。

3.1. RTP Header Usage
3.1. RTP头使用

The format of the RTP header is specified in [RFC3550]. This payload format uses the fields of the header in a manner consistent with that specification.

RTP标头的格式在[RFC3550]中指定。此有效负载格式以与该规范一致的方式使用报头的字段。

The RTP timestamp corresponds to the sampling instant of the first sample encoded for the first frame-block in the packet. The timestamp clock frequency SHALL be 16 kHz. The duration of one frame is 20 ms, which corresponds to 320 samples per frame. Thus, the timestamp is increased by 320 for each consecutive frame. The timestamp is also used to recover the correct decoding order of the frame-blocks.

RTP时间戳对应于为分组中的第一帧块编码的第一样本的采样瞬间。时间戳时钟频率应为16 kHz。一帧的持续时间为20ms,相当于每帧320个样本。因此,对于每个连续帧,时间戳增加320。时间戳还用于恢复帧块的正确解码顺序。

The RTP header marker bit (M) SHALL be set to 1 whenever the first frame-block carried in the packet is the first frame-block in a talkspurt (see definition of talkspurt in Section 4.1 of [RFC3551]). For all other packets, the marker bit SHALL be set to zero (M=0).

每当数据包中携带的第一个帧块是TalkSport中的第一个帧块时,RTP报头标记位(M)应设置为1(见[RFC3551]第4.1节中TalkSput的定义)。对于所有其他数据包,标记位应设置为零(M=0)。

The assignment of an RTP payload type for the format defined in this memo is outside the scope of this document. The RTP profiles in use currently mandate binding the payload type dynamically for this payload format. This is basically necessary because the payload type expresses the configuration of the payload itself, i.e., basic or interleaved mode, and the number of channels carried.

为本备忘录中定义的格式分配RTP有效负载类型超出了本文档的范围。当前使用的RTP配置文件强制为此有效负载格式动态绑定有效负载类型。这基本上是必要的,因为有效负载类型表示有效负载本身的配置,即基本或交织模式,以及所承载的信道数。

The remaining RTP header fields are used as specified in [RFC3550].

剩余的RTP标头字段按照[RFC3550]中的规定使用。

3.2. RTP Payload Structure
3.2. RTP有效载荷结构

The IP-MR payload is composed of two payloads, one for current speech and one for redundancy. Both payloads are represented in this form: Header, Table of Contents (TOC), and Data. Redundancy payload carries data for preceding and pre-preceding packets.

IP-MR有效载荷由两个有效载荷组成,一个用于当前语音,一个用于冗余。这两种有效载荷都以这种形式表示:标题、目录(TOC)和数据。冗余有效负载承载前一个和前一个数据包的数据。

     +--------+-----+----------------------+- - - - +- -  +- - - - - +
     | Header | TOC | Data                 | Header | TOC | Data     |
     +--------+-----+----------------------+- - - - +- -  +- - - - - +
     |<- Speech -------------------------->|<- Redundancy (opt) ---->|
        
     +--------+-----+----------------------+- - - - +- -  +- - - - - +
     | Header | TOC | Data                 | Header | TOC | Data     |
     +--------+-----+----------------------+- - - - +- -  +- - - - - +
     |<- Speech -------------------------->|<- Redundancy (opt) ---->|
        
3.3. Speech Payload Header
3.3. 语音有效载荷报头

This header carries parameters that are common for all frames in the packet:

此报头包含数据包中所有帧的通用参数:

                        0                   1
                        0 1 2 3 4 5 6 7 8 9 0 1
                       +-+-+-+-+-+-+-+-+-+-+-+-+
                       |T| CR  | BR  |D|A|GR |R|
                       +-+-+-+-+-+-+-+-+-+-+-+-+
        
                        0                   1
                        0 1 2 3 4 5 6 7 8 9 0 1
                       +-+-+-+-+-+-+-+-+-+-+-+-+
                       |T| CR  | BR  |D|A|GR |R|
                       +-+-+-+-+-+-+-+-+-+-+-+-+
        

o T (1 bit): Reserved. MUST always be set to 0. Receiver MAY discard packet if the 'T' bit is not equal to 0.

o T(1位):保留。必须始终设置为0。如果“T”位不等于0,则接收器可能丢弃数据包。

o CR (3 bits): Coding rate index - top enchantment layer available. The CR value 7 (NO_DATA) indicates that there is no speech data (and thus no speech TOC) in the payload. This MAY be used to transmit redundancy data only.

o CR(3位):编码率索引-顶部附魔层可用。CR值7(无_数据)表示有效负载中没有语音数据(因此没有语音TOC)。这只能用于传输冗余数据。

o BR (3 bits): Base rate index - base layer bitrate. Speech payload can be scaled to any rate index between BR and CR. Packets with BR = 6 or BR > CR MUST be discarded. Redundancy data is also considered to have a base rate of BR.

o BR(3位):基本速率索引-基本层比特率。语音有效负载可以缩放到BR和CR之间的任何速率索引。必须丢弃BR=6或BR>CR的数据包。冗余数据的基本速率也被认为是BR。

o D (1 bit): Reserved. MUST always be set to 1. Receiver MAY discard packet if the 'D' bit is zero.

o D(1位):保留。必须始终设置为1。如果“D”位为零,则接收器可能丢弃数据包。

o A (1 bit): Byte alignment. The value of 1 specifies that padding bits were added to enable each compressed frame (3.5) to start with the byte (8-bit) boundary. The value of 0 specifies unaligned frames. Note that the speech payload is always padded to the byte boundary independently on an 'A' bit value.

o A(1位):字节对齐。值1指定添加填充位以使每个压缩帧(3.5)以字节(8位)边界开始。值0指定未对齐的帧。请注意,语音有效负载总是在“A”位值上独立地填充到字节边界。

o GR (2 bits): Number of frames in packet (grouping size). Actual grouping size is GR + 1; thus, the maximum grouping supported is 4.

o GR(2位):数据包中的帧数(分组大小)。实际分组大小为GR+1;因此,支持的最大分组数为4。

o R (1 bit): Redundancy presence. Value of 1 indicates redundancy payload presence.

o R(1位):冗余存在。值1表示存在冗余有效负载。

Note that the values of 'T' and 'D' bits are fixed; any other values are not allowed by specification. Padding bits ('P' bits) MUST always be set to zero.

注意,‘T’和‘D’位的值是固定的;规范不允许使用任何其他值。填充位(“P”位)必须始终设置为零。

The following table defines the mapping between rate index and rate value:

下表定义了速率索引和速率值之间的映射:

                    +------------+--------------+
                    | rate index | avg. bitrate |
                    +------------+--------------+
                    |      0     |   7.7 kbps   |
                    |      1     |   9.8 kbps   |
                    |      2     |  14.3 kbps   |
                    |      3     |  20.8 kbps   |
                    |      4     |  27.9 kbps   |
                    |      5     |  34.2 kbps   |
                    |      6     |  (reserved)  |
                    |      7     |   NO_DATA    |
                    +------------+--------------+
        
                    +------------+--------------+
                    | rate index | avg. bitrate |
                    +------------+--------------+
                    |      0     |   7.7 kbps   |
                    |      1     |   9.8 kbps   |
                    |      2     |  14.3 kbps   |
                    |      3     |  20.8 kbps   |
                    |      4     |  27.9 kbps   |
                    |      5     |  34.2 kbps   |
                    |      6     |  (reserved)  |
                    |      7     |   NO_DATA    |
                    +------------+--------------+
        

The value of 6 is reserved. If receiving this value, the packet MUST be discarded.

保留值6。如果收到该值,则必须丢弃该数据包。

3.4. Speech Payload Table of Contents
3.4. 语音负载目录

The speech TOC is a bitmask indicating the presence of each frame in the packet. TOC is only available if the 'CR' value is not equal to 7 (NO_DATA).

语音TOC是指示分组中每个帧的存在的位掩码。只有当“CR”值不等于7(无数据)时,TOC才可用。

                               0 1 2 3
                              +-+-+-+-+
                              |E|E|E|E|
                              +-+-+-+-+
                              |<----->| <-- #(GR+1)
        
                               0 1 2 3
                              +-+-+-+-+
                              |E|E|E|E|
                              +-+-+-+-+
                              |<----->| <-- #(GR+1)
        

o E (1 bit): Frame existence indicator. The value of 0 indicates speech data is not present for the corresponding frame. The IP-MR encoder sets the 'E' flag to 0 for the periods of silence in DTX mode. Applications MUST set this bit to 0 if the frame is known to be damaged.

o E(1位):帧存在指示器。值0表示对应帧不存在语音数据。IP-MR编码器在DTX模式下的静音期间将“E”标志设置为0。如果已知帧已损坏,应用程序必须将该位设置为0。

3.5. Speech Payload Data
3.5. 语音有效载荷数据

Speech data contains (GR+1) compressed IP-MR frames (20 ms of data). A compressed frame has a length of zero if the corresponding TOC flag is zero.

语音数据包含(GR+1)压缩IP-MR帧(20ms数据)。如果相应的TOC标志为零,则压缩帧的长度为零。

The beginning of each compressed frame is aligned if the 'A' bit is nonzero, while the end of the speech payload is always aligned to a byte (8-bit) boundary:

如果“A”位不为零,则每个压缩帧的开头对齐,而语音有效负载的结尾始终与字节(8位)边界对齐:

   +- - -+------------+------------+------------+------------+
   | TOC | Frame1     | Frame2     | Frame3     | Frame4     |
   +- - -+------------+------------+------------+------------+   ALWAYS
         |<- aligned  |<- aligned  |<- aligned  |<- aligned  |<- ALIGNED
        
   +- - -+------------+------------+------------+------------+
   | TOC | Frame1     | Frame2     | Frame3     | Frame4     |
   +- - -+------------+------------+------------+------------+   ALWAYS
         |<- aligned  |<- aligned  |<- aligned  |<- aligned  |<- ALIGNED
        

Marked regions MUST be padded only if the 'A' bit is set to '1'.

仅当“A”位设置为“1”时,才必须填充标记区域。

The compressed frame structure is as follows:

压缩框架结构如下所示:

   |<---- sensitive classes ------>|<----- enchantment layers -------->|
   +-------------------------------+----+-----+------+- - - - - +------+
   | L1 (Base Layer)               | L2 | L3  | L4   |          | LN   |
   +-------------------------------+----+-----+------+- - - - - +------+
   |<- A --->|<- B ->| ... |<- F ->|                                   |
   |<- BR rate ------------------->|                                   |
   |<- CR rate ------------------------------------------------------->|
        
   |<---- sensitive classes ------>|<----- enchantment layers -------->|
   +-------------------------------+----+-----+------+- - - - - +------+
   | L1 (Base Layer)               | L2 | L3  | L4   |          | LN   |
   +-------------------------------+----+-----+------+- - - - - +------+
   |<- A --->|<- B ->| ... |<- F ->|                                   |
   |<- BR rate ------------------->|                                   |
   |<- CR rate ------------------------------------------------------->|
        

Appendix A of this document provides a helper routine written in "C" that MUST be used to extract sensitivity classes and bounds for the enchantment layers from the compressed frame data.

本文档的附录A提供了一个用“C”编写的助手例程,该例程必须用于从压缩帧数据中提取附魔层的敏感度类和边界。

3.6. Redundancy Payload Header
3.6. 冗余有效负载报头

The redundancy payload presence is signaled by the 'R' bit of the speech payload header. The redundancy header is composed of two fields of 3 bits each:

冗余有效负载的存在由语音有效负载报头的“R”位发出信号。冗余报头由两个各为3位的字段组成:

                               0 1 2 3 4 5
                              +-+-+-+-+-+-+
                              | CL1 | CL2 |
                              +-+-+-+-+-+-+
        
                               0 1 2 3 4 5
                              +-+-+-+-+-+-+
                              | CL1 | CL2 |
                              +-+-+-+-+-+-+
        

The 'CL1' and 'CL2' fields both specify the sensitivity classes available for preceding and pre-preceding packets respectively.

“CL1”和“CL2”字段都分别指定可用于前置和前置数据包的敏感度类别。

                    +-------+--------------------+
                    |  CL   | Redundancy classes |
                    |       |      available     |
                    +-------+--------------------+
                    |   0   |       NONE         |
                    |   1   |        A           |
                    |   2   |        A-B         |
                    |   3   |        A-C         |
                    |   4   |        A-D         |
                    |   5   |        A-E         |
                    |   6   |        A-F         |
                    |   7   |    (reserved)      |
                    +-------+--------------------+
        
                    +-------+--------------------+
                    |  CL   | Redundancy classes |
                    |       |      available     |
                    +-------+--------------------+
                    |   0   |       NONE         |
                    |   1   |        A           |
                    |   2   |        A-B         |
                    |   3   |        A-C         |
                    |   4   |        A-D         |
                    |   5   |        A-E         |
                    |   6   |        A-F         |
                    |   7   |    (reserved)      |
                    +-------+--------------------+
        

A receiver can reconstruct the base layer of preceding packets completely (CL=6) or partially (0<CL< 6) based on the sensitivity classes delivered. A decoder MUST discard the redundancy payload if 'CL' is equal to 0 or 7.

接收机可以基于所传送的灵敏度等级完全(CL=6)或部分(0<CL<6)重构先前分组的基本层。如果“CL”等于0或7,解码器必须丢弃冗余有效负载。

Note that the index of the base rate and grouping parameter is not transmitted for the redundancy payload. Applications MUST assume that 'BR' and 'GR' are the same as for the current packet.

注意,基本速率和分组参数的索引不针对冗余有效负载传输。应用程序必须假定“BR”和“GR”与当前数据包的相同。

3.7. Redundancy Payload Table of Contents
3.7. 冗余有效载荷目录

The redundancy TOC is a bitmask indicating the presence of each frame in the redundancy payload. The redundancy TOC is only available if the 'CL' value is not equal to 0 or 7.

冗余TOC是一个位掩码,指示冗余有效负载中每个帧的存在。冗余TOC仅在“CL”值不等于0或7时可用。

                 0 1 ...
                +-+-+-+-+-+-+-+-+
                |E|E|E|E|E|E|E|E|
                +-+-+-+-+-+-+-+-+
                |       |<----->| pre-preceding payload #(GR+1)
                |<----->| preceding payload #(GR+1)
        
                 0 1 ...
                +-+-+-+-+-+-+-+-+
                |E|E|E|E|E|E|E|E|
                +-+-+-+-+-+-+-+-+
                |       |<----->| pre-preceding payload #(GR+1)
                |<----->| preceding payload #(GR+1)
        

o E (1 bit): Redundancy frame existence indicator. The value of 0 indicates redundancy data is not present for corresponding frame.

o E(1位):冗余帧存在指示器。值0表示对应帧不存在冗余数据。

3.8. Redundancy Payload Data
3.8. 冗余有效载荷数据

IP-MR defines 6 classes ('A'-'F') of sensitivity to bit errors. Any damage of class 'A' bits causes significant reconstruction artifacts while the loss in class 'F' may not even be perceived by the listener. Note that only the base layer in a bitstream is represented as a set of classes. Together, the sensitivity classes' approach and redundancy allow IP-MR duplicate frames through the packets to improve robustness against packet loss.

IP-MR定义了6类(A'-'F')对位错误的敏感性。“A”类位的任何损坏都会导致严重的重建伪影,而“F”类位的丢失甚至可能不会被侦听器察觉到。请注意,位流中只有基本层表示为一组类。同时,灵敏度类的方法和冗余允许IP-MR通过数据包复制帧,以提高对数据包丢失的鲁棒性。

Redundancy data carries a number of sensitivity classes for preceding and pre-preceding packets as indicated by the 'CL1' and 'CL2' fields of the redundancy header. The sensitivity classes' data is available individually for each frame only if the corresponding 'E' bit of the redundancy TOC is nonzero:

冗余数据包含多个前向和前向数据包的敏感度等级,如冗余报头的“CL1”和“CL2”字段所示。仅当冗余TOC的相应“E”位为非零时,每个帧的灵敏度等级数据才单独可用:

   +---+---+----+----|-----+-----+-----+-----+-----+-----+-----+
   |A-C|A-B|1000|1001|cl_A1|cl_B1|cl_C1|cl_A1|cl_B1|cl_A4|cl_B4|
   +---+---+----+----|-----+-----+-----+-----+-----+-----+-----+
   |<- CL >|<- TOC ->|<- preceding --->|<- pre-preceding ----->|
        
   +---+---+----+----|-----+-----+-----+-----+-----+-----+-----+
   |A-C|A-B|1000|1001|cl_A1|cl_B1|cl_C1|cl_A1|cl_B1|cl_A4|cl_B4|
   +---+---+----+----|-----+-----+-----+-----+-----+-----+-----+
   |<- CL >|<- TOC ->|<- preceding --->|<- pre-preceding ----->|
        

Redundancy data is only available if the base rates (BRs) and coding rates (CRs) of preceding and pre-preceding packets are the same as for the current packet.

仅当前一个和前一个数据包的基本速率(BRs)和编码速率(CRs)与当前数据包相同时,冗余数据才可用。

A receiver MAY use redundancy data to compensate for packet loss (note that in this case, the 'CL' field MUST also be passed to the decoder). The helper routine provided in Appendix A MUST be used to extract sensitivity classes' length for each frame. The following pseudocode describes the sequence of operations:

接收机可以使用冗余数据来补偿分组丢失(注意,在这种情况下,“CL”字段也必须传递给解码器)。必须使用附录A中提供的辅助程序提取每个帧的灵敏度等级长度。以下伪代码描述了操作序列:

      int sensitivityBits[numOfRedundancyFrames][6];
      int redundancyBits [numOfRedundancyFrames];
      for(i = 0 ; i < numOfRedundancyFrames; i++) {
          GetFrameInfo(CR, BR, pRedundancyPayloadData, dummy,
                       sensitivityBits[i], dummy);
          redundancyBits[i] = 0;
          for(j = 0; j < CL[i]; j++ ) {
               redundancyBits[i] += sensitivityBits[i][j];
          }
          flushBits(pRedundancyPayloadData, redundancyBits[i]);
      }
        
      int sensitivityBits[numOfRedundancyFrames][6];
      int redundancyBits [numOfRedundancyFrames];
      for(i = 0 ; i < numOfRedundancyFrames; i++) {
          GetFrameInfo(CR, BR, pRedundancyPayloadData, dummy,
                       sensitivityBits[i], dummy);
          redundancyBits[i] = 0;
          for(j = 0; j < CL[i]; j++ ) {
               redundancyBits[i] += sensitivityBits[i][j];
          }
          flushBits(pRedundancyPayloadData, redundancyBits[i]);
      }
        
4. Payload Examples
4. 有效载荷示例

This section provides detailed examples of the IP-MR payload format.

本节提供IP-MR有效负载格式的详细示例。

4.1. Payload Carrying a Single Frame
4.1. 承载单帧的有效载荷

The following diagram shows a typical IP-MR payload carrying one (GR=0) non-aligned (A=0) speech frame without redundancy (R=0). The base layer is coded at 7.8 kbps (BR=0) while the coding rate is 9.7 kbps (CR=1). The 'E' bit value of 1 signals that compressed frame bits s(0) - s(193) are present. There is a padding bit 'P' to maintain speech payload size alignment.

下图显示了一个典型的IP-MR有效负载,该负载承载一个(GR=0)未对齐(a=0)语音帧,且没有冗余(R=0)。基本层的编码速率为7.8 kbps(BR=0),而编码速率为9.7 kbps(CR=1)。存在压缩帧比特s(0)-s(193)的1信号的“E”比特值。有一个填充位“P”来保持语音有效负载大小对齐。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |0|CR=1 |BR=0 |1|0|0 0|0|1|s(0)                                 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       s(193)|P|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |0|CR=1 |BR=0 |1|0|0 0|0|1|s(0)                                 |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       s(193)|P|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
4.2. Payload Carrying Multiple Frames with Redundancy
4.2. 承载多个冗余帧的有效载荷

The following diagram shows a payload carrying 3 (GR=2) aligned (A=1) speech frames with redundancy (R=1). The TOC value of '101' indicates speech data present for the first (bits sp1(0)-sp1(92)) and third frames (bits sp3(0)-sp3(171)). There are no enchantment layers because the base and coding rates are equal (BR=CR=0). The padding bit 'P' is inserted to maintain necessary alignment.

下图显示了承载3个(GR=2)对齐(a=1)且具有冗余(R=1)的语音帧的有效载荷。TOC值“101”表示第一帧(位sp1(0)-sp1(92))和第三帧(位sp3(0)-sp3(171))存在的语音数据。没有附魔层,因为基本速率和编码速率相等(BR=CR=0)。插入填充位“P”以保持必要的对齐。

The redundancy payload present for both preceding and pre-preceding payloads (CL1 = A-B, CL2=A), but redundancy data is only available for 5 (TOC='111011') of 6 (2*(GR+1)) frames. There is redundancy data of 20, 39, and 35 bits for each of the three frames of the preceding packet and 15 and 19 bits for the two frames of the pre-preceding packet.

前一个和前一个有效载荷(CL1=A-B,CL2=A)都存在冗余有效载荷,但冗余数据仅适用于6(2*(GR+1))帧中的5(TOC='111011')。对于前一分组的三个帧中的每一帧,存在20、39和35位的冗余数据,对于前一分组的两个帧,存在15和19位的冗余数据。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |0|CR=0 |BR=0 |1|1|1 0|1|1 0 1|P|sp1(0)                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                  sp1(92)|P|P|P|sp3(0)                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                               sp3(171)|P|P|P|P|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |CL1=2|CL2=1|1 1 1|0 1 1|red1_1_AB(0)              red1_1_AB(19)|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |red1_2_AB(0)                                                   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |red1_2_AB(38)|red1_3_AB(0)                                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |      red1_3_AB(34)|red2_2_A(0)      red2_2_A(14)|red2_3_A(0)  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           red2_3_A(18)|P|P|P|P|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |0|CR=0 |BR=0 |1|1|1 0|1|1 0 1|P|sp1(0)                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                  sp1(92)|P|P|P|sp3(0)                         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                               sp3(171)|P|P|P|P|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |CL1=2|CL2=1|1 1 1|0 1 1|red1_1_AB(0)              red1_1_AB(19)|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |red1_2_AB(0)                                                   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |red1_2_AB(38)|red1_3_AB(0)                                     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |      red1_3_AB(34)|red2_2_A(0)      red2_2_A(14)|red2_3_A(0)  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           red2_3_A(18)|P|P|P|P|
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
5. Congestion Control
5. 拥塞控制

The general congestion control considerations for transporting RTP data applicable to IP-MR speech over RTP (see RTP [RFC3550] and any applicable RTP profile like the Audio-Visual Profile (AVP) [RFC3551]). However, the multi-rate capability of IP-MR speech coding provides a mechanism that may help to control congestion, since the bandwidth demand can be adjusted by selecting a different encoding mode.

通过RTP传输适用于IP-MR语音的RTP数据的一般拥塞控制注意事项(参见RTP[RFC3550]和任何适用的RTP配置文件,如视听配置文件(AVP)[RFC3551])。然而,IP-MR语音编码的多速率能力提供了一种有助于控制拥塞的机制,因为可以通过选择不同的编码模式来调整带宽需求。

The number of frames encapsulated in each RTP payload highly influences the overall bandwidth of the RTP stream due to header overhead constraints. Packetizing more frames in each RTP payload can reduce the number of packets sent and thus reduce the overhead from IP/UDP/RTP headers, at the expense of increased delay.

由于报头开销限制,封装在每个RTP有效负载中的帧的数量高度影响RTP流的总体带宽。在每个RTP有效负载中打包更多帧可以减少发送的数据包数量,从而减少IP/UDP/RTP报头的开销,但会增加延迟。

Due to the scalability nature of the IP_MR codec, the transmission rate can be reduced at any transport stage to fit channel bandwidth. The minimal rate is specified by the BR field of the payload header and can be as low as 7.7 kbps. It is up to the application to keep the balance between coding quality (high BR) and bitstream scalability (low BR). Because coding quality depends on coding rate (CR) rather than base rate (BR), it is NOT RECOMMENDED to use high BR values for real-time communications.

由于IP_MR编解码器的可伸缩性,可以在任何传输阶段降低传输速率以适应信道带宽。最小速率由有效负载标头的BR字段指定,可以低至7.7 kbps。由应用程序在编码质量(高BR)和比特流可伸缩性(低BR)之间保持平衡。由于编码质量取决于编码速率(CR)而不是基本速率(BR),因此不建议在实时通信中使用高BR值。

Applications MAY utilize bitstream redundancy to combat packet loss. However, the gateway is free to chose any option to reduce the transmission rate; the coding layer or redundancy bits can be dropped. Due to this fact, it is NOT RECOMMENDED for applications to increase the total bitrate when adding redundancy in response to packet loss.

应用程序可以利用比特流冗余来防止数据包丢失。然而,网关可以自由选择任何选项来降低传输速率;可以丢弃编码层或冗余位。由于这一事实,不建议应用程序在添加冗余以响应数据包丢失时增加总比特率。

6. Security Considerations
6. 安全考虑

RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550] and in any applicable RTP profile. The main security considerations for the RTP packet carrying the RTP payload format defined within this memo are confidentiality, integrity, and source authenticity. Confidentiality is achieved by encryption of the RTP payload. Integrity of the RTP packets is achieved through a suitable cryptographic integrity-protection mechanism. Such a cryptographic system may also allow the authentication of the source of the payload. A suitable security mechanism for this RTP payload format should provide confidentiality, integrity protection, and source authentication at least capable of determining if an RTP packet is from a member of the RTP session.

使用本规范中定义的有效负载格式的RTP数据包应遵守RTP规范[RFC3550]和任何适用RTP配置文件中讨论的安全注意事项。携带本备忘录中定义的RTP有效载荷格式的RTP数据包的主要安全注意事项是机密性、完整性和源真实性。保密性是通过对RTP有效负载进行加密来实现的。RTP数据包的完整性是通过合适的密码完整性保护机制实现的。这样的密码系统还可以允许对有效载荷的源进行认证。用于该RTP有效载荷格式的适当安全机制应提供机密性、完整性保护和至少能够确定RTP分组是否来自RTP会话的成员的源认证。

Note that the appropriate mechanisms to provide security to RTP and payloads following this memo may vary. The security mechanisms are dependent on the application, the transport, and the signaling protocol employed. Therefore, a single mechanism is not sufficient; although if suitable, usage of the Secure Real-time Transport Protocol (SRTP) [RFC3711] is recommended. Other mechanisms that may be used are IPsec [RFC4301] and Transport Layer Security (TLS) [RFC5246] (RTP over TCP); other alternatives may exist.

请注意,在本备忘录之后,为RTP和有效负载提供安全性的适当机制可能会有所不同。安全机制取决于应用程序、传输和采用的信令协议。因此,单一的机制是不够的;尽管合适,建议使用安全实时传输协议(SRTP)[RFC3711]。可使用的其他机制包括IPsec[RFC4301]和传输层安全(TLS)[RFC5246](TCP上的RTP);可能存在其他替代方案。

This payload format does not exhibit any significant non-uniformity in the receiver-side computational complexity for packet processing and thus is unlikely to pose a denial-of-service threat due to the receipt of pathological data.

这种有效载荷格式在数据包处理的接收方计算复杂度方面没有表现出任何显著的非均匀性,因此不太可能由于接收病理数据而造成拒绝服务威胁。

7. Payload Format Parameters
7. 有效载荷格式参数

This section describes the media types and names associated with this payload format.

本节介绍与此有效负载格式关联的媒体类型和名称。

The IP-MR media subtype is defined as 'ip-mr_v2.5'. This subtype was registered to specify an internal codec version. Later, this version was accepted as final, the bitstream was frozen, and IP-MR v2.5 was published under the name of IP-MR. Currently, the terms 'IP-MR' and 'IP-MR v2.5' are synonyms. The subtype name 'ip-mr_v2.5' is being used in implementations.

IP-MR媒体子类型定义为“IP-MR_v2.5”。已注册此子类型以指定内部编解码器版本。后来,该版本被接受为最终版本,比特流被冻结,IP-MR v2.5以IP-MR的名义发布。目前,术语“IP-MR”和“IP-MR v2.5”是同义词。子类型名称“ip-mr_v2.5”正在实现中使用。

7.1. Media Type Registration
7.1. 媒体类型注册

Media Type name: audio

媒体类型名称:音频

Media Subtype name: ip-mr_v2.5

媒体子类型名称:ip-mr_v2.5

Required parameters: none

所需参数:无

Optional parameters: These parameters apply to RTP transfer only.

可选参数:这些参数仅适用于RTP传输。

ptime: The media packet length in milliseconds. Allowed values are: 20, 40, 60, and 80.

ptime:以毫秒为单位的媒体数据包长度。允许的值为:20、40、60和80。

Encoding considerations: This media type is framed and binary (see RFC 4288, Section 4.8).

编码注意事项:这种媒体类型是有框的和二进制的(参见RFC 4288,第4.8节)。

Security considerations: See Section 6 of RFC 6262.

安全注意事项:见RFC 6262第6节。

Interoperability considerations: none

互操作性注意事项:无

Published specification: RFC 6262

已发布规范:RFC 6262

Applications that use this media type: Real-time audio applications like voice over IP, teleconference, and multimedia streaming.

使用这种媒体类型的应用程序:实时音频应用程序,如IP语音、电话会议和多媒体流。

Additional information: none

其他信息:无

   Person & email address to contact for further information:
      V. Sviridenko <vladimirs@spiritdsp.com>
        
   Person & email address to contact for further information:
      V. Sviridenko <vladimirs@spiritdsp.com>
        

Intended usage: COMMON

预期用途:普通

Restrictions on usage: This media type depends on RTP framing and thus is only defined for transfer via RTP [RFC3550].

使用限制:此媒体类型取决于RTP帧,因此仅定义为通过RTP传输[RFC3550]。

   Authors:
      Sergey Ikonin <info@spiritdsp.com>
      Dmitry Yudin <info@spiritdsp.com>
        
   Authors:
      Sergey Ikonin <info@spiritdsp.com>
      Dmitry Yudin <info@spiritdsp.com>
        

Change controller: IETF Audio/Video Transport working group delegated from the IESG.

变更控制员:IESG授权的IETF音频/视频传输工作组。

7.2. Mapping Media Type Parameters into SDP
7.2. 将媒体类型参数映射到SDP

The information carried in the media type specification has a specific mapping to fields in the Session Description Protocol (SDP) [RFC4566], which is commonly used to describe RTP sessions. When SDP is used to specify sessions employing the IP-MR codec, the mapping is as follows:

媒体类型规范中包含的信息与会话描述协议(SDP)[RFC4566]中的字段具有特定映射,该协议通常用于描述RTP会话。当使用SDP指定使用IP-MR编解码器的会话时,映射如下:

o The media type ("audio") goes in SDP "m=" as the media name.

o 媒体类型(“音频”)以SDP“m=”作为媒体名称。

o The media subtype (payload format name) goes in SDP "a=rtpmap" as the encoding name. The RTP clock rate in "a=rtpmap" MUST be 16000.

o 媒体子类型(有效负载格式名称)以SDP“a=rtpmap”作为编码名称。“a=rtpmap”中的RTP时钟速率必须为16000。

o The parameter "ptime" goes in the SDP "a=ptime" attribute.

o 参数“ptime”位于SDP“a=ptime”属性中。

Any remaining parameters go in the SDP "a=fmtp" attribute by copying them directly from the media type parameter string as a semicolon-separated list of parameter=value pairs.

通过直接从媒体类型参数字符串中以分号分隔的参数=值对列表形式复制其余参数,将其放入SDP“a=fmtp”属性中。

Note that the payload format (encoding) names are commonly shown in uppercase. Media subtypes are commonly shown in lowercase. These names are case-insensitive in both places.

请注意,有效负载格式(编码)名称通常以大写形式显示。媒体子类型通常以小写字母显示。这些名称在两个位置都不区分大小写。

8. IANA Considerations
8. IANA考虑

One media type (ip-mr_v2.5) has been defined and registered in the media types registry.

已在媒体类型注册表中定义并注册了一种媒体类型(ip-mr_v2.5)。

9. Normative References
9. 规范性引用文件

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.

[RFC3550]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。

[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003.

[RFC3551]Schulzrinne,H.和S.Casner,“具有最小控制的音频和视频会议的RTP配置文件”,STD 65,RFC 3551,2003年7月。

[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004.

[RFC3711]Baugher,M.,McGrew,D.,Naslund,M.,Carrara,E.,和K.Norrman,“安全实时传输协议(SRTP)”,RFC 37112004年3月。

[RFC4301] Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, December 2005.

[RFC4301]Kent,S.和K.Seo,“互联网协议的安全架构”,RFC 43012005年12月。

[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.

[RFC4566]Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC4566,2006年7月。

[RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, August 2008.

[RFC5246]Dierks,T.和E.Rescorla,“传输层安全(TLS)协议版本1.2”,RFC 5246,2008年8月。

Appendix A. Retrieving Frame Information
附录A.检索帧信息

This appendix contains the C code for implementation of the frame-parsing function. This function extracts information about a coded frame, including frame size, number of layers, size of each layer, and size of perceptual sensitive classes.

本附录包含用于实现帧解析功能的C代码。此函数提取有关编码帧的信息,包括帧大小、层数、每层的大小以及感知敏感类的大小。

A.1. get_frame_info.c
A.1. 获取框架信息
   /*
        
   /*
        

Copyright (c) 2011 IETF Trust and the persons identified as authors of the code. All rights reserved.

版权所有(c)2011 IETF信托基金和被确定为代码作者的人员。版权所有。

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

在满足以下条件的情况下,允许以源代码和二进制格式重新分发和使用,无论是否修改:

- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

- 源代码的重新分发必须保留上述版权声明、此条件列表和以下免责声明。

- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

- 以二进制形式重新分发时,必须在分发时提供的文档和/或其他材料中复制上述版权声明、本条件列表和以下免责声明。

- Neither the name of Internet Society, IETF or IETF Trust, nor the names of specific contributors, may be used to endorse or promote products derived from this software without specific prior written permission.

- 未经事先书面许可,不得使用互联网协会、IETF或IETF Trust的名称或特定贡献者的名称来认可或推广源自本软件的产品。

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

本软件由版权所有者和贡献者“按原样”提供,不承担任何明示或暗示的担保,包括但不限于对适销性和特定用途适用性的暗示担保。在任何情况下,版权所有人或贡献者均不对任何直接、间接、偶然、特殊、惩戒性或后果性损害(包括但不限于替代商品或服务的采购;使用、数据或利润的损失;或业务中断)负责,无论是在合同中还是在任何责任理论下,严格责任,或因使用本软件而产生的侵权行为(包括疏忽或其他),即使告知可能发生此类损害。

*/

*/

   /******************************************************************
        
   /******************************************************************
        

get_frame_info.c

获取框架信息

Retrieving frame information for IP-MR Speech Codec

检索IP-MR语音编解码器的帧信息

   ******************************************************************/
        
   ******************************************************************/
        
   #define RATES_NUM       6   // number of codec rates
   #define SENSE_CLASSES   6   // number of sensitivity classes (A..F)
        
   #define RATES_NUM       6   // number of codec rates
   #define SENSE_CLASSES   6   // number of sensitivity classes (A..F)
        
   // frame types
   #define FT_SPEECH       0   // active speech
   #define FT_DTX_SID      1   // silence insertion descriptor
        
   // frame types
   #define FT_SPEECH       0   // active speech
   #define FT_DTX_SID      1   // silence insertion descriptor
        
   // get specified bit from coded data
   int GetBit(const unsigned char *buf, int curBit)
   {
       return (buf[curBit>>3]>>(curBit%8))&1;
   }
        
   // get specified bit from coded data
   int GetBit(const unsigned char *buf, int curBit)
   {
       return (buf[curBit>>3]>>(curBit%8))&1;
   }
        
   // retrieve frame information
   int GetFrameInfo(               // o: frame size in bits
       short rate,                 // i: encoding rate (0..5)
       short base_rate,            // i: base (core) layer rate,
       const unsigned char buf[2], // i: coded bit frame
       int size,                   // i: coded bit frame size in bytes
       short pLayerBits[RATES_NUM],     // o: number of bits in layers
       short pSenseBits[SENSE_CLASSES], // o: number of bits in
                                        //    sensitivity classes
       short *nLayers                   // o: number of layers
   )
   {
       static const short Bits_1[4]    = {  0, 9, 9,15};
       static const short Bits_2[16]   = { 43,50,36,31,46,48,40,44,
                                           47,43,44,45,43,44,47,36};
       static const short Bits_3[2][6] = {{13,11,23,33,36,31},
                                          {25, 0,23,32,36,31},};
       int FrType;
       int i, nBits = 0;
        
   // retrieve frame information
   int GetFrameInfo(               // o: frame size in bits
       short rate,                 // i: encoding rate (0..5)
       short base_rate,            // i: base (core) layer rate,
       const unsigned char buf[2], // i: coded bit frame
       int size,                   // i: coded bit frame size in bytes
       short pLayerBits[RATES_NUM],     // o: number of bits in layers
       short pSenseBits[SENSE_CLASSES], // o: number of bits in
                                        //    sensitivity classes
       short *nLayers                   // o: number of layers
   )
   {
       static const short Bits_1[4]    = {  0, 9, 9,15};
       static const short Bits_2[16]   = { 43,50,36,31,46,48,40,44,
                                           47,43,44,45,43,44,47,36};
       static const short Bits_3[2][6] = {{13,11,23,33,36,31},
                                          {25, 0,23,32,36,31},};
       int FrType;
       int i, nBits = 0;
        
       if (rate < 0 || rate > 5) {
           return 0; // incorrect stream
       }
        
       if (rate < 0 || rate > 5) {
           return 0; // incorrect stream
       }
        
       // extract frame type bit if required
       FrType = GetBit(buf, nBits++) ? FT_SPEECH : FT_DTX_SID;
        
       // extract frame type bit if required
       FrType = GetBit(buf, nBits++) ? FT_SPEECH : FT_DTX_SID;
        
       if((FrType != FT_DTX_SID && size < 2) || size < 1) {
           return 0; // not enough input data
        
       if((FrType != FT_DTX_SID && size < 2) || size < 1) {
           return 0; // not enough input data
        

}

}

       for(i = 0; i < SENSE_CLASSES; i++) {
           pSenseBits[i] = 0;
        
       for(i = 0; i < SENSE_CLASSES; i++) {
           pSenseBits[i] = 0;
        

}

}

       {
           int cw_0;
           int b[14];
        
       {
           int cw_0;
           int b[14];
        
           // extract meaning bits
           for(i = 0 ; i < 14; i++) {
               b[i] = GetBit(buf, nBits++);
           }
        
           // extract meaning bits
           for(i = 0 ; i < 14; i++) {
               b[i] = GetBit(buf, nBits++);
           }
        
           // parse
           if(FrType == FT_DTX_SID) {
               cw_0 = (b[0]<<0)|(b[1]<<1)|(b[2]<<2)|(b[3]<<3);
               rate = 0;
               pSenseBits[0] = 10 + Bits_2[cw_0];
           } else {
        
           // parse
           if(FrType == FT_DTX_SID) {
               cw_0 = (b[0]<<0)|(b[1]<<1)|(b[2]<<2)|(b[3]<<3);
               rate = 0;
               pSenseBits[0] = 10 + Bits_2[cw_0];
           } else {
        
               int i, idx;
               int nFlag_1, nFlag_2, cw_1, cw_2;
        
               int i, idx;
               int nFlag_1, nFlag_2, cw_1, cw_2;
        
               nFlag_1 = b[0] + b[2] + b[4] + b[6];
               cw_1 = (cw_1 << 1) | b[0];
               cw_1 = (cw_1 << 1) | b[2];
               cw_1 = (cw_1 << 1) | b[4];
               cw_1 = (cw_1 << 1) | b[6];
        
               nFlag_1 = b[0] + b[2] + b[4] + b[6];
               cw_1 = (cw_1 << 1) | b[0];
               cw_1 = (cw_1 << 1) | b[2];
               cw_1 = (cw_1 << 1) | b[4];
               cw_1 = (cw_1 << 1) | b[6];
        
               nFlag_2 = b[1] + b[3] + b[5] + b[7];
               cw_2 = (cw_2 << 1) | b[1];
               cw_2 = (cw_2 << 1) | b[3];
               cw_2 = (cw_2 << 1) | b[5];
               cw_2 = (cw_2 << 1) | b[7];
        
               nFlag_2 = b[1] + b[3] + b[5] + b[7];
               cw_2 = (cw_2 << 1) | b[1];
               cw_2 = (cw_2 << 1) | b[3];
               cw_2 = (cw_2 << 1) | b[5];
               cw_2 = (cw_2 << 1) | b[7];
        
               cw_0 = (b[10]<<0)|(b[11]<<1)|(b[12]<<2)|(b[13]<<3);
               if (base_rate < 0)    base_rate = 0;
               if (base_rate > rate) base_rate = rate;
               idx = base_rate == 0 ? 0 : 1;
        
               cw_0 = (b[10]<<0)|(b[11]<<1)|(b[12]<<2)|(b[13]<<3);
               if (base_rate < 0)    base_rate = 0;
               if (base_rate > rate) base_rate = rate;
               idx = base_rate == 0 ? 0 : 1;
        
               pSenseBits[0] = 15+Bits_2[cw_0];
               pSenseBits[1] = Bits_1[(cw_1>>0)&0x3] +
                               Bits_1[(cw_1>>2)&0x3];
               pSenseBits[2] = nFlag_1*5;
               pSenseBits[3] = nFlag_2*30;
        
               pSenseBits[0] = 15+Bits_2[cw_0];
               pSenseBits[1] = Bits_1[(cw_1>>0)&0x3] +
                               Bits_1[(cw_1>>2)&0x3];
               pSenseBits[2] = nFlag_1*5;
               pSenseBits[3] = nFlag_2*30;
        
               pSenseBits[5] = (4 - nFlag_2)*(Bits_3[idx][0]);
        
               pSenseBits[5] = (4 - nFlag_2)*(Bits_3[idx][0]);
        
               for (i = 1; i < rate+1; i++) {
                   pLayerBits[i] = 4*Bits_3[idx][i];
               }
        
               for (i = 1; i < rate+1; i++) {
                   pLayerBits[i] = 4*Bits_3[idx][i];
               }
        

}

}

           pLayerBits[0] = 0;
           for (i = 0; i < SENSE_CLASSES; i++) {
               pLayerBits[0] += pSenseBits[i];
           }
        
           pLayerBits[0] = 0;
           for (i = 0; i < SENSE_CLASSES; i++) {
               pLayerBits[0] += pSenseBits[i];
           }
        
           *nLayers = rate+1;
       }
        
           *nLayers = rate+1;
       }
        
       {
           // count total frame size
           int payloadBitCount = 0;
           for (i = 0; i < *nLayers; i++) {
               payloadBitCount += pLayerBits[i];
           }
           return payloadBitCount;
       }
   }
        
       {
           // count total frame size
           int payloadBitCount = 0;
           for (i = 0; i < *nLayers; i++) {
               payloadBitCount += pLayerBits[i];
           }
           return payloadBitCount;
       }
   }
        

Author's Address

作者地址

Sergey Ikonin SPIRIT DSP Building 27, A. Solzhenitsyna Street 109004, Moscow Russia

俄罗斯莫斯科索尔仁尼琴大街A.索尔仁尼琴大街109004号谢尔盖·伊科宁精神DSP大楼27号

   Tel: +7 495 661-2178
   Fax: +7 495 912-6786
   EMail: s.ikonin@gmail.com
        
   Tel: +7 495 661-2178
   Fax: +7 495 912-6786
   EMail: s.ikonin@gmail.com