Internet Engineering Task Force (IETF)                        J. Spittka
Request for Comments: 7587
Category: Standards Track                                         K. Vos
ISSN: 2070-1721                                                  vocTone
                                                               JM. Valin
                                                                 Mozilla
                                                               June 2015
        
Internet Engineering Task Force (IETF)                        J. Spittka
Request for Comments: 7587
Category: Standards Track                                         K. Vos
ISSN: 2070-1721                                                  vocTone
                                                               JM. Valin
                                                                 Mozilla
                                                               June 2015
        

RTP Payload Format for the Opus Speech and Audio Codec

Opus语音和音频编解码器的RTP有效负载格式

Abstract

摘要

This document defines the Real-time Transport Protocol (RTP) payload format for packetization of Opus-encoded speech and audio data necessary to integrate the codec in the most compatible way. It also provides an applicability statement for the use of Opus over RTP. Further, it describes media type registrations for the RTP payload format.

本文档定义了实时传输协议(RTP)有效载荷格式,用于以最兼容的方式集成编解码器所需的Opus编码语音和音频数据的打包。它还提供了在RTP上使用Opus的适用性声明。此外,它描述了RTP有效负载格式的媒体类型注册。

Status of This Memo

关于下段备忘

This is an Internet Standards Track document.

这是一份互联网标准跟踪文件。

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.

本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。有关互联网标准的更多信息,请参见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7587.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc7587.

Copyright Notice

版权公告

Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2015 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents

目录

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions, Definitions, and Acronyms Used in This Document    3
   3.  Opus Codec  . . . . . . . . . . . . . . . . . . . . . . . . .   4
     3.1.  Network Bandwidth . . . . . . . . . . . . . . . . . . . .   4
       3.1.1.  Recommended Bitrate . . . . . . . . . . . . . . . . .   4
       3.1.2.  Variable versus Constant Bitrate  . . . . . . . . . .   4
       3.1.3.  Discontinuous Transmission (DTX)  . . . . . . . . . .   5
     3.2.  Complexity  . . . . . . . . . . . . . . . . . . . . . . .   6
     3.3.  Forward Error Correction (FEC)  . . . . . . . . . . . . .   6
     3.4.  Stereo Operation  . . . . . . . . . . . . . . . . . . . .   6
   4.  Opus RTP Payload Format . . . . . . . . . . . . . . . . . . .   7
     4.1.  RTP Header Usage  . . . . . . . . . . . . . . . . . . . .   7
     4.2.  Payload Structure . . . . . . . . . . . . . . . . . . . .   7
   5.  Congestion Control  . . . . . . . . . . . . . . . . . . . . .   8
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
     6.1.  Opus Media Type Registration  . . . . . . . . . . . . . .   9
   7.  SDP Considerations  . . . . . . . . . . . . . . . . . . . . .  12
     7.1.  SDP Offer/Answer Considerations . . . . . . . . . . . . .  13
     7.2.  Declarative SDP Considerations for Opus . . . . . . . . .  15
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  15
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  16
     9.2.  Informative References  . . . . . . . . . . . . . . . . .  17
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  18
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  18
        
   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions, Definitions, and Acronyms Used in This Document    3
   3.  Opus Codec  . . . . . . . . . . . . . . . . . . . . . . . . .   4
     3.1.  Network Bandwidth . . . . . . . . . . . . . . . . . . . .   4
       3.1.1.  Recommended Bitrate . . . . . . . . . . . . . . . . .   4
       3.1.2.  Variable versus Constant Bitrate  . . . . . . . . . .   4
       3.1.3.  Discontinuous Transmission (DTX)  . . . . . . . . . .   5
     3.2.  Complexity  . . . . . . . . . . . . . . . . . . . . . . .   6
     3.3.  Forward Error Correction (FEC)  . . . . . . . . . . . . .   6
     3.4.  Stereo Operation  . . . . . . . . . . . . . . . . . . . .   6
   4.  Opus RTP Payload Format . . . . . . . . . . . . . . . . . . .   7
     4.1.  RTP Header Usage  . . . . . . . . . . . . . . . . . . . .   7
     4.2.  Payload Structure . . . . . . . . . . . . . . . . . . . .   7
   5.  Congestion Control  . . . . . . . . . . . . . . . . . . . . .   8
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
     6.1.  Opus Media Type Registration  . . . . . . . . . . . . . .   9
   7.  SDP Considerations  . . . . . . . . . . . . . . . . . . . . .  12
     7.1.  SDP Offer/Answer Considerations . . . . . . . . . . . . .  13
     7.2.  Declarative SDP Considerations for Opus . . . . . . . . .  15
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  15
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  16
     9.2.  Informative References  . . . . . . . . . . . . . . . . .  17
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  18
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  18
        
1. Introduction
1. 介绍

Opus [RFC6716] is a speech and audio codec developed within the IETF Internet Wideband Audio Codec working group. The codec has a very low algorithmic delay, and it is highly scalable in terms of audio bandwidth, bitrate, and complexity. Further, it provides different modes to efficiently encode speech signals as well as music signals, thus making it the codec of choice for various applications using the Internet or similar networks.

Opus[RFC6716]是IETF互联网宽带音频编解码器工作组内开发的语音和音频编解码器。编解码器具有非常低的算法延迟,并且在音频带宽、比特率和复杂度方面具有高度可扩展性。此外,它提供了不同的模式来有效地编码语音信号以及音乐信号,从而使其成为使用互联网或类似网络的各种应用的首选编解码器。

This document defines the Real-time Transport Protocol (RTP) [RFC3550] payload format for packetization of Opus-encoded speech and audio data necessary to integrate Opus in the most compatible way. It also provides an applicability statement for the use of Opus over RTP. Further, it describes media type registrations for the RTP payload format.

本文件定义了实时传输协议(RTP)[RFC3550]有效载荷格式,用于以最兼容的方式集成Opus所需的Opus编码语音和音频数据的打包。它还提供了在RTP上使用Opus的适用性声明。此外,它描述了RTP有效负载格式的媒体类型注册。

2. Conventions, Definitions, and Acronyms Used in This Document
2. 本文件中使用的约定、定义和首字母缩略词

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中所述进行解释。

audio bandwidth: The range of audio frequencies being coded

音频带宽:正在编码的音频范围

CBR: Constant bitrate

恒定比特率

CPU: Central Processing Unit

中央处理器

DTX: Discontinuous Transmission

DTX:不连续传输

FEC: Forward Error Correction

前向纠错

IP: Internet Protocol

互联网协议

samples: Speech or audio samples (per channel)

样本:语音或音频样本(每个通道)

SDP: Session Description Protocol

会话描述协议

SSRC: Synchronization source

同步源

VBR: Variable bitrate

可变比特率

Throughout this document, we refer to the following definitions:

在本文件中,我们引用了以下定义:

   +--------------+----------------+-----------------+-----------------+
   | Abbreviation |      Name      | Audio Bandwidth |  Sampling Rate  |
   |              |                |       (Hz)      |       (Hz)      |
   +--------------+----------------+-----------------+-----------------+
   |      NB      |   Narrowband   |     0 - 4000    |       8000      |
   |              |                |                 |                 |
   |      MB      |   Mediumband   |     0 - 6000    |      12000      |
   |              |                |                 |                 |
   |      WB      |    Wideband    |     0 - 8000    |      16000      |
   |              |                |                 |                 |
   |     SWB      | Super-wideband |    0 - 12000    |      24000      |
   |              |                |                 |                 |
   |      FB      |    Fullband    |    0 - 20000    |      48000      |
   +--------------+----------------+-----------------+-----------------+
        
   +--------------+----------------+-----------------+-----------------+
   | Abbreviation |      Name      | Audio Bandwidth |  Sampling Rate  |
   |              |                |       (Hz)      |       (Hz)      |
   +--------------+----------------+-----------------+-----------------+
   |      NB      |   Narrowband   |     0 - 4000    |       8000      |
   |              |                |                 |                 |
   |      MB      |   Mediumband   |     0 - 6000    |      12000      |
   |              |                |                 |                 |
   |      WB      |    Wideband    |     0 - 8000    |      16000      |
   |              |                |                 |                 |
   |     SWB      | Super-wideband |    0 - 12000    |      24000      |
   |              |                |                 |                 |
   |      FB      |    Fullband    |    0 - 20000    |      48000      |
   +--------------+----------------+-----------------+-----------------+
        

Table 1: Audio Bandwidth Naming

表1:音频带宽命名

3. Opus Codec
3. Opus编解码器

Opus encodes speech signals as well as general audio signals. Two different modes can be chosen, a voice mode or an audio mode, to allow the most efficient coding depending on the type of the input signal, the sampling frequency of the input signal, and the intended application.

Opus对语音信号和一般音频信号进行编码。可以选择两种不同的模式,语音模式或音频模式,以允许根据输入信号的类型、输入信号的采样频率和预期应用进行最有效的编码。

The voice mode allows efficient encoding of voice signals at lower bitrates while the audio mode is optimized for general audio signals at medium and higher bitrates.

语音模式允许以较低的比特率对语音信号进行有效编码,而音频模式针对中等和较高比特率的一般音频信号进行了优化。

Opus is highly scalable in terms of audio bandwidth, bitrate, and complexity. Further, Opus allows transmitting stereo signals with in-band signaling in the bitstream.

Opus在音频带宽、比特率和复杂性方面具有高度可扩展性。此外,Opus允许在比特流中使用带内信令发送立体声信号。

3.1. Network Bandwidth
3.1. 网络带宽

Opus supports bitrates from 6 kbit/s to 510 kbit/s. The bitrate can be changed dynamically within that range. All other parameters being equal, higher bitrates result in higher audio quality.

Opus支持从6kbit/s到510kbit/s的比特率。比特率可以在该范围内动态更改。在所有其他参数相同的情况下,更高的比特率会导致更高的音频质量。

3.1.1. Recommended Bitrate
3.1.1. 推荐比特率

For a frame size of 20 ms, these are the bitrate "sweet spots" for Opus in various configurations:

对于20 ms的帧大小,以下是各种配置中OPU的比特率“最佳点”:

o 8-12 kbit/s for NB speech,

o NB语音的8-12 kbit/s,

o 16-20 kbit/s for WB speech,

o 用于WB语音的16-20 kbit/s,

o 28-40 kbit/s for FB speech,

o 用于FB语音的28-40 kbit/s,

o 48-64 kbit/s for FB mono music, and

o 用于FB单声道音乐的48-64 kbit/s,以及

o 64-128 kbit/s for FB stereo music.

o 用于FB立体声音乐的64-128 kbit/s。

3.1.2. Variable versus Constant Bitrate
3.1.2. 可变比特率与恒定比特率

For the same average bitrate, variable bitrate (VBR) can achieve higher audio quality than constant bitrate (CBR). For the majority of voice transmission applications, VBR is the best choice. One reason for choosing CBR is the potential information leak that _might_ occur when encrypting the compressed stream. See [RFC6562] for guidelines on when VBR is appropriate for encrypted audio communications. In the case where an existing VBR stream needs to be converted to CBR for security reasons, the Opus padding mechanism

对于相同的平均比特率,可变比特率(VBR)可以获得比恒定比特率(CBR)更高的音频质量。对于大多数语音传输应用,VBR是最佳选择。选择CBR的一个原因是加密压缩流时可能发生的潜在信息泄漏。有关VBR何时适用于加密音频通信的指南,请参见[RFC6562]。在出于安全原因需要将现有VBR流转换为CBR的情况下,Opus填充机制

described in [RFC6716] is the RECOMMENDED way to achieve padding because the RTP padding bit is unencrypted.

[RFC6716]中描述的是实现填充的推荐方法,因为RTP填充位未加密。

The bitrate can be adjusted at any point in time. To avoid congestion, the average bitrate SHOULD NOT exceed the available network bandwidth. If no target bitrate is specified, the bitrates specified in Section 3.1.1 are RECOMMENDED.

比特率可以在任何时间点进行调整。为避免拥塞,平均比特率不应超过可用网络带宽。如果未指定目标比特率,建议使用第3.1.1节中指定的比特率。

3.1.3. Discontinuous Transmission (DTX)
3.1.3. 不连续传输(DTX)

Opus can, as described in Section 3.1.2, be operated with a variable bitrate. In that case, the encoder will automatically reduce the bitrate for certain input signals, like periods of silence. When using continuous transmission, it will reduce the bitrate when the characteristics of the input signal permit, but it will never interrupt the transmission to the receiver. Therefore, the received signal will maintain the same high level of audio quality over the full duration of a transmission while minimizing the average bitrate over time.

如第3.1.2节所述,OPU可以以可变比特率运行。在这种情况下,编码器将自动降低某些输入信号的比特率,如静音周期。当使用连续传输时,当输入信号的特性允许时,它将降低比特率,但它不会中断到接收器的传输。因此,接收信号将在传输的整个持续时间内保持相同的高水平音频质量,同时随时间最小化平均比特率。

In cases where the bitrate of Opus needs to be reduced even further or in cases where only constant bitrate is available, the Opus encoder can use Discontinuous Transmission (DTX), where parts of the encoded signal that correspond to periods of silence in the input speech or audio signal are not transmitted to the receiver. A receiver can distinguish between DTX and packet loss by looking for gaps in the sequence number, as described by Section 4.1 of [RFC3551].

在Opus的比特率需要进一步降低的情况下,或者在只有恒定比特率可用的情况下,Opus编码器可以使用不连续传输(DTX),其中与输入语音或音频信号中的静默周期相对应的编码信号的部分不传输到接收机。如[RFC3551]第4.1节所述,接收机可通过查找序列号中的间隙来区分DTX和数据包丢失。

On the receiving side, the non-transmitted parts will be handled by a frame loss concealment unit in the Opus decoder, which generates a comfort noise signal to replace the non-transmitted parts of the speech or audio signal. Using Comfort Noise as defined in [RFC3389] with Opus is discouraged. The transmitter MUST drop whole frames only, based on the size of the last transmitted frame, to ensure successive RTP timestamps differ by a multiple of 120 and to allow the receiver to use whole frames for concealment.

在接收侧,非发送部分将由Opus解码器中的帧丢失隐藏单元处理,其生成舒适噪声信号以替换语音或音频信号的非发送部分。不鼓励在Opus上使用[RFC3389]中定义的舒适噪音。发射机必须仅根据最后发送的帧的大小丢弃整个帧,以确保连续的RTP时间戳相差120倍,并允许接收机使用整个帧进行隐藏。

DTX can be used with both variable and constant bitrate. It will have a slightly lower speech or audio quality than continuous transmission. Therefore, using continuous transmission is RECOMMENDED unless constraints on available network bandwidth are severe.

DTX可用于可变和恒定比特率。它的语音或音频质量将略低于连续传输。因此,建议使用连续传输,除非对可用网络带宽的限制非常严格。

3.2. Complexity
3.2. 复杂性

Complexity of the encoder can be scaled to optimize for CPU resources in real time, mostly as a trade-off between audio quality and bitrate. Also, different modes of Opus have different complexity.

编码器的复杂度可以调整,以实时优化CPU资源,主要是在音频质量和比特率之间进行权衡。此外,不同的作品模式具有不同的复杂性。

3.3. Forward Error Correction (FEC)
3.3. 前向纠错(FEC)

The voice mode of Opus allows for embedding in-band Forward Error Correction (FEC) data into the Opus bitstream. This FEC scheme adds redundant information about the previous packet (N-1) to the current output packet N. For each frame, the encoder decides whether to use FEC based on (1) an externally provided estimate of the channel's packet loss rate; (2) an externally provided estimate of the channel's capacity; (3) the sensitivity of the audio or speech signal to packet loss; and (4) whether the receiving decoder has indicated it can take advantage of in-band FEC information. The decision to send in-band FEC information is entirely controlled by the encoder; therefore, no special precautions for the payload have to be taken.

Opus的语音模式允许将带内前向纠错(FEC)数据嵌入Opus比特流。该FEC方案将关于先前分组(N-1)的冗余信息添加到当前输出分组N。对于每个帧,编码器基于(1)外部提供的信道分组丢失率估计来决定是否使用FEC;(2) 外部提供的信道容量估计值;(3) 音频或语音信号对分组丢失的敏感性;以及(4)接收解码器是否指示其可以利用带内FEC信息。发送带内FEC信息的决定完全由编码器控制;因此,无需对有效载荷采取特殊预防措施。

On the receiving side, the decoder can take advantage of this additional information when it loses a packet and the next packet is available. In order to use the FEC data, the jitter buffer needs to provide access to payloads with the FEC data. Instead of performing loss concealment for a missing packet, the receiver can then configure its decoder to decode the FEC data from the next packet.

在接收端,当解码器丢失一个分组并且下一个分组可用时,解码器可以利用该附加信息。为了使用FEC数据,抖动缓冲器需要提供对FEC数据有效载荷的访问。接收机随后可以配置其解码器来解码来自下一个分组的FEC数据,而不是对丢失的分组执行丢失隐藏。

Any compliant Opus decoder is capable of ignoring FEC information when it is not needed, so encoding with FEC cannot cause interoperability problems. However, if FEC cannot be used on the receiving side, then FEC SHOULD NOT be used, as it leads to an inefficient usage of network resources. Decoder support for FEC SHOULD be indicated at the time a session is set up.

任何兼容的Opus解码器都能够在不需要FEC信息时忽略它,因此使用FEC编码不会导致互操作性问题。但是,如果接收端不能使用FEC,则不应使用FEC,因为它会导致网络资源的低效使用。解码器对FEC的支持应在建立会话时指示。

3.4. Stereo Operation
3.4. 立体声操作

Opus allows for transmission of stereo audio signals. This operation is signaled in-band in the Opus bitstream and no special arrangement is needed in the payload format. An Opus decoder is capable of handling a stereo encoding, but an application might only be capable of consuming a single audio channel.

Opus允许传输立体声音频信号。此操作在Opus比特流中的频带内发出信号,有效负载格式中不需要特殊安排。Opus解码器能够处理立体声编码,但应用程序可能只能使用单个音频通道。

If a decoder cannot take advantage of the benefits of a stereo signal, this SHOULD be indicated at the time a session is set up. In that case, the sending side SHOULD NOT send stereo signals as it leads to an inefficient usage of network resources.

如果解码器无法利用立体声信号的优点,则应在设置会话时指出这一点。在这种情况下,发送端不应发送立体声信号,因为这会导致网络资源的低效使用。

4. Opus RTP Payload Format
4. Opus RTP有效载荷格式

The payload format for Opus consists of the RTP header and Opus payload data.

Opus的有效载荷格式由RTP报头和Opus有效载荷数据组成。

4.1. RTP Header Usage
4.1. RTP头使用

The format of the RTP header is specified in [RFC3550]. The use of the fields of the RTP header by the Opus payload format is consistent with that specification.

RTP标头的格式在[RFC3550]中指定。Opus有效载荷格式对RTP报头字段的使用与该规范一致。

The payload length of Opus is an integer number of octets; therefore, no padding is necessary. The payload MAY be padded by an integer number of octets according to [RFC3550], although the Opus internal padding is preferred.

Opus的有效负载长度是八位字节的整数;因此,不需要填充。根据[RFC3550],虽然Opus内部填充是首选的,但有效载荷可以由整数个八位字节填充。

The timestamp, sequence number, and marker bit (M) of the RTP header are used in accordance with Section 4.1 of [RFC3551].

RTP报头的时间戳、序列号和标记位(M)根据[RFC3551]第4.1节使用。

The RTP payload type for Opus is to be assigned dynamically.

Opus的RTP有效负载类型将动态分配。

The receiving side MUST be prepared to receive duplicate RTP packets. The receiver MUST provide at most one of those payloads to the Opus decoder for decoding, and it MUST discard the others.

接收方必须准备好接收重复的RTP数据包。接收器必须向Opus解码器提供最多一个有效载荷进行解码,并且必须丢弃其他有效载荷。

Opus supports 5 different audio bandwidths, which can be adjusted during a stream. The RTP timestamp is incremented with a 48000 Hz clock rate for all modes of Opus and all sampling rates. The unit for the timestamp is samples per single (mono) channel. The RTP timestamp corresponds to the sample time of the first encoded sample in the encoded frame. For data encoded with sampling rates other than 48000 Hz, the sampling rate has to be adjusted to 48000 Hz.

Opus支持5种不同的音频带宽,可在流期间进行调整。RTP时间戳以48000 Hz时钟频率递增,适用于所有OPU模式和所有采样率。时间戳的单位是每个单(单)通道的采样数。RTP时间戳对应于编码帧中第一个编码样本的采样时间。对于使用48000 Hz以外的采样率编码的数据,必须将采样率调整为48000 Hz。

4.2. Payload Structure
4.2. 有效载荷结构

The Opus encoder can output encoded frames representing 2.5, 5, 10, 20, 40, or 60 ms of speech or audio data. Further, an arbitrary number of frames can be combined into a packet, up to a maximum packet duration representing 120 ms of speech or audio data. The grouping of one or more Opus frames into a single Opus packet is defined in Section 3 of [RFC6716]. An RTP payload MUST contain exactly one Opus packet as defined by that document.

Opus编码器可以输出表示2.5、5、10、20、40或60ms语音或音频数据的编码帧。此外,可以将任意数量的帧组合成分组,最大分组持续时间表示120 ms的语音或音频数据。[RFC6716]第3节定义了将一个或多个Opus帧分组为单个Opus数据包。RTP有效负载必须包含该文档定义的一个Opus数据包。

Figure 1 shows the structure combined with the RTP header.

图1显示了与RTP标头组合的结构。

                        +----------+--------------+
                        |RTP Header| Opus Payload |
                        +----------+--------------+
        
                        +----------+--------------+
                        |RTP Header| Opus Payload |
                        +----------+--------------+
        

Figure 1: Packet Structure with RTP Header

图1:带有RTP报头的数据包结构

Table 2 shows supported frame sizes in milliseconds of encoded speech or audio data for the speech and audio modes (Mode) and sampling rates (fs) of Opus, and it shows how the timestamp is incremented for packetization (ts incr). If the Opus encoder outputs multiple encoded frames into a single packet, the timestamp increment is the sum of the increments for the individual frames.

表2显示了Opus的语音和音频模式(Mode)和采样率(fs)支持的编码语音或音频数据的帧大小(以毫秒为单位),并显示了打包(ts incr)的时间戳是如何增加的。如果Opus编码器将多个编码帧输出到单个数据包中,则时间戳增量是各个帧的增量之和。

    +---------+-----------------+-----+-----+-----+-----+------+------+
    |   Mode  |        fs       | 2.5 |  5  |  10 |  20 |  40  |  60  |
    +---------+-----------------+-----+-----+-----+-----+------+------+
    | ts incr |       all       | 120 | 240 | 480 | 960 | 1920 | 2880 |
    |         |                 |     |     |     |     |      |      |
    |  voice  | NB/MB/WB/SWB/FB |  x  |  x  |  o  |  o  |  o   |  o   |
    |         |                 |     |     |     |     |      |      |
    |  audio  |   NB/WB/SWB/FB  |  o  |  o  |  o  |  o  |  x   |  x   |
    +---------+-----------------+-----+-----+-----+-----+------+------+
        
    +---------+-----------------+-----+-----+-----+-----+------+------+
    |   Mode  |        fs       | 2.5 |  5  |  10 |  20 |  40  |  60  |
    +---------+-----------------+-----+-----+-----+-----+------+------+
    | ts incr |       all       | 120 | 240 | 480 | 960 | 1920 | 2880 |
    |         |                 |     |     |     |     |      |      |
    |  voice  | NB/MB/WB/SWB/FB |  x  |  x  |  o  |  o  |  o   |  o   |
    |         |                 |     |     |     |     |      |      |
    |  audio  |   NB/WB/SWB/FB  |  o  |  o  |  o  |  o  |  x   |  x   |
    +---------+-----------------+-----+-----+-----+-----+------+------+
        

Table 2: Supported Opus frame sizes and timestamp increments are marked with an o. Unsupported ones are marked with an x.

表2:支持的Opus帧大小和时间戳增量用o标记。不受支持的标记为x。

5. Congestion Control
5. 拥塞控制

The target bitrate of Opus can be adjusted at any point in time, thus allowing efficient congestion control. Furthermore, the amount of encoded speech or audio data encoded in a single packet can be used for congestion control, since the transmission rate is inversely proportional to the packet duration. A lower packet transmission rate reduces the amount of header overhead, but at the same time increases latency and loss sensitivity, so it ought to be used with care.

OPU的目标比特率可以在任何时间点进行调整,从而实现有效的拥塞控制。此外,编码在单个分组中的编码语音或音频数据量可用于拥塞控制,因为传输速率与分组持续时间成反比。较低的数据包传输速率降低了报头开销,但同时增加了延迟和丢失敏感性,因此应谨慎使用。

Since UDP does not provide congestion control, applications that use RTP over UDP SHOULD implement their own congestion control above the UDP layer [RFC5405]. Work in the RMCAT working group [rmcat] describes the interactions and conceptual interfaces necessary between the application components that relate to congestion control, including the RTP layer, the higher-level media codec control layer, and the lower-level transport interface, as well as components dedicated to congestion control functions.

由于UDP不提供拥塞控制,在UDP上使用RTP的应用程序应该在UDP层[RFC5405]上实现自己的拥塞控制。RMCAT工作组[RMCAT]中的工作描述了与拥塞控制相关的应用程序组件之间必要的交互和概念接口,包括RTP层、高级媒体编解码器控制层和低级传输接口,以及专用于拥塞控制功能的组件。

6. IANA Considerations
6. IANA考虑

One media subtype (audio/opus) has been defined and registered as described in the following section.

一个媒体子类型(音频/作品)已定义并注册,如下节所述。

6.1. Opus Media Type Registration
6.1. Opus媒体类型注册

Media type registration is done according to [RFC6838] and [RFC4855].

根据[RFC6838]和[RFC4855]进行介质类型注册。

Type name: audio

类型名称:音频

Subtype name: opus

子类型名称:opus

Required parameters:

所需参数:

rate: the RTP timestamp is incremented with a 48000 Hz clock rate for all modes of Opus and all sampling rates. For data encoded with sampling rates other than 48000 Hz, the sampling rate has to be adjusted to 48000 Hz.

速率:对于所有OPU模式和所有采样速率,RTP时间戳以48000 Hz时钟速率递增。对于使用48000 Hz以外的采样率编码的数据,必须将采样率调整为48000 Hz。

Optional parameters:

可选参数:

maxplaybackrate: a hint about the maximum output sampling rate that the receiver is capable of rendering in Hz. The decoder MUST be capable of decoding any audio bandwidth, but, due to hardware limitations, only signals up to the specified sampling rate can be played back. Sending signals with higher audio bandwidth results in higher than necessary network usage and encoding complexity, so an encoder SHOULD NOT encode frequencies above the audio bandwidth specified by maxplaybackrate. This parameter can take any value between 8000 and 48000, although commonly the value will match one of the Opus bandwidths (Table 1). By default, the receiver is assumed to have no limitations, i.e., 48000.

maxplaybackrate:关于接收器能够以Hz呈现的最大输出采样率的提示。解码器必须能够解码任何音频带宽,但由于硬件限制,只能播放达到指定采样率的信号。发送具有更高音频带宽的信号会导致更高的网络使用率和编码复杂性,因此编码器不应编码高于maxplaybackrate指定的音频带宽的频率。该参数可以取8000到48000之间的任何值,尽管该值通常与Opus带宽之一相匹配(表1)。默认情况下,假定接收器没有限制,即48000。

sprop-maxcapturerate: a hint about the maximum input sampling rate that the sender is likely to produce. This is not a guarantee that the sender will never send any higher bandwidth (e.g., it could send a prerecorded prompt that uses a higher bandwidth), but it indicates to the receiver that frequencies above this maximum can safely be discarded. This parameter is useful to avoid wasting receiver resources by operating the audio processing pipeline (e.g., echo cancellation) at a higher rate than necessary. This parameter can take any value between 8000 and 48000, although commonly the value will match one of the Opus bandwidths (Table 1). By default, the sender is assumed to have no limitations, i.e., 48000.

sprop maxcapturerate:关于发送方可能产生的最大输入采样率的提示。这并不能保证发送方永远不会发送任何更高的带宽(例如,它可以发送使用更高带宽的预录提示),但它向接收方表明,可以安全地丢弃高于此最大值的频率。此参数有助于避免以高于必要的速率操作音频处理管道(例如回声消除),从而浪费接收器资源。该参数可以取8000到48000之间的任何值,尽管该值通常与Opus带宽之一相匹配(表1)。默认情况下,假定发送方没有限制,即48000。

maxptime: the maximum duration of media represented by a packet (according to Section 6 of [RFC4566]) that a decoder wants to receive, in milliseconds rounded up to the next full integer value. Possible values are 3, 5, 10, 20, 40, 60, or an arbitrary multiple of an Opus frame size rounded up to the next full integer value, up to a maximum value of 120, as defined in Section 4. If no value is specified, the default is 120.

maxptime:解码器希望接收的数据包(根据[RFC4566]第6节)所表示的媒体的最大持续时间,以毫秒为单位,向上舍入到下一个完整整数值。可能的值为3、5、10、20、40、60或Opus帧大小的任意倍数,四舍五入为下一个完整整数值,最大值为120,如第4节所定义。如果未指定值,则默认值为120。

ptime: the preferred duration of media represented by a packet (according to Section 6 of [RFC4566]) that a decoder wants to receive, in milliseconds rounded up to the next full integer value. Possible values are 3, 5, 10, 20, 40, 60, or an arbitrary multiple of an Opus frame size rounded up to the next full integer value, up to a maximum value of 120, as defined in Section 4. If no value is specified, the default is 20.

ptime:解码器希望接收的数据包(根据[RFC4566]第6节)表示的媒体的首选持续时间,以毫秒为单位,四舍五入到下一个完整整数值。可能的值为3、5、10、20、40、60或Opus帧大小的任意倍数,四舍五入为下一个完整整数值,最大值为120,如第4节所定义。如果未指定值,则默认值为20。

maxaveragebitrate: specifies the maximum average receive bitrate of a session in bits per second (bit/s). The actual value of the bitrate can vary, as it is dependent on the characteristics of the media in a packet. Note that the maximum average bitrate MAY be modified dynamically during a session. Any positive integer is allowed, but values outside the range 6000 to 510000 SHOULD be ignored. If no value is specified, the maximum value specified in Section 3.1.1 for the corresponding mode of Opus and corresponding maxplaybackrate is the default.

maxaveragebitrate:指定会话的最大平均接收比特率(位/秒)。比特率的实际值可以变化,因为它取决于数据包中媒体的特性。注意,最大平均比特率可以在会话期间动态修改。允许使用任何正整数,但应忽略6000到510000范围之外的值。如果未指定值,则第3.1.1节中规定的Opus对应模式和对应maxplaybackrate的最大值为默认值。

stereo: specifies whether the decoder prefers receiving stereo or mono signals. Possible values are 1 and 0, where 1 specifies that stereo signals are preferred, and 0 specifies that only mono signals are preferred. Independent of the stereo parameter, every receiver MUST be able to receive and decode stereo signals, but sending stereo signals to a receiver that signaled a preference for mono signals may result in higher than necessary network utilization and encoding complexity. If no value is specified, the default is 0 (mono).

立体声:指定解码器喜欢接收立体声还是单声道信号。可能的值为1和0,其中1指定首选立体声信号,0指定仅首选单声道信号。独立于立体声参数,每个接收器必须能够接收和解码立体声信号,但向表示偏好单声道信号的接收器发送立体声信号可能会导致高于必要的网络利用率和编码复杂性。如果未指定值,则默认值为0(单声道)。

sprop-stereo: specifies whether the sender is likely to produce stereo audio. Possible values are 1 and 0, where 1 specifies that stereo signals are likely to be sent, and 0 specifies that the sender will likely only send mono. This is not a guarantee that the sender will never send stereo audio (e.g., it could send a prerecorded prompt that uses stereo), but it indicates to the receiver that the received signal can be safely downmixed to mono. This parameter is useful to avoid wasting receiver resources by operating the audio processing pipeline (e.g., echo cancellation) in stereo when not necessary. If no value is specified, the default is 0 (mono).

存储立体声:指定发送方是否可能生成立体声音频。可能的值为1和0,其中1指定可能发送立体声信号,0指定发送方可能仅发送单声道信号。这并不能保证发送方永远不会发送立体声音频(例如,它可以发送使用立体声的预录提示),但它向接收方表明,接收到的信号可以安全地下混到单声道。此参数有助于避免在不必要时通过立体声操作音频处理管道(例如回声消除)浪费接收器资源。如果未指定值,则默认值为0(单声道)。

cbr: specifies if the decoder prefers the use of a constant bitrate versus a variable bitrate. Possible values are 1 and 0, where 1 specifies constant bitrate, and 0 specifies variable bitrate. If no value is specified, the default is 0 (vbr). When cbr is 1, the maximum average bitrate can still change, e.g., to adapt to changing network conditions.

cbr:指定解码器是否更喜欢使用恒定比特率而不是可变比特率。可能的值为1和0,其中1指定恒定比特率,0指定可变比特率。如果未指定值,则默认值为0(vbr)。当cbr为1时,最大平均比特率仍然可以改变,例如,以适应不断变化的网络条件。

useinbandfec: specifies that the decoder has the capability to take advantage of the Opus in-band FEC. Possible values are 1 and 0. Providing 0 when FEC cannot be used on the receiving side is RECOMMENDED. If no value is specified, useinbandfec is assumed to be 0. This parameter is only a preference, and the receiver MUST be able to process packets that include FEC information, even if it means the FEC part is discarded.

useinbandfec:指定解码器能够利用Opus带内FEC。可能的值为1和0。建议在接收端无法使用FEC时提供0。如果未指定值,则假定useinbandfec为0。该参数只是一个首选项,并且接收机必须能够处理包括FEC信息的分组,即使这意味着FEC部分被丢弃。

usedtx: specifies if the decoder prefers the use of DTX. Possible values are 1 and 0. If no value is specified, the default is 0.

usedtx:指定解码器是否更喜欢使用DTX。可能的值为1和0。如果未指定值,则默认值为0。

Encoding considerations:

编码注意事项:

The Opus media type is framed and consists of binary data according to Section 4.8 of [RFC6838].

根据[RFC6838]第4.8节的规定,Opus媒体类型由二进制数据构成。

Security considerations:

安全考虑:

See Section 8 of this document.

见本文件第8节。

Interoperability considerations: none

互操作性注意事项:无

Published specification: RFC 7587

发布规范:RFC 7587

Applications that use this media type:

使用此媒体类型的应用程序:

Any application that requires the transport of speech or audio data can use this media type. Some examples are, but not limited to, audio and video conferencing, Voice over IP, and media streaming.

任何需要传输语音或音频数据的应用程序都可以使用此媒体类型。一些示例包括但不限于音频和视频会议、IP语音和媒体流。

Fragment identifier considerations: N/A

片段标识符注意事项:不适用

Person & email address to contact for further information:

联系人和电子邮件地址,以获取更多信息:

SILK Support, silksupport@skype.net

丝绸支架,silksupport@skype.net

Jean-Marc Valin, jmvalin@jmvalin.ca

让-马克·瓦林,jmvalin@jmvalin.ca

Intended usage: COMMON

预期用途:普通

Restrictions on usage:

使用限制:

For transfer over RTP, the RTP payload format (Section 4 of this document) SHALL be used.

对于RTP传输,应使用RTP有效载荷格式(本文件第4节)。

Authors:

作者:

Julian Spittka, jspittka@gmail.com

朱利安·斯皮特卡,jspittka@gmail.com

Koen Vos, koenvos74@gmail.com

科恩沃斯,koenvos74@gmail.com

Jean-Marc Valin, jmvalin@jmvalin.ca

让-马克·瓦林,jmvalin@jmvalin.ca

Change controller: IETF Payload working group delegated from the IESG

变更控制员:IESG授权的IETF有效载荷工作组

7. SDP Considerations
7. SDP考虑因素

The information described in the media type specification has a specific mapping to fields in the Session Description Protocol (SDP) [RFC4566], which is commonly used to describe RTP sessions. When SDP is used to specify sessions employing Opus, the mapping is as follows:

媒体类型规范中描述的信息具有到会话描述协议(SDP)[RFC4566]中的字段的特定映射,该协议通常用于描述RTP会话。当使用SDP指定使用OPU的会话时,映射如下:

o The media type ("audio") goes in SDP "m=" as the media name.

o 媒体类型(“音频”)以SDP“m=”作为媒体名称。

o The media subtype ("opus") goes in SDP "a=rtpmap" as the encoding name. The RTP clock rate in "a=rtpmap" MUST be 48000, and the number of channels MUST be 2.

o 媒体子类型(“opus”)以SDP“a=rtpmap”作为编码名称。“a=rtpmap”中的RTP时钟速率必须为48000,通道数必须为2。

o The OPTIONAL media type parameters "ptime" and "maxptime" are mapped to "a=ptime" and "a=maxptime" attributes, respectively, in the SDP.

o 可选媒体类型参数“ptime”和“maxptime”分别映射到SDP中的“a=ptime”和“a=maxptime”属性。

o The OPTIONAL media type parameters "maxaveragebitrate", "maxplaybackrate", "stereo", "cbr", "useinbandfec", and "usedtx", when present, MUST be included in the "a=fmtp" attribute in the SDP, expressed as a media type string in the form of a semicolon-separated list of parameter=value pairs (e.g., maxplaybackrate=48000). They MUST NOT be specified in an SSRC-specific "fmtp" source-level attribute (as defined in Section 6.3 of [RFC5576]).

o 可选媒体类型参数“maxaveragebitrate”、“maxplaybackrate”、“stereo”、“cbr”、“useinbandfec”和“usedtx”(如果存在)必须包含在SDP中的“a=fmtp”属性中,以分号分隔的参数=值对列表形式表示为媒体类型字符串(例如,maxplaybackrate=48000)。不得在SSRC特定的“fmtp”源级属性中指定它们(如[RFC5576]第6.3节所定义)。

o The OPTIONAL media type parameters "sprop-maxcapturerate" and "sprop-stereo" MAY be mapped to the "a=fmtp" SDP attribute by copying them directly from the media type parameter string as part of the semicolon-separated list of parameter=value pairs (e.g., sprop-stereo=1). These same OPTIONAL media type parameters MAY also be specified using an SSRC-specific "fmtp" source-level

o 可选的媒体类型参数“sprop maxcapturerate”和“sprop stereo”可通过直接从媒体类型参数字符串复制,作为参数=值对(例如,sprop stereo=1)分号分隔列表的一部分,映射到“a=fmtp”SDP属性。这些相同的可选介质类型参数也可以使用SSRC特定的“fmtp”源级别指定

attribute as described in Section 6.3 of [RFC5576]. They MAY be specified in both places, in which case the parameter in the source-level attribute overrides the one found on the "a=fmtp" line. The value of any parameter that is not specified in a source-level source attribute MUST be taken from the "a=fmtp" line, if it is present there.

[RFC5576]第6.3节所述的属性。它们可以在两个位置指定,在这种情况下,源级别属性中的参数将覆盖“a=fmtp”行中的参数。源级别源属性中未指定的任何参数的值必须取自“a=fmtp”行(如果存在)。

Below are some examples of SDP session descriptions for Opus:

以下是OPU SDP会话描述的一些示例:

Example 1: Standard mono session with 48000 Hz clock rate

示例1:时钟频率为48000 Hz的标准单声道会话

       m=audio 54312 RTP/AVP 101
       a=rtpmap:101 opus/48000/2
        
       m=audio 54312 RTP/AVP 101
       a=rtpmap:101 opus/48000/2
        

Example 2: 16000 Hz clock rate, maximum packet size of 40 ms, recommended packet size of 40 ms, maximum average bitrate of 20000 bit/s, prefers to receive stereo but only plans to send mono, FEC is desired, DTX is not desired

示例2:16000 Hz时钟频率,最大数据包大小为40 ms,建议数据包大小为40 ms,最大平均比特率为20000 bit/s,更喜欢接收立体声,但只计划发送单声道,需要FEC,不需要DTX

       m=audio 54312 RTP/AVP 101
       a=rtpmap:101 opus/48000/2
       a=fmtp:101 maxplaybackrate=16000; sprop-maxcapturerate=16000;
       maxaveragebitrate=20000; stereo=1; useinbandfec=1; usedtx=0
       a=ptime:40
       a=maxptime:40
        
       m=audio 54312 RTP/AVP 101
       a=rtpmap:101 opus/48000/2
       a=fmtp:101 maxplaybackrate=16000; sprop-maxcapturerate=16000;
       maxaveragebitrate=20000; stereo=1; useinbandfec=1; usedtx=0
       a=ptime:40
       a=maxptime:40
        

Example 3: Two-way full-band stereo preferred

示例3:首选双向全频段立体声

       m=audio 54312 RTP/AVP 101
       a=rtpmap:101 opus/48000/2
       a=fmtp:101 stereo=1; sprop-stereo=1
        
       m=audio 54312 RTP/AVP 101
       a=rtpmap:101 opus/48000/2
       a=fmtp:101 stereo=1; sprop-stereo=1
        
7.1. SDP Offer/Answer Considerations
7.1. SDP提供/回答注意事项

When using the offer/answer procedure described in [RFC3264] to negotiate the use of Opus, the following considerations apply:

使用[RFC3264]中所述的报价/应答程序协商Opus的使用时,应考虑以下因素:

o Opus supports several clock rates. For signaling purposes, only the highest, i.e., 48000, is used. The actual clock rate of the corresponding media is signaled inside the payload and is not restricted by this payload format description. The decoder MUST be capable of decoding every received clock rate. An example is shown below:

o Opus支持多种时钟频率。出于信令目的,仅使用最高值,即48000。相应媒体的实际时钟速率在有效负载内发出信号,不受该有效负载格式描述的限制。解码器必须能够对每个接收到的时钟频率进行解码。示例如下所示:

       m=audio 54312 RTP/AVP 100
       a=rtpmap:100 opus/48000/2
        
       m=audio 54312 RTP/AVP 100
       a=rtpmap:100 opus/48000/2
        

o The "ptime" and "maxptime" parameters are unidirectional receive-only parameters and typically will not compromise interoperability; however, some values might cause application performance to suffer. [RFC3264] defines the SDP offer/answer handling of the "ptime" parameter. The "maxptime" parameter MUST be handled in the same way.

o “ptime”和“maxptime”参数是单向的仅接收参数,通常不会影响互操作性;但是,某些值可能会影响应用程序性能。[RFC3264]定义“ptime”参数的SDP提供/应答处理。必须以相同的方式处理“maxptime”参数。

o The "maxplaybackrate" parameter is a unidirectional receive-only parameter that reflects limitations of the local receiver. When sending to a single destination, a sender MUST NOT use an audio bandwidth higher than necessary to make full use of audio sampled at a sampling rate of "maxplaybackrate". Gateways or senders that are sending the same encoded audio to multiple destinations SHOULD NOT use an audio bandwidth higher than necessary to represent audio sampled at "maxplaybackrate", as this would lead to inefficient use of network resources. The "maxplaybackrate" parameter does not affect interoperability. Also, this parameter SHOULD NOT be used to adjust the audio bandwidth as a function of the bitrate, as this is the responsibility of the Opus encoder implementation.

o “maxplaybackrate”参数是一个单向仅接收参数,反映本地接收机的限制。当发送到单个目的地时,发送方不得使用高于充分利用以“maxplaybackrate”采样率采样的音频所需的音频带宽。向多个目的地发送相同编码音频的网关或发送方不应使用高于表示以“maxplaybackrate”采样的音频所需的音频带宽,因为这将导致网络资源的低效使用。“maxplaybackrate”参数不影响互操作性。此外,此参数不应用于根据比特率调整音频带宽,因为这是Opus编码器实现的责任。

o The "maxaveragebitrate" parameter is a unidirectional receive-only parameter that reflects limitations of the local receiver. The sender of the other side MUST NOT send with an average bitrate higher than "maxaveragebitrate" as it might overload the network and/or receiver. The "maxaveragebitrate" parameter typically will not compromise interoperability; however, some values might cause application performance to suffer and ought to be set with care.

o “maxaveragebitrate”参数是一个单向仅接收参数,反映本地接收机的限制。另一方的发送方发送时的平均比特率不得高于“maxaveragebitrate”,因为这可能会使网络和/或接收方过载。“maxaveragebitrate”参数通常不会影响互操作性;但是,某些值可能会影响应用程序的性能,因此应谨慎设置。

o The "sprop-maxcapturerate" and "sprop-stereo" parameters are unidirectional sender-only parameters that reflect limitations of the sender side. They allow the receiver to set up a reduced-complexity audio processing pipeline if the sender is not planning to use the full range of Opus's capabilities. Neither "sprop-maxcapturerate" nor "sprop-stereo" affect interoperability, and the receiver MUST be capable of receiving any signal.

o “sprop maxcapturerate”和“sprop stereo”参数是仅限发送方的单向参数,反映了发送方的限制。如果发送方不打算使用Opus的全部功能,它们允许接收方建立一个复杂度较低的音频处理管道。“sprop maxcapturerate”和“sprop stereo”都不会影响互操作性,并且接收器必须能够接收任何信号。

o The "stereo" parameter is a unidirectional receive-only parameter. When sending to a single destination, a sender MUST NOT use stereo when "stereo" is 0. Gateways or senders that are sending the same encoded audio to multiple destinations SHOULD NOT use stereo when "stereo" is 0, as this would lead to inefficient use of network resources. The "stereo" parameter does not affect interoperability.

o “立体声”参数是单向仅接收参数。当发送到单个目的地时,当“stereo”为0时,发送方不得使用stereo。当“stereo”为0时,向多个目的地发送相同编码音频的网关或发送方不应使用stereo,因为这将导致网络资源的低效使用。“立体声”参数不影响互操作性。

o The "cbr" parameter is a unidirectional receive-only parameter.

o “cbr”参数是单向仅接收参数。

o The "useinbandfec" parameter is a unidirectional receive-only parameter.

o “useinbandfec”参数是一个单向仅接收参数。

o The "usedtx" parameter is a unidirectional receive-only parameter.

o “usedtx”参数是一个单向仅接收参数。

o Any unknown parameter in an offer MUST be ignored by the receiver and MUST be removed from the answer.

o 接收方必须忽略报价中的任何未知参数,并且必须将其从答案中删除。

The Opus parameters in an SDP offer/answer exchange are completely orthogonal, and there is no relationship between the SDP offer and the answer.

SDP提供/应答交换中的Opus参数是完全正交的,SDP提供和应答之间没有关系。

7.2. Declarative SDP Considerations for Opus
7.2. Opus的声明性SDP注意事项

For declarative use of SDP such as in the Session Announcement Protocol (SAP) [RFC2974] and the Real Time Streaming Protocol (RTSP) [RFC2326] for Opus, the following needs to be considered:

对于SDP的声明性使用,例如在Opus的会话公告协议(SAP)[RFC2974]和实时流协议(RTSP)[RFC2326]中,需要考虑以下事项:

o The values for "maxptime", "ptime", "maxplaybackrate", and "maxaveragebitrate" ought to be selected carefully to ensure that a reasonable performance can be achieved for the participants of a session.

o 应仔细选择“maxptime”、“ptime”、“maxplaybackrate”和“maxaveragebitrate”的值,以确保会话参与者能够获得合理的性能。

o The values for "maxptime", "ptime", and of the payload format configuration are recommendations by the decoding side to ensure the best performance for the decoder.

o 解码侧建议使用“maxptime”、“ptime”和有效负载格式配置的值,以确保解码器的最佳性能。

o All other parameters of the payload format configuration are declarative and a participant MUST use the configurations that are provided for the session. More than one configuration can be provided if necessary by declaring multiple RTP payload types; however, the number of types ought to be kept small.

o 有效负载格式配置的所有其他参数都是声明性的,参与者必须使用为会话提供的配置。如果需要,可以通过声明多个RTP有效负载类型来提供多个配置;然而,类型的数量应该保持在较小的范围内。

8. Security Considerations
8. 安全考虑

Use of VBR is subject to the security considerations in [RFC6562].

VBR的使用取决于[RFC6562]中的安全注意事项。

RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550] and in any applicable RTP profile such as RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711], or RTP/ SAVPF [RFC5124]. However, as "Securing the RTP Framework: Why RTP Does Not Mandate a Single Media Security Solution" [RFC7202] discusses, it is not an RTP payload format's responsibility to discuss or mandate what solutions are used to meet the basic security goals like confidentiality, integrity, and source authenticity for RTP in general. This responsibility lies on anyone using RTP in an application. They can find guidance on available security mechanisms

使用本规范中定义的有效负载格式的RTP数据包受RTP规范[RFC3550]和任何适用RTP配置文件(如RTP/AVP[RFC3551]、RTP/AVPF[RFC4585]、RTP/SAVP[RFC3711]或RTP/SAVPF[RFC5124]中讨论的安全注意事项的约束。然而,正如[RFC7202]所讨论的“保护RTP框架:为什么RTP不强制要求单一媒体安全解决方案”,RTP有效负载格式不负责讨论或强制要求使用什么解决方案来满足RTP的基本安全目标,如机密性、完整性和源真实性。这一责任由在应用程序中使用RTP的任何人承担。他们可以找到关于可用安全机制的指导

and important considerations in "Options for Securing RTP Sessions" [RFC7201]. Applications SHOULD use one or more appropriate strong security mechanisms.

以及“保护RTP会话的选项”[RFC7201]中的重要注意事项。应用程序应使用一个或多个适当的强安全机制。

This payload format and the Opus encoding do not exhibit any significant non-uniformity in the receiver-end computational load and thus are unlikely to pose a denial-of-service threat due to the receipt of pathological datagrams.

此有效载荷格式和Opus编码在接收端计算负载中不表现出任何显著的非均匀性,因此不太可能由于接收病理数据报而造成拒绝服务威胁。

9. References
9. 工具书类
9.1. Normative References
9.1. 规范性引用文件

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,DOI 10.17487/RFC2119,1997年3月<http://www.rfc-editor.org/info/rfc2119>.

[RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, DOI 10.17487/RFC2326, April 1998, <http://www.rfc-editor.org/info/rfc2326>.

[RFC2326]Schulzrinne,H.,Rao,A.,和R.Lanphier,“实时流协议(RTSP)”,RFC 2326,DOI 10.17487/RFC2326,1998年4月<http://www.rfc-editor.org/info/rfc2326>.

[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, DOI 10.17487/RFC3264, June 2002, <http://www.rfc-editor.org/info/rfc3264>.

[RFC3264]Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,DOI 10.17487/RFC3264,2002年6月<http://www.rfc-editor.org/info/rfc3264>.

[RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for Comfort Noise (CN)", RFC 3389, DOI 10.17487/RFC3389, September 2002, <http://www.rfc-editor.org/info/rfc3389>.

[RFC3389]Zopf,R.,“舒适噪声(CN)的实时传输协议(RTP)有效载荷”,RFC 3389,DOI 10.17487/RFC3389,2002年9月<http://www.rfc-editor.org/info/rfc3389>.

[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, July 2003, <http://www.rfc-editor.org/info/rfc3550>.

[RFC3550]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 3550,DOI 10.17487/RFC3550,2003年7月<http://www.rfc-editor.org/info/rfc3550>.

[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, DOI 10.17487/RFC3551, July 2003, <http://www.rfc-editor.org/info/rfc3551>.

[RFC3551]Schulzrinne,H.和S.Casner,“具有最小控制的音频和视频会议的RTP配置文件”,STD 65,RFC 3551,DOI 10.17487/RFC3551,2003年7月<http://www.rfc-editor.org/info/rfc3551>.

[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, DOI 10.17487/RFC3711, March 2004, <http://www.rfc-editor.org/info/rfc3711>.

[RFC3711]Baugher,M.,McGrew,D.,Naslund,M.,Carrara,E.,和K.Norrman,“安全实时传输协议(SRTP)”,RFC 3711,DOI 10.17487/RFC3711,2004年3月<http://www.rfc-editor.org/info/rfc3711>.

[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, DOI 10.17487/RFC4566, July 2006, <http://www.rfc-editor.org/info/rfc4566>.

[RFC4566]Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC 4566,DOI 10.17487/RFC4566,2006年7月<http://www.rfc-editor.org/info/rfc4566>.

[RFC4855] Casner, S., "Media Type Registration of RTP Payload Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, <http://www.rfc-editor.org/info/rfc4855>.

[RFC4855]Casner,S.,“RTP有效载荷格式的媒体类型注册”,RFC 4855,DOI 10.17487/RFC4855,2007年2月<http://www.rfc-editor.org/info/rfc4855>.

[RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific Media Attributes in the Session Description Protocol (SDP)", RFC 5576, DOI 10.17487/RFC5576, June 2009, <http://www.rfc-editor.org/info/rfc5576>.

[RFC5576]Lennox,J.,Ott,J.,和T.Schierl,“会话描述协议(SDP)中的源特定媒体属性”,RFC 5576,DOI 10.17487/RFC5576,2009年6月<http://www.rfc-editor.org/info/rfc5576>.

[RFC6562] Perkins, C. and JM. Valin, "Guidelines for the Use of Variable Bit Rate Audio with Secure RTP", RFC 6562, DOI 10.17487/RFC6562, March 2012, <http://www.rfc-editor.org/info/rfc6562>.

[RFC6562]Perkins,C.和JM。Valin,“带安全RTP的可变比特率音频使用指南”,RFC 6562,DOI 10.17487/RFC6562,2012年3月<http://www.rfc-editor.org/info/rfc6562>.

[RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, September 2012, <http://www.rfc-editor.org/info/rfc6716>.

[RFC6716]Valin,JM.,Vos,K.,和T.Terriberry,“作品音频编解码器的定义”,RFC 6716,DOI 10.17487/RFC6716,2012年9月<http://www.rfc-editor.org/info/rfc6716>.

[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type Specifications and Registration Procedures", BCP 13, RFC 6838, DOI 10.17487/RFC6838, January 2013, <http://www.rfc-editor.org/info/rfc6838>.

[RFC6838]Freed,N.,Klensin,J.和T.Hansen,“介质类型规范和注册程序”,BCP 13,RFC 6838,DOI 10.17487/RFC6838,2013年1月<http://www.rfc-editor.org/info/rfc6838>.

9.2. Informative References
9.2. 资料性引用

[RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session Announcement Protocol", RFC 2974, DOI 10.17487/RFC2974, October 2000, <http://www.rfc-editor.org/info/rfc2974>.

[RFC2974]Handley,M.,Perkins,C.,和E.Whelan,“会话公告协议”,RFC 2974,DOI 10.17487/RFC2974,2000年10月<http://www.rfc-editor.org/info/rfc2974>.

[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, "Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, DOI 10.17487/RFC4585, July 2006, <http://www.rfc-editor.org/info/rfc4585>.

[RFC4585]Ott,J.,Wenger,S.,Sato,N.,Burmeister,C.,和J.Rey,“基于实时传输控制协议(RTCP)的反馈(RTP/AVPF)的扩展RTP配置文件”,RFC 4585,DOI 10.17487/RFC4585,2006年7月<http://www.rfc-editor.org/info/rfc4585>.

[RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February 2008, <http://www.rfc-editor.org/info/rfc5124>.

[RFC5124]Ott,J.和E.Carrara,“基于实时传输控制协议(RTCP)的反馈扩展安全RTP配置文件(RTP/SAVPF)”,RFC 5124DOI 10.17487/RFC5124,2008年2月<http://www.rfc-editor.org/info/rfc5124>.

[RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines for Application Designers", BCP 145, RFC 5405, DOI 10.17487/RFC5405, November 2008, <http://www.rfc-editor.org/info/rfc5405>.

[RFC5405]Eggert,L.和G.Fairhurst,“应用程序设计者的单播UDP使用指南”,BCP 145,RFC 5405,DOI 10.17487/RFC5405,2008年11月<http://www.rfc-editor.org/info/rfc5405>.

[RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014, <http://www.rfc-editor.org/info/rfc7201>.

[RFC7201]Westerlund,M.和C.Perkins,“保护RTP会话的选项”,RFC 7201,DOI 10.17487/RFC7201,2014年4月<http://www.rfc-editor.org/info/rfc7201>.

[RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP Framework: Why RTP Does Not Mandate a Single Media Security Solution", RFC 7202, DOI 10.17487/RFC7202, April 2014, <http://www.rfc-editor.org/info/rfc7202>.

[RFC7202]Perkins,C.和M.Westerlund,“保护RTP框架:为什么RTP不要求单一媒体安全解决方案”,RFC 7202,DOI 10.17487/RFC7202,2014年4月<http://www.rfc-editor.org/info/rfc7202>.

[rmcat] "RTP Media Congestion Avoidance Techniques (rmcat) Documents", <https://datatracker.ietf.org/wg/rmcat/ documents/>.

[rmcat]“RTP媒体拥塞避免技术(rmcat)文档”<https://datatracker.ietf.org/wg/rmcat/ 文件/>。

Acknowledgements

致谢

Many people have made useful comments and suggestions contributing to this document. In particular, we would like to thank Tina le Grand, Cullen Jennings, Jonathan Lennox, Gregory Maxwell, Colin Perkins, Jan Skoglund, Timothy B. Terriberry, Martin Thompson, Justin Uberti, Magnus Westerlund, and Mo Zanaty.

许多人对本文件提出了有益的意见和建议。特别是,我们要感谢蒂娜·勒格兰德、卡伦·詹宁斯、乔纳森·伦诺克斯、格雷戈里·麦克斯韦、科林·珀金斯、扬·斯科格隆德、蒂莫西·特瑞贝里、马丁·汤普森、贾斯汀·尤贝蒂、马格纳斯·韦斯特隆德和莫·扎纳蒂。

Authors' Addresses

作者地址

Julian Spittka

朱利安·斯皮特卡

   Email: jspittka@gmail.com
        
   Email: jspittka@gmail.com
        

Koen Vos vocTone

科恩沃斯酒店

   Email: koenvos74@gmail.com
        
   Email: koenvos74@gmail.com
        

Jean-Marc Valin Mozilla 331 E. Evelyn Avenue Mountain View, CA 94041 United States

Jean-Marc Valin Mozilla 331 E.Evelyn Avenue Mountain View,加利福尼亚州94041

   Email: jmvalin@jmvalin.ca
        
   Email: jmvalin@jmvalin.ca