Internet Engineering Task Force (IETF)                        Y.-K. Wang
Request for Comments: 6184                                       R. Even
Obsoletes: 3984                                      Huawei Technologies
Category: Standards Track                                  T. Kristensen
ISSN: 2070-1721                                                 Tandberg
                                                                R. Jesup
                                                WorldGate Communications
                                                                May 2011
        
Internet Engineering Task Force (IETF)                        Y.-K. Wang
Request for Comments: 6184                                       R. Even
Obsoletes: 3984                                      Huawei Technologies
Category: Standards Track                                  T. Kristensen
ISSN: 2070-1721                                                 Tandberg
                                                                R. Jesup
                                                WorldGate Communications
                                                                May 2011
        

RTP Payload Format for H.264 Video

H.264视频的RTP有效负载格式

Abstract

摘要

This memo describes an RTP Payload format for the ITU-T Recommendation H.264 video codec and the technically identical ISO/IEC International Standard 14496-10 video codec, excluding the Scalable Video Coding (SVC) extension and the Multiview Video Coding extension, for which the RTP payload formats are defined elsewhere. The RTP payload format allows for packetization of one or more Network Abstraction Layer Units (NALUs), produced by an H.264 video encoder, in each RTP payload. The payload format has wide applicability, as it supports applications from simple low bitrate conversational usage, to Internet video streaming with interleaved transmission, to high bitrate video-on-demand.

本备忘录描述了ITU-T建议H.264视频编解码器和技术上相同的ISO/IEC国际标准14496-10视频编解码器的RTP有效载荷格式,不包括可伸缩视频编码(SVC)扩展和多视图视频编码扩展,RTP有效载荷格式在别处有定义。RTP有效载荷格式允许在每个RTP有效载荷中对H.264视频编码器产生的一个或多个网络抽象层单元(NALU)进行分组。有效负载格式具有广泛的适用性,因为它支持从简单的低比特率对话使用到具有交织传输的互联网视频流,再到高比特率视频点播的应用。

This memo obsoletes RFC 3984. Changes from RFC 3984 are summarized in Section 14. Issues on backward compatibility to RFC 3984 are discussed in Section 15.

本备忘录废除RFC 3984。第14节总结了RFC 3984的变更。第15节讨论了RFC 3984的向后兼容性问题。

Status of This Memo

关于下段备忘

This is an Internet Standards Track document.

这是一份互联网标准跟踪文件。

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.

本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。有关互联网标准的更多信息,请参见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6184.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc6184.

Copyright Notice

版权公告

Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2011 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

Table of Contents

目录

   1. Introduction ....................................................4
      1.1. The H.264 Codec ............................................4
      1.2. Parameter Set Concept ......................................5
      1.3. Network Abstraction Layer Unit Types .......................6
   2. Conventions .....................................................7
   3. Scope ...........................................................7
   4. Definitions and Abbreviations ...................................7
      4.1. Definitions ................................................7
      4.2. Abbreviations ..............................................9
   5. RTP Payload Format .............................................10
      5.1. RTP Header Usage ..........................................10
      5.2. Payload Structures ........................................12
      5.3. NAL Unit Header Usage .....................................13
      5.4. Packetization Modes .......................................16
      5.5. Decoding Order Number (DON) ...............................17
      5.6. Single NAL Unit Packet ....................................19
      5.7. Aggregation Packets .......................................20
           5.7.1. Single-Time Aggregation Packet (STAP) ..............22
           5.7.2. Multi-Time Aggregation Packets (MTAPs) .............25
      5.8. Fragmentation Units (FUs) .................................29
   6. Packetization Rules ............................................33
      6.1. Common Packetization Rules ................................33
      6.2. Single NAL Unit Mode ......................................34
      6.3. Non-Interleaved Mode ......................................34
      6.4. Interleaved Mode ..........................................34
   7. De-Packetization Process .......................................35
      7.1. Single NAL Unit and Non-Interleaved Mode ..................35
      7.2. Interleaved Mode ..........................................35
           7.2.1. Size of the De-Interleaving Buffer .................36
           7.2.2. De-Interleaving Process ............................36
      7.3. Additional De-Packetization Guidelines ....................38
        
   1. Introduction ....................................................4
      1.1. The H.264 Codec ............................................4
      1.2. Parameter Set Concept ......................................5
      1.3. Network Abstraction Layer Unit Types .......................6
   2. Conventions .....................................................7
   3. Scope ...........................................................7
   4. Definitions and Abbreviations ...................................7
      4.1. Definitions ................................................7
      4.2. Abbreviations ..............................................9
   5. RTP Payload Format .............................................10
      5.1. RTP Header Usage ..........................................10
      5.2. Payload Structures ........................................12
      5.3. NAL Unit Header Usage .....................................13
      5.4. Packetization Modes .......................................16
      5.5. Decoding Order Number (DON) ...............................17
      5.6. Single NAL Unit Packet ....................................19
      5.7. Aggregation Packets .......................................20
           5.7.1. Single-Time Aggregation Packet (STAP) ..............22
           5.7.2. Multi-Time Aggregation Packets (MTAPs) .............25
      5.8. Fragmentation Units (FUs) .................................29
   6. Packetization Rules ............................................33
      6.1. Common Packetization Rules ................................33
      6.2. Single NAL Unit Mode ......................................34
      6.3. Non-Interleaved Mode ......................................34
      6.4. Interleaved Mode ..........................................34
   7. De-Packetization Process .......................................35
      7.1. Single NAL Unit and Non-Interleaved Mode ..................35
      7.2. Interleaved Mode ..........................................35
           7.2.1. Size of the De-Interleaving Buffer .................36
           7.2.2. De-Interleaving Process ............................36
      7.3. Additional De-Packetization Guidelines ....................38
        
   8. Payload Format Parameters ......................................39
      8.1. Media Type Registration ...................................39
      8.2. SDP Parameters ............................................57
           8.2.1. Mapping of Payload Type Parameters to SDP ..........57
           8.2.2. Usage with the SDP Offer/Answer Model ..............58
           8.2.3. Usage in Declarative Session Descriptions ..........66
      8.3. Examples ..................................................68
      8.4. Parameter Set Considerations ..............................75
      8.5. Decoder Refresh Point Procedure Using In-Band
           Transport of Parameter Sets (Informative)..................78
           8.5.1. IDR Procedure to Respond to a Request for
                  a Decoder Refresh Point ............................78
           8.5.2. Gradual Recovery Procedure to Respond to
                  a Request for a Decoder Refresh Point ..............79
   9. Security Considerations ........................................79
   10. Congestion Control ............................................80
   11. IANA Considerations ...........................................81
   12. Informative Appendix: Application Examples ....................81
      12.1. Video Telephony According to Annex A of ITU-T
            Recommendation H.241 .....................................81
      12.2. Video Telephony, No Slice Data Partitioning, No
            NAL Unit Aggregation .....................................82
      12.3. Video Telephony, Interleaved Packetization Using
            NAL Unit Aggregation .....................................82
      12.4. Video Telephony with Data Partitioning ...................83
      12.5. Video Telephony or Streaming with FUs and Forward
            Error Correction .........................................83
      12.6. Low Bitrate Streaming ....................................86
      12.7. Robust Packet Scheduling in Video Streaming ..............86
   13. Informative Appendix: Rationale for Decoding Order Number .....87
      13.1. Introduction .............................................87
      13.2. Example of Multi-Picture Slice Interleaving ..............88
      13.3. Example of Robust Packet Scheduling ......................89
      13.4. Robust Transmission Scheduling of Redundant Coded
            Slices ...................................................93
      13.5. Remarks on Other Design Possibilities ....................94
   14. Changes from RFC 3984 .........................................94
   15. Backward Compatibility to RFC 3984 ............................96
   16. Acknowledgements ..............................................98
   17. References ....................................................98
      17.1. Normative References .....................................98
      17.2. Informative References ...................................99
        
   8. Payload Format Parameters ......................................39
      8.1. Media Type Registration ...................................39
      8.2. SDP Parameters ............................................57
           8.2.1. Mapping of Payload Type Parameters to SDP ..........57
           8.2.2. Usage with the SDP Offer/Answer Model ..............58
           8.2.3. Usage in Declarative Session Descriptions ..........66
      8.3. Examples ..................................................68
      8.4. Parameter Set Considerations ..............................75
      8.5. Decoder Refresh Point Procedure Using In-Band
           Transport of Parameter Sets (Informative)..................78
           8.5.1. IDR Procedure to Respond to a Request for
                  a Decoder Refresh Point ............................78
           8.5.2. Gradual Recovery Procedure to Respond to
                  a Request for a Decoder Refresh Point ..............79
   9. Security Considerations ........................................79
   10. Congestion Control ............................................80
   11. IANA Considerations ...........................................81
   12. Informative Appendix: Application Examples ....................81
      12.1. Video Telephony According to Annex A of ITU-T
            Recommendation H.241 .....................................81
      12.2. Video Telephony, No Slice Data Partitioning, No
            NAL Unit Aggregation .....................................82
      12.3. Video Telephony, Interleaved Packetization Using
            NAL Unit Aggregation .....................................82
      12.4. Video Telephony with Data Partitioning ...................83
      12.5. Video Telephony or Streaming with FUs and Forward
            Error Correction .........................................83
      12.6. Low Bitrate Streaming ....................................86
      12.7. Robust Packet Scheduling in Video Streaming ..............86
   13. Informative Appendix: Rationale for Decoding Order Number .....87
      13.1. Introduction .............................................87
      13.2. Example of Multi-Picture Slice Interleaving ..............88
      13.3. Example of Robust Packet Scheduling ......................89
      13.4. Robust Transmission Scheduling of Redundant Coded
            Slices ...................................................93
      13.5. Remarks on Other Design Possibilities ....................94
   14. Changes from RFC 3984 .........................................94
   15. Backward Compatibility to RFC 3984 ............................96
   16. Acknowledgements ..............................................98
   17. References ....................................................98
      17.1. Normative References .....................................98
      17.2. Informative References ...................................99
        
1. Introduction
1. 介绍

This memo specifies an RTP payload specification for the video coding standard known as ITU-T Recommendation H.264 [1] and ISO/IEC International Standard 14496-10 [2] (both also known as Advanced Video Coding (AVC)). In this memo, the name H.264 is used for the codec and the standard, but this memo is equally applicable to the ISO/IEC counterpart of the coding standard.

本备忘录规定了视频编码标准(称为ITU-T建议H.264[1]和ISO/IEC国际标准14496-10[2])(也称为高级视频编码(AVC))的RTP有效载荷规范。在本备忘录中,编解码器和标准使用了H.264名称,但本备忘录同样适用于编码标准的ISO/IEC对应项。

This memo obsoletes RFC 3984. Changes from RFC 3984 are summarized in Section 14. Issues on backward compatibility to RFC 3984 are discussed in Section 15.

本备忘录废除RFC 3984。第14节总结了RFC 3984的变更。第15节讨论了RFC 3984的向后兼容性问题。

1.1. The H.264 Codec
1.1. H.264编解码器

The H.264 video codec has a very broad application range that covers all forms of digital compressed video, from low bitrate Internet streaming applications to HDTV broadcast and Digital Cinema applications with nearly lossless coding. Compared to the current state of technology, the overall performance of H.264 is such that bitrate savings of 50% or more are reported. Digital Satellite TV quality, for example, was reported to be achievable at 1.5 Mbit/s, compared to the current operation point of MPEG 2 video at around 3.5 Mbit/s [10].

H.264视频编解码器具有非常广泛的应用范围,涵盖所有形式的数字压缩视频,从低比特率互联网流媒体应用到HDTV广播和几乎无损编码的数字电影应用。与当前的技术状态相比,H.264的总体性能是报告比特率节省50%或更多。例如,据报道,数字卫星电视质量可以达到1.5 Mbit/s,而MPEG 2视频的当前运行点大约为3.5 Mbit/s[10]。

The codec specification [1] itself conceptually distinguishes between a Video Coding Layer (VCL) and a Network Abstraction Layer (NAL). The VCL contains the signal processing functionality of the codec; mechanisms such as transform, quantization, and motion-compensated prediction; and a loop filter. It follows the general concept of most of today's video codecs, a macroblock-based coder that uses inter picture prediction with motion compensation and transform coding of the residual signal. The VCL encoder outputs slices: a bit string that contains the macroblock data of an integer number of macroblocks and the information of the slice header (containing the spatial address of the first macroblock in the slice, the initial quantization parameter, and similar information). Macroblocks in slices are arranged in scan order unless a different macroblock allocation is specified using the syntax of slice groups. In-picture prediction is used only within a slice. More information is provided in [10].

编解码器规范[1]本身在概念上区分了视频编码层(VCL)和网络抽象层(NAL)。VCL包含编解码器的信号处理功能;变换、量化和运动补偿预测等机制;和一个环路滤波器。它遵循当今大多数视频编解码器的一般概念,一种基于宏块的编码器,使用带运动补偿的帧间预测和残余信号的变换编码。VCL编码器输出片:包含整数个宏块的宏块数据和片头信息(包含片中第一个宏块的空间地址、初始量化参数和类似信息)的位字符串。切片中的宏块按扫描顺序排列,除非使用切片组的语法指定了不同的宏块分配。图片内预测仅在切片内使用。更多信息见[10]。

The NAL encoder encapsulates the slice output of the VCL encoder into Network Abstraction Layer Units (NALUs), which are suitable for transmission over packet networks or for use in packet-oriented

NAL编码器将VCL编码器的片输出封装到网络抽象层单元(NALU)中,该单元适合在分组网络上传输或在面向分组的网络中使用

multiplex environments. Annex B of H.264 defines an encapsulation process to transmit such NALUs over bytestream-oriented networks. In the scope of this memo, Annex B is not relevant.

多重环境。H.264的附录B定义了通过面向ByTestStream的网络传输此类NALU的封装过程。在本备忘录范围内,附件B不相关。

Internally, the NAL uses NAL units. A NAL unit consists of a one-byte header and the payload byte string. The header indicates the type of the NAL unit, the (potential) presence of bit errors or syntax violations in the NAL unit payload, and information regarding the relative importance of the NAL unit for the decoding process. This RTP payload specification is designed to be unaware of the bit string in the NAL unit payload.

在内部,NAL使用NAL单位。NAL单元由一个单字节头和有效负载字节字符串组成。报头指示NAL单元的类型、NAL单元有效载荷中(可能)存在的比特错误或语法冲突,以及关于解码过程中NAL单元的相对重要性的信息。此RTP有效负载规范旨在不知道NAL单元有效负载中的位字符串。

One of the main properties of H.264 is the complete decoupling of the transmission time, the decoding time, and the sampling or presentation time of slices and pictures. The decoding process specified in H.264 is unaware of time, and the H.264 syntax does not carry information such as the number of skipped frames (as is common in the form of the Temporal Reference in earlier video compression standards). Also, there are NAL units that affect many pictures and that are, therefore, inherently timeless. For this reason, the handling of the RTP timestamp requires some special considerations for NAL units for which the sampling or presentation time is not defined or, at transmission time, is unknown.

H.264的主要特性之一是传输时间、解码时间以及切片和图片的采样或显示时间的完全解耦。在H.264中指定的解码处理不知道时间,并且H.264语法不携带诸如跳过帧的数目之类的信息(这在早期视频压缩标准中以时间参考的形式常见)。此外,还有影响许多图片的NAL单元,因此,它们本质上是永恒的。因此,RTP时间戳的处理需要对NAL单元进行一些特殊的考虑,对于NAL单元,采样或呈现时间没有定义,或者在传输时未知。

1.2. Parameter Set Concept
1.2. 参数集概念

One very fundamental design concept of H.264 is to generate self-contained packets, to make mechanisms such as the header duplication of RFC 4629 [11] or MPEG-4 Visual's Header Extension Code (HEC) [12] unnecessary. This was achieved by decoupling information relevant to more than one slice from the media stream. This higher-layer meta information should be sent reliably, asynchronously, and in advance from the RTP packet stream that contains the slice packets. (Provisions for sending this information in-band are also available for applications that do not have an out-of-band transport channel appropriate for the purpose). The combination of the higher-level parameters is called a parameter set. The H.264 specification includes two types of parameter sets: sequence parameter sets and picture parameter sets. An active sequence parameter set remains unchanged throughout a coded video sequence, and an active picture parameter set remains unchanged within a coded picture. The sequence and picture parameter set structures contain information such as picture size, optional coding modes employed, and macroblock to slice group map.

H.264的一个非常基本的设计概念是生成自包含的数据包,以使诸如RFC 4629[11]或MPEG-4 Visual的报头扩展码(HEC)[12]的报头复制等机制变得不必要。这是通过从媒体流中分离与多个片段相关的信息来实现的。这种更高层的元信息应该从包含切片数据包的RTP数据包流中可靠地、异步地提前发送。(在带内发送此信息的规定也适用于没有适用于此目的的带外传输信道的应用)。高级参数的组合称为参数集。H.264规范包括两种类型的参数集:序列参数集和图片参数集。活动序列参数集在整个编码视频序列中保持不变,并且活动图片参数集在编码图片中保持不变。序列和图片参数集结构包含图片大小、采用的可选编码模式以及宏块到切片组映射等信息。

To be able to change picture parameters (such as the picture size) without having to transmit parameter set updates synchronously to the slice packet stream, the encoder and decoder can maintain a list of more than one sequence and picture parameter set. Each slice header contains a codeword that indicates the sequence and picture parameter set to be used.

为了能够改变图片参数(例如图片大小),而不必将参数集更新同步地发送到切片分组流,编码器和解码器可以维护多个序列和图片参数集的列表。每个切片标头包含一个码字,该码字指示要使用的序列和图片参数集。

This mechanism allows the decoupling of the transmission of parameter sets from the packet stream and the transmission of them by external means (e.g., as a side effect of the capability exchange) or through a (reliable or unreliable) control protocol. It may even be possible that they are never transmitted but are fixed by an application design specification.

该机制允许通过外部手段(例如,作为能力交换的副作用)或通过(可靠或不可靠的)控制协议将参数集的传输与数据包流分离。甚至可能它们从未被传输,而是由应用程序设计规范固定。

1.3. Network Abstraction Layer Unit Types
1.3. 网络抽象层单元类型

Tutorial information on the NAL design can be found in [13], [14], and [15].

有关NAL设计的教程信息可在[13]、[14]和[15]中找到。

All NAL units consist of a single NAL unit type octet, which also co-serves as the payload header of this RTP payload format. A description of the payload of a NAL unit follows.

所有NAL单元均由单个NAL单元类型的八位字节组成,该八位字节还共同充当此RTP有效负载格式的有效负载标头。下文描述了NAL单元的有效载荷。

The syntax and semantics of the NAL unit type octet are specified in [1], but the essential properties of the NAL unit type octet are summarized below. The NAL unit type octet has the following format:

[1]中规定了NAL单元类型八位字节的语法和语义,但NAL单元类型八位字节的基本属性总结如下。NAL单元类型八位字节的格式如下:

      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        
      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        

The semantics of the components of the NAL unit type octet, as specified in the H.264 specification, are described briefly below.

下面简要描述H.264规范中指定的NAL单元类型八位字节的组件的语义。

F: 1 bit forbidden_zero_bit. The H.264 specification declares a value of 1 as a syntax violation.

F:1位禁止\u零位\u位。H.264规范将值1声明为语法冲突。

NRI: 2 bits nal_ref_idc. A value of 00 indicates that the content of the NAL unit is not used to reconstruct reference pictures for inter picture prediction. Such NAL units can be discarded without risking the integrity of the reference pictures. Values greater than 00 indicate that the decoding of the NAL unit is required to maintain the integrity of the reference pictures.

NRI:2位nal\U ref\U idc。值00表示NAL单元的内容不用于重建用于画面间预测的参考画面。这样的NAL单元可以被丢弃,而不会危及参考图片的完整性。大于00的值表示需要对NAL单元进行解码以保持参考图片的完整性。

Type: 5 bits nal_unit_type. This component specifies the NAL unit payload type as defined in Table 7-1 of [1] and later within this memo. For a reference of all currently defined NAL unit types and their semantics, please refer to Section 7.4.1 in [1].

类型:5位nal\U单元\U类型。该组件指定了[1]表7-1和本备忘录后面部分中定义的NAL装置有效载荷类型。有关所有当前定义的NAL单元类型及其语义的参考,请参考[1]中的第7.4.1节。

This memo introduces new NAL unit types, which are presented in Section 5.2. The NAL unit types defined in this memo are marked as unspecified in [1]. Moreover, this specification extends the semantics of F and NRI as described in Section 5.3.

本备忘录介绍了新的NAL装置类型,见第5.2节。本备忘录中定义的NAL单元类型在[1]中标记为未指定。此外,本规范扩展了第5.3节所述的F和NRI的语义。

2. Conventions
2. 习俗

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [4].

本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照RFC 2119[4]中所述进行解释。

This specification uses the notion of setting and clearing a bit when bit fields are handled. Setting a bit is the same as assigning that bit the value of 1 (On). Clearing a bit is the same as assigning that bit the value of 0 (Off).

本规范使用在处理位字段时设置和清除位的概念。设置位与将该位的值指定为1(On)相同。清除一个位与将该位赋值为0(关闭)相同。

3. Scope
3. 范围

This payload specification can only be used to carry the "naked" H.264 NAL unit stream over RTP and not the bitstream format discussed in Annex B of H.264. Likely, the first applications of this specification will be in the conversational multimedia field, video telephony or video conferencing, but the payload format also covers other applications, such as Internet streaming and TV over IP.

该有效负载规范只能用于通过RTP传输“裸”H.264 NAL单元流,而不是H.264附录B中讨论的比特流格式。很可能,本规范的第一个应用将在对话多媒体领域、视频电话或视频会议中,但有效载荷格式也涵盖其他应用,如互联网流媒体和IP电视。

4. Definitions and Abbreviations
4. 定义和缩写
4.1. Definitions
4.1. 定义

This document uses the definitions of [1]. The following terms, defined in [1], are summed up for convenience:

本文件使用[1]的定义。为了方便起见,对[1]中定义的以下术语进行了总结:

access unit: A set of NAL units always containing a primary coded picture. In addition to the primary coded picture, an access unit may also contain one or more redundant coded pictures or other NAL units not containing slices or slice data partitions of a coded picture. The decoding of an access unit always results in a decoded picture.

访问单元:一组NAL单元,总是包含一个主编码图片。除了主编码图片之外,访问单元还可以包含一个或多个冗余编码图片或不包含编码图片的切片或切片数据分区的其他NAL单元。访问单元的解码总是导致解码图片。

coded video sequence: A sequence of access units that consists, in decoding order, of an instantaneous decoding refresh (IDR) access unit followed by zero or more non-IDR access units including all subsequent access units up to but not including any subsequent IDR access unit.

编码视频序列:按解码顺序由瞬时解码刷新(IDR)访问单元和零个或多个非IDR访问单元组成的访问单元序列,包括所有后续访问单元,但不包括任何后续IDR访问单元。

IDR access unit: An access unit in which the primary coded picture is an IDR picture.

IDR访问单元:其中主编码图片为IDR图片的访问单元。

IDR picture: A coded picture containing only slices with I or SI slice types that causes a "reset" in the decoding process. After the decoding of an IDR picture, all following coded pictures in decoding order can be decoded without inter prediction from any picture decoded prior to the IDR picture.

IDR图片:仅包含I或SI切片类型的切片的编码图片,在解码过程中导致“重置”。在对IDR图片进行解码之后,可以按照解码顺序对所有后续编码图片进行解码,而无需从在IDR图片之前解码的任何图片进行帧间预测。

primary coded picture: The coded representation of a picture to be used by the decoding process for a bitstream conforming to H.264. The primary coded picture contains all macroblocks of the picture.

主编码图片:对符合H.264的位流进行解码处理时使用的图片的编码表示。主编码图片包含图片的所有宏块。

redundant coded picture: A coded representation of a picture or a part of a picture. The content of a redundant coded picture shall not be used by the decoding process for a bitstream conforming to H.264. The content of a redundant coded picture may be used by the decoding process for a bitstream that contains errors or losses.

冗余编码图片:图片或图片部分的编码表示。对于符合H.264的比特流,解码过程不得使用冗余编码图片的内容。冗余编码图片的内容可由解码过程用于包含错误或丢失的比特流。

VCL NAL unit: A collective term used to refer to coded slice and coded data partition NAL units.

VCL NAL单元:用于指编码片和编码数据分区NAL单元的集合术语。

In addition, the following definitions apply:

此外,以下定义适用:

decoding order number (DON): A field in the payload structure or a derived variable indicating NAL unit decoding order. Values of DON are in the range of 0 to 65535, inclusive. After reaching the maximum value, the value of DON wraps around to 0.

解码顺序号(DON):有效负载结构中的一个字段或表示NAL单元解码顺序的派生变量。DON的值在0到65535之间(含0到65535)。达到最大值后,DON的值将变为0。

NAL unit decoding order: A NAL unit order that conforms to the constraints on NAL unit order given in Section 7.4.1.2 in [1].

NAL单元解码顺序:符合[1]第7.4.1.2节中给出的NAL单元顺序约束的NAL单元顺序。

NALU-time: The value that the RTP timestamp would have if the NAL unit would be transported in its own RTP packet.

NALU时间:如果NAL单元将在其自己的RTP数据包中传输,则RTP时间戳将具有的值。

transmission order: The order of packets in ascending RTP sequence number order (in modulo arithmetic). Within an aggregation packet, the NAL unit transmission order is the same as the order of appearance of NAL units in the packet.

传输顺序:以RTP序列号升序排列的数据包顺序(在模运算中)。在聚合分组内,NAL单元传输顺序与分组中NAL单元的出现顺序相同。

media-aware network element (MANE): A network element, such as a middlebox or application layer gateway that is capable of parsing certain aspects of the RTP payload headers or the RTP payload and reacting to the contents.

媒体感知网元(MANE):能够解析RTP有效负载头或RTP有效负载的某些方面并对内容作出反应的网元,如中间盒或应用层网关。

Informative note: The concept of a MANE goes beyond normal routers or gateways in that a MANE has to be aware of the signaling (e.g., to learn about the payload type mappings of the media streams) and that it has to be trusted when working with Secure Real-time Transport Protocol (SRTP). The advantage of using MANEs is that they allow packets to be dropped according to the needs of the media coding. For example, if a MANE has to drop packets due to congestion on a certain link, it can identify and remove those packets whose elimination produces the least adverse effect on the user experience.

资料性说明:MANE的概念超出了普通路由器或网关,因为MANE必须知道信令(例如,了解媒体流的有效负载类型映射),并且在使用安全实时传输协议(SRTP)时必须信任它。使用mane的优点是,它们允许根据媒体编码的需要丢弃数据包。例如,如果MANE由于某一链路上的拥塞而不得不丢弃分组,则它可以识别并移除那些其消除对用户体验产生最小不利影响的分组。

static macroblock: A certain amount of macroblocks in the video stream can be defined as static, as defined in Section 8.3.2.8 in [3]. Static macroblocks free up additional processing cycles for the handling of non-static macroblocks. Based on a given amount of video processing resources and a given resolution, a higher number of static macroblocks enables a correspondingly higher frame rate.

静态宏块:视频流中的一定数量的宏块可以定义为静态,如[3]中第8.3.2.8节所定义。静态宏块为处理非静态宏块释放了额外的处理周期。基于给定数量的视频处理资源和给定的分辨率,更高数量的静态宏块能够实现相应更高的帧速率。

default sub-profile: The subset of coding tools, which may be all coding tools of one profile or the common subset of coding tools of more than one profile, indicated by the profile-level-id parameter.

默认子概要文件:编码工具的子集,可以是一个概要文件的所有编码工具,也可以是多个概要文件的编码工具的公共子集,由概要文件级别id参数表示。

default level: The level indicated by the profile-level-id parameter, which consists of three octets, profile_idc, profile-iop, and level_idc. The default level is indicated by level_idc in most cases, and, in some cases, additionally by profile-iop.

默认级别:由profile level id参数指示的级别,该参数由三个八位字节组成,即profile_idc、profile iop和level_idc。在大多数情况下,默认级别由级别_idc指示,在某些情况下,还由配置文件iop指示。

4.2. Abbreviations
4.2. 缩写

DON: Decoding Order Number DONB: Decoding Order Number Base DOND: Decoding Order Number Difference FEC: Forward Error Correction FU: Fragmentation Unit IDR: Instantaneous Decoding Refresh IEC: International Electrotechnical Commission ISO: International Organization for Standardization ITU-T: International Telecommunication Union, Telecommunication Standardization Sector MANE: Media-Aware Network Element MTAP: Multi-Time Aggregation Packet

DON:解码顺序号DONB:解码顺序号Base DOND:解码顺序号差异FEC:前向纠错FU:分段单元IDR:瞬时解码刷新IEC:国际电工委员会ISO:国际标准化组织ITU-T:国际电信联盟,电信标准化部门MANE:媒体感知网元MTAP:多时间聚合数据包

MTAP16: MTAP with 16-bit timestamp offset MTAP24: MTAP with 24-bit timestamp offset NAL: Network Abstraction Layer NALU: NAL Unit SAR: Sample Aspect Ratio SEI: Supplemental Enhancement Information STAP: Single-Time Aggregation Packet STAP-A: STAP type A STAP-B: STAP type B TS: Timestamp VCL: Video Coding Layer VUI: Video Usability Information

MTAP16:MTAP具有16位时间戳偏移MTAP24:MTAP具有24位时间戳偏移NAL:网络抽象层NALU:NAL单元SAR:样本纵横比SEI:补充增强信息STAP:单次聚合数据包STAP-A:STAP类型A STAP-B:STAP类型B TS:时间戳VCL:视频编码层VUI:视频可用性信息

5. RTP Payload Format
5. RTP有效负载格式
5.1. RTP Header Usage
5.1. RTP头使用

The format of the RTP header is specified in RFC 3550 [5] and reprinted in Figure 1 for convenience. This payload format uses the fields of the header in a manner consistent with that specification.

RFC 3550[5]中规定了RTP标头的格式,为了方便起见,在图1中重新打印了RTP标头。此有效负载格式以与该规范一致的方式使用报头的字段。

When one NAL unit is encapsulated per RTP packet, the RECOMMENDED RTP payload format is specified in Section 5.6. The RTP payload (and the settings for some RTP header bits) for aggregation packets and fragmentation units are specified in Sections 5.7.2 and 5.8, respectively.

当每个RTP数据包封装一个NAL单元时,第5.6节规定了推荐的RTP有效负载格式。第5.7.2节和第5.8节分别规定了聚合数据包和分段单元的RTP有效负载(以及某些RTP报头位的设置)。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |V=2|P|X|  CC   |M|     PT      |       sequence number         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           timestamp                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           synchronization source (SSRC) identifier            |
      +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
      |            contributing source (CSRC) identifiers             |
      |                             ....                              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |V=2|P|X|  CC   |M|     PT      |       sequence number         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           timestamp                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           synchronization source (SSRC) identifier            |
      +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
      |            contributing source (CSRC) identifiers             |
      |                             ....                              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 1. RTP header according to RFC 3550

图1。符合RFC 3550的RTP集管

The RTP header information to be set according to this RTP payload format is set as follows:

要根据此RTP有效负载格式设置的RTP报头信息设置如下:

Marker bit (M): 1 bit Set for the very last packet of the access unit indicated by the RTP timestamp, in line with the normal use of the M bit in video

标记位(M):为RTP时间戳指示的接入单元的最后一个分组设置的1位,与视频中M位的正常使用一致

formats, to allow an efficient playout buffer handling. For aggregation packets (STAP and MTAP), the marker bit in the RTP header MUST be set to the value that the marker bit of the last NAL unit of the aggregation packet would have been if it were transported in its own RTP packet. Decoders MAY use this bit as an early indication of the last packet of an access unit but MUST NOT rely on this property.

格式,以允许有效的播放缓冲区处理。对于聚合数据包(STAP和MTAP),RTP报头中的标记位必须设置为聚合数据包最后一个NAL单元的标记位的值,如果它在自己的RTP数据包中传输。解码器可以使用该位作为接入单元的最后一个分组的早期指示,但不得依赖于该属性。

Informative note: Only one M bit is associated with an aggregation packet carrying multiple NAL units. Thus, if a gateway has re-packetized an aggregation packet into several packets, it cannot reliably set the M bit of those packets.

资料性说明:只有一个M位与承载多个NAL单元的聚合数据包相关联。因此,如果网关已将聚合数据包重新打包为多个数据包,则无法可靠地设置这些数据包的M位。

Payload type (PT): 7 bits The assignment of an RTP payload type for this new packet format is outside the scope of this document and will not be specified here. The assignment of a payload type has to be performed either through the profile used or in a dynamic way.

有效负载类型(PT):7位此新数据包格式的RTP有效负载类型的分配不在本文档的范围内,此处将不指定。有效负载类型的分配必须通过使用的配置文件或以动态方式执行。

Sequence number (SN): 16 bits Set and used in accordance with RFC 3550. For the single NALU and non-interleaved packetization mode, the sequence number is used to determine decoding order for the NALU.

序列号(SN):根据RFC 3550设置和使用的16位。对于单个NALU和非交错分组模式,序列号用于确定NALU的解码顺序。

Timestamp: 32 bits The RTP timestamp is set to the sampling timestamp of the content. A 90 kHz clock rate MUST be used.

时间戳:32位RTP时间戳设置为内容的采样时间戳。必须使用90 kHz的时钟频率。

If the NAL unit has no timing properties of its own (e.g., parameter set and SEI NAL units), the RTP timestamp is set to the RTP timestamp of the primary coded picture of the access unit in which the NAL unit is included, according to Section 7.4.1.2 of [1].

如果NAL单元没有自己的定时属性(例如,参数集和SEI-NAL单元),则根据[1]的第7.4.1.2节,将RTP时间戳设置为包含NAL单元的接入单元的主编码图片的RTP时间戳。

The setting of the RTP timestamp for MTAPs is defined in Section 5.7.2.

MTAP的RTP时间戳设置见第5.7.2节。

Receivers SHOULD ignore any picture timing SEI messages included in access units that have only one display timestamp. Instead, receivers SHOULD use the RTP timestamp for synchronizing the display process.

接收器应忽略仅具有一个显示时间戳的访问单元中包含的任何图片定时SEI消息。相反,接收器应该使用RTP时间戳来同步显示过程。

If one access unit has more than one display timestamp carried in a picture timing SEI message, then the information in the SEI message SHOULD be treated as relative to the RTP timestamp, with the earliest event occurring at the time given by the RTP timestamp and subsequent events later, as given by the difference in picture time values carried in the picture timing SEI message.

如果一个访问单元在图片定时SEI消息中携带了多个显示时间戳,则SEI消息中的信息应被视为相对于RTP时间戳,最早的事件发生在RTP时间戳给出的时间,随后的事件发生在后面,由图片定时SEI消息中携带的图片时间值的差异给出。

Let tSEI1, tSEI2, ..., tSEIn be the display timestamps carried in the SEI message of an access unit, where tSEI1 is the earliest of all such timestamps. Let tmadjst() be a function that adjusts the SEI messages time scale to a 90-kHz time scale. Let TS be the RTP timestamp. Then, the display time for the event associated with tSEI1 is TS. The display time for the event with tSEIx, where x is [2..n], is TS + tmadjst (tSEIx - tSEI1).

设tSEI1、tSEI2、…、tSEIn为接入单元的SEI消息中携带的显示时间戳,其中tSEI1是所有此类时间戳中最早的。让tmadjst()是一个将SEI消息时间刻度调整为90 kHz时间刻度的函数。设TS为RTP时间戳。然后,与tSEI1关联的事件的显示时间是TS。与tSEIx关联的事件的显示时间,其中x是[2..n],是TS+tmadjst(tSEIx-tSEI1)。

Informative note: Displaying coded frames as fields is needed commonly in an operation known as 3:2 pulldown, in which film content that consists of coded frames is displayed on a display using interlaced scanning. The picture timing SEI message enables carriage of multiple timestamps for the same coded picture, and therefore the 3:2 pulldown process is perfectly controlled. The picture timing SEI message mechanism is necessary because only one timestamp per coded frame can be conveyed in the RTP timestamp.

资料性说明:在称为3:2下拉的操作中,通常需要将编码帧显示为字段,在该操作中,使用隔行扫描在显示器上显示由编码帧组成的胶片内容。图片定时SEI消息允许为同一编码图片传送多个时间戳,因此3:2下拉过程得到完美控制。图片定时SEI消息机制是必要的,因为RTP时间戳中每个编码帧只能传送一个时间戳。

5.2. Payload Structures
5.2. 有效载荷结构

The payload format defines three different basic payload structures. A receiver can identify the payload structure by the first byte of the RTP packet payload, which co-serves as the RTP payload header and, in some cases, as the first byte of the payload. This byte is always structured as a NAL unit header. The NAL unit type field indicates which structure is present. The possible structures are as follows.

有效载荷格式定义了三种不同的基本有效载荷结构。接收机可以通过RTP分组有效载荷的第一字节来识别有效载荷结构,该有效载荷共同用作RTP有效载荷报头,并且在某些情况下作为有效载荷的第一字节。此字节始终被构造为NAL单元头。NAL单元类型字段指示存在的结构。可能的结构如下。

Single NAL Unit Packet: Contains only a single NAL unit in the payload. The NAL header type field is equal to the original NAL unit type, i.e., in the range of 1 to 23, inclusive. Specified in Section 5.6.

单个NAL单元数据包:在有效负载中仅包含单个NAL单元。NAL标头类型字段等于原始NAL单位类型,即范围为1到23(包括1到23)。第5.6节中规定。

Aggregation Packet: Packet type used to aggregate multiple NAL units into a single RTP payload. This packet exists in four versions, the Single-Time Aggregation Packet type A (STAP-A), the Single-Time Aggregation Packet type B (STAP-B), Multi-Time Aggregation Packet (MTAP) with 16-bit offset (MTAP16), and Multi-Time Aggregation Packet (MTAP) with 24-bit offset (MTAP24). The NAL unit type numbers assigned for STAP-A, STAP-B, MTAP16, and MTAP24 are 24, 25, 26, and 27, respectively. Specified in Section 5.7.

聚合数据包:用于将多个NAL单元聚合为单个RTP有效负载的数据包类型。此数据包有四个版本,即单次聚合数据包类型A(STAP-A)、单次聚合数据包类型B(STAP-B)、具有16位偏移量的多时间聚合数据包(MTAP)(MTAP16)和具有24位偏移量的多时间聚合数据包(MTAP)(MTAP24)。为STAP-A、STAP-B、MTAP16和MTAP24分配的NAL单元类型号分别为24、25、26和27。第5.7节中规定。

Fragmentation Unit: Used to fragment a single NAL unit over multiple RTP packets. Exists with two versions, FU-A and FU-B, identified with the NAL unit type numbers 28 and 29, respectively. Specified in Section 5.8.

分段单元:用于在多个RTP数据包上对单个NAL单元进行分段。存在两个版本,FU-A和FU-B,分别用NAL装置类型编号28和29标识。第5.8节中规定。

Informative note: This specification does not limit the size of NAL units encapsulated in single NAL unit packets and fragmentation units. The maximum size of a NAL unit encapsulated in any aggregation packet is 65535 bytes.

资料性说明:本规范不限制封装在单个NAL单元数据包和碎片单元中的NAL单元的大小。封装在任何聚合数据包中的NAL单元的最大大小为65535字节。

Table 1 summarizes NAL unit types and the corresponding RTP packet types when each of these NAL units is directly used as a packet payload, and where the types are described in this memo.

表1总结了NAL单元类型和当每个NAL单元直接用作数据包有效负载时的相应RTP数据包类型,这些类型在本备忘录中进行了描述。

Table 1. Summary of NAL unit types and the corresponding packet types

表1。NAL单元类型和相应数据包类型的摘要

      NAL Unit  Packet    Packet Type Name               Section
      Type      Type
      -------------------------------------------------------------
      0        reserved                                     -
      1-23     NAL unit  Single NAL unit packet             5.6
      24       STAP-A    Single-time aggregation packet     5.7.1
      25       STAP-B    Single-time aggregation packet     5.7.1
      26       MTAP16    Multi-time aggregation packet      5.7.2
      27       MTAP24    Multi-time aggregation packet      5.7.2
      28       FU-A      Fragmentation unit                 5.8
      29       FU-B      Fragmentation unit                 5.8
      30-31    reserved                                     -
        
      NAL Unit  Packet    Packet Type Name               Section
      Type      Type
      -------------------------------------------------------------
      0        reserved                                     -
      1-23     NAL unit  Single NAL unit packet             5.6
      24       STAP-A    Single-time aggregation packet     5.7.1
      25       STAP-B    Single-time aggregation packet     5.7.1
      26       MTAP16    Multi-time aggregation packet      5.7.2
      27       MTAP24    Multi-time aggregation packet      5.7.2
      28       FU-A      Fragmentation unit                 5.8
      29       FU-B      Fragmentation unit                 5.8
      30-31    reserved                                     -
        
5.3. NAL Unit Header Usage
5.3. NAL单元头使用

The structure and semantics of the NAL unit header were introduced in Section 1.3. For convenience, the format of the NAL unit header is reprinted below:

第1.3节介绍了NAL单元头的结构和语义。为方便起见,NAL单元标题的格式重印如下:

      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        
      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        

This section specifies the semantics of F and NRI according to this specification.

本节根据本规范规定了F和NRI的语义。

F: 1 bit forbidden_zero_bit. A value of 0 indicates that the NAL unit type octet and payload should not contain bit errors or other syntax violations. A value of 1 indicates that the NAL unit type octet and payload may contain bit errors or other syntax violations.

F:1位禁止\u零位\u位。值0表示NAL单元类型八位字节和有效负载不应包含位错误或其他语法冲突。值1表示NAL单元类型八位字节和有效负载可能包含位错误或其他语法冲突。

MANEs SHOULD set the F bit to indicate detected bit errors in the NAL unit. The H.264 specification requires that the F bit be equal to 0. When the F bit is set, the decoder is advised that bit errors or any other syntax violations may be present in the payload or in the NAL unit type octet. The simplest decoder reaction to a NAL unit in which the F bit is equal to 1 is to discard such a NAL unit and to conceal the lost data in the discarded NAL unit.

MANE应设置F位,以指示NAL单元中检测到的位错误。H.264规范要求F位等于0。当设置F位时,建议解码器在有效载荷或NAL单元类型八位字节中可能存在位错误或任何其他语法冲突。对于F位等于1的NAL单元,解码器最简单的反应是丢弃这样的NAL单元,并在丢弃的NAL单元中隐藏丢失的数据。

NRI: 2 bits nal_ref_idc. The semantics of value 00 and a non-zero value remain unchanged from the H.264 specification. In other words, a value of 00 indicates that the content of the NAL unit is not used to reconstruct reference pictures for inter picture prediction. Such NAL units can be discarded without risking the integrity of the reference pictures. Values greater than 00 indicate that the decoding of the NAL unit is required to maintain the integrity of the reference pictures.

NRI:2位nal\U ref\U idc。值00和非零值的语义与H.264规范保持不变。换言之,值00表示NAL单元的内容不用于重建用于画面间预测的参考画面。这样的NAL单元可以被丢弃,而不会危及参考图片的完整性。大于00的值表示需要对NAL单元进行解码以保持参考图片的完整性。

In addition to the specification above, according to this RTP payload specification, values of NRI indicate the relative transport priority, as determined by the encoder. MANEs can use this information to protect more important NAL units better than they do less important NAL units. The highest transport priority is 11, followed by 10, and then by 01; finally, 00 is the lowest.

除上述规范外,根据该RTP有效负载规范,NRI值表示编码器确定的相对传输优先级。与不太重要的NAL单元相比,MANE可以利用这些信息更好地保护更重要的NAL单元。最高传输优先级是11,其次是10,然后是01;最后,00是最低的。

Informative note: Any non-zero value of NRI is handled identically in H.264 decoders. Therefore, receivers need not manipulate the value of NRI when passing NAL units to the decoder.

资料性说明:NRI的任何非零值在H.264解码器中的处理方式相同。因此,当将NAL单元传递给解码器时,接收机不需要操纵NRI的值。

An H.264 encoder MUST set the value of NRI according to the H.264 specification (Subclause 7.4.1) when the value of nal_unit_type is in the range of 1 to 12, inclusive. In particular, the H.264 specification requires that the value of NRI SHALL be equal to 0 for all NAL units having nal_unit_type equal to 6, 9, 10, 11, or 12.

H.264编码器必须根据H.264规范(第7.4.1款)设置NRI值,当nal_单位_类型的值在1到12范围内(包括1到12)时。特别是,H.264规范要求,对于NAL_单元类型等于6、9、10、11或12的所有NAL单元,NRI的值应等于0。

For NAL units having nal_unit_type equal to 7 or 8 (indicating a sequence parameter set or a picture parameter set, respectively), an H.264 encoder SHOULD set the value of NRI to 11 (in binary format). For coded slice NAL units of a primary coded picture having nal_unit_type equal to 5 (indicating a coded slice belonging to an IDR picture), an H.264 encoder SHOULD set the value of NRI to 11 (in binary format).

对于NAL_unit_type等于7或8(分别表示序列参数集或图片参数集)的NAL单元,H.264编码器应将NRI的值设置为11(二进制格式)。对于NAL_unit_type等于5的主编码图片的编码片段NAL单元(表示属于IDR图片的编码片段),H.264编码器应将NRI的值设置为11(二进制格式)。

For a mapping of the remaining nal_unit_types to NRI values, the following example MAY be used and has been shown to be efficient in a certain environment [14]. Other mappings MAY also be desirable, depending on the application and the H.264 profile in use.

对于剩余的nal_单位_类型到NRI值的映射,可以使用以下示例,并且已经证明在特定环境中是有效的[14]。根据所使用的应用程序和H.264配置文件,也可能需要其他映射。

Informative note: Data partitioning is not available in certain profiles, e.g., in the Main or Baseline profiles. Consequently, the NAL unit types 2, 3, and 4 can occur only if the video bitstream conforms to a profile in which data partitioning is allowed and not in streams that conform to the Main or Baseline profiles.

资料性说明:数据分区在某些配置文件中不可用,例如在主配置文件或基线配置文件中。因此,仅当视频比特流符合其中允许数据分区的简档而不是符合主简档或基线简档的流时,才可以出现NAL单元类型2、3和4。

Table 2. Example of NRI values for coded slices and coded slice data partitions of primary coded reference pictures

表2。主编码参考图片的编码切片和编码切片数据分区的NRI值示例

       NAL Unit Type     Content of NAL Unit              NRI (binary)
       ----------------------------------------------------------------
        1              non-IDR coded slice                         10
        2              Coded slice data partition A                10
        3              Coded slice data partition B                01
        4              Coded slice data partition C                01
        
       NAL Unit Type     Content of NAL Unit              NRI (binary)
       ----------------------------------------------------------------
        1              non-IDR coded slice                         10
        2              Coded slice data partition A                10
        3              Coded slice data partition B                01
        4              Coded slice data partition C                01
        

Informative note: As mentioned before, the NRI value of non-reference pictures is 00 as mandated by H.264.

资料性说明:如前所述,根据H.264的规定,非参考图片的NRI值为00。

An H.264 encoder SHOULD set the value of NRI for coded slice and coded slice data partition NAL units of redundant coded reference pictures equal to 01 (in binary format).

H.264编码器应将冗余编码参考图片的编码片段和编码片段数据分区NAL单元的NRI值设置为等于01(二进制格式)。

Definitions of the values for NRI for NAL unit types 24 to 29, inclusive, are given in Sections 5.7 and 5.8 of this memo.

本备忘录第5.7节和第5.8节给出了24至29型NAL装置的NRI值定义。

No recommendation for the value of NRI is given for NAL units having nal_unit_type in the range of 13 to 23, inclusive, because these values are reserved for ITU-T and ISO/IEC. No recommendation for the value of NRI is given for NAL units having nal_unit_type equal to 0 or in the range of 30 to 31, inclusive, as the semantics of these values are not specified in this memo.

对于NAL_unit_类型在13到23(包括13到23)范围内的NAL单元,没有给出NRI值的建议,因为这些值是为ITU-T和ISO/IEC保留的。对于NAL_unit_type等于0或在30到31(含30到31)范围内的NAL单元,未给出NRI值的建议,因为本备忘录中未规定这些值的语义。

5.4. Packetization Modes
5.4. 打包方式

This memo specifies three cases of packetization modes:

本备忘录规定了三种打包模式:

o Single NAL unit mode

o 单NAL单元模式

o Non-interleaved mode

o 非交织模式

o Interleaved mode

o 交织模式

The single NAL unit mode is targeted for conversational systems that comply with ITU-T Recommendation H.241 [3] (see Section 12.1). The non-interleaved mode is targeted for conversational systems that may not comply with ITU-T Recommendation H.241. In the non-interleaved mode, NAL units are transmitted in NAL unit decoding order. The interleaved mode is targeted for systems that do not require very low end-to-end latency. The interleaved mode allows transmission of NAL units out of NAL unit decoding order.

单NAL单元模式适用于符合ITU-T建议H.241[3](见第12.1节)的对话系统。非交织模式针对可能不符合ITU-T建议H.241的会话系统。在非交织模式中,以NAL单元解码顺序发送NAL单元。交织模式的目标是不需要非常低的端到端延迟的系统。交织模式允许按照NAL单元解码顺序传输NAL单元。

The packetization mode in use MAY be signaled by the value of the OPTIONAL packetization-mode media type parameter. The used packetization mode governs which NAL unit types are allowed in RTP payloads. Table 3 summarizes the allowed packet payload types for each packetization mode. Packetization modes are explained in more detail in Section 6.

正在使用的打包模式可以通过可选打包模式媒体类型参数的值来表示。使用的打包模式控制RTP有效负载中允许的NAL单元类型。表3总结了每个分组化模式允许的分组有效负载类型。第6节将更详细地解释打包模式。

Table 3. Summary of allowed NAL unit types for each packetization mode (yes = allowed, no = disallowed, ig = ignore)

表3。每个打包模式允许的NAL单元类型汇总(是=允许,否=不允许,ig=忽略)

      Payload Packet    Single NAL    Non-Interleaved    Interleaved
      Type    Type      Unit Mode           Mode             Mode
      -------------------------------------------------------------
      0      reserved      ig               ig               ig
      1-23   NAL unit     yes              yes               no
      24     STAP-A        no              yes               no
      25     STAP-B        no               no              yes
      26     MTAP16        no               no              yes
      27     MTAP24        no               no              yes
      28     FU-A          no              yes              yes
      29     FU-B          no               no              yes
      30-31  reserved      ig               ig               ig
        
      Payload Packet    Single NAL    Non-Interleaved    Interleaved
      Type    Type      Unit Mode           Mode             Mode
      -------------------------------------------------------------
      0      reserved      ig               ig               ig
      1-23   NAL unit     yes              yes               no
      24     STAP-A        no              yes               no
      25     STAP-B        no               no              yes
      26     MTAP16        no               no              yes
      27     MTAP24        no               no              yes
      28     FU-A          no              yes              yes
      29     FU-B          no               no              yes
      30-31  reserved      ig               ig               ig
        

Some NAL unit or payload type values (indicated as reserved in Table 3) are reserved for future extensions. NAL units of those types SHOULD NOT be sent by a sender (direct as packet payloads, as aggregation units in aggregation packets, or as fragmented units in FU packets) and MUST be ignored by a receiver. For example, the payload types 1-23, with the associated packet type "NAL unit", are

一些NAL单元或有效负载类型值(如表3所示)保留用于将来的扩展。这些类型的NAL单元不应由发送方发送(直接作为数据包有效载荷,作为聚合数据包中的聚合单元,或作为FU数据包中的碎片单元),接收方必须忽略。例如,具有相关分组类型“NAL单元”的有效负载类型1-23是:

allowed in "Single NAL Unit Mode" and in "Non-Interleaved Mode" but disallowed in "Interleaved Mode". However, NAL units of NAL unit types 1-23 can be used in "Interleaved Mode" as aggregation units in STAP-B, MTAP16, and MTAP24 packets as well as fragmented units in FU-A and FU-B packets. Similarly, NAL units of NAL unit types 1-23 can also be used in the "Non-Interleaved Mode" as aggregation units in STAP-A packets or fragmented units in FU-A packets, in addition to being directly used as packet payloads.

在“单NAL单元模式”和“非交织模式”中允许,但在“交织模式”中不允许。但是,NAL单元类型1-23的NAL单元可以在“交错模式”中用作STAP-B、MTAP16和MTAP24数据包中的聚合单元以及FU-A和FU-B数据包中的碎片单元。类似地,NAL单元类型1-23的NAL单元除了直接用作分组有效载荷之外,还可以在“非交织模式”中用作STAP-A分组中的聚合单元或FU-A分组中的分段单元。

5.5. Decoding Order Number (DON)
5.5. 解码顺序号(DON)

In the interleaved packetization mode, the transmission order of NAL units is allowed to differ from the decoding order of the NAL units. Decoding order number (DON) is a field in the payload structure or a derived variable that indicates the NAL unit decoding order. Rationale and examples of use cases for transmission out of decoding order and for the use of DON are given in Section 13.

在交织分组模式中,允许NAL单元的传输顺序不同于NAL单元的解码顺序。解码顺序号(DON)是有效负载结构中的一个字段,或指示NAL单元解码顺序的派生变量。第13节给出了非解码顺序传输和DON使用的基本原理和用例示例。

The coupling of transmission and decoding order is controlled by the OPTIONAL sprop-interleaving-depth media type parameter as follows. When the value of the OPTIONAL sprop-interleaving-depth media type parameter is equal to 0 (explicitly or per default), the transmission order of NAL units MUST conform to the NAL unit decoding order. When the value of the OPTIONAL sprop-interleaving-depth media type parameter is greater than 0:

传输和解码顺序的耦合由可选的sprop交错深度媒体类型参数控制,如下所示。当可选sprop interleaving depth media type参数的值等于0(显式或默认值)时,NAL单元的传输顺序必须符合NAL单元解码顺序。当可选sprop交错深度媒体类型参数的值大于0时:

o the order of NAL units in an MTAP16 and an MTAP24 is not required to be the NAL unit decoding order, and

o MTAP16和MTAP24中NAL单元的顺序不要求是NAL单元解码顺序,并且

o the order of NAL units generated by de-packetizing STAP-Bs, MTAPs, and FUs in two consecutive packets is not required to be the NAL unit decoding order.

o 对两个连续数据包中的STAP Bs、MTAP和FUs进行反打包生成的NAL单元的顺序不要求是NAL单元解码顺序。

The RTP payload structures for a single NAL unit packet, an STAP-A, and an FU-A do not include DON. STAP-B and FU-B structures include DON, and the structure of MTAPs enables derivation of DON, as specified in Section 5.7.2.

单个NAL单元分组、STAP-a和FU-a的RTP有效载荷结构不包括DON。STAP-B和FU-B结构包括DON,MTAP的结构允许推导DON,如第5.7.2节所述。

Informative note: When an FU-A occurs in interleaved mode, it always follows an FU-B, which sets its DON.

资料性说明:当FU-A以交错模式出现时,它总是跟随FU-B,后者设置其DON。

Informative note: If a transmitter wants to encapsulate a single NAL unit per packet and transmit packets out of their decoding order, STAP-B packet type can be used.

资料性说明:如果发送器希望每个数据包封装一个NAL单元,并按照解码顺序发送数据包,则可以使用STAP-B数据包类型。

In the single NAL unit packetization mode, the transmission order of NAL units, determined by the RTP sequence number, MUST be the same as their NAL unit decoding order. In the non-interleaved packetization

在单NAL单元分组模式中,由RTP序列号确定的NAL单元的传输顺序必须与其NAL单元解码顺序相同。在非交错分组中

mode, the transmission order of NAL units in single NAL unit packets, STAP-As, and FU-As MUST be the same as their NAL unit decoding order. The NAL units within an STAP MUST appear in the NAL unit decoding order. Thus, the decoding order is first provided through the implicit order within an STAP and then provided through the RTP sequence number for the order between STAPs, FUs, and single NAL unit packets.

mode, the transmission order of NAL units in single NAL unit packets, STAP-As, and FU-As MUST be the same as their NAL unit decoding order. The NAL units within an STAP MUST appear in the NAL unit decoding order. Thus, the decoding order is first provided through the implicit order within an STAP and then provided through the RTP sequence number for the order between STAPs, FUs, and single NAL unit packets.translate error, please retry

The signaling of the value of DON for NAL units carried in STAP-B, MTAP, and a series of fragmentation units starting with an FU-B is specified in Sections 5.7.1, 5.7.2, and 5.8, respectively. The DON value of the first NAL unit in transmission order MAY be set to any value. Values of DON are in the range of 0 to 65535, inclusive. After reaching the maximum value, the value of DON wraps around to 0.

第5.7.1、5.7.2和5.8节分别规定了STAP-B、MTAP和以FU-B开头的一系列碎片单元中NAL单元的DON值的信令。传输顺序中的第一NAL单元的DON值可以设置为任何值。DON的值在0到65535之间(含0到65535)。达到最大值后,DON的值将变为0。

The decoding order of two NAL units contained in any STAP-B, MTAP, or a series of fragmentation units starting with an FU-B is determined as follows. Let DON(i) be the decoding order number of the NAL unit having index i in the transmission order. Function don_diff(m,n) is specified as follows:

包含在任何STAP-B、MTAP或以FU-B开头的一系列分段单元中的两个NAL单元的解码顺序确定如下。假设DON(i)是在传输顺序中具有索引i的NAL单元的解码顺序号。函数don_diff(m,n)指定如下:

      If DON(m) == DON(n), don_diff(m,n) = 0
        
      If DON(m) == DON(n), don_diff(m,n) = 0
        
      If (DON(m) < DON(n) and DON(n) - DON(m) < 32768),
      don_diff(m,n) = DON(n) - DON(m)
        
      If (DON(m) < DON(n) and DON(n) - DON(m) < 32768),
      don_diff(m,n) = DON(n) - DON(m)
        
      If (DON(m) > DON(n) and DON(m) - DON(n) >= 32768),
      don_diff(m,n) = 65536 - DON(m) + DON(n)
        
      If (DON(m) > DON(n) and DON(m) - DON(n) >= 32768),
      don_diff(m,n) = 65536 - DON(m) + DON(n)
        
      If (DON(m) < DON(n) and DON(n) - DON(m) >= 32768),
      don_diff(m,n) = - (DON(m) + 65536 - DON(n))
        
      If (DON(m) < DON(n) and DON(n) - DON(m) >= 32768),
      don_diff(m,n) = - (DON(m) + 65536 - DON(n))
        
      If (DON(m) > DON(n) and DON(m) - DON(n) < 32768),
      don_diff(m,n) = - (DON(m) - DON(n))
        
      If (DON(m) > DON(n) and DON(m) - DON(n) < 32768),
      don_diff(m,n) = - (DON(m) - DON(n))
        

A positive value of don_diff(m,n) indicates that the NAL unit having transmission order index n follows, in decoding order, the NAL unit having transmission order index m. When don_diff(m,n) is equal to 0, the NAL unit decoding order of the two NAL units can be in either order. A negative value of don_diff(m,n) indicates that the NAL unit having transmission order index n precedes, in decoding order, the NAL unit having transmission order index m.

don_diff(m,n)的正值表示具有传输顺序索引n的NAL单元以解码顺序跟随具有传输顺序索引m的NAL单元。当don_diff(m,n)等于0时,两个NAL单元的NAL单元解码顺序可以是任意顺序。don_diff(m,n)的负值表示具有传输顺序索引n的NAL单元以解码顺序先于具有传输顺序索引m的NAL单元。

Values of DON-related fields (DON, DONB, and DOND; see Section 5.7) MUST be such that the decoding order determined by the values of DON, as specified above, conforms to the NAL unit decoding order.

DON相关字段(DON、DONB和DOND;见第5.7节)的值必须确保由上述DON值确定的解码顺序符合NAL单元解码顺序。

If the order of two NAL units in NAL unit decoding order is switched and the new order does not conform to the NAL unit decoding order, the NAL units MUST NOT have the same value of DON. If the order of two consecutive NAL units in the NAL unit stream is switched and the new order still conforms to the NAL unit decoding order, the NAL units MAY have the same value of DON. For example, when arbitrary slice order is allowed by the video coding profile in use, all the coded slice NAL units of a coded picture are allowed to have the same value of DON. Consequently, NAL units having the same value of DON can be decoded in any order, and two NAL units having a different value of DON should be passed to the decoder in the order specified above. When two consecutive NAL units in the NAL unit decoding order have a different value of DON, the value of DON for the second NAL unit in decoding order SHOULD be the value of DON for the first, incremented by one.

如果切换了NAL单元解码顺序中两个NAL单元的顺序,并且新的顺序不符合NAL单元解码顺序,则NAL单元不得具有相同的DON值。如果切换了NAL单元流中的两个连续NAL单元的顺序,并且新的顺序仍然符合NAL单元解码顺序,则NAL单元可以具有相同的DON值。例如,当使用中的视频编码简档允许任意片段顺序时,允许编码图片的所有编码片段NAL单元具有相同的DON值。因此,具有相同DON值的NAL单元可以以任何顺序被解码,并且具有不同DON值的两个NAL单元应当以上面指定的顺序被传递给解码器。当NAL单元解码顺序中的两个连续NAL单元具有不同的DON值时,解码顺序中的第二个NAL单元的DON值应为第一个NAL单元的DON值,递增1。

An example of the de-packetization process to recover the NAL unit decoding order is given in Section 7.

第7节给出了恢复NAL单元解码顺序的解分组过程的示例。

Informative note: Receivers should not expect that the absolute difference of values of DON for two consecutive NAL units in the NAL unit decoding order will be equal to one, even in error-free transmission. An increment by one is not required, as at the time of associating values of DON to NAL units, it may not be known whether all NAL units are delivered to the receiver. For example, a gateway may not forward coded slice NAL units of non-reference pictures or SEI NAL units when there is a shortage of bitrate in the network to which the packets are forwarded. In another example, a live broadcast is interrupted by pre-encoded content, such as commercials, from time to time. The first intra picture of a pre-encoded clip is transmitted in advance to ensure that it is readily available in the receiver. When transmitting the first intra picture, the originator does not exactly know how many NAL units will be encoded before the first intra picture of the pre-encoded clip follows in decoding order. Thus, the values of DON for the NAL units of the first intra picture of the pre-encoded clip have to be estimated when they are transmitted, and gaps in values of DON may occur.

资料性说明:即使在无差错传输中,接收机也不应期望NAL单元解码顺序中两个连续NAL单元的DON值的绝对差值等于1。不需要增加1,因为在将DON的值与NAL单元相关联时,可能不知道是否所有NAL单元都交付给接收机。例如,当分组转发到的网络中的比特率不足时,网关可以不转发非参考图片或序列单元的编码片段。在另一示例中,实时广播不时地被预编码的内容(例如商业广告)中断。预先发送预编码片段的第一帧内图片,以确保其在接收机中随时可用。当发送第一帧内图片时,发起者不确切地知道在预编码片段的第一帧内图片以解码顺序跟随之前将编码多少NAL单元。因此,当传输预编码片段的第一帧内图片的NAL单元时,必须估计它们的DON值,并且DON值中可能出现间隙。

5.6. Single NAL Unit Packet
5.6. 单NAL单元数据包

The single NAL unit packet defined here MUST contain only one NAL unit of the types defined in [1]. This means that neither an aggregation packet nor a fragmentation unit can be used within a single NAL unit packet. A NAL unit stream composed by de-packetizing single NAL unit packets in RTP sequence number order MUST conform to the NAL unit decoding order. The structure of the single NAL unit packet is shown in Figure 2.

此处定义的单个NAL单元数据包必须仅包含[1]中定义类型的一个NAL单元。这意味着在单个NAL单元分组中既不能使用聚合分组也不能使用分段单元。以RTP序列号顺序对单个NAL单元数据包进行去分组构成的NAL单元流必须符合NAL单元解码顺序。单个NAL单元数据包的结构如图2所示。

Informative note: The first byte of a NAL unit co-serves as the RTP payload header.

资料性说明:NAL单元co的第一个字节用作RTP有效负载报头。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |F|NRI|  Type   |                                               |
    +-+-+-+-+-+-+-+-+                                               |
    |                                                               |
    |               Bytes 2..n of a single NAL unit                 |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |F|NRI|  Type   |                                               |
    +-+-+-+-+-+-+-+-+                                               |
    |                                                               |
    |               Bytes 2..n of a single NAL unit                 |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 2. RTP payload format for single NAL unit packet

图2。单个NAL单元数据包的RTP有效负载格式

5.7. Aggregation Packets
5.7. 聚合数据包

Aggregation packets are the NAL unit aggregation scheme of this payload specification. The scheme is introduced to reflect the dramatically different MTU sizes of two key target networks: wireline IP networks (with an MTU size that is often limited by the Ethernet MTU size, roughly 1500 bytes) and IP-based or non-IP-based (e.g., ITU-T H.324/M) wireless communication systems with preferred transmission unit sizes of 254 bytes or less. To prevent media transcoding between the two worlds, and to avoid undesirable packetization overhead, a NAL unit aggregation scheme is introduced.

聚合数据包是本有效负载规范的NAL单元聚合方案。引入该方案是为了反映两个关键目标网络的显著不同的MTU大小:有线IP网络(MTU大小通常受到以太网MTU大小的限制,大约1500字节)和基于IP或非基于IP(例如,ITU-T H.324/M)首选传输单元大小为254字节或更小的无线通信系统。为了防止两个世界之间的媒体转码,并避免不必要的分组开销,引入了NAL单元聚合方案。

Two types of aggregation packets are defined by this specification:

本规范定义了两种类型的聚合数据包:

o Single-time aggregation packet (STAP): aggregates NAL units with identical NALU-times. Two types of STAPs are defined, one without DON (STAP-A) and another including DON (STAP-B).

o 单次聚合数据包(STAP):聚合具有相同NALU时间的NAL单元。定义了两种类型的STAP,一种没有DON(STAP-A),另一种包括DON(STAP-B)。

o Multi-time aggregation packet (MTAP): aggregates NAL units with potentially differing NALU-times. Two different MTAPs are defined, differing in the length of the NAL unit timestamp offset.

o 多时间聚合数据包(MTAP):聚合具有潜在不同NALU时间的NAL单元。定义了两个不同的MTAP,其NAL单位时间戳偏移量的长度不同。

Each NAL unit to be carried in an aggregation packet is encapsulated in an aggregation unit. Please see below for the four different aggregation units and their characteristics.

要在聚合分组中携带的每个NAL单元被封装在聚合单元中。请参见下面的四个不同聚合单元及其特征。

The structure of the RTP payload format for aggregation packets is presented in Figure 3.

聚合数据包的RTP有效负载格式的结构如图3所示。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |F|NRI|  Type   |                                               |
    +-+-+-+-+-+-+-+-+                                               |
    |                                                               |
    |             one or more aggregation units                     |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |F|NRI|  Type   |                                               |
    +-+-+-+-+-+-+-+-+                                               |
    |                                                               |
    |             one or more aggregation units                     |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 3. RTP payload format for aggregation packets

图3。聚合数据包的RTP有效负载格式

MTAPs and STAPs share the following packetization rules:

MTAP和STAP共享以下打包规则:

o The RTP timestamp MUST be set to the earliest of the NALU-times of all the NAL units to be aggregated.

o RTP时间戳必须设置为要聚合的所有NAL单元的最早NALU时间。

o The type field of the NAL unit type octet MUST be set to the appropriate value, as indicated in Table 4.

o NAL单元类型八位字节的类型字段必须设置为适当的值,如表4所示。

o The F bit MUST be cleared if all F bits of the aggregated NAL units are zero; otherwise, it MUST be set.

o 如果聚合NAL单元的所有F位为零,则必须清除F位;否则,必须设置它。

o The value of NRI MUST be the maximum of all the NAL units carried in the aggregation packet.

o NRI的值必须是聚合数据包中携带的所有NAL单元的最大值。

Table 4. Type field for STAPs and MTAPs

表4。STAP和MTAP的类型字段

      Type   Packet    Timestamp offset   DON-related fields
                       field length       (DON, DONB, DOND)
                       (in bits)          present
      --------------------------------------------------------
      24     STAP-A       0                 no
      25     STAP-B       0                 yes
      26     MTAP16      16                 yes
      27     MTAP24      24                 yes
        
      Type   Packet    Timestamp offset   DON-related fields
                       field length       (DON, DONB, DOND)
                       (in bits)          present
      --------------------------------------------------------
      24     STAP-A       0                 no
      25     STAP-B       0                 yes
      26     MTAP16      16                 yes
      27     MTAP24      24                 yes
        

The marker bit in the RTP header is set to the value that the marker bit of the last NAL unit of the aggregated packet would have if it were transported in its own RTP packet.

RTP报头中的标记位设置为聚合数据包的最后一个NAL单元的标记位在其自身RTP数据包中传输时的值。

The payload of an aggregation packet consists of one or more aggregation units. See Sections 5.7.1 and 5.7.2 for the four different types of aggregation units. An aggregation packet can carry as many aggregation units as necessary; however, the total amount of data in an aggregation packet obviously MUST fit into an IP packet, and the size SHOULD be chosen so that the resulting IP packet is smaller than the MTU size. An aggregation packet MUST NOT contain fragmentation units, as specified in Section 5.8. Aggregation packets MUST NOT be nested; that is, an aggregation packet MUST NOT contain another aggregation packet.

聚合数据包的有效负载由一个或多个聚合单元组成。有关四种不同类型的聚合单元,请参见第5.7.1节和第5.7.2节。一个聚合包可以根据需要携带任意多个聚合单元;然而,聚合数据包中的数据总量显然必须适合于IP数据包,并且应选择大小,以便生成的IP数据包小于MTU大小。如第5.8节所述,聚合数据包不得包含碎片单元。聚合数据包不能嵌套;也就是说,聚合数据包不能包含另一个聚合数据包。

5.7.1. Single-Time Aggregation Packet (STAP)
5.7.1. 单次聚合数据包(STAP)

A single-time aggregation packet (STAP) SHOULD be used whenever NAL units are aggregated that all share the same NALU-time. The payload of an STAP-A does not include DON and consists of at least one single-time aggregation unit, as presented in Figure 4. The payload of an STAP-B consists of a 16-bit unsigned decoding order number (DON) (in network byte order) followed by at least one single-time aggregation unit, as presented in Figure 5.

每当聚合所有共享相同NALU时间的NAL单元时,应使用单时间聚合数据包(STAP)。STAP-A的有效载荷不包括DON,而是由至少一个单一时间聚合单元组成,如图4所示。STAP-B的有效载荷由16位无符号解码顺序号(DON)(以网络字节顺序)和至少一个单一时间聚合单元组成,如图5所示。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    :                                               |
    +-+-+-+-+-+-+-+-+                                               |
    |                                                               |
    |                single-time aggregation units                  |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    :                                               |
    +-+-+-+-+-+-+-+-+                                               |
    |                                                               |
    |                single-time aggregation units                  |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 4. Payload format for STAP-A

图4。STAP-A的有效载荷格式

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    :  decoding order number (DON)  |               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
    |                                                               |
    |                single-time aggregation units                  |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    :  decoding order number (DON)  |               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
    |                                                               |
    |                single-time aggregation units                  |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 5. Payload format for STAP-B

图5。STAP-B的有效载荷格式

The DON field specifies the value of DON for the first NAL unit in an STAP-B in transmission order. For each successive NAL unit in appearance order in an STAP-B, the value of DON is equal to (the value of DON of the previous NAL unit in the STAP-B + 1) % 65536, in which '%' stands for the modulo operation.

DON字段按传输顺序指定STAP-B中第一个NAL单元的DON值。对于STAP-B中按外观顺序排列的每个后续NAL单元,DON的值等于(STAP-B+1中上一个NAL单元的DON值)%65536,其中“%”表示模运算。

A single-time aggregation unit consists of 16-bit unsigned size information (in network byte order) that indicates the size of the following NAL unit in bytes (excluding these two octets, but including the NAL unit type octet of the NAL unit), followed by the NAL unit itself, including its NAL unit type byte. A single-time aggregation unit is byte aligned within the RTP payload, but it may not be aligned on a 32-bit word boundary. Figure 6 presents the structure of the single-time aggregation unit.

单个时间聚合单元由16位无符号大小信息(按网络字节顺序)组成,该信息以字节表示以下NAL单元的大小(不包括这两个八位字节,但包括NAL单元的NAL单元类型八位字节),然后是NAL单元本身,包括其NAL单元类型字节。单个时间聚合单元在RTP有效负载内是字节对齐的,但它可能不在32位字边界上对齐。图6显示了单个时间聚合单元的结构。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    :        NAL unit size          |               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
    |                                                               |
    |                           NAL unit                            |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    :        NAL unit size          |               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
    |                                                               |
    |                           NAL unit                            |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 6. Structure for single-time aggregation unit

图6。单时间聚合单元的结构

Figure 7 presents an example of an RTP packet that contains an STAP-A. The STAP contains two single-time aggregation units, labeled as 1 and 2 in the figure.

图7显示了包含STAP-A的RTP数据包的示例。STAP包含两个单时间聚合单元,在图中标记为1和2。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          RTP Header                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |STAP-A NAL HDR |         NALU 1 Size           | NALU 1 HDR    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         NALU 1 Data                           |
    :                                                               :
    +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |               | NALU 2 Size                   | NALU 2 HDR    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         NALU 2 Data                           |
    :                                                               :
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          RTP Header                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |STAP-A NAL HDR |         NALU 1 Size           | NALU 1 HDR    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         NALU 1 Data                           |
    :                                                               :
    +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |               | NALU 2 Size                   | NALU 2 HDR    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                         NALU 2 Data                           |
    :                                                               :
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 7. An example of an RTP packet including an STAP-A containing two single-time aggregation units

图7。包含包含两个单一时间聚合单元的STAP-A的RTP分组的示例

Figure 8 presents an example of an RTP packet that contains an STAP-B. The STAP contains two single-time aggregation units, labeled as 1 and 2 in the figure.

图8显示了包含STAP-B的RTP数据包的示例。STAP包含两个单时间聚合单元,在图中标记为1和2。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          RTP Header                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |STAP-B NAL HDR | DON                           | NALU 1 Size   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | NALU 1 Size   | NALU 1 HDR    | NALU 1 Data                   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
    :                                                               :
    +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |               | NALU 2 Size                   | NALU 2 HDR    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       NALU 2 Data                             |
    :                                                               :
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          RTP Header                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |STAP-B NAL HDR | DON                           | NALU 1 Size   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | NALU 1 Size   | NALU 1 HDR    | NALU 1 Data                   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
    :                                                               :
    +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |               | NALU 2 Size                   | NALU 2 HDR    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       NALU 2 Data                             |
    :                                                               :
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 8. An example of an RTP packet including an STAP-B containing two single-time aggregation units

图8。包含包含两个单时间聚合单元的STAP-B的RTP分组的示例

5.7.2. Multi-Time Aggregation Packets (MTAPs)
5.7.2. 多时间聚合数据包(MTAP)

The NAL unit payload of MTAPs consists of a 16-bit unsigned decoding order number base (DONB) (in network byte order) and one or more multi-time aggregation units, as presented in Figure 9. DONB MUST contain the value of DON for the first NAL unit in the NAL unit decoding order among the NAL units of the MTAP.

MTAP的NAL单元有效负载由一个16位无符号解码顺序数字基(DONB)(以网络字节顺序)和一个或多个多次聚合单元组成,如图9所示。DONB必须包含MTAP NAL单元中NAL单元解码顺序中第一个NAL单元的DON值。

Informative note: The first NAL unit in the NAL unit decoding order is not necessarily the first NAL unit in the order in which the NAL units are encapsulated in an MTAP.

资料性说明:NAL单元解码顺序中的第一个NAL单元不一定是NAL单元封装在MTAP中的顺序中的第一个NAL单元。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    :  decoding order number base   |               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
    |                                                               |
    |                 multi-time aggregation units                  |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                    :  decoding order number base   |               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
    |                                                               |
    |                 multi-time aggregation units                  |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 9. NAL unit payload format for MTAPs

图9。MTAP的NAL单元有效负载格式

Two different multi-time aggregation units are defined in this specification. Both of them consist of 16 bits of unsigned size information of the following NAL unit (in network byte order), an 8-bit unsigned decoding order number difference (DOND), and n bits (in network byte order) of timestamp offset (TS offset) for this NAL unit, whereby n can be 16 or 24. The choice between the different MTAP types (MTAP16 and MTAP24) is application dependent: the larger the timestamp offset is, the higher the flexibility of the MTAP, but the overhead is also higher.

本规范中定义了两个不同的多次聚合单元。它们都由以下NAL单元的16位无符号大小信息(网络字节顺序)、8位无符号解码顺序数字差(DOND)和该NAL单元的时间戳偏移(TS偏移)的n位(网络字节顺序)组成,其中n可以是16或24。不同MTAP类型(MTAP16和MTAP24)之间的选择取决于应用程序:时间戳偏移量越大,MTAP的灵活性越高,但开销也越大。

The structure of the multi-time aggregation units for MTAP16 and MTAP24 are presented in Figures 10 and 11, respectively. The starting or ending position of an aggregation unit within a packet is not required to be on a 32-bit word boundary. The DON of the NAL unit contained in a multi-time aggregation unit is equal to (DONB + DOND) % 65536, in which % denotes the modulo operation. This memo does not specify how the NAL units within an MTAP are ordered, but, in most cases, NAL unit decoding order SHOULD be used.

MTAP16和MTAP24的多时间聚合单元的结构分别如图10和图11所示。数据包中聚合单元的起始或结束位置不需要位于32位字边界上。多次聚合单元中包含的NAL单元的DON等于(DONB+DOND)%65536,其中%表示模运算。本备忘录未指定MTAP中NAL单元的排序方式,但在大多数情况下,应使用NAL单元解码顺序。

The timestamp offset field MUST be set to a value equal to the value of the following formula: if the NALU-time is larger than or equal to the RTP timestamp of the packet, then the timestamp offset equals (the NALU-time of the NAL unit - the RTP timestamp of the packet). If the NALU-time is smaller than the RTP timestamp of the packet, then the timestamp offset is equal to the NALU-time + (2^32 - the RTP timestamp of the packet).

时间戳偏移字段必须设置为等于以下公式值的值:如果NALU时间大于或等于数据包的RTP时间戳,则时间戳偏移等于(NAL单元的NALU时间-数据包的RTP时间戳)。如果NALU时间小于数据包的RTP时间戳,则时间戳偏移量等于NALU时间+(2^32-数据包的RTP时间戳)。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    :        NAL unit size          |      DOND     |  TS offset    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  TS offset    |                                               |
    +-+-+-+-+-+-+-+-+              NAL unit                         |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    :        NAL unit size          |      DOND     |  TS offset    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  TS offset    |                                               |
    +-+-+-+-+-+-+-+-+              NAL unit                         |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 10. Multi-time aggregation unit for MTAP16

图10。MTAP16的多时间聚合单元

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    :        NAL unit size         |      DOND     |  TS offset    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |         TS offset             |                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
    |                              NAL unit                         |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    :        NAL unit size         |      DOND     |  TS offset    |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |         TS offset             |                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
    |                              NAL unit                         |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 11. Multi-time aggregation unit for MTAP24

图11。MTAP24的多时间聚合单元

For the "earliest" multi-time aggregation unit in an MTAP, the timestamp offset MUST be zero. Hence, the RTP timestamp of the MTAP itself is identical to the earliest NALU-time.

对于MTAP中的“最早”多时间聚合单元,时间戳偏移量必须为零。因此,MTAP本身的RTP时间戳与最早的NALU时间相同。

Informative note: The "earliest" multi-time aggregation unit is the one that would have the smallest extended RTP timestamp among all the aggregation units of an MTAP if the NAL units contained in the aggregation units were encapsulated in single NAL unit packets. An extended timestamp is a timestamp that has more than 32 bits and is capable of counting the wraparound of the timestamp field, thus enabling one to determine the smallest value if the timestamp wraps. Such an "earliest" aggregation unit may not be the first one in the order in which the aggregation units are encapsulated in an MTAP. The "earliest" NAL unit need not be the same as the first NAL unit in the NAL unit decoding order either.

资料性说明:“最早”的多次聚合单元是指如果聚合单元中包含的NAL单元封装在单个NAL单元数据包中,则MTAP的所有聚合单元中具有最小扩展RTP时间戳的单元。扩展时间戳是具有超过32位的时间戳,并且能够对时间戳字段的环绕进行计数,从而使得能够在时间戳环绕时确定最小值。这种“最早”的聚合单元可能不是MTAP中聚合单元封装顺序中的第一个聚合单元。“最早的”NAL单元也不必与NAL单元解码顺序中的第一个NAL单元相同。

Figure 12 presents an example of an RTP packet that contains a multi-time aggregation packet of type MTAP16 that contains two multi-time aggregation units, labeled as 1 and 2 in the figure.

图12显示了一个RTP数据包示例,其中包含MTAP16类型的多次聚合数据包,该数据包包含两个多次聚合单元,在图中标记为1和2。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          RTP Header                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |MTAP16 NAL HDR |  decoding order number base   | NALU 1 Size   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  NALU 1 Size  |  NALU 1 DOND  |       NALU 1 TS offset        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  NALU 1 HDR   |  NALU 1 DATA                                  |
    +-+-+-+-+-+-+-+-+                                               +
    :                                                               :
    +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |               | NALU 2 SIZE                   |  NALU 2 DOND  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |       NALU 2 TS offset        |  NALU 2 HDR   |  NALU 2 DATA  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
    :                                                               :
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          RTP Header                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |MTAP16 NAL HDR |  decoding order number base   | NALU 1 Size   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  NALU 1 Size  |  NALU 1 DOND  |       NALU 1 TS offset        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  NALU 1 HDR   |  NALU 1 DATA                                  |
    +-+-+-+-+-+-+-+-+                                               +
    :                                                               :
    +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |               | NALU 2 SIZE                   |  NALU 2 DOND  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |       NALU 2 TS offset        |  NALU 2 HDR   |  NALU 2 DATA  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
    :                                                               :
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 12. An RTP packet including a multi-time aggregation packet of type MTAP16 containing two multi-time aggregation units

图12。一种RTP数据包,包括MTAP16类型的多时间聚合数据包,其中包含两个多时间聚合单元

Figure 13 presents an example of an RTP packet that contains a multi-time aggregation packet of type MTAP24 that contains two multi-time aggregation units, labeled as 1 and 2 in the figure.

图13显示了一个RTP数据包的示例,其中包含MTAP24类型的多次聚合数据包,该数据包包含两个多次聚合单元,在图中标记为1和2。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          RTP Header                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |MTAP24 NAL HDR |  decoding order number base   | NALU 1 Size   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  NALU 1 Size  |  NALU 1 DOND  |       NALU 1 TS offs          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |NALU 1 TS offs |  NALU 1 HDR   |  NALU 1 DATA                  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
    :                                                               :
    +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |               | NALU 2 SIZE                   |  NALU 2 DOND  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |       NALU 2 TS offset                        |  NALU 2 HDR   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  NALU 2 DATA                                                  |
    :                                                               :
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                          RTP Header                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |MTAP24 NAL HDR |  decoding order number base   | NALU 1 Size   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  NALU 1 Size  |  NALU 1 DOND  |       NALU 1 TS offs          |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |NALU 1 TS offs |  NALU 1 HDR   |  NALU 1 DATA                  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
    :                                                               :
    +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |               | NALU 2 SIZE                   |  NALU 2 DOND  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |       NALU 2 TS offset                        |  NALU 2 HDR   |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  NALU 2 DATA                                                  |
    :                                                               :
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 13. An RTP packet including a multi-time aggregation packet of type MTAP24 containing two multi-time aggregation units

图13。一种RTP数据包,包括MTAP24类型的多时间聚合数据包,其中包含两个多时间聚合单元

5.8. Fragmentation Units (FUs)
5.8. 碎片单位(FUs)

This payload type allows fragmenting a NAL unit into several RTP packets. Doing so on the application layer instead of relying on lower-layer fragmentation (e.g., by IP) has the following advantages:

此有效负载类型允许将NAL单元分段为多个RTP数据包。在应用层上这样做而不是依赖较低层的碎片(例如,通过IP)具有以下优点:

o The payload format is capable of transporting NAL units bigger than 64 kbytes over an IPv4 network that may be present in pre-recorded video, particularly in High-Definition formats (there is a limit of the number of slices per picture, which results in a limit of NAL units per picture, which may result in big NAL units).

o 有效负载格式能够通过IPv4网络传输大于64 KB的NAL单元,该NAL单元可能存在于预先录制的视频中,特别是在高清晰度格式中(存在每个图片的切片数限制,这导致每个图片的NAL单元限制,这可能导致大NAL单元)。

o The fragmentation mechanism allows fragmenting a single NAL unit and applying generic forward error correction as described in Section 12.5.

o 分段机制允许对单个NAL单元进行分段,并应用第12.5节所述的通用前向纠错。

Fragmentation is defined only for a single NAL unit and not for any aggregation packets. A fragment of a NAL unit consists of an integer number of consecutive octets of that NAL unit. Each octet of the NAL unit MUST be part of exactly one fragment of that NAL unit. Fragments of the same NAL unit MUST be sent in consecutive order with ascending RTP sequence numbers (with no other RTP packets within the same RTP packet stream being sent between the first and last fragment). Similarly, a NAL unit MUST be reassembled in RTP sequence number order.

碎片仅为单个NAL单元定义,不为任何聚合数据包定义。NAL单元的片段由该NAL单元的整数个连续八位字节组成。NAL单元的每个八位组必须恰好是该NAL单元的一个片段的一部分。同一NAL单元的片段必须以递增RTP序列号的连续顺序发送(同一RTP数据包流中没有其他RTP数据包在第一个片段和最后一个片段之间发送)。同样,NAL单元必须按照RTP序列号顺序重新组装。

When a NAL unit is fragmented and conveyed within fragmentation units (FUs), it is referred to as a fragmented NAL unit. STAPs and MTAPs MUST NOT be fragmented. FUs MUST NOT be nested; that is, an FU MUST NOT contain another FU.

当NAL单元被分段并在分段单元(FUs)内传送时,它被称为分段NAL单元。STAP和MTAP不得分割。FU不得嵌套;也就是说,一个赋不能包含另一个赋。

The RTP timestamp of an RTP packet carrying an FU is set to the NALU-time of the fragmented NAL unit.

携带FU的RTP分组的RTP时间戳被设置为分段NAL单元的NALU时间。

Figure 14 presents the RTP payload format for FU-As. An FU-A consists of a fragmentation unit indicator of one octet, a fragmentation unit header of one octet, and a fragmentation unit payload.

图14显示了FU As的RTP有效负载格式。FU-A由一个八位字节的碎片单元指示符、一个八位字节的碎片单元头和碎片单元有效载荷组成。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | FU indicator  |   FU header   |                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
    |                                                               |
    |                         FU payload                            |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | FU indicator  |   FU header   |                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
    |                                                               |
    |                         FU payload                            |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 14. RTP payload format for FU-A

图14。FU-A的RTP有效载荷格式

Figure 15 presents the RTP payload format for FU-Bs. An FU-B consists of a fragmentation unit indicator of one octet, a fragmentation unit header of one octet, a decoding order number (DON) (in network byte order), and a fragmentation unit payload. In other words, the structure of FU-B is the same as the structure of FU-A, except for the additional DON field.

图15显示了FU Bs的RTP有效负载格式。FU-B由一个八位字节的分段单元指示符、一个八位字节的分段单元报头、解码顺序号(DON)(以网络字节顺序)和分段单元有效载荷组成。换句话说,除了附加的DON字段外,FU-B的结构与FU-A的结构相同。

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | FU indicator  |   FU header   |               DON             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-|
    |                                                               |
    |                         FU payload                            |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | FU indicator  |   FU header   |               DON             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-|
    |                                                               |
    |                         FU payload                            |
    |                                                               |
    |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                               :...OPTIONAL RTP padding        |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 15. RTP payload format for FU-B

图15。FU-B的RTP有效载荷格式

NAL unit type FU-B MUST be used in the interleaved packetization mode for the first fragmentation unit of a fragmented NAL unit. NAL unit type FU-B MUST NOT be used in any other case. In other words, in the interleaved packetization mode, each NALU that is fragmented has an FU-B as the first fragment, followed by one or more FU-A fragments.

对于分段NAL单元的第一个分段单元,必须在交织分组模式下使用FU-B型NAL单元。在任何其他情况下,不得使用FU-B型NAL装置。换言之,在交织分组模式中,被分段的每个NALU具有一个FU-B作为第一个片段,后跟一个或多个FU-A片段。

The FU indicator octet has the following format:

FU指示器八位字节的格式如下:

       +---------------+
       |0|1|2|3|4|5|6|7|
       +-+-+-+-+-+-+-+-+
       |F|NRI|  Type   |
       +---------------+
        
       +---------------+
       |0|1|2|3|4|5|6|7|
       +-+-+-+-+-+-+-+-+
       |F|NRI|  Type   |
       +---------------+
        

Values equal to 28 and 29 in the type field of the FU indicator octet identify an FU-A and an FU-B, respectively. The use of the F bit is described in Section 5.3. The value of the NRI field MUST be set according to the value of the NRI field in the fragmented NAL unit.

FU指示剂八位字节类型字段中等于28和29的值分别表示FU-A和FU-B。第5.3节介绍了F位的使用。必须根据分段NAL单元中NRI字段的值设置NRI字段的值。

The FU header has the following format:

FU标题具有以下格式:

      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |S|E|R|  Type   |
      +---------------+
        
      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |S|E|R|  Type   |
      +---------------+
        

S: 1 bit When set to one, the Start bit indicates the start of a fragmented NAL unit. When the following FU payload is not the start of a fragmented NAL unit payload, the Start bit is set to zero.

S:1位当设置为1时,起始位表示分段NAL单元的开始。当以下FU有效负载不是分段NAL单元有效负载的开始时,开始位设置为零。

E: 1 bit When set to one, the End bit indicates the end of a fragmented NAL unit, i.e., the last byte of the payload is also the last byte of the fragmented NAL unit. When the following FU payload is not the last fragment of a fragmented NAL unit, the End bit is set to zero.

E:1位当设置为1时,结束位表示分段NAL单元的结束,即有效负载的最后一个字节也是分段NAL单元的最后一个字节。当以下FU有效负载不是分段NAL单元的最后一个片段时,结束位设置为零。

R: 1 bit The Reserved bit MUST be equal to 0 and MUST be ignored by the receiver.

R:1位保留位必须等于0,并且必须被接收器忽略。

Type: 5 bits The NAL unit payload type as defined in Table 7-1 of [1].

类型:5位为[1]表7-1中定义的NAL单元有效载荷类型。

The value of DON in FU-Bs is selected as described in Section 5.5.

如第5.5节所述,选择FU Bs中的DON值。

Informative note: The DON field in FU-Bs allows gateways to fragment NAL units to FU-Bs without organizing the incoming NAL units to the NAL unit decoding order.

资料性说明:FU Bs中的DON字段允许网关将NAL单元分段到FU Bs,而无需将传入的NAL单元组织到NAL单元解码顺序。

A fragmented NAL unit MUST NOT be transmitted in one FU; that is, the Start bit and End bit MUST NOT both be set to one in the same FU header.

碎片NAL单元不得在一个FU中传输;也就是说,在同一FU头中,起始位和结束位不能同时设置为一。

The FU payload consists of fragments of the payload of the fragmented NAL unit so that if the fragmentation unit payloads of consecutive FUs are sequentially concatenated, the payload of the fragmented NAL unit can be reconstructed. The NAL unit type octet of the fragmented NAL unit is not included as such in the fragmentation unit payload, but rather the information of the NAL unit type octet of the fragmented NAL unit is conveyed in the F and NRI fields of the FU indicator octet of the fragmentation unit and in the type field of the FU header. An FU payload MAY have any number of octets and MAY be empty.

FU有效载荷由分段NAL单元的有效载荷的片段组成,因此如果连续FU的分段单元有效载荷被顺序串联,则可以重构分段NAL单元的有效载荷。片段化NAL单元的NAL单元类型八位组不包括在片段化单元有效载荷中,而是片段化NAL单元的NAL单元类型八位组的信息在片段化单元的FU指示符八位组的F和NRI字段以及FU报头的类型字段中传送。FU有效载荷可以有任意数量的八位字节,并且可以为空。

Informative note: Empty FUs are allowed to reduce the latency of a certain class of senders in nearly lossless environments. These senders can be characterized in that they packetize NALU fragments before the NALU is completely generated and, hence, before the NALU size is known. If zero-length NALU fragments were not allowed, the sender would have to generate at least one bit of data of the following fragment before the current fragment could be sent. Due to the characteristics of H.264, where sometimes several macroblocks occupy zero bits, this is undesirable and can add delay. However, the (potential) use of zero-length NALU fragments should be carefully weighed against the increased risk of the loss of at least a part of the NALU because of the additional packets employed for its transmission.

信息性说明:在几乎无损的环境中,允许使用空FU来减少某类发送方的延迟。这些发送器的特征在于,它们在NALU完全生成之前(因此在NALU大小已知之前)打包NALU片段。如果不允许使用长度为零的NALU片段,则发送方必须生成以下片段的至少一位数据,然后才能发送当前片段。由于H.264的特点,有时几个宏块占用零位,这是不可取的,并且可能增加延迟。然而,应仔细权衡零长度NALU片段的(潜在)使用,以避免由于其传输所使用的额外分组而导致至少一部分NALU丢失的风险增加。

If a fragmentation unit is lost, the receiver SHOULD discard all following fragmentation units in transmission order corresponding to the same fragmented NAL unit.

如果碎片单元丢失,则接收器应按照与相同碎片NAL单元对应的传输顺序丢弃所有后续碎片单元。

A receiver in an endpoint or in a MANE MAY aggregate the first n-1 fragments of a NAL unit to an (incomplete) NAL unit, even if fragment n of that NAL unit is not received. In this case, the forbidden_zero_bit of the NAL unit MUST be set to one to indicate a syntax violation.

端点或MANE中的接收器可以将NAL单元的前n-1个片段聚合为(不完整的)NAL单元,即使没有接收到该NAL单元的片段n。在这种情况下,NAL单元的禁止\u零\u位必须设置为1,以指示语法冲突。

6. Packetization Rules
6. 打包规则

The packetization modes are introduced in Section 5.2. The packetization rules common to more than one of the packetization modes are specified in Section 6.1. The packetization rules for the single NAL unit mode, the non-interleaved mode, and the interleaved mode are specified in Sections 6.2, 6.3, and 6.4, respectively.

第5.2节介绍了打包模式。第6.1节规定了一种以上包装模式通用的包装规则。第6.2、6.3和6.4节分别规定了单NAL单元模式、非交织模式和交织模式的分组规则。

6.1. Common Packetization Rules
6.1. 通用分组规则

All senders MUST enforce the following packetization rules, regardless of the packetization mode in use:

无论使用何种打包模式,所有发件人都必须强制执行以下打包规则:

o Coded slice NAL units or coded slice data partition NAL units belonging to the same coded picture (and thus sharing the same RTP timestamp value) MAY be sent in any order; however, for delay-critical systems, they SHOULD be sent in their original decoding order to minimize the delay. Note that the decoding order is the order of the NAL units in the bitstream.

o 可以以任何顺序发送属于相同编码图片(并因此共享相同RTP时间戳值)的编码片NAL单元或编码片数据分区NAL单元;然而,对于延迟关键系统,它们应该按照其原始解码顺序发送,以最小化延迟。注意,解码顺序是比特流中NAL单元的顺序。

o Parameter sets are handled in accordance with the rules and recommendations given in Section 8.4.

o 根据第8.4节给出的规则和建议处理参数集。

o MANEs MUST NOT duplicate any NAL unit except for sequence or picture parameter set NAL units, as neither this memo nor the H.264 specification provides means to identify duplicated NAL units. Sequence and picture parameter set NAL units MAY be duplicated to make their correct reception more probable, but any such duplication MUST NOT affect the contents of any active sequence or picture parameter set. Duplication SHOULD be performed on the application layer and not by duplicating RTP packets (with identical sequence numbers).

o 除序列或图片参数集NAL单元外,MANE不得复制任何NAL单元,因为本备忘录和H.264规范均未提供识别重复NAL单元的方法。序列和图片参数集NAL单元可被复制,以使其更可能被正确接收,但任何此类复制不得影响任何活动序列或图片参数集的内容。复制应在应用层执行,而不是通过复制RTP数据包(具有相同的序列号)。

Senders using the non-interleaved mode and the interleaved mode MUST enforce the following packetization rule:

使用非交织模式和交织模式的发送方必须执行以下分组规则:

o In an RTP translator, MANEs MAY convert single NAL unit packets into one aggregation packet, convert an aggregation packet into several single NAL unit packets, or mix both concepts. The RTP translator SHOULD take into account at least the following parameters: path MTU size, unequal protection mechanisms (e.g., through packet-based FEC according to RFC 5109 [18], especially for sequence and picture parameter set NAL units and coded slice data partition A NAL units), bearable latency of the system, and buffering capabilities of the receiver.

o 在RTP转换器中,MANE可以将单个NAL单元分组转换为一个聚合分组,将聚合分组转换为多个单个NAL单元分组,或者混合这两个概念。RTP转换器应至少考虑以下参数:路径MTU大小、不等保护机制(例如,根据RFC 5109[18],通过基于分组的FEC),尤其是序列和图片参数集NAL单元和编码切片数据分区NAL单元)、系统的可承受延迟,以及接收器的缓冲能力。

Informative note: An RTP translator is required to handle RTP Control Protocol (RTCP) as per RFC 3550.

资料性说明:根据RFC 3550,需要一个RTP转换器来处理RTP控制协议(RTCP)。

6.2. Single NAL Unit Mode
6.2. 单NAL单元模式

This mode is in use when the value of the OPTIONAL packetization-mode media type parameter is equal to 0 or the packetization-mode is not present. All receivers MUST support this mode. It is primarily intended for low-delay applications that are compatible with systems using ITU-T Recommendation H.241 [3] (see Section 12.1). Only single NAL unit packets MAY be used in this mode. STAPs, MTAPs, and FUs MUST NOT be used. The transmission order of single NAL unit packets MUST comply with the NAL unit decoding order.

当可选打包模式媒体类型参数的值等于0或打包模式不存在时,使用此模式。所有接收器必须支持此模式。它主要用于与使用ITU-T建议H.241[3]的系统兼容的低延迟应用(见第12.1节)。在此模式中只能使用单个NAL单元数据包。不得使用STAP、MTAP和FUs。单个NAL单元数据包的传输顺序必须符合NAL单元解码顺序。

6.3. Non-Interleaved Mode
6.3. 非交织模式

This mode is in use when the value of the OPTIONAL packetization-mode media type parameter is equal to 1. This mode SHOULD be supported. It is primarily intended for low-delay applications. Only single NAL unit packets, STAP-As, and FU-As MAY be used in this mode. STAP-Bs, MTAPs, and FU-Bs MUST NOT be used. The transmission order of NAL units MUST comply with the NAL unit decoding order.

当可选打包模式媒体类型参数的值等于1时,使用此模式。应支持此模式。它主要用于低延迟应用。在此模式中只能使用单个NAL单元数据包、STAP As和FU As。不得使用STAP Bs、MTAP和FU Bs。NAL单元的传输顺序必须符合NAL单元解码顺序。

6.4. Interleaved Mode
6.4. 交织模式

This mode is in use when the value of the OPTIONAL packetization-mode media type parameter is equal to 2. Some receivers MAY support this mode. STAP-Bs, MTAPs, FU-As, and FU-Bs MAY be used. STAP-As and single NAL unit packets MUST NOT be used. The transmission order of packets and NAL units is constrained as specified in Section 5.5.

当可选打包模式媒体类型参数的值等于2时,使用此模式。一些接收机可能支持这种模式。可以使用STAP Bs、MTAP、FU As和FU Bs。不得使用STAP As和单个NAL单元数据包。数据包和NAL单元的传输顺序受第5.5节规定的约束。

7. De-Packetization Process
7. 去包装过程

The de-packetization process is implementation dependent. Therefore, the following description should be seen as an example of a suitable implementation. Other schemes may also be used as long as the output for the same input is the same as the process described below. The same output means that the resulting NAL units and their order are identical. Optimizations relative to the described algorithms are likely possible. Section 7.1 presents the de-packetization process for the single NAL unit and non-interleaved packetization modes, whereas Section 7.2 describes the process for the interleaved mode. Section 7.3 includes additional de-packetization guidelines for intelligent receivers.

反打包过程取决于实现。因此,应将以下描述视为适当实现的示例。只要相同输入的输出与下面描述的过程相同,也可以使用其他方案。相同的输出意味着生成的NAL单位及其顺序相同。与所述算法相关的优化是可能的。第7.1节介绍了单个NAL单元和非交织分组模式的解分组过程,而第7.2节介绍了交织模式的过程。第7.3节包括智能接收器的附加反包装指南。

All normal RTP mechanisms related to buffer management apply. In particular, duplicated or outdated RTP packets (as indicated by the RTP sequence number and the RTP timestamp) are removed. To determine the exact time for decoding, factors such as a possible intentional delay to allow for proper inter-stream synchronization must be factored in.

所有与缓冲区管理相关的正常RTP机制都适用。特别是,删除重复或过时的RTP数据包(如RTP序列号和RTP时间戳所示)。为了确定解码的确切时间,必须考虑一些因素,例如允许适当的流间同步的可能故意延迟。

7.1. Single NAL Unit and Non-Interleaved Mode
7.1. 单NAL单元和非交织模式

The receiver includes a receiver buffer to compensate for transmission delay jitter. The receiver stores incoming packets in reception order into the receiver buffer. Packets are de-packetized in RTP sequence number order. If a de-packetized packet is a single NAL unit packet, the NAL unit contained in the packet is passed directly to the decoder. If a de-packetized packet is an STAP-A, the NAL units contained in the packet are passed to the decoder in the order in which they are encapsulated in the packet. For all the FU-A packets containing fragments of a single NAL unit, the de-packetized fragments are concatenated in their sending order to recover the NAL unit, which is then passed to the decoder.

接收机包括用于补偿传输延迟抖动的接收机缓冲器。接收机按接收顺序将传入的数据包存储到接收机缓冲器中。数据包按RTP序列号顺序进行反打包。如果解分组分组是单个NAL单元分组,则分组中包含的NAL单元直接传递给解码器。如果解除分组的分组是STAP-a,则分组中包含的NAL单元按照它们被封装在分组中的顺序传递给解码器。对于包含单个NAL单元的片段的所有FU-A分组,解分组的片段按其发送顺序连接以恢复NAL单元,然后将其传递给解码器。

Informative note: If the decoder supports arbitrary slice order, coded slices of a picture can be passed to the decoder in any order, regardless of their reception and transmission order.

资料性说明:如果解码器支持任意切片顺序,则图片的编码切片可以以任何顺序传递给解码器,而不管它们的接收和传输顺序如何。

7.2. Interleaved Mode
7.2. 交织模式

The general concept behind these de-packetization rules is to reorder NAL units from transmission order to the NAL unit decoding order.

这些去分组规则背后的一般概念是将NAL单元从传输顺序重新排序为NAL单元解码顺序。

The receiver includes a receiver buffer, which is used to compensate for transmission delay jitter and to reorder NAL units from transmission order to the NAL unit decoding order. In this section, the receiver operation is described under the assumption that there

接收机包括接收机缓冲器,其用于补偿传输延迟抖动并将NAL单元从传输顺序重新排序到NAL单元解码顺序。在本节中,在假设存在以下情况下描述接收机操作:

is no transmission delay jitter. To differentiate the receiver buffer from a practical receiver buffer that is also used for compensation of transmission delay jitter, the receiver buffer is hereafter called the de-interleaving buffer in this section. Receivers SHOULD also prepare for transmission delay jitter, i.e., either reserve separate buffers for transmission delay jitter buffering and de-interleaving buffering or use a receiver buffer for both transmission delay jitter and de-interleaving. Moreover, receivers SHOULD take transmission delay jitter into account in the buffering operation, e.g., by additional initial buffering before starting of decoding and playback.

没有传输延迟抖动。为了将接收机缓冲器与也用于补偿传输延迟抖动的实际接收机缓冲器区分开来,在本节中,接收机缓冲器在下文中称为解交织缓冲器。接收机还应为传输延迟抖动做好准备,即,为传输延迟抖动缓冲和解交织缓冲保留单独的缓冲器,或为传输延迟抖动和解交织使用接收机缓冲器。此外,接收器应在缓冲操作中考虑传输延迟抖动,例如,在开始解码和回放之前通过附加初始缓冲。

This section is organized as follows: Subsection 7.2.1 presents how to calculate the size of the de-interleaving buffer. Subsection 7.2.2 specifies the receiver process on how to organize received NAL units to the NAL unit decoding order.

本节组织如下:第7.2.1小节介绍了如何计算解交错缓冲器的大小。第7.2.2小节规定了如何将接收到的NAL单元组织到NAL单元解码顺序的接收器过程。

7.2.1. Size of the De-Interleaving Buffer
7.2.1. 解交错缓冲区的大小

In either Offer/Answer or declarative Session Description Protocol (SDP) usage, the sprop-deint-buf-req media type parameter signals the requirement for the de-interleaving buffer size. Therefore, it is RECOMMENDED to set the de-interleaving buffer size, in terms of number of bytes, equal to or greater than the value of the sprop-deint-buf-req media type parameter.

在提供/应答或声明性会话描述协议(SDP)使用中,sprop deint buf req media type参数表示对解交织缓冲区大小的要求。因此,建议按字节数将解交织缓冲区大小设置为等于或大于sprop deint buf req media type参数的值。

When the SDP Offer/Answer model or any other capability exchange procedure is used in session setup, the properties of the received stream SHOULD be such that the receiver capabilities are not exceeded. In the SDP Offer/Answer model, the receiver can indicate its capabilities to allocate a de-interleaving buffer with the deint-buf-cap media type parameter. See Section 8.1 for further information on the deint-buf-cap and sprop-deint-buf-req media type parameters and Section 8.2.2 for further information on their use in the SDP Offer/Answer model.

在会话设置中使用SDP提供/应答模型或任何其他能力交换过程时,接收流的属性应确保不会超过接收器的能力。在SDP提供/应答模型中,接收机可以使用deint buf cap media type参数指示其分配解交织缓冲器的能力。有关deint buf cap和sprop deint buf req介质类型参数的更多信息,请参见第8.1节;有关SDP报价/应答模型中使用这些参数的更多信息,请参见第8.2.2节。

7.2.2. De-Interleaving Process
7.2.2. 解交织过程

There are two buffering states in the receiver: initial buffering and buffering while playing. Initial buffering occurs when the RTP session is initialized. After initial buffering, decoding and playback are started, and the buffering-while-playing mode is used.

接收器中有两种缓冲状态:初始缓冲和播放时缓冲。初始化RTP会话时发生初始缓冲。初始缓冲后,开始解码和播放,并使用播放时缓冲模式。

Regardless of the buffering state, the receiver stores incoming NAL units, in reception order, in the de-interleaving buffer as follows. NAL units of aggregation packets are stored in the de-interleaving buffer individually. The value of DON is calculated and stored for each NAL unit.

不管缓冲状态如何,接收机按照接收顺序将传入的NAL单元存储在解交错缓冲器中,如下所示。聚合数据包的NAL单元分别存储在解交织缓冲区中。计算并存储每个NAL单位的DON值。

The receiver operation is described below with the help of the following functions and constants:

在以下函数和常数的帮助下,接收器操作如下所述:

o Function AbsDON is specified in Section 8.1.

o 第8.1节规定了功能AbsDON。

o Function don_diff is specified in Section 5.5.

o 功能don_diff在第5.5节中有规定。

o Constant N is the value of the OPTIONAL sprop-interleaving-depth media type parameter (see Section 8.1) incremented by 1.

o 常数N是可选sprop交错深度介质类型参数(见第8.1节)的值,该参数增加1。

Initial buffering lasts until one of the following conditions is fulfilled:

初始缓冲持续到满足以下条件之一:

o There are N or more VCL NAL units in the de-interleaving buffer.

o 解交织缓冲器中有N个或多个VCL NAL单元。

o If sprop-max-don-diff is present, don_diff(m,n) is greater than the value of sprop-max-don-diff, in which n corresponds to the NAL unit having the greatest value of AbsDON among the received NAL units and m corresponds to the NAL unit having the smallest value of AbsDON among the received NAL units.

o 如果存在sprop max don diff,则don_diff(m,n)大于sprop max don diff的值,其中n对应于接收到的NAL单元中AbsDON值最大的NAL单元,m对应于接收到的NAL单元中AbsDON值最小的NAL单元。

o Initial buffering has lasted for the duration equal to or greater than the value of the OPTIONAL sprop-init-buf-time media type parameter.

o 初始缓冲的持续时间等于或大于可选的sprop init buf time media type参数的值。

The NAL units to be removed from the de-interleaving buffer are determined as follows:

要从解交织缓冲器中移除的NAL单元确定如下:

o If the de-interleaving buffer contains at least N VCL NAL units, NAL units are removed from the de-interleaving buffer and passed to the decoder in the order specified below until the buffer contains N-1 VCL NAL units.

o 如果解交错缓冲器包含至少N个VCL NAL单元,则NAL单元将从解交错缓冲器中移除,并按照下面指定的顺序传递给解码器,直到缓冲器包含N-1个VCL NAL单元。

o If sprop-max-don-diff is present, all NAL units m for which don_diff(m,n) is greater than sprop-max-don-diff are removed from the de-interleaving buffer and passed to the decoder in the order specified below. Herein, n corresponds to the NAL unit having the greatest value of AbsDON among the NAL units in the de-interleaving buffer.

o 如果存在sprop max don diff,则don_diff(m,n)大于sprop max don diff的所有NAL单元m将从解交织缓冲器中移除,并按照下面指定的顺序传递给解码器。这里,n对应于在解交织缓冲器中的NAL单元中具有最大AbsDON值的NAL单元。

The order in which NAL units are passed to the decoder is specified as follows:

NAL单元传递给解码器的顺序规定如下:

o Let PDON be a variable that is initialized to 0 at the beginning of the RTP session.

o 设PDON为在RTP会话开始时初始化为0的变量。

o For each NAL unit associated with a value of DON, a DON distance is calculated as follows. If the value of DON of the NAL unit is larger than the value of PDON, the DON distance is equal to DON - PDON. Otherwise, the DON distance is equal to 65535 - PDON + DON + 1.

o 对于与DON值相关联的每个NAL单元,DON距离计算如下。如果NAL单元的DON值大于PDON值,则DON距离等于DON-PDON。否则,DON距离等于65535-PDON+DON+1。

o NAL units are delivered to the decoder in ascending order of DON distance. If several NAL units share the same value of DON distance, they can be passed to the decoder in any order.

o NAL单元按DON距离的升序传送到解码器。如果多个NAL单元共享相同的DON距离值,则可以按任意顺序将它们传递给解码器。

o When a desired number of NAL units have been passed to the decoder, the value of PDON is set to the value of DON for the last NAL unit passed to the decoder.

o 当已将所需数量的NAL单元传递给解码器时,PDON的值被设置为传递给解码器的最后一个NAL单元的DON值。

7.3. Additional De-Packetization Guidelines
7.3. 附加反包装指南

The following additional de-packetization rules may be used to implement an operational H.264 de-packetizer:

以下附加反打包规则可用于实现可操作的H.264反打包器:

o Intelligent RTP receivers (e.g., in gateways) may identify lost coded slice data partitions A (DPAs). If a lost DPA is detected, after taking into account possible retransmission and FEC, a gateway may decide not to send the corresponding coded slice data partitions B and C, as their information is meaningless for H.264 decoders. In this way, a MANE can reduce network load by discarding useless packets without parsing a complex bitstream.

o 智能RTP接收器(例如,在网关中)可识别丢失的编码片数据分区A(DPA)。如果检测到丢失的DPA,则在考虑可能的重传和FEC之后,网关可以决定不发送相应的编码片数据分区B和C,因为它们的信息对于H.264解码器来说是无意义的。通过这种方式,MANE可以通过丢弃无用的包而不解析复杂的比特流来减少网络负载。

o Intelligent RTP receivers (e.g., in gateways) may identify lost FUs. If a lost FU is found, a gateway may decide not to send the following FUs of the same fragmented NAL unit, as their information is meaningless for H.264 decoders. In this way, a MANE can reduce network load by discarding useless packets without parsing a complex bitstream.

o 智能RTP接收器(例如,在网关中)可以识别丢失的FU。如果发现丢失的FU,网关可能会决定不发送相同分段NAL单元的以下FU,因为它们的信息对于H.264解码器没有意义。通过这种方式,MANE可以通过丢弃无用的包而不解析复杂的比特流来减少网络负载。

o Intelligent receivers having to discard packets or NALUs should first discard all packets/NALUs in which the value of the NRI field of the NAL unit type octet is equal to 0. This will minimize the impact on user experience and keep the reference pictures intact. If more packets have to be discarded, then

o 必须丢弃数据包或NALU的智能接收器应首先丢弃NAL单元类型八位字节的NRI字段值等于0的所有数据包/NALU。这将最大限度地减少对用户体验的影响,并保持参考图片的完整性。如果必须丢弃更多的数据包,则

packets with a numerically lower NRI value should be discarded before packets with a numerically higher NRI value. However, discarding any packets with an NRI bigger than 0 very likely leads to decoder drift and SHOULD be avoided.

NRI值数值较低的数据包应在NRI值数值较高的数据包之前丢弃。然而,丢弃NRI大于0的任何数据包很可能会导致解码器漂移,应该避免。

8. Payload Format Parameters
8. 有效载荷格式参数

This section specifies the parameters that MAY be used to select optional features of the payload format and certain features of the bitstream. The parameters are specified here as part of the media subtype registration for the ITU-T H.264 | ISO/IEC 14496-10 codec. A mapping of the parameters into the Session Description Protocol (SDP) [6] is also provided for applications that use SDP. Equivalent parameters could be defined elsewhere for use with control protocols that do not use SDP.

本节规定了可用于选择有效负载格式的可选特征和比特流的某些特征的参数。此处指定的参数是ITU-T H.264 | ISO/IEC 14496-10编解码器的媒体子类型注册的一部分。还为使用SDP的应用程序提供了参数到会话描述协议(SDP)[6]的映射。可以在其他地方定义等效参数,以便与不使用SDP的控制协议一起使用。

Some parameters provide a receiver with the properties of the stream that will be sent. The names of all these parameters start with "sprop" for stream properties. Some of these "sprop" parameters are limited by other payload or codec configuration parameters. For example, the sprop-parameter-sets parameter is constrained by the profile-level-id parameter.

一些参数向接收器提供将要发送的流的属性。对于流属性,所有这些参数的名称都以“sprop”开头。其中一些“sprop”参数受到其他有效负载或编解码器配置参数的限制。例如,“sprop参数集”参数受“纵断面标高id”参数的约束。

8.1. Media Type Registration
8.1. 媒体类型注册

The media subtype for the ITU-T H.264 | ISO/IEC 14496-10 codec has been allocated from the IETF tree.

ITU-T H.264 | ISO/IEC 14496-10编解码器的媒体子类型已从IETF树中分配。

Media Type name: video

媒体类型名称:视频

Media subtype name: H264

媒体子类型名称:H264

Required parameters: none

所需参数:无

OPTIONAL parameters:

可选参数:

profile-level-id: A base16 [7] (hexadecimal) representation of the following three bytes in the sequence parameter set NAL unit is specified in [1]: 1) profile_idc, 2) a byte herein referred to as profile-iop, composed of the values of constraint_set0_flag, constraint_set1_flag, constraint_set2_flag, constraint_set3_flag, constraint_set4_flag, constraint_set5_flag, and reserved_zero_2bits in bit-significance order, starting from the most-significant bit, and 3) level_idc. Note that reserved_zero_2bits is required to be equal to 0 in [1], but other values for it may be specified in the future by ITU-T or ISO/IEC.

配置文件级别id:序列参数集NAL单元中以下三个字节的base16[7](十六进制)表示在[1]中指定:1)配置文件\u idc,2)此处称为配置文件iop的字节,由约束设置0\u标志、约束设置1\u标志、约束设置2\u标志、约束设置3\u标志、约束设置4\u标志的值组成,约束_set5_标志,并按位重要性顺序保留_零_2位,从最高有效位开始,以及3)级别_idc。请注意,在[1]中,保留的0位必须等于0,但将来ITU-T或ISO/IEC可能会指定其其他值。

The profile-level-id parameter indicates the default sub-profile (i.e., the subset of coding tools that may have been used to generate the stream or that the receiver supports) and the default level of the stream or the receiver supports.

profile level id参数指示默认子概要文件(即,可能已用于生成流或接收器支持的编码工具子集)和流或接收器支持的默认级别。

The default sub-profile is indicated collectively by the profile_idc byte and some fields in the profile-iop byte. Depending on the values of the fields in the profile-iop byte, the default sub-profile may be the set of coding tools supported by one profile, or a common subset of coding tools of multiple profiles, as specified in Section 7.4.2.1.1 of [1]. The default level is indicated by the level_idc byte, and, when profile_idc is equal to 66, 77, or 88 (the Baseline, Main, or Extended profile) and level_idc is equal to 11, additionally by bit 4 (constraint_set3_flag) of the profile-iop byte. When profile_idc is equal to 66, 77, or 88 (the Baseline, Main, or Extended profile), level_idc is equal to 11, and bit 4 (constraint_set3_flag) of the profile-iop byte is equal to 1, the default level is Level 1b.

默认子配置文件由配置文件\ idc字节和配置文件iop字节中的某些字段共同指示。根据配置文件iop字节中字段的值,默认子配置文件可以是一个配置文件支持的编码工具集,也可以是多个配置文件的编码工具的公共子集,如[1]第7.4.2.1.1节所述。默认级别由级别_idc字节表示,当配置文件_idc等于66、77或88(基线、主配置文件或扩展配置文件)且级别_idc等于11时,另外由配置文件iop字节的位4(约束设置3标志)表示。当profile_idc等于66、77或88(基线、主或扩展profile)、level_idc等于11、profile iop字节的第4位(constraint_set3_标志)等于1时,默认级别为1b。

Table 5 lists all profiles defined in Annex A of [1] and, for each of the profiles, the possible combinations of profile_idc and profile-iop that represent the same sub-profile.

表5列出了[1]附录A中定义的所有剖面,以及对于每个剖面,代表同一子剖面的剖面_idc和剖面iop的可能组合。

Table 5. Combinations of profile_idc and profile-iop representing the same sub-profile corresponding to the full set of coding tools supported by one profile. In the following, x may be either 0 or 1, while the profile names are indicated as follows. CB: Constrained Baseline profile, B: Baseline profile, M: Main profile, E: Extended profile, H: High profile, H10: High 10 profile, H42: High 4:2:2 profile, H44: High 4:4:4 Predictive profile, H10I: High 10 Intra profile, H42I: High 4:2:2 Intra profile, H44I: High 4:4:4 Intra profile, and C44I: CAVLC 4:4:4 Intra profile.

表5。profile_idc和profile iop的组合,表示与一个profile支持的全套编码工具相对应的相同子profile。在下面的示例中,x可以是0或1,而配置文件名称如下所示。CB:约束基线配置文件,B:基线配置文件,M:主配置文件,E:扩展配置文件,H:高配置文件,H10:高10配置文件,H42:高4:2:2配置文件,H44:高4:4:4预测配置文件,H10I:高10帧内配置文件,H42I:高4:2:2帧内配置文件,H44I:高4:4帧内配置文件,以及C44I:CAVLC 4:4:4帧内配置文件。

Profile profile_idc profile-iop (hexadecimal) (binary)

配置文件\ idc配置文件iop(十六进制)(二进制)

CB 42 (B) x1xx0000 same as: 4D (M) 1xxx0000 same as: 58 (E) 11xx0000 B 42 (B) x0xx0000 same as: 58 (E) 10xx0000 M 4D (M) 0x0x0000 E 58 00xx0000 H 64 00000000 H10 6E 00000000 H42 7A 00000000 H44 F4 00000000 H10I 6E 00010000 H42I 7A 00010000 H44I F4 00010000 C44I 2C 00010000

CB 42(B)x1xx0000相同于:4D(M)1x0000相同于:58(E)11xx0000 B 42(B)0xX0000相同于:58(E)10xx0000 M 4D(M)0x0x0000 E 58 00xx0000 H 64 00000000 H10 6E 00000000 H42 7A 00000000 H44 F4 00000000 H10I 6E 00010000 H42I 7A 00010000 H7A 00010000 H44I 00010000 C44I 2C 00010000

For example, in the table above, profile_idc equal to 58 (Extended) with profile-iop equal to 11xx0000 indicates the same sub-profile corresponding to profile_idc equal to 42 (Baseline) with profile-iop equal to x1xx0000. Note that other combinations of profile_idc and profile-iop (not listed in Table 5) may represent a sub-profile equivalent to the common subset of coding tools for more than one profile. Note also that a decoder conforming to a certain profile may be able to decode bitstreams conforming to other profiles.

例如,在上表中,profile_idc等于58(扩展),profile iop等于11xx0000,表示profile_idc等于42(基线),profile iop等于x1xx0000的相同子profile。注意,profile_idc和profile iop的其他组合(未在表5中列出)可能表示与多个profile的编码工具的公共子集等效的子profile。还注意,符合特定简档的解码器可以解码符合其他简档的比特流。

If the profile-level-id parameter is used to indicate properties of a NAL unit stream, it indicates that, to decode the stream, the minimum subset of coding tools a decoder has to support is the default sub-profile, and the lowest level the decoder has to support is the default level.

如果profile-level id参数用于指示NAL单元流的属性,则它指示为了解码该流,解码器必须支持的编码工具的最小子集是默认子profile,并且解码器必须支持的最低级别是默认级别。

If the profile-level-id parameter is used for capability exchange or session setup, it indicates the subset of coding tools, which is equal to the default sub-profile, that the codec supports for both receiving and sending. If max-recv-level is not present, the default level from profile-level-id indicates the highest level the codec wishes to support. If max-recv-level is present, it indicates the highest level the codec supports for receiving. For either receiving or sending, all levels that are lower than the highest level supported MUST also be supported.

如果profile level id参数用于功能交换或会话设置,则它表示编解码器在接收和发送时支持的编码工具子集,该子集等于默认子配置文件。如果不存在max recv level,则配置文件级别id中的默认级别表示编解码器希望支持的最高级别。如果存在max recv level,则表示编解码器支持接收的最高级别。对于接收或发送,还必须支持低于支持的最高级别的所有级别。

Informative note: Capability exchange and session setup procedures should provide means to list the capabilities for each supported sub-profile separately. For example, the one-of-N codec selection procedure of the SDP Offer/Answer model can be used (Section 10.2 of [8]). The one-of-N codec selection procedure may also be used to provide different combinations of profile_idc and profile-iop that represent the same sub-profile. When there are many different combinations of profile_idc and profile-iop that represent the same sub-profile, using the one-of-N codec selection procedure may result in a fairly large SDP message. Therefore, a receiver should understand the different equivalent combinations of profile_idc and profile-iop that represent the same sub-profile and be ready to accept an offer using any of the equivalent combinations.

资料性说明:能力交换和会话设置程序应提供单独列出每个受支持子概要文件的能力的方法。例如,可以使用SDP提供/应答模型的N选一编解码器选择过程(见[8]第10.2节)。N个编解码器中的一个选择过程还可用于提供表示相同子简档的简档和简档iop的不同组合。当存在代表同一子配置文件的多个不同配置文件和配置文件iop组合时,使用N选一编解码器选择过程可能会产生相当大的SDP消息。因此,接收方应了解代表同一子剖面的剖面图和剖面图iop的不同等效组合,并准备接受使用任何等效组合的报价。

If no profile-level-id is present, the Baseline profile, without additional constraints at Level 1, MUST be inferred.

如果不存在纵断面级别id,则必须推断基线纵断面(在级别1上没有其他约束)。

max-recv-level: This parameter MAY be used to indicate the highest level a receiver supports when the highest level is higher than the default level (the level indicated by profile-level-id). The value of max-recv-level is a base16 (hexadecimal) representation of the two bytes after the syntax element profile_idc in the sequence parameter set NAL unit specified in [1]: profile-iop (as defined above) and level_idc. If the level_idc byte of max-recv-level is equal to 11 and bit 4 of the profile-iop byte of max-recv-level is equal to 1 or if the level_idc byte of max-recv-level is equal to 9 and bit 4 of the profile-iop byte of max-recv-level is equal to 0, the highest level the receiver supports is Level 1b. Otherwise, the highest level the receiver supports is equal to the level_idc byte of max-recv-level divided by 10.

最大recv级别:当最高级别高于默认级别(由配置文件级别id指示的级别)时,此参数可用于指示接收器支持的最高级别。max recv level的值是[1]:profile iop(如上定义)和level_idc中指定的序列参数集NAL unit中语法元素profile_idc之后两个字节的base16(十六进制)表示。如果max recv level的level_idc字节等于11,max recv level的配置文件iop字节的位4等于1,或者max recv level的level_idc字节等于9,max recv level的配置文件iop字节的位4等于0,则接收机支持的最高级别为1b。否则,接收器支持的最高电平等于max recv level的level_idc字节除以10。

max-recv-level MUST NOT be present if the highest level the receiver supports is not higher than the default level.

如果接收器支持的最高电平不高于默认电平,则不得出现最大recv电平。

max-mbps, max-smbps, max-fs, max-cpb, max-dpb, and max-br: These parameters MAY be used to signal the capabilities of a receiver implementation. These parameters MUST NOT be used for any other purpose. The highest level conveyed in the value of the profile-level-id parameter or the max-recv-level parameter MUST be such that the receiver is fully capable of supporting. max-mbps, max-smbps, max-fs, max-cpb, max-dpb, and max-br MAY be used to indicate capabilities of the receiver that extend the required capabilities of the signaled highest level, as specified below.

max-mbps、max-smbps、max-fs、max-cpb、max-dpb和max-br:这些参数可用于表示接收机实现的能力。这些参数不得用于任何其他目的。在profile level id参数或max recv level参数的值中传递的最高电平必须确保接收器完全能够支持。max-mbps、max-smbps、max-fs、max-cpb、max-dpb和max-br可用于指示扩展信号化最高级别所需能力的接收机能力,如下所述。

When more than one parameter from the set (max-mbps, max-smbps, max-fs, max-cpb, max-dpb, max-br) is present, the receiver MUST support all signaled capabilities simultaneously. For example, if both max-mbps and max-br are present, the signaled highest level with the extension of both the frame rate and bitrate is supported. That is, the receiver is able to decode NAL unit streams in which the macroblock processing rate is up to max-mbps (inclusive), the bitrate is up to max-br (inclusive), the coded picture buffer size is derived as specified in the semantics of the max-br parameter below, and the other properties comply with the highest level specified in the value of the profile-level-id parameter or the max-recv-level parameter.

当集合中存在多个参数(最大mbps、最大smbps、最大fs、最大cpb、最大dpb、最大br)时,接收器必须同时支持所有信号功能。例如,如果存在max mbps和max br,则支持扩展帧速率和比特率的信号化最高级别。也就是说,接收机能够解码其中宏块处理速率高达max mbps(包括),比特率高达max br(包括),编码图片缓冲区大小如下面max br参数的语义中所指定的那样导出的NAL单元流,其他属性符合配置文件级别id参数值或max recv级别参数值中指定的最高级别。

If a receiver can support all the properties of Level A, the highest level specified in the value of the profile-level-id parameter or the max-recv-level parameter MUST be Level A (i.e., MUST NOT be lower than Level A). In other words, a receiver MUST NOT signal values of max-mbps, max-fs, max-cpb, max-dpb, and max-br that taken together meet the requirements of a higher level compared to the highest level specified in the value of the profile-level-id parameter or the max-recv-level parameter.

如果接收器可以支持级别a的所有属性,则配置文件级别id参数或最大recv级别参数的值中指定的最高级别必须为级别a(即,不得低于级别a)。换句话说,接收器不得发送最大mbps、最大fs、最大cpb、最大dpb和最大br值的信号,与配置文件级别id参数或最大recv级别参数的值中指定的最高级别相比,这些值一起满足更高级别的要求。

Informative note: When the OPTIONAL media type parameters are used to signal the properties of a NAL unit stream, max-mbps, max-smbps, max-fs, max-cpb, max-dpb, and max-br are not present, and the value of profile-level-id must always be such that the NAL unit stream complies fully with the specified profile and level.

资料性说明:当使用可选媒体类型参数来表示NAL单元流的属性时,max mbps、max smbps、max fs、max cpb、max dpb和max br不存在,并且配置文件级别id的值必须始终确保NAL单元流完全符合指定的配置文件和级别。

max-mbps: The value of max-mbps is an integer indicating the maximum macroblock processing rate in units of macroblocks per second. The max-mbps parameter signals that the receiver is capable of decoding video at a higher rate than is required by the signaled highest level conveyed in the value of the profile-level-id parameter or the max-recv-level parameter.

max-mbps:max-mbps的值是一个整数,表示以每秒宏块为单位的最大宏块处理速率。max mbps参数表示接收器能够以高于在简档电平id参数或max recv电平参数的值中传送的信号化最高电平所要求的速率解码视频。

When max-mbps is signaled, the receiver MUST be able to decode NAL unit streams that conform to the signaled highest level, with the exception that the MaxMBPS value in Table A-1 of [1] for the signaled highest level is replaced with the value of max-mbps. The value of max-mbps MUST be greater than or equal to the value of MaxMBPS given in Table A-1 of [1] for the highest level. Senders MAY use this knowledge to send pictures of a given size at a higher picture rate than is indicated in the signaled highest level.

当发送最大mbps信号时,接收机必须能够解码符合发送信号的最高电平的NAL单元流,但[1]表A-1中发送信号的最高电平的最大mbps值被最大mbps值替换。最大mbps的值必须大于或等于[1]表A-1中给出的最高级别的最大mbps值。发送者可以使用此知识以高于信号最高级别中指示的图片速率发送给定大小的图片。

max-smbps: The value of max-smbps is an integer indicating the maximum static macroblock processing rate in units of static macroblocks per second, under the hypothetical assumption that all macroblocks are static macroblocks. When max-smbps is signaled, the MaxMBPS value in Table A-1 of [1] should be replaced with the result of the following computation:

max smbps:max smbps的值是一个整数,表示在假设所有宏块都是静态宏块的情况下,以每秒静态宏块为单位的最大静态宏块处理速率。当发出max smbps信号时,应将[1]表A-1中的MaxMBPS值替换为以下计算结果:

o If the parameter max-mbps is signaled, set a variable MaxMacroblocksPerSecond to the value of max-mbps. Otherwise, set MaxMacroblocksPerSecond equal to the value of MaxMBPS in Table A-1 [1] for the signaled highest level conveyed in the value of the profile-level-id parameter or the max-recv-level parameter.

o 如果参数max mbps发出信号,则将变量MaxMacroblocksPerSecond设置为max mbps的值。否则,将MaxMacroblocksPerSecond设置为表A-1[1]中的MaxMBPS值,用于配置文件级别id参数或最大recv级别参数值中传输的信号最高级别。

o Set a variable P_non-static to the proportion of non-static macroblocks in picture n.

o 将变量P_non-static设置为图片n中非静态宏块的比例。

o Set a variable P_static to the proportion of static macroblocks in picture n.

o 将变量P_static设置为图片n中静态宏块的比例。

o The value of MaxMBPS in Table A-1 of [1] should be considered by the encoder to be equal to:

o 编码器应认为[1]表A-1中的MaxMBPS值等于:

            MaxMacroblocksPerSecond * max-smbps / (P_non-static *
            max-smbps + P_static * MaxMacroblocksPerSecond)
        
            MaxMacroblocksPerSecond * max-smbps / (P_non-static *
            max-smbps + P_static * MaxMacroblocksPerSecond)
        

The encoder should recompute this value for each picture. The value of max-smbps MUST be greater than or equal to the value of MaxMBPS given explicitly as the value of the max-mbps parameter or implicitly in Table A-1 of [1] for the signaled highest level. Senders MAY use this knowledge to send pictures of a given size at a higher picture rate than is indicated in the signaled highest level.

编码器应该为每个图片重新计算该值。max smbps的值必须大于或等于MaxMBPS的值,MaxMBPS的值是作为max mbps参数的值显式给出的,或是在[1]的表A-1中隐式给出的信号最高电平的值。发送者可以使用此知识以高于信号最高级别中指示的图片速率发送给定大小的图片。

max-fs: The value of max-fs is an integer indicating the maximum frame size in units of macroblocks. The max-fs parameter signals that the receiver is capable of decoding larger picture sizes than are required by the signaled highest level conveyed

max fs:max fs的值是一个整数,表示以宏块为单位的最大帧大小。max fs参数表示接收器能够解码比所发送信号的最高电平所需的更大的图片大小

in the value of the profile-level-id parameter or the max-recv-level parameter. When max-fs is signaled, the receiver MUST be able to decode NAL unit streams that conform to the signaled highest level, with the exception that the MaxFS value in Table A-1 of [1] for the signaled highest level is replaced with the value of max-fs. The value of max-fs MUST be greater than or equal to the value of MaxFS given in Table A-1 of [1] for the highest level. Senders MAY use this knowledge to send larger pictures at a proportionally lower frame rate than is indicated in the signaled highest level.

在配置文件级别id参数或max recv级别参数的值中。当发送最大fs信号时,接收机必须能够解码符合发送信号的最高电平的NAL单位流,但[1]表A-1中发送信号的最高电平的MaxFS值被替换为最大fs值的情况除外。max fs的值必须大于或等于[1]表A-1中给出的最高级别的MaxFS值。发送者可以利用这一知识以比信号最高级别指示的帧速率低的比例发送较大的图片。

max-cpb: The value of max-cpb is an integer indicating the maximum coded picture buffer size in units of 1000 bits for the VCL HRD parameters and in units of 1200 bits for the NAL HRD parameters. Note that this parameter does not use units of cpbBrVclFactor and cpbBrNALFactor (see Table A-1 of [1]). The max-cpb parameter signals that the receiver has more memory than the minimum amount of coded picture buffer memory required by the signaled highest level conveyed in the value of the profile-level-id parameter or the max-recv-level parameter. When max-cpb is signaled, the receiver MUST be able to decode NAL unit streams that conform to the signaled highest level, with the exception that the MaxCPB value in Table A-1 of [1] for the signaled highest level is replaced with the value of max-cpb (after taking cpbBrVclFactor and cpbBrNALFactor into consideration when needed). The value of max-cpb (after taking cpbBrVclFactor and cpbBrNALFactor into consideration when needed) MUST be greater than or equal to the value of MaxCPB given in Table A-1 of [1] for the highest level. Senders MAY use this knowledge to construct coded video streams with greater variation of bitrate than can be achieved with the MaxCPB value in Table A-1 of [1].

max cpb:max cpb的值是一个整数,表示VCL HRD参数的最大编码图片缓冲区大小,以1000位为单位,NAL HRD参数以1200位为单位。请注意,此参数不使用cpbBrVclFactor和cpbBrNALFactor的单位(见[1]中的表A-1])。max cpb参数表示接收器的内存大于在profile level id参数或max recv level参数的值中传送的信号化最高电平所需的最小编码图片缓冲内存量。当发送最大cpb信号时,接收器必须能够解码符合发送信号的最高电平的NAL单位流,但[1]表A-1中发送信号的最高电平的最大cpb值被替换为最大cpb值(在需要时考虑cpbBrVclFactor和cpbBrNALFactor后)。最大cpb值(在需要时考虑CpBBrVCL系数和cpbBrNALFactor后)必须大于或等于[1]表A-1中给出的最高水平的最大cpb值。发送方可以利用这一知识来构造编码视频流,其比特率变化比[1]表A-1中的MaxCPB值更大。

Informative note: The coded picture buffer is used in the hypothetical reference decoder (Annex C of H.264). The use of the hypothetical reference decoder is recommended in H.264 encoders to verify that the produced bitstream conforms to the standard and to control the output bitrate. Thus, the coded picture buffer is conceptually independent of any other potential buffers in the receiver, including de-interleaving and de-jitter buffers. The coded picture buffer need not be implemented in decoders as specified in Annex C of H.264, but rather standard-compliant decoders can have any buffering arrangements provided that they can decode standard-compliant bitstreams. Thus, in practice, the input buffer for a video decoder can be integrated with de-interleaving and de-jitter buffers of the receiver.

资料性说明:编码图片缓冲器用于假设参考解码器(H.264附录C)。建议在H.264编码器中使用假设参考解码器,以验证生成的比特流是否符合标准并控制输出比特率。因此,编码图片缓冲器在概念上独立于接收机中的任何其他潜在缓冲器,包括解交错和解抖动缓冲器。编码图片缓冲器不需要在H.264的附录C中规定的解码器中实现,而是符合标准的解码器可以具有任何缓冲布置,只要它们能够解码符合标准的比特流。因此,在实践中,视频解码器的输入缓冲器可以与接收机的解交错和解抖动缓冲器集成。

max-dpb: The value of max-dpb is an integer indicating the maximum decoded picture buffer size in units of 8/3 macroblocks. The max-dpb parameter signals that the receiver has more memory than the minimum amount of decoded picture buffer memory required by the signaled highest level conveyed in the value of the profile-level-id parameter or the max-recv-level parameter. When max-dpb is signaled, the receiver MUST be able to decode NAL unit streams that conform to the signaled highest level, with the exception that the MaxDpbMbs value in Table A-1 of [1] for the signaled highest level is replaced with the value of max-dpb * 3 / 8. Consequently, a receiver that signals max-dpb MUST be capable of storing the following number of decoded frames, complementary field pairs, and non-paired fields in its decoded picture buffer:

max dpb:max dpb的值是一个整数,表示以8/3宏块为单位的最大解码图片缓冲区大小。max dpb参数表示接收器的内存大于在配置文件级别id参数或max recv级别参数的值中传送的信号化最高级别所需的解码图片缓冲内存的最小量。当发送最大dpb信号时,接收器必须能够解码符合发送信号的最高电平的NAL单元流,但[1]表A-1中发送信号的最高电平的最大DPBMBS值被最大dpb*3/8值替换。因此,发送max dpb信号的接收器必须能够在其解码图片缓冲器中存储以下数量的解码帧、互补场对和非成对场:

            Min(max-dpb * 3 / 8 / ( PicWidthInMbs * FrameHeightInMbs),
            16)
        
            Min(max-dpb * 3 / 8 / ( PicWidthInMbs * FrameHeightInMbs),
            16)
        

Wherein PicWidthInMbs and FrameHeightInMbs are defined in [1].

其中,PicWidthInMbs和FrameHeightInMbs在[1]中定义。

The value of max-dpb MUST be greater than or equal to the value of MaxDpbMbs * 3 / 8, wherein the value of MaxDpbMbs is given in Table A-1 of [1] for the highest level. Senders MAY use this knowledge to construct coded video streams with improved compression.

max dpb的值必须大于或等于MaxDpbMbs*3/8的值,其中MaxDpbMbs的值在[1]的表A-1中给出了最高级别。发送者可以使用此知识构造具有改进的压缩的编码视频流。

Informative note: This parameter was added primarily to complement a similar codepoint in the ITU-T Recommendation H.245, so as to facilitate signaling gateway designs. The decoded picture buffer stores reconstructed samples. There is no relationship between the size of the decoded picture buffer and the buffers used in RTP, especially de-interleaving and de-jitter buffers.

资料性说明:添加此参数主要是为了补充ITU-T建议H.245中的类似代码点,以便于信令网关设计。解码的图片缓冲器存储重构的样本。解码图片缓冲区的大小与RTP中使用的缓冲区之间没有关系,尤其是去交错和去抖动缓冲区。

Informative note: In RFC 3984, which this document obsoletes, the unit of this parameter was 1024 bytes. The unit has been changed to 8/3 macroblocks in this document. The reason for this change was due to the changes from the 2003 version of the H.264 specification referenced by RFC 3984 to the 2010 version of the H.264 specification referenced by this document, particularly the changes to Table A-1 in the H.264 specification due to addition of color formats and bit depths not supported earlier. The changed semantics of this parameter keeps backward compatibility to RFC 3984 and supports all profiles defined in the 2010 version of the H.264 specification.

资料性说明:在RFC 3984中,此参数的单位为1024字节,本文档已将其淘汰。在本文档中,该单元已更改为8/3宏块。此更改的原因是由于从RFC 3984引用的2003版H.264规范更改为本文档引用的2010版H.264规范,特别是由于添加了先前不支持的颜色格式和位深度,H.264规范中的表A-1发生了更改。此参数的更改语义保持了与RFC 3984的向后兼容性,并支持2010版H.264规范中定义的所有配置文件。

max-br: The value of max-br is an integer indicating the maximum video bitrate in units of 1000 bits per second for the VCL HRD parameters and in units of 1200 bits per second for the NAL HRD parameters. Note that this parameter does not use units of cpbBrVclFactor and cpbBrNALFactor (see Table A-1 of [1]).

max br:max br的值是一个整数,表示VCL HRD参数的最大视频比特率,单位为1000比特/秒,NAL HRD参数的最大视频比特率单位为1200比特/秒。请注意,此参数不使用cpbBrVclFactor和cpbBrNALFactor的单位(见[1]中的表A-1])。

The max-br parameter signals that the video decoder of the receiver is capable of decoding video at a higher bitrate than is required by the signaled highest level conveyed in the value of the profile-level-id parameter or the max-recv-level parameter.

max-br参数表示接收器的视频解码器能够以高于在简档电平id参数或max-recv电平参数的值中传送的信号化最高电平所要求的比特率解码视频。

When max-br is signaled, the video codec of the receiver MUST be able to decode NAL unit streams that conform to the signaled highest level, with the following exceptions in the limits specified by the highest level:

当发送max br信号时,接收器的视频编解码器必须能够解码符合所发送信号的最高级别的NAL单元流,但在最高级别指定的限制范围内存在以下例外情况:

o The value of max-br (after taking cpbBrVclFactor and cpbBrNALFactor into consideration when needed) replaces the MaxBR value in Table A-1 of [1] for the highest level.

o max br的值(在需要时考虑cpbBrVclFactor和cpbBrNALFactor后)替换[1]表A-1中最高级别的MaxBR值。

o When the max-cpb parameter is not present, the result of the following formula replaces the value of MaxCPB in Table A-1 of [1]: (MaxCPB of the signaled level) * max-br / (MaxBR of the signaled highest level).

o 当max cpb参数不存在时,以下公式的结果将替换[1]表A-1中的MaxCPB值:(信号电平的MaxCPB)*max br/(信号最高电平的MaxBR)。

For example, if a receiver signals capability for Main profile Level 1.2 with max-br equal to 1550, this indicates a maximum video bitrate of 1550 kbits/sec for VCL HRD parameters, a maximum video bitrate of 1860 kbits/sec for NAL HRD parameters, and a CPB size of 4036458 bits (1550000 / 384000 * 1000 * 1000).

例如,如果接收机向主配置文件级别1.2发送信号,最大br等于1550,则表示VCL HRD参数的最大视频比特率为1550 kbits/sec,NAL HRD参数的最大视频比特率为1860 kbits/sec,CPB大小为4036458比特(1550000/384000*1000*1000)。

The value of max-br (after taking cpbBrVclFactor and cpbBrNALFactor into consideration when needed) MUST be greater than or equal to the value MaxBR given in Table A-1 of [1] for the signaled highest level.

max br的值(在需要时考虑cpbBrVclFactor和cpbBrNALFactor后)必须大于或等于[1]表A-1中给出的信号最高电平的MaxBR值。

Senders MAY use this knowledge to send higher bitrate video as allowed in the level definition of Annex A of H.264 to achieve improved video quality.

发送者可以使用此知识发送H.264附件A的级别定义中允许的更高比特率视频,以实现改进的视频质量。

Informative note: This parameter was added primarily to complement a similar codepoint in the ITU-T Recommendation H.245, so as to facilitate signaling gateway designs. The assumption that the network is capable of handling such bitrates at any given time cannot be made from the value of

资料性说明:添加此参数主要是为了补充ITU-T建议H.245中的类似代码点,以便于信令网关设计。网络能够在任何给定时间处理这种比特率的假设不能从

this parameter. In particular, no conclusion can be drawn that the signaled bitrate is possible under congestion control constraints.

这个参数。特别地,不能得出在拥塞控制约束下信号比特率是可能的结论。

redundant-pic-cap: This parameter signals the capabilities of a receiver implementation. When equal to 0, the parameter indicates that the receiver makes no attempt to use redundant coded pictures to correct incorrectly decoded primary coded pictures. When equal to 0, the receiver is not capable of using redundant slices; therefore, a sender SHOULD avoid sending redundant slices to save bandwidth. When equal to 1, the receiver is capable of decoding any such redundant slice that covers a corrupted area in a primary decoded picture (at least partly), and therefore a sender MAY send redundant slices. When the parameter is not present, a value of 0 MUST be used for redundant-pic-cap. When present, the value of redundant-pic-cap MUST be either 0 or 1.

冗余pic cap:此参数表示接收器实现的能力。当等于0时,该参数表示接收器不尝试使用冗余编码图片来纠正未正确解码的主编码图片。当等于0时,接收器不能使用冗余片;因此,发送方应避免发送冗余片以节省带宽。当等于1时,接收器能够解码覆盖主解码图片中损坏区域的任何此类冗余片(至少部分),因此发送器可以发送冗余片。当参数不存在时,冗余pic cap必须使用0值。存在时,冗余pic cap的值必须为0或1。

When the profile-level-id parameter is present in the same signaling as the redundant-pic-cap parameter and the profile indicated in profile-level-id is such that it disallows the use of redundant coded pictures (e.g., Main profile), the value of redundant-pic-cap MUST be equal to 0. When a receiver indicates redundant-pic-cap equal to 0, the received stream SHOULD NOT contain redundant coded pictures.

当配置文件级别id参数与冗余pic cap参数存在于同一信令中,并且配置文件级别id中指示的配置文件不允许使用冗余编码图片(例如,主配置文件)时,冗余pic cap的值必须等于0。当接收器指示冗余pic cap等于0时,接收的流不应包含冗余编码图片。

Informative note: Even if redundant-pic-cap is equal to 0, the decoder is able to ignore redundant codec pictures provided that the decoder supports a profile (Baseline, Extended) in which redundant coded pictures are allowed.

资料性说明:即使冗余pic cap等于0,只要解码器支持允许冗余编码图片的配置文件(基线、扩展),解码器也可以忽略冗余编解码器图片。

Informative note: Even if redundant-pic-cap is equal to 1, the receiver may also choose other error concealment strategies to replace or complement decoding of redundant slices.

资料性说明:即使冗余pic cap等于1,接收机也可以选择其他错误隐藏策略来替换或补充冗余切片的解码。

sprop-parameter-sets: This parameter MAY be used to convey any sequence and picture parameter set NAL units (herein referred to as the initial parameter set NAL units) that can be placed in the NAL unit stream to precede any other NAL units in decoding order. The parameter MUST NOT be used to indicate codec capability in any capability exchange procedure. The value of the parameter is a comma-separated (',') list of base64 [7] representations of parameter set NAL units as specified in Sections 7.3.2.1 and

sprop参数集:该参数可用于传送任何序列和图片参数集NAL单元(本文称为初始参数集NAL单元),其可置于NAL单元流中,以解码顺序先于任何其他NAL单元。在任何功能交换过程中,该参数不得用于指示编解码器功能。参数值为第7.3.2.1节和第7.3.2.1节中规定的参数集NAL单位的base64[7]表示形式的逗号分隔(“,”)列表

7.3.2.2 of [1]. Note that the number of bytes in a parameter set NAL unit is typically less than 10, but a picture parameter set NAL unit can contain several hundred bytes.

7.3.2.2 共[1]。注意,参数集NAL单元中的字节数通常小于10,但图片参数集NAL单元可以包含几百个字节。

Informative note: When several payload types are offered in the SDP Offer/Answer model, each with its own sprop-parameter-sets parameter, the receiver cannot assume that those parameter sets do not use conflicting storage locations (i.e., identical values of parameter set identifiers). Therefore, a receiver should buffer all sprop-parameter-sets and make them available to the decoder instance that decodes a certain payload type.

资料性说明:当SDP提供/应答模型中提供了几种有效负载类型,每种类型都有自己的sprop参数集参数时,接收方不能假设这些参数集没有使用冲突的存储位置(即参数集标识符的相同值)。因此,接收器应缓冲所有sprop参数集,并使其可用于解码特定有效负载类型的解码器实例。

The sprop-parameter-sets parameter MUST only contain parameter sets that are conforming to the profile-level-id, i.e., the subset of coding tools indicated by any of the parameter sets MUST be equal to the default sub-profile, and the level indicated by any of the parameter sets MUST be equal to the default level.

sprop参数集参数必须仅包含符合配置文件级别id的参数集,即任何参数集指示的编码工具子集必须等于默认子配置文件,并且任何参数集指示的级别必须等于默认级别。

sprop-level-parameter-sets: This parameter MAY be used to convey any sequence and picture parameter set NAL units (herein referred to as the initial parameter set NAL units) that can be placed in the NAL unit stream to precede any other NAL units in decoding order and that are associated with one or more levels different than the default level. The parameter MUST NOT be used to indicate codec capability in any capability exchange procedure.

sprop-level参数集:该参数可用于传送任何序列和图片参数集NAL单元(本文称为初始参数集NAL单元),其可置于NAL单元流中以解码顺序先于任何其他NAL单元,并且与不同于默认级别的一个或多个级别相关联。在任何功能交换过程中,该参数不得用于指示编解码器功能。

The sprop-level-parameter-sets parameter contains parameter sets for one or more levels that are different than the default level. All parameter sets associated with one level are clustered and prefixed with a three-byte field that has the same syntax as profile-level-id. This enables the receiver to install the parameter sets for one level and discard the rest. The three-byte field is named PLId, and all parameter sets associated with one level are named PSL, which has the same syntax as sprop-parameter-sets. Parameter sets for each level are represented in the form of PLId:PSL, i.e., PLId followed by a colon (':') and the base64 [7] representation of the initial parameter set NAL units for the level. Each pair of PLId:PSLs is also separated by a colon. Note that a PSL can contain multiple parameter sets for that level, separated with commas (',').

“存储级别参数集”参数包含一个或多个与默认级别不同的级别的参数集。与一个级别关联的所有参数集都是集群的,并以一个三字节字段作为前缀,该字段的语法与profile-level-id相同。这使接收器能够为一个级别安装参数集,并丢弃其余的参数集。三字节字段命名为PLId,与一个级别关联的所有参数集命名为PSL,其语法与sprop参数集相同。每个级别的参数集以PLId:PSL的形式表示,即PLId后跟冒号(“:”)和级别初始参数集NAL单位的base64[7]表示。每对PLId:PSLs也用冒号分隔。请注意,PSL可以包含该级别的多个参数集,用逗号(“,”)分隔。

The subset of coding tools indicated by each PLId field MUST be equal to the default sub-profile, and the level indicated by each PLId field MUST be different than the default level. All

每个PLId字段指示的编码工具子集必须等于默认子配置文件,并且每个PLId字段指示的级别必须不同于默认级别。全部的

sequence parameter sets contained in each PSL MUST have the three bytes from profile_idc to level_idc, inclusive, equal to the preceding PLId.

每个PSL中包含的序列参数集必须具有从profile_idc到level_idc(含)的三个字节,等于前面的PLId。

Informative note: This parameter allows for efficient level downgrade or upgrade in SDP Offer/Answer and out-of-band transport of parameter sets simultaneously.

资料性说明:此参数允许在SDP提供/应答和参数集带外传输中同时进行有效的级别降级或升级。

use-level-src-parameter-sets: This parameter MAY be used to indicate a receiver capability. The value MAY be equal to either 0 or 1. When the parameter is not present, the value MUST be inferred to be equal to 0. The value 0 indicates that the receiver does not understand the sprop-level-parameter-sets parameter, does not understand the "fmtp" source attribute as specified in Section 6.3 of [9], will ignore sprop-level-parameter-sets when present, and will ignore sprop-parameter-sets when conveyed using the "fmtp" source attribute. The value 1 indicates that the receiver understands the sprop-level-parameter-sets parameter, understands the "fmtp" source attribute as specified in Section 6.3 of [9], and is capable of using parameter sets contained in the sprop-level-parameter-sets or contained in the sprop-parameter-sets that is conveyed using the "fmtp" source attribute.

使用级别src参数集:此参数可用于指示接收器能力。该值可以等于0或1。当参数不存在时,必须将该值推断为等于0。值0表示接收器不理解sprop级别参数集参数,不理解[9]第6.3节中规定的“fmtp”源属性,当存在sprop级别参数集时将忽略,当使用“fmtp”源属性传送时将忽略sprop参数集。值1表示接收器理解sprop级别参数集参数,理解[9]第6.3节中规定的“fmtp”源属性,并且能够使用sprop级别参数集中包含的参数集或使用“fmtp”源属性传送的sprop参数集中包含的参数集。

Informative note: An RFC 3984 receiver does not understand sprop-level-parameter-sets, use-level-src-parameter-sets, or the "fmtp" source attribute as specified in Section 6.3 of [9]. Therefore, during SDP Offer/Answer, an RFC 3984 receiver as the answerer will simply ignore sprop-level-parameter-sets when present in an offer and sprop-parameter-sets conveyed using the "fmtp" source attribute, as specified in Section 6.3 of [9]. Assume that the offered payload type was accepted at a level lower than the default level. If the offered payload type included sprop-level-parameter-sets or included sprop-parameter-sets conveyed using the "fmtp" source attribute and if the offerer sees that the answerer has not included use-level-src-parameter-sets equal to 1 in the answer, the offerer knows that in-band transport of parameter sets is needed.

资料性说明:RFC 3984接收器不理解sprop级别参数集、使用级别src参数集或[9]第6.3节中规定的“fmtp”源属性。因此,在SDP报价/应答过程中,RFC 3984接收方作为应答方,在报价和使用“fmtp”源属性传输的sprop参数集中存在时,将忽略sprop级别参数集,如[9]第6.3节所述。假设提供的有效负载类型在低于默认级别的级别上被接受。如果提供的有效载荷类型包括sprop级别参数集或使用“fmtp”源属性传送的包括的sprop参数集,并且如果报价人发现应答人在回答中没有包括等于1的使用级别src参数集,报价人知道需要带内传输参数集。

in-band-parameter-sets: This parameter MAY be used to indicate a receiver capability. The value MAY be equal to either 0 or 1. The value 1 indicates that the receiver discards out-of-band parameter sets in sprop-parameter-sets and sprop-level-parameter-sets; therefore, the sender MUST transmit all parameter sets in-band. The value 0 indicates that the receiver utilizes out-of-band parameter sets

带内参数集:此参数可用于指示接收器能力。该值可以等于0或1。值1表示接收器丢弃sprop参数集和sprop级别参数集中的带外参数集;因此,发送方必须在频带内传输所有参数集。值0表示接收器使用带外参数集

included in sprop-parameter-sets and/or sprop-level-parameter-sets. However, in this case, the sender MAY still choose to send parameter sets in-band. When in-band-parameter-sets is equal to 1, use-level-src-parameter-sets MUST NOT be present or MUST be equal to 0. When the parameter is not present, this receiver capability is not specified, and therefore the sender MAY send out-of-band parameter sets only, it MAY send in-band-parameter-sets only, or it MAY send both.

包含在sprop参数集和/或sprop级别参数集中。然而,在这种情况下,发送方仍然可以选择在频带内发送参数集。当带内参数集等于1时,use level src参数集不得存在或必须等于0。如果参数不存在,则不指定此接收器功能,因此发送方可以仅发送带外参数集,也可以仅发送带内参数集,或者两者都发送。

level-asymmetry-allowed: This parameter MAY be used in SDP Offer/Answer to indicate whether level asymmetry, i.e., sending media encoded at a different level in the offerer-to-answerer direction than the level in the answerer-to-offerer direction, is allowed. The value MAY be equal to either 0 or 1. When the parameter is not present, the value MUST be inferred to be equal to 0. The value 1 in both the offer and the answer indicates that level asymmetry is allowed. The value of 0 in either the offer or the answer indicates that level asymmetry is not allowed.

允许的级别不对称:此参数可用于SDP报价/应答中,以指示是否允许级别不对称,即在报价人至报价人方向上发送编码为不同级别的媒体,而不是在报价人至报价人方向上发送编码为不同级别的媒体。该值可以等于0或1。当参数不存在时,必须将该值推断为等于0。报价和答案中的值1表示允许水平不对称。报价或答案中的值0表示不允许水平不对称。

If level-asymmetry-allowed is equal to 0 (or not present) in either the offer or the answer, level asymmetry is not allowed. In this case, the level to use in the direction from the offerer to the answerer MUST be the same as the level to use in the opposite direction.

如果报价或答案中允许的水平不对称等于0(或不存在),则不允许水平不对称。在这种情况下,从报价人到应答人的方向上使用的电平必须与在相反方向上使用的电平相同。

packetization-mode: This parameter signals the properties of an RTP payload type or the capabilities of a receiver implementation. Only a single configuration point can be indicated; thus, when capabilities to support more than one packetization-mode are declared, multiple configuration points (RTP payload types) must be used.

打包模式:此参数表示RTP有效负载类型的属性或接收器实现的能力。只能指示一个配置点;因此,当声明支持多个打包模式的能力时,必须使用多个配置点(RTP有效负载类型)。

When the value of packetization-mode is equal to 0 or packetization-mode is not present, the single NAL mode MUST be used. This mode is in use in standards using ITU-T Recommendation H.241 [3] (see Section 12.1). When the value of packetization-mode is equal to 1, the non-interleaved mode MUST be used. When the value of packetization-mode is equal to 2, the interleaved mode MUST be used. The value of packetization-mode MUST be an integer in the range of 0 to 2, inclusive.

当打包模式的值等于0或不存在打包模式时,必须使用单一NAL模式。该模式在使用ITU-T建议H.241[3]的标准中使用(见第12.1节)。当packetization mode的值等于1时,必须使用非交织模式。当打包模式的值等于2时,必须使用交织模式。packetization mode的值必须是0到2(包括0到2)范围内的整数。

sprop-interleaving-depth: This parameter MUST NOT be present when packetization-mode is not present or the value of packetization-mode is equal to 0 or 1. This parameter MUST be present when the value of packetization-mode is equal to 2.

sprop交错深度:当分组模式不存在或分组模式的值等于0或1时,此参数不得存在。当打包模式的值等于2时,此参数必须存在。

This parameter signals the properties of an RTP packet stream. It specifies the maximum number of VCL NAL units that precede any VCL NAL unit in the RTP packet stream in transmission order and that follow the VCL NAL unit in decoding order. Consequently, it is guaranteed that receivers can reconstruct NAL unit decoding order when the buffer size for NAL unit decoding order recovery is at least the value of sprop-interleaving-depth + 1 in terms of VCL NAL units.

此参数表示RTP数据包流的属性。它指定了RTP数据包流中任何VCL NAL单元之前(按传输顺序)和之后(按解码顺序)的VCL NAL单元的最大数量。因此,当用于NAL单元解码顺序恢复的缓冲器大小至少是相对于VCL NAL单元的sprop交织深度+1的值时,保证接收机能够重构NAL单元解码顺序。

The value of sprop-interleaving-depth MUST be an integer in the range of 0 to 32767, inclusive.

sprop交错深度的值必须是0到32767(包括0到32767)范围内的整数。

sprop-deint-buf-req: This parameter MUST NOT be present when packetization-mode is not present or the value of packetization-mode is equal to 0 or 1. It MUST be present when the value of packetization-mode is equal to 2.

sprop deint buf req:当打包模式不存在或打包模式的值等于0或1时,此参数不得存在。当packetization mode的值等于2时,它必须存在。

sprop-deint-buf-req signals the required size of the de-interleaving buffer for the RTP packet stream. The value of the parameter MUST be greater than or equal to the maximum buffer occupancy (in units of bytes) required in such a de-interleaving buffer that is specified in Section 7.2. It is guaranteed that receivers can perform the de-interleaving of interleaved NAL units into NAL unit decoding order, when the de-interleaving buffer size is at least the value of sprop-deint-buf-req in terms of bytes.

sprop deint buf req向RTP数据包流发送所需大小的解交织缓冲区信号。该参数的值必须大于或等于第7.2节规定的这种解交错缓冲区所需的最大缓冲区占用率(以字节为单位)。当解交织缓冲器大小至少是以字节为单位的sprop deint buf req的值时,保证接收机能够将交织的NAL单元解交织成NAL单元解码顺序。

The value of sprop-deint-buf-req MUST be an integer in the range of 0 to 4294967295, inclusive.

sprop deint buf req的值必须是0到4294967295(包括0到4294967295)范围内的整数。

Informative note: sprop-deint-buf-req indicates the required size of the de-interleaving buffer only. When network jitter can occur, an appropriately sized jitter buffer has to be provisioned for as well.

资料性说明:sprop deint buf req仅表示所需的解交织缓冲区大小。当网络抖动可能发生时,还必须为其配置适当大小的抖动缓冲区。

deint-buf-cap: This parameter signals the capabilities of a receiver implementation and indicates the amount of de-interleaving buffer space in units of bytes that the receiver has available for reconstructing the NAL unit decoding order. A receiver is able to handle any stream for which the value of the sprop-deint-buf-req parameter is smaller than or equal to this parameter.

deint buf cap:此参数表示接收器实现的能力,并指示接收器可用于重建NAL单元解码顺序的以字节为单位的解交织缓冲区空间量。接收器能够处理sprop deint buf req参数值小于或等于此参数的任何流。

If the parameter is not present, then a value of 0 MUST be used for deint-buf-cap. The value of deint-buf-cap MUST be an integer in the range of 0 to 4294967295, inclusive.

如果参数不存在,则deint buf cap必须使用0值。deint buf cap的值必须是0到4294967295(包括0到4294967295)范围内的整数。

Informative note: deint-buf-cap indicates the maximum possible size of the de-interleaving buffer of the receiver only. When network jitter can occur, an appropriately sized jitter buffer has to be provisioned for as well.

资料性说明:deint buf cap仅表示接收机解交错缓冲器的最大可能大小。当网络抖动可能发生时,还必须为其配置适当大小的抖动缓冲区。

sprop-init-buf-time: This parameter MAY be used to signal the properties of an RTP packet stream. The parameter MUST NOT be present if the value of packetization-mode is equal to 0 or 1.

sprop init buf time:此参数可用于表示RTP数据包流的属性。如果打包模式的值等于0或1,则该参数不得存在。

The parameter signals the initial buffering time that a receiver MUST wait before starting decoding to recover the NAL unit decoding order from the transmission order. The parameter is the maximum value of (decoding time of the NAL unit - transmission time of a NAL unit), assuming reliable and instantaneous transmission, the same timeline for transmission and decoding, and commencement of decoding when the first packet arrives.

该参数表示接收器在开始解码之前必须等待的初始缓冲时间,以从传输顺序恢复NAL单元解码顺序。该参数是(NAL单元的解码时间-NAL单元的传输时间)的最大值,假设可靠和瞬时传输,传输和解码的时间线相同,并且在第一个数据包到达时开始解码。

An example of specifying the value of sprop-init-buf-time follows. A NAL unit stream is sent in the following interleaved order, in which the value corresponds to the decoding time and the transmission order is from left to right:

下面是一个指定sprop init buf time值的示例。以以下交织顺序发送NAL单元流,其中值对应于解码时间,并且传输顺序是从左到右:

0 2 1 3 5 4 6 8 7 ...

0 2 1 3 5 4 6 8 7 ...

Assuming a steady transmission rate of NAL units, the transmission times are:

假设NAL单元的稳定传输速率,传输时间为:

0 1 2 3 4 5 6 7 8 ...

0 1 2 3 4 5 6 7 8 ...

Subtracting the decoding time from the transmission time column-wise results in the following series:

从传输时间列中减去解码时间,得到以下序列:

0 -1 1 0 -1 1 0 -1 1 ...

0 -1 1 0 -1 1 0 -1 1 ...

Thus, in terms of intervals of NAL unit transmission times, the value of sprop-init-buf-time in this example is 1. The parameter is coded as a non-negative base10 integer representation in clock ticks of a 90-kHz clock. If the parameter is not present, then no initial buffering time value is defined. Otherwise, the value of sprop-init-buf-time MUST be an integer in the range of 0 to 4294967295, inclusive.

因此,就NAL单位发送时间的间隔而言,本示例中的sprop init buf time的值为1。该参数以90 kHz时钟的时钟信号为单位编码为非负的base10整数表示。如果参数不存在,则不定义初始缓冲时间值。否则,sprop init buf time的值必须是介于0到4294967295(包括0到4294967295)之间的整数。

In addition to the signaled sprop-init-buf-time, receivers SHOULD take into account the transmission delay jitter buffering, including buffering for the delay jitter caused by mixers, translators, gateways, proxies, traffic-shapers, and other network elements.

除了信号sprop init buf time外,接收机还应考虑传输延迟抖动缓冲,包括混频器、转换器、网关、代理、流量整形器和其他网络元件引起的延迟抖动缓冲。

sprop-max-don-diff: This parameter MAY be used to signal the properties of an RTP packet stream. It MUST NOT be used to signal transmitter, receiver, or codec capabilities. The parameter MUST NOT be present if the value of packetization-mode is equal to 0 or 1. sprop-max-don-diff is an integer in the range of 0 to 32767, inclusive. If sprop-max-don-diff is not present, the value of the parameter is unspecified. sprop-max-don-diff is calculated as follows:

sprop max don diff:此参数可用于表示RTP数据包流的属性。不得将其用于信号发射器、接收器或编解码器功能。如果打包模式的值等于0或1,则该参数不得存在。sprop max don diff是一个介于0到32767(包括0到32767)之间的整数。如果sprop max don diff不存在,则该参数的值未指定。sprop max don diff的计算如下:

sprop-max-don-diff = max{AbsDON(i) - AbsDON(j)}, for any i and any j>i,

sprop max don diff=max{AbsDON(i)-AbsDON(j)},对于任意i和任意j>i,

where i and j indicate the index of the NAL unit in the transmission order and AbsDON denotes a decoding order number of the NAL unit that does not wrap around to 0 after 65535. In other words, AbsDON is calculated as follows: let m and n be consecutive NAL units in transmission order. For the very first NAL unit in transmission order (whose index is 0), AbsDON(0) = DON(0). For other NAL units, AbsDON is calculated as follows:

其中i和j表示传输顺序中的NAL单元的索引,AbsDON表示在65535之后不环绕到0的NAL单元的解码顺序号。换句话说,AbsDON的计算如下:设m和n是传输顺序上的连续NAL单元。对于传输顺序中的第一个NAL单元(其索引为0),AbsDON(0)=DON(0)。对于其他NAL装置,AbsDON的计算如下:

            If DON(m) == DON(n), AbsDON(n) = AbsDON(m)
        
            If DON(m) == DON(n), AbsDON(n) = AbsDON(m)
        
            If (DON(m) < DON(n) and DON(n) - DON(m) < 32768),
              AbsDON(n) = AbsDON(m) + DON(n) - DON(m)
        
            If (DON(m) < DON(n) and DON(n) - DON(m) < 32768),
              AbsDON(n) = AbsDON(m) + DON(n) - DON(m)
        
            If (DON(m) > DON(n) and DON(m) - DON(n) >= 32768),
              AbsDON(n) = AbsDON(m) + 65536 - DON(m) + DON(n)
        
            If (DON(m) > DON(n) and DON(m) - DON(n) >= 32768),
              AbsDON(n) = AbsDON(m) + 65536 - DON(m) + DON(n)
        
            If (DON(m) < DON(n) and DON(n) - DON(m) >= 32768),
              AbsDON(n) = AbsDON(m) - (DON(m) + 65536 - DON(n))
        
            If (DON(m) < DON(n) and DON(n) - DON(m) >= 32768),
              AbsDON(n) = AbsDON(m) - (DON(m) + 65536 - DON(n))
        
            If (DON(m) > DON(n) and DON(m) - DON(n) < 32768),
              AbsDON(n) = AbsDON(m) - (DON(m) - DON(n))
        
            If (DON(m) > DON(n) and DON(m) - DON(n) < 32768),
              AbsDON(n) = AbsDON(m) - (DON(m) - DON(n))
        

where DON(i) is the decoding order number of the NAL unit having index i in the transmission order. The decoding order number is specified in Section 5.5.

其中DON(i)是在传输顺序中具有索引i的NAL单元的解码顺序号。第5.5节规定了解码顺序号。

Informative note: Receivers may use sprop-max-don-diff to trigger which NAL units in the receiver buffer can be passed to the decoder.

资料性说明:接收机可使用sprop max don diff触发接收机缓冲区中哪些NAL单元可传递给解码器。

max-rcmd-nalu-size: This parameter MAY be used to signal the capabilities of a receiver. The parameter MUST NOT be used for any other purposes. The value of the parameter indicates the largest NALU size in bytes that the receiver can handle efficiently. The parameter value is a recommendation, not a strict upper boundary. The sender MAY create larger NALUs but must be aware that the handling of these may come at a higher cost than NALUs conforming to the limitation.

最大rcmd nalu大小:此参数可用于向接收器的功能发送信号。该参数不得用于任何其他目的。该参数的值表示接收器可以有效处理的最大NALU大小(以字节为单位)。参数值是建议值,而不是严格的上限。发送方可以创建更大的NALU,但必须注意,处理这些NALU的成本可能高于符合限制的NALU。

The value of max-rcmd-nalu-size MUST be an integer in the range of 0 to 4294967295, inclusive. If this parameter is not specified, no known limitation to the NALU size exists. Senders still have to consider the MTU size available between the sender and the receiver and SHOULD run MTU discovery for this purpose.

max rcmd nalu size的值必须是介于0到4294967295(包括0和4294967295)之间的整数。如果未指定此参数,则NALU大小不存在已知限制。发送者仍然需要考虑发送者和接收者之间可用的MTU大小,为此应该运行MTU发现。

This parameter is motivated by, for example, an IP to H.223 video telephony gateway, where NALUs smaller than the H.223 transport data unit will be more efficient. A gateway may terminate IP; thus, MTU discovery will normally not work beyond the gateway.

例如,该参数由IP到H.223视频电话网关驱动,其中小于H.223传输数据单元的NALU将更高效。网关可以终止IP;因此,MTU发现通常不会在网关之外工作。

Informative note: Setting this parameter to a lower than necessary value may have a negative impact.

资料性说明:将此参数设置为低于必要值可能会产生负面影响。

sar-understood: This parameter MAY be used to indicate a receiver capability and nothing else. The parameter indicates the maximum value of aspect_ratio_idc (specified in [1]) smaller than 255 that the receiver understands. Table E-1 of [1] specifies aspect_ratio_idc equal to 0 as "unspecified"; 1 to 16, inclusive, as specific Sample Aspect Ratios (SARs); 17 to 254, inclusive, as "reserved"; and 255 as the Extended SAR, for which SAR width and SAR height are explicitly signaled. Therefore, a receiver with a decoder according to [1] understands aspect_ratio_idc in the range of 1 to 16, inclusive, and aspect_ratio_idc equal to 255, in the sense that the receiver knows exactly what the SAR is. For such a receiver, the value of sar-understood is 16. In the future, if Table E-1 of [1] is extended, e.g., such that the SAR for aspect_ratio_idc equal to 17 is specified, then for a receiver with a decoder that understands the extension, the value of

理解sar:此参数可用于指示接收机能力,而不是其他。该参数表示接收器理解的小于255的纵横比(在[1]中指定)的最大值。[1]的表E-1规定纵横比等于0为“未指定”;1至16(含),作为特定样本纵横比(SARs);17至254,包括在内,为“保留”;255作为扩展SAR,其SAR宽度和SAR高度均明确表示。因此,具有根据[1]的解码器的接收机理解范围为1到16(包括1到16)的纵横比idc,并且纵横比idc等于255,这意味着接收机确切地知道SAR是什么。对于这样的接收机,所理解的sar值为16。将来,如果扩展了[1]的表E-1,例如,指定纵横比为17的SAR,则对于具有理解扩展的解码器的接收机

sar-understood is 17. For a receiver with a decoder according to the 2003 version of [1], the value of sar-understood is 13, as the minimum reserved aspect_ratio_idc therein is 14.

他今年17岁。对于具有根据[1]的2003版本的解码器的接收机,所理解的sar的值是13,因为其中的最小保留纵横比是14。

When sar-understood is not present, the value MUST be inferred to be equal to 13.

当sar不存在时,必须推断该值等于13。

sar-supported: This parameter MAY be used to indicate a receiver capability and nothing else. The value of this parameter is an integer in the range of 1 to sar-understood, inclusive, equal to 255. The value of sar-supported equal to N smaller than 255 indicates that the receiver supports all the SARs corresponding to H.264 aspect_ratio_idc values (see Table E-1 of [1]) in the range from 1 to N, inclusive, without geometric distortion. The value of sar-supported equal to 255 indicates that the receiver supports all sample aspect ratios that are expressible using two 16-bit integer values as the numerator and denominator, i.e., those that are expressible using the H.264 aspect_ratio_idc value of 255 (Extended_SAR, see Table E-1 of [1]), without geometric distortion.

支持sar:此参数可用于指示接收器能力,而不是其他。此参数的值是一个范围为1到255(含)的整数。支持的sar值等于N小于255表示接收器支持与H.264纵横比idc值(见[1]中的表E-1)对应的1到N范围内(包括1到N)的所有sar,且没有几何失真。支持的sar值等于255表示接收器支持使用两个16位整数值作为分子和分母表示的所有样本纵横比,即使用H.264纵横比idc值255表示的所有样本纵横比(扩展的sar,见[1]的表e-1),无几何失真。

H.264-compliant encoders SHOULD NOT send an aspect_ratio_idc equal to 0 or an aspect_ratio_idc larger than sar-understood and smaller than 255. H.264-compliant encoders SHOULD send an aspect_ratio_idc that the receiver is able to display without geometrical distortion. However, H.264-compliant encoders MAY choose to send pictures using any SAR.

H.264兼容编码器不应发送等于0的纵横比idc或大于sar理解值且小于255的纵横比idc。符合H.264标准的编码器应发送一个纵横比idc,接收器能够在没有几何失真的情况下显示。然而,兼容H.264的编码器可以选择使用任何SAR发送图片。

Note that the actual sample aspect ratio or extended sample aspect ratio, when present, of the stream is conveyed in the Video Usability Information (VUI) part of the sequence parameter set.

注意,流的实际样本纵横比或扩展样本纵横比(当存在时)在序列参数集的视频可用性信息(VUI)部分中传送。

Encoding considerations: This type is only defined for transfer via RTP (RFC 3550).

编码注意事项:此类型仅为通过RTP(RFC 3550)传输而定义。

Security considerations: See Section 9 of RFC 6184.

安全注意事项:见RFC 6184第9节。

Public specification: Please refer to RFC 6184 and its Section 17.

公共规范:请参考RFC 6184及其第17节。

Additional information: None

其他信息:无

File extensions: none

文件扩展名:无

Macintosh file type code: none

Macintosh文件类型代码:无

Object identifier or OID: none

对象标识符或OID:无

Person & email address to contact for further information: Ye-Kui Wang, yekui.wang@huawei.com

联系人和电子邮件地址,以获取更多信息:Ye Kui Wang,yekui。wang@huawei.com

Intended usage: COMMON

预期用途:普通

Author: Ye-Kui Wang, yekui.wang@huawei.com

作者:王业奎,王业奎。wang@huawei.com

Change controller: IETF Audio/Video Transport working group delegated from the IESG.

变更控制员:IESG授权的IETF音频/视频传输工作组。

8.2. SDP Parameters
8.2. SDP参数

The receiver MUST ignore any parameter unspecified in this memo.

接收方必须忽略本备忘录中未指定的任何参数。

8.2.1. Mapping of Payload Type Parameters to SDP
8.2.1. 有效负载类型参数到SDP的映射

The media type video/H264 string is mapped to fields in the Session Description Protocol (SDP) [6] as follows:

媒体类型video/H264字符串映射到会话描述协议(SDP)[6]中的字段,如下所示:

o The media name in the "m=" line of SDP MUST be video.

o SDP的“m=”行中的媒体名称必须是视频。

o The encoding name in the "a=rtpmap" line of SDP MUST be H264 (the media subtype).

o SDP的“a=rtpmap”行中的编码名称必须是H264(媒体子类型)。

o The clock rate in the "a=rtpmap" line MUST be 90000.

o “a=rtpmap”行中的时钟频率必须为90000。

o The OPTIONAL parameters profile-level-id, max-recv-level, max-mbps, max-smbps, max-fs, max-cpb, max-dpb, max-br, redundant-pic-cap, use-level-src-parameter-sets, in-band-parameter-sets, level-asymmetry-allowed, packetization-mode, sprop-interleaving-depth, sprop-deint-buf-req, deint-buf-cap, sprop-init-buf-time, sprop-max-don-diff, max-rcmd-nalu-size, sar-understood, and sar-supported, when present, MUST be included in the "a=fmtp" line of SDP. These parameters are expressed as a media type string, in the form of a semicolon-separated list of parameter=value pairs.

o 可选参数配置文件级别id、最大recv级别、最大mbps、最大smbps、最大fs、最大cpb、最大dpb、最大br、冗余pic cap、使用级别src参数集、带内参数集、允许的级别不对称性、打包模式、sprop交错深度、sprop deint buf req、deint buf cap、sprop init buf time、sprop max don diff、,SDP的“a=fmtp”行中必须包括最大rcmd nalu大小、理解的sar和支持的sar(如果存在)。这些参数表示为媒体类型字符串,以分号分隔的参数=值对列表的形式。

o The OPTIONAL parameters sprop-parameter-sets and sprop-level-parameter-sets, when present, MUST be included in the "a=fmtp" line of SDP or conveyed using the "fmtp" source attribute as specified in Section 6.3 of [9]. For a particular media format (i.e., RTP payload type), a sprop-parameter-sets or sprop-level-parameter-sets MUST NOT be both included in the "a=fmtp" line of

o 可选参数sprop参数集和sprop级别参数集(如果存在)必须包含在SDP的“a=fmtp”行中,或使用[9]第6.3节中规定的“fmtp”源属性进行传输。对于特定媒体格式(即RTP有效负载类型),sprop参数集或sprop级别参数集不得同时包含在

SDP and conveyed using the "fmtp" source attribute. When included in the "a=fmtp" line of SDP, these parameters are expressed as a media type string, in the form of a semicolon-separated list of parameter=value pairs. When conveyed using the "fmtp" source attribute, these parameters are only associated with the given source and payload type as parts of the "fmtp" source attribute.

SDP,并使用“fmtp”源属性进行传输。当包含在SDP的“a=fmtp”行中时,这些参数表示为媒体类型字符串,以分号分隔的参数=值对列表的形式。当使用“fmtp”源属性进行传输时,这些参数仅与作为“fmtp”源属性一部分的给定源和有效负载类型相关联。

Informative note: Conveyance of sprop-parameter-sets and sprop-level-parameter-sets using the "fmtp" source attribute allows for out-of-band transport of parameter sets in topologies like Topo-Video-switch-MCU [29].

资料性说明:使用“fmtp”源属性传输sprop参数集和sprop级别参数集允许在拓扑(如拓扑视频开关MCU)[29]中进行带外传输。

An example of media representation in SDP is as follows (Baseline profile, Level 3.0, some of the constraints of the Main profile may not be obeyed):

SDP中的媒体表示示例如下(基线配置文件,3.0级,可能不遵守主配置文件的某些约束):

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E;
                packetization-mode=1;
                sprop-parameter-sets=<parameter sets data>
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E;
                packetization-mode=1;
                sprop-parameter-sets=<parameter sets data>
        
8.2.2. Usage with the SDP Offer/Answer Model
8.2.2. SDP提供/应答模式的使用

When H.264 is offered over RTP using SDP in an Offer/Answer model [8] for negotiation for unicast usage, the following limitations and rules apply:

当在提供/应答模型[8]中使用SDP通过RTP提供H.264以协商单播使用时,以下限制和规则适用:

o The parameters identifying a media format configuration for H.264 are profile-level-id and packetization-mode. These media format configuration parameters (except for the level part of profile-level-id) MUST be used symmetrically; that is, the answerer MUST either maintain all configuration parameters or remove the media format (payload type) completely if one or more of the parameter values are not supported. Note that the level part of profile-level-id includes level_idc, and, for indication of Level 1b when profile_idc is equal to 66, 77, or 88, bit 4 (constraint_set3_flag) of profile-iop. The level part of profile-level-id is changeable.

o 标识H.264的媒体格式配置的参数是配置文件级别id和打包模式。这些媒体格式配置参数(配置文件级别id的级别部分除外)必须对称使用;也就是说,如果一个或多个参数值不受支持,应答者必须保留所有配置参数或完全删除媒体格式(有效负载类型)。请注意,配置文件级别id的级别部分包括级别_idc,并且,当配置文件_idc等于66、77或88时,用于指示级别1b,配置文件iop的第4位(约束设置3_标志)。配置文件级别id的级别部分是可更改的。

Informative note: The requirement for symmetric use does not apply for the level part of profile-level-id and does not apply for the other stream properties and capability parameters.

资料性说明:对称使用要求不适用于概要文件级别id的级别部分,也不适用于其他流属性和能力参数。

Informative note: In H.264 [1], all the levels except for Level 1b are equal to the value of level_idc divided by 10. Level 1b is a level higher than Level 1.0 but lower than Level 1.1 and is signaled in an ad hoc manner, because the level was

资料性说明:在H.264[1]中,除1b级外的所有级别都等于_级idc除以10的值。级别1b是高于级别1.0但低于级别1.1的级别,并以特殊方式发出信号,因为该级别为

specified after Level 1.0 and Level 1.1. For the Baseline, Main, and Extended profiles (with profile_idc equal to 66, 77, and 88, respectively), Level 1b is indicated by level_idc equal to 11 (i.e., same as Level 1.1) and constraint_set3_flag equal to 1. For other profiles, Level 1b is indicated by level_idc equal to 9 (but note that Level 1b for these profiles are still higher than Level 1, which has level_idc equal to 10 and lower than Level 1.1). In SDP Offer/Answer, an answer to an offer may indicate a level equal to or lower than the level indicated in the offer. Due to the ad hoc indication of Level 1b, offerers and answerers must check the value of bit 4 (constraint_set3_flag) of the middle octet of the parameter profile-level-id, when profile_idc is equal to 66, 77, or 88 and level_idc is equal to 11.

在级别1.0和级别1.1之后指定。对于基线、主配置文件和扩展配置文件(配置文件idc分别等于66、77和88),1b级由级别idc等于11(即,与级别1.1相同)和约束设置3_标志等于1表示。对于其他配置文件,1b级由等于9的级别_idc表示(但请注意,这些配置文件的1b级仍然高于级别1,级别_idc等于10,低于级别1.1)。在SDP报价/应答中,报价的应答可能表示等于或低于报价中指示的水平。由于1b级的特殊指示,当profile_idc等于66、77或88且Level_idc等于11时,报价人和应答人必须检查参数profile Level id中间八位字节的第4位(constraint_set3_flag)的值。

To simplify the handling and matching of these configurations, the same RTP payload type number used in the offer SHOULD also be used in the answer, as specified in [8]. An answer MUST NOT contain the payload type number used in the offer unless the configuration is exactly the same as in the offer.

为了简化这些配置的处理和匹配,答案中也应使用报价中使用的相同RTP有效负载类型编号,如[8]中所述。答案不得包含报价中使用的有效负载类型编号,除非配置与报价中的配置完全相同。

Informative note: When an offerer receives an answer, it has to compare payload types not declared in the offer based on the media type (i.e., video/H264) and the above media configuration parameters with any payload types it has already declared. This will enable it to determine whether the configuration in question is new or if it is equivalent to configuration already offered, since a different payload type number may be used in the answer.

资料性说明:当报价人收到答复时,必须根据媒体类型(即视频/H264)和上述媒体配置参数,将报价中未声明的有效负载类型与其已声明的任何有效负载类型进行比较。这将使其能够确定所讨论的配置是新的还是与已经提供的配置等效,因为答案中可能会使用不同的有效负载类型编号。

o When present, the parameter max-recv-level declares the highest level supported for receiving. In case max-recv-level is not present, the highest level supported for receiving is equal to the default level indicated by the level part of profile-level-id. When present, max-recv-level MUST be higher than the default level.

o 存在时,参数max recv level声明接收支持的最高级别。如果不存在最大recv级别,则支持接收的最高级别等于profile-level-id的级别部分指示的默认级别。存在时,最大recv级别必须高于默认级别。

o The parameter level-asymmetry-allowed indicates whether level asymmetry is allowed.

o 参数“允许的级别不对称”指示是否允许级别不对称。

If level-asymmetry-allowed is equal to 0 (or not present) in either the offer or the answer, level asymmetry is not allowed. In this case, the level to use in the direction from the offerer to the answerer MUST be the same as the level to use in the opposite direction, and the common level to use is equal to the lower value of the default level in the offer and the default level in the answer.

如果报价或答案中允许的水平不对称等于0(或不存在),则不允许水平不对称。在这种情况下,从报价人到应答人的方向上使用的级别必须与在相反方向上使用的级别相同,并且使用的通用级别等于报价中默认级别和答案中默认级别的较低值。

Otherwise, level-asymmetry-allowed equals 1 in both the offer and the answer, and level asymmetry is allowed. In this case, the level to use in the offerer-to-answerer direction MUST be equal to the highest level the answerer supports for receiving, and the level to use in the answerer-to-offerer direction MUST be equal to the highest level the offerer supports for receiving.

否则,在报价和答案中允许的水平不对称等于1,并且允许水平不对称。在这种情况下,在“报价人-报价人”方向上使用的级别必须等于“报价人”支持接收的最高级别,在“报价人-报价人”方向上使用的级别必须等于“报价人”支持接收的最高级别。

When level asymmetry is not allowed, level upgrade is not allowed, i.e., the default level in the answer MUST be equal to or lower than the default level in the offer.

如果不允许级别不对称,则不允许级别升级,即答案中的默认级别必须等于或低于报价中的默认级别。

o The parameters sprop-deint-buf-req, sprop-interleaving-depth, sprop-max-don-diff, and sprop-init-buf-time describe the properties of the RTP packet stream that the offerer or answerer is sending for the media format configuration. This differs from the normal usage of the Offer/Answer parameters: normally such parameters declare the properties of the stream that the offerer or the answerer is able to receive. When dealing with H.264, the offerer assumes that the answerer will be able to receive media encoded using the configuration being offered.

o 参数sprop deint buf req、sprop交织深度、sprop max don diff和sprop init buf time描述了提供方或应答方为媒体格式配置发送的RTP数据包流的属性。这与要约/应答参数的正常用法不同:通常这些参数声明了要约人或应答人能够接收的流的属性。在处理H.264时,提供方假设应答方将能够接收使用提供的配置编码的媒体。

Informative note: The above parameters apply for any stream sent by a declaring entity with the same configuration; i.e., they are dependent on their source. Rather than being bound to the payload type, the values may have to be applied to another payload type when being sent, as they apply for the configuration.

资料性说明:上述参数适用于具有相同配置的声明实体发送的任何流;i、 例如,它们依赖于它们的来源。这些值在发送时可能必须应用于另一个有效负载类型,而不是绑定到有效负载类型,因为它们适用于配置。

o The capability parameters max-mbps, max-smbps, max-fs, max-cpb, max-dpb, max-br, redundant-pic-cap, max-rcmd-nalu-size, sar-understood, and sar-supported MAY be used to declare further capabilities of the offerer or answerer for receiving. These parameters MUST NOT be present when the direction attribute is "sendonly" and when the parameters describe the limitations of what the offerer or answerer accepts for receiving streams.

o 能力参数max-mbps、max-smbps、max-fs、max-cpb、max-dpb、max-br、冗余pic cap、max-rcmd-nalu-size、sar-undersed和sar-supported可用于声明报价人或应答人的进一步接收能力。当方向属性为“sendonly”且参数描述了报价人或应答人接受接收流的限制时,这些参数不得出现。

o An offerer has to include the size of the de-interleaving buffer, sprop-deint-buf-req, in the offer for an interleaved H.264 stream. To enable the offerer and answerer to inform each other about their capabilities for de-interleaving buffering in receiving streams, both parties are RECOMMENDED to include deint-buf-cap. For interleaved streams, it is also RECOMMENDED to consider offering multiple payload types with different buffering requirements when the capabilities of the receiver are unknown.

o 报价人必须在交织H.264流的报价中包括解交织缓冲区的大小,即sprop deint buf req。为了使报价人和应答人能够相互告知其在接收流中解交错缓冲的能力,建议双方包括deint buf cap。对于交错流,还建议考虑当接收机的能力未知时,提供具有不同缓冲要求的多个有效载荷类型。

o The sprop-parameter-sets or sprop-level-parameter-sets parameter, when present (included in the "a=fmtp" line of SDP or conveyed using the "fmtp" source attribute as specified in Section 6.3 of

o sprop参数设置或sprop级别参数设置参数(当存在时)(包括在SDP的“a=fmtp”行中,或使用本规范第6.3节中规定的“fmtp”源属性传送)

[9]), is used for out-of-band transport of parameter sets. However, when out-of-band transport of parameter sets is used, parameter sets MAY still be additionally transported in-band.

[9] ),用于参数集的带外传输。然而,当使用参数集的带外传输时,参数集仍然可以在带内额外传输。

The answerer MAY use either out-of-band or in-band transport of parameter sets for the stream it is sending, regardless of whether out-of-band parameter sets transport has been used in the offerer-to-answerer direction. Parameter sets included in an answer are independent of those parameter sets included in the offer, as they are used for decoding two different video streams, one from the answerer to the offerer and the other in the opposite direction.

应答者可以对其发送的流使用参数集的带外传输或带内传输,而不管是否在提供方到应答方的方向上使用了带外参数集传输。答案中包含的参数集独立于报价中包含的参数集,因为它们用于解码两个不同的视频流,一个从应答者到报价者,另一个在相反方向。

The following rules apply to transport of parameter sets in the offerer-to-answerer direction.

以下规则适用于报价人向应答人方向的参数集传输。

o An offer MAY include either or both of sprop-parameter-sets and sprop-level-parameter-sets. If neither sprop-parameter-sets nor sprop-level-parameter-sets is present in the offer, then only in-band transport of parameter sets is used.

o 报价可能包括sprop参数集和sprop级别参数集中的一个或两个。如果报价中既不存在sprop参数集,也不存在sprop级别参数集,则仅使用参数集的带内传输。

o If the answer includes in-band-parameter-sets equal to 1, then the offerer MUST transmit parameter sets in-band. Otherwise, the following applies.

o 如果答案包含等于1的带内参数集,则报价人必须传输带内参数集。否则,以下内容适用。

o If the level to use in the offerer-to-answerer direction is equal to the default level in the offer, the following applies.

o 如果在“报价人-应答人”指示中使用的级别等于报价中的默认级别,则以下内容适用。

When there is a sprop-parameter-sets included in the "a=fmtp" line in the offer, the answerer MUST be prepared to use the parameter sets included in the sprop-parameter-sets for decoding the incoming NAL unit stream.

当报价中的“a=fmtp”行中包含sprop参数集时,应答者必须准备使用sprop参数集中包含的参数集对传入的NAL单元流进行解码。

When there is a sprop-parameter-sets conveyed using the "fmtp" source attribute in the offer, the following applies. If the answer includes use-level-src-parameter-sets equal to 1 or the "fmtp" source attribute, the answerer MUST be prepared to use the parameter sets included in the sprop-parameter-sets for decoding the incoming NAL unit stream; otherwise, the offerer MUST transmit parameter sets in-band.

当在报价中使用“fmtp”源属性传递sprop参数集时,适用以下规定。如果答案包含等于1的使用级别src参数集或“fmtp”源属性,则应答者必须准备使用sprop参数集中包含的参数集解码传入的NAL单元流;否则,发盘方必须在频带内传输参数集。

When sprop-parameter-sets is not present in the offer, the offerer MUST transmit parameter sets in-band.

当sprop参数集不在报价中时,报价人必须在频带内传输参数集。

The answerer MUST ignore sprop-level-parameter-sets, when present (either included in the "a=fmtp" line or conveyed using the "fmtp" source attribute) in the offer.

回答者必须忽略报价中存在的sprop级别参数集(包括在“a=fmtp”行中或使用“fmtp”源属性传达)。

o Otherwise, the level to use in the offerer-to-answerer direction is not equal to the default level in the offer, and the following applies.

o 否则,在“报价人-应答人”指示中使用的级别不等于报价中的默认级别,以下情况适用。

The answerer MUST ignore sprop-parameter-sets, when present (either included in the "a=fmtp" line or conveyed using the "fmtp" source attribute) in the offer.

回答者必须忽略报价中存在的sprop参数集(包括在“a=fmtp”行中或使用“fmtp”源属性传达)。

When neither use-level-src-parameter-sets is equal to 1 nor the "fmtp" source attribute is present in the answer, the answerer MUST ignore sprop-level-parameter-sets, when present in the offer, and the offerer MUST transmit parameter sets in-band.

当use level src parameter sets均不等于1且应答中不存在“fmtp”源属性时,应答者必须忽略sprop level parameter sets(当报价中存在时),且报价者必须在频带内传输参数集。

When either use-level-src-parameter-sets is equal to 1 or the "fmtp" source attribute is present in the answer, the answerer MUST be prepared to use the parameter sets that are included in sprop-level-parameter-sets for the accepted level (i.e., the default level in the answer), when present in the offer, for decoding the incoming NAL unit stream, and ignore all other parameter sets included in sprop-level-parameter-sets.

当use level src parameter sets等于1或应答中存在“fmtp”源属性时,应答者必须准备好使用sprop level parameter sets中包含的参数集,用于接受级别(即应答中的默认级别),当报价中存在时,对传入的NAL单元流进行解码,并忽略存储过程级别参数集中包含的所有其他参数集。

When no parameter sets for the level to use in the offerer-to-answerer direction are present in sprop-level-parameter-sets in the offer, the offerer MUST transmit parameter sets in-band.

当报价中的sprop级别参数集中不存在供报价人至应答人方向使用的级别参数集时,报价人必须在频带内传输参数集。

The following rules apply to the transport of parameter sets in the answerer-to-offerer direction.

以下规则适用于应答方向报价方方向传输参数集。

o An answer MAY include either sprop-parameter-sets or sprop-level-parameter-sets but MUST NOT include both. If neither sprop-parameter-sets nor sprop-level-parameter-sets is present in the answer, then only in-band transport of parameter sets is used.

o 答案可以包括sprop参数集或sprop级别参数集,但不能同时包括这两个参数集。如果答案中既不存在sprop参数集,也不存在sprop级别参数集,则仅使用参数集的带内传输。

o If the offer includes in-band-parameter-sets equal to 1, the answerer MUST NOT include sprop-parameter-sets or sprop-level-parameter-sets in the answer and MUST transmit parameter sets in-band. Otherwise, the following applies.

o 如果报价包含等于1的带内参数集,则应答者不得在应答中包含sprop参数集或sprop级别参数集,并且必须在带内传输参数集。否则,以下内容适用。

o If the level to use in the answerer-to-offerer direction is equal to the default level in the answer, the following applies.

o 如果回答者对发盘者的指示中使用的级别等于回答中的默认级别,则以下情况适用。

When there is a sprop-parameter-sets included in the "a=fmtp" line in the answer, the offerer MUST be prepared to use the parameter sets included in the sprop-parameter-sets for decoding the incoming NAL unit stream.

当答案中的“a=fmtp”行中包含sprop参数集时,报价人必须准备使用sprop参数集中包含的参数集对传入的NAL单元流进行解码。

When there is a sprop-parameter-sets conveyed using the "fmtp" source attribute in the answer, the following applies. If the offer includes use-level-src-parameter-sets equal to 1 or the "fmtp" source attribute, the offerer MUST be prepared to use the parameter sets included in the sprop-parameter-sets for decoding the incoming NAL unit stream; otherwise, the answerer MUST transmit parameter sets in-band.

当在回答中使用“fmtp”源属性传递sprop参数集时,以下适用。如果要约包括等于1的使用级别src参数集或“fmtp”源属性,则要约人必须准备使用sprop参数集中包括的参数集来解码传入的NAL单元流;否则,应答器必须在频带内传输参数集。

When sprop-parameter-sets is not present in the answer, the answerer MUST transmit parameter sets in-band.

当应答中不存在sprop参数集时,应答者必须在频带内传输参数集。

The offerer MUST ignore sprop-level-parameter-sets, when present (either included in the "a=fmtp" line or conveyed using the "fmtp" source attribute) in the answer.

如果答案中存在(包括在“a=fmtp”行中或使用“fmtp”源属性传达),报价人必须忽略sprop级别参数集。

o Otherwise, the level to use in the answerer-to-offerer direction is not equal to the default level in the answer, and the following applies.

o 否则,在回答者对发盘者的指示中使用的级别不等于回答中的默认级别,以下情况适用。

The offerer MUST ignore sprop-parameter-sets when present (either included in the "a=fmtp" line of SDP or conveyed using the "fmtp" source attribute) in the answer.

报价人必须忽略答案中存在的sprop参数集(包括在SDP的“a=fmtp”行中或使用“fmtp”源属性传送)。

When neither use-level-src-parameter-sets is equal to 1 nor the "fmtp" source attribute is present in the offer, the offerer MUST ignore sprop-level-parameter-sets, when present, and the answerer MUST transmit parameter sets in-band.

当use level src参数集均不等于1,且报价中不存在“fmtp”源属性时,报价人必须忽略sprop level参数集(如果存在),且应答人必须在频带内传输参数集。

When either use-level-src-parameter-sets is equal to 1 or the "fmtp" source attribute is present in the offer, the offerer MUST be prepared to use the parameter sets that are included in sprop-level-

当use level src参数集等于1或报价中存在“fmtp”源属性时,报价人必须准备使用sprop level中包含的参数集-

parameter-sets for the level to use in the answerer-to-offerer direction, when present in the answer, for decoding the incoming NAL unit stream, and ignore all other parameter sets included in sprop-level-parameter-sets in the answer.

应答方至报价方方向中使用的级别的参数集,当在应答中出现时,用于解码传入的NAL单元流,并忽略应答中sprop级别参数集中包含的所有其他参数集。

When no parameter sets for the level to use in the answerer-to-offerer direction are present in sprop-level-parameter-sets in the answer, the answerer MUST transmit parameter sets in-band.

当应答中的sprop level参数集中不存在应答方至报价方方向中使用的电平参数集时,应答方必须在频带内传输参数集。

When sprop-parameter-sets or sprop-level-parameter-sets is conveyed using the "fmtp" source attribute as specified in Section 6.3 of [9], the receiver of the parameters MUST store the parameter sets included in the sprop-parameter-sets or sprop-level-parameter-sets for the accepted level and associate them with the source given as a part of the "fmtp" source attribute. Parameter sets associated with one source MUST only be used to decode NAL units conveyed in RTP packets from the same source. When this mechanism is in use, SSRC collision detection and resolution MUST be performed as specified in [9].

当使用[9]第6.3节中规定的“fmtp”源属性传送sprop参数集或sprop级别参数集时,参数接收者必须存储sprop参数集或sprop级别参数集中包含的参数集,用于接受级别,并将其与作为测试一部分给出的源相关联“fmtp”源属性。与一个源关联的参数集只能用于解码来自同一源的RTP数据包中传输的NAL单元。使用此机制时,必须按照[9]中的规定执行SSRC冲突检测和解决。

Informative note: Conveyance of sprop-parameter-sets and sprop-level-parameter-sets using the "fmtp" source attribute may be used in topologies like Topo-Video-switch-MCU [29] to enable out-of-band transport of parameter sets.

资料性说明:使用“fmtp”源属性传输sprop参数集和sprop级别参数集可用于拓扑(如拓扑视频开关MCU)[29],以实现参数集的带外传输。

For streams being delivered over multicast, the following rules apply:

对于通过多播传送的流,以下规则适用:

o The media format configuration is identified by "profile-level-id", including the level part, and packetization-mode. These media format configuration parameters (including the level part of profile-level-id) MUST be used symmetrically; that is, the answerer MUST either maintain all configuration parameters or remove the media format (payload type) completely. Note that this implies that the level part of profile-level-id for Offer/Answer in multicast is not changeable.

o 媒体格式配置由“配置文件级别id”标识,包括级别部分和打包模式。这些媒体格式配置参数(包括配置文件级别id的级别部分)必须对称使用;也就是说,应答者必须保留所有配置参数或完全删除媒体格式(有效负载类型)。请注意,这意味着多播中提供/应答的概要文件级别id的级别部分是不可更改的。

To simplify the handling and matching of these configurations, the same RTP payload type number used in the offer SHOULD also be used in the answer, as specified in [8]. An answer MUST NOT contain a payload type number used in the offer unless the configuration is the same as in the offer.

为了简化这些配置的处理和匹配,答案中也应使用报价中使用的相同RTP有效负载类型编号,如[8]中所述。答案不得包含报价中使用的有效负载类型编号,除非配置与报价中的配置相同。

o Parameter sets received MUST be associated with the originating source and MUST only be used in decoding the incoming NAL unit stream from the same source.

o 接收到的参数集必须与原始源关联,并且只能用于解码来自同一源的传入NAL单元流。

o The rules for other parameters are the same as above for unicast as long as the above rules are obeyed.

o 只要遵守上述规则,其他参数的规则与单播相同。

Table 6 lists the interpretation of all the media type parameters that MUST be used for the different direction attributes.

表6列出了必须用于不同方向属性的所有介质类型参数的解释。

Table 6. Interpretation of parameters for different direction attributes

表6。不同方向属性的参数解释

                                              sendonly --+
                                           recvonly --+  |
                                        sendrecv --+  |  |
                                                   |  |  |
                profile-level-id                   C  C  P
                max-recv-level                     R  R  -
                packetization-mode                 C  C  P
                sprop-deint-buf-req                P  -  P
                sprop-interleaving-depth           P  -  P
                sprop-max-don-diff                 P  -  P
                sprop-init-buf-time                P  -  P
                max-mbps                           R  R  -
                max-smbps                          R  R  -
                max-fs                             R  R  -
                max-cpb                            R  R  -
                max-dpb                            R  R  -
                max-br                             R  R  -
                redundant-pic-cap                  R  R  -
                deint-buf-cap                      R  R  -
                max-rcmd-nalu-size                 R  R  -
                sar-understood                     R  R  -
                sar-supported                      R  R  -
                in-band-parameter-sets             R  R  -
                use-level-src-parameter-sets       R  R  -
                level-asymmetry-allowed            O  -  -
                sprop-parameter-sets               S  -  S
                sprop-level-parameter-sets         S  -  S
        
                                              sendonly --+
                                           recvonly --+  |
                                        sendrecv --+  |  |
                                                   |  |  |
                profile-level-id                   C  C  P
                max-recv-level                     R  R  -
                packetization-mode                 C  C  P
                sprop-deint-buf-req                P  -  P
                sprop-interleaving-depth           P  -  P
                sprop-max-don-diff                 P  -  P
                sprop-init-buf-time                P  -  P
                max-mbps                           R  R  -
                max-smbps                          R  R  -
                max-fs                             R  R  -
                max-cpb                            R  R  -
                max-dpb                            R  R  -
                max-br                             R  R  -
                redundant-pic-cap                  R  R  -
                deint-buf-cap                      R  R  -
                max-rcmd-nalu-size                 R  R  -
                sar-understood                     R  R  -
                sar-supported                      R  R  -
                in-band-parameter-sets             R  R  -
                use-level-src-parameter-sets       R  R  -
                level-asymmetry-allowed            O  -  -
                sprop-parameter-sets               S  -  S
                sprop-level-parameter-sets         S  -  S
        

Legend:

图例:

             C: configuration for sending and receiving streams
             O: offer/answer mode
             P: properties of the stream to be sent
             R: receiver capabilities
             S: out-of-band parameter sets
             -: not usable (when present, SHOULD be ignored)
        
             C: configuration for sending and receiving streams
             O: offer/answer mode
             P: properties of the stream to be sent
             R: receiver capabilities
             S: out-of-band parameter sets
             -: not usable (when present, SHOULD be ignored)
        

Parameters used for declaring receiver capabilities are in general downgradable; that is, they express the upper limit for a sender's possible behavior. Thus, a sender MAY select to set its encoder using only lower/less or equal values of these parameters.

用于声明接收器功能的参数通常是可降级的;也就是说,它们表示发送者可能行为的上限。因此,发送方可以选择仅使用这些参数的较低/较少或相等值来设置其编码器。

Parameters declaring a configuration point are not changeable, with the exception of the level part of the profile-level-id parameter for unicast usage.

声明配置点的参数不可更改,但用于单播使用的概要文件级别id参数的级别部分除外。

When a sender's capabilities are declared and non-downgradable parameters are used in this declaration, these parameters express a configuration that is acceptable for the sender to receive streams. In order to achieve high interoperability levels, it is often advisable to offer multiple alternative configurations, e.g., for the packetization mode. It is impossible to offer multiple configurations in a single payload type. Thus, when multiple configuration offers are made, each offer requires its own RTP payload type associated with the offer.

当声明发送方的功能并在此声明中使用不可降级的参数时,这些参数表示发送方可接受的用于接收流的配置。为了实现高互操作性级别,通常建议提供多种备选配置,例如,对于打包模式。不可能在一种有效负载类型中提供多种配置。因此,当做出多个配置报价时,每个报价都需要与报价关联的自己的RTP有效负载类型。

A receiver SHOULD understand all media type parameters, even if it only supports a subset of the payload format's functionality. This ensures that a receiver is capable of understanding when an offer to receive media can be downgraded to what is supported by the receiver of the offer.

接收器应该理解所有媒体类型参数,即使它只支持有效负载格式功能的一个子集。这确保接收者能够理解何时可以将接收媒体的要约降级为要约接收者支持的内容。

An answerer MAY extend the offer with additional media format configurations. However, to enable their usage, in most cases, a second offer is required from the offerer to provide the stream property parameters that the media sender will use. This also has the effect that the offerer has to be able to receive this media format configuration, not only to send it.

应答者可以通过附加媒体格式配置来延长报价。然而,在大多数情况下,为了能够使用它们,需要提供方提供第二次提供,以提供媒体发送方将使用的流属性参数。这也意味着,报价人必须能够接收此媒体格式配置,而不仅仅是发送它。

If an offerer wishes to have non-symmetric capabilities between sending and receiving, the offerer can allow asymmetric levels via level-asymmetry-allowed being equal to 1. Alternatively, the offerer could offer different RTP sessions, i.e., different media lines declared as "recvonly" and "sendonly", respectively. This may have further implications on the system and may require additional external semantics to associate the two media lines.

如果发盘方希望在发送和接收之间具有非对称能力,发盘方可以通过允许的电平不对称等于1来允许不对称电平。或者,报价人可以提供不同的RTP会话,即分别声明为“RecvoOnly”和“sendonly”的不同媒体线路。这可能对系统有进一步的影响,并且可能需要额外的外部语义来关联两条媒体线路。

8.2.3. Usage in Declarative Session Descriptions
8.2.3. 声明性会话描述中的用法

When H.264 over RTP is offered with SDP in a declarative style, as in Real Time Streaming Protocol (RTSP) [27] or Session Announcement Protocol (SAP) [28], the following considerations are necessary.

当H.264 over RTP与SDP一起以声明式方式提供时,如在实时流协议(RTSP)[27]或会话公告协议(SAP)[28]中,需要考虑以下事项。

o All parameters capable of indicating both stream properties and receiver capabilities are used to indicate only stream properties. For example, in this case, the parameter profile-level-id declares only the values used by the stream, not the capabilities for receiving streams. The result of this is that the following interpretation of the parameters MUST be used:

o 所有能够同时指示流属性和接收器能力的参数仅用于指示流属性。例如,在本例中,参数profile level id仅声明流使用的值,而不声明接收流的功能。其结果是必须使用以下参数解释:

Declaring actual configuration or stream properties:

声明实际配置或流属性:

- profile-level-id - packetization-mode - sprop-interleaving-depth - sprop-deint-buf-req - sprop-max-don-diff - sprop-init-buf-time

- 配置文件级别id-打包模式-sprop交错深度-sprop deint buf req-sprop max don diff-sprop init buf time

Out-of-band transporting of parameter sets:

参数集的带外传输:

- sprop-parameter-sets - sprop-level-parameter-sets

- sprop参数集-sprop级别参数集

Not usable (when present, they SHOULD be ignored):

不可用(如果存在,则应忽略):

- max-mbps - max-smbps - max-fs - max-cpb - max-dpb - max-br - max-recv-level - redundant-pic-cap - max-rcmd-nalu-size - deint-buf-cap - sar-understood - sar-supported - in-band-parameter-sets - level-asymmetry-allowed - use-level-src-parameter-sets

- 最大mbps-最大smbps-最大fs-最大cpb-最大dpb-最大br-最大recv级别-冗余pic cap-最大rcmd nalu大小-deint buf cap-理解sar-支持sar-带内参数集-允许级别不对称-使用级别src参数集

o A receiver of the SDP is required to support all parameters and values of the parameters provided; otherwise, the receiver MUST reject (RTSP) or not participate in (SAP) the session. It falls on the creator of the session to use values that are expected to be supported by the receiving application.

o SDP接收器需要支持提供的所有参数和参数值;否则,接收方必须拒绝(RTSP)或不参与(SAP)会话。会话的创建者需要使用接收应用程序预期支持的值。

8.3. Examples
8.3. 例子

An SDP Offer/Answer exchange wherein both parties are expected to both send and receive could look like the following. Only the media-codec-specific parts of the SDP are shown. Some lines are wrapped due to text constraints.

一个SDP提供/应答交换,其中双方都需要发送和接收,如下所示。仅显示SDP的媒体编解码器特定部分。由于文本约束,某些行被换行。

Offerer -> Answerer SDP message:

报价人->应答人SDP消息:

      m=video 49170 RTP/AVP 100 99 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; packetization-mode=0;
        sprop-parameter-sets=<parameter sets data#0>
      a=rtpmap:99 H264/90000
      a=fmtp:99 profile-level-id=42A01E; packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#1>
      a=rtpmap:100 H264/90000
      a=fmtp:100 profile-level-id=42A01E; packetization-mode=2;
        sprop-parameter-sets=<parameter sets data#2>;
        sprop-interleaving-depth=45; sprop-deint-buf-req=64000;
        sprop-init-buf-time=102478; deint-buf-cap=128000
        
      m=video 49170 RTP/AVP 100 99 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; packetization-mode=0;
        sprop-parameter-sets=<parameter sets data#0>
      a=rtpmap:99 H264/90000
      a=fmtp:99 profile-level-id=42A01E; packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#1>
      a=rtpmap:100 H264/90000
      a=fmtp:100 profile-level-id=42A01E; packetization-mode=2;
        sprop-parameter-sets=<parameter sets data#2>;
        sprop-interleaving-depth=45; sprop-deint-buf-req=64000;
        sprop-init-buf-time=102478; deint-buf-cap=128000
        

The above offer presents the same codec configuration in three different packetization formats. Payload type 98 represents single NALU mode, payload type 99 represents non-interleaved mode, and payload type 100 indicates the interleaved mode. In the interleaved mode case, the interleaving parameters that the offerer would use if the answer indicates support for payload type 100 are also included. In all three cases, the parameter sprop-parameter-sets conveys the initial parameter sets that are required by the answerer when receiving a stream from the offerer when this configuration is accepted. Note that the value for sprop-parameter-sets could be different for each payload type.

上述产品以三种不同的打包格式提供了相同的编解码器配置。有效负载类型98表示单NALU模式,有效负载类型99表示非交织模式,有效负载类型100表示交织模式。在交织模式情况下,还包括如果应答指示支持有效负载类型100,则报价人将使用的交织参数。在所有三种情况下,参数sprop参数集传送应答者在接受此配置时从报价人接收流时所需的初始参数集。请注意,对于每种有效负载类型,sprop参数集的值可能不同。

Answerer -> Offerer SDP message:

应答人->报价人SDP消息:

      m=video 49170 RTP/AVP 100 99 97
      a=rtpmap:97 H264/90000
      a=fmtp:97 profile-level-id=42A01E; packetization-mode=0;
        sprop-parameter-sets=<parameter sets data#3>
      a=rtpmap:99 H264/90000
      a=fmtp:99 profile-level-id=42A01E; packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#4>;
        max-rcmd-nalu-size=3980
      a=rtpmap:100 H264/90000
      a=fmtp:100 profile-level-id=42A01E; packetization-mode=2;
        sprop-parameter-sets=<parameter sets data#5>;
        sprop-interleaving-depth=60;
        sprop-deint-buf-req=86000; sprop-init-buf-time=156320;
        deint-buf-cap=128000; max-rcmd-nalu-size=3980
        
      m=video 49170 RTP/AVP 100 99 97
      a=rtpmap:97 H264/90000
      a=fmtp:97 profile-level-id=42A01E; packetization-mode=0;
        sprop-parameter-sets=<parameter sets data#3>
      a=rtpmap:99 H264/90000
      a=fmtp:99 profile-level-id=42A01E; packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#4>;
        max-rcmd-nalu-size=3980
      a=rtpmap:100 H264/90000
      a=fmtp:100 profile-level-id=42A01E; packetization-mode=2;
        sprop-parameter-sets=<parameter sets data#5>;
        sprop-interleaving-depth=60;
        sprop-deint-buf-req=86000; sprop-init-buf-time=156320;
        deint-buf-cap=128000; max-rcmd-nalu-size=3980
        

As the Offer/Answer negotiation covers both sending and receiving streams, an offer indicates the exact parameters for what the offerer is willing to receive, whereas the answer indicates the same for what the answerer is willing to receive. In this case, the offerer declared that it is willing to receive payload type 98. The answerer accepts this by declaring an equivalent payload type 97; that is, it has identical values for the two parameters profile-level-id and packetization-mode (since packetization-mode is equal to 0 and sprop-deint-buf-req is not present). As the offered payload type 98 is accepted, the answerer needs to store parameter sets included in sprop-parameter-sets=<parameter sets data#0> in case the offer finally decides to use this configuration. In the answer, the answerer includes the parameter sets in sprop-parameter-sets=<parameter sets data#3> that the answerer would use in the stream sent from the answerer if this configuration is finally used.

由于要约/应答协商包括发送流和接收流,要约表明了要约人愿意接收的确切参数,而应答表示了应答人愿意接收的确切参数。在这种情况下,报价人声明其愿意接收有效载荷类型98。应答者通过声明等效有效负载类型97来接受这一点;也就是说,它具有两个参数profile level id和packetization mode的相同值(因为packetization mode等于0,并且不存在sprop deint buf req)。当提供的有效负载类型98被接受时,应答者需要存储sprop parameter sets=<parameter sets data#0>中包含的参数集,以防提供最终决定使用此配置。在回答中,应答器包括sprop parameter sets=<parameter sets data#3>中的参数集,如果最终使用此配置,应答器将在应答器发送的流中使用这些参数集。

   The answerer also accepts the reception of the two configurations
   that payload types 99 and 100 represent.  Again, the answerer needs
   to store parameter sets included in sprop-parameter-sets=<parameter
   sets data#1> and sprop-parameter-sets=<parameter sets data#2> in case
   the offer finally decides to use either of these two configurations.
   The answerer provides the initial parameter sets for the answerer-to-
   offerer direction, i.e., the parameter sets in sprop-parameter-
   sets=<parameter sets data#4> and sprop-parameter-sets=<parameter sets
   data#5>, for payload types 99 and 100, respectively, that it will use
   to send the payload types.  The answerer also provides the offerer
   with its memory limit for de-interleaving operations by providing a
   deint-buf-cap parameter.  This is only useful if the offerer decides
   on making a second offer, where it can take the new value into
        
   The answerer also accepts the reception of the two configurations
   that payload types 99 and 100 represent.  Again, the answerer needs
   to store parameter sets included in sprop-parameter-sets=<parameter
   sets data#1> and sprop-parameter-sets=<parameter sets data#2> in case
   the offer finally decides to use either of these two configurations.
   The answerer provides the initial parameter sets for the answerer-to-
   offerer direction, i.e., the parameter sets in sprop-parameter-
   sets=<parameter sets data#4> and sprop-parameter-sets=<parameter sets
   data#5>, for payload types 99 and 100, respectively, that it will use
   to send the payload types.  The answerer also provides the offerer
   with its memory limit for de-interleaving operations by providing a
   deint-buf-cap parameter.  This is only useful if the offerer decides
   on making a second offer, where it can take the new value into
        

account. The max-rcmd-nalu-size indicates that the answerer can efficiently process NALUs up to the size of 3980 bytes. However, there is no guarantee that the network supports this size.

账户最大rcmd nalu大小表明应答器可以有效地处理最大为3980字节的nalu。但是,不能保证网络支持这种大小。

In the following example, the offer is accepted without level downgrading (i.e., the default level, Level 3.0, is accepted), and both sprop-parameter-sets and sprop-level-parameter-sets are present in the offer. The answerer must ignore sprop-level-parameter-sets=<parameter sets data#1> and store parameter sets in sprop-parameter-sets=<parameter sets data#0> for decoding the incoming NAL unit stream. The offerer must store the parameter sets in sprop-parameter-sets=<parameter sets data#2> in the answer for decoding the incoming NAL unit stream. Note that in this example, parameter sets in sprop-parameter-sets=<parameter sets data#2> must be associated with Level 3.0.

在以下示例中,接受报价时不会降低级别(即接受默认级别3.0),并且报价中同时存在sprop参数集和sprop级别参数集。应答者必须忽略sprop级别参数集=<parameter sets data#1>,并将参数集存储在sprop parameter sets=<parameter sets data#0>中,以解码传入的NAL单位流。报价人必须将参数集存储在应答中的sprop parameter sets=<parameter sets data#2>中,以解码传入的NAL单元流。请注意,在此示例中,sprop parameter sets=<parameter sets data#2>中的参数集必须与级别3.0关联。

Offer SDP:

提供SDP:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>;
        sprop-level-parameter-sets=<parameter sets data#1>
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>;
        sprop-level-parameter-sets=<parameter sets data#1>
        

Answer SDP:

答复:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#2>
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#2>
        

In the following example, the offer (Baseline profile, Level 1.1) is accepted with level downgrading (the accepted level is Level 1b), and both sprop-parameter-sets and sprop-level-parameter-sets are present in the offer. The answerer must ignore sprop-parameter-sets=<parameter sets data#0> and all parameter sets not for the accepted level (Level 1b) in sprop-level-parameter-sets=<parameter sets data#1> and must store parameter sets for the accepted level (Level 1b) in sprop-level-parameter-sets=<parameter sets data#1> for decoding the incoming NAL unit stream. The offerer must store the parameter sets in sprop-parameter-sets=<parameter sets data#2> in the answer for decoding the incoming NAL unit stream. Note that in this example, parameter sets in sprop-parameter-sets=<parameter sets data#2> must be associated with Level 1b.

在下面的示例中,接受报价(基线配置文件,级别1.1)并进行级别降级(接受的级别为级别1b),报价中同时存在sprop参数集和sprop级别参数集。应答者必须忽略sprop参数集=<parameter sets data#0>和sprop level parameter sets=<parameter sets data#1>中接受级别(级别1b)以外的所有参数集,并且必须在sprop level parameter sets=<parameter sets data#1>中存储接受级别(级别1b)的参数集,以解码传入的NAL单位流。报价人必须将参数集存储在应答中的sprop parameter sets=<parameter sets data#2>中,以解码传入的NAL单元流。请注意,在此示例中,sprop parameter sets=<parameter sets data#2>中的参数集必须与级别1b相关联。

Offer SDP:

提供SDP:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A00B; //Baseline profile, Level 1.1
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>;
        sprop-level-parameter-sets=<parameter sets data#1>
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A00B; //Baseline profile, Level 1.1
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>;
        sprop-level-parameter-sets=<parameter sets data#1>
        

Answer SDP:

答复:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42B00B; //Baseline profile, Level 1b
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#2>;
        use-level-src-parameter-sets=1
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42B00B; //Baseline profile, Level 1b
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#2>;
        use-level-src-parameter-sets=1
        

In the following example, the offer (Baseline profile, Level 1.1) is accepted with level downgrading (the accepted level is Level 1b), and both sprop-parameter-sets and sprop-level-parameter-sets are present in the offer. However, the answerer is a legacy RFC 3984 implementation and does not understand sprop-level-parameter-sets; hence, it does not include use-level-src-parameter-sets (which the answerer does not understand either) in the answer. Therefore, the answerer must ignore both sprop-parameter-sets=<parameter sets data#0> and sprop-level-parameter-sets=<parameter sets data#1>, and the offerer must transport parameter sets in-band.

在下面的示例中,接受报价(基线配置文件,级别1.1)并进行级别降级(接受的级别为级别1b),报价中同时存在sprop参数集和sprop级别参数集。但是,回答者是一个遗留的RFC 3984实现,不了解sprop级别参数集;因此,答案中不包括使用级别src参数集(回答者也不理解)。因此,应答方必须同时忽略sprop参数集=<parameter sets data#0>和sprop级别参数集=<parameter sets data#1>,并且报价方必须在频带内传输参数集。

Offer SDP:

提供SDP:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A00B; //Baseline profile, Level 1.1
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>;
        sprop-level-parameter-sets=<parameter sets data#1>
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A00B; //Baseline profile, Level 1.1
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>;
        sprop-level-parameter-sets=<parameter sets data#1>
        

Answer SDP:

答复:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42B00B; //Baseline profile, Level 1b
        packetization-mode=1
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42B00B; //Baseline profile, Level 1b
        packetization-mode=1
        

In the following example, the offer is accepted without level downgrading, and sprop-parameter-sets is present in the offer. Parameter sets in sprop-parameter-sets=<parameter sets data#0> must

在以下示例中,接受报价时不会降低级别,并且报价中存在sprop参数集。sprop参数集中的参数集=<Parameter sets data#0>必须

be stored and used by the encoder of the offerer and the decoder of the answerer, and parameter sets in sprop-parameter-sets=<parameter sets data#1> must be used by the encoder of the answerer and the decoder of the offerer. Note that sprop-parameter-sets=<parameter sets data#0> is basically independent of sprop-parameter-sets=<parameter sets data#1>.

由报价人的编码器和报价人的解码器存储和使用,sprop parameter sets=<parameter sets data#1>中的参数集必须由报价人的编码器和报价人的解码器使用。请注意,sprop参数集=<parameter sets data#0>基本上独立于sprop参数集=<parameter sets data#1>。

Offer SDP:

提供SDP:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>
        

Answer SDP:

答复:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#1>
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#1>
        

In the following example, the offer is accepted without level downgrading, and neither sprop-parameter-sets nor sprop-level-parameter-sets is present in the offer, meaning that there is no out-of-band transmission of parameter sets, which then have to be transported in-band.

在以下示例中,接受报价时不进行级别降级,报价中既不存在sprop参数集,也不存在sprop级别参数集,这意味着不存在参数集的带外传输,这些参数集随后必须在带内传输。

Offer SDP:

提供SDP:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        

Answer SDP:

答复:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        

In the following example, the offer is accepted with level downgrading and sprop-parameter-sets is present in the offer. As sprop-parameter-sets=<parameter sets data#0> contains level_idc indicating Level 3.0, it therefore cannot be used, as the answerer wants Level 2.0, and must be ignored by the answerer, and in-band parameter sets must be used.

在以下示例中,接受级别降级的报价,并且报价中存在sprop参数集。由于sprop parameter sets=<parameter sets data#0>包含表示级别3.0的级别idc,因此无法使用它,因为应答者需要级别2.0,应答者必须忽略它,并且必须使用带内参数集。

Offer SDP:

提供SDP:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1;
        sprop-parameter-sets=<parameter sets data#0>
        

Answer SDP:

答复:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A014; //Baseline profile, Level 2.0
        packetization-mode=1
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A014; //Baseline profile, Level 2.0
        packetization-mode=1
        

In the following example, the offer is also accepted with level downgrading, and neither sprop-parameter-sets nor sprop-level-parameter-sets is present in the offer, meaning that there is no out-of-band transmission of parameter sets, which then have to be transported in-band.

在以下示例中,还接受级别降级的报价,报价中既不存在sprop参数集,也不存在sprop级别参数集,这意味着不存在参数集的带外传输,这些参数集随后必须在带内传输。

Offer SDP:

提供SDP:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        

Answer SDP:

答复:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A014; //Baseline profile, Level 2.0
        packetization-mode=1
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A014; //Baseline profile, Level 2.0
        packetization-mode=1
        

In the following example, the offer is accepted with level upgrading, and neither sprop-parameter-sets nor sprop-level-parameter-sets is present in the offer or the answer, meaning that there is no out-of-band transmission of parameter sets, which then have to be transported in-band. The level to use in the offerer-to-answerer direction is Level 3.0, and the level to use in the answerer-to-

在下面的示例中,通过级别升级接受报价,报价或答案中既不存在sprop参数集,也不存在sprop级别参数集,这意味着不存在参数集的带外传输,这些参数集随后必须在带内传输。在“报价人-应答人”方向中使用的级别为3.0级,在“应答人-应答人”方向中使用的级别为3.0级-

offerer direction is Level 2.0. The answerer is allowed to send at any level up to and including Level 2.0, and the offerer is allowed to send at any level up to and including Level 3.0.

报价人方向为2.0级。回答者可以在2.0级及以下的任何级别发送,报价者可以在3.0级及以下的任何级别发送。

Offer SDP:

提供SDP:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A014; //Baseline profile, Level 2.0
        packetization-mode=1; level-asymmetry-allowed=1
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A014; //Baseline profile, Level 2.0
        packetization-mode=1; level-asymmetry-allowed=1
        

Answer SDP:

答复:

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1; level-asymmetry-allowed=1
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1; level-asymmetry-allowed=1
        

In the following example, the offerer is a Multipoint Control Unit (MCU) in a topology like Topo-Video-switch-MCU [29], offering parameter sets received (using out-of-band transport) from three other participants (B, C, and D) and receiving parameter sets from the participant A, which is the answerer. The participants are identified by their values of canonical name (CNAME), which are mapped to different SSRC values. The same codec configuration is used by all four participants. The participant A stores and associates the parameter sets included in <parameter sets data#B>, <parameter sets data#C>, and <parameter sets data#D> to participants B, C, and D, respectively, and uses <parameter sets data#B> for decoding NAL units carried in RTP packets originating from participant B only, uses <parameter sets data#C> for decoding NAL units carried in RTP packets originating from participant C only, and uses <parameter sets data#D> for decoding NAL units carried in RTP packets originating from participant D only.

在以下示例中,报价人是拓扑结构中的多点控制单元(MCU),如Topo Video switch MCU[29],提供从其他三个参与者(B、C和D)接收(使用带外传输)的参数集,以及从作为应答人的参与者a接收参数集。参与者通过其规范名称(CNAME)的值进行标识,这些值映射到不同的SSRC值。所有四个参与者都使用相同的编解码器配置。参与者A将<parameter sets data#B>、<parameter sets data#C>和<parameter sets data#D>中包含的参数集分别存储并关联到参与者B、C和D,并使用<parameter sets data#B>对仅源自参与者B的RTP分组中携带的NAL单元进行解码,使用<parameter sets data#C>对源自参与者C的RTP数据包中携带的NAL单元进行解码,并使用<parameter sets data#D>对源自参与者D的RTP数据包中携带的NAL单元进行解码。

Offer SDP:

提供SDP:

      m=video 49170 RTP/AVP 98
      a=ssrc:SSRC-B cname:CNAME-B
      a=ssrc:SSRC-C cname:CNAME-C
      a=ssrc:SSRC-D cname:CNAME-D
      a=ssrc:SSRC-B fmtp:98
        sprop-parameter-sets=<parameter sets data#B>
      a=ssrc:SSRC-C fmtp:98
        sprop-parameter-sets=<parameter sets data#C>
      a=ssrc:SSRC-D fmtp:98
        sprop-parameter-sets=<parameter sets data#D>
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        
      m=video 49170 RTP/AVP 98
      a=ssrc:SSRC-B cname:CNAME-B
      a=ssrc:SSRC-C cname:CNAME-C
      a=ssrc:SSRC-D cname:CNAME-D
      a=ssrc:SSRC-B fmtp:98
        sprop-parameter-sets=<parameter sets data#B>
      a=ssrc:SSRC-C fmtp:98
        sprop-parameter-sets=<parameter sets data#C>
      a=ssrc:SSRC-D fmtp:98
        sprop-parameter-sets=<parameter sets data#D>
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        

Answer SDP:

答复:

      m=video 49170 RTP/AVP 98
      a=ssrc:SSRC-A cname:CNAME-A
      a=ssrc:SSRC-A fmtp:98
        sprop-parameter-sets=<parameter sets data#A>
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        
      m=video 49170 RTP/AVP 98
      a=ssrc:SSRC-A cname:CNAME-A
      a=ssrc:SSRC-A fmtp:98
        sprop-parameter-sets=<parameter sets data#A>
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; //Baseline profile, Level 3.0
        packetization-mode=1
        
8.4. Parameter Set Considerations
8.4. 参数集注意事项

The H.264 parameter sets are a fundamental part of the video codec and vital to its operation (see Section 1.2). Due to their characteristics and their importance for the decoding process, lost or erroneously transmitted parameter sets can hardly be concealed locally at the receiver. A reference to a corrupt parameter set normally has fatal results to the decoding process. Corruption could occur, for example, due to the erroneous transmission or loss of a parameter set NAL unit but also due to the untimely transmission of a parameter set update. A parameter set update refers to a change of at least one parameter in a picture parameter set or sequence parameter set for which the picture parameter set or sequence parameter set identifier remains unchanged. Therefore, the following recommendations are provided as a guideline for the implementer of the RTP sender.

H.264参数集是视频编解码器的基本组成部分,对其运行至关重要(见第1.2节)。由于其特性及其对解码过程的重要性,丢失或错误传输的参数集很难在接收器处被本地隐藏。对损坏参数集的引用通常会对解码过程产生致命的结果。例如,由于参数集NAL单元的错误传输或丢失,以及由于参数集更新的不及时传输,可能会发生损坏。参数集更新是指图片参数集或序列参数集标识符保持不变的图片参数集或序列参数集中至少一个参数的更改。因此,以下建议作为RTP发送器实现者的指南提供。

Parameter set NALUs can be transported using three different principles:

可以使用三种不同的原则传输参数集NALU:

A. Using a session control protocol (out-of-band) prior to the actual RTP session.

A.在实际RTP会话之前使用会话控制协议(带外)。

B. Using a session control protocol (out-of-band) during an ongoing RTP session.

B.在正在进行的RTP会话期间使用会话控制协议(带外)。

C. Within the RTP packet stream in the payload (in-band) during an ongoing RTP session.

C.在正在进行的RTP会话期间,在有效负载(带内)中的RTP分组流内。

It is recommended to implement principles A and B within a session control protocol. SIP and SDP can be used as described in the SDP Offer/Answer model and in the previous sections of this memo. Section 8.2.2 includes a detailed discussion on transport of parameter sets in-band or out-of-band in SDP Offer/Answer using media type parameters sprop-parameter-sets, sprop-level-parameter-sets, use-level-src-parameter-sets, and in-band-parameter-sets. This section contains guidelines on how principles A and B should be implemented within session control protocols. It is independent of the particular protocol used. Principle C is supported by the RTP payload format defined in this specification. There are topologies like Topo-Video-switch-MCU [29] for which the use of principle C may be desirable.

建议在会话控制协议中实现原则A和原则B。SIP和SDP可按照SDP报价/应答模型和本备忘录前面章节的说明使用。第8.2.2节详细讨论了使用媒体类型参数sprop参数集、sprop级别参数集、使用级别src参数集和带内参数集在SDP报价/应答中带内或带外传输参数集。本节包含如何在会话控制协议中实现原则A和原则B的指南。它独立于所使用的特定协议。原则C由本规范中定义的RTP有效负载格式支持。有一些拓扑,如Topo Video switch MCU[29],需要使用原理C。

If in-band signaling of parameter sets is used, the picture and sequence parameter set NALUs SHOULD be transmitted in the RTP payload using a reliable method of delivering of RTP (see below), as a loss of a parameter set of either type will likely prevent decoding of a considerable portion of the corresponding RTP packet stream.

如果使用参数集的带内信令,则应使用可靠的RTP传送方法(参见下文)在RTP有效载荷中传送图片和序列参数集NALUs,因为任一类型的参数集的丢失可能会阻止相应RTP分组流的相当大一部分的解码。

If in-band signaling of parameter sets is used, the sender SHOULD take the error characteristics into account and use mechanisms to provide a high probability for delivering the parameter sets correctly. Mechanisms that increase the probability for a correct reception include packet repetition, FEC, and retransmission. The use of an unreliable, out-of-band control protocol has similar disadvantages as the in-band signaling (possible loss) and, in addition, may also lead to difficulties in the synchronization (see below). Therefore, it is NOT RECOMMENDED.

如果使用参数集的带内信令,发送方应考虑错误特性,并使用机制提供正确交付参数集的高概率。增加正确接收概率的机制包括分组重复、FEC和重传。使用不可靠的带外控制协议与带内信令(可能丢失)具有类似的缺点,此外,还可能导致同步困难(见下文)。因此,不建议这样做。

Parameter sets MAY be added or updated during the lifetime of a session using principles B and C. It is required that parameter sets be present at the decoder prior to the NAL units that refer to them. Update or addition of parameter sets can result in further problems; therefore, the following recommendations should be considered.

在会话的生命周期内,可以使用原则B和C添加或更新参数集。要求参数集在引用它们的NAL单元之前出现在解码器中。更新或添加参数集可能导致进一步的问题;因此,应考虑以下建议。

- When parameter sets are added or updated, care SHOULD be taken to ensure that any parameter set is delivered prior to its usage. When new parameter sets are added, previously unused parameter set identifiers are used. It is common that no synchronization is present between out-of-band signaling and in-band traffic. If out-of-band signaling is used, it is RECOMMENDED that a sender not start sending NALUs requiring the added or updated parameter sets prior to acknowledgement of delivery from the signaling protocol.

- 添加或更新参数集时,应注意确保任何参数集在使用前交付。添加新参数集时,将使用以前未使用的参数集标识符。带外信令和带内业务之间通常不存在同步。如果使用带外信令,建议发送方在确认信令协议的发送之前,不要开始发送需要添加或更新参数集的NALU。

- When parameter sets are updated, the following synchronization issue should be taken into account. When overwriting a parameter set at the receiver, the sender has to ensure that the parameter set in question is not needed by any NALU present in the network or receiver buffers. Otherwise, decoding with a wrong parameter set may occur. To lessen this problem, it is RECOMMENDED either to overwrite only those parameter sets that have not been used for a sufficiently long time (to ensure that all related NALUs have been consumed) or to add a new parameter set instead (which may have negative consequences for the efficiency of the video coding).

- 更新参数集时,应考虑以下同步问题。当覆盖接收器上的参数集时,发送方必须确保网络或接收器缓冲区中的任何NALU都不需要有问题的参数集。否则,可能会使用错误的参数集进行解码。为了减少这个问题,建议只覆盖那些在足够长的时间内没有使用的参数集(以确保所有相关的NALU都已使用),或者添加一个新的参数集(这可能会对视频编码的效率产生负面影响)。

Informative note: In some topologies like Topo-Video-switch-MCU [29], the origin of the whole set of parameter sets may come from multiple sources that may use non-unique parameter set identifiers. In this case, an offer may overwrite an existing parameter set if no other mechanism that enables uniqueness of the parameter sets in the out-of-band channel exists.

资料性说明:在一些拓扑中,如Topo Video switch MCU[29],整套参数集的来源可能来自多个可能使用非唯一参数集标识符的来源。在这种情况下,如果不存在在带外信道中实现参数集唯一性的其他机制,则报价可能会覆盖现有参数集。

- In a multiparty session, one participant MUST associate parameter sets coming from different sources with the source identification whenever possible, e.g., by conveying out-of-band transported parameter sets, as different sources typically use independent parameter set identifier value spaces.

- 在多方会话中,一个参与者必须尽可能将来自不同来源的参数集与来源标识相关联,例如,通过传送带外传输的参数集,因为不同来源通常使用独立的参数集标识符值空间。

- Adding or modifying parameter sets by using both principles B and C in the same RTP session may lead to inconsistencies of the parameter sets because of the lack of synchronization between the control and the RTP channel. Therefore, principles B and C MUST NOT both be used in the same session unless sufficient synchronization can be provided.

- 在同一RTP会话中使用原则B和原则C添加或修改参数集可能会导致参数集不一致,因为控件和RTP通道之间缺乏同步。因此,除非能够提供足够的同步,否则原则B和C不得同时用于同一会话。

In some scenarios (e.g., when only the subset of this payload format specification corresponding to H.241 is used) or topologies, it is not possible to employ out-of-band parameter set transmission. In this case, parameter sets have to be transmitted in-band. Here, the synchronization with the non-parameter-set-data in the bitstream is implicit, but the possibility of a loss has to be taken into account.

在某些情况下(例如,当仅使用与H.241对应的有效载荷格式规范的子集时)或拓扑中,不可能采用带外参数集传输。在这种情况下,参数集必须在频带内传输。这里,与比特流中的非参数集数据的同步是隐式的,但是必须考虑丢失的可能性。

The loss probability should be reduced using the mechanisms discussed above. In case a loss of a parameter set is detected, recovery may be achieved using a Decoder Refresh Point procedure, for example, using RTCP feedback Full Intra Request (FIR) [30]. Two example Decoder Refresh Point procedures are provided in the informative Section 8.5.

应使用上述机制降低损失概率。在检测到参数集丢失的情况下,可使用解码器刷新点程序实现恢复,例如,使用RTCP反馈全帧内请求(FIR)[30]。第8.5节提供了两个示例解码器刷新点程序。

- When parameter sets are initially provided using principle A and then later added or updated in-band (principle C), there is a risk associated with updating the parameter sets delivered out-of-band. If receivers miss some in-band updates (for example, because of a loss or a late tune-in), those receivers attempt to decode the bitstream using outdated parameters. It is therefore RECOMMENDED that parameter set IDs be partitioned between the out-of-band and in-band parameter sets.

- 当最初使用原则A提供参数集,然后在带内添加或更新(原则C)时,更新带外交付的参数集存在风险。如果接收机错过了一些带内更新(例如,由于丢失或延迟调谐),这些接收机将尝试使用过时的参数解码比特流。因此,建议在带外和带内参数集之间划分参数集ID。

8.5. Decoder Refresh Point Procedure Using In-Band Transport of Parameter Sets (Informative)

8.5. 使用参数集带内传输的解码器刷新点程序(信息性)

When a sender with a video encoder according to [1] receives a request for a decoder refresh point, the encoder shall enter the fast update mode by using one of the procedures specified in Sections 8.5.1 or 8.5.2. The procedure in Section 8.5.1 is the preferred response in a lossless transmission environment. Both procedures satisfy the requirement to enter the fast update mode for H.264 video encoding.

当根据[1]配备视频编码器的发送方收到解码器刷新点的请求时,编码器应使用第8.5.1节或第8.5.2节规定的程序之一进入快速更新模式。第8.5.1节中的程序是无损传输环境中的首选响应。这两个过程都满足H.264视频编码进入快速更新模式的要求。

8.5.1. IDR Procedure to Respond to a Request for a Decoder Refresh Point

8.5.1. 响应解码器刷新点请求的IDR过程

This section gives one possible way to respond to a request for a decoder refresh point.

本节给出了一种可能的方法来响应解码器刷新点的请求。

The encoder shall, in the order presented here:

编码器应按照此处给出的顺序:

1) Immediately prepare to send an IDR picture.

1) 立即准备发送IDR图片。

2) Send a sequence parameter set to be used by the IDR picture to be sent. The encoder may optionally also send other sequence parameter sets.

2) 发送要发送的IDR图片使用的序列参数集。编码器还可以可选地发送其他序列参数集。

3) Send a picture parameter set to be used by the IDR picture to be sent. The encoder may optionally also send other picture parameter sets.

3) 发送要发送的IDR图片使用的图片参数集。编码器还可以可选地发送其他图片参数集。

4) Send the IDR picture.

4) 发送IDR图片。

5) From this point forward in time, send any other sequence or picture parameter sets that have not yet been sent in this procedure, prior to their reference by any NAL unit, regardless of whether such parameter sets were previously sent prior to receiving the request for a decoder refresh point. As needed, such parameter sets may be sent in a batch, one at a time, or in any combination of these two methods. Parameter sets may be re-sent at any time for redundancy. Caution should be taken when parameter set updates are present, as described above in Section 8.4.

5) 从该时间点向前,在任何NAL单元引用之前,发送在此过程中尚未发送的任何其他序列或图片参数集,而不管这些参数集是否在接收解码器刷新点请求之前发送。根据需要,这些参数集可以成批发送,一次发送一个,或者以这两种方法的任意组合发送。参数集可在任何时候重新发送以实现冗余。如上文第8.4节所述,当存在参数集更新时,应小心。

8.5.2. Gradual Recovery Procedure to Respond to a Request for a Decoder Refresh Point

8.5.2. 响应解码器刷新点请求的渐进恢复过程

This section gives another possible way to respond to a request for a decoder refresh point.

本节给出了响应解码器刷新点请求的另一种可能方式。

The encoder shall, in the order presented here:

编码器应按照此处给出的顺序:

1) Send a recovery point SEI message (see Sections D.1.7 and D.2.7 of [1]).

1) 发送恢复点SEI消息(见[1]第D.1.7节和第D.2.7节)。

2) Repeat any sequence and picture parameter sets that were sent before the recovery point SEI message, prior to their reference by a NAL unit.

2) 在NAL单元引用之前,重复在恢复点SEI消息之前发送的任何序列和图片参数集。

The encoder shall ensure that the decoder has access to all reference pictures for inter prediction of pictures at or after the recovery point, which is indicated by the recovery point SEI message, in output order, assuming that the transmission from now on is error-free.

编码器应确保解码器能够访问所有参考图片,以便按照输出顺序在恢复点处或之后对图片进行帧间预测,这由恢复点SEI消息指示,假设从现在开始的传输是无错误的。

The value of the recovery_frame_cnt syntax element in the recovery point SEI message should be small enough to ensure a fast recovery.

恢复点SEI消息中recovery_frame_cnt语法元素的值应足够小,以确保快速恢复。

As needed, such parameter sets may be re-sent in a batch, one at a time, or in any combination of these two methods. Parameter sets may be re-sent at any time for redundancy. Caution should be taken when parameter set updates are present, as described above in Section 8.4.

根据需要,这些参数集可以批量重新发送,一次发送一个,或者以这两种方法的任意组合重新发送。参数集可在任何时候重新发送以实现冗余。如上文第8.4节所述,当存在参数集更新时,应小心。

9. Security Considerations
9. 安全考虑

RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [5] and in any appropriate RTP profile (for example, [16]). This implies that confidentiality of the media streams is achieved by encryption, for example, through the application of SRTP [26]. Because the data compression used with this payload format is

使用本规范中定义的有效负载格式的RTP数据包受RTP规范[5]和任何适当RTP配置文件(例如[16])中讨论的安全注意事项的约束。这意味着媒体流的机密性是通过加密实现的,例如,通过应用SRTP[26]。因为与此有效负载格式一起使用的数据压缩是

applied end-to-end, any encryption needs to be performed after compression. A potential denial-of-service threat exists for data encodings using compression techniques that have non-uniform receiver-end computational load. The attacker can inject pathological datagrams into the stream that are complex to decode and that cause the receiver to be overloaded. H.264 is particularly vulnerable to such attacks, as it is extremely simple to generate datagrams containing NAL units that affect the decoding process of many future NAL units. Therefore, the usage of data origin authentication and data integrity protection of at least the RTP packet is RECOMMENDED, for example, with SRTP [26].

应用端到端,任何加密都需要在压缩后执行。使用压缩技术的数据编码存在潜在的拒绝服务威胁,这种压缩技术具有非均匀的接收端计算负载。攻击者可以向流中注入难以解码的病理数据报,从而导致接收器过载。H.264特别容易受到此类攻击,因为生成包含影响许多未来NAL单元解码过程的NAL单元的数据报非常简单。因此,建议至少使用RTP分组的数据源认证和数据完整性保护,例如,使用SRTP[26]。

Note that the appropriate mechanism to ensure confidentiality and integrity of RTP packets and their payloads is very dependent on the application and on the transport and signaling protocols employed. Thus, although SRTP is given as an example above, other possible choices exist.

请注意,确保RTP数据包及其有效负载的机密性和完整性的适当机制非常依赖于应用程序以及所采用的传输和信令协议。因此,尽管上面给出了SRTP作为示例,但存在其他可能的选择。

Decoders MUST exercise caution with respect to the handling of user data SEI messages, particularly if they contain active elements, and MUST restrict their domain of applicability to the presentation containing the stream.

解码器必须谨慎处理用户数据SEI消息,特别是如果它们包含活动元素,并且必须将其适用范围限制为包含流的表示。

End-to-end security with either authentication, integrity, or confidentiality protection will prevent a MANE from performing media-aware operations other than discarding complete packets. In the case of confidentiality protection, it will even be prevented from discarding packets in a media-aware way. To be allowed to perform its operations, a MANE is required to be a trusted entity that is included in the security context establishment.

具有身份验证、完整性或机密性保护的端到端安全性将防止MANE执行除丢弃完整数据包以外的媒体感知操作。在保密保护的情况下,甚至可以防止它以媒体感知的方式丢弃数据包。为了允许执行其操作,MANE必须是包含在安全上下文建立中的受信任实体。

10. Congestion Control
10. 拥塞控制

Congestion control for RTP SHALL be used in accordance with RFC 3550 [5] and with any applicable RTP profile, e.g., RFC 3551 [16]. If best-effort service is being used, an additional requirement is that users of this payload format MUST monitor packet loss to ensure that the packet loss rate is within acceptable parameters. Packet loss is considered acceptable if a TCP flow across the same network path, and experiencing the same network conditions, would achieve an average throughput, measured on a reasonable timescale, that is not less than the RTP flow is achieving. This condition can be satisfied by implementing congestion control mechanisms to adapt the transmission rate (or the number of layers subscribed for a layered multicast session) or by arranging for a receiver to leave the session if the loss rate is unacceptably high.

RTP的拥塞控制应根据RFC 3550[5]和任何适用的RTP配置文件(如RFC 3551[16])使用。如果正在使用尽力而为服务,另一个要求是,此有效负载格式的用户必须监控数据包丢失,以确保数据包丢失率在可接受的参数范围内。如果通过相同网络路径并经历相同网络条件的TCP流将实现在合理时间尺度上测量的平均吞吐量,即不小于RTP流所实现的平均吞吐量,则认为丢包是可接受的。通过实现拥塞控制机制以适应传输速率(或分层多播会话订阅的层数),或者如果丢失率高得令人无法接受,则通过安排接收机离开会话,可以满足该条件。

The bitrate adaptation necessary for obeying the congestion control principle is easily achievable when real-time encoding is used. However, when pre-encoded content is being transmitted, bandwidth adaptation requires the availability of more than one coded representation of the same content, at different bitrates, or the existence of non-reference pictures or sub-sequences [22] in the bitstream. The switching between the different representations can normally be performed in the same RTP session, e.g., by employing a concept known as SI/SP slices of the Extended profile or by switching streams at IDR picture boundaries. Only when non-downgradable parameters (such as the profile part of the profile/level ID) are required to be changed does it become necessary to terminate and restart the media stream. This may be accomplished by using a different RTP payload type.

当使用实时编码时,遵守拥塞控制原则所需的比特率自适应很容易实现。然而,当传输预编码内容时,带宽自适应要求以不同比特率提供相同内容的多个编码表示,或者比特流中存在非参考图片或子序列[22]。不同表示之间的切换通常可以在同一RTP会话中执行,例如,通过采用称为扩展简档的SI/SP片的概念,或者通过在IDR图片边界处切换流。只有当需要更改不可降级的参数(如配置文件/级别ID的配置文件部分)时,才需要终止并重新启动媒体流。这可以通过使用不同的RTP有效负载类型来实现。

MANEs MAY follow the suggestions outlined in Section 7.3 and remove certain unusable packets from the packet stream when that stream was damaged due to previous packet losses. This can help reduce the network load in certain special cases.

MANE可遵循第7.3节中概述的建议,并在数据包流因先前的数据包丢失而损坏时,从数据包流中删除某些不可用的数据包。在某些特殊情况下,这有助于减少网络负载。

11. IANA Considerations
11. IANA考虑

The H264 media subtype name specified by RFC 3984 has been updated as defined in Section 8.1 of this memo.

RFC 3984指定的H264媒体子类型名称已按照本备忘录第8.1节的定义进行了更新。

12. Informative Appendix: Application Examples
12. 资料性附录:应用示例

This payload specification is very flexible in its use, in order to cover the extremely wide application space anticipated for H.264. However, this great flexibility also makes it difficult for an implementer to decide on a reasonable packetization scheme. Some information on how to apply this specification to real-world scenarios is likely to appear in the form of academic publications and a test model software and description in the near future. However, some preliminary usage scenarios are described here as well.

该有效负载规范在使用上非常灵活,以覆盖H.264预期的极其广泛的应用空间。然而,这种巨大的灵活性也使得实现者很难决定一个合理的打包方案。在不久的将来,有关如何将本规范应用于实际场景的一些信息可能会以学术出版物、测试模型软件和描述的形式出现。然而,这里也描述了一些初步的使用场景。

12.1. Video Telephony According to Annex A of ITU-T Recommendation H.241

12.1. 符合ITU-T建议H.241附件A的视频电话

H.323-based video telephony systems that use H.264 as an optional video compression scheme are required to support Annex A of H.241 [3] as a packetization scheme. The packetization mechanism defined in this Annex is technically identical with a small subset of this specification.

使用H.264作为可选视频压缩方案的基于H.323的视频电话系统需要支持H.241[3]的附录A作为分组方案。本附件中定义的打包机制在技术上与本规范的一小部分相同。

When a system operates according to Annex A of H.241, parameter set NAL units are sent in-band. Only single NAL unit packets are used. Many such systems are not sending IDR pictures regularly, but only

当系统按照H.241附录a运行时,参数集NAL单元在频带内发送。仅使用单个NAL单元数据包。许多这样的系统不是定期发送IDR图片,而是

when required by user interaction or by control protocol means, e.g., when switching between video channels in a Multipoint Control Unit or for error recovery requested by feedback.

当用户交互或控制协议手段要求时,例如,当在多点控制单元中的视频通道之间切换或用于反馈请求的错误恢复时。

12.2. Video Telephony, No Slice Data Partitioning, No NAL Unit Aggregation

12.2. 视频电话,无切片数据分区,无NAL单元聚合

The RTP part of this scheme is implemented and tested (though not the control-protocol part; see below).

该方案的RTP部分已经实现和测试(但不是控制协议部分;见下文)。

In most real-world video telephony applications, picture parameters such as picture size or optional modes never change during the lifetime of a connection. Therefore, all necessary parameter sets (usually only one) are sent as a side effect of the capability exchange/announcement process, e.g., according to the SDP syntax specified in Section 8.2 of this document. As all necessary parameter set information is established before the RTP session starts, there is no need for sending any parameter set NAL units. Slice data partitioning is not used either. Thus, the RTP packet stream basically consists of NAL units that carry single coded slices.

在大多数现实世界的视频电话应用程序中,图片参数(如图片大小或可选模式)在连接的生命周期内不会改变。因此,所有必要的参数集(通常只有一个)作为能力交换/公告过程的副作用发送,例如,根据本文件第8.2节规定的SDP语法。由于所有必要的参数集信息都是在RTP会话开始之前建立的,因此不需要发送任何参数集NAL单元。也不使用切片数据分区。因此,RTP分组流基本上由携带单个编码片段的NAL单元组成。

The encoder chooses the size of coded slice NAL units so that they offer the best performance. Often, this is done by adapting the coded slice size to the MTU size of the IP network. For small picture sizes, this may result in a one-picture-per-one-packet strategy. Intra refresh algorithms clean up the loss of packets and the resulting drift-related artifacts.

编码器选择编码片NAL单元的大小,以便它们提供最佳性能。通常,这是通过使编码片大小适应IP网络的MTU大小来实现的。对于较小的图片大小,这可能导致每包一张图片的策略。帧内刷新算法可清除数据包丢失和由此产生的漂移相关伪影。

12.3. Video Telephony, Interleaved Packetization Using NAL Unit Aggregation

12.3. 视频电话,使用NAL单元聚合的交错分组

This scheme allows better error concealment and is used in H.263-based designs using RFC 4629 packetization [11]. It has been implemented, and good results were reported [13].

该方案允许更好的错误隐藏,并用于基于H.263的设计中,使用RFC 4629分组[11]。它已经实施,并报告了良好的结果[13]。

The VCL encoder codes the source picture so that all macroblocks (MBs) of one MB line are assigned to one slice. All slices with even MB row addresses are combined into one STAP, and all slices with odd MB row addresses are combined into another. Those STAPs are transmitted as RTP packets. The establishment of the parameter sets is performed as discussed above.

VCL编码器对源图片进行编码,以便将一个MB行的所有宏块(MB)分配给一个切片。具有偶数MB行地址的所有片合并到一个STAP中,具有奇数MB行地址的所有片合并到另一个STAP中。这些STAP作为RTP数据包传输。参数集的建立如上所述。

Note that the use of STAPs is essential here, as the high number of individual slices (18 for a Common Intermediate Format (CIF) picture) would lead to unacceptably high IP/UDP/RTP header overhead (unless the source coding tool FMO is used, which is not assumed in this scenario). Furthermore, some wireless video transmission systems,

请注意,在这里使用STAP是至关重要的,因为大量的单个片段(18个用于通用中间格式(CIF)图片)将导致不可接受的高IP/UDP/RTP报头开销(除非使用源代码工具FMO,这在本场景中不被假定)。此外,一些无线视频传输系统,

such as H.324M and the IP-based video telephony specified in 3GPP, are likely to use relatively small transport packet size. For example, a typical MTU size of H.223 AL3 SDU is around 100 bytes [17]. Coding individual slices according to this packetization scheme provides further advantage in communication between wired and wireless networks, as individual slices are likely to be smaller than the preferred maximum packet size of wireless systems. Consequently, a gateway can convert the STAPs used in a wired network into several RTP packets with only one NAL unit, which are preferred in a wireless network, and vice versa.

例如H.324M和3GPP中指定的基于IP的视频电话很可能使用相对较小的传输分组大小。例如,H.223 AL3 SDU的典型MTU大小约为100字节[17]。根据该分组方案编码各个片段在有线和无线网络之间的通信中提供了进一步的优势,因为各个片段可能小于无线系统的优选最大分组大小。因此,网关可以将有线网络中使用的stap转换为仅具有一个NAL单元的多个RTP分组,这在无线网络中是优选的,反之亦然。

12.4. Video Telephony with Data Partitioning
12.4. 具有数据分区的视频电话

This scheme has been implemented and has been shown to offer good performance, especially at higher packet loss rates [13].

该方案已经实施,并被证明具有良好的性能,特别是在较高的丢包率下[13]。

Data partitioning is known to be useful only when some form of unequal error protection is available. Normally, in single-session RTP environments, even error characteristics are assumed; that is, the packet loss probability of all packets of the session is the same statistically. However, there are means to reduce the packet loss probability of individual packets in an RTP session. A FEC packet according to RFC 5109 [18], for example, specifies which media packets are associated with the FEC packet.

只有当某种形式的不等错误保护可用时,数据分区才有用。通常,在单会话RTP环境中,甚至假设错误特性;也就是说,会话的所有分组的分组丢失概率在统计上是相同的。然而,存在降低RTP会话中单个分组的分组丢失概率的方法。例如,根据RFC 5109[18]的FEC分组指定哪些媒体分组与FEC分组相关联。

In all cases, the incurred overhead is substantial but is in the same order of magnitude as the number of bits that have otherwise been spent for intra information. However, this mechanism does not add any delay to the system.

在所有情况下,所产生的开销都是巨大的,但其数量级与用于帧内信息的比特数相同。但是,该机制不会给系统增加任何延迟。

Again, the complete parameter set establishment is performed through control protocol means.

同样,通过控制协议手段执行完整的参数集建立。

12.5. Video Telephony or Streaming with FUs and Forward Error Correction

12.5. 视频电话或具有FUs和前向纠错功能的流媒体

This scheme has been implemented and has been shown to provide good performance, especially at higher packet loss rates [19].

该方案已经实施,并被证明提供了良好的性能,特别是在较高的丢包率下[19]。

The most efficient means to combat packet losses for scenarios where retransmissions are not applicable is forward error correction (FEC). Although application layer, end-to-end use of FEC is often less efficient than a FEC-based protection of individual links (especially when links of different characteristics are in the transmission path), application layer, end-to-end FEC is unavoidable in some scenarios. RFC 5109 [18] provides means to use generic, application layer, end-to-end FEC in packet loss environments. A binary forward error correcting code is generated by applying the XOR operation to

在不适用重传的情况下,对抗数据包丢失的最有效方法是前向纠错(FEC)。尽管应用层端到端使用FEC的效率通常低于基于FEC的单个链路保护(尤其是在传输路径中具有不同特性的链路时),但在某些情况下,应用层端到端FEC是不可避免的。RFC 5109[18]提供了在丢包环境中使用通用、应用层、端到端FEC的方法。二进制前向纠错码是通过对

the bits at the same bit position in different packets. The binary code can be specified by the parameters (n,k), in which k is the number of information packets used in the connection and n is the total number of packets generated for k information packets; that is, n-k parity packets are generated for k information packets.

不同数据包中相同位位置的位。二进制代码可以由参数(n,k)指定,其中k是连接中使用的信息分组的数量,n是为k个信息分组生成的分组的总数;即,为k个信息分组生成n-k个奇偶校验分组。

When a code is used with parameters (n,k) within the RFC 5109 framework, the following properties are well known:

当代码与RFC 5109框架内的参数(n,k)一起使用时,以下属性是众所周知的:

a) If applied over one RTP packet, RFC 5109 provides only packet repetition.

a) 如果应用于一个RTP数据包,RFC 5109仅提供数据包重复。

b) RFC 5109 is most bitrate efficient if XOR-connected packets have equal length.

b) 如果XOR连接的数据包长度相等,则RFC 5109的比特率效率最高。

c) At the same packet loss probability p and for a fixed k, the greater the value of n, the smaller the residual error probability becomes. For example, for a packet loss probability of 10%, k=1, and n=2, the residual error probability is about 1%, whereas for n=3, the residual error probability is about 0.1%.

c) 在相同的丢包概率p下,对于固定的k,n的值越大,剩余错误概率越小。例如,对于10%、k=1和n=2的分组丢失概率,残余错误概率约为1%,而对于n=3,残余错误概率约为0.1%。

d) At the same packet loss probability p and for a fixed code rate k/n, the greater the value of n, the smaller the residual error probability becomes. For example, at a packet loss probability of p=10%, k=1, and n=2, the residual error rate is about 1%, whereas for an extended Golay code with k=12 and n=24, the residual error rate is about 0.01%.

d) 在相同的分组丢失概率p和固定的码速率k/n下,n的值越大,残余错误概率变得越小。例如,在p=10%、k=1和n=2的分组丢失概率下,残余错误率约为1%,而对于k=12和n=24的扩展Golay码,残余错误率约为0.01%。

For applying RFC 5109 in combination with H.264 baseline-coded video without using FUs, several options might be considered:

要在不使用FUs的情况下将RFC 5109与H.264基线编码视频结合使用,可以考虑以下几种选项:

1) The video encoder produces NAL units for which each video frame is coded in a single slice. Applying FEC, one could use a simple code, e.g., (n=2, k=1). That is, each NAL unit would basically just be repeated. The disadvantage is obviously the bad code performance according to d), above, and the low flexibility, as only (n, k=1) codes can be used.

1) 视频编码器产生NAL单元,每个视频帧在单个片段中编码。应用FEC,可以使用简单的代码,例如(n=2,k=1)。也就是说,每个NAL单元基本上都是重复的。缺点显然是根据上述d),代码性能差,灵活性低,因为只能使用(n,k=1)代码。

2) The video encoder produces NAL units for which each video frame is encoded in one or more consecutive slices. Applying FEC, one could use a better code, e.g., (n=24, k=12), over a sequence of NAL units. Depending on the number of RTP packets per frame, a loss may introduce a significant delay, which is reduced when more RTP packets are used per frame. Packets of completely different lengths might also be connected, which decreases bitrate

2) 视频编码器产生NAL单元,每个视频帧被编码在一个或多个连续片中。应用FEC,可以在NAL单元序列上使用更好的代码,例如(n=24,k=12)。根据每帧RTP分组的数量,丢失可能会引入显著的延迟,当每帧使用更多RTP分组时,延迟会减少。也可以连接长度完全不同的数据包,这会降低比特率

efficiency according to b), above. However, with some care and for slices of 1 kb or larger, similar length (100-200 bytes difference) may be produced, which will not lower the bit efficiency catastrophically.

根据上述b)的效率。然而,在一定程度上,对于1kb或更大的片,可能会产生类似的长度(100-200字节差),这不会灾难性地降低比特效率。

3) The video encoder produces NAL units, for which a certain frame contains k slices of possibly almost equal length. Then, applying FEC, a better code, e.g., (n=24, k=12), can be used over the sequence of NAL units for each frame. The delay compared to that of 2), above, may be reduced, but several disadvantages are obvious. First, the coding efficiency of the encoded video is lowered significantly, as slice-structured coding reduces intra-frame prediction and additional slice overhead is necessary. Second, pre-encoded content or, when operating over a gateway, the video is usually not appropriately coded with k slices such that FEC can be applied. Finally, the encoding of video producing k slices of equal length is not straightforward and might require more than one encoding pass.

3) 视频编码器产生NAL单元,其中某个帧包含可能几乎相等长度的k个片段。然后,应用FEC,可以在每个帧的NAL单元序列上使用更好的代码,例如(n=24,k=12)。与上述2)相比,延迟可能会减少,但有几个缺点是显而易见的。首先,编码视频的编码效率显著降低,因为切片结构化编码减少了帧内预测,并且需要额外的切片开销。第二,预编码内容,或者,当在网关上操作时,视频通常不使用k个片段进行适当编码,以便可以应用FEC。最后,对产生等长k个片段的视频进行编码并不简单,可能需要多次编码。

Many of the mentioned disadvantages can be avoided by applying FUs in combination with FEC. Each NAL unit can be split into any number of FUs of basically equal length; therefore, FEC, with a reasonable k and n, can be applied, even if the encoder made no effort to produce slices of equal length. For example, a coded slice NAL unit containing an entire frame can be split to k FUs, and a parity check code (n=k+1, k) can be applied. However, this has the disadvantage that unless all created fragments can be recovered, the whole slice will be lost. Thus, a larger section is lost than would be if the frame had been split into several slices.

通过将FUs与FEC结合使用,可以避免上述许多缺点。每个NAL单元可分为任意数量的长度基本相等的FU;因此,可以应用具有合理k和n的FEC,即使编码器不努力产生等长的切片。例如,包含整个帧的编码片NAL单元可以分割为k fu,并且可以应用奇偶校验码(n=k+1,k)。但是,这样做的缺点是,除非所有创建的片段都可以恢复,否则整个片段都将丢失。因此,与将帧分割为多个切片相比,丢失的部分更大。

The presented technique makes it possible to achieve good transmission error tolerance, even if no additional source coding layer redundancy (such as periodic intra frames) is present. Consequently, the same coded video sequence can be used to achieve the maximum compression efficiency and quality over error-free transmission and for transmission over error-prone networks. Furthermore, the technique allows the application of FEC to pre-encoded sequences without adding delay. In this case, pre-encoded sequences that are not encoded for error-prone networks can still be transmitted almost reliably without adding extensive delays. In addition, FUs of equal length result in a bitrate efficient use of RFC 5109.

所提出的技术使得即使不存在额外的信源编码层冗余(例如周期性帧内帧),也能够实现良好的传输容错。因此,相同的编码视频序列可用于在无差错传输上实现最大的压缩效率和质量,并用于在容易出错的网络上传输。此外,该技术允许在不增加延迟的情况下将FEC应用于预编码序列。在这种情况下,对于容易出错的网络,未编码的预编码序列仍然可以在不增加大量延迟的情况下几乎可靠地传输。此外,长度相等的FU可有效地利用RFC 5109的比特率。

If the error probability depends on the length of the transmitted packet (e.g., in case of mobile transmission [15]), the benefits of applying FUs with FEC are even more obvious. Basically, the flexibility of the size of FUs allows appropriate FEC to be applied for each NAL unit and unequal error protection of NAL units.

如果错误概率取决于所传输数据包的长度(例如,在移动传输的情况下[15]),则将FUs与FEC结合使用的好处更为明显。基本上,FUs大小的灵活性允许为每个NAL单元应用适当的FEC,并且NAL单元具有不等的错误保护。

When FUs and FEC are used, the incurred overhead is substantial but is in the same order of magnitude as the number of bits that have to be spent for intra-coded macroblocks if no FEC is applied. In [19], it was shown that the overall performance of the FEC-based approach enhanced quality when using the same error rate and same overall bitrate, including the overhead.

当使用FUs和FEC时,产生的开销是巨大的,但是如果没有应用FEC,其数量级与必须用于帧内编码宏块的比特数相同。在[19]中,研究表明,当使用相同的错误率和相同的总体比特率(包括开销)时,基于FEC的方法的总体性能提高了质量。

12.6. Low Bitrate Streaming
12.6. 低比特率流媒体

This scheme has been implemented with H.263 and non-standard RTP packetization and has given good results [20]. There is no technical reason why similarly good results could not be achievable with H.264.

该方案已在H.263和非标准RTP封装中实现,并取得了良好的效果[20]。没有技术上的理由说明H.264无法获得类似的好结果。

In today's Internet streaming, some of the offered bitrates are relatively low in order to allow terminals with dial-up modems to access the content. In wired IP networks, relatively large packets, say 500 - 1500 bytes, are preferred to smaller and more frequently occurring packets in order to reduce network congestion. Moreover, use of large packets decreases the amount of RTP/UDP/IP header overhead. For low bitrate video, the use of large packets means that sometimes up to few pictures should be encapsulated in one packet.

在今天的互联网流媒体中,一些提供的比特率相对较低,以便允许带有拨号调制解调器的终端访问内容。在有线IP网络中,相对较大的数据包(例如500-1500字节)比较小且更频繁出现的数据包更可取,以减少网络拥塞。此外,使用大数据包可以减少RTP/UDP/IP报头开销。对于低比特率视频,使用大数据包意味着有时一个数据包中最多应封装几个图片。

However, the loss of a packet including many coded pictures would have drastic consequences for visual quality, as there is practically no way to conceal the loss of an entire picture other than repeating the previous one. One way to construct relatively large packets and maintain possibilities for successful loss concealment is to construct MTAPs that contain interleaved slices from several pictures. An MTAP should not contain spatially adjacent slices from the same picture or spatially overlapping slices from any picture. If a packet is lost, it is likely that a lost slice is surrounded by spatially adjacent slices of the same picture and spatially corresponding slices of the temporally previous and succeeding pictures. Consequently, concealment of the lost slice is likely to be relatively successful.

然而,丢失包含许多编码图片的数据包将对视觉质量产生严重影响,因为除了重复上一张图片外,几乎没有办法隐藏整个图片的丢失。构造相对较大的数据包并保持成功隐藏丢失可能性的一种方法是构造包含来自多个图片的交错切片的MTAP。MTAP不应包含来自同一图片的空间相邻切片或来自任何图片的空间重叠切片。如果分组丢失,则丢失的片段很可能被相同图片的空间上相邻的片段以及时间上先前和后续图片的空间上对应的片段包围。因此,隐藏丢失的切片可能比较成功。

12.7. Robust Packet Scheduling in Video Streaming
12.7. 视频流中的鲁棒分组调度

Robust packet scheduling has been implemented with MPEG-4 Part 2 and simulated in a wireless streaming environment [21]. There is no technical reason why similar or better results could not be achievable with H.264.

已使用MPEG-4第2部分实现了健壮的数据包调度,并在无线流媒体环境中进行了模拟[21]。对于H.264无法实现类似或更好的结果,没有任何技术原因。

Streaming clients typically have a receiver buffer that is capable of storing a relatively large amount of data. Initially, when a streaming session is established, a client does not start playing the stream back immediately. Rather, it typically buffers the incoming data for a few seconds. This buffering helps maintain continuous

流式客户端通常具有能够存储相对大量数据的接收器缓冲区。最初,当建立流会话时,客户端不会立即开始播放流。相反,它通常会将传入的数据缓冲几秒钟。这种缓冲有助于保持连续性

playback, as, in case of occasional increased transmission delays or network throughput drops, the client can decode and play buffered data. Otherwise, without initial buffering, the client has to freeze the display, stop decoding, and wait for incoming data. The buffering is also necessary for either automatic or selective retransmission in any protocol level. If any part of a picture is lost, a retransmission mechanism may be used to resend the lost data. If the retransmitted data is received before its scheduled decoding or playback time, the loss is recovered perfectly. Coded pictures can be ranked according to their importance in the subjective quality of the decoded sequence. For example, non-reference pictures, such as conventional B pictures, are subjectively least important, as their absence does not affect decoding of any other pictures. In addition to non-reference pictures, the ITU-T H.264 | ISO/IEC 14496-10 standard includes a temporal scalability method called sub-sequences [22]. Subjective ranking can also be made on coded slice data partition or slice group basis. Coded slices and coded slice data partitions that are subjectively the most important can be sent earlier than their decoding order indicates, whereas coded slices and coded slice data partitions that are subjectively the least important can be sent later than their natural coding order indicates. Consequently, any retransmitted parts of the most important slices and coded slice data partitions are more likely to be received before their scheduled decoding or playback time compared to the least important slices and slice data partitions.

回放,如在偶尔增加传输延迟或网络吞吐量下降的情况下,客户端可以解码和播放缓冲数据。否则,在没有初始缓冲的情况下,客户端必须冻结显示、停止解码并等待传入数据。缓冲对于任何协议级别的自动或选择性重传也是必要的。如果图片的任何部分丢失,可以使用重传机制来重新发送丢失的数据。如果重新传输的数据在其预定解码或回放时间之前被接收,则丢失完全恢复。编码图片可以根据其在解码序列主观质量中的重要性进行排序。例如,非参考图片,例如传统的B图片,在主观上是最不重要的,因为它们的缺失不影响任何其他图片的解码。除了非参考图片外,ITU-T H.264 | ISO/IEC 14496-10标准还包括一种称为子序列的时间可伸缩性方法[22]。主观排序也可以基于编码切片数据分区或切片组进行。主观上最重要的编码片和编码片数据分区可以在其解码顺序指示之前发送,而主观上最不重要的编码片和编码片数据分区可以在其自然编码顺序指示之后发送。因此,与最不重要的片段和片段数据分区相比,最重要片段和编码片段数据分区的任何重传部分更有可能在其预定解码或回放时间之前被接收。

13. Informative Appendix: Rationale for Decoding Order Number
13. 资料性附录:解码订单号的基本原理
13.1. Introduction
13.1. 介绍

The Decoding Order Number (DON) concept was introduced mainly to enable efficient multi-picture slice interleaving (see Section 12.6) and robust packet scheduling (see Section 12.7). In both of these applications, NAL units are transmitted out of decoding order. DON indicates the decoding order of NAL units and should be used in the receiver to recover the decoding order. Example use cases for efficient multi-picture slice interleaving and for robust packet scheduling are given in Sections 13.2 and 13.3, respectively. Section 13.4 describes the benefits of the DON concept in error resiliency achieved by redundant coded pictures. Section 13.5 summarizes considered alternatives to DON and justifies why DON was chosen for this RTP payload specification.

引入解码顺序号(DON)概念主要是为了实现高效的多图片片交织(见第12.6节)和健壮的分组调度(见第12.7节)。在这两种应用中,NAL单元都是按解码顺序传输的。DON表示NAL单元的解码顺序,应在接收器中使用,以恢复解码顺序。第13.2节和第13.3节分别给出了高效多图片片交织和鲁棒分组调度的示例用例。第13.4节描述了DON概念在通过冗余编码图片实现错误恢复能力方面的优势。第13.5节总结了所考虑的DON替代方案,并说明了为什么选择DON作为本RTP有效载荷规范的理由。

13.2. Example of Multi-Picture Slice Interleaving
13.2. 多图片片交织示例

An example of multi-picture slice interleaving follows. A subset of a coded video sequence is depicted below in output order. R denotes a reference picture, N denotes a non-reference picture, and the number indicates a relative output time.

下面是多图片片交织的示例。下面按输出顺序描述编码视频序列的子集。R表示参考图片,N表示非参考图片,数字表示相对输出时间。

... R1 N2 R3 N4 R5 ...

... R1 N2 R3 N4 R5。。。

The decoding order of these pictures from left to right is as follows:

这些图片从左到右的解码顺序如下:

... R1 R3 N2 R5 N4 ...

... R1 R3 N2 R5 N4。。。

The NAL units of pictures R1, R3, N2, R5, and N4 are marked with a DON equal to 1, 2, 3, 4, and 5, respectively.

图片R1、R3、N2、R5和N4的NAL单位分别用等于1、2、3、4和5的DON标记。

Each reference picture consists of three slice groups that are scattered as follows (a number denotes the slice group number for each macroblock in a Quarter Common Intermediate Format (QCIF) frame):

每个参考图片由以下三个分散的切片组组成(一个数字表示四分之一公共中间格式(QCIF)帧中每个宏块的切片组编号):

0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2

For the sake of simplicity, we assume that all the macroblocks of a slice group are included in one slice. Three MTAPs are constructed from three consecutive reference pictures so that each MTAP contains three aggregation units, each of which contains all the macroblocks from one slice group. The first MTAP contains slice group 0 of picture R1, slice group 1 of picture R3, and slice group 2 of picture R5. The second MTAP contains slice group 1 of picture R1, slice group 2 of picture R3, and slice group 0 of picture R5. The third MTAP contains slice group 2 of picture R1, slice group 0 of picture R3, and slice group 1 of picture R5. Each non-reference picture is encapsulated into an STAP-B.

为了简单起见,我们假设一个片组的所有宏块都包含在一个片中。三个MTAP由三个连续的参考图片构成,因此每个MTAP包含三个聚合单元,每个聚合单元包含一个切片组中的所有宏块。第一个MTAP包含图片R1的切片组0、图片R3的切片组1和图片R5的切片组2。第二个MTAP包含图片R1的切片组1、图片R3的切片组2和图片R5的切片组0。第三个MTAP包含图片R1的切片组2、图片R3的切片组0和图片R5的切片组1。每个非参考图片被封装到STAP-B中。

Consequently, the transmission order of NAL units is the following:

因此,NAL单元的传输顺序如下:

      R1, slice group 0, DON 1, carried in MTAP,RTP SN: N
      R3, slice group 1, DON 2, carried in MTAP,RTP SN: N
      R5, slice group 2, DON 4, carried in MTAP,RTP SN: N
      R1, slice group 1, DON 1, carried in MTAP,RTP SN: N+1
      R3, slice group 2, DON 2, carried in MTAP,RTP SN: N+1
      R5, slice group 0, DON 4, carried in MTAP,RTP SN: N+1
      R1, slice group 2, DON 1, carried in MTAP,RTP SN: N+2
      R3, slice group 1, DON 2, carried in MTAP,RTP SN: N+2
      R5, slice group 0, DON 4, carried in MTAP,RTP SN: N+2
      N2, DON 3, carried in STAP-B, RTP SN: N+3
      N4, DON 5, carried in STAP-B, RTP SN: N+4
        
      R1, slice group 0, DON 1, carried in MTAP,RTP SN: N
      R3, slice group 1, DON 2, carried in MTAP,RTP SN: N
      R5, slice group 2, DON 4, carried in MTAP,RTP SN: N
      R1, slice group 1, DON 1, carried in MTAP,RTP SN: N+1
      R3, slice group 2, DON 2, carried in MTAP,RTP SN: N+1
      R5, slice group 0, DON 4, carried in MTAP,RTP SN: N+1
      R1, slice group 2, DON 1, carried in MTAP,RTP SN: N+2
      R3, slice group 1, DON 2, carried in MTAP,RTP SN: N+2
      R5, slice group 0, DON 4, carried in MTAP,RTP SN: N+2
      N2, DON 3, carried in STAP-B, RTP SN: N+3
      N4, DON 5, carried in STAP-B, RTP SN: N+4
        

The receiver is able to organize the NAL units back in decoding order based on the value of DON associated with each NAL unit.

接收机能够基于与每个NAL单元相关联的DON的值以解码顺序重新组织NAL单元。

If one of the MTAPs is lost, the spatially adjacent and temporally co-located macroblocks are received and can be used to conceal the loss efficiently. If one of the STAPs is lost, the effect of the loss does not propagate temporally.

如果其中一个mtap丢失,则接收空间上相邻且时间上共存的宏块,并可用于有效地隐藏丢失。如果其中一个STAP丢失,则丢失的影响不会在时间上传播。

13.3. Example of Robust Packet Scheduling
13.3. 健壮分组调度示例

An example of robust packet scheduling follows. The communication system used in the example consists of the following components in the order that the video is processed from source to sink:

下面是一个健壮的数据包调度示例。本示例中使用的通信系统由以下组件组成,按照视频从源到接收器的处理顺序排列:

o camera and capturing o pre-encoding buffer o encoder o encoded picture buffer o transmitter o transmission channel o receiver o receiver buffer o decoder o decoded picture buffer o display

o 照相机和捕获o预编码缓冲器o编码器o编码图片缓冲器o发射器o传输通道o接收器o接收器缓冲器o解码器o解码图片缓冲器o显示器

The video communication system used in this example operates as follows. Note that processing of the video stream happens gradually and at the same time in all components of the system. The source video sequence is shot and captured to a pre-encoding buffer. The pre-encoding buffer can be used to order pictures from sampling order to encoding order or to analyze multiple uncompressed frames for bitrate control purposes, for example. In some cases, the pre-encoding buffer may not exist; instead, the sampled pictures are

本示例中使用的视频通信系统操作如下。注意,视频流的处理在系统的所有组件中逐渐同时进行。源视频序列被拍摄并捕获到预编码缓冲区。例如,预编码缓冲器可用于从采样顺序到编码顺序对图片进行排序,或用于出于比特率控制目的分析多个未压缩帧。在某些情况下,预编码缓冲区可能不存在;相反,采样的图片是

encoded right away. The encoder encodes pictures from the pre-encoding buffer and stores the output (i.e., coded pictures) to the encoded picture buffer. The transmitter encapsulates the coded pictures from the encoded picture buffer to transmission packets and sends them to a receiver through a transmission channel. The receiver stores the received packets to the receiver buffer. The receiver buffering process typically includes buffering for transmission delay jitter. The receiver buffer can also be used to recover correct decoding order of coded data. The decoder reads coded data from the receiver buffer and produces decoded pictures as output into the decoded picture buffer. The decoded picture buffer is used to recover the output (or display) order of pictures. Finally, pictures are displayed.

马上编码。编码器对来自预编码缓冲器的图片进行编码,并将输出(即,编码图片)存储到编码图片缓冲器。发射机将来自编码图片缓冲器的编码图片封装到传输分组中,并通过传输信道将其发送到接收机。接收器将接收到的数据包存储到接收器缓冲区。接收机缓冲处理通常包括对传输延迟抖动的缓冲。接收机缓冲器还可用于恢复编码数据的正确解码顺序。解码器从接收器缓冲器读取编码数据,并产生解码图片作为输出到解码图片缓冲器中。解码图片缓冲区用于恢复图片的输出(或显示)顺序。最后,显示图片。

In the following example figures, I denotes an IDR picture, R denotes a reference picture, N denotes a non-reference picture, and the number after I, R, or N indicates the sampling time relative to the previous IDR picture in decoding order. Values below the sequence of pictures indicate scaled system clock timestamps. The system clock is initialized arbitrarily in this example, and time runs from left to right. Each I, R, and N picture is mapped into the same timeline compared to the previous processing step, if any, assuming that encoding, transmission, and decoding take no time. Thus, events happening at the same time are located in the same column throughout all example figures.

在以下示例图中,I表示IDR图片,R表示参考图片,N表示非参考图片,并且I、R或N之后的数字表示相对于解码顺序中的先前IDR图片的采样时间。图片序列下方的值表示缩放的系统时钟时间戳。在本例中,系统时钟任意初始化,时间从左到右运行。假设编码、传输和解码不花费时间,则每个I、R和N图片被映射到与先前处理步骤(如果有的话)相比的相同时间线中。因此,在所有示例图中,同时发生的事件位于同一列中。

A subset of a sequence of coded pictures is depicted below in sampling order.

下面以采样顺序描述编码图片序列的子集。

       ...  N58 N59 I00 N01 N02 R03 N04 N05 R06 ... N58 N59 I00 N01 ...
       ... --|---|---|---|---|---|---|---|---|- ... -|---|---|---|- ...
       ...  58  59  60  61  62  63  64  65  66  ... 128 129 130 131 ...
        
       ...  N58 N59 I00 N01 N02 R03 N04 N05 R06 ... N58 N59 I00 N01 ...
       ... --|---|---|---|---|---|---|---|---|- ... -|---|---|---|- ...
       ...  58  59  60  61  62  63  64  65  66  ... 128 129 130 131 ...
        

Figure 16. Sequence of pictures in sampling order

图16。按采样顺序排列的图片序列

The sampled pictures are buffered in the pre-encoding buffer to arrange them in encoding order. In this example, we assume that the non-reference pictures are predicted from both the previous and the next reference picture in output order, except for the non-reference pictures immediately preceding an IDR picture, which are predicted only from the previous reference picture in output order. Thus, the pre-encoding buffer has to contain at least two pictures, and the buffering causes a delay of two picture intervals. The output of the pre-encoding buffering process and the encoding (and decoding) order of the pictures are as follows:

采样的图片缓冲在预编码缓冲区中,以按编码顺序排列。在该示例中,我们假设非参考图片是以输出顺序从上一参考图片和下一参考图片预测的,除了IDR图片前面的非参考图片,它们是仅以输出顺序从上一参考图片预测的。因此,预编码缓冲器必须包含至少两个图片,并且该缓冲器导致两个图片间隔的延迟。预编码缓冲处理的输出和图片的编码(和解码)顺序如下:

       ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
       ... -|---|---|---|---|---|---|---|---|- ...
       ... 60  61  62  63  64  65  66  67  68  ...
        
       ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
       ... -|---|---|---|---|---|---|---|---|- ...
       ... 60  61  62  63  64  65  66  67  68  ...
        

Figure 17. Reordered pictures in the pre-encoding buffer

图17。在预编码缓冲区中重新排序的图片

The encoder or the transmitter can set the value of DON for each picture to a value of DON for the previous picture in decoding order plus one.

编码器或发射器可以将每个图片的DON值设置为解码顺序加1的前一张图片的DON值。

For the sake of simplicity, let us assume that:

为了简单起见,让我们假设:

o the frame rate of the sequence is constant, o each picture consists of only one slice, o each slice is encapsulated in a single NAL unit packet, o there is no transmission delay, and o pictures are transmitted at constant intervals (that is, 1 / (frame rate)).

o 序列的帧速率是恒定的,o每个图片仅由一个片组成,o每个片封装在单个NAL单元分组中,o没有传输延迟,并且o图片以恒定间隔(即,1/(帧速率))传输。

When pictures are transmitted in decoding order, they are received as follows:

当以解码顺序发送图片时,它们按如下方式接收:

       ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
       ... -|---|---|---|---|---|---|---|---|- ...
       ... 60  61  62  63  64  65  66  67  68  ...
        
       ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
       ... -|---|---|---|---|---|---|---|---|- ...
       ... 60  61  62  63  64  65  66  67  68  ...
        

Figure 18. Received pictures in decoding order

图18。按解码顺序接收图片

The OPTIONAL sprop-interleaving-depth media type parameter is set to 0, as the transmission (or reception) order is identical to the decoding order.

由于发送(或接收)顺序与解码顺序相同,因此可选的sprop interleaving depth media type参数设置为0。

Initially, the decoder has to buffer for one picture interval in its decoded picture buffer to organize pictures from decoding order to output order, as depicted below:

最初,解码器必须在其解码图片缓冲区中缓冲一个图片间隔,以将图片从解码顺序组织到输出顺序,如下所示:

       ... N58 N59 I00 N01 N02 R03 N04 N05 R06 ...
       ... -|---|---|---|---|---|---|---|---|- ...
       ... 61  62  63  64  65  66  67  68  69  ...
        
       ... N58 N59 I00 N01 N02 R03 N04 N05 R06 ...
       ... -|---|---|---|---|---|---|---|---|- ...
       ... 61  62  63  64  65  66  67  68  69  ...
        

Figure 19. Output order

图19。输出顺序

The amount of required initial buffering in the decoded picture buffer can be signaled in the buffering period SEI message or with the num_reorder_frames syntax element of H.264 video usability information. num_reorder_frames indicates the maximum number of frames, complementary field pairs, or non-paired fields that precede any frame, complementary field pair, or non-paired field in the

解码图片缓冲器中所需的初始缓冲量可以在缓冲周期SEI消息中或使用H.264视频可用性信息的num_reorder_frames语法元素来表示。num_reorder_frames表示帧中任何帧、互补场对或非成对场之前的最大帧数、互补场对或非成对场数

sequence in decoding order and that follow it in output order. For the sake of simplicity, we assume that num_reorder_frames is used to indicate the initial buffer in the decoded picture buffer. In this example, num_reorder_frames is equal to 1.

按解码顺序排列的序列,以及按输出顺序排列的序列。为了简单起见,我们假设num_reorder_frames用于指示解码图片缓冲区中的初始缓冲区。在本例中,num_reorder_frames等于1。

It can be observed that if the IDR picture I00 is lost during transmission and a retransmission request is issued when the value of the system clock is 62, there is one picture interval of time (until the system clock reaches timestamp 63) to receive the retransmitted IDR picture I00.

可以观察到,如果IDR图片I00在传输期间丢失并且当系统时钟的值为62时发出重发请求,则存在一个图片时间间隔(直到系统时钟达到时间戳63)来接收重发的IDR图片I00。

Let us then assume that IDR pictures are transmitted two frame intervals earlier than their decoding position; that is, the pictures are transmitted as follows:

然后,让我们假设IDR图片的传输间隔早于其解码位置两帧;也就是说,图片被传送如下:

       ...  I00 N58 N59 R03 N01 N02 R06 N04 N05 ...
       ... --|---|---|---|---|---|---|---|---|- ...
       ...  62  63  64  65  66  67  68  69  70  ...
        
       ...  I00 N58 N59 R03 N01 N02 R06 N04 N05 ...
       ... --|---|---|---|---|---|---|---|---|- ...
       ...  62  63  64  65  66  67  68  69  70  ...
        

Figure 20. Interleaving: Early IDR pictures in sending order

图20。交错:发送顺序中的早期IDR图片

The OPTIONAL sprop-interleaving-depth media type parameter is set equal to 1 according to its definition. (The value of sprop-interleaving-depth in this example can be derived as follows: picture I00 is the only picture preceding picture N58 or N59 in transmission order and following it in decoding order. Except for pictures I00, N58, and N59, the transmission order is the same as the decoding order of pictures. As a coded picture is encapsulated into exactly one NAL unit, the value of sprop-interleaving-depth is equal to the maximum number of pictures preceding any picture in transmission order and following the picture in decoding order).

根据其定义,可选sprop交错深度媒体类型参数设置为1。(本例中sprop交织深度的值可以如下导出:图片I00是图片N58或N59之前传输顺序和之后解码顺序的唯一图片。除了图片I00、N58和N59之外,传输顺序与图片的解码顺序相同。因为编码图片被封装到只有一个NAL单元,sprop交织深度的值等于以传输顺序在任何图片之前,以解码顺序在图片之后的最大图片数)。

The receiver buffering process contains two pictures at a time according to the value of the sprop-interleaving-depth parameter and orders pictures from the reception order to the correct decoding order based on the value of DON associated with each picture. The output of the receiver buffering process is as follows:

接收机缓冲处理根据sprop交错深度参数的值一次包含两个图片,并基于与每个图片相关联的DON的值将图片从接收顺序排序到正确的解码顺序。接收机缓冲过程的输出如下:

       ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
       ... -|---|---|---|---|---|---|---|---|- ...
       ... 63  64  65  66  67  68  69  70  71  ...
        
       ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
       ... -|---|---|---|---|---|---|---|---|- ...
       ... 63  64  65  66  67  68  69  70  71  ...
        

Figure 21. Interleaving: Receiver buffer

图21。交织:接收机缓冲器

Again, an initial buffering delay of one picture interval is needed to organize pictures from decoding order to output order, as depicted below:

同样,需要一个图片间隔的初始缓冲延迟来将图片从解码顺序组织到输出顺序,如下所示:

        ... N58 N59 I00 N01 N02 R03 N04 N05 ...
        ... -|---|---|---|---|---|---|---|- ...
        ... 64  65  66  67  68  69  70  71  ...
        
        ... N58 N59 I00 N01 N02 R03 N04 N05 ...
        ... -|---|---|---|---|---|---|---|- ...
        ... 64  65  66  67  68  69  70  71  ...
        

Figure 22. Interleaving: Receiver buffer after reordering

图22。交织:重新排序后的接收器缓冲区

Note that the maximum delay that IDR pictures can undergo during transmission, including possible application, transport, or link layer retransmission, is equal to three picture intervals. Thus, the loss resiliency of IDR pictures is improved in systems supporting retransmission compared to the case in which pictures are transmitted in their decoding order.

请注意,IDR图片在传输期间(包括可能的应用、传输或链路层重传)可经历的最大延迟等于三个图片间隔。因此,与以解码顺序发送图片的情况相比,在支持重传的系统中,IDR图片的丢失弹性得到改善。

13.4. Robust Transmission Scheduling of Redundant Coded Slices
13.4. 冗余编码片的鲁棒传输调度

A redundant coded picture is a coded representation of a picture or a part of a picture that is not used in the decoding process if the corresponding primary coded picture is correctly decoded. There should be no noticeable difference between any area of the decoded primary picture and a corresponding area that would result from application of the H.264 decoding process for any redundant picture in the same access unit. A redundant coded slice is a coded slice that is a part of a redundant coded picture.

冗余编码图片是在相应的主编码图片被正确解码的情况下,未在解码过程中使用的图片或图片的一部分的编码表示。解码的主图片的任何区域与将由对同一接入单元中的任何冗余图片应用H.264解码处理而产生的对应区域之间不应有明显差异。冗余编码片是作为冗余编码图片的一部分的编码片。

Redundant coded pictures can be used to provide unequal error protection in error-prone video transmission. If a primary coded representation of a picture is decoded incorrectly, a corresponding redundant coded picture can be decoded. Examples of applications and coding techniques using the redundant codec picture feature include the video redundancy coding [23] and the protection of "key pictures" in multicast streaming [24].

冗余编码图片可用于在易出错的视频传输中提供不等的错误保护。如果图片的主要编码表示被错误解码,则相应的冗余编码图片可以被解码。使用冗余编解码器图片功能的应用和编码技术的示例包括视频冗余编码[23]和多播流中“关键图片”的保护[24]。

One property of many error-prone video communications systems is that transmission errors are often bursty. Therefore, they may affect more than one consecutive transmission packet in transmission order. In low bitrate video communication, it is relatively common for an entire coded picture to be encapsulated into one transmission packet. Consequently, a primary coded picture and the corresponding redundant coded pictures may be transmitted in consecutive packets in transmission order. To make the transmission scheme more tolerant of bursty transmission errors, it is beneficial to transmit the primary coded picture and redundant coded picture separated by more than a single packet. The DON concept enables this.

许多容易出错的视频通信系统的一个特点是传输错误通常是突发的。因此,它们可能影响传输顺序中的多个连续传输分组。在低比特率视频通信中,将整个编码图片封装到一个传输分组中是相对常见的。因此,主编码图片和相应的冗余编码图片可以按照传输顺序以连续分组的形式传输。为了使传输方案更能容忍突发传输错误,传输由多个分组分隔的主编码图片和冗余编码图片是有益的。DON概念实现了这一点。

13.5. Remarks on Other Design Possibilities
13.5. 关于其他设计可能性的评论

The slice header syntax structure of the H.264 coding standard contains the frame_num syntax element that can indicate the decoding order of coded frames. However, the usage of the frame_num syntax element is not feasible or desirable to recover the decoding order, due to the following reasons:

H.264编码标准的切片头语法结构包含frame_num语法元素,该元素可以指示编码帧的解码顺序。然而,由于以下原因,使用frame_num语法元素来恢复解码顺序是不可行或不可取的:

o The receiver is required to parse at least one slice header per coded picture (before passing the coded data to the decoder).

o 接收器需要对每个编码图片至少解析一个切片头(在将编码数据传递给解码器之前)。

o Coded slices from multiple coded video sequences cannot be interleaved, as the frame number syntax element is reset to 0 in each IDR picture.

o 来自多个编码视频序列的编码片段不能交错,因为在每个IDR图片中帧编号语法元素重置为0。

o The coded fields of a complementary field pair share the same value of the frame_num syntax element. Thus, the decoding order of the coded fields of a complementary field pair cannot be recovered based on the frame_num syntax element or any other syntax element of the H.264 coding syntax.

o 互补字段对的编码字段共享frame_num语法元素的相同值。因此,不能基于H.264编码语法的frame_num语法元素或任何其他语法元素来恢复互补字段对的编码字段的解码顺序。

The RTP payload format for transport of MPEG-4 elementary streams [25] enables interleaving of access units and transmission of multiple access units in the same RTP packet. An access unit is specified in the H.264 coding standard to comprise all NAL units associated with a primary coded picture according to Subclause 7.4.1.2 of [1]. Consequently, slices of different pictures cannot be interleaved, and the multi-picture slice interleaving technique (see Section 12.6) for improved error resilience cannot be used.

用于传输MPEG-4基本流的RTP有效载荷格式[25]支持在同一RTP数据包中交错接入单元和传输多个接入单元。根据[1]的子条款7.4.1.2,H.264编码标准中规定了接入单元,以包括与主编码图片相关联的所有NAL单元。因此,不同图片的切片不能交错,并且不能使用用于提高错误恢复能力的多图片切片交错技术(见第12.6节)。

14. Changes from RFC 3984
14. RFC 3984的变更

Following is the list of technical changes (including bug fixes) from RFC 3984. Besides this list of technical changes, numerous editorial changes have been made, but not documented in this section. Note that Section 8.2.2 is where much of the important changes in this memo occurs and deserves particular attention.

以下是RFC 3984的技术更改列表(包括错误修复)。除此技术变更列表外,还进行了许多编辑性变更,但本节未记录这些变更。请注意,第8.2.2节是本备忘录中许多重要变更发生的地方,值得特别注意。

1) In Sections 5.4, 5.5, 6.2, 6.3, and 6.4, removed that the packetization mode in use may be signaled by external means.

1) 在第5.4节、第5.5节、第6.2节、第6.3节和第6.4节中,删除了使用中的打包模式可通过外部方式发出信号的规定。

2) In Section 7.2.2, changed the sentence

2) 在第7.2.2节中,更改了句子

There are N VCL NAL units in the de-interleaving buffer.

解交织缓冲器中有N个VCL NAL单元。

to

There are N or more VCL NAL units in the de-interleaving buffer.

解交织缓冲器中有N个或多个VCL NAL单元。

3) In Section 8.1, the semantics of sprop-init-buf-time (paragraph 2), changed the sentence

3) 在第8.1节中,sprop init buf time(第2段)的语义改变了句子

The parameter is the maximum value of (transmission time of a NAL unit - decoding time of the NAL unit), assuming reliable and instantaneous transmission, the same timeline for transmission and decoding, and that decoding starts when the first packet arrives.

该参数是(NAL单元的传输时间-NAL单元的解码时间)的最大值,假设可靠和瞬时传输,传输和解码的时间线相同,并且解码在第一个数据包到达时开始。

to

The parameter is the maximum value of (decoding time of the NAL unit - transmission time of a NAL unit), assuming reliable and instantaneous transmission, the same timeline for transmission and decoding, and that decoding starts when the first packet arrives.

该参数是(NAL单元的解码时间-NAL单元的传输时间)的最大值,假设可靠和瞬时传输,传输和解码的时间线相同,并且解码在第一个数据包到达时开始。

4) Added media type parameters max-smbps, sprop-level-parameter-sets, use-level-src-parameter-sets, in-band-parameter-sets, sar-understood, and sar-supported.

4) 添加了媒体类型参数max smbps、存储级别参数集、使用级别src参数集、带内参数集、理解sar和支持sar。

5) In Section 8.1, removed the specification of parameter-add. Other descriptions of parameter-add (in Sections 8.2 and 8.4) were also removed.

5) 在第8.1节中,删除了参数add的规范。参数添加的其他说明(第8.2节和第8.4节)也被删除。

6) In Section 8.1, added a constraint to sprop-parameter-sets such that it can only contain parameter sets for the same profile and level as indicated by profile-level-id.

6) 在第8.1节中,向sprop参数集添加了一个约束,使其只能包含由profile-level-id指示的相同纵断面和标高的参数集。

7) In Section 8.2.1, added that sprop-parameter-sets and sprop-level-parameter-sets may be either included in the "a=fmtp" line of SDP or conveyed using the "fmtp" source attribute as specified in Section 6.3 of [9].

7) 在第8.2.1节中,补充了sprop参数集和sprop级别参数集可以包含在SDP的“a=fmtp”行中,或者使用[9]第6.3节中规定的“fmtp”源属性进行传输。

8) In Section 8.2.2, removed sprop-deint-buf-req from being part of the media format configuration in usage with the SDP Offer/Answer model.

8) 在第8.2.2节中,删除了sprop deint buf req,使其不再是SDP提供/应答模式使用的媒体格式配置的一部分。

9) In Section 8.2.2, made it clear that level is downgradable in the SDP Offer/Answer model, i.e., the use of the level part of profile-level-id does not need to be symmetric (the level included in the answer can be lower than or equal to the level included in the offer).

9) 在第8.2.2节中,明确了SDP报价/应答模型中的级别可降级,即配置文件级别id的级别部分的使用不需要对称(应答中包含的级别可以低于或等于报价中包含的级别)。

10) In Section 8.2.2, removed that the capability parameters may be used to declare encoding capabilities.

10) 在第8.2.2节中,删除了能力参数可用于声明编码能力的内容。

11) In Section 8.2.2, added rules on how to use sprop-parameter-sets and sprop-level-parameter-sets for out-of-band transport of parameter sets, with or without level downgrading.

11) 在第8.2.2节中,添加了关于如何使用sprop参数集和sprop级别参数集进行参数集带外传输的规则,包括或不包括级别降级。

12) In Section 8.2.2, clarified the rules of using the media type parameters with SDP Offer/Answer for multicast.

12) 在第8.2.2节中,阐明了将媒体类型参数与SDP提供/应答用于多播的规则。

13) In Section 8.2.2, completed and corrected the list of how different media type parameters shall be interpreted in the different combinations of offer or answer and direction attribute.

13) 在第8.2.2节中,完成并更正了不同媒体类型参数在报价或应答和方向属性的不同组合中的解释列表。

14) In Section 8.4, changed the text such that both out-of-band and in-band transport of parameter sets are allowed, and neither is recommended or required.

14) 在第8.4节中,更改了文本,允许参数集的带外和带内传输,建议或要求两者都不允许。

15) Added Section 8.5 (informative) providing example methods for decoder refresh to handle parameter set losses.

15) 增加了第8.5节(资料性),提供了解码器刷新处理参数集丢失的示例方法。

16) Added media type parameters max-recv-level and level-asymmetry-allowed and adjusted associated text and examples for level upgrade and asymmetry.

16) 添加了媒体类型参数“允许的最大recv级别和级别不对称”,并调整了级别升级和不对称的相关文本和示例。

15. Backward Compatibility to RFC 3984
15. 向后兼容RFC 3984

The current document is a revision of RFC 3984 and obsoletes it. The technical changes relative to RFC 3984 are listed in Section 14. This section addresses the backward compatibility issues.

当前文件是RFC 3984的修订版,已被废弃。第14节列出了与RFC 3984相关的技术变更。本节讨论向后兼容性问题。

It should be noted that for the majority of cases, there will be no compatibility issues for legacy implementations per RFC 3984 and new implementations per this document to interwork. Compatibility issues may only occur when both of the following conditions are true: 1) legacy implementations and new implementations are interworking, and 2) parameter sets are transported out-of-band. When such compatibility issues occur, it is easy to debug and find the reason for the incompatibility using the following analyses.

应该注意的是,在大多数情况下,根据RFC 3984的传统实现和根据本文档的新实现不会存在兼容性问题。只有当以下两种情况都成立时,才可能出现兼容性问题:1)旧实现和新实现相互作用,2)参数集传输到带外。当出现此类兼容性问题时,使用以下分析很容易进行调试并找到不兼容的原因。

Items 1, 2, 3, 7, 9, 10, 12, and 13 are bug-fix types of changes and do not incur any backward compatibility issues.

第1、2、3、7、9、10、12和13项是bug修复类型的更改,不会引起任何向后兼容性问题。

Item 4 (addition of six new media type parameters) does not incur any backward compatibility issues for SDP Offer/Answer-based applications, as legacy RFC 3984 receivers ignore these parameters, and it is fine for legacy RFC 3984 senders not to use these parameters as they are optional. However, there is a backward compatibility issue for declarative-usage-based applications (only for the parameter sprop-level-parameter-sets as the other five

第4项(添加六个新的媒体类型参数)不会对基于SDP提供/应答的应用程序产生任何向后兼容性问题,因为旧版RFC 3984接收器忽略这些参数,旧版RFC 3984发送器不使用这些参数是可以的,因为它们是可选的。但是,基于声明性用法的应用程序存在向后兼容性问题(仅适用于参数sprop-level参数集和其他五个参数集)

parameters are not usable in declarative usage). For example, declarative-usage-based applications using RTSP and SAP have a backward compatibility issue because the SDP receiver per RFC 3984 cannot accept a session for which the SDP includes an unrecognized parameter. Therefore, the RTSP or SAP server may have to prepare two sets of streams, one for legacy RFC 3984 receivers and one for receivers according to this memo.

参数在声明性用法中不可用)。例如,使用RTSP和SAP的基于使用情况的声明性应用程序存在向后兼容性问题,因为每个RFC 3984的SDP接收器无法接受SDP包含无法识别参数的会话。因此,RTSP或SAP服务器可能必须准备两组流,一组用于传统RFC 3984接收器,另一组用于根据本备忘录的接收器。

Items 5, 6, and 11 are related to out-of-band transport of parameter sets. There are following backward compatibility issues.

第5、6和11项与参数集的带外传输有关。存在以下向后兼容性问题。

1) When a legacy sender per RFC 3984 includes parameter sets for a level different than the default level indicated by profile-level-id to sprop-parameter-sets, the parameter value of sprop-parameter-sets is invalid to the receiver per this memo; therefore, the session may be rejected.

1) 当RFC 3984规定的传统发送方包含与sprop参数集的配置文件级别id指示的默认级别不同的级别的参数集时,根据此备忘录,sprop参数集的参数值对接收方无效;因此,会议可能会被拒绝。

2) In SDP Offer/Answer between a legacy offerer per RFC 3984 and an answerer per this memo, when the answerer includes in the answer parameter sets that are not a superset of the parameter sets included in the offer, the parameter value of sprop-parameter-sets is invalid to the offerer, and the session may not be initiated properly (related to change item 11).

2) 在根据RFC 3984的传统报价人与根据本备忘录的应答人之间的SDP报价/应答中,当应答人包含的应答参数集不是报价中包含的参数集的超集时,sprop参数集的参数值对报价人无效,会话可能无法正确启动(与变更项目11相关)。

3) When one endpoint A per this memo includes in-band-parameter-sets equal to 1, the other side B per RFC 3984 does not understand that it must transmit parameter sets in-band, and B may still exclude parameter sets in the in-band stream it is sending. Consequently, endpoint A cannot decode the stream it receives.

3) 当此备忘录中的一个端点A包含等于1的带内参数集时,RFC 3984中的另一方B不理解其必须在带内传输参数集,并且B仍可能排除其发送的带内流中的参数集。因此,端点A无法解码它接收的流。

Item 7 (allowance of conveying sprop-parameter-sets and sprop-level-parameter-sets using the "fmtp" source attribute as specified in Section 6.3 of [9]) is similar to item 4. It does not incur any backward compatibility issues for SDP Offer/Answer-based applications, as legacy RFC 3984 receivers ignore the "fmtp" source attribute, and it is fine for legacy RFC 3984 senders not to use the "fmtp" source attribute as it is optional. However, there is a backward compatibility issue for SDP declarative-usage-based applications, e.g., those using RTSP and SAP, because the SDP receiver per RFC 3984 cannot accept a session for which the SDP includes an unrecognized parameter (i.e., the "fmtp" source attribute). Therefore, the RTSP or SAP server may have to prepare two sets of streams, one for legacy RFC 3984 receivers and one for receivers according to this memo.

第7项(允许使用[9]第6.3节中规定的“fmtp”源属性传送sprop参数集和sprop级别参数集)与第4项类似。对于基于SDP提供/应答的应用程序,它不会产生任何向后兼容性问题,因为旧版RFC 3984接收器忽略“fmtp”源属性,旧版RFC 3984发送器不使用“fmtp”源属性是可以的,因为它是可选的。但是,基于SDP声明性使用的应用程序(例如使用RTSP和SAP的应用程序)存在向后兼容性问题,因为根据RFC 3984的SDP接收器无法接受SDP包含无法识别参数(即“fmtp”源属性)的会话。因此,RTSP或SAP服务器可能必须准备两组流,一组用于传统RFC 3984接收器,另一组用于根据本备忘录的接收器。

Item 14 does not incur any backward compatibility issues, as out-of-band transport of parameter sets is still allowed.

第14项不会产生任何向后兼容性问题,因为仍然允许参数集的带外传输。

Item 15 does not incur any backward compatibility issues, as the added Section 8.5 is informative.

第15项不产生任何向后兼容性问题,因为增加的第8.5节是信息性的。

Item 16 does not create any backward compatibility issues as the handling of the default level is the same if either end is RFC 3984 compliant, and, furthermore, RFC-3984-compliant ends would simply ignore the new media type parameters, if present.

第16项不会产生任何向后兼容性问题,因为如果任一端符合RFC 3984,则默认级别的处理是相同的,而且,符合RFC-3984的端只会忽略新的媒体类型参数(如果存在)。

16. Acknowledgements
16. 致谢

Stephan Wenger, Miska Hannuksela, Thomas Stockhammer, Magnus Westerlund, and David Singer are thanked as the authors of RFC 3984. Dave Lindbergh, Philippe Gentric, Gonzalo Camarillo, Gary Sullivan, Joerg Ott, and Colin Perkins are thanked for careful review during the development of RFC 3984. Stephen Botzko, Magnus Westerlund, Alex Eleftheriadis, Thomas Schierl, Tom Taylor, Ali Begen, Aaron Wells, Stuart Taylor, Robert Sparks, Dan Romascanu, and Niclas Comstedt are thanked for their valuable comments and input during the development of this memo.

斯蒂芬·温格、米斯卡·汉努克塞拉、托马斯·斯托克哈默、马格努斯·韦斯特隆德和大卫·辛格是RFC3984的作者,我们对此表示感谢。感谢Dave Lindbergh、Philippe Gentric、Gonzalo Camarillo、Gary Sullivan、Joerg Ott和Colin Perkins在RFC 3984开发过程中进行的仔细审查。感谢Stephen Botzko、Magnus Westerlund、Alex Eleftheriadis、Thomas Schierl、Tom Taylor、Ali Begen、Aaron Wells、Stuart Taylor、Robert Sparks、Dan Romascanu和Niclas Comstedt在编制本备忘录期间提出的宝贵意见和投入。

17. References
17. 工具书类
17.1. Normative References
17.1. 规范性引用文件

[1] ITU-T Recommendation H.264, "Advanced video coding for generic audiovisual services", March 2010.

[1] ITU-T建议H.264,“通用视听服务的高级视频编码”,2010年3月。

[2] ISO/IEC International Standard 14496-10:2008.

[2] ISO/IEC国际标准14496-10:2008。

[3] ITU-T Recommendation H.241, "Extended video procedures and control signals for H.300-series terminals", May 2006.

[3] ITU-T建议H.241,“H.300系列终端的扩展视频程序和控制信号”,2006年5月。

[4] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[4] Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[5] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.

[5] Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。

[6] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.

[6] Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC4566,2006年7月。

[7] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, October 2006.

[7] Josefsson,S.,“Base16、Base32和Base64数据编码”,RFC4648,2006年10月。

[8] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.

[8] Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,2002年6月。

[9] Lennox, J., Ott, J., and T. Schierl, "Source-Specific Media Attributes in the Session Description Protocol (SDP)", RFC 5576, June 2009.

[9] Lennox,J.,Ott,J.,和T.Schierl,“会话描述协议(SDP)中的源特定媒体属性”,RFC 55762009年6月。

17.2. Informative References
17.2. 资料性引用

[10] Luthra, A., Sullivan, G.J., and T. Wiegand (eds.), "Introduction to the special issue on the H.264/AVC video coding standard", IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, July 2003.

[10] Luthra,A.,Sullivan,G.J.和T.Wiegand(编辑),“H.264/AVC视频编码标准特刊介绍”,IEEE视频技术电路和系统交易,第13卷,第7期,2003年7月。

[11] Ott, J., Bormann, C., Sullivan, G., Wenger, S., and R. Even, Ed., "RTP Payload Format for ITU-T Rec. H.263 Video", RFC 4629, January 2007.

[11] Ott,J.,Bormann,C.,Sullivan,G.,Wenger,S.,和R.Even,编辑,“ITU-T Rec.H.263视频的RTP有效载荷格式”,RFC 4629,2007年1月。

[12] ISO/IEC International Standard 14496-2:2004.

[12] ISO/IEC国际标准14496-2:2004。

[13] Wenger, S., "H.264/AVC over IP", IEEE Transaction on Circuits and Systems for Video Technology, Vol. 13, No. 7, July 2003.

[13] Wenger,S.,“H.264/AVC over IP”,IEEE视频技术电路和系统交易,第13卷,第7期,2003年7月。

[14] Wenger, S., "H.26L over IP: The IP-Network Adaptation Layer", Proceedings Packet Video Workshop, April 2002.

[14] Wenger,S.,“H.26L over IP:IP网络适配层”,包视频研讨会论文集,2002年4月。

[15] Stockhammer, T., Hannuksela, M.M., and S. Wenger, "H.26L/JVT Coding Network Abstraction Layer and IP-Based Transport", IEEE International Conference on Image Processing (ICIP 2002), Rochester, NY, September 2002.

[15] Stockhammer,T.,Hannuksela,M.M.和S.Wenger,“H.26L/JVT编码网络抽象层和基于IP的传输”,IEEE图像处理国际会议(ICIP 2002),纽约州罗切斯特,2002年9月。

[16] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003.

[16] Schulzrinne,H.和S.Casner,“具有最小控制的音频和视频会议的RTP配置文件”,STD 65,RFC 3551,2003年7月。

[17] ITU-T Recommendation H.223, "Multiplexing protocol for low bit rate multimedia communication", July 2001.

[17] ITU-T建议H.223,“低比特率多媒体通信的多路复用协议”,2001年7月。

[18] Li, A., Ed., "RTP Payload Format for Generic Forward Error Correction", RFC 5109, December 2007.

[18] Li,A.,编辑,“通用前向纠错的RTP有效载荷格式”,RFC 5109,2007年12月。

[19] Stockhammer, T., Wiegand, T., Oelbaum, T., and F. Obermeier, "Video Coding and Transport Layer Techniques for H.264/AVC-Based Transmission over Packet-Lossy Networks", IEEE International Conference on Image Processing (ICIP 2003), Barcelona, Spain, September 2003.

[19] Stockhammer,T.,Wiegand,T.,Oelbaum,T.,和F.Obermier,“分组有损网络上基于H.264/AVC传输的视频编码和传输层技术”,IEEE国际图像处理会议(ICIP 2003),西班牙巴塞罗那,2003年9月。

[20] Varsa, V. and M. Karczewicz, "Slice interleaving in compressed video packetization", Packet Video Workshop 2000.

[20] Varsa,V.和M.Karczewicz,“压缩视频分组中的切片交织”,分组视频研讨会2000年。

[21] Kang, S.H. and A. Zakhor, "Packet scheduling algorithm for wireless video streaming", Packet Video Workshop 2002.

[21] Kang,S.H.和A.Zakhor,“无线视频流的分组调度算法”,分组视频研讨会2002。

[22] Hannuksela, M.M., "Enhanced Concept of GOP", JVT-B042, available http://ftp3.itu.int/av-arch/video-site/0201_Gen/JVT-B042.doc, January 2002.

[22] Hannuksela,M.M.,“GOP的增强概念”,JVT-B042,可供查阅http://ftp3.itu.int/av-arch/video-site/0201_Gen/JVT-B042.doc,2002年1月。

[23] Wenger, S., "Video Redundancy Coding in H.263+", 1997 International Workshop on Audio-Visual Services over Packet Networks, September 1997.

[23] Wenger,S.,“H.263+中的视频冗余编码”,1997年分组网络视听服务国际研讨会,1997年9月。

[24] Wang, Y.-K., Hannuksela, M.M., and M. Gabbouj, "Error Resilient Video Coding Using Unequally Protected Key Pictures", in Proc. International Workshop VLBV03, September 2003.

[24] Wang,Y.-K.,Hannuksela,M.M.和M.Gabbouj,“使用不平等保护的关键图片的抗错误视频编码”,在Proc。VLBV03国际研讨会,2003年9月。

[25] van der Meer, J., Mackie, D., Swaminathan, V., Singer, D., and P. Gentric, "RTP Payload Format for Transport of MPEG-4 Elementary Streams", RFC 3640, November 2003.

[25] van der Meer,J.,Mackie,D.,Swaminathan,V.,Singer,D.,和P.Gentric,“MPEG-4基本流传输的RTP有效载荷格式”,RFC 36402003年11月。

[26] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004.

[26] Baugher,M.,McGrew,D.,Naslund,M.,Carrara,E.,和K.Norrman,“安全实时传输协议(SRTP)”,RFC 37112004年3月。

[27] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998.

[27] Schulzrinne,H.,Rao,A.,和R.Lanphier,“实时流协议(RTSP)”,RFC2326,1998年4月。

[28] Handley, M., Perkins, C., and E. Whelan, "Session Announcement Protocol", RFC 2974, October 2000.

[28] Handley,M.,Perkins,C.,和E.Whelan,“会话公告协议”,RFC 29742000年10月。

[29] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117, January 2008.

[29] Westerlund,M.和S.Westerlund,S.Wenger,“RTP拓扑”,RFC 51172008年1月。

[30] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, "Codec Control Messages in the RTP Audio-Visual Profile with Feedback (AVPF)", RFC 5104, February 2008.

[30] Wenger,S.,Chandra,U.,Westerlund,M.,和B.Burman,“带反馈的RTP视听配置文件(AVPF)中的编解码器控制消息”,RFC 5104,2008年2月。

Authors' Addresses

作者地址

Ye-Kui Wang Huawei Technologies 400 Crossing Blvd, 2nd Floor Bridgewater, NJ 08807 USA

美国新泽西州布里奇沃特市横穿大道400号2楼华为科技公司王业奎08807

   Phone: +1-908-541-3518
   EMail: yekui.wang@huawei.com
        
   Phone: +1-908-541-3518
   EMail: yekui.wang@huawei.com
        

Roni Even Huawei Technologies 14 David Hamelech Tel Aviv 64953 Israel

Roni甚至华为技术14 David Hamelech特拉维夫64953以色列

   Phone: +972-545481099
   EMail: even.roni@huawei.com
        
   Phone: +972-545481099
   EMail: even.roni@huawei.com
        

Tom Kristensen TANDBERG Philip Pedersens vei 22 N-1366 Lysaker Norway

Tom Kristensen TANDBERG Philip Pedersens vei 22 N-1366挪威莱赛克

   Phone: +47 67125125
   EMail: tom.kristensen@tandberg.com, tomkri@ifi.uio.no
        
   Phone: +47 67125125
   EMail: tom.kristensen@tandberg.com, tomkri@ifi.uio.no
        

Randell Jesup WorldGate Communications 3800 Horizon Blvd, Suite #103 Trevose, PA 19053-4947 USA

Randell Jesup WorldGate Communications 3800 Horizon Blvd,美国宾夕法尼亚州特雷沃斯103号套房,邮编:19053-4947

   Phone: +1-215-354-5166
   EMail: rjesup@wgate.com, randell_ietf@jesup.org
        
   Phone: +1-215-354-5166
   EMail: rjesup@wgate.com, randell_ietf@jesup.org