Network Working Group                                          S. Wenger
Request for Comments: 3984                               M.M. Hannuksela
Category: Standards Track                                 T. Stockhammer
                                                           M. Westerlund
                                                               D. Singer
                                                           February 2005
        
Network Working Group                                          S. Wenger
Request for Comments: 3984                               M.M. Hannuksela
Category: Standards Track                                 T. Stockhammer
                                                           M. Westerlund
                                                               D. Singer
                                                           February 2005
        

RTP Payload Format for H.264 Video

H.264视频的RTP有效负载格式

Status of This Memo

关于下段备忘

This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.

本文件规定了互联网社区的互联网标准跟踪协议,并要求进行讨论和提出改进建议。有关本协议的标准化状态和状态,请参考当前版本的“互联网官方协议标准”(STD 1)。本备忘录的分发不受限制。

Copyright Notice

版权公告

Copyright (C) The Internet Society (2005).

版权所有(C)互联网协会(2005年)。

Abstract

摘要

This memo describes an RTP Payload format for the ITU-T Recommendation H.264 video codec and the technically identical ISO/IEC International Standard 14496-10 video codec. The RTP payload format allows for packetization of one or more Network Abstraction Layer Units (NALUs), produced by an H.264 video encoder, in each RTP payload. The payload format has wide applicability, as it supports applications from simple low bit-rate conversational usage, to Internet video streaming with interleaved transmission, to high bit-rate video-on-demand.

本备忘录描述了ITU-T建议H.264视频编解码器和技术上相同的ISO/IEC国际标准14496-10视频编解码器的RTP有效载荷格式。RTP有效载荷格式允许在每个RTP有效载荷中对H.264视频编码器产生的一个或多个网络抽象层单元(NALU)进行分组。有效负载格式具有广泛的适用性,因为它支持从简单的低比特率对话使用到具有交织传输的互联网视频流,再到高比特率视频点播的应用。

Table of Contents

目录

   1.  Introduction..................................................  3
       1.1.  The H.264 Codec.........................................  3
       1.2.  Parameter Set Concept...................................  4
       1.3.  Network Abstraction Layer Unit Types....................  5
   2.  Conventions...................................................  6
   3.  Scope.........................................................  6
   4.  Definitions and Abbreviations.................................  6
       4.1.  Definitions.............................................  6
   5.  RTP Payload Format............................................  8
       5.1.  RTP Header Usage........................................  8
       5.2.  Common Structure of the RTP Payload Format.............. 11
       5.3.  NAL Unit Octet Usage.................................... 12
        
   1.  Introduction..................................................  3
       1.1.  The H.264 Codec.........................................  3
       1.2.  Parameter Set Concept...................................  4
       1.3.  Network Abstraction Layer Unit Types....................  5
   2.  Conventions...................................................  6
   3.  Scope.........................................................  6
   4.  Definitions and Abbreviations.................................  6
       4.1.  Definitions.............................................  6
   5.  RTP Payload Format............................................  8
       5.1.  RTP Header Usage........................................  8
       5.2.  Common Structure of the RTP Payload Format.............. 11
       5.3.  NAL Unit Octet Usage.................................... 12
        
       5.4.  Packetization Modes..................................... 14
       5.5.  Decoding Order Number (DON)............................. 15
       5.6.  Single NAL Unit Packet.................................. 18
       5.7.  Aggregation Packets..................................... 18
       5.8.  Fragmentation Units (FUs)............................... 27
   6.  Packetization Rules........................................... 31
       6.1.  Common Packetization Rules.............................. 31
       6.2.  Single NAL Unit Mode.................................... 32
       6.3.  Non-Interleaved Mode.................................... 32
       6.4.  Interleaved Mode........................................ 33
   7.  De-Packetization Process (Informative)........................ 33
       7.1.  Single NAL Unit and Non-Interleaved Mode................ 33
       7.2.  Interleaved Mode........................................ 34
       7.3.  Additional De-Packetization Guidelines.................. 36
   8.  Payload Format Parameters..................................... 37
       8.1.  MIME Registration....................................... 37
       8.2.  SDP Parameters.......................................... 52
       8.3.  Examples................................................ 58
       8.4.  Parameter Set Considerations............................ 60
   9.  Security Considerations....................................... 62
   10. Congestion Control............................................ 63
   11. IANA Considerations........................................... 64
   12. Informative Appendix: Application Examples.................... 65
       12.1. Video Telephony according to ITU-T Recommendation H.241
             Annex A................................................. 65
       12.2. Video Telephony, No Slice Data Partitioning, No NAL
             Unit Aggregation........................................ 65
       12.3. Video Telephony, Interleaved Packetization Using NAL
             Unit Aggregation........................................ 66
       12.4. Video Telephony with Data Partitioning.................. 66
       12.5. Video Telephony or Streaming with FUs and Forward
             Error Correction........................................ 67
       12.6. Low Bit-Rate Streaming.................................. 69
       12.7. Robust Packet Scheduling in Video Streaming............. 70
   13. Informative Appendix: Rationale for Decoding Order Number..... 71
       13.1. Introduction............................................ 71
       13.2. Example of Multi-Picture Slice Interleaving............. 71
       13.3. Example of Robust Packet Scheduling..................... 73
       13.4. Robust Transmission Scheduling of Redundant Coded
             Slices.................................................. 77
       13.5. Remarks on Other Design Possibilities................... 77
   14. Acknowledgements.............................................. 78
   15. References.................................................... 78
       15.1. Normative References.................................... 78
       15.2. Informative References.................................. 79
   Authors' Addresses................................................ 81
   Full Copyright Statement.......................................... 83
        
       5.4.  Packetization Modes..................................... 14
       5.5.  Decoding Order Number (DON)............................. 15
       5.6.  Single NAL Unit Packet.................................. 18
       5.7.  Aggregation Packets..................................... 18
       5.8.  Fragmentation Units (FUs)............................... 27
   6.  Packetization Rules........................................... 31
       6.1.  Common Packetization Rules.............................. 31
       6.2.  Single NAL Unit Mode.................................... 32
       6.3.  Non-Interleaved Mode.................................... 32
       6.4.  Interleaved Mode........................................ 33
   7.  De-Packetization Process (Informative)........................ 33
       7.1.  Single NAL Unit and Non-Interleaved Mode................ 33
       7.2.  Interleaved Mode........................................ 34
       7.3.  Additional De-Packetization Guidelines.................. 36
   8.  Payload Format Parameters..................................... 37
       8.1.  MIME Registration....................................... 37
       8.2.  SDP Parameters.......................................... 52
       8.3.  Examples................................................ 58
       8.4.  Parameter Set Considerations............................ 60
   9.  Security Considerations....................................... 62
   10. Congestion Control............................................ 63
   11. IANA Considerations........................................... 64
   12. Informative Appendix: Application Examples.................... 65
       12.1. Video Telephony according to ITU-T Recommendation H.241
             Annex A................................................. 65
       12.2. Video Telephony, No Slice Data Partitioning, No NAL
             Unit Aggregation........................................ 65
       12.3. Video Telephony, Interleaved Packetization Using NAL
             Unit Aggregation........................................ 66
       12.4. Video Telephony with Data Partitioning.................. 66
       12.5. Video Telephony or Streaming with FUs and Forward
             Error Correction........................................ 67
       12.6. Low Bit-Rate Streaming.................................. 69
       12.7. Robust Packet Scheduling in Video Streaming............. 70
   13. Informative Appendix: Rationale for Decoding Order Number..... 71
       13.1. Introduction............................................ 71
       13.2. Example of Multi-Picture Slice Interleaving............. 71
       13.3. Example of Robust Packet Scheduling..................... 73
       13.4. Robust Transmission Scheduling of Redundant Coded
             Slices.................................................. 77
       13.5. Remarks on Other Design Possibilities................... 77
   14. Acknowledgements.............................................. 78
   15. References.................................................... 78
       15.1. Normative References.................................... 78
       15.2. Informative References.................................. 79
   Authors' Addresses................................................ 81
   Full Copyright Statement.......................................... 83
        
1. Introduction
1. 介绍
1.1. The H.264 Codec
1.1. H.264编解码器

This memo specifies an RTP payload specification for the video coding standard known as ITU-T Recommendation H.264 [1] and ISO/IEC International Standard 14496 Part 10 [2] (both also known as Advanced Video Coding, or AVC). Recommendation H.264 was approved by ITU-T on May 2003, and the approved draft specification is available for public review [8]. In this memo the H.264 acronym is used for the codec and the standard, but the memo is equally applicable to the ISO/IEC counterpart of the coding standard.

本备忘录规定了视频编码标准(称为ITU-T建议H.264[1]和ISO/IEC国际标准14496第10部分[2])(也称为高级视频编码或AVC)的RTP有效载荷规范。建议H.264于2003年5月获得ITU-T批准,批准的规范草案可供公众审查[8]。在本备忘录中,H.264首字母缩略词用于编解码器和标准,但本备忘录同样适用于编码标准的ISO/IEC对应物。

The H.264 video codec has a very broad application range that covers all forms of digital compressed video from, low bit-rate Internet streaming applications to HDTV broadcast and Digital Cinema applications with nearly lossless coding. Compared to the current state of technology, the overall performance of H.264 is such that bit rate savings of 50% or more are reported. Digital Satellite TV quality, for example, was reported to be achievable at 1.5 Mbit/s, compared to the current operation point of MPEG 2 video at around 3.5 Mbit/s [9].

H.264视频编解码器具有非常广泛的应用范围,涵盖了所有形式的数字压缩视频,从低比特率互联网流媒体应用到HDTV广播和几乎无损编码的数字电影应用。与当前的技术状态相比,H.264的总体性能是,据报告比特率节省了50%或更多。例如,据报道,数字卫星电视质量可以达到1.5 Mbit/s,而MPEG 2视频的当前运行点大约为3.5 Mbit/s[9]。

The codec specification [1] itself distinguishes conceptually between a video coding layer (VCL) and a network abstraction layer (NAL). The VCL contains the signal processing functionality of the codec; mechanisms such as transform, quantization, and motion compensated prediction; and a loop filter. It follows the general concept of most of today's video codecs, a macroblock-based coder that uses inter picture prediction with motion compensation and transform coding of the residual signal. The VCL encoder outputs slices: a bit string that contains the macroblock data of an integer number of macroblocks, and the information of the slice header (containing the spatial address of the first macroblock in the slice, the initial quantization parameter, and similar information). Macroblocks in slices are arranged in scan order unless a different macroblock allocation is specified, by using the so-called Flexible Macroblock Ordering syntax. In-picture prediction is used only within a slice. More information is provided in [9].

编解码器规范[1]本身在概念上区分了视频编码层(VCL)和网络抽象层(NAL)。VCL包含编解码器的信号处理功能;变换、量化和运动补偿预测等机制;和一个环路滤波器。它遵循当今大多数视频编解码器的一般概念,一种基于宏块的编码器,使用带运动补偿的帧间预测和残余信号的变换编码。VCL编码器输出片:包含整数个宏块的宏块数据和片头信息(包含片中第一个宏块的空间地址、初始量化参数和类似信息)的位字符串。片中的宏块按照扫描顺序排列,除非使用所谓的灵活宏块排序语法指定了不同的宏块分配。图片内预测仅在切片内使用。更多信息见[9]。

The Network Abstraction Layer (NAL) encoder encapsulates the slice output of the VCL encoder into Network Abstraction Layer Units (NAL units), which are suitable for transmission over packet networks or use in packet oriented multiplex environments. Annex B of H.264 defines an encapsulation process to transmit such NAL units over byte-stream oriented networks. In the scope of this memo, Annex B is not relevant.

网络抽象层(NAL)编码器将VCL编码器的片输出封装到网络抽象层单元(NAL单元)中,网络抽象层单元适合在分组网络上传输或在面向分组的多路复用环境中使用。H.264的附录B定义了通过面向字节流的网络传输此类NAL单元的封装过程。在本备忘录范围内,附件B不相关。

Internally, the NAL uses NAL units. A NAL unit consists of a one-byte header and the payload byte string. The header indicates the type of the NAL unit, the (potential) presence of bit errors or syntax violations in the NAL unit payload, and information regarding the relative importance of the NAL unit for the decoding process. This RTP payload specification is designed to be unaware of the bit string in the NAL unit payload.

在内部,NAL使用NAL单位。NAL单元由一个单字节头和有效负载字节字符串组成。报头指示NAL单元的类型、NAL单元有效载荷中(可能)存在的比特错误或语法冲突,以及关于解码过程中NAL单元的相对重要性的信息。此RTP有效负载规范旨在不知道NAL单元有效负载中的位字符串。

One of the main properties of H.264 is the complete decoupling of the transmission time, the decoding time, and the sampling or presentation time of slices and pictures. The decoding process specified in H.264 is unaware of time, and the H.264 syntax does not carry information such as the number of skipped frames (as is common in the form of the Temporal Reference in earlier video compression standards). Also, there are NAL units that affect many pictures and that are, therefore, inherently timeless. For this reason, the handling of the RTP timestamp requires some special considerations for NAL units for which the sampling or presentation time is not defined or, at transmission time, unknown.

H.264的主要特性之一是传输时间、解码时间以及切片和图片的采样或显示时间的完全解耦。在H.264中指定的解码处理不知道时间,并且H.264语法不携带诸如跳过帧的数目之类的信息(这在早期视频压缩标准中以时间参考的形式常见)。此外,还有影响许多图片的NAL单元,因此,它们本质上是永恒的。因此,对于采样或呈现时间未定义或在传输时未知的NAL单元,RTP时间戳的处理需要一些特殊考虑。

1.2. Parameter Set Concept
1.2. 参数集概念

One very fundamental design concept of H.264 is to generate self-contained packets, to make mechanisms such as the header duplication of RFC 2429 [10] or MPEG-4's Header Extension Code (HEC) [11] unnecessary. This was achieved by decoupling information relevant to more than one slice from the media stream. This higher layer meta information should be sent reliably, asynchronously, and in advance from the RTP packet stream that contains the slice packets. (Provisions for sending this information in-band are also available for applications that do not have an out-of-band transport channel appropriate for the purpose.) The combination of the higher-level parameters is called a parameter set. The H.264 specification includes two types of parameter sets: sequence parameter set and picture parameter set. An active sequence parameter set remains unchanged throughout a coded video sequence, and an active picture parameter set remains unchanged within a coded picture. The sequence and picture parameter set structures contain information such as picture size, optional coding modes employed, and macroblock to slice group map.

H.264的一个非常基本的设计概念是生成自包含的数据包,以使诸如RFC 2429[10]的报头复制或MPEG-4的报头扩展码(HEC)[11]等机制变得不必要。这是通过从媒体流中分离与多个片段相关的信息来实现的。这种更高层的元信息应该从包含切片数据包的RTP数据包流中可靠地、异步地提前发送。(用于在带内发送此信息的规定也适用于没有适合此目的的带外传输信道的应用。)更高级别参数的组合称为参数集。H.264规范包括两种类型的参数集:序列参数集和图片参数集。活动序列参数集在整个编码视频序列中保持不变,并且活动图片参数集在编码图片中保持不变。序列和图片参数集结构包含图片大小、采用的可选编码模式以及宏块到切片组映射等信息。

To be able to change picture parameters (such as the picture size) without having to transmit parameter set updates synchronously to the slice packet stream, the encoder and decoder can maintain a list of more than one sequence and picture parameter set. Each slice header contains a codeword that indicates the sequence and picture parameter set to be used.

为了能够改变图片参数(例如图片大小),而不必将参数集更新同步地发送到切片分组流,编码器和解码器可以维护多个序列和图片参数集的列表。每个切片标头包含一个码字,该码字指示要使用的序列和图片参数集。

This mechanism allows the decoupling of the transmission of parameter sets from the packet stream, and the transmission of them by external means (e.g., as a side effect of the capability exchange), or through a (reliable or unreliable) control protocol. It may even be possible that they are never transmitted but are fixed by an application design specification.

该机制允许将参数集的传输与数据包流分离,并通过外部手段(例如,作为能力交换的副作用)或通过(可靠或不可靠)控制协议进行传输。甚至可能它们从未被传输,而是由应用程序设计规范固定。

1.3. Network Abstraction Layer Unit Types
1.3. 网络抽象层单元类型

Tutorial information on the NAL design can be found in [12], [13], and [14].

有关NAL设计的教程信息可在[12]、[13]和[14]中找到。

All NAL units consist of a single NAL unit type octet, which also co-serves as the payload header of this RTP payload format. The payload of a NAL unit follows immediately.

所有NAL单元均由单个NAL单元类型的八位字节组成,该八位字节还共同充当此RTP有效负载格式的有效负载标头。NAL单元的有效载荷立即跟随。

The syntax and semantics of the NAL unit type octet are specified in [1], but the essential properties of the NAL unit type octet are summarized below. The NAL unit type octet has the following format:

[1]中规定了NAL单元类型八位字节的语法和语义,但NAL单元类型八位字节的基本属性总结如下。NAL单元类型八位字节的格式如下:

      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        
      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        

The semantics of the components of the NAL unit type octet, as specified in the H.264 specification, are described briefly below.

下面简要描述H.264规范中指定的NAL单元类型八位字节的组件的语义。

F: 1 bit forbidden_zero_bit. The H.264 specification declares a value of 1 as a syntax violation.

F:1位禁止\u零位\u位。H.264规范将值1声明为语法冲突。

NRI: 2 bits nal_ref_idc. A value of 00 indicates that the content of the NAL unit is not used to reconstruct reference pictures for inter picture prediction. Such NAL units can be discarded without risking the integrity of the reference pictures. Values greater than 00 indicate that the decoding of the NAL unit is required to maintain the integrity of the reference pictures.

NRI:2位nal\U ref\U idc。值00表示NAL单元的内容不用于重建用于画面间预测的参考画面。这样的NAL单元可以被丢弃,而不会危及参考图片的完整性。大于00的值表示需要对NAL单元进行解码以保持参考图片的完整性。

Type: 5 bits nal_unit_type. This component specifies the NAL unit payload type as defined in table 7-1 of [1], and later within this memo. For a reference of all currently defined NAL unit types and their semantics, please refer to section 7.4.1 in [1].

类型:5位nal\U单元\U类型。该组件指定了[1]表7-1中定义的NAL装置有效载荷类型,以及本备忘录后面的内容。有关所有当前定义的NAL单元类型及其语义的参考,请参考[1]中的第7.4.1节。

This memo introduces new NAL unit types, which are presented in section 5.2. The NAL unit types defined in this memo are marked as unspecified in [1]. Moreover, this specification extends the semantics of F and NRI as described in section 5.3.

本备忘录介绍了新的NAL装置类型,见第5.2节。本备忘录中定义的NAL单元类型在[1]中标记为未指定。此外,本规范扩展了第5.3节所述的F和NRI的语义。

2. Conventions
2. 习俗

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [3].

本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照BCP 14、RFC 2119[3]中的说明进行解释。

This specification uses the notion of setting and clearing a bit when bit fields are handled. Setting a bit is the same as assigning that bit the value of 1 (On). Clearing a bit is the same as assigning that bit the value of 0 (Off).

本规范使用在处理位字段时设置和清除位的概念。设置位与将该位的值指定为1(On)相同。清除一个位与将该位赋值为0(关闭)相同。

3. Scope
3. 范围

This payload specification can only be used to carry the "naked" H.264 NAL unit stream over RTP, and not the bitstream format discussed in Annex B of H.264. Likely, the first applications of this specification will be in the conversational multimedia field, video telephony or video conferencing, but the payload format also covers other applications, such as Internet streaming and TV over IP.

此有效负载规范只能用于通过RTP传输“裸”H.264 NAL单元流,而不是H.264附录B中讨论的比特流格式。很可能,本规范的第一个应用将在对话多媒体领域、视频电话或视频会议中,但有效载荷格式也涵盖其他应用,如互联网流媒体和IP电视。

4. Definitions and Abbreviations
4. 定义和缩写
4.1. Definitions
4.1. 定义

This document uses the definitions of [1]. The following terms, defined in [1], are summed up for convenience:

本文件使用[1]的定义。为了方便起见,对[1]中定义的以下术语进行了总结:

access unit: A set of NAL units always containing a primary coded picture. In addition to the primary coded picture, an access unit may also contain one or more redundant coded pictures or other NAL units not containing slices or slice data partitions of a coded picture. The decoding of an access unit always results in a decoded picture.

访问单元:一组NAL单元,总是包含一个主编码图片。除了主编码图片之外,访问单元还可以包含一个或多个冗余编码图片或不包含编码图片的切片或切片数据分区的其他NAL单元。访问单元的解码总是导致解码图片。

coded video sequence: A sequence of access units that consists, in decoding order, of an instantaneous decoding refresh (IDR) access unit followed by zero or more non-IDR access units including all subsequent access units up to but not including any subsequent IDR access unit.

编码视频序列:按解码顺序由瞬时解码刷新(IDR)访问单元和零个或多个非IDR访问单元组成的访问单元序列,包括所有后续访问单元,但不包括任何后续IDR访问单元。

IDR access unit: An access unit in which the primary coded picture is an IDR picture.

IDR访问单元:其中主编码图片为IDR图片的访问单元。

IDR picture: A coded picture containing only slices with I or SI slice types that causes a "reset" in the decoding process. After the decoding of an IDR picture, all following coded pictures in decoding order can be decoded without inter prediction from any picture decoded prior to the IDR picture.

IDR图片:仅包含I或SI切片类型的切片的编码图片,在解码过程中导致“重置”。在对IDR图片进行解码之后,可以按照解码顺序对所有后续编码图片进行解码,而无需从在IDR图片之前解码的任何图片进行帧间预测。

primary coded picture: The coded representation of a picture to be used by the decoding process for a bitstream conforming to H.264. The primary coded picture contains all macroblocks of the picture.

主编码图片:对符合H.264的位流进行解码处理时使用的图片的编码表示。主编码图片包含图片的所有宏块。

redundant coded picture: A coded representation of a picture or a part of a picture. The content of a redundant coded picture shall not be used by the decoding process for a bitstream conforming to H.264. The content of a redundant coded picture may be used by the decoding process for a bitstream that contains errors or losses.

冗余编码图片:图片或图片部分的编码表示。对于符合H.264的比特流,解码过程不得使用冗余编码图片的内容。冗余编码图片的内容可由解码过程用于包含错误或丢失的比特流。

VCL NAL unit: A collective term used to refer to coded slice and coded data partition NAL units.

VCL NAL单元:用于指编码片和编码数据分区NAL单元的集合术语。

In addition, the following definitions apply:

此外,以下定义适用:

decoding order number (DON): A field in the payload structure, or a derived variable indicating NAL unit decoding order. Values of DON are in the range of 0 to 65535, inclusive. After reaching the maximum value, the value of DON wraps around to 0.

解码顺序号(DON):有效负载结构中的一个字段,或指示NAL单元解码顺序的派生变量。DON的值在0到65535之间(含0到65535)。达到最大值后,DON的值将变为0。

NAL unit decoding order: A NAL unit order that conforms to the constraints on NAL unit order given in section 7.4.1.2 in [1].

NAL单元解码顺序:符合[1]第7.4.1.2节中给出的NAL单元顺序约束的NAL单元顺序。

transmission order: The order of packets in ascending RTP sequence number order (in modulo arithmetic). Within an aggregation packet, the NAL unit transmission order is the same as the order of appearance of NAL units in the packet.

传输顺序:以RTP序列号升序排列的数据包顺序(在模运算中)。在聚合分组内,NAL单元传输顺序与分组中NAL单元的出现顺序相同。

media aware network element (MANE): A network element, such as a middlebox or application layer gateway that is capable of parsing certain aspects of the RTP payload headers or the RTP payload and reacting to the contents.

媒体感知网元(MANE):能够解析RTP有效负载头或RTP有效负载的某些方面并对内容作出反应的网元,如中间盒或应用层网关。

Informative note: The concept of a MANE goes beyond normal routers or gateways in that a MANE has to be aware of the signaling (e.g., to learn about the payload type mappings of the media streams), and in that it has to be trusted when working with SRTP. The advantage of using MANEs is that they allow packets to be dropped according to the needs of the media coding. For example, if a MANE has to drop packets due to congestion on a certain link, it can identify those packets

信息性说明:MANE的概念超出了普通路由器或网关,因为MANE必须知道信令(例如,了解媒体流的有效负载类型映射),并且在使用SRTP时必须信任它。使用mane的优点是,它们允许根据媒体编码的需要丢弃数据包。例如,如果MANE由于某个链路上的拥塞而不得不丢弃数据包,它可以识别这些数据包

whose dropping has the smallest negative impact on the user experience and remove them in order to remove the congestion and/or keep the delay low.

其丢弃对用户体验的负面影响最小,并将其移除,以消除拥塞和/或保持较低的延迟。

Abbreviations

缩写

DON: Decoding Order Number DONB: Decoding Order Number Base DOND: Decoding Order Number Difference FEC: Forward Error Correction FU: Fragmentation Unit IDR: Instantaneous Decoding Refresh IEC: International Electrotechnical Commission ISO: International Organization for Standardization ITU-T: International Telecommunication Union, Telecommunication Standardization Sector MANE: Media Aware Network Element MTAP: Multi-Time Aggregation Packet MTAP16: MTAP with 16-bit timestamp offset MTAP24: MTAP with 24-bit timestamp offset NAL: Network Abstraction Layer NALU: NAL Unit SEI: Supplemental Enhancement Information STAP: Single-Time Aggregation Packet STAP-A: STAP type A STAP-B: STAP type B TS: Timestamp VCL: Video Coding Layer

DON:解码顺序号DONB:解码顺序号Base DOND:解码顺序号差异FEC:前向纠错FU:分段单元IDR:瞬时解码刷新IEC:国际电工委员会ISO:国际标准化组织ITU-T:国际电信联盟,电信标准化部门MANE:媒体感知网元MTAP:多时间聚合数据包MTAP16:具有16位时间戳偏移量的MTAP MTAP24:具有24位时间戳偏移量的MTAP NAL:网络抽象层NALU:NAL单元SEI:补充增强信息STAP:单时间聚合数据包STAP-A:STAP类型A STAP-B:STAP类型B TS:时间戳VCL:视频编码层

5. RTP Payload Format
5. RTP有效负载格式
5.1. RTP Header Usage
5.1. RTP头使用

The format of the RTP header is specified in RFC 3550 [4] and reprinted in Figure 1 for convenience. This payload format uses the fields of the header in a manner consistent with that specification.

RFC 3550[4]中规定了RTP头的格式,为了方便起见,在图1中重新打印了RTP头。此有效负载格式以与该规范一致的方式使用报头的字段。

When one NAL unit is encapsulated per RTP packet, the RECOMMENDED RTP payload format is specified in section 5.6. The RTP payload (and the settings for some RTP header bits) for aggregation packets and fragmentation units are specified in sections 5.7 and 5.8, respectively.

当每个RTP数据包封装一个NAL单元时,第5.6节规定了推荐的RTP有效负载格式。第5.7节和第5.8节分别规定了聚合数据包和分段单元的RTP有效负载(以及某些RTP报头位的设置)。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |V=2|P|X|  CC   |M|     PT      |       sequence number         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           timestamp                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           synchronization source (SSRC) identifier            |
      +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
      |            contributing source (CSRC) identifiers             |
      |                             ....                              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |V=2|P|X|  CC   |M|     PT      |       sequence number         |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                           timestamp                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           synchronization source (SSRC) identifier            |
      +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
      |            contributing source (CSRC) identifiers             |
      |                             ....                              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 1. RTP header according to RFC 3550

图1。符合RFC 3550的RTP集管

The RTP header information to be set according to this RTP payload format is set as follows:

要根据此RTP有效负载格式设置的RTP报头信息设置如下:

Marker bit (M): 1 bit Set for the very last packet of the access unit indicated by the RTP timestamp, in line with the normal use of the M bit in video formats, to allow an efficient playout buffer handling. For aggregation packets (STAP and MTAP), the marker bit in the RTP header MUST be set to the value that the marker bit of the last NAL unit of the aggregation packet would have been if it were transported in its own RTP packet. Decoders MAY use this bit as an early indication of the last packet of an access unit, but MUST NOT rely on this property.

标记位(M):为RTP时间戳指示的接入单元的最后一个分组设置的1位,与视频格式中M位的正常使用一致,以允许有效的播放缓冲区处理。对于聚合数据包(STAP和MTAP),RTP报头中的标记位必须设置为聚合数据包最后一个NAL单元的标记位的值,如果它在自己的RTP数据包中传输。解码器可以使用该位作为接入单元的最后一个分组的早期指示,但不得依赖于该属性。

Informative note: Only one M bit is associated with an aggregation packet carrying multiple NAL units. Thus, if a gateway has re-packetized an aggregation packet into several packets, it cannot reliably set the M bit of those packets.

资料性说明:只有一个M位与承载多个NAL单元的聚合数据包相关联。因此,如果网关已将聚合数据包重新打包为多个数据包,则无法可靠地设置这些数据包的M位。

Payload type (PT): 7 bits The assignment of an RTP payload type for this new packet format is outside the scope of this document and will not be specified here. The assignment of a payload type has to be performed either through the profile used or in a dynamic way.

有效负载类型(PT):7位此新数据包格式的RTP有效负载类型的分配不在本文档的范围内,此处将不指定。有效负载类型的分配必须通过使用的配置文件或以动态方式执行。

Sequence number (SN): 16 bits Set and used in accordance with RFC 3550. For the single NALU and non-interleaved packetization mode, the sequence number is used to determine decoding order for the NALU.

序列号(SN):根据RFC 3550设置和使用的16位。对于单个NALU和非交错分组模式,序列号用于确定NALU的解码顺序。

Timestamp: 32 bits The RTP timestamp is set to the sampling timestamp of the content. A 90 kHz clock rate MUST be used.

时间戳:32位RTP时间戳设置为内容的采样时间戳。必须使用90 kHz的时钟频率。

If the NAL unit has no timing properties of its own (e.g., parameter set and SEI NAL units), the RTP timestamp is set to the RTP timestamp of the primary coded picture of the access unit in which the NAL unit is included, according to section 7.4.1.2 of [1].

如果NAL单元没有自己的定时属性(例如,参数集和SEI-NAL单元),则根据[1]的第7.4.1.2节,将RTP时间戳设置为包含NAL单元的接入单元的主编码图片的RTP时间戳。

The setting of the RTP Timestamp for MTAPs is defined in section 5.7.2.

MTAP的RTP时间戳设置见第5.7.2节。

Receivers SHOULD ignore any picture timing SEI messages included in access units that have only one display timestamp. Instead, receivers SHOULD use the RTP timestamp for synchronizing the display process.

接收器应忽略仅具有一个显示时间戳的访问单元中包含的任何图片定时SEI消息。相反,接收器应该使用RTP时间戳来同步显示过程。

RTP senders SHOULD NOT transmit picture timing SEI messages for pictures that are not supposed to be displayed as multiple fields.

RTP发送方不应为不应显示为多个字段的图片发送图片定时SEI消息。

If one access unit has more than one display timestamp carried in a picture timing SEI message, then the information in the SEI message SHOULD be treated as relative to the RTP timestamp, with the earliest event occurring at the time given by the RTP timestamp, and subsequent events later, as given by the difference in SEI message picture timing values. Let tSEI1, tSEI2, ..., tSEIn be the display timestamps carried in the SEI message of an access unit, where tSEI1 is the earliest of all such timestamps. Let tmadjst() be a function that adjusts the SEI messages time scale to a 90-kHz time scale. Let TS be the RTP timestamp. Then, the display time for the event associated with tSEI1 is TS. The display time for the event with tSEIx, where x is [2..n] is TS + tmadjst (tSEIx - tSEI1).

如果一个访问单元在图片定时SEI消息中携带了多个显示时间戳,则SEI消息中的信息应被视为相对于RTP时间戳,最早的事件发生在RTP时间戳给出的时间,随后的事件发生在RTP时间戳之后,由SEI消息图片定时值的差异给出。设tSEI1、tSEI2、…、tSEIn为接入单元的SEI消息中携带的显示时间戳,其中tSEI1是所有此类时间戳中最早的。让tmadjst()是一个将SEI消息时间刻度调整为90 kHz时间刻度的函数。设TS为RTP时间戳。然后,与tSEI1关联的事件的显示时间是TS。与tSEIx关联的事件的显示时间,其中x是[2..n]是TS+tmadjst(tSEIx-tSEI1)。

Informative note: Displaying coded frames as fields is needed commonly in an operation known as 3:2 pulldown, in which film content that consists of coded frames is displayed on a display using interlaced scanning. The picture timing SEI message enables carriage of multiple timestamps for the same coded picture, and therefore the 3:2 pulldown process is perfectly controlled. The picture timing SEI message mechanism is necessary because only one timestamp per coded frame can be conveyed in the RTP timestamp.

资料性说明:在称为3:2下拉的操作中,通常需要将编码帧显示为字段,在该操作中,使用隔行扫描在显示器上显示由编码帧组成的胶片内容。图片定时SEI消息允许为同一编码图片传送多个时间戳,因此3:2下拉过程得到完美控制。图片定时SEI消息机制是必要的,因为RTP时间戳中每个编码帧只能传送一个时间戳。

Informative note: Because H.264 allows the decoding order to be different from the display order, values of RTP timestamps may not be monotonically non-decreasing as a function of RTP sequence numbers. Furthermore, the value for interarrival jitter reported in the RTCP reports may not be a trustworthy indication of the network performance, as the calculation rules

资料性说明:由于H.264允许解码顺序不同于显示顺序,因此RTP时间戳的值可能不会作为RTP序列号的函数单调地非递减。此外,根据计算规则,RTCP报告中报告的到达间抖动值可能不是网络性能的可靠指示

for interarrival jitter (section 6.4.1 of RFC 3550) assume that the RTP timestamp of a packet is directly proportional to its transmission time.

对于到达间抖动(RFC 3550第6.4.1节),假设数据包的RTP时间戳与其传输时间成正比。

5.2. Common Structure of the RTP Payload Format
5.2. RTP有效负载格式的通用结构

The payload format defines three different basic payload structures. A receiver can identify the payload structure by the first byte of the RTP payload, which co-serves as the RTP payload header and, in some cases, as the first byte of the payload. This byte is always structured as a NAL unit header. The NAL unit type field indicates which structure is present. The possible structures are as follows:

有效载荷格式定义了三种不同的基本有效载荷结构。接收机可以通过RTP有效负载的第一个字节来识别有效负载结构,该字节共同充当RTP有效负载报头,并且在某些情况下充当有效负载的第一个字节。此字节始终被构造为NAL单元头。NAL单元类型字段指示存在的结构。可能的结构如下:

Single NAL Unit Packet: Contains only a single NAL unit in the payload. The NAL header type field will be equal to the original NAL unit type; i.e., in the range of 1 to 23, inclusive. Specified in section 5.6.

单个NAL单元数据包:在有效负载中仅包含单个NAL单元。NAL标头类型字段将等于原始NAL单位类型;i、 e.范围为1至23(含1至23)。第5.6节中规定。

Aggregation packet: Packet type used to aggregate multiple NAL units into a single RTP payload. This packet exists in four versions, the Single-Time Aggregation Packet type A (STAP-A), the Single-Time Aggregation Packet type B (STAP-B), Multi-Time Aggregation Packet (MTAP) with 16-bit offset (MTAP16), and Multi-Time Aggregation Packet (MTAP) with 24-bit offset (MTAP24). The NAL unit type numbers assigned for STAP-A, STAP-B, MTAP16, and MTAP24 are 24, 25, 26, and 27, respectively. Specified in section 5.7.

聚合数据包:用于将多个NAL单元聚合为单个RTP有效负载的数据包类型。此数据包有四个版本,即单次聚合数据包类型A(STAP-A)、单次聚合数据包类型B(STAP-B)、具有16位偏移量的多时间聚合数据包(MTAP)(MTAP16)和具有24位偏移量的多时间聚合数据包(MTAP)(MTAP24)。为STAP-A、STAP-B、MTAP16和MTAP24分配的NAL单元类型号分别为24、25、26和27。第5.7节中规定。

Fragmentation unit: Used to fragment a single NAL unit over multiple RTP packets. Exists with two versions, FU-A and FU-B, identified with the NAL unit type numbers 28 and 29, respectively. Specified in section 5.8.

分段单元:用于在多个RTP数据包上对单个NAL单元进行分段。存在两个版本,FU-A和FU-B,分别用NAL装置类型编号28和29标识。第5.8节中规定。

Table 1. Summary of NAL unit types and their payload structures

表1。NAL装置类型及其有效载荷结构概述

      Type   Packet    Type name                        Section
      ---------------------------------------------------------
      0      undefined                                    -
      1-23   NAL unit  Single NAL unit packet per H.264   5.6
      24     STAP-A    Single-time aggregation packet     5.7.1
      25     STAP-B    Single-time aggregation packet     5.7.1
      26     MTAP16    Multi-time aggregation packet      5.7.2
      27     MTAP24    Multi-time aggregation packet      5.7.2
      28     FU-A      Fragmentation unit                 5.8
      29     FU-B      Fragmentation unit                 5.8
      30-31  undefined                                    -
        
      Type   Packet    Type name                        Section
      ---------------------------------------------------------
      0      undefined                                    -
      1-23   NAL unit  Single NAL unit packet per H.264   5.6
      24     STAP-A    Single-time aggregation packet     5.7.1
      25     STAP-B    Single-time aggregation packet     5.7.1
      26     MTAP16    Multi-time aggregation packet      5.7.2
      27     MTAP24    Multi-time aggregation packet      5.7.2
      28     FU-A      Fragmentation unit                 5.8
      29     FU-B      Fragmentation unit                 5.8
      30-31  undefined                                    -
        

Informative note: This specification does not limit the size of NAL units encapsulated in single NAL unit packets and fragmentation units. The maximum size of a NAL unit encapsulated in any aggregation packet is 65535 bytes.

资料性说明:本规范不限制封装在单个NAL单元数据包和碎片单元中的NAL单元的大小。封装在任何聚合数据包中的NAL单元的最大大小为65535字节。

5.3. NAL Unit Octet Usage
5.3. NAL单位八位字节用法

The structure and semantics of the NAL unit octet were introduced in section 1.3. For convenience, the format of the NAL unit type octet is reprinted below:

第1.3节介绍了NAL单位八位元的结构和语义。为方便起见,NAL单元类型八位字节的格式如下所示:

      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        
      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        

This section specifies the semantics of F and NRI according to this specification.

本节根据本规范规定了F和NRI的语义。

F: 1 bit forbidden_zero_bit. A value of 0 indicates that the NAL unit type octet and payload should not contain bit errors or other syntax violations. A value of 1 indicates that the NAL unit type octet and payload may contain bit errors or other syntax violations.

F:1位禁止\u零位\u位。值0表示NAL单元类型八位字节和有效负载不应包含位错误或其他语法冲突。值1表示NAL单元类型八位字节和有效负载可能包含位错误或其他语法冲突。

MANEs SHOULD set the F bit to indicate detected bit errors in the NAL unit. The H.264 specification requires that the F bit is equal to 0. When the F bit is set, the decoder is advised that bit errors or any other syntax violations may be present in the payload or in the NAL unit type octet. The simplest decoder reaction to a NAL unit in which the F bit is equal to 1 is to discard such a NAL unit and to conceal the lost data in the discarded NAL unit.

MANE应设置F位,以指示NAL单元中检测到的位错误。H.264规范要求F位等于0。当设置F位时,建议解码器在有效载荷或NAL单元类型八位字节中可能存在位错误或任何其他语法冲突。对于F位等于1的NAL单元,解码器最简单的反应是丢弃这样的NAL单元,并在丢弃的NAL单元中隐藏丢失的数据。

NRI: 2 bits nal_ref_idc. The semantics of value 00 and a non-zero value remain unchanged from the H.264 specification. In other words, a value of 00 indicates that the content of the NAL unit is not used to reconstruct reference pictures for inter picture prediction. Such NAL units can be discarded without risking the integrity of the reference pictures. Values greater than 00 indicate that the decoding of the NAL unit is required to maintain the integrity of the reference pictures.

NRI:2位nal\U ref\U idc。值00和非零值的语义与H.264规范保持不变。换言之,值00表示NAL单元的内容不用于重建用于画面间预测的参考画面。这样的NAL单元可以被丢弃,而不会危及参考图片的完整性。大于00的值表示需要对NAL单元进行解码以保持参考图片的完整性。

In addition to the specification above, according to this RTP payload specification, values of NRI greater than 00 indicate the relative transport priority, as determined by the encoder. MANEs

除上述规范外,根据该RTP有效载荷规范,大于00的NRI值表示编码器确定的相对传输优先级。鬃毛

can use this information to protect more important NAL units better than they do less important NAL units. The highest transport priority is 11, followed by 10, and then by 01; finally, 00 is the lowest.

可以使用此信息更好地保护更重要的NAL单元,而不是不太重要的NAL单元。最高传输优先级是11,其次是10,然后是01;最后,00是最低的。

Informative note: Any non-zero value of NRI is handled identically in H.264 decoders. Therefore, receivers need not manipulate the value of NRI when passing NAL units to the decoder.

资料性说明:NRI的任何非零值在H.264解码器中的处理方式相同。因此,当将NAL单元传递给解码器时,接收机不需要操纵NRI的值。

An H.264 encoder MUST set the value of NRI according to the H.264 specification (subclause 7.4.1) when the value of nal_unit_type is in the range of 1 to 12, inclusive. In particular, the H.264 specification requires that the value of NRI SHALL be equal to 0 for all NAL units having nal_unit_type equal to 6, 9, 10, 11, or 12.

H.264编码器必须根据H.264规范(第7.4.1款)设置NRI值,当nal_单位_类型的值在1到12范围内(包括1到12)时。特别是,H.264规范要求,对于NAL_单元类型等于6、9、10、11或12的所有NAL单元,NRI的值应等于0。

For NAL units having nal_unit_type equal to 7 or 8 (indicating a sequence parameter set or a picture parameter set, respectively), an H.264 encoder SHOULD set the value of NRI to 11 (in binary format). For coded slice NAL units of a primary coded picture having nal_unit_type equal to 5 (indicating a coded slice belonging to an IDR picture), an H.264 encoder SHOULD set the value of NRI to 11 (in binary format).

对于NAL_unit_type等于7或8(分别表示序列参数集或图片参数集)的NAL单元,H.264编码器应将NRI的值设置为11(二进制格式)。对于NAL_unit_type等于5的主编码图片的编码片段NAL单元(表示属于IDR图片的编码片段),H.264编码器应将NRI的值设置为11(二进制格式)。

For a mapping of the remaining nal_unit_types to NRI values, the following example MAY be used and has been shown to be efficient in a certain environment [13]. Other mappings MAY also be desirable, depending on the application and the H.264/AVC Annex A profile in use.

对于剩余的nal_单位_类型到NRI值的映射,可以使用以下示例,并且已经证明在特定环境中是有效的[13]。根据所使用的应用和H.264/AVC附录A配置文件,也可能需要其他映射。

Informative note: Data Partitioning is not available in certain profiles; e.g., in the Main or Baseline profiles. Consequently, the nal unit types 2, 3, and 4 can occur only if the video bitstream conforms to a profile in which data partitioning is allowed and not in streams that conform to the Main or Baseline profiles.

资料性说明:数据分区在某些配置文件中不可用;e、 例如,在主配置文件或基线配置文件中。因此,仅当视频比特流符合其中允许数据分区的简档而不是符合主简档或基线简档的流时,才可以出现nal单元类型2、3和4。

Table 2. Example of NRI values for coded slices and coded slice data partitions of primary coded reference pictures

表2。主编码参考图片的编码切片和编码切片数据分区的NRI值示例

      NAL Unit Type     Content of NAL unit              NRI (binary)
      ----------------------------------------------------------------
       1              non-IDR coded slice                         10
       2              Coded slice data partition A                10
       3              Coded slice data partition B                01
       4              Coded slice data partition C                01
        
      NAL Unit Type     Content of NAL unit              NRI (binary)
      ----------------------------------------------------------------
       1              non-IDR coded slice                         10
       2              Coded slice data partition A                10
       3              Coded slice data partition B                01
       4              Coded slice data partition C                01
        

Informative note: As mentioned before, the NRI value of non-reference pictures is 00 as mandated by H.264/AVC.

资料性说明:如前所述,根据H.264/AVC的规定,非参考图片的NRI值为00。

An H.264 encoder SHOULD set the value of NRI for coded slice and coded slice data partition NAL units of redundant coded reference pictures equal to 01 (in binary format).

H.264编码器应将冗余编码参考图片的编码片段和编码片段数据分区NAL单元的NRI值设置为等于01(二进制格式)。

Definitions of the values for NRI for NAL unit types 24 to 29, inclusive, are given in sections 5.7 and 5.8 of this memo.

本备忘录第5.7节和第5.8节给出了24至29型NAL装置的NRI值定义。

No recommendation for the value of NRI is given for NAL units having nal_unit_type in the range of 13 to 23, inclusive, because these values are reserved for ITU-T and ISO/IEC. No recommendation for the value of NRI is given for NAL units having nal_unit_type equal to 0 or in the range of 30 to 31, inclusive, as the semantics of these values are not specified in this memo.

对于NAL_unit_类型在13到23(包括13到23)范围内的NAL单元,没有给出NRI值的建议,因为这些值是为ITU-T和ISO/IEC保留的。对于NAL_unit_type等于0或在30到31(含30到31)范围内的NAL单元,未给出NRI值的建议,因为本备忘录中未规定这些值的语义。

5.4. Packetization Modes
5.4. 打包方式

This memo specifies three cases of packetization modes:

本备忘录规定了三种打包模式:

o Single NAL unit mode o Non-interleaved mode o Interleaved mode

o 单NAL单元模式o非交织模式o交织模式

The single NAL unit mode is targeted for conversational systems that comply with ITU-T Recommendation H.241 [15] (see section 12.1). The non-interleaved mode is targeted for conversational systems that may not comply with ITU-T Recommendation H.241. In the non-interleaved mode, NAL units are transmitted in NAL unit decoding order. The interleaved mode is targeted for systems that do not require very low end-to-end latency. The interleaved mode allows transmission of NAL units out of NAL unit decoding order.

单NAL单元模式适用于符合ITU-T建议H.241[15](见第12.1节)的对话系统。非交织模式针对可能不符合ITU-T建议H.241的会话系统。在非交织模式中,以NAL单元解码顺序发送NAL单元。交织模式的目标是不需要非常低的端到端延迟的系统。交织模式允许按照NAL单元解码顺序传输NAL单元。

The packetization mode in use MAY be signaled by the value of the OPTIONAL packetization-mode MIME parameter or by external means. The used packetization mode governs which NAL unit types are allowed in RTP payloads. Table 3 summarizes the allowed NAL unit types for each packetization mode. Some NAL unit type values (indicated as undefined in Table 3) are reserved for future extensions. NAL units of those types SHOULD NOT be sent by a sender and MUST be ignored by a receiver. For example, the Types 1-23, with the associated packet type "NAL unit", are allowed in "Single NAL Unit Mode" and in "Non-Interleaved Mode", but disallowed in "Interleaved Mode". Packetization modes are explained in more detail in section 6.

正在使用的打包模式可以通过可选打包模式MIME参数的值或通过外部方式发出信号。使用的打包模式控制RTP有效负载中允许的NAL单元类型。表3总结了每个打包模式允许的NAL单元类型。一些NAL单元类型值(表3中未定义)保留供将来扩展。这些类型的NAL单元不应由发送方发送,接收方必须忽略。例如,在“单NAL单元模式”和“非交织模式”中允许具有相关分组类型“NAL单元”的类型1-23,但在“交织模式”中不允许。第6节将更详细地解释打包模式。

Table 3. Summary of allowed NAL unit types for each packetization mode (yes = allowed, no = disallowed, ig = ignore)

表3。每个打包模式允许的NAL单元类型汇总(是=允许,否=不允许,ig=忽略)

      Type   Packet    Single NAL    Non-Interleaved    Interleaved
                       Unit Mode           Mode             Mode
      -------------------------------------------------------------
        
      Type   Packet    Single NAL    Non-Interleaved    Interleaved
                       Unit Mode           Mode             Mode
      -------------------------------------------------------------
        

0 undefined ig ig ig 1-23 NAL unit yes yes no 24 STAP-A no yes no 25 STAP-B no no yes 26 MTAP16 no no yes 27 MTAP24 no no yes 28 FU-A no yes yes 29 FU-B no no yes 30-31 undefined ig ig ig

0未定义的ig 1-23 NAL装置是是否24 STAP-A否否25 STAP-B否否26 MTAP16否否27 MTAP24否否28 FU-A否是29 FU-B否否否30-31未定义的ig

5.5. Decoding Order Number (DON)
5.5. 解码顺序号(DON)

In the interleaved packetization mode, the transmission order of NAL units is allowed to differ from the decoding order of the NAL units. Decoding order number (DON) is a field in the payload structure or a derived variable that indicates the NAL unit decoding order. Rationale and examples of use cases for transmission out of decoding order and for the use of DON are given in section 13.

在交织分组模式中,允许NAL单元的传输顺序不同于NAL单元的解码顺序。解码顺序号(DON)是有效负载结构中的一个字段,或指示NAL单元解码顺序的派生变量。第13节给出了非解码顺序传输和DON使用的基本原理和用例示例。

The coupling of transmission and decoding order is controlled by the OPTIONAL sprop-interleaving-depth MIME parameter as follows. When the value of the OPTIONAL sprop-interleaving-depth MIME parameter is equal to 0 (explicitly or per default) or transmission of NAL units out of their decoding order is disallowed by external means, the transmission order of NAL units MUST conform to the NAL unit decoding order. When the value of the OPTIONAL sprop-interleaving-depth MIME parameter is greater than 0 or transmission of NAL units out of their decoding order is allowed by external means,

传输和解码顺序的耦合由可选的sprop交错深度MIME参数控制,如下所示。当可选sprop interleaving depth MIME参数的值等于0(显式或默认值)或外部方式不允许传输超出其解码顺序的NAL单元时,NAL单元的传输顺序必须符合NAL单元解码顺序。当可选sprop交错深度MIME参数的值大于0或允许通过外部方式传输超出其解码顺序的NAL单元时,

o the order of NAL units in an MTAP16 and an MTAP24 is NOT REQUIRED to be the NAL unit decoding order, and

o MTAP16和MTAP24中NAL单元的顺序不要求是NAL单元解码顺序,并且

o the order of NAL units generated by decapsulating STAP-Bs, MTAPs, and FUs in two consecutive packets is NOT REQUIRED to be the NAL unit decoding order.

o 通过在两个连续数据包中解封装STAP Bs、MTAP和FU而生成的NAL单元的顺序不需要是NAL单元解码顺序。

The RTP payload structures for a single NAL unit packet, an STAP-A, and an FU-A do not include DON. STAP-B and FU-B structures include DON, and the structure of MTAPs enables derivation of DON as specified in section 5.7.2.

单个NAL单元分组、STAP-a和FU-a的RTP有效载荷结构不包括DON。STAP-B和FU-B结构包括DON,MTAP的结构使DON的推导符合第5.7.2节的规定。

Informative note: When an FU-A occurs in interleaved mode, it always follows an FU-B, which sets its DON.

资料性说明:当FU-A以交错模式出现时,它总是跟随FU-B,后者设置其DON。

Informative note: If a transmitter wants to encapsulate a single NAL unit per packet and transmit packets out of their decoding order, STAP-B packet type can be used.

资料性说明:如果发送器希望每个数据包封装一个NAL单元,并按照解码顺序发送数据包,则可以使用STAP-B数据包类型。

In the single NAL unit packetization mode, the transmission order of NAL units, determined by the RTP sequence number, MUST be the same as their NAL unit decoding order. In the non-interleaved packetization mode, the transmission order of NAL units in single NAL unit packets, STAP-As, and FU-As MUST be the same as their NAL unit decoding order. The NAL units within an STAP MUST appear in the NAL unit decoding order. Thus, the decoding order is first provided through the implicit order within a STAP, and second provided through the RTP sequence number for the order between STAPs, FUs, and single NAL unit packets.

在单NAL单元分组模式中,由RTP序列号确定的NAL单元的传输顺序必须与其NAL单元解码顺序相同。在非交织分组模式中,单个NAL单元分组、STAP As和FU As中NAL单元的传输顺序必须与其NAL单元解码顺序相同。STAP中的NAL单元必须以NAL单元解码顺序出现。因此,解码顺序首先通过STAP内的隐式顺序提供,其次通过RTP序列号提供STAP、FUs和单个NAL单元分组之间的顺序。

Signaling of the value of DON for NAL units carried in STAP-B, MTAP, and a series of fragmentation units starting with an FU-B is specified in sections 5.7.1, 5.7.2, and 5.8, respectively. The DON value of the first NAL unit in transmission order MAY be set to any value. Values of DON are in the range of 0 to 65535, inclusive. After reaching the maximum value, the value of DON wraps around to 0.

第5.7.1节、第5.7.2节和第5.8节分别规定了STAP-B、MTAP和以FU-B开头的一系列碎片单元中NAL单元的DON值的信令。传输顺序中的第一NAL单元的DON值可以设置为任何值。DON的值在0到65535之间(含0到65535)。达到最大值后,DON的值将变为0。

The decoding order of two NAL units contained in any STAP-B, MTAP, or a series of fragmentation units starting with an FU-B is determined as follows. Let DON(i) be the decoding order number of the NAL unit having index i in the transmission order. Function don_diff(m,n) is specified as follows:

包含在任何STAP-B、MTAP或以FU-B开头的一系列分段单元中的两个NAL单元的解码顺序确定如下。假设DON(i)是在传输顺序中具有索引i的NAL单元的解码顺序号。函数don_diff(m,n)指定如下:

      If DON(m) == DON(n), don_diff(m,n) = 0
        
      If DON(m) == DON(n), don_diff(m,n) = 0
        
      If (DON(m) < DON(n) and DON(n) - DON(m) < 32768),
      don_diff(m,n) = DON(n) - DON(m)
        
      If (DON(m) < DON(n) and DON(n) - DON(m) < 32768),
      don_diff(m,n) = DON(n) - DON(m)
        
      If (DON(m) > DON(n) and DON(m) - DON(n) >= 32768),
      don_diff(m,n) = 65536 - DON(m) + DON(n)
        
      If (DON(m) > DON(n) and DON(m) - DON(n) >= 32768),
      don_diff(m,n) = 65536 - DON(m) + DON(n)
        
      If (DON(m) < DON(n) and DON(n) - DON(m) >= 32768),
      don_diff(m,n) = - (DON(m) + 65536 - DON(n))
        
      If (DON(m) < DON(n) and DON(n) - DON(m) >= 32768),
      don_diff(m,n) = - (DON(m) + 65536 - DON(n))
        
      If (DON(m) > DON(n) and DON(m) - DON(n) < 32768),
      don_diff(m,n) = - (DON(m) - DON(n))
        
      If (DON(m) > DON(n) and DON(m) - DON(n) < 32768),
      don_diff(m,n) = - (DON(m) - DON(n))
        

A positive value of don_diff(m,n) indicates that the NAL unit having transmission order index n follows, in decoding order, the NAL unit having transmission order index m. When don_diff(m,n) is equal to 0,

don_diff(m,n)的正值表示具有传输顺序索引n的NAL单元以解码顺序跟随具有传输顺序索引m的NAL单元。当don_diff(m,n)等于0时,

then the NAL unit decoding order of the two NAL units can be in either order. A negative value of don_diff(m,n) indicates that the NAL unit having transmission order index n precedes, in decoding order, the NAL unit having transmission order index m.

然后,两个NAL单元的NAL单元解码顺序可以是任意顺序。don_diff(m,n)的负值表示具有传输顺序索引n的NAL单元以解码顺序先于具有传输顺序索引m的NAL单元。

Values of DON related fields (DON, DONB, and DOND; see section 5.7) MUST be such that the decoding order determined by the values of DON, as specified above, conforms to the NAL unit decoding order. If the order of two NAL units in NAL unit decoding order is switched and the new order does not conform to the NAL unit decoding order, the NAL units MUST NOT have the same value of DON. If the order of two consecutive NAL units in the NAL unit stream is switched and the new order still conforms to the NAL unit decoding order, the NAL units MAY have the same value of DON. For example, when arbitrary slice order is allowed by the video coding profile in use, all the coded slice NAL units of a coded picture are allowed to have the same value of DON. Consequently, NAL units having the same value of DON can be decoded in any order, and two NAL units having a different value of DON should be passed to the decoder in the order specified above. When two consecutive NAL units in the NAL unit decoding order have a different value of DON, the value of DON for the second NAL unit in decoding order SHOULD be the value of DON for the first, incremented by one.

DON相关字段(DON、DONB和DOND;见第5.7节)的值必须确保由上述DON值确定的解码顺序符合NAL单元解码顺序。如果切换了NAL单元解码顺序中两个NAL单元的顺序,并且新的顺序不符合NAL单元解码顺序,则NAL单元不得具有相同的DON值。如果切换了NAL单元流中的两个连续NAL单元的顺序,并且新的顺序仍然符合NAL单元解码顺序,则NAL单元可以具有相同的DON值。例如,当使用中的视频编码简档允许任意片段顺序时,允许编码图片的所有编码片段NAL单元具有相同的DON值。因此,具有相同DON值的NAL单元可以以任何顺序被解码,并且具有不同DON值的两个NAL单元应当以上面指定的顺序被传递给解码器。当NAL单元解码顺序中的两个连续NAL单元具有不同的DON值时,解码顺序中的第二个NAL单元的DON值应为第一个NAL单元的DON值,递增1。

An example of the decapsulation process to recover the NAL unit decoding order is given in section 7.

第7节给出了恢复NAL单元解码顺序的去封装过程的示例。

Informative note: Receivers should not expect that the absolute difference of values of DON for two consecutive NAL units in the NAL unit decoding order will be equal to one, even in error-free transmission. An increment by one is not required, as at the time of associating values of DON to NAL units, it may not be known whether all NAL units are delivered to the receiver. For example, a gateway may not forward coded slice NAL units of non-reference pictures or SEI NAL units when there is a shortage of bit rate in the network to which the packets are forwarded. In another example, a live broadcast is interrupted by pre-encoded content, such as commercials, from time to time. The first intra picture of a pre-encoded clip is transmitted in advance to ensure that it is readily available in the receiver. When transmitting the first intra picture, the originator does not exactly know how many NAL units will be encoded before the first intra picture of the pre-encoded clip follows in decoding order. Thus, the values of DON for the NAL units of the first intra picture of the pre-encoded clip have to be estimated when they are transmitted, and gaps in values of DON may occur.

资料性说明:即使在无差错传输中,接收机也不应期望NAL单元解码顺序中两个连续NAL单元的DON值的绝对差值等于1。不需要增加1,因为在将DON的值与NAL单元相关联时,可能不知道是否所有NAL单元都交付给接收机。例如,当分组转发到的网络中的比特率不足时,网关可以不转发非参考图片或序列单元的编码片段。在另一示例中,实时广播不时地被预编码的内容(例如商业广告)中断。预先发送预编码片段的第一帧内图片,以确保其在接收机中随时可用。当发送第一帧内图片时,发起者不确切地知道在预编码片段的第一帧内图片以解码顺序跟随之前将编码多少NAL单元。因此,当传输预编码片段的第一帧内图片的NAL单元时,必须估计它们的DON值,并且DON值中可能出现间隙。

5.6. Single NAL Unit Packet
5.6. 单NAL单元数据包

The single NAL unit packet defined here MUST contain only one NAL unit, of the types defined in [1]. This means that neither an aggregation packet nor a fragmentation unit can be used within a single NAL unit packet. A NAL unit stream composed by decapsulating single NAL unit packets in RTP sequence number order MUST conform to the NAL unit decoding order. The structure of the single NAL unit packet is shown in Figure 2.

此处定义的单个NAL单元数据包必须仅包含[1]中定义的类型中的一个NAL单元。这意味着在单个NAL单元分组中既不能使用聚合分组也不能使用分段单元。由以RTP序列号顺序对单个NAL单元数据包进行去封装而构成的NAL单元流必须符合NAL单元解码顺序。单个NAL单元数据包的结构如图2所示。

Informative note: The first byte of a NAL unit co-serves as the RTP payload header.

资料性说明:NAL单元co的第一个字节用作RTP有效负载报头。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |F|NRI|  type   |                                               |
      +-+-+-+-+-+-+-+-+                                               |
      |                                                               |
      |               Bytes 2..n of a Single NAL unit                 |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |F|NRI|  type   |                                               |
      +-+-+-+-+-+-+-+-+                                               |
      |                                                               |
      |               Bytes 2..n of a Single NAL unit                 |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 2. RTP payload format for single NAL unit packet

图2。单个NAL单元数据包的RTP有效负载格式

5.7. Aggregation Packets
5.7. 聚合数据包

Aggregation packets are the NAL unit aggregation scheme of this payload specification. The scheme is introduced to reflect the dramatically different MTU sizes of two key target networks: wireline IP networks (with an MTU size that is often limited by the Ethernet MTU size; roughly 1500 bytes), and IP or non-IP (e.g., ITU-T H.324/M) based wireless communication systems with preferred transmission unit sizes of 254 bytes or less. To prevent media transcoding between the two worlds, and to avoid undesirable packetization overhead, a NAL unit aggregation scheme is introduced.

聚合数据包是本有效负载规范的NAL单元聚合方案。引入该方案是为了反映两个关键目标网络的显著不同的MTU大小:有线IP网络(MTU大小通常受到以太网MTU大小的限制;大约1500字节)和基于IP或非IP(例如,ITU-T H.324/M)的无线通信系统,首选传输单元大小为254字节或更小。为了防止两个世界之间的媒体转码,并避免不必要的分组开销,引入了NAL单元聚合方案。

Two types of aggregation packets are defined by this specification:

本规范定义了两种类型的聚合数据包:

o Single-time aggregation packet (STAP): aggregates NAL units with identical NALU-time. Two types of STAPs are defined, one without DON (STAP-A) and another including DON (STAP-B).

o 单时间聚合数据包(STAP):聚合具有相同NALU时间的NAL单元。定义了两种类型的STAP,一种没有DON(STAP-A),另一种包括DON(STAP-B)。

o Multi-time aggregation packet (MTAP): aggregates NAL units with potentially differing NALU-time. Two different MTAPs are defined, differing in the length of the NAL unit timestamp offset.

o 多时间聚合数据包(MTAP):聚合具有潜在不同NALU时间的NAL单元。定义了两个不同的MTAP,其NAL单位时间戳偏移量的长度不同。

The term NALU-time is defined as the value that the RTP timestamp would have if that NAL unit would be transported in its own RTP packet.

术语NALU时间定义为如果NAL单元将在其自己的RTP数据包中传输,RTP时间戳将具有的值。

Each NAL unit to be carried in an aggregation packet is encapsulated in an aggregation unit. Please see below for the four different aggregation units and their characteristics.

要在聚合分组中携带的每个NAL单元被封装在聚合单元中。请参见下面的四个不同聚合单元及其特征。

The structure of the RTP payload format for aggregation packets is presented in Figure 3.

聚合数据包的RTP有效负载格式的结构如图3所示。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |F|NRI|  type   |                                               |
      +-+-+-+-+-+-+-+-+                                               |
      |                                                               |
      |             one or more aggregation units                     |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |F|NRI|  type   |                                               |
      +-+-+-+-+-+-+-+-+                                               |
      |                                                               |
      |             one or more aggregation units                     |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 3. RTP payload format for aggregation packets

图3。聚合数据包的RTP有效负载格式

MTAPs and STAPs share the following packetization rules: The RTP timestamp MUST be set to the earliest of the NALU times of all the NAL units to be aggregated. The type field of the NAL unit type octet MUST be set to the appropriate value, as indicated in Table 4. The F bit MUST be cleared if all F bits of the aggregated NAL units are zero; otherwise, it MUST be set. The value of NRI MUST be the maximum of all the NAL units carried in the aggregation packet.

MTAP和STAP共享以下打包规则:RTP时间戳必须设置为要聚合的所有NAL单元的最早NALU时间。NAL单元类型八位字节的类型字段必须设置为适当的值,如表4所示。如果聚合NAL单元的所有F位为零,则必须清除F位;否则,必须设置它。NRI的值必须是聚合数据包中携带的所有NAL单元的最大值。

Table 4. Type field for STAPs and MTAPs

表4。STAP和MTAP的类型字段

      Type   Packet    Timestamp offset   DON related fields
                       field length       (DON, DONB, DOND)
                       (in bits)          present
      --------------------------------------------------------
      24     STAP-A       0                 no
      25     STAP-B       0                 yes
      26     MTAP16      16                 yes
      27     MTAP24      24                 yes
        
      Type   Packet    Timestamp offset   DON related fields
                       field length       (DON, DONB, DOND)
                       (in bits)          present
      --------------------------------------------------------
      24     STAP-A       0                 no
      25     STAP-B       0                 yes
      26     MTAP16      16                 yes
      27     MTAP24      24                 yes
        

The marker bit in the RTP header is set to the value that the marker bit of the last NAL unit of the aggregated packet would have if it were transported in its own RTP packet.

RTP报头中的标记位设置为聚合数据包的最后一个NAL单元的标记位在其自身RTP数据包中传输时的值。

The payload of an aggregation packet consists of one or more aggregation units. See sections 5.7.1 and 5.7.2 for the four different types of aggregation units. An aggregation packet can carry as many aggregation units as necessary; however, the total amount of data in an aggregation packet obviously MUST fit into an IP packet, and the size SHOULD be chosen so that the resulting IP packet is smaller than the MTU size. An aggregation packet MUST NOT contain fragmentation units specified in section 5.8. Aggregation packets MUST NOT be nested; i.e., an aggregation packet MUST NOT contain another aggregation packet.

聚合数据包的有效负载由一个或多个聚合单元组成。有关四种不同类型的聚合单元,请参见第5.7.1节和第5.7.2节。一个聚合包可以根据需要携带任意多个聚合单元;然而,聚合数据包中的数据总量显然必须适合于IP数据包,并且应选择大小,以便生成的IP数据包小于MTU大小。聚合数据包不得包含第5.8节中规定的碎片单元。聚合数据包不能嵌套;i、 例如,聚合数据包不得包含另一个聚合数据包。

5.7.1. Single-Time Aggregation Packet
5.7.1. 一次性聚合数据包

Single-time aggregation packet (STAP) SHOULD be used whenever NAL units are aggregated that all share the same NALU-time. The payload of an STAP-A does not include DON and consists of at least one single-time aggregation unit, as presented in Figure 4. The payload of an STAP-B consists of a 16-bit unsigned decoding order number (DON) (in network byte order) followed by at least one single-time aggregation unit, as presented in Figure 5.

每当聚合所有共享相同NALU时间的NAL单元时,应使用单时间聚合数据包(STAP)。STAP-A的有效载荷不包括DON,而是由至少一个单一时间聚合单元组成,如图4所示。STAP-B的有效载荷由16位无符号解码顺序号(DON)(以网络字节顺序)和至少一个单一时间聚合单元组成,如图5所示。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                      :                                               |
      +-+-+-+-+-+-+-+-+                                               |
      |                                                               |
      |                single-time aggregation units                  |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                      :                                               |
      +-+-+-+-+-+-+-+-+                                               |
      |                                                               |
      |                single-time aggregation units                  |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 4. Payload format for STAP-A

图4。STAP-A的有效载荷格式

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                      :  decoding order number (DON)  |               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
      |                                                               |
      |                single-time aggregation units                  |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                      :  decoding order number (DON)  |               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
      |                                                               |
      |                single-time aggregation units                  |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 5. Payload format for STAP-B

图5。STAP-B的有效载荷格式

The DON field specifies the value of DON for the first NAL unit in an STAP-B in transmission order. For each successive NAL unit in appearance order in an STAP-B, the value of DON is equal to (the value of DON of the previous NAL unit in the STAP-B + 1) % 65536, in which '%' stands for the modulo operation.

DON字段按传输顺序指定STAP-B中第一个NAL单元的DON值。对于STAP-B中按外观顺序排列的每个后续NAL单元,DON的值等于(STAP-B+1中上一个NAL单元的DON值)%65536,其中“%”表示模运算。

A single-time aggregation unit consists of 16-bit unsigned size information (in network byte order) that indicates the size of the following NAL unit in bytes (excluding these two octets, but including the NAL unit type octet of the NAL unit), followed by the NAL unit itself, including its NAL unit type byte. A single-time aggregation unit is byte aligned within the RTP payload, but it may not be aligned on a 32-bit word boundary. Figure 6 presents the structure of the single-time aggregation unit.

单个时间聚合单元由16位无符号大小信息(按网络字节顺序)组成,该信息以字节表示以下NAL单元的大小(不包括这两个八位字节,但包括NAL单元的NAL单元类型八位字节),然后是NAL单元本身,包括其NAL单元类型字节。单个时间聚合单元在RTP有效负载内是字节对齐的,但它可能不在32位字边界上对齐。图6显示了单个时间聚合单元的结构。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                      :        NAL unit size          |               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
      |                                                               |
      |                           NAL unit                            |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                      :        NAL unit size          |               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
      |                                                               |
      |                           NAL unit                            |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 6. Structure for single-time aggregation unit

图6。单时间聚合单元的结构

Figure 7 presents an example of an RTP packet that contains an STAP-A. The STAP contains two single-time aggregation units, labeled as 1 and 2 in the figure.

图7显示了包含STAP-A的RTP数据包的示例。STAP包含两个单时间聚合单元,在图中标记为1和2。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          RTP Header                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |STAP-A NAL HDR |         NALU 1 Size           | NALU 1 HDR    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         NALU 1 Data                           |
      :                                                               :
      +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               | NALU 2 Size                   | NALU 2 HDR    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         NALU 2 Data                           |
      :                                                               :
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          RTP Header                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |STAP-A NAL HDR |         NALU 1 Size           | NALU 1 HDR    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         NALU 1 Data                           |
      :                                                               :
      +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               | NALU 2 Size                   | NALU 2 HDR    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                         NALU 2 Data                           |
      :                                                               :
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 7. An example of an RTP packet including an STAP-A and two single-time aggregation units

图7。包含STAP-A和两个单时间聚合单元的RTP分组的示例

Figure 8 presents an example of an RTP packet that contains an STAP-B. The STAP contains two single-time aggregation units, labeled as 1 and 2 in the figure.

图8显示了包含STAP-B的RTP数据包的示例。STAP包含两个单时间聚合单元,在图中标记为1和2。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          RTP Header                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |STAP-B NAL HDR | DON                           | NALU 1 Size   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | NALU 1 Size   | NALU 1 HDR    | NALU 1 Data                   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
      :                                                               :
      +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               | NALU 2 Size                   | NALU 2 HDR    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       NALU 2 Data                             |
      :                                                               :
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          RTP Header                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |STAP-B NAL HDR | DON                           | NALU 1 Size   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | NALU 1 Size   | NALU 1 HDR    | NALU 1 Data                   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
      :                                                               :
      +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               | NALU 2 Size                   | NALU 2 HDR    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                       NALU 2 Data                             |
      :                                                               :
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 8. An example of an RTP packet including an STAP-B and two single-time aggregation units

图8。包含STAP-B和两个单时间聚合单元的RTP分组的示例

5.7.2. Multi-Time Aggregation Packets (MTAPs)
5.7.2. 多时间聚合数据包(MTAP)

The NAL unit payload of MTAPs consists of a 16-bit unsigned decoding order number base (DONB) (in network byte order) and one or more multi-time aggregation units, as presented in Figure 9. DONB MUST contain the value of DON for the first NAL unit in the NAL unit decoding order among the NAL units of the MTAP.

MTAP的NAL单元有效负载由一个16位无符号解码顺序数字基(DONB)(以网络字节顺序)和一个或多个多次聚合单元组成,如图9所示。DONB必须包含MTAP NAL单元中NAL单元解码顺序中第一个NAL单元的DON值。

Informative note: The first NAL unit in the NAL unit decoding order is not necessarily the first NAL unit in the order in which the NAL units are encapsulated in an MTAP.

资料性说明:NAL单元解码顺序中的第一个NAL单元不一定是NAL单元封装在MTAP中的顺序中的第一个NAL单元。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                      :  decoding order number base   |               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
      |                                                               |
      |                 multi-time aggregation units                  |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                      :  decoding order number base   |               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
      |                                                               |
      |                 multi-time aggregation units                  |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 9. NAL unit payload format for MTAPs

图9。MTAP的NAL单元有效负载格式

Two different multi-time aggregation units are defined in this specification. Both of them consist of 16 bits unsigned size information of the following NAL unit (in network byte order), an 8- bit unsigned decoding order number difference (DOND), and n bits (in network byte order) of timestamp offset (TS offset) for this NAL unit, whereby n can be 16 or 24. The choice between the different MTAP types (MTAP16 and MTAP24) is application dependent: the larger the timestamp offset is, the higher the flexibility of the MTAP, but the overhead is also higher.

本规范中定义了两个不同的多次聚合单元。它们都由以下NAL单元的16位无符号大小信息(网络字节顺序)、8位无符号解码顺序数字差(DOND)和该NAL单元的时间戳偏移量(TS偏移量)的n位(网络字节顺序)组成,其中n可以是16或24。不同MTAP类型(MTAP16和MTAP24)之间的选择取决于应用程序:时间戳偏移量越大,MTAP的灵活性越高,但开销也越大。

The structure of the multi-time aggregation units for MTAP16 and MTAP24 are presented in Figures 10 and 11, respectively. The starting or ending position of an aggregation unit within a packet is NOT REQUIRED to be on a 32-bit word boundary. The DON of the following NAL unit is equal to (DONB + DOND) % 65536, in which % denotes the modulo operation. This memo does not specify how the NAL units within an MTAP are ordered, but, in most cases, NAL unit decoding order SHOULD be used.

MTAP16和MTAP24的多时间聚合单元的结构分别如图10和图11所示。数据包中聚合单元的起始或结束位置不需要位于32位字边界上。以下NAL单元的DON等于(DONB+DOND)%65536,其中%表示模运算。本备忘录未指定MTAP中NAL单元的排序方式,但在大多数情况下,应使用NAL单元解码顺序。

The timestamp offset field MUST be set to a value equal to the value of the following formula: If the NALU-time is larger than or equal to the RTP timestamp of the packet, then the timestamp offset equals (the NALU-time of the NAL unit - the RTP timestamp of the packet). If the NALU-time is smaller than the RTP timestamp of the packet, then the timestamp offset is equal to the NALU-time + (2^32 - the RTP timestamp of the packet).

时间戳偏移字段必须设置为等于以下公式值的值:如果NALU时间大于或等于数据包的RTP时间戳,则时间戳偏移等于(NAL单元的NALU时间-数据包的RTP时间戳)。如果NALU时间小于数据包的RTP时间戳,则时间戳偏移量等于NALU时间+(2^32-数据包的RTP时间戳)。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      :        NAL unit size          |      DOND     |  TS offset    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  TS offset    |                                               |
      +-+-+-+-+-+-+-+-+              NAL unit                         |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      :        NAL unit size          |      DOND     |  TS offset    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  TS offset    |                                               |
      +-+-+-+-+-+-+-+-+              NAL unit                         |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 10. Multi-time aggregation unit for MTAP16

图10。MTAP16的多时间聚合单元

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      :        NALU unit size         |      DOND     |  TS offset    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         TS offset             |                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
      |                              NAL unit                         |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      :        NALU unit size         |      DOND     |  TS offset    |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         TS offset             |                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
      |                              NAL unit                         |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 11. Multi-time aggregation unit for MTAP24

图11。MTAP24的多时间聚合单元

For the "earliest" multi-time aggregation unit in an MTAP the timestamp offset MUST be zero. Hence, the RTP timestamp of the MTAP itself is identical to the earliest NALU-time.

对于MTAP中的“最早”多时间聚合单元,时间戳偏移量必须为零。因此,MTAP本身的RTP时间戳与最早的NALU时间相同。

Informative note: The "earliest" multi-time aggregation unit is the one that would have the smallest extended RTP timestamp among all the aggregation units of an MTAP if the aggregation units were encapsulated in single NAL unit packets. An extended timestamp is a timestamp that has more than 32 bits and is capable of counting the wraparound of the timestamp field, thus enabling one to determine the smallest value if the timestamp wraps. Such an "earliest" aggregation unit may not be the first one in the order in which the aggregation units are encapsulated in an MTAP. The "earliest" NAL unit need not be the same as the first NAL unit in the NAL unit decoding order either.

资料性说明:“最早”的多次聚合单元是指如果聚合单元封装在单个NAL单元数据包中,则在MTAP的所有聚合单元中具有最小扩展RTP时间戳的单元。扩展时间戳是具有超过32位的时间戳,并且能够对时间戳字段的环绕进行计数,从而使得能够在时间戳环绕时确定最小值。这种“最早”的聚合单元可能不是MTAP中聚合单元封装顺序中的第一个聚合单元。“最早的”NAL单元也不必与NAL单元解码顺序中的第一个NAL单元相同。

Figure 12 presents an example of an RTP packet that contains a multi-time aggregation packet of type MTAP16 that contains two multi-time aggregation units, labeled as 1 and 2 in the figure.

图12显示了一个RTP数据包示例,其中包含MTAP16类型的多次聚合数据包,该数据包包含两个多次聚合单元,在图中标记为1和2。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          RTP Header                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |MTAP16 NAL HDR |  decoding order number base   | NALU 1 Size   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  NALU 1 Size  |  NALU 1 DOND  |       NALU 1 TS offset        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  NALU 1 HDR   |  NALU 1 DATA                                  |
      +-+-+-+-+-+-+-+-+                                               +
      :                                                               :
      +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               | NALU 2 SIZE                   |  NALU 2 DOND  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |       NALU 2 TS offset        |  NALU 2 HDR   |  NALU 2 DATA  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
      :                                                               :
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          RTP Header                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |MTAP16 NAL HDR |  decoding order number base   | NALU 1 Size   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  NALU 1 Size  |  NALU 1 DOND  |       NALU 1 TS offset        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  NALU 1 HDR   |  NALU 1 DATA                                  |
      +-+-+-+-+-+-+-+-+                                               +
      :                                                               :
      +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               | NALU 2 SIZE                   |  NALU 2 DOND  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |       NALU 2 TS offset        |  NALU 2 HDR   |  NALU 2 DATA  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+               |
      :                                                               :
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 12. An RTP packet including a multi-time aggregation packet of type MTAP16 and two multi-time aggregation units

图12。一种RTP数据包,包括MTAP16类型的多次聚合数据包和两个多次聚合单元

Figure 13 presents an example of an RTP packet that contains a multi-time aggregation packet of type MTAP24 that contains two multi-time aggregation units, labeled as 1 and 2 in the figure.

图13显示了一个RTP数据包的示例,其中包含MTAP24类型的多次聚合数据包,该数据包包含两个多次聚合单元,在图中标记为1和2。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          RTP Header                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |MTAP24 NAL HDR |  decoding order number base   | NALU 1 Size   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  NALU 1 Size  |  NALU 1 DOND  |       NALU 1 TS offs          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |NALU 1 TS offs |  NALU 1 HDR   |  NALU 1 DATA                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
      :                                                               :
      +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               | NALU 2 SIZE                   |  NALU 2 DOND  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |       NALU 2 TS offset                        |  NALU 2 HDR   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  NALU 2 DATA                                                  |
      :                                                               :
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          RTP Header                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |MTAP24 NAL HDR |  decoding order number base   | NALU 1 Size   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  NALU 1 Size  |  NALU 1 DOND  |       NALU 1 TS offs          |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |NALU 1 TS offs |  NALU 1 HDR   |  NALU 1 DATA                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
      :                                                               :
      +               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               | NALU 2 SIZE                   |  NALU 2 DOND  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |       NALU 2 TS offset                        |  NALU 2 HDR   |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |  NALU 2 DATA                                                  |
      :                                                               :
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 13. An RTP packet including a multi-time aggregation packet of type MTAP24 and two multi-time aggregation units

图13。一种RTP数据包,包括MTAP24类型的多次聚合数据包和两个多次聚合单元

5.8. Fragmentation Units (FUs)
5.8. 碎片单位(FUs)

This payload type allows fragmenting a NAL unit into several RTP packets. Doing so on the application layer instead of relying on lower layer fragmentation (e.g., by IP) has the following advantages:

此有效负载类型允许将NAL单元分段为多个RTP数据包。在应用层上这样做而不是依赖较低层的碎片(例如,通过IP)具有以下优点:

o The payload format is capable of transporting NAL units bigger than 64 kbytes over an IPv4 network that may be present in pre-recorded video, particularly in High Definition formats (there is a limit of the number of slices per picture, which results in a limit of NAL units per picture, which may result in big NAL units).

o 有效负载格式能够通过IPv4网络传输大于64 KB的NAL单元,该NAL单元可能存在于预先录制的视频中,特别是在高清晰度格式中(存在每个图片的切片数限制,这导致每个图片的NAL单元限制,这可能导致大NAL单元)。

o The fragmentation mechanism allows fragmenting a single picture and applying generic forward error correction as described in section 12.5.

o 分段机制允许对单个图片进行分段,并应用第12.5节所述的通用前向纠错。

Fragmentation is defined only for a single NAL unit and not for any aggregation packets. A fragment of a NAL unit consists of an integer number of consecutive octets of that NAL unit. Each octet of the NAL unit MUST be part of exactly one fragment of that NAL unit. Fragments of the same NAL unit MUST be sent in consecutive order with ascending RTP sequence numbers (with no other RTP packets within the same RTP packet stream being sent between the first and last fragment). Similarly, a NAL unit MUST be reassembled in RTP sequence number order.

碎片仅为单个NAL单元定义,不为任何聚合数据包定义。NAL单元的片段由该NAL单元的整数个连续八位字节组成。NAL单元的每个八位组必须恰好是该NAL单元的一个片段的一部分。同一NAL单元的片段必须以递增RTP序列号的连续顺序发送(同一RTP数据包流中没有其他RTP数据包在第一个片段和最后一个片段之间发送)。同样,NAL单元必须按照RTP序列号顺序重新组装。

When a NAL unit is fragmented and conveyed within fragmentation units (FUs), it is referred to as a fragmented NAL unit. STAPs and MTAPs MUST NOT be fragmented. FUs MUST NOT be nested; i.e., an FU MUST NOT contain another FU.

当NAL单元被分段并在分段单元(FUs)内传送时,它被称为分段NAL单元。STAP和MTAP不得分割。FU不得嵌套;i、 例如,一个赋不能包含另一个赋。

The RTP timestamp of an RTP packet carrying an FU is set to the NALU time of the fragmented NAL unit.

携带FU的RTP分组的RTP时间戳被设置为分段NAL单元的NALU时间。

Figure 14 presents the RTP payload format for FU-As. An FU-A consists of a fragmentation unit indicator of one octet, a fragmentation unit header of one octet, and a fragmentation unit payload.

图14显示了FU As的RTP有效负载格式。FU-A由一个八位字节的碎片单元指示符、一个八位字节的碎片单元头和碎片单元有效载荷组成。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | FU indicator  |   FU header   |                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
      |                                                               |
      |                         FU payload                            |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | FU indicator  |   FU header   |                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               |
      |                                                               |
      |                         FU payload                            |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 14. RTP payload format for FU-A

图14。FU-A的RTP有效载荷格式

Figure 15 presents the RTP payload format for FU-Bs. An FU-B consists of a fragmentation unit indicator of one octet, a fragmentation unit header of one octet, a decoding order number (DON) (in network byte order), and a fragmentation unit payload. In other words, the structure of FU-B is the same as the structure of FU-A, except for the additional DON field.

图15显示了FU Bs的RTP有效负载格式。FU-B由一个八位字节的分段单元指示符、一个八位字节的分段单元报头、解码顺序号(DON)(以网络字节顺序)和分段单元有效载荷组成。换句话说,除了附加的DON字段外,FU-B的结构与FU-A的结构相同。

       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | FU indicator  |   FU header   |               DON             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-|
      |                                                               |
      |                         FU payload                            |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | FU indicator  |   FU header   |               DON             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-|
      |                                                               |
      |                         FU payload                            |
      |                                                               |
      |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                               :...OPTIONAL RTP padding        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 15. RTP payload format for FU-B

图15。FU-B的RTP有效载荷格式

NAL unit type FU-B MUST be used in the interleaved packetization mode for the first fragmentation unit of a fragmented NAL unit. NAL unit type FU-B MUST NOT be used in any other case. In other words, in the interleaved packetization mode, each NALU that is fragmented has an FU-B as the first fragment, followed by one or more FU-A fragments.

对于分段NAL单元的第一个分段单元,必须在交织分组模式下使用FU-B型NAL单元。在任何其他情况下,不得使用FU-B型NAL装置。换言之,在交织分组模式中,被分段的每个NALU具有一个FU-B作为第一个片段,后跟一个或多个FU-A片段。

The FU indicator octet has the following format:

FU指示器八位字节的格式如下:

      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        
      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |F|NRI|  Type   |
      +---------------+
        

Values equal to 28 and 29 in the Type field of the FU indicator octet identify an FU-A and an FU-B, respectively. The use of the F bit is described in section 5.3. The value of the NRI field MUST be set according to the value of the NRI field in the fragmented NAL unit.

FU指示剂八位字节类型字段中等于28和29的值分别表示FU-A和FU-B。第5.3节介绍了F位的使用。必须根据分段NAL单元中NRI字段的值设置NRI字段的值。

The FU header has the following format:

FU标题具有以下格式:

      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |S|E|R|  Type   |
      +---------------+
        
      +---------------+
      |0|1|2|3|4|5|6|7|
      +-+-+-+-+-+-+-+-+
      |S|E|R|  Type   |
      +---------------+
        

S: 1 bit When set to one, the Start bit indicates the start of a fragmented NAL unit. When the following FU payload is not the start of a fragmented NAL unit payload, the Start bit is set to zero.

S:1位当设置为1时,起始位表示分段NAL单元的开始。当以下FU有效负载不是分段NAL单元有效负载的开始时,开始位设置为零。

E: 1 bit When set to one, the End bit indicates the end of a fragmented NAL unit, i.e., the last byte of the payload is also the last byte of the fragmented NAL unit. When the following FU payload is not the last fragment of a fragmented NAL unit, the End bit is set to zero.

E:1位当设置为1时,结束位表示分段NAL单元的结束,即有效负载的最后一个字节也是分段NAL单元的最后一个字节。当以下FU有效负载不是分段NAL单元的最后一个片段时,结束位设置为零。

R: 1 bit The Reserved bit MUST be equal to 0 and MUST be ignored by the receiver.

R:1位保留位必须等于0,并且必须被接收器忽略。

Type: 5 bits The NAL unit payload type as defined in table 7-1 of [1].

类型:5位为[1]表7-1中定义的NAL单元有效载荷类型。

The value of DON in FU-Bs is selected as described in section 5.5.

如第5.5节所述,选择FU Bs中的DON值。

Informative note: The DON field in FU-Bs allows gateways to fragment NAL units to FU-Bs without organizing the incoming NAL units to the NAL unit decoding order.

资料性说明:FU Bs中的DON字段允许网关将NAL单元分段到FU Bs,而无需将传入的NAL单元组织到NAL单元解码顺序。

A fragmented NAL unit MUST NOT be transmitted in one FU; i.e., the Start bit and End bit MUST NOT both be set to one in the same FU header.

碎片NAL单元不得在一个FU中传输;i、 例如,在同一FU头中,起始位和结束位不得同时设置为一。

The FU payload consists of fragments of the payload of the fragmented NAL unit so that if the fragmentation unit payloads of consecutive FUs are sequentially concatenated, the payload of the fragmented NAL unit can be reconstructed. The NAL unit type octet of the fragmented NAL unit is not included as such in the fragmentation unit payload, but rather the information of the NAL unit type octet of the fragmented NAL unit is conveyed in F and NRI fields of the FU indicator octet of the fragmentation unit and in the type field of the FU header. A FU payload MAY have any number of octets and MAY be empty.

FU有效载荷由分段NAL单元的有效载荷的片段组成,因此如果连续FU的分段单元有效载荷被顺序串联,则可以重构分段NAL单元的有效载荷。片段化NAL单元的NAL单元类型八位字节不包括在片段化单元有效载荷中,而是片段化NAL单元的NAL单元类型八位字节的信息在片段化单元的FU指示符八位字节的F和NRI字段以及FU报头的type字段中传送。FU有效载荷可以有任意数量的八位字节,并且可以为空。

Informative note: Empty FUs are allowed to reduce the latency of a certain class of senders in nearly lossless environments. These senders can be characterized in that they packetize NALU fragments before the NALU is completely generated and, hence, before the NALU size is known. If zero-length NALU fragments were not allowed, the sender would have to generate at least one bit of data of the following fragment before the current fragment could be sent. Due to the characteristics of H.264, where sometimes

信息性说明:在几乎无损的环境中,允许使用空FU来减少某类发送方的延迟。这些发送器的特征在于,它们在NALU完全生成之前(因此在NALU大小已知之前)打包NALU片段。如果不允许使用长度为零的NALU片段,则发送方必须生成以下片段的至少一位数据,然后才能发送当前片段。由于H.264的特点,有时

several macroblocks occupy zero bits, this is undesirable and can add delay. However, the (potential) use of zero-length NALUs should be carefully weighed against the increased risk of the loss of the NALU because of the additional packets employed for its transmission.

多个宏块占用零位,这是不需要的,并且会增加延迟。但是,应仔细权衡零长度NALU的(潜在)使用与NALU丢失风险的增加,因为其传输使用了额外的数据包。

If a fragmentation unit is lost, the receiver SHOULD discard all following fragmentation units in transmission order corresponding to the same fragmented NAL unit.

如果碎片单元丢失,则接收器应按照与相同碎片NAL单元对应的传输顺序丢弃所有后续碎片单元。

A receiver in an endpoint or in a MANE MAY aggregate the first n-1 fragments of a NAL unit to an (incomplete) NAL unit, even if fragment n of that NAL unit is not received. In this case, the forbidden_zero_bit of the NAL unit MUST be set to one to indicate a syntax violation.

端点或MANE中的接收器可以将NAL单元的前n-1个片段聚合为(不完整的)NAL单元,即使没有接收到该NAL单元的片段n。在这种情况下,NAL单元的禁止\u零\u位必须设置为1,以指示语法冲突。

6. Packetization Rules
6. 打包规则

The packetization modes are introduced in section 5.2. The packetization rules common to more than one of the packetization modes are specified in section 6.1. The packetization rules for the single NAL unit mode, the non-interleaved mode, and the interleaved mode are specified in sections 6.2, 6.3, and 6.4, respectively.

第5.2节介绍了打包模式。第6.1节规定了一种以上包装模式通用的包装规则。第6.2、6.3和6.4节分别规定了单NAL单元模式、非交织模式和交织模式的分组规则。

6.1. Common Packetization Rules
6.1. 通用分组规则

All senders MUST enforce the following packetization rules regardless of the packetization mode in use:

无论使用何种打包模式,所有发件人都必须强制执行以下打包规则:

o Coded slice NAL units or coded slice data partition NAL units belonging to the same coded picture (and thus sharing the same RTP timestamp value) MAY be sent in any order permitted by the applicable profile defined in [1]; however, for delay-critical systems, they SHOULD be sent in their original coding order to minimize the delay. Note that the coding order is not necessarily the scan order, but the order the NAL packets become available to the RTP stack.

o 属于相同编码图片(并因此共享相同RTP时间戳值)的编码片段NAL单元或编码片段数据分区NAL单元可以按照[1]中定义的适用概要文件允许的任何顺序发送;然而,对于延迟关键系统,它们应该按照其原始编码顺序发送,以最小化延迟。注意,编码顺序不一定是扫描顺序,而是NAL数据包对RTP堆栈可用的顺序。

o Parameter sets are handled in accordance with the rules and recommendations given in section 8.4.

o 根据第8.4节给出的规则和建议处理参数集。

o MANEs MUST NOT duplicate any NAL unit except for sequence or picture parameter set NAL units, as neither this memo nor the H.264 specification provides means to identify duplicated NAL units. Sequence and picture parameter set NAL units MAY be duplicated to make their correct reception more probable, but any such duplication MUST NOT affect the contents of any active sequence or picture parameter set. Duplication SHOULD be

o 除序列或图片参数集NAL单元外,MANE不得复制任何NAL单元,因为本备忘录和H.264规范均未提供识别重复NAL单元的方法。序列和图片参数集NAL单元可被复制,以使其更可能被正确接收,但任何此类复制不得影响任何活动序列或图片参数集的内容。应避免重复

performed on the application layer and not by duplicating RTP packets (with identical sequence numbers).

在应用层上执行,而不是通过复制RTP数据包(具有相同的序列号)。

Senders using the non-interleaved mode and the interleaved mode MUST enforce the following packetization rule:

使用非交织模式和交织模式的发送方必须执行以下分组规则:

o MANEs MAY convert single NAL unit packets into one aggregation packet, convert an aggregation packet into several single NAL unit packets, or mix both concepts, in an RTP translator. The RTP translator SHOULD take into account at least the following parameters: path MTU size, unequal protection mechanisms (e.g., through packet-based FEC according to RFC 2733 [18], especially for sequence and picture parameter set NAL units and coded slice data partition A NAL units), bearable latency of the system, and buffering capabilities of the receiver.

o 在RTP转换器中,MANE可以将单个NAL单元分组转换为一个聚合分组,将聚合分组转换为多个单个NAL单元分组,或者混合这两个概念。RTP转换器应至少考虑以下参数:路径MTU大小、不等保护机制(例如,根据RFC 2733[18],通过基于分组的FEC,尤其是序列和图片参数集NAL单元和编码切片数据分区NAL单元)、系统的可承受延迟,以及接收器的缓冲能力。

Informative note: An RTP translator is required to handle RTCP as per RFC 3550.

资料性说明:根据RFC 3550,需要RTP转换器来处理RTCP。

6.2. Single NAL Unit Mode
6.2. 单NAL单元模式

This mode is in use when the value of the OPTIONAL packetization-mode MIME parameter is equal to 0, the packetization-mode is not present, or no other packetization mode is signaled by external means. All receivers MUST support this mode. It is primarily intended for low-delay applications that are compatible with systems using ITU-T Recommendation H.241 [15] (see section 12.1). Only single NAL unit packets MAY be used in this mode. STAPs, MTAPs, and FUs MUST NOT be used. The transmission order of single NAL unit packets MUST comply with the NAL unit decoding order.

当可选打包模式MIME参数的值等于0、打包模式不存在或没有其他打包模式通过外部方式发出信号时,使用此模式。所有接收器必须支持此模式。它主要用于与使用ITU-T建议H.241[15]的系统兼容的低延迟应用(见第12.1节)。在此模式中只能使用单个NAL单元数据包。不得使用STAP、MTAP和FUs。单个NAL单元数据包的传输顺序必须符合NAL单元解码顺序。

6.3. Non-Interleaved Mode
6.3. 非交织模式

This mode is in use when the value of the OPTIONAL packetization-mode MIME parameter is equal to 1 or the mode is turned on by external means. This mode SHOULD be supported. It is primarily intended for low-delay applications. Only single NAL unit packets, STAP-As, and FU-As MAY be used in this mode. STAP-Bs, MTAPs, and FU-Bs MUST NOT be used. The transmission order of NAL units MUST comply with the NAL unit decoding order.

当可选打包模式MIME参数的值等于1或通过外部方式打开模式时,将使用此模式。应支持此模式。它主要用于低延迟应用。在此模式中只能使用单个NAL单元数据包、STAP As和FU As。不得使用STAP Bs、MTAP和FU Bs。NAL单元的传输顺序必须符合NAL单元解码顺序。

6.4. Interleaved Mode
6.4. 交织模式

This mode is in use when the value of the OPTIONAL packetization-mode MIME parameter is equal to 2 or the mode is turned on by external means. Some receivers MAY support this mode. STAP-Bs, MTAPs, FU-As, and FU-Bs MAY be used. STAP-As and single NAL unit packets MUST NOT be used. The transmission order of packets and NAL units is constrained as specified in section 5.5.

当可选打包模式MIME参数的值等于2或通过外部方式打开模式时,使用此模式。一些接收机可能支持这种模式。可以使用STAP Bs、MTAP、FU As和FU Bs。不得使用STAP As和单个NAL单元数据包。数据包和NAL单元的传输顺序受第5.5节规定的约束。

7. De-Packetization Process (Informative)
7. 反包装过程(资料性)

The de-packetization process is implementation dependent. Therefore, the following description should be seen as an example of a suitable implementation. Other schemes may be used as well. Optimizations relative to the described algorithms are likely possible. Section 7.1 presents the de-packetization process for the single NAL unit and non-interleaved packetization modes, whereas section 7.2 describes the process for the interleaved mode. Section 7.3 includes additional decapsulation guidelines for intelligent receivers.

反打包过程取决于实现。因此,应将以下描述视为适当实现的示例。也可以使用其他方案。与所述算法相关的优化是可能的。第7.1节介绍了单个NAL单元和非交织分组模式的解分组过程,而第7.2节介绍了交织模式的过程。第7.3节包括智能接收器的附加去封装指南。

All normal RTP mechanisms related to buffer management apply. In particular, duplicated or outdated RTP packets (as indicated by the RTP sequences number and the RTP timestamp) are removed. To determine the exact time for decoding, factors such as a possible intentional delay to allow for proper inter-stream synchronization must be factored in.

所有与缓冲区管理相关的正常RTP机制都适用。特别地,删除重复的或过时的RTP分组(如RTP序列号和RTP时间戳所示)。为了确定解码的确切时间,必须考虑一些因素,例如允许适当的流间同步的可能故意延迟。

7.1. Single NAL Unit and Non-Interleaved Mode
7.1. 单NAL单元和非交织模式

The receiver includes a receiver buffer to compensate for transmission delay jitter. The receiver stores incoming packets in reception order into the receiver buffer. Packets are decapsulated in RTP sequence number order. If a decapsulated packet is a single NAL unit packet, the NAL unit contained in the packet is passed directly to the decoder. If a decapsulated packet is an STAP-A, the NAL units contained in the packet are passed to the decoder in the order in which they are encapsulated in the packet. If a decapsulated packet is an FU-A, all the fragments of the fragmented NAL unit are concatenated and passed to the decoder.

接收机包括用于补偿传输延迟抖动的接收机缓冲器。接收机按接收顺序将传入的数据包存储到接收机缓冲器中。数据包按RTP序列号顺序被解封。如果解除封装的分组是单个NAL单元分组,则分组中包含的NAL单元直接传递给解码器。如果解除封装的分组是STAP-a,则分组中包含的NAL单元按照封装在分组中的顺序传递给解码器。如果解除封装的分组是FU-a,则将分段NAL单元的所有片段串联并传递给解码器。

Informative note: If the decoder supports Arbitrary Slice Order, coded slices of a picture can be passed to the decoder in any order regardless of their reception and transmission order.

资料性说明:如果解码器支持任意切片顺序,则图片的编码切片可以以任何顺序传递给解码器,而不管它们的接收和传输顺序如何。

7.2. Interleaved Mode
7.2. 交织模式

The general concept behind these de-packetization rules is to reorder NAL units from transmission order to the NAL unit decoding order.

这些去分组规则背后的一般概念是将NAL单元从传输顺序重新排序为NAL单元解码顺序。

The receiver includes a receiver buffer, which is used to compensate for transmission delay jitter and to reorder packets from transmission order to the NAL unit decoding order. In this section, the receiver operation is described under the assumption that there is no transmission delay jitter. To make a difference from a practical receiver buffer that is also used for compensation of transmission delay jitter, the receiver buffer is here after called the deinterleaving buffer in this section. Receivers SHOULD also prepare for transmission delay jitter; i.e., either reserve separate buffers for transmission delay jitter buffering and deinterleaving buffering or use a receiver buffer for both transmission delay jitter and deinterleaving. Moreover, receivers SHOULD take transmission delay jitter into account in the buffering operation; e.g., by additional initial buffering before starting of decoding and playback.

接收机包括接收机缓冲器,其用于补偿传输延迟抖动并将分组从传输顺序重新排序到NAL单元解码顺序。在本节中,在没有传输延迟抖动的假设下描述接收机操作。为了区别于也用于补偿传输延迟抖动的实际接收机缓冲器,接收机缓冲器在本节中称为解交织缓冲器。接收机还应为传输延迟抖动做好准备;i、 例如,为传输延迟抖动缓冲和解交织缓冲保留单独的缓冲器,或者为传输延迟抖动和解交织使用接收器缓冲器。此外,接收机应在缓冲操作中考虑传输延迟抖动;e、 例如,在开始解码和回放之前,通过额外的初始缓冲。

This section is organized as follows: subsection 7.2.1 presents how to calculate the size of the deinterleaving buffer. Subsection 7.2.2 specifies the receiver process how to organize received NAL units to the NAL unit decoding order.

本节组织如下:第7.2.1小节介绍了如何计算解交织缓冲区的大小。第7.2.2小节规定了接收机处理如何将接收到的NAL单元组织到NAL单元解码顺序。

7.2.1. Size of the Deinterleaving Buffer
7.2.1. 解交织缓冲区的大小

When SDP Offer/Answer model or any other capability exchange procedure is used in session setup, the properties of the received stream SHOULD be such that the receiver capabilities are not exceeded. In the SDP Offer/Answer model, the receiver can indicate its capabilities to allocate a deinterleaving buffer with the deint-buf-cap MIME parameter. The sender indicates the requirement for the deinterleaving buffer size with the sprop-deint-buf-req MIME parameter. It is therefore RECOMMENDED to set the deinterleaving buffer size, in terms of number of bytes, equal to or greater than the value of sprop-deint-buf-req MIME parameter. See section 8.1 for further information on deint-buf-cap and sprop-deint-buf-req MIME parameters and section 8.2.2 for further information on their use in SDP Offer/Answer model.

当会话设置中使用SDP提供/应答模型或任何其他能力交换过程时,接收流的属性应确保不会超过接收器的能力。在SDP提供/应答模型中,接收器可以使用deint buf cap MIME参数指示其分配解交织缓冲区的能力。发送方使用sprop deint buf req MIME参数指示对解交织缓冲区大小的要求。因此,建议按字节数将解交织缓冲区大小设置为等于或大于sprop deint buf req MIME参数的值。有关deint buf cap和sprop deint buf req MIME参数的更多信息,请参见第8.1节;有关SDP报价/应答模型中使用这些参数的更多信息,请参见第8.2.2节。

When a declarative session description is used in session setup, the sprop-deint-buf-req MIME parameter signals the requirement for the deinterleaving buffer size. It is therefore RECOMMENDED to set the deinterleaving buffer size, in terms of number of bytes, equal to or greater than the value of sprop-deint-buf-req MIME parameter.

在会话设置中使用声明性会话描述时,sprop deint buf req MIME参数表示对解交织缓冲区大小的要求。因此,建议按字节数将解交织缓冲区大小设置为等于或大于sprop deint buf req MIME参数的值。

7.2.2. Deinterleaving Process
7.2.2. 去交织过程

There are two buffering states in the receiver: initial buffering and buffering while playing. Initial buffering occurs when the RTP session is initialized. After initial buffering, decoding and playback is started, and the buffering-while-playing mode is used.

接收器中有两种缓冲状态:初始缓冲和播放时缓冲。初始化RTP会话时发生初始缓冲。初始缓冲后,开始解码和播放,并使用播放时缓冲模式。

Regardless of the buffering state, the receiver stores incoming NAL units, in reception order, in the deinterleaving buffer as follows. NAL units of aggregation packets are stored in the deinterleaving buffer individually. The value of DON is calculated and stored for all NAL units.

不管缓冲状态如何,接收机都按照接收顺序将传入的NAL单元存储在解交织缓冲器中,如下所示。聚合数据包的NAL单元分别存储在解交织缓冲区中。计算并存储所有NAL单位的DON值。

The receiver operation is described below with the help of the following functions and constants:

在以下函数和常数的帮助下,接收器操作如下所述:

o Function AbsDON is specified in section 8.1.

o 第8.1节规定了功能AbsDON。

o Function don_diff is specified in section 5.5.

o 功能don_diff在第5.5节中有规定。

o Constant N is the value of the OPTIONAL sprop-interleaving-depth MIME type parameter (see section 8.1) incremented by 1.

o 常数N是可选的sprop交错深度MIME类型参数(参见第8.1节)的值,该参数增加1。

Initial buffering lasts until one of the following conditions is fulfilled:

初始缓冲持续到满足以下条件之一:

o There are N VCL NAL units in the deinterleaving buffer.

o 解交织缓冲区中有N个VCL NAL单元。

o If sprop-max-don-diff is present, don_diff(m,n) is greater than the value of sprop-max-don-diff, in which n corresponds to the NAL unit having the greatest value of AbsDON among the received NAL units and m corresponds to the NAL unit having the smallest value of AbsDON among the received NAL units.

o 如果存在sprop max don diff,则don_diff(m,n)大于sprop max don diff的值,其中n对应于接收到的NAL单元中AbsDON值最大的NAL单元,m对应于接收到的NAL单元中AbsDON值最小的NAL单元。

o Initial buffering has lasted for the duration equal to or greater than the value of the OPTIONAL sprop-init-buf-time MIME parameter.

o 初始缓冲的持续时间等于或大于可选sprop init buf time MIME参数的值。

The NAL units to be removed from the deinterleaving buffer are determined as follows:

要从解交织缓冲器中移除的NAL单元确定如下:

o If the deinterleaving buffer contains at least N VCL NAL units, NAL units are removed from the deinterleaving buffer and passed to the decoder in the order specified below until the buffer contains N-1 VCL NAL units.

o 如果解交织缓冲区包含至少N个VCL NAL单元,则NAL单元将从解交织缓冲区中移除,并按照下面指定的顺序传递给解码器,直到缓冲区包含N-1个VCL NAL单元。

o If sprop-max-don-diff is present, all NAL units m for which don_diff(m,n) is greater than sprop-max-don-diff are removed from the deinterleaving buffer and passed to the decoder in the order specified below. Herein, n corresponds to the NAL unit having the greatest value of AbsDON among the received NAL units.

o 如果存在sprop max don diff,则don_diff(m,n)大于sprop max don diff的所有NAL单元m将从解交织缓冲区中移除,并按照下面指定的顺序传递给解码器。在此,n对应于接收到的NAL单元中具有最大AbsDON值的NAL单元。

The order in which NAL units are passed to the decoder is specified as follows:

NAL单元传递给解码器的顺序规定如下:

o Let PDON be a variable that is initialized to 0 at the beginning of the an RTP session.

o 设PDON为在RTP会话开始时初始化为0的变量。

o For each NAL unit associated with a value of DON, a DON distance is calculated as follows. If the value of DON of the NAL unit is larger than the value of PDON, the DON distance is equal to DON - PDON. Otherwise, the DON distance is equal to 65535 - PDON + DON + 1.

o 对于与DON值相关联的每个NAL单元,DON距离计算如下。如果NAL单元的DON值大于PDON值,则DON距离等于DON-PDON。否则,DON距离等于65535-PDON+DON+1。

o NAL units are delivered to the decoder in ascending order of DON distance. If several NAL units share the same value of DON distance, they can be passed to the decoder in any order.

o NAL单元按DON距离的升序传送到解码器。如果多个NAL单元共享相同的DON距离值,则可以按任意顺序将它们传递给解码器。

o When a desired number of NAL units have been passed to the decoder, the value of PDON is set to the value of DON for the last NAL unit passed to the decoder.

o 当已将所需数量的NAL单元传递给解码器时,PDON的值被设置为传递给解码器的最后一个NAL单元的DON值。

7.3. Additional De-Packetization Guidelines
7.3. 附加反包装指南

The following additional de-packetization rules may be used to implement an operational H.264 de-packetizer:

以下附加反打包规则可用于实现可操作的H.264反打包器:

o Intelligent RTP receivers (e.g., in gateways) may identify lost coded slice data partitions A (DPAs). If a lost DPA is found, a gateway may decide not to send the corresponding coded slice data partitions B and C, as their information is meaningless for H.264 decoders. In this way a MANE can reduce network load by discarding useless packets without parsing a complex bitstream.

o 智能RTP接收器(例如,在网关中)可识别丢失的编码片数据分区A(DPA)。如果发现丢失的DPA,网关可以决定不发送相应的编码片数据分区B和C,因为它们的信息对于H.264解码器来说是没有意义的。通过这种方式,MANE可以通过丢弃无用的数据包而不解析复杂的比特流来减少网络负载。

o Intelligent RTP receivers (e.g., in gateways) may identify lost FUs. If a lost FU is found, a gateway may decide not to send the following FUs of the same fragmented NAL unit, as their information is meaningless for H.264 decoders. In this way a MANE can reduce network load by discarding useless packets without parsing a complex bitstream.

o 智能RTP接收器(例如,在网关中)可以识别丢失的FU。如果发现丢失的FU,网关可能会决定不发送相同分段NAL单元的以下FU,因为它们的信息对于H.264解码器没有意义。通过这种方式,MANE可以通过丢弃无用的数据包而不解析复杂的比特流来减少网络负载。

o Intelligent receivers having to discard packets or NALUs should first discard all packets/NALUs in which the value of the NRI field of the NAL unit type octet is equal to 0. This will minimize the impact on user experience and keep the reference pictures intact. If more packets have to be discarded, then packets with a numerically lower NRI value should be discarded before packets with a numerically higher NRI value. However, discarding any packets with an NRI bigger than 0 very likely leads to decoder drift and SHOULD be avoided.

o 必须丢弃数据包或NALU的智能接收器应首先丢弃NAL单元类型八位字节的NRI字段值等于0的所有数据包/NALU。这将最大限度地减少对用户体验的影响,并保持参考图片的完整性。如果必须丢弃更多的数据包,则在丢弃具有较高NRI值的数据包之前,应丢弃具有较低NRI值的数据包。然而,丢弃NRI大于0的任何数据包很可能会导致解码器漂移,应该避免。

8. Payload Format Parameters
8. 有效载荷格式参数

This section specifies the parameters that MAY be used to select optional features of the payload format and certain features of the bitstream. The parameters are specified here as part of the MIME subtype registration for the ITU-T H.264 | ISO/IEC 14496-10 codec. A mapping of the parameters into the Session Description Protocol (SDP) [5] is also provided for applications that use SDP. Equivalent parameters could be defined elsewhere for use with control protocols that do not use MIME or SDP.

本节规定了可用于选择有效负载格式的可选特征和比特流的某些特征的参数。此处指定的参数是ITU-T H.264 | ISO/IEC 14496-10编解码器MIME子类型注册的一部分。还为使用SDP的应用程序提供了参数到会话描述协议(SDP)[5]的映射。可以在其他地方定义等效参数,以便与不使用MIME或SDP的控制协议一起使用。

Some parameters provide a receiver with the properties of the stream that will be sent. The name of all these parameters starts with "sprop" for stream properties. Some of these "sprop" parameters are limited by other payload or codec configuration parameters. For example, the sprop-parameter-sets parameter is constrained by the profile-level-id parameter. The media sender selects all "sprop" parameters rather than the receiver. This uncommon characteristic of the "sprop" parameters may not be compatible with some signaling protocol concepts, in which case the use of these parameters SHOULD be avoided.

一些参数向接收器提供将要发送的流的属性。对于流属性,所有这些参数的名称都以“sprop”开头。其中一些“sprop”参数受到其他有效负载或编解码器配置参数的限制。例如,“sprop参数集”参数受“纵断面标高id”参数的约束。媒体发送方选择所有“存储”参数,而不是接收方。“sprop”参数的这种不常见特征可能与某些信令协议概念不兼容,在这种情况下,应避免使用这些参数。

8.1. MIME Registration
8.1. MIME注册

The MIME subtype for the ITU-T H.264 | ISO/IEC 14496-10 codec is allocated from the IETF tree.

ITU-T H.264 | ISO/IEC 14496-10编解码器的MIME子类型是从IETF树中分配的。

The receiver MUST ignore any unspecified parameter.

接收器必须忽略任何未指定的参数。

Media Type name: video

媒体类型名称:视频

Media subtype name: H264

媒体子类型名称:H264

Required parameters: none

所需参数:无

OPTIONAL parameters: profile-level-id: A base16 [6] (hexadecimal) representation of the following three bytes in the sequence parameter set NAL unit specified in [1]: 1) profile_idc, 2) a byte herein referred to as profile-iop, composed of the values of constraint_set0_flag, constraint_set1_flag, constraint_set2_flag, and reserved_zero_5bits in bit-significance order, starting from the most significant bit, and 3) level_idc. Note that reserved_zero_5bits is required to be equal to 0 in [1], but other values for it may be specified in the future by ITU-T or ISO/IEC.

可选参数:配置文件级别id:base16[6](十六进制)表示[1]中指定的序列参数集NAL单元中的以下三个字节:1)配置文件\u idc,2)此处称为配置文件iop的字节,由constraint\u set0\u标志、constraint\u set1\u标志、constraint\u set2\u标志的值组成,从最高有效位开始,按位重要性顺序保留0位5位,以及3)级别idc。请注意,在[1]中,保留的0位必须等于0,但将来ITU-T或ISO/IEC可能会指定其其他值。

If the profile-level-id parameter is used to indicate properties of a NAL unit stream, it indicates the profile and level that a decoder has to support in order to comply with [1] when it decodes the stream. The profile-iop byte indicates whether the NAL unit stream also obeys all constraints of the indicated profiles as follows. If bit 7 (the most significant bit), bit 6, or bit 5 of profile-iop is equal to 1, all constraints of the Baseline profile, the Main profile, or the Extended profile, respectively, are obeyed in the NAL unit stream.

如果profile-level id参数用于指示NAL单元流的属性,则它指示解码器在解码流时为了遵守[1]而必须支持的配置文件和级别。配置文件iop字节指示NAL单元流是否也遵守所示配置文件的所有约束,如下所示。如果配置文件iop的位7(最高有效位)、位6或位5等于1,则在NAL单元流中分别遵守基线配置文件、主配置文件或扩展配置文件的所有约束。

If the profile-level-id parameter is used for capability exchange or session setup procedure, it indicates the profile that the codec supports and the highest level supported for the signaled profile. The profile-iop byte indicates whether the codec has additional limitations whereby only the common subset of the algorithmic features and limitations of the profiles signaled with the profile-iop byte and of the profile indicated by profile_idc is supported by the codec. For example, if a codec supports only the common subset of the coding tools of the Baseline profile and the Main profile at level 2.1 and below, the profile-level-id becomes 42E015, in which 42 stands for the Baseline profile, E0 indicates that only the common subset for all profiles is supported, and 15 indicates level 2.1.

如果profile level id参数用于功能交换或会话设置过程,则它指示编解码器支持的配置文件以及信号配置文件支持的最高级别。配置文件iop字节表示编解码器是否具有其他限制,因此编解码器仅支持算法功能的公共子集,以及使用配置文件iop字节发送信号的配置文件和配置文件idc指示的配置文件的限制。例如,如果编解码器仅支持基线配置文件和级别2.1及以下的主配置文件的编码工具的公共子集,则配置文件级别id变为42E015,其中42表示基线配置文件,E0表示仅支持所有配置文件的公共子集,15表示级别2.1。

Informative note: Capability exchange and session setup procedures should provide means to list the capabilities for each supported codec profile separately. For example, the one-of-N codec selection procedure of the SDP Offer/Answer model can be used (section 10.2 of [7]).

资料性说明:功能交换和会话设置过程应提供单独列出每个支持的编解码器配置文件的功能的方法。例如,可以使用SDP提供/应答模型的N选一编解码器选择过程(见[7]第10.2节)。

If no profile-level-id is present, the Baseline Profile without additional constraints at Level 1 MUST be implied.

如果不存在概要文件级别id,则必须暗示在级别1没有附加约束的基线概要文件。

max-mbps, max-fs, max-cpb, max-dpb, and max-br: These parameters MAY be used to signal the capabilities of a receiver implementation. These parameters MUST NOT be used for any other purpose. The profile-level-id parameter MUST be present in the same receiver capability description that contains any of these parameters. The level conveyed in the value of the profile-level-id parameter MUST be such that the receiver is fully capable of supporting. max-mbps, max-fs, max-cpb, max-dpb, and max-br MAY be used to indicate capabilities of the receiver that extend the required capabilities of the signaled level, as specified below.

max-mbps、max-fs、max-cpb、max-dpb和max-br:这些参数可用于向接收机实现的能力发送信号。这些参数不得用于任何其他目的。配置文件级别id参数必须存在于包含任何这些参数的同一接收器能力描述中。在profile level id参数的值中传递的电平必须确保接收器完全能够支持。max mbps、max fs、max cpb、max dpb和max br可用于指示扩展信号电平的所需能力的接收机的能力,如下所述。

When more than one parameter from the set (max-mbps, max-fs, max-cpb, max-dpb, max-br) is present, the receiver MUST support all signaled capabilities simultaneously. For example, if both max-mbps and max-br are present, the signaled level with the extension of both the frame rate and bit rate is supported. That is, the receiver is able to decode NAL unit streams in which the macroblock processing rate is up to max-mbps (inclusive), the bit rate is up to max-br (inclusive), the coded picture buffer size is derived as specified in the semantics of the max-br parameter below, and other properties comply with the level specified in the value of the profile-level-id parameter.

当集合中存在多个参数(最大mbps、最大fs、最大cpb、最大dpb、最大br)时,接收器必须同时支持所有信号功能。例如,如果存在max mbps和max br,则支持扩展帧速率和比特率的信号电平。也就是说,接收机能够解码其中宏块处理速率高达max mbps(包括),比特率高达max br(包括),编码图片缓冲器大小如下面max br参数的语义中所指定的那样导出的NAL单元流,和其他属性符合配置文件级别id参数值中指定的级别。

A receiver MUST NOT signal values of max-mbps, max-fs, max-cpb, max-dpb, and max-br that meet the requirements of a higher level,

接收器不得发送满足更高级别要求的最大mbps、最大fs、最大cpb、最大dpb和最大br值的信号,

referred to as level A herein, compared to the level specified in the value of the profile-level-id parameter, if the receiver can support all the properties of level A.

如果接收机能够支持级别A的所有属性,则与在简档级别id参数的值中指定的级别相比,这里称为级别A。

Informative note: When the OPTIONAL MIME type parameters are used to signal the properties of a NAL unit stream, max-mbps, max-fs, max-cpb, max-dpb, and max-br are not present, and the value of profile-level-id must always be such that the NAL unit stream complies fully with the specified profile and level.

资料性说明:当使用可选MIME类型参数来表示NAL单元流的属性时,max mbps、max fs、max cpb、max dpb和max br不存在,并且配置文件级别id的值必须始终确保NAL单元流完全符合指定的配置文件和级别。

max-mbps: The value of max-mbps is an integer indicating the maximum macroblock processing rate in units of macroblocks per second. The max-mbps parameter signals that the receiver is capable of decoding video at a higher rate than is required by the signaled level conveyed in the value of the profile-level-id parameter. When max-mbps is signaled, the receiver MUST be able to decode NAL unit streams that conform to the signaled level, with the exception that the MaxMBPS value in Table A-1 of [1] for the signaled level is replaced with the value of max-mbps. The value of max-mbps MUST be greater than or equal to the value of MaxMBPS for the level given in Table A-1 of [1]. Senders MAY use this knowledge to send pictures of a given size at a higher picture rate than is indicated in the signaled level.

max-mbps:max-mbps的值是一个整数,表示以每秒宏块为单位的最大宏块处理速率。max mbps参数表示接收器能够以高于在简档电平id参数的值中传送的信号电平所要求的速率解码视频。当发送最大mbps信号时,接收器必须能够解码符合信号电平的NAL单元流,但[1]表A-1中用于信号电平的最大mbps值被最大mbps值替换的情况除外。对于[1]表A-1中给出的等级,最大mbps的值必须大于或等于最大mbps的值。发送者可以使用此知识以高于信号电平中指示的图片速率发送给定大小的图片。

max-fs: The value of max-fs is an integer indicating the maximum frame size in units of macroblocks. The max-fs parameter signals that the receiver is capable of decoding larger picture sizes than are required by the signaled level conveyed in the value of the profile-level-id parameter. When max-fs is signaled, the receiver MUST be able to decode NAL unit streams that conform to the signaled level, with the exception that the MaxFS value in Table A-1 of [1] for the signaled level is replaced with the value of max-fs. The value of max-fs MUST be greater than or equal to the value of MaxFS for the level given in Table A-1 of [1]. Senders MAY use this knowledge to send larger pictures at a

max fs:max fs的值是一个整数,表示以宏块为单位的最大帧大小。max fs参数表示接收器能够解码大于在简档电平id参数的值中传送的信号电平所需的图像大小。当发送最大fs信号时,接收器必须能够解码符合信号电平的NAL单位流,但[1]表A-1中用于信号电平的最大fs值被最大fs值替换的情况除外。max fs的值必须大于或等于[1]表A-1中给出的水平的MaxFS值。发送者可以使用此知识在同一时间发送较大的图片

proportionally lower frame rate than is indicated in the signaled level.

按比例低于信号电平中指示的帧速率。

max-cpb The value of max-cpb is an integer indicating the maximum coded picture buffer size in units of 1000 bits for the VCL HRD parameters (see A.3.1 item i of [1]) and in units of 1200 bits for the NAL HRD parameters (see A.3.1 item j of [1]). The max-cpb parameter signals that the receiver has more memory than the minimum amount of coded picture buffer memory required by the signaled level conveyed in the value of the profile-level-id parameter. When max-cpb is signaled, the receiver MUST be able to decode NAL unit streams that conform to the signaled level, with the exception that the MaxCPB value in Table A-1 of [1] for the signaled level is replaced with the value of max-cpb. The value of max-cpb MUST be greater than or equal to the value of MaxCPB for the level given in Table A-1 of [1]. Senders MAY use this knowledge to construct coded video streams with greater variation of bit rate than can be achieved with the MaxCPB value in Table A-1 of [1].

max cpb max cpb的值是一个整数,表示VCL HRD参数的最大编码图片缓冲区大小,单位为1000位(见[1]的A.3.1项目i),NAL HRD参数的最大编码图片缓冲区大小单位为1200位(见[1]的A.3.1项目j)。max cpb参数表示接收器的内存大于在profile level id参数的值中传送的信号电平所需的最小编码图片缓冲内存量。当发送最大cpb信号时,接收机必须能够解码符合信号电平的NAL单位流,但[1]表A-1中用于信号电平的最大cpb值被最大cpb值替换的情况除外。对于[1]表A-1中给出的水平,最大cpb值必须大于或等于最大cpb值。发送方可以利用这一知识来构建编码视频流,其比特率的变化比[1]表A-1中的MaxCPB值更大。

Informative note: The coded picture buffer is used in the hypothetical reference decoder (Annex C) of H.264. The use of the hypothetical reference decoder is recommended in H.264 encoders to verify that the produced bitstream conforms to the standard and to control the output bitrate. Thus, the coded picture buffer is conceptually independent of any other potential buffers in the receiver, including de-interleaving and de-jitter buffers. The coded picture buffer need not be implemented in decoders as specified in Annex C of H.264, but rather standard-compliant decoders can have any buffering arrangements provided that they can decode standard-compliant bitstreams. Thus, in practice, the input buffer for video decoder can be integrated with de-interleaving and de-jitter buffers of the receiver.

资料性说明:编码图片缓冲器用于H.264的假设参考解码器(附录C)。建议在H.264编码器中使用假设参考解码器,以验证生成的比特流是否符合标准并控制输出比特率。因此,编码图片缓冲器在概念上独立于接收机中的任何其他潜在缓冲器,包括解交错和解抖动缓冲器。编码图片缓冲器不需要在H.264的附录C中规定的解码器中实现,而是符合标准的解码器可以具有任何缓冲布置,只要它们能够解码符合标准的比特流。因此,在实践中,视频解码器的输入缓冲器可以与接收机的解交错和解抖动缓冲器集成。

max-dpb: The value of max-dpb is an integer indicating the maximum decoded picture buffer size in units of 1024 bytes. The max-dpb parameter signals that the receiver has more memory than the minimum amount of decoded picture buffer memory required by the signaled level conveyed in the value of the profile-level-id parameter. When max-dpb is signaled, the receiver MUST be able to decode NAL unit streams that conform to the signaled level, with the exception that the MaxDPB value in Table A-1 of [1] for the signaled level is replaced with the value of max-dpb. Consequently, a receiver that signals max-dpb MUST be capable of storing the following number of decoded frames, complementary field pairs, and non-paired fields in its decoded picture buffer:

max dpb:max dpb的值是一个整数,表示最大解码图片缓冲区大小(以1024字节为单位)。max dpb参数表示接收器的内存大于在简档级别id参数的值中传送的信号级别所需的解码图片缓冲内存的最小量。当发送最大dpb信号时,接收器必须能够解码符合信号电平的NAL单位流,但[1]表A-1中用于信号电平的最大dpb值被最大dpb值替换的情况除外。因此,发送max dpb信号的接收器必须能够在其解码图片缓冲器中存储以下数量的解码帧、互补场对和非成对场:

                        Min(1024 * max-dpb / ( PicWidthInMbs *
                        FrameHeightInMbs * 256 * ChromaFormatFactor ),
                        16)
        
                        Min(1024 * max-dpb / ( PicWidthInMbs *
                        FrameHeightInMbs * 256 * ChromaFormatFactor ),
                        16)
        

PicWidthInMbs, FrameHeightInMbs, and ChromaFormatFactor are defined in [1].

PicWidthInMbs、FrameHeightInMbs和ChromaFormatFactor在[1]中定义。

The value of max-dpb MUST be greater than or equal to the value of MaxDPB for the level given in Table A-1 of [1]. Senders MAY use this knowledge to construct coded video streams with improved compression.

对于[1]表A-1中给出的等级,最大dpb值必须大于或等于最大dpb值。发送者可以使用此知识构造具有改进的压缩的编码视频流。

Informative note: This parameter was added primarily to complement a similar codepoint in the ITU-T Recommendation H.245, so as to facilitate signaling gateway designs. The decoded picture buffer stores reconstructed samples and is a property of the video decoder only. There is no relationship between the size of the decoded picture buffer and the buffers used in RTP, especially de-interleaving and de-jitter buffers.

资料性说明:添加此参数主要是为了补充ITU-T建议H.245中的类似代码点,以便于信令网关设计。解码图像缓冲器存储重构样本,并且仅是视频解码器的属性。解码图片缓冲区的大小与RTP中使用的缓冲区之间没有关系,尤其是去交错和去抖动缓冲区。

max-br: The value of max-br is an integer indicating the maximum video bit rate in units of 1000 bits per second for the VCL HRD parameters (see A.3.1 item i of [1]) and in units of 1200 bits

max br:max br的值是一个整数,表示VCL HRD参数的最大视频比特率,单位为每秒1000比特(见[1]的A.3.1第i项),单位为1200比特

per second for the NAL HRD parameters (see A.3.1 item j of [1]).

NAL HRD参数每秒(见[1]中A.3.1第j项)。

The max-br parameter signals that the video decoder of the receiver is capable of decoding video at a higher bit rate than is required by the signaled level conveyed in the value of the profile-level-id parameter. The value of max-br MUST be greater than or equal to the value of MaxBR for the level given in Table A-1 of [1].

max br参数表示接收器的视频解码器能够以高于在简档电平id参数的值中传送的信号电平所要求的比特率解码视频。对于[1]的表A-1中给出的等级,max br的值必须大于或等于MaxBR的值。

When max-br is signaled, the video codec of the receiver MUST be able to decode NAL unit streams that conform to the signaled level, conveyed in the profile-level-id parameter, with the following exceptions in the limits specified by the level: o The value of max-br replaces the MaxBR value of the signaled level (in Table A-1 of [1]). o When the max-cpb parameter is not present, the result of the following formula replaces the value of MaxCPB in Table A-1 of [1]: (MaxCPB of the signaled level) * max-br / (MaxBR of the signaled level).

当发送max br信号时,接收器的视频编解码器必须能够解码符合配置文件级别id参数中传输的信号级别的NAL单元流,但级别指定的限制中存在以下例外情况:o max br的值替换信号级别的MaxBR值(在[1]的表A-1中)。o当最大cpb参数不存在时,以下公式的结果将替换[1]表A-1中的最大cpb值:(信号电平的最大cpb)*最大br/(信号电平的最大br)。

For example, if a receiver signals capability for Level 1.2 with max-br equal to 1550, this indicates a maximum video bitrate of 1550 kbits/sec for VCL HRD parameters, a maximum video bitrate of 1860 kbits/sec for NAL HRD parameters, and a CPB size of 4036458 bits (1550000 / 384000 * 1000 * 1000).

例如,如果接收器以最大br等于1550的方式向1.2级发送信号,则表示VCL HRD参数的最大视频比特率为1550 kbits/sec,NAL HRD参数的最大视频比特率为1860 kbits/sec,CPB大小为4036458比特(1550000/384000*1000*1000)。

The value of max-br MUST be greater than or equal to the value MaxBR for the signaled level given in Table A-1 of [1].

max br的值必须大于或等于[1]表A-1中给出的信号电平的MaxBR值。

Senders MAY use this knowledge to send higher bitrate video as allowed in the level definition of Annex A of H.264, to achieve improved video quality.

发送者可以使用此知识发送H.264附件A的级别定义中允许的更高比特率视频,以实现改进的视频质量。

Informative note: This parameter was added primarily to complement a similar codepoint in the ITU-T Recommendation H.245, so as to facilitate signaling gateway designs. No assumption can be made from the value of

资料性说明:添加此参数主要是为了补充ITU-T建议H.245中的类似代码点,以便于信令网关设计。无法对以下各项的价值进行任何假设:

this parameter that the network is capable of handling such bit rates at any given time. In particular, no conclusion can be drawn that the signaled bit rate is possible under congestion control constraints.

此参数表示网络能够在任何给定时间处理此类比特率。特别地,不能得出结论,即在拥塞控制约束下,信号比特率是可能的。

redundant-pic-cap: This parameter signals the capabilities of a receiver implementation. When equal to 0, the parameter indicates that the receiver makes no attempt to use redundant coded pictures to correct incorrectly decoded primary coded pictures. When equal to 0, the receiver is not capable of using redundant slices; therefore, a sender SHOULD avoid sending redundant slices to save bandwidth. When equal to 1, the receiver is capable of decoding any such redundant slice that covers a corrupted area in a primary decoded picture (at least partly), and therefore a sender MAY send redundant slices. When the parameter is not present, then a value of 0 MUST be used for redundant-pic-cap. When present, the value of redundant-pic-cap MUST be either 0 or 1.

冗余pic cap:此参数表示接收器实现的能力。当等于0时,该参数表示接收器不尝试使用冗余编码图片来纠正未正确解码的主编码图片。当等于0时,接收器不能使用冗余片;因此,发送方应避免发送冗余片以节省带宽。当等于1时,接收器能够解码覆盖主解码图片中损坏区域的任何此类冗余片(至少部分),因此发送器可以发送冗余片。当参数不存在时,冗余pic cap必须使用0值。存在时,冗余pic cap的值必须为0或1。

When the profile-level-id parameter is present in the same capability signaling as the redundant-pic-cap parameter, and the profile indicated in profile-level-id is such that it disallows the use of redundant coded pictures (e.g., Main Profile), the value of redundant-pic-cap MUST be equal to 0. When a receiver indicates redundant-pic-cap equal to 0, the received stream SHOULD NOT contain redundant coded pictures.

当配置文件级别id参数与冗余pic cap参数存在于相同的能力信令中,并且配置文件级别id中指示的配置文件不允许使用冗余编码图片(例如,主配置文件)时,冗余pic cap的值必须等于0。当接收器指示冗余pic cap等于0时,接收的流不应包含冗余编码图片。

Informative note: Even if redundant-pic-cap is equal to 0, the decoder is able to ignore redundant codec pictures provided that the decoder supports such a profile (Baseline, Extended) in which redundant coded pictures are allowed.

资料性说明:即使冗余pic cap等于0,只要解码器支持允许冗余编码图片的配置文件(基线、扩展),解码器也可以忽略冗余编解码器图片。

Informative note: Even if redundant-pic-cap is equal to 1, the receiver may also choose other error concealment strategies to

资料性说明:即使冗余pic cap等于1,接收机也可以选择其他错误隐藏策略来

replace or complement decoding of redundant slices.

替换或补充冗余片的解码。

sprop-parameter-sets: This parameter MAY be used to convey any sequence and picture parameter set NAL units (herein referred to as the initial parameter set NAL units) that MUST precede any other NAL units in decoding order. The parameter MUST NOT be used to indicate codec capability in any capability exchange procedure. The value of the parameter is the base64 [6] representation of the initial parameter set NAL units as specified in sections 7.3.2.1 and 7.3.2.2 of [1]. The parameter sets are conveyed in decoding order, and no framing of the parameter set NAL units takes place. A comma is used to separate any pair of parameter sets in the list. Note that the number of bytes in a parameter set NAL unit is typically less than 10, but a picture parameter set NAL unit can contain several hundreds of bytes.

sprop参数集:该参数可用于传送任何序列和图片参数集NAL单元(本文称为初始参数集NAL单元),其在解码顺序上必须先于任何其他NAL单元。在任何功能交换过程中,该参数不得用于指示编解码器功能。参数值为[1]第7.3.2.1节和第7.3.2.2节规定的初始参数集NAL单位的base64[6]表示。参数集以解码顺序传送,并且不发生参数集单元的帧。逗号用于分隔列表中的任何一对参数集。请注意,参数集NAL单元中的字节数通常小于10,但图片参数集NAL单元可以包含数百个字节。

Informative note: When several payload types are offered in the SDP Offer/Answer model, each with its own sprop-parameter-sets parameter, then the receiver cannot assume that those parameter sets do not use conflicting storage locations (i.e., identical values of parameter set identifiers). Therefore, a receiver should double-buffer all sprop-parameter-sets and make them available to the decoder instance that decodes a certain payload type.

资料性说明:当SDP提供/应答模型中提供了几种有效负载类型,每种类型都有自己的sprop参数集参数时,接收方不能假设这些参数集没有使用冲突的存储位置(即参数集标识符的相同值)。因此,接收器应加倍缓冲所有sprop参数集,并使其可用于解码特定有效负载类型的解码器实例。

parameter-add: This parameter MAY be used to signal whether the receiver of this parameter is allowed to add parameter sets in its signaling response using the sprop-parameter-sets MIME parameter. The value of this parameter is either 0 or 1. 0 is equal to false; i.e., it is not allowed to add parameter sets. 1 is equal to true; i.e., it is allowed to add parameter sets. If the parameter is not present, its value MUST be 1.

parameter add(参数添加):此参数可用于表示是否允许此参数的接收器使用sprop parameter sets MIME参数在其信令响应中添加参数集。此参数的值为0或1。0等于false;i、 例如,不允许添加参数集。1等于真;i、 例如,允许添加参数集。如果参数不存在,则其值必须为1。

packetization-mode: This parameter signals the properties of an RTP payload type or the capabilities of a receiver implementation. Only a single configuration point can be indicated; thus, when capabilities to support more than one packetization-mode are declared, multiple configuration points (RTP payload types) must be used.

打包模式:此参数表示RTP有效负载类型的属性或接收器实现的能力。只能指示一个配置点;因此,当声明支持多个打包模式的能力时,必须使用多个配置点(RTP有效负载类型)。

When the value of packetization-mode is equal to 0 or packetization-mode is not present, the single NAL mode, as defined in section 6.2 of RFC 3984, MUST be used. This mode is in use in standards using ITU-T Recommendation H.241 [15] (see section 12.1). When the value of packetization-mode is equal to 1, the non-interleaved mode, as defined in section 6.3 of RFC 3984, MUST be used. When the value of packetization-mode is equal to 2, the interleaved mode, as defined in section 6.4 of RFC 3984, MUST be used. The value of packetization mode MUST be an integer in the range of 0 to 2, inclusive.

当打包模式的值等于0或不存在打包模式时,必须使用RFC 3984第6.2节中定义的单一NAL模式。该模式在使用ITU-T建议H.241[15]的标准中使用(见第12.1节)。当打包模式的值等于1时,必须使用RFC 3984第6.3节中定义的非交错模式。当打包模式的值等于2时,必须使用RFC 3984第6.4节中定义的交织模式。packetization mode的值必须是0到2(包括0到2)范围内的整数。

sprop-interleaving-depth: This parameter MUST NOT be present when packetization-mode is not present or the value of packetization-mode is equal to 0 or 1. This parameter MUST be present when the value of packetization-mode is equal to 2.

sprop交错深度:当分组模式不存在或分组模式的值等于0或1时,此参数不得存在。当打包模式的值等于2时,此参数必须存在。

This parameter signals the properties of a NAL unit stream. It specifies the maximum number of VCL NAL units that precede any VCL NAL unit in the NAL unit stream in transmission order and follow the VCL NAL unit in decoding order. Consequently, it is guaranteed that receivers can reconstruct NAL unit decoding order when the buffer size for NAL unit decoding order recovery is at least the value of sprop-interleaving-depth + 1 in terms of VCL NAL units.

此参数表示NAL单位流的属性。它指定以传输顺序在NAL单元流中任何VCL NAL单元之前,以解码顺序在VCL NAL单元之后的VCL NAL单元的最大数量。因此,当用于NAL单元解码顺序恢复的缓冲器大小至少是相对于VCL NAL单元的sprop交织深度+1的值时,保证接收机能够重构NAL单元解码顺序。

The value of sprop-interleaving-depth MUST be an integer in the range of 0 to 32767, inclusive.

sprop交错深度的值必须是0到32767(包括0到32767)范围内的整数。

sprop-deint-buf-req: This parameter MUST NOT be present when packetization-mode is not present or the value of packetization-mode is equal to 0 or 1. It MUST be present when the value of packetization-mode is equal to 2.

sprop deint buf req:当打包模式不存在或打包模式的值等于0或1时,此参数不得存在。当packetization mode的值等于2时,它必须存在。

sprop-deint-buf-req signals the required size of the deinterleaving buffer for the NAL unit stream. The value of the parameter MUST be greater than or equal to the maximum buffer occupancy (in units of bytes) required in such a deinterleaving buffer that is specified in section 7.2 of RFC 3984. It is guaranteed that receivers can perform the deinterleaving of interleaved NAL units into NAL unit decoding order, when the deinterleaving buffer size is at least the value of sprop-deint-buf-req in terms of bytes.

sprop解交织buf req为NAL单元流发送所需大小的解交织缓冲区信号。该参数的值必须大于或等于RFC 3984第7.2节中规定的此类解交织缓冲区所需的最大缓冲区占用率(以字节为单位)。当解交织缓冲区大小至少是以字节为单位的sprop deint buf req的值时,可以保证接收机能够将交织的NAL单元解交织成NAL单元解码顺序。

The value of sprop-deint-buf-req MUST be an integer in the range of 0 to 4294967295, inclusive.

sprop deint buf req的值必须是0到4294967295(包括0到4294967295)范围内的整数。

Informative note: sprop-deint-buf-req indicates the required size of the deinterleaving buffer only. When network jitter can occur, an appropriately sized jitter buffer has to be provisioned for as well.

资料性说明:sprop deint buf req仅表示所需的解交织缓冲区大小。当网络抖动可能发生时,还必须为其配置适当大小的抖动缓冲区。

deint-buf-cap: This parameter signals the capabilities of a receiver implementation and indicates the amount of deinterleaving buffer space in units of bytes that the receiver has available for reconstructing the NAL unit decoding order. A receiver is able to handle any stream for which the value of the sprop-deint-buf-req parameter is smaller than or equal to this parameter.

deint buf cap:此参数表示接收器实现的能力,并指示接收器可用于重建NAL单元解码顺序的以字节为单位的解交织缓冲区空间量。接收器能够处理sprop deint buf req参数值小于或等于此参数的任何流。

If the parameter is not present, then a value of 0 MUST be used for deint-buf-cap. The value of deint-buf-cap MUST be an integer in the range of 0 to 4294967295, inclusive.

如果参数不存在,则deint buf cap必须使用0值。deint buf cap的值必须是0到4294967295(包括0到4294967295)范围内的整数。

Informative note: deint-buf-cap indicates the maximum possible size of the deinterleaving buffer of the receiver only.

资料性说明:deint buf cap仅表示接收器的解交织缓冲区的最大可能大小。

When network jitter can occur, an appropriately sized jitter buffer has to be provisioned for as well.

当网络抖动可能发生时,还必须为其配置适当大小的抖动缓冲区。

sprop-init-buf-time: This parameter MAY be used to signal the properties of a NAL unit stream. The parameter MUST NOT be present, if the value of packetization-mode is equal to 0 or 1.

sprop init buf time:此参数可用于表示NAL单位流的属性。如果打包模式的值等于0或1,则该参数不得存在。

The parameter signals the initial buffering time that a receiver MUST buffer before starting decoding to recover the NAL unit decoding order from the transmission order. The parameter is the maximum value of (transmission time of a NAL unit - decoding time of the NAL unit), assuming reliable and instantaneous transmission, the same timeline for transmission and decoding, and that decoding starts when the first packet arrives.

该参数表示接收器在开始解码之前必须缓冲的初始缓冲时间,以从传输顺序恢复NAL单元解码顺序。该参数是(NAL单元的传输时间-NAL单元的解码时间)的最大值,假设可靠和瞬时传输,传输和解码的时间线相同,并且解码在第一个数据包到达时开始。

An example of specifying the value of sprop-init-buf-time follows. A NAL unit stream is sent in the following interleaved order, in which the value corresponds to the decoding time and the transmission order is from left to right:

下面是一个指定sprop init buf time值的示例。以以下交织顺序发送NAL单元流,其中值对应于解码时间,并且传输顺序是从左到右:

0 2 1 3 5 4 6 8 7 ...

0 2 1 3 5 4 6 8 7 ...

Assuming a steady transmission rate of NAL units, the transmission times are:

假设NAL单元的稳定传输速率,传输时间为:

0 1 2 3 4 5 6 7 8 ...

0 1 2 3 4 5 6 7 8 ...

Subtracting the decoding time from the transmission time column-wise results in the following series:

从传输时间列中减去解码时间,得到以下序列:

0 -1 1 0 -1 1 0 -1 1 ...

0 -1 1 0 -1 1 0 -1 1 ...

Thus, in terms of intervals of NAL unit transmission times, the value of sprop-init-buf-time in this example is 1.

因此,就NAL单位发送时间的间隔而言,本示例中的sprop init buf time的值为1。

The parameter is coded as a non-negative base10 integer representation in clock ticks of a 90- kHz clock. If the parameter is not present, then no initial buffering time value is defined. Otherwise the value of sprop-init-buf-time MUST be an integer in the range of 0 to 4294967295, inclusive.

该参数被编码为90-kHz时钟的时钟信号中的非负base10整数表示。如果参数不存在,则不定义初始缓冲时间值。否则,sprop init buf time的值必须是介于0到4294967295(包括0和4294967295)之间的整数。

In addition to the signaled sprop-init-buf-time, receivers SHOULD take into account the transmission delay jitter buffering, including buffering for the delay jitter caused by mixers, translators, gateways, proxies, traffic-shapers, and other network elements.

除了信号sprop init buf time外,接收机还应考虑传输延迟抖动缓冲,包括混频器、转换器、网关、代理、流量整形器和其他网络元件引起的延迟抖动缓冲。

sprop-max-don-diff: This parameter MAY be used to signal the properties of a NAL unit stream. It MUST NOT be used to signal transmitter or receiver or codec capabilities. The parameter MUST NOT be present if the value of packetization-mode is equal to 0 or 1. sprop-max-don-diff is an integer in the range of 0 to 32767, inclusive. If sprop-max-don-diff is not present, the value of the parameter is unspecified. sprop-max-don-diff is calculated as follows:

sprop max don diff:此参数可用于表示NAL单位流的属性。不得将其用于信号发送器或接收器或编解码器功能。如果打包模式的值等于0或1,则该参数不得存在。sprop max don diff是一个介于0到32767(包括0到32767)之间的整数。如果sprop max don diff不存在,则该参数的值未指定。sprop max don diff的计算如下:

sprop-max-don-diff = max{AbsDON(i) - AbsDON(j)}, for any i and any j>i,

sprop max don diff=max{AbsDON(i)-AbsDON(j)},对于任意i和任意j>i,

where i and j indicate the index of the NAL unit in the transmission order and AbsDON denotes a decoding order number of the NAL unit that does not wrap around to 0 after 65535. In other words, AbsDON is calculated as follows: Let m and n be consecutive NAL units in transmission order. For the very first NAL unit in transmission order (whose index is 0), AbsDON(0) = DON(0). For other NAL units, AbsDON is calculated as follows:

其中i和j表示传输顺序中的NAL单元的索引,AbsDON表示在65535之后不环绕到0的NAL单元的解码顺序号。换句话说,AbsDON的计算如下:设m和n是传输顺序上的连续NAL单元。对于传输顺序中的第一个NAL单元(其索引为0),AbsDON(0)=DON(0)。对于其他NAL装置,AbsDON的计算如下:

                        If DON(m) == DON(n), AbsDON(n) = AbsDON(m)
        
                        If DON(m) == DON(n), AbsDON(n) = AbsDON(m)
        
                        If (DON(m) < DON(n) and DON(n) - DON(m) <
                        32768),
                        AbsDON(n) = AbsDON(m) + DON(n) - DON(m)
        
                        If (DON(m) < DON(n) and DON(n) - DON(m) <
                        32768),
                        AbsDON(n) = AbsDON(m) + DON(n) - DON(m)
        
                        If (DON(m) > DON(n) and DON(m) - DON(n) >=
                        32768),
                        AbsDON(n) = AbsDON(m) + 65536 - DON(m) + DON(n)
        
                        If (DON(m) > DON(n) and DON(m) - DON(n) >=
                        32768),
                        AbsDON(n) = AbsDON(m) + 65536 - DON(m) + DON(n)
        

If (DON(m) < DON(n) and DON(n) - DON(m) >= 32768),

如果(DON(m)<DON(n)和DON(n)-DON(m)>=32768),

                        AbsDON(n) = AbsDON(m) - (DON(m) + 65536 -
                        DON(n))
        
                        AbsDON(n) = AbsDON(m) - (DON(m) + 65536 -
                        DON(n))
        
                        If (DON(m) > DON(n) and DON(m) - DON(n) <
                        32768),
                        AbsDON(n) = AbsDON(m) - (DON(m) - DON(n))
        
                        If (DON(m) > DON(n) and DON(m) - DON(n) <
                        32768),
                        AbsDON(n) = AbsDON(m) - (DON(m) - DON(n))
        

where DON(i) is the decoding order number of the NAL unit having index i in the transmission order. The decoding order number is specified in section 5.5 of RFC 3984.

其中DON(i)是在传输顺序中具有索引i的NAL单元的解码顺序号。RFC 3984第5.5节规定了解码顺序号。

Informative note: Receivers may use sprop-max-don-diff to trigger which NAL units in the receiver buffer can be passed to the decoder.

资料性说明:接收机可使用sprop max don diff触发接收机缓冲区中哪些NAL单元可传递给解码器。

max-rcmd-nalu-size: This parameter MAY be used to signal the capabilities of a receiver. The parameter MUST NOT be used for any other purposes. The value of the parameter indicates the largest NALU size in bytes that the receiver can handle efficiently. The parameter value is a recommendation, not a strict upper boundary. The sender MAY create larger NALUs but must be aware that the handling of these may come at a higher cost than NALUs conforming to the limitation.

最大rcmd nalu大小:此参数可用于向接收器的功能发送信号。该参数不得用于任何其他目的。该参数的值表示接收器可以有效处理的最大NALU大小(以字节为单位)。参数值是建议值,而不是严格的上限。发送方可以创建更大的NALU,但必须注意,处理这些NALU的成本可能高于符合限制的NALU。

The value of max-rcmd-nalu-size MUST be an integer in the range of 0 to 4294967295, inclusive. If this parameter is not specified, no known limitation to the NALU size exists. Senders still have to consider the MTU size available between the sender and the receiver and SHOULD run MTU discovery for this purpose.

max rcmd nalu size的值必须是介于0到4294967295(包括0和4294967295)之间的整数。如果未指定此参数,则NALU大小不存在已知限制。发送者仍然需要考虑发送者和接收者之间可用的MTU大小,为此应该运行MTU发现。

This parameter is motivated by, for example, an IP to H.223 video telephony gateway, where NALUs smaller than the H.223 transport data

例如,该参数由IP到H.223视频电话网关驱动,其中NALU小于H.223传输数据

unit will be more efficient. A gateway may terminate IP; thus, MTU discovery will normally not work beyond the gateway.

这个单位会更有效率。网关可以终止IP;因此,MTU发现通常不会在网关之外工作。

Informative note: Setting this parameter to a lower than necessary value may have a negative impact.

资料性说明:将此参数设置为低于必要值可能会产生负面影响。

Encoding considerations: This type is only defined for transfer via RTP (RFC 3550).

编码注意事项:此类型仅为通过RTP(RFC 3550)传输而定义。

A file format of H.264/AVC video is defined in [29]. This definition is utilized by other file formats, such as the 3GPP multimedia file format (MIME type video/3gpp) [30] or the MP4 file format (MIME type video/mp4).

[29]中定义了H.264/AVC视频的文件格式。此定义由其他文件格式使用,例如3GPP多媒体文件格式(MIME类型视频/3GPP)[30]或MP4文件格式(MIME类型视频/MP4)。

Security considerations: See section 9 of RFC 3984.

安全注意事项:见RFC 3984第9节。

Public specification: Please refer to RFC 3984 and its section 15.

公共规范:请参考RFC 3984及其第15节。

Additional information: None

其他信息:无

File extensions: none Macintosh file type code: none Object identifier or OID: none

文件扩展名:无Macintosh文件类型代码:无对象标识符或OID:无

Person & email address to contact for further information: stewe@stewe.org

联系人和电子邮件地址,以获取更多信息:stewe@stewe.org

Intended usage: COMMON

预期用途:普通

Author: stewe@stewe.org Change controller: IETF Audio/Video Transport working group delegated from the IESG.

作者:stewe@stewe.org变更控制员:IESG授权的IETF音频/视频传输工作组。

8.2. SDP Parameters
8.2. SDP参数
8.2.1. Mapping of MIME Parameters to SDP
8.2.1. MIME参数到SDP的映射

The MIME media type video/H264 string is mapped to fields in the Session Description Protocol (SDP) [5] as follows:

MIME媒体类型video/H264字符串映射到会话描述协议(SDP)[5]中的字段,如下所示:

o The media name in the "m=" line of SDP MUST be video.

o SDP的“m=”行中的媒体名称必须是视频。

o The encoding name in the "a=rtpmap" line of SDP MUST be H264 (the MIME subtype).

o SDP的“a=rtpmap”行中的编码名称必须是H264(MIME子类型)。

o The clock rate in the "a=rtpmap" line MUST be 90000.

o “a=rtpmap”行中的时钟频率必须为90000。

o The OPTIONAL parameters "profile-level-id", "max-mbps", "max-fs", "max-cpb", "max-dpb", "max-br", "redundant-pic-cap", "sprop-parameter-sets", "parameter-add", "packetization-mode", "sprop-interleaving-depth", "deint-buf-cap", "sprop-deint-buf-req", "sprop-init-buf-time", "sprop-max-don-diff", and "max-rcmd-nalu-size", when present, MUST be included in the "a=fmtp" line of SDP. These parameters are expressed as a MIME media type string, in the form of a semicolon separated list of parameter=value pairs.

o 可选参数“配置文件级别id”、“最大mbps”、“最大fs”、“最大cpb”、“最大dpb”、“最大br”、“冗余pic cap”、“sprop参数集”、“参数添加”、“打包模式”、“sprop交错深度”、“设计buf cap”、“sprop设计buf req”、“sprop初始buf时间”、“sprop最大don diff”和“最大rcmd nalu大小”(如果存在),必须包含在SDP的“a=fmtp”行中。这些参数表示为MIME媒体类型字符串,以分号分隔的参数=值对列表的形式。

An example of media representation in SDP is as follows (Baseline Profile, Level 3.0, some of the constraints of the Main profile may not be obeyed):

SDP中的媒体表示示例如下(基线配置文件,3.0级,可能不遵守主配置文件的某些约束):

      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E;
                sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==
        
      m=video 49170 RTP/AVP 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E;
                sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==
        
8.2.2. Usage with the SDP Offer/Answer Model
8.2.2. SDP提供/应答模式的使用

When H.264 is offered over RTP using SDP in an Offer/Answer model [7] for negotiation for unicast usage, the following limitations and rules apply:

当H.264在提供/应答模型[7]中使用SDP通过RTP提供以协商单播使用时,以下限制和规则适用:

o The parameters identifying a media format configuration for H.264 are "profile-level-id", "packetization-mode", and, if required by "packetization-mode", "sprop-deint-buf-req". These three parameters MUST be used symmetrically; i.e., the answerer MUST either maintain all configuration parameters or remove the media format (payload type) completely, if one or more of the parameter values are not supported.

o 标识H.264的媒体格式配置的参数是“配置文件级别id”、“打包模式”,如果“打包模式”需要,还有“sprop deint buf req”。这三个参数必须对称使用;i、 例如,如果不支持一个或多个参数值,应答者必须维护所有配置参数或完全删除媒体格式(有效负载类型)。

Informative note: The requirement for symmetric use applies only for the above three parameters and not for the other stream properties and capability parameters.

资料性说明:对称使用要求仅适用于上述三个参数,而不适用于其他流属性和能力参数。

To simplify handling and matching of these configurations, the same RTP payload type number used in the offer SHOULD also be used in the answer, as specified in [7]. An answer MUST NOT contain a payload type number used in the offer unless the configuration ("profile-level-id", "packetization-mode", and, if present, "sprop-deint-buf-req") is the same as in the offer.

为了简化这些配置的处理和匹配,答案中还应使用报价中使用的相同RTP有效负载类型编号,如[7]中所述。答案不得包含报价中使用的有效负载类型编号,除非配置(“配置文件级别id”、“打包模式”,如果存在,“sprop deint buf req”)与报价中的配置相同。

Informative note: An offerer, when receiving the answer, has to compare payload types not declared in the offer based on media type (i.e., video/h264) and the above three parameters with any payload types it has already declared, in order to determine whether the configuration in question is new or equivalent to a configuration already offered.

资料性说明:报价人在收到答复时,必须根据媒体类型(即视频/h264)和上述三个参数,将报价中未声明的有效负载类型与其已声明的任何有效负载类型进行比较,以确定所述配置是新的还是与已提供的配置等效。

o The parameters "sprop-parameter-sets", "sprop-deint-buf-req", "sprop-interleaving-depth", "sprop-max-don-diff", and "sprop-init-buf-time" describe the properties of the NAL unit stream that the offerer or answerer is sending for this media format configuration. This differs from the normal usage of the Offer/Answer parameters: normally such parameters declare the properties of the stream that the offerer or the answerer is able to receive. When dealing with H.264, the offerer assumes that the answerer will be able to receive media encoded using the configuration being offered.

o 参数“sprop参数集”、“sprop deint buf req”、“sprop交织深度”、“sprop max don diff”和“sprop init buf time”描述了报价方或应答方为该媒体格式配置发送的NAL单元流的属性。这与要约/应答参数的正常用法不同:通常这些参数声明了要约人或应答人能够接收的流的属性。在处理H.264时,提供方假设应答方将能够接收使用提供的配置编码的媒体。

Informative note: The above parameters apply for any stream sent by the declaring entity with the same configuration; i.e., they are dependent on their source. Rather then being bound to the payload type, the values may have to be applied to another payload type when being sent, as they apply for the configuration.

资料性说明:上述参数适用于声明实体以相同配置发送的任何流;i、 例如,它们依赖于它们的来源。这些值在发送时可能必须应用于另一个有效负载类型,而不是绑定到有效负载类型,因为它们适用于配置。

o The capability parameters ("max-mbps", "max-fs", "max-cpb", "max-dpb", "max-br", ,"redundant-pic-cap", "max-rcmd-nalu-size") MAY be used to declare further capabilities. Their interpretation depends on the direction attribute. When the direction attribute is sendonly, then the parameters describe the limits of the RTP packets and the NAL unit stream that the sender is capable of producing. When the direction attribute is sendrecv or recvonly, then the parameters describe the limitations of what the receiver accepts.

o 能力参数(“最大mbps”、“最大fs”、“最大cpb”、“最大dpb”、“最大br”、“冗余pic cap”、“最大rcmd nalu大小”)可用于声明进一步的能力。它们的解释取决于方向属性。当方向属性为sendonly时,参数描述了RTP数据包和发送方能够产生的NAL单位流的限制。当direction属性为sendrecv或recvonly时,参数描述接收器接受的限制。

o As specified above, an offerer has to include the size of the deinterleaving buffer in the offer for an interleaved H.264 stream. To enable the offerer and answerer to inform each other about their capabilities for deinterleaving buffering, both parties are RECOMMENDED to include "deint-buf-cap". This information MAY be used when the value for "sprop-deint-buf-req" is selected in a second round of offer and answer. For interleaved streams, it is also RECOMMENDED to consider offering multiple payload types with different buffering requirements when the capabilities of the receiver are unknown.

o 如上所述,要约人必须在交织H.264流的要约中包括解交织缓冲器的大小。为了使报价人和应答人能够相互告知其解交织缓冲能力,建议双方加入“解交织缓冲区上限”。当在第二轮报价和应答中选择“sprop deint buf req”值时,可使用此信息。对于交错流,还建议考虑当接收机的能力未知时,提供具有不同缓冲要求的多个有效载荷类型。

o The "sprop-parameter-sets" parameter is used as described above. In addition, an answerer MUST maintain all parameter sets received in the offer in its answer. Depending on the value of the "parameter-add" parameter, different rules apply: If "parameter-add" is false (0), the answer MUST NOT add any additional parameter sets. If "parameter-add" is true (1), the answerer, in its answer, MAY add additional parameter sets to the "sprop-parameter-sets" parameter. The answerer MUST also, independent of the value of "parameter-add", accept to receive a video stream using the sprop-parameter-sets it declared in the answer.

o 如上所述,使用“sprop参数集”参数。此外,应答者必须在其应答中维护报价中收到的所有参数集。根据“parameter add”参数的值,不同的规则适用:如果“parameter add”为false(0),则答案不得添加任何其他参数集。如果“参数添加”为真(1),回答者可在其回答中向“sprop参数集”参数添加其他参数集。应答者还必须独立于“parameter add”的值,接受使用其在应答中声明的sprop参数集接收视频流。

Informative note: care must be taken when parameter sets are added not to cause overwriting of already transmitted parameter sets by using conflicting parameter set identifiers.

资料性说明:添加参数集时必须小心,以免使用冲突的参数集标识符覆盖已传输的参数集。

For streams being delivered over multicast, the following rules apply in addition:

对于通过多播传送的流,还应适用以下规则:

o The stream properties parameters ("sprop-parameter-sets", "sprop-deint-buf-req", "sprop-interleaving-depth", "sprop-max-don-diff", and "sprop-init-buf-time") MUST NOT be changed by the answerer. Thus, a payload type can either be accepted unaltered or removed.

o 应答者不得更改流属性参数(“sprop参数集”、“sprop deint buf req”、“sprop交织深度”、“sprop max don diff”和“sprop init buf time”)。因此,有效负载类型可以不改变地接受,也可以删除。

o The receiver capability parameters "max-mbps", "max-fs", "max-cpb", "max-dpb", "max-br", and "max-rcmd-nalu-size" MUST be supported by the answerer for all streams declared as sendrecv or recvonly; otherwise, one of the following actions MUST be performed: the media format is removed, or the session rejected.

o 应答器必须为所有声明为sendrecv或RecvoOnly的流支持接收机能力参数“最大mbps”、“最大fs”、“最大cpb”、“最大dpb”、“最大br”和“最大rcmd nalu size”;否则,必须执行以下操作之一:删除媒体格式或拒绝会话。

o The receiver capability parameter redundant-pic-cap SHOULD be supported by the answerer for all streams declared as sendrecv or recvonly as follows: The answerer SHOULD NOT include redundant coded pictures in the transmitted stream if the offerer indicated redundant-pic-cap equal to 0. Otherwise (when redundant_pic_cap is equal to 1), it is beyond the scope of this memo to recommend how the answerer should use redundant coded pictures.

o 对于声明为sendrecv或recvonly的所有流,应答方应支持接收机能力参数冗余pic cap,如下所示:如果报价方指示冗余pic cap等于0,则应答方不应在传输流中包含冗余编码图片。否则(当冗余图片上限等于1时),建议回答者如何使用冗余编码图片超出本备忘录的范围。

Below are the complete lists of how the different parameters shall be interpreted in the different combinations of offer or answer and direction attribute.

以下是如何在不同的报价或应答和方向属性组合中解释不同参数的完整列表。

o In offers and answers for which "a=sendrecv" or no direction attribute is used, or in offers and answers for which "a=recvonly" is used, the following interpretation of the parameters MUST be used.

o 在使用“a=sendrecv”或未使用方向属性的报价和应答中,或在使用“a=recvonly”的报价和应答中,必须使用以下参数解释。

Declaring actual configuration or properties for receiving:

声明要接收的实际配置或属性:

- profile-level-id - packetization-mode

- 配置文件级别id-打包模式

Declaring actual properties of the stream to be sent (applicable only when "a=sendrecv" or no direction attribute is used):

声明要发送的流的实际属性(仅当“a=sendrecv”或未使用方向属性时适用):

- sprop-deint-buf-req - sprop-interleaving-depth - sprop-parameter-sets - sprop-max-don-diff - sprop-init-buf-time

- sprop deint buf req-sprop交织深度-sprop参数集-sprop max don diff-sprop init buf time

Declaring receiver implementation capabilities:

声明接收器实现功能:

- max-mbps - max-fs - max-cpb - max-dpb - max-br - redundant-pic-cap - deint-buf-cap - max-rcmd-nalu-size

- 最大mbps-最大fs-最大cpb-最大dpb-最大br-冗余pic cap-设计buf cap-最大rcmd nalu尺寸

Declaring how Offer/Answer negotiation shall be performed:

说明如何进行报价/应答谈判:

- parameter-add

- 参数添加

o In an offer or answer for which the direction attribute "a=sendonly" is included for the media stream, the following interpretation of the parameters MUST be used:

o 在媒体流包含方向属性“a=sendonly”的报价或应答中,必须使用以下参数解释:

Declaring actual configuration and properties of stream proposed to be sent:

声明建议发送的流的实际配置和属性:

- profile-level-id - packetization-mode - sprop-deint-buf-req

- 配置文件级别id-打包模式-sprop deint buf req

- sprop-max-don-diff - sprop-init-buf-time - sprop-parameter-sets - sprop-interleaving-depth

- sprop max don diff-sprop init buf time-sprop参数集-sprop交织深度

Declaring the capabilities of the sender when it receives a stream:

在接收流时声明发送方的功能:

- max-mbps - max-fs - max-cpb - max-dpb - max-br - redundant-pic-cap - deint-buf-cap - max-rcmd-nalu-size

- 最大mbps-最大fs-最大cpb-最大dpb-最大br-冗余pic cap-设计buf cap-最大rcmd nalu尺寸

Declaring how Offer/Answer negotiation shall be performed:

说明如何进行报价/应答谈判:

- parameter-add

- 参数添加

Furthermore, the following considerations are necessary:

此外,有必要考虑以下因素:

o Parameters used for declaring receiver capabilities are in general downgradable; i.e., they express the upper limit for a sender's possible behavior. Thus a sender MAY select to set its encoder using only lower/lesser or equal values of these parameters. "sprop-parameter-sets" MUST NOT be used in a sender's declaration of its capabilities, as the limits of the values that are carried inside the parameter sets are implicit with the profile and level used.

o 用于声明接收器功能的参数通常是可降级的;i、 例如,它们表示发送者可能行为的上限。因此,发送方可以选择仅使用这些参数的较低/较小或相等值来设置其编码器。“sprop参数集”不得用于发送方的功能声明中,因为参数集内携带的值的限制与所使用的配置文件和级别是隐式的。

o Parameters declaring a configuration point are not downgradable, with the exception of the level part of the "profile-level-id" parameter. This expresses values a receiver expects to be used and must be used verbatim on the sender side.

o 声明配置点的参数不可降级,但“概要文件级别id”参数的级别部分除外。这表示接收者期望使用的值,并且必须在发送者端逐字使用。

o When a sender's capabilities are declared, and non-downgradable parameters are used in this declaration, then these parameters express a configuration that is acceptable. In order to achieve high interoperability levels, it is often advisable to offer multiple alternative configurations; e.g., for the packetization mode. It is impossible to offer multiple configurations in a single payload type. Thus, when multiple configuration offers are made, each offer requires its own RTP payload type associated with the offer.

o 当声明发送方的功能,并且在此声明中使用不可降级的参数时,这些参数表示可接受的配置。为了实现高互操作性级别,通常建议提供多种备选配置;e、 例如,对于打包模式。不可能在一种有效负载类型中提供多种配置。因此,当做出多个配置报价时,每个报价都需要与报价关联的自己的RTP有效负载类型。

o A receiver SHOULD understand all MIME parameters, even if it only supports a subset of the payload format's functionality. This ensures that a receiver is capable of understanding when an offer to receive media can be downgraded to what is supported by the receiver of the offer.

o 接收器应该理解所有MIME参数,即使它只支持有效负载格式功能的一个子集。这确保接收者能够理解何时可以将接收媒体的要约降级为要约接收者支持的内容。

o An answerer MAY extend the offer with additional media format configurations. However, to enable their usage, in most cases a second offer is required from the offerer to provide the stream properties parameters that the media sender will use. This also has the effect that the offerer has to be able to receive this media format configuration, not only to send it.

o 应答者可以通过附加媒体格式配置来延长报价。然而,为了能够使用它们,在大多数情况下,需要提供方提供第二次提供,以提供媒体发送方将使用的流属性参数。这也意味着,报价人必须能够接收此媒体格式配置,而不仅仅是发送它。

o If an offerer wishes to have non-symmetric capabilities between sending and receiving, the offerer has to offer different RTP sessions; i.e., different media lines declared as "recvonly" and "sendonly", respectively. This may have further implications on the system.

o 如果发盘方希望在发送和接收之间具有非对称能力,发盘方必须提供不同的RTP会话;i、 例如,不同的媒体行分别声明为“RecvoOnly”和“sendonly”。这可能会对系统产生进一步的影响。

8.2.3. Usage in Declarative Session Descriptions
8.2.3. 声明性会话描述中的用法

When H.264 over RTP is offered with SDP in a declarative style, as in RTSP [27] or SAP [28], the following considerations are necessary.

当H.264 over RTP与SDP一起以声明式风格提供时,如在RTSP[27]或SAP[28]中,以下注意事项是必要的。

o All parameters capable of indicating the properties of both a NAL unit stream and a receiver are used to indicate the properties of a NAL unit stream. For example, in this case, the parameter "profile-level-id" declares the values used by the stream, instead of the capabilities of the sender. This results in that the following interpretation of the parameters MUST be used:

o 能够指示NAL单元流和接收器两者的属性的所有参数用于指示NAL单元流的属性。例如,在本例中,参数“profile level id”声明流使用的值,而不是发送方的功能。这导致必须使用以下参数解释:

Declaring actual configuration or properties:

声明实际配置或属性:

- profile-level-id - sprop-parameter-sets - packetization-mode - sprop-interleaving-depth - sprop-deint-buf-req - sprop-max-don-diff - sprop-init-buf-time

- 配置文件级别id-sprop参数集-打包模式-sprop交错深度-sprop deint buf req-sprop max don diff-sprop init buf time

Not usable:

不可用:

- max-mbps - max-fs - max-cpb - max-dpb - max-br - redundant-pic-cap - max-rcmd-nalu-size - parameter-add - deint-buf-cap

- 最大mbps-最大fs-最大cpb-最大dpb-最大br-冗余pic cap-最大rcmd nalu大小-参数添加-设计buf cap

o A receiver of the SDP is required to support all parameters and values of the parameters provided; otherwise, the receiver MUST reject (RTSP) or not participate in (SAP) the session. It falls on the creator of the session to use values that are expected to be supported by the receiving application.

o SDP接收器需要支持提供的所有参数和参数值;否则,接收方必须拒绝(RTSP)或不参与(SAP)会话。会话的创建者需要使用接收应用程序预期支持的值。

8.3. Examples
8.3. 例子

A SIP Offer/Answer exchange wherein both parties are expected to both send and receive could look like the following. Only the media codec specific parts of the SDP are shown. Some lines are wrapped due to text constraints.

SIP提供/应答交换,其中双方都希望发送和接收,如下所示。仅显示SDP的媒体编解码器特定部分。由于文本约束,某些行被换行。

Offerer -> Answer SDP message:

报价人->回答SDP消息:

      m=video 49170 RTP/AVP 100 99 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; packetization-mode=0;
                sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==
      a=rtpmap:99 H264/90000
      a=fmtp:99 profile-level-id=42A01E; packetization-mode=1;
                sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==
      a=rtpmap:100 H264/90000
      a=fmtp:100 profile-level-id=42A01E; packetization-mode=2;
                 sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==;
                 sprop-interleaving-depth=45; sprop-deint-buf-req=64000;
                 sprop-init-buf-time=102478; deint-buf-cap=128000
        
      m=video 49170 RTP/AVP 100 99 98
      a=rtpmap:98 H264/90000
      a=fmtp:98 profile-level-id=42A01E; packetization-mode=0;
                sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==
      a=rtpmap:99 H264/90000
      a=fmtp:99 profile-level-id=42A01E; packetization-mode=1;
                sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==
      a=rtpmap:100 H264/90000
      a=fmtp:100 profile-level-id=42A01E; packetization-mode=2;
                 sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==;
                 sprop-interleaving-depth=45; sprop-deint-buf-req=64000;
                 sprop-init-buf-time=102478; deint-buf-cap=128000
        

The above offer presents the same codec configuration in three different packetization formats. PT 98 represents single NALU mode, PT 99 non-interleaved mode; PT 100 indicates the interleaved mode. In the interleaved mode case, the interleaving parameters that the offerer would use if the answer indicates support for PT 100 are also included. In all three cases the parameter "sprop-parameter-sets" conveys the initial parameter sets that are required for the answerer when receiving a stream from the offerer when this configuration

上述产品以三种不同的打包格式提供了相同的编解码器配置。PT 98表示单NALU模式,PT 99表示非交织模式;PT 100表示交织模式。在交织模式的情况下,如果应答指示支持PT 100,则还包括发盘方将使用的交织参数。在所有这三种情况下,参数“sprop parameter sets”传达了应答者在收到来自发盘方的流时所需的初始参数集

(profile-level-id and packetization mode) is accepted. Note that the value for "sprop-parameter-sets", although identical in the example above, could be different for each payload type.

(配置文件级别id和打包模式)被接受。请注意,“sprop参数集”的值虽然在上述示例中相同,但对于每种有效负载类型可能不同。

Answerer -> Offerer SDP message:

应答人->报价人SDP消息:

     m=video 49170 RTP/AVP 100 99 97
     a=rtpmap:97 H264/90000
     a=fmtp:97 profile-level-id=42A01E; packetization-mode=0;
               sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==,As0DEWlsIOp==,
               KyzFGleR
     a=rtpmap:99 H264/90000
     a=fmtp:99 profile-level-id=42A01E; packetization-mode=1;
               sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==,As0DEWlsIOp==,
               KyzFGleR; max-rcmd-nalu-size=3980
     a=rtpmap:100 H264/90000
     a=fmtp:100 profile-level-id=42A01E; packetization-mode=2;
               sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==,As0DEWlsIOp==,
               KyzFGleR; sprop-interleaving-depth=60;
               sprop-deint-buf-req=86000; sprop-init-buf-time=156320;
               deint-buf-cap=128000; max-rcmd-nalu-size=3980
        
     m=video 49170 RTP/AVP 100 99 97
     a=rtpmap:97 H264/90000
     a=fmtp:97 profile-level-id=42A01E; packetization-mode=0;
               sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==,As0DEWlsIOp==,
               KyzFGleR
     a=rtpmap:99 H264/90000
     a=fmtp:99 profile-level-id=42A01E; packetization-mode=1;
               sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==,As0DEWlsIOp==,
               KyzFGleR; max-rcmd-nalu-size=3980
     a=rtpmap:100 H264/90000
     a=fmtp:100 profile-level-id=42A01E; packetization-mode=2;
               sprop-parameter-sets=Z0IACpZTBYmI,aMljiA==,As0DEWlsIOp==,
               KyzFGleR; sprop-interleaving-depth=60;
               sprop-deint-buf-req=86000; sprop-init-buf-time=156320;
               deint-buf-cap=128000; max-rcmd-nalu-size=3980
        

As the Offer/Answer negotiation covers both sending and receiving streams, an offer indicates the exact parameters for what the offerer is willing to receive, whereas the answer indicates the same for what the answerer accepts to receive. In this case the offerer declared that it is willing to receive payload type 98. The answerer accepts this by declaring a equivalent payload type 97; i.e., it has identical values for the three parameters "profile-level-id", packetization-mode, and "sprop-deint-buf-req". This has the following implications for both the offerer and the answerer concerning the parameters that declare properties. The offerer initially declared a certain value of the "sprop-parameter-sets" in the payload definition for PT=98. However, as the answerer accepted this as PT=97, the values of "sprop-parameter-sets" in PT=98 must now be used instead when the offerer sends PT=97. Similarly, when the answerer sends PT=98 to the offerer, it has to use the properties parameters it declared in PT=97.

由于要约/应答协商包括发送流和接收流,要约表示要约人愿意接收的确切参数,而应答表示应答人接受接收的确切参数。在这种情况下,报价人声明其愿意接收有效载荷类型98。应答者通过声明等效有效负载类型97来接受这一点;i、 例如,三个参数“配置文件级别id”、“打包模式”和“sprop deint buf req”的值相同。关于声明属性的参数,这对报价人和应答人都有以下影响。报价人最初在PT=98的有效载荷定义中声明了“sprop参数集”的某个值。然而,由于回答者接受这一点为PT=97,当报价人发送PT=97时,现在必须使用PT=98中的“sprop参数集”值。类似地,当应答者向发盘者发送PT=98时,它必须使用在PT=97中声明的属性参数。

The answerer also accepts the reception of the two configurations that payload types 99 and 100 represent. It provides the initial parameter sets for the answerer-to-offerer direction, and for buffering related parameters that it will use to send the payload types. It also provides the offerer with its memory limit for deinterleaving operations by providing a "deint-buf-cap" parameter. This is only useful if the offerer decides on making a second offer, where it can take the new value into account. The "max-rcmd-nalu-size" indicates that the answerer can efficiently process NALUs up to

应答器还接受有效负载类型99和100所代表的两种配置的接收。它为应答者提供了初始参数集,以提供方向,并用于缓冲将用于发送有效负载类型的相关参数。它还通过提供“deint buf cap”参数,向报价人提供其用于解交织操作的内存限制。这只有在报价人决定进行第二次报价时才有用,因为报价人可以考虑新的价值。“最大rcmd nalu大小”表示应答者可以有效地处理多达

the size of 3980 bytes. However, there is no guarantee that the network supports this size.

大小为3980字节。但是,不能保证网络支持这种大小。

Please note that the parameter sets in the above example do not represent a legal operation point of an H.264 codec. The base64 strings are only used for illustration.

请注意,上述示例中的参数集并不代表H.264编解码器的合法操作点。base64字符串仅用于说明。

8.4. Parameter Set Considerations
8.4. 参数集注意事项

The H.264 parameter sets are a fundamental part of the video codec and vital to its operation; see section 1.2. Due to their characteristics and their importance for the decoding process, lost or erroneously transmitted parameter sets can hardly be concealed locally at the receiver. A reference to a corrupt parameter set has normally fatal results to the decoding process. Corruption could occur, for example, due to the erroneous transmission or loss of a parameter set data structure, but also due to the untimely transmission of a parameter set update. Therefore, the following recommendations are provided as a guideline for the implementer of the RTP sender.

H.264参数集是视频编解码器的基本组成部分,对其运行至关重要;见第1.2节。由于其特性及其对解码过程的重要性,丢失或错误传输的参数集很难在接收器处被本地隐藏。对损坏参数集的引用通常会对解码过程产生致命的结果。例如,由于参数集数据结构的错误传输或丢失,以及参数集更新的不及时传输,可能会发生损坏。因此,以下建议作为RTP发送器实现者的指南提供。

Parameter set NALUs can be transported using three different principles:

可以使用三种不同的原则传输参数集NALU:

A. Using a session control protocol (out-of-band) prior to the actual RTP session.

A.在实际RTP会话之前使用会话控制协议(带外)。

B. Using a session control protocol (out-of-band) during an ongoing RTP session.

B.在正在进行的RTP会话期间使用会话控制协议(带外)。

C. Within the RTP stream in the payload (in-band) during an ongoing RTP session.

C.在正在进行的RTP会话期间,有效负载(带内)中的RTP流内。

It is necessary to implement principles A and B within a session control protocol. SIP and SDP can be used as described in the SDP Offer/Answer model and in the previous sections of this memo. This section contains guidelines on how principles A and B must be implemented within session control protocols. It is independent of the particular protocol used. Principle C is supported by the RTP payload format defined in this specification.

有必要在会话控制协议中实现原则A和原则B。SIP和SDP可按照SDP报价/应答模型和本备忘录前面章节的说明使用。本节包含如何在会话控制协议中实现原则A和原则B的指南。它独立于所使用的特定协议。原则C由本规范中定义的RTP有效负载格式支持。

The picture and sequence parameter set NALUs SHOULD NOT be transmitted in the RTP payload unless reliable transport is provided for RTP, as a loss of a parameter set of either type will likely prevent decoding of a considerable portion of the corresponding RTP

除非为RTP提供了可靠的传输,否则图片和序列参数集NALUs不应在RTP有效载荷中传输,因为任何一种类型的参数集的丢失都可能会阻止相应RTP的相当大一部分的解码

stream. Thus, the transmission of parameter sets using a reliable session control protocol (i.e., usage of principle A or B above) is RECOMMENDED.

流动因此,建议使用可靠的会话控制协议(即,使用上述原则a或原则B)传输参数集。

In the rest of the section it is assumed that out-of-band signaling provides reliable transport of parameter set NALUs and that in-band transport does not. If in-band signaling of parameter sets is used, the sender SHOULD take the error characteristics into account and use mechanisms to provide a high probability for delivering the parameter sets correctly. Mechanisms that increase the probability for a correct reception include packet repetition, FEC, and retransmission. The use of an unreliable, out-of-band control protocol has similar disadvantages as the in-band signaling (possible loss) and, in addition, may also lead to difficulties in the synchronization (see below). Therefore, it is NOT RECOMMENDED.

在本节的其余部分中,假定带外信令提供参数集NALUs的可靠传输,而带内传输不提供。如果使用参数集的带内信令,发送方应考虑错误特性,并使用机制提供正确交付参数集的高概率。增加正确接收概率的机制包括分组重复、FEC和重传。使用不可靠的带外控制协议与带内信令(可能丢失)具有类似的缺点,此外,还可能导致同步困难(见下文)。因此,不建议这样做。

Parameter sets MAY be added or updated during the lifetime of a session using principles B and C. It is required that parameter sets are present at the decoder prior to the NAL units that refer to them. Updating or adding of parameter sets can result in further problems, and therefore the following recommendations should be considered.

可以使用原则B和C在会话的生存期内添加或更新参数集。要求参数集在引用它们的NAL单元之前出现在解码器中。更新或添加参数集可能会导致进一步的问题,因此应考虑以下建议。

- When parameter sets are added or updated, principle C is vulnerable to transmission errors as described above, and therefore principle B is RECOMMENDED.

- 当添加或更新参数集时,原则C容易出现如上所述的传输错误,因此建议使用原则B。

- When parameter sets are added or updated, care SHOULD be taken to ensure that any parameter set is delivered prior to its usage. It is common that no synchronization is present between out-of-band signaling and in-band traffic. If out-of-band signaling is used, it is RECOMMENDED that a sender does not start sending NALUs requiring the updated parameter sets prior to acknowledgement of delivery from the signaling protocol.

- 添加或更新参数集时,应注意确保任何参数集在使用前交付。带外信令和带内业务之间通常不存在同步。如果使用带外信令,建议发送方在确认信令协议的发送之前,不要开始发送需要更新参数集的NALU。

- When parameter sets are updated, the following synchronization issue should be taken into account. When overwriting a parameter set at the receiver, the sender has to ensure that the parameter set in question is not needed by any NALU present in the network or receiver buffers. Otherwise, decoding with a wrong parameter set may occur. To lessen this problem, it is RECOMMENDED either to overwrite only those parameter sets that have not been used for a sufficiently long time (to ensure that all related NALUs have been consumed), or to add a new parameter set instead (which may have negative consequences for the efficiency of the video coding).

- 更新参数集时,应考虑以下同步问题。当覆盖接收器上的参数集时,发送方必须确保网络或接收器缓冲区中的任何NALU都不需要有问题的参数集。否则,可能会使用错误的参数集进行解码。为了减少这个问题,建议只覆盖那些在足够长的时间内没有使用的参数集(以确保所有相关的NALU都已使用),或者添加一个新的参数集(这可能会对视频编码的效率产生负面影响)。

- When new parameter sets are added, previously unused parameter set identifiers are used. This avoids the problem identified in the

- 添加新参数集时,将使用以前未使用的参数集标识符。这样可以避免在中发现的问题

previous paragraph. However, in a multiparty session, unless a synchronized control protocol is used, there is a risk that multiple entities try to add different parameter sets for the same identifier, which has to be avoided.

上一段。但是,在多方会话中,除非使用同步控制协议,否则存在多个实体试图为同一标识符添加不同参数集的风险,这必须避免。

- Adding or modifying parameter sets by using both principles B and C in the same RTP session may lead to inconsistencies of the parameter sets because of the lack of synchronization between the control and the RTP channel. Therefore, principles B and C MUST NOT both be used in the same session unless sufficient synchronization can be provided.

- 在同一RTP会话中使用原则B和原则C添加或修改参数集可能会导致参数集不一致,因为控件和RTP通道之间缺乏同步。因此,除非能够提供足够的同步,否则原则B和C不得同时用于同一会话。

In some scenarios (e.g., when only the subset of this payload format specification corresponding to H.241 is used), it is not possible to employ out-of-band parameter set transmission. In this case, parameter sets have to be transmitted in-band. Here, the synchronization with the non-parameter-set-data in the bitstream is implicit, but the possibility of a loss has to be taken into account. The loss probability should be reduced using the mechanisms discussed above.

在某些情况下(例如,当仅使用与H.241对应的有效载荷格式规范的子集时),不可能采用带外参数集传输。在这种情况下,参数集必须在频带内传输。这里,与比特流中的非参数集数据的同步是隐式的,但是必须考虑丢失的可能性。应使用上述机制降低损失概率。

- When parameter sets are initially provided using principle A and then later added or updated in-band (principle C), there is a risk associated with updating the parameter sets delivered out-of-band. If receivers miss some in-band updates (for example, because of a loss or a late tune-in), those receivers attempt to decode the bitstream using out-dated parameters. It is RECOMMENDED that parameter set IDs be partitioned between the out-of-band and in-band parameter sets.

- 当最初使用原则A提供参数集,然后在带内添加或更新(原则C)时,更新带外交付的参数集存在风险。如果接收机错过了一些带内更新(例如,由于丢失或延迟调谐),这些接收机将尝试使用过时的参数解码比特流。建议在带外参数集和带内参数集之间划分参数集ID。

To allow for maximum flexibility and best performance from the H.264 coder, it is recommended, if possible, to allow any sender to add its own parameter sets to be used in a session. Setting the "parameter-add" parameter to false should only be done in cases where the session topology prevents a participant to add its own parameter sets.

为了使H.264编码器具有最大的灵活性和最佳性能,如果可能的话,建议允许任何发送方添加其自己的参数集以在会话中使用。只有在会话拓扑阻止参与者添加自己的参数集的情况下,才应将“parameter add”参数设置为false。

9. Security Considerations
9. 安全考虑

RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [4], and in any appropriate RTP profile (for example, [16]). This implies that confidentiality of the media streams is achieved by encryption; for example, through the application of SRTP [26]. Because the data compression used with this payload format is applied end-to-end, any encryption needs to be performed after compression.

使用本规范中定义的有效负载格式的RTP数据包受RTP规范[4]和任何适当RTP配置文件(例如[16])中讨论的安全注意事项的约束。这意味着媒体流的机密性是通过加密实现的;例如,通过应用SRTP[26]。由于与此有效负载格式一起使用的数据压缩是端到端应用的,因此任何加密都需要在压缩后执行。

A potential denial-of-service threat exists for data encodings using compression techniques that have non-uniform receiver-end computational load. The attacker can inject pathological datagrams into the stream that are complex to decode and that cause the receiver to be overloaded. H.264 is particularly vulnerable to such attacks, as it is extremely simple to generate datagrams containing NAL units that affect the decoding process of many future NAL units. Therefore, the usage of data origin authentication and data integrity protection of at least the RTP packet is RECOMMENDED; for example, with SRTP [26].

使用压缩技术的数据编码存在潜在的拒绝服务威胁,这种压缩技术具有非均匀的接收端计算负载。攻击者可以向流中注入难以解码的病理数据报,从而导致接收器过载。H.264特别容易受到此类攻击,因为生成包含影响许多未来NAL单元解码过程的NAL单元的数据报非常简单。因此,建议使用至少RTP分组的数据源认证和数据完整性保护;例如,使用SRTP[26]。

Note that the appropriate mechanism to ensure confidentiality and integrity of RTP packets and their payloads is very dependent on the application and on the transport and signaling protocols employed. Thus, although SRTP is given as an example above, other possible choices exist.

请注意,确保RTP数据包及其有效负载的机密性和完整性的适当机制非常依赖于应用程序以及所采用的传输和信令协议。因此,尽管上面给出了SRTP作为示例,但存在其他可能的选择。

Decoders MUST exercise caution with respect to the handling of user data SEI messages, particularly if they contain active elements, and MUST restrict their domain of applicability to the presentation containing the stream.

解码器必须谨慎处理用户数据SEI消息,特别是如果它们包含活动元素,并且必须将其适用范围限制为包含流的表示。

End-to-End security with either authentication, integrity or confidentiality protection will prevent a MANE from performing media-aware operations other than discarding complete packets. And in the case of confidentiality protection it will even be prevented from performing discarding of packets in a media aware way. To allow any MANE to perform its operations, it will be required to be a trusted entity which is included in the security context establishment.

具有身份验证、完整性或机密性保护的端到端安全性将防止MANE执行除丢弃完整数据包以外的媒体感知操作。并且在保密保护的情况下,它甚至将被阻止以媒体感知的方式执行数据包丢弃。为了允许任何MANE执行其操作,它必须是安全上下文建立中包含的受信任实体。

10. Congestion Control
10. 拥塞控制

Congestion control for RTP SHALL be used in accordance with RFC 3550 [4], and with any applicable RTP profile; e.g., RFC 3551 [16]. An additional requirement if best-effort service is being used is: users of this payload format MUST monitor packet loss to ensure that the packet loss rate is within acceptable parameters. Packet loss is considered acceptable if a TCP flow across the same network path, and experiencing the same network conditions, would achieve an average throughput, measured on a reasonable timescale, that is not less than the RTP flow is achieving. This condition can be satisfied by implementing congestion control mechanisms to adapt the transmission rate (or the number of layers subscribed for a layered multicast session), or by arranging for a receiver to leave the session if the loss rate is unacceptably high.

RTP的拥塞控制应根据RFC 3550[4]和任何适用的RTP配置文件使用;e、 g.,RFC 3551[16]。如果使用尽力而为服务,另一个要求是:此有效负载格式的用户必须监控数据包丢失,以确保数据包丢失率在可接受的参数范围内。如果通过相同网络路径并经历相同网络条件的TCP流将实现在合理时间尺度上测量的平均吞吐量,即不小于RTP流所实现的平均吞吐量,则认为丢包是可接受的。可以通过实现拥塞控制机制来适应传输速率(或分层多播会话订阅的层数),或者如果丢失率高得令人无法接受,则通过安排接收机离开会话来满足该条件。

The bit rate adaptation necessary for obeying the congestion control principle is easily achievable when real-time encoding is used. However, when pre-encoded content is being transmitted, bandwidth adaptation requires the availability of more than one coded representation of the same content, at different bit rates, or the existence of non-reference pictures or sub-sequences [22] in the bitstream. The switching between the different representations can normally be performed in the same RTP session; e.g., by employing a concept known as SI/SP slices of the Extended Profile, or by switching streams at IDR picture boundaries. Only when non-downgradable parameters (such as the profile part of the profile/level ID) are required to be changed does it become necessary to terminate and re-start the media stream. This may be accomplished by using a different RTP payload type.

当使用实时编码时,遵守拥塞控制原则所需的比特率自适应很容易实现。然而,当传输预编码内容时,带宽自适应要求以不同比特率提供相同内容的多个编码表示,或者比特流中存在非参考图片或子序列[22]。不同表示之间的切换通常可以在同一RTP会话中执行;e、 例如,通过采用称为扩展配置文件的SI/SP片的概念,或通过在IDR图片边界处切换流。只有当需要更改不可降级的参数(如配置文件/级别ID的配置文件部分)时,才需要终止并重新启动媒体流。这可以通过使用不同的RTP有效负载类型来实现。

MANEs MAY follow the suggestions outlined in section 7.3 and remove certain unusable packets from the packet stream when that stream was damaged due to previous packet losses. This can help reduce the network load in certain special cases.

MANE可遵循第7.3节中概述的建议,并在数据包流因先前的数据包丢失而损坏时,从数据包流中删除某些不可用的数据包。在某些特殊情况下,这有助于减少网络负载。

11. IANA Consideration
11. IANA考虑

IANA has registered one new MIME type; see section 8.1.

IANA注册了一个新的MIME类型;见第8.1节。

12. Informative Appendix: Application Examples
12. 资料性附录:应用示例

This payload specification is very flexible in its use, in order to cover the extremely wide application space anticipated for H.264. However, this great flexibility also makes it difficult for an implementer to decide on a reasonable packetization scheme. Some information on how to apply this specification to real-world scenarios is likely to appear in the form of academic publications and a test model software and description in the near future. However, some preliminary usage scenarios are described here as well.

该有效负载规范在使用上非常灵活,以覆盖H.264预期的极其广泛的应用空间。然而,这种巨大的灵活性也使得实现者很难决定一个合理的打包方案。在不久的将来,有关如何将本规范应用于实际场景的一些信息可能会以学术出版物、测试模型软件和描述的形式出现。然而,这里也描述了一些初步的使用场景。

12.1. Video Telephony according to ITU-T Recommendation H.241 Annex A

12.1. 符合ITU-T建议H.241附录A的视频电话

H.323-based video telephony systems that use H.264 as an optional video compression scheme are required to support H.241 Annex A [15] as a packetization scheme. The packetization mechanism defined in this Annex is technically identical with a small subset of this specification.

使用H.264作为可选视频压缩方案的基于H.323的视频电话系统需要支持H.241附录A[15]作为分组方案。本附件中定义的打包机制在技术上与本规范的一小部分相同。

When a system operates according to H.241 Annex A, parameter set NAL units are sent in-band. Only Single NAL unit packets are used. Many such systems are not sending IDR pictures regularly, but only when required by user interaction or by control protocol means; e.g., when switching between video channels in a Multipoint Control Unit or for error recovery requested by feedback.

当系统按照H.241附录a运行时,参数集NAL单元在频带内发送。仅使用单个NAL单元数据包。许多这样的系统不定期发送IDR图片,而是仅在用户交互或控制协议方式需要时发送;e、 例如,在多点控制单元中的视频通道之间切换时,或用于反馈请求的错误恢复时。

12.2. Video Telephony, No Slice Data Partitioning, No NAL Unit Aggregation

12.2. 视频电话,无切片数据分区,无NAL单元聚合

The RTP part of this scheme is implemented and tested (though not the control-protocol part; see below).

该方案的RTP部分已经实现和测试(但不是控制协议部分;见下文)。

In most real-world video telephony applications, picture parameters such as picture size or optional modes never change during the lifetime of a connection. Therefore, all necessary parameter sets (usually only one) are sent as a side effect of the capability exchange/announcement process, e.g., according to the SDP syntax specified in section 8.2 of this document. As all necessary parameter set information is established before the RTP session starts, there is no need for sending any parameter set NAL units. Slice data partitioning is not used, either. Thus, the RTP packet stream basically consists of NAL units that carry single coded slices.

在大多数现实世界的视频电话应用程序中,图片参数(如图片大小或可选模式)在连接的生命周期内不会改变。因此,所有必要的参数集(通常只有一个)作为能力交换/公告过程的副作用发送,例如,根据本文件第8.2节规定的SDP语法。由于所有必要的参数集信息都是在RTP会话开始之前建立的,因此不需要发送任何参数集NAL单元。切片数据分区也没有使用。因此,RTP分组流基本上由携带单个编码片段的NAL单元组成。

The encoder chooses the size of coded slice NAL units so that they offer the best performance. Often, this is done by adapting the coded slice size to the MTU size of the IP network. For small

编码器选择编码片NAL单元的大小,以便它们提供最佳性能。通常,这是通过使编码片大小适应IP网络的MTU大小来实现的。小的

picture sizes, this may result in a one-picture-per-one-packet strategy. Intra refresh algorithms clean up the loss of packets and the resulting drift-related artifacts.

图片大小,这可能导致每包一张图片的策略。帧内刷新算法可清除数据包丢失和由此产生的漂移相关伪影。

12.3. Video Telephony, Interleaved Packetization Using NAL Unit Aggregation

12.3. 视频电话,使用NAL单元聚合的交错分组

This scheme allows better error concealment and is used in H.263 based designs using RFC 2429 packetization [10]. It has been implemented, and good results were reported [12].

该方案允许更好的错误隐藏,并用于基于H.263的设计中,使用RFC2429分组[10]。已经实施,并报告了良好的结果[12]。

The VCL encoder codes the source picture so that all macroblocks (MBs) of one MB line are assigned to one slice. All slices with even MB row addresses are combined into one STAP, and all slices with odd MB row addresses into another. Those STAPs are transmitted as RTP packets. The establishment of the parameter sets is performed as discussed above.

VCL编码器对源图片进行编码,以便将一个MB行的所有宏块(MB)分配给一个切片。具有偶数MB行地址的所有片合并到一个STAP中,具有奇数MB行地址的所有片合并到另一个STAP中。这些STAP作为RTP数据包传输。参数集的建立如上所述。

Note that the use of STAPs is essential here, as the high number of individual slices (18 for a CIF picture) would lead to unacceptably high IP/UDP/RTP header overhead (unless the source coding tool FMO is used, which is not assumed in this scenario). Furthermore, some wireless video transmission systems, such as H.324M and the IP-based video telephony specified in 3GPP, are likely to use relatively small transport packet size. For example, a typical MTU size of H.223 AL3 SDU is around 100 bytes [17]. Coding individual slices according to this packetization scheme provides further advantage in communication between wired and wireless networks, as individual slices are likely to be smaller than the preferred maximum packet size of wireless systems. Consequently, a gateway can convert the STAPs used in a wired network into several RTP packets with only one NAL unit, which are preferred in a wireless network, and vice versa.

请注意,在这里使用STAP是至关重要的,因为大量的单个片段(CIF图片为18个)将导致不可接受的高IP/UDP/RTP报头开销(除非使用源代码工具FMO,这在本场景中不被假定)。此外,一些无线视频传输系统,例如H.324M和3GPP中指定的基于IP的视频电话,可能使用相对较小的传输分组大小。例如,H.223 AL3 SDU的典型MTU大小约为100字节[17]。根据该分组方案编码各个片段在有线和无线网络之间的通信中提供了进一步的优势,因为各个片段可能小于无线系统的优选最大分组大小。因此,网关可以将有线网络中使用的stap转换为仅具有一个NAL单元的多个RTP分组,这在无线网络中是优选的,反之亦然。

12.4. Video Telephony with Data Partitioning
12.4. 具有数据分区的视频电话

This scheme has been implemented and has been shown to offer good performance, especially at higher packet loss rates [12].

该方案已经实施,并被证明具有良好的性能,特别是在较高的丢包率下[12]。

Data Partitioning is known to be useful only when some form of unequal error protection is available. Normally, in single-session RTP environments, even error characteristics are assumed; i.e., the packet loss probability of all packets of the session is the same statistically. However, there are means to reduce the packet loss probability of individual packets in an RTP session. A FEC packet according to RFC 2733 [18], for example, specifies which media packets are associated with the FEC packet.

只有当某种形式的不等错误保护可用时,数据分区才有用。通常,在单会话RTP环境中,甚至假设错误特性;i、 例如,会话的所有分组的分组丢失概率在统计上是相同的。然而,存在降低RTP会话中单个分组的分组丢失概率的方法。例如,根据RFC 2733[18]的FEC分组指定哪些媒体分组与FEC分组相关联。

In all cases, the incurred overhead is substantial but is in the same order of magnitude as the number of bits that have otherwise been spent for intra information. However, this mechanism does not add any delay to the system.

在所有情况下,所产生的开销都是巨大的,但其数量级与用于帧内信息的比特数相同。但是,该机制不会给系统增加任何延迟。

Again, the complete parameter set establishment is performed through control protocol means.

同样,通过控制协议手段执行完整的参数集建立。

12.5. Video Telephony or Streaming with FUs and Forward Error Correction

12.5. 视频电话或具有FUs和前向纠错功能的流媒体

This scheme has been implemented and has been shown to provide good performance, especially at higher packet loss rates [19].

该方案已经实施,并被证明提供了良好的性能,特别是在较高的丢包率下[19]。

The most efficient means to combat packet losses for scenarios where retransmissions are not applicable is forward error correction (FEC). Although application layer, end-to-end use of FEC is often less efficient than an FEC-based protection of individual links (especially when links of different characteristics are in the transmission path), application layer, end-to-end FEC is unavoidable in some scenarios. RFC 2733 [18] provides means to use generic, application layer, end-to-end FEC in packet-loss environments. A binary forward error correcting code is generated by applying the XOR operation to the bits at the same bit position in different packets. The binary code can be specified by the parameters (n,k) in which k is the number of information packets used in the connection and n is the total number of packets generated for k information packets; i.e., n-k parity packets are generated for k information packets.

在不适用重传的情况下,对抗数据包丢失的最有效方法是前向纠错(FEC)。尽管应用层端到端使用FEC的效率通常低于基于FEC的单个链路保护(尤其是在传输路径中具有不同特性的链路时),但在某些情况下,应用层端到端FEC是不可避免的。RFC 2733[18]提供了在丢包环境中使用通用、应用层、端到端FEC的方法。通过对不同数据包中相同位位置的位应用异或操作,生成二进制前向纠错码。二进制代码可以由参数(n,k)指定,其中k是连接中使用的信息分组的数量,n是为k个信息分组生成的分组的总数;i、 例如,为k个信息分组生成n-k个奇偶校验分组。

When a code is used with parameters (n,k) within the RFC 2733 framework, the following properties are well known:

当代码与RFC 2733框架内的参数(n,k)一起使用时,以下属性是众所周知的:

a) If applied over one RTP packet, RFC 2733 provides only packet repetition.

a) 如果应用于一个RTP数据包,RFC2733仅提供数据包重复。

b) RFC 2733 is most bit rate efficient if XOR-connected packets have equal length.

b) 如果XOR连接的数据包长度相等,RFC2733的比特率效率最高。

c) At the same packet loss probability p and for a fixed k, the greater the value of n is, the smaller the residual error probability becomes. For example, for a packet loss probability of 10%, k=1, and n=2, the residual error probability is about 1%, whereas for n=3, the residual error probability is about 0.1%.

c) 在相同的丢包概率p下,对于固定的k,n的值越大,剩余错误概率越小。例如,对于10%、k=1和n=2的分组丢失概率,残余错误概率约为1%,而对于n=3,残余错误概率约为0.1%。

d) At the same packet loss probability p and for a fixed code rate k/n, the greater the value of n is, the smaller the residual error probability becomes. For example, at a packet loss probability of p=10%, k=1 and n=2, the residual error rate is about 1%, whereas

d) 在相同的分组丢失概率p和固定的码速率k/n下,n的值越大,残余错误概率越小。例如,在p=10%、k=1和n=2的分组丢失概率下,残余错误率约为1%,而

for an extended Golay code with k=12 and n=24, the residual error rate is about 0.01%.

对于k=12、n=24的扩展Golay码,残差率约为0.01%。

For applying RFC 2733 in combination with H.264 baseline coded video without using FUs, several options might be considered:

为了在不使用FUs的情况下将RFC 2733与H.264基线编码视频结合使用,可以考虑以下几种选项:

1) The video encoder produces NAL units for which each video frame is coded in a single slice. Applying FEC, one could use a simple code; e.g., (n=2, k=1). That is, each NAL unit would basically just be repeated. The disadvantage is obviously the bad code performance according to d), above, and the low flexibility, as only (n, k=1) codes can be used.

1) 视频编码器产生NAL单元,每个视频帧在单个片段中编码。应用FEC,可以使用简单的代码;e、 (n=2,k=1)。也就是说,每个NAL单元基本上都是重复的。缺点显然是根据上述d),代码性能差,灵活性低,因为只能使用(n,k=1)代码。

2) The video encoder produces NAL units for which each video frame is encoded in one or more consecutive slices. Applying FEC, one could use a better code, e.g., (n=24, k=12), over a sequence of NAL units. Depending on the number of RTP packets per frame, a loss may introduce a significant delay, which is reduced when more RTP packets are used per frame. Packets of completely different length might also be connected, which decreases bit rate efficiency according to b), above. However, with some care and for slices of 1kb or larger, similar length (100-200 bytes difference) may be produced, which will not lower the bit efficiency catastrophically.

2) 视频编码器产生NAL单元,每个视频帧被编码在一个或多个连续片中。应用FEC,可以在NAL单元序列上使用更好的代码,例如(n=24,k=12)。根据每帧RTP分组的数量,丢失可能会引入显著的延迟,当每帧使用更多RTP分组时,延迟会减少。也可以连接长度完全不同的数据包,这会降低上述b)所述的比特率效率。然而,在一定程度上,对于1kb或更大的片,可能会产生类似的长度(100-200字节差),这不会灾难性地降低比特效率。

3) The video encoder produces NAL units, for which a certain frame contains k slices of possibly almost equal length. Then, applying FEC, a better code, e.g., (n=24, k=12), can be used over the sequence of NAL units for each frame. The delay compared to that of 2), above, may be reduced, but several disadvantages are obvious. First, the coding efficiency of the encoded video is lowered significantly, as slice-structured coding reduces intra-frame prediction and additional slice overhead is necessary. Second, pre-encoded content or, when operating over a gateway, the video is usually not appropriately coded with k slices such that FEC can be applied. Finally, the encoding of video producing k slices of equal length is not straightforward and might require more than one encoding pass.

3) 视频编码器产生NAL单元,其中某个帧包含可能几乎相等长度的k个片段。然后,应用FEC,可以在每个帧的NAL单元序列上使用更好的代码,例如(n=24,k=12)。与上述2)相比,延迟可能会减少,但有几个缺点是显而易见的。首先,编码视频的编码效率显著降低,因为切片结构化编码减少了帧内预测,并且需要额外的切片开销。第二,预编码内容,或者,当在网关上操作时,视频通常不使用k个片段进行适当编码,以便可以应用FEC。最后,对产生等长k个片段的视频进行编码并不简单,可能需要多次编码。

Many of the mentioned disadvantages can be avoided by applying FUs in combination with FEC. Each NAL unit can be split into any number of FUs of basically equal length; therefore, FEC with a reasonable k and n can be applied, even if the encoder made no effort to produce slices of equal length. For example, a coded slice NAL unit containing an entire frame can be split to k FUs, and a parity check code (n=k+1, k) can be applied. However, this has the disadvantage

通过将FUs与FEC结合使用,可以避免上述许多缺点。每个NAL单元可分为任意数量的长度基本相等的FU;因此,可以应用具有合理k和n的FEC,即使编码器不努力产生等长的切片。例如,包含整个帧的编码片NAL单元可以分割为k fu,并且可以应用奇偶校验码(n=k+1,k)。然而,这也有缺点

that unless all created fragments can be recovered, the whole slice will be lost. Thus a larger section is lost than would be if the frame had been split into several slices.

除非所有创建的片段都可以恢复,否则整个片段都将丢失。因此,与将帧分割为多个切片相比,丢失的部分更大。

The presented technique makes it possible to achieve good transmission error tolerance, even if no additional source coding layer redundancy (such as periodic intra frames) is present. Consequently, the same coded video sequence can be used to achieve the maximum compression efficiency and quality over error-free transmission and for transmission over error-prone networks. Furthermore, the technique allows the application of FEC to pre-encoded sequences without adding delay. In this case, pre-encoded sequences that are not encoded for error-prone networks can still be transmitted almost reliably without adding extensive delays. In addition, FUs of equal length result in a bit rate efficient use of RFC 2733.

所提出的技术使得即使不存在额外的信源编码层冗余(例如周期性帧内帧),也能够实现良好的传输容错。因此,相同的编码视频序列可用于在无差错传输上实现最大的压缩效率和质量,并用于在容易出错的网络上传输。此外,该技术允许在不增加延迟的情况下将FEC应用于预编码序列。在这种情况下,对于容易出错的网络,未编码的预编码序列仍然可以在不增加大量延迟的情况下几乎可靠地传输。此外,等长的FU导致RFC2733的比特率有效使用。

If the error probability depends on the length of the transmitted packet (e.g., in case of mobile transmission [14]), the benefits of applying FUs with FEC are even more obvious. Basically, the flexibility of the size of FUs allows appropriate FEC to be applied for each NAL unit and unequal error protection of NAL units.

如果错误概率取决于所传输数据包的长度(例如,在移动传输的情况下[14]),则将FUs与FEC结合使用的好处更为明显。基本上,FUs大小的灵活性允许为每个NAL单元应用适当的FEC,并且NAL单元具有不等的错误保护。

When FUs and FEC are used, the incurred overhead is substantial but is in the same order of magnitude as the number of bits that have to be spent for intra-coded macroblocks if no FEC is applied. In [19], it was shown that the overall performance of the FEC-based approach enhanced quality when using the same error rate and same overall bit rate, including the overhead.

当使用FUs和FEC时,产生的开销是巨大的,但是如果没有应用FEC,其数量级与必须用于帧内编码宏块的比特数相同。在[19]中,研究表明,当使用相同的错误率和相同的总体比特率(包括开销)时,基于FEC的方法的总体性能提高了质量。

12.6. Low Bit-Rate Streaming
12.6. 低比特率流媒体

This scheme has been implemented with H.263 and non-standard RTP packetization and has given good results [20]. There is no technical reason why similarly good results could not be achievable with H.264.

该方案已在H.263和非标准RTP封装中实现,并取得了良好的效果[20]。没有技术上的理由说明H.264无法获得类似的好结果。

In today's Internet streaming, some of the offered bit rates are relatively low in order to allow terminals with dial-up modems to access the content. In wired IP networks, relatively large packets, say 500 - 1500 bytes, are preferred to smaller and more frequently occurring packets in order to reduce network congestion. Moreover, use of large packets decreases the amount of RTP/UDP/IP header overhead. For low bit-rate video, the use of large packets means that sometimes up to few pictures should be encapsulated in one packet.

在今天的互联网流媒体中,一些提供的比特率相对较低,以便允许带有拨号调制解调器的终端访问内容。在有线IP网络中,相对较大的数据包(例如500-1500字节)比较小且更频繁出现的数据包更可取,以减少网络拥塞。此外,使用大数据包可以减少RTP/UDP/IP报头开销。对于低比特率视频,使用大数据包意味着有时一个数据包中最多应封装几个图片。

However, loss of a packet including many coded pictures would have drastic consequences for visual quality, as there is practically no other way to conceal a loss of an entire picture than to repeat the previous one. One way to construct relatively large packets and maintain possibilities for successful loss concealment is to construct MTAPs that contain interleaved slices from several pictures. An MTAP should not contain spatially adjacent slices from the same picture or spatially overlapping slices from any picture. If a packet is lost, it is likely that a lost slice is surrounded by spatially adjacent slices of the same picture and spatially corresponding slices of the temporally previous and succeeding pictures. Consequently, concealment of the lost slice is likely to be relatively successful.

然而,丢失包含许多编码图片的数据包将对视觉质量产生严重影响,因为除了重复上一张图片外,几乎没有其他方法可以隐藏整个图片的丢失。构造相对较大的数据包并保持成功隐藏丢失可能性的一种方法是构造包含来自多个图片的交错切片的MTAP。MTAP不应包含来自同一图片的空间相邻切片或来自任何图片的空间重叠切片。如果分组丢失,则丢失的片段很可能被相同图片的空间上相邻的片段以及时间上先前和后续图片的空间上对应的片段包围。因此,隐藏丢失的切片可能比较成功。

12.7. Robust Packet Scheduling in Video Streaming
12.7. 视频流中的鲁棒分组调度

Robust packet scheduling has been implemented with MPEG-4 Part 2 and simulated in a wireless streaming environment [21]. There is no technical reason why similar or better results could not be achievable with H.264.

已使用MPEG-4第2部分实现了健壮的数据包调度,并在无线流媒体环境中进行了模拟[21]。对于H.264无法实现类似或更好的结果,没有任何技术原因。

Streaming clients typically have a receiver buffer that is capable of storing a relatively large amount of data. Initially, when a streaming session is established, a client does not start playing the stream back immediately. Rather, it typically buffers the incoming data for a few seconds. This buffering helps maintain continuous playback, as, in case of occasional increased transmission delays or network throughput drops, the client can decode and play buffered data. Otherwise, without initial buffering, the client has to freeze the display, stop decoding, and wait for incoming data. The buffering is also necessary for either automatic or selective retransmission in any protocol level. If any part of a picture is lost, a retransmission mechanism may be used to resend the lost data. If the retransmitted data is received before its scheduled decoding or playback time, the loss is recovered perfectly. Coded pictures can be ranked according to their importance in the subjective quality of the decoded sequence. For example, non-reference pictures, such as conventional B pictures, are subjectively least important, as their absence does not affect decoding of any other pictures. In addition to non-reference pictures, the ITU-T H.264 | ISO/IEC 14496-10 standard includes a temporal scalability method called sub-sequences [22]. Subjective ranking can also be made on coded slice data partition or slice group basis. Coded slices and coded slice data partitions that are subjectively the most important can be sent earlier than their decoding order indicates, whereas coded slices and coded slice data partitions that are subjectively the least important can be sent later than their natural coding order indicates. Consequently, any retransmitted parts of the most important slices

流式客户端通常具有能够存储相对大量数据的接收器缓冲区。最初,当建立流会话时,客户端不会立即开始播放流。相反,它通常会将传入的数据缓冲几秒钟。这种缓冲有助于保持连续播放,因为在偶尔增加传输延迟或网络吞吐量下降的情况下,客户端可以解码和播放缓冲数据。否则,在没有初始缓冲的情况下,客户端必须冻结显示、停止解码并等待传入数据。缓冲对于任何协议级别的自动或选择性重传也是必要的。如果图片的任何部分丢失,可以使用重传机制来重新发送丢失的数据。如果重新传输的数据在其预定解码或回放时间之前被接收,则丢失完全恢复。编码图片可以根据其在解码序列主观质量中的重要性进行排序。例如,非参考图片,例如传统的B图片,在主观上是最不重要的,因为它们的缺失不影响任何其他图片的解码。除了非参考图片外,ITU-T H.264 | ISO/IEC 14496-10标准还包括一种称为子序列的时间可伸缩性方法[22]。主观排序也可以基于编码切片数据分区或切片组进行。主观上最重要的编码片和编码片数据分区可以在其解码顺序指示之前发送,而主观上最不重要的编码片和编码片数据分区可以在其自然编码顺序指示之后发送。因此,任何重传的部分都是最重要的片段

and coded slice data partitions are more likely to be received before their scheduled decoding or playback time compared to the least important slices and slice data partitions.

与最不重要的片和片数据分区相比,编码片数据分区更有可能在其预定解码或回放时间之前被接收。

13. Informative Appendix: Rationale for Decoding Order Number
13. 资料性附录:解码订单号的基本原理
13.1. Introduction
13.1. 介绍

The Decoding Order Number (DON) concept was introduced mainly to enable efficient multi-picture slice interleaving (see section 12.6) and robust packet scheduling (see section 12.7). In both of these applications, NAL units are transmitted out of decoding order. DON indicates the decoding order of NAL units and should be used in the receiver to recover the decoding order. Example use cases for efficient multi-picture slice interleaving and for robust packet scheduling are given in sections 13.2 and 13.3, respectively. Section 13.4 describes the benefits of the DON concept in error resiliency achieved by redundant coded pictures. Section 13.5 summarizes considered alternatives to DON and justifies why DON was chosen to this RTP payload specification.

引入解码顺序号(DON)概念主要是为了实现高效的多图片片交织(见第12.6节)和健壮的分组调度(见第12.7节)。在这两种应用中,NAL单元都是按解码顺序传输的。DON表示NAL单元的解码顺序,应在接收器中使用,以恢复解码顺序。第13.2节和第13.3节分别给出了高效多图片片交织和鲁棒分组调度的示例用例。第13.4节描述了DON概念在通过冗余编码图片实现错误恢复能力方面的优势。第13.5节总结了所考虑的DON替代方案,并说明了为什么选择DON符合本RTP有效载荷规范。

13.2. Example of Multi-Picture Slice Interleaving
13.2. 多图片片交织示例

An example of multi-picture slice interleaving follows. A subset of a coded video sequence is depicted below in output order. R denotes a reference picture, N denotes a non-reference picture, and the number indicates a relative output time.

下面是多图片片交织的示例。下面按输出顺序描述编码视频序列的子集。R表示参考图片,N表示非参考图片,数字表示相对输出时间。

... R1 N2 R3 N4 R5 ...

... R1 N2 R3 N4 R5。。。

The decoding order of these pictures from left to right is as follows:

这些图片从左到右的解码顺序如下:

... R1 R3 N2 R5 N4 ...

... R1 R3 N2 R5 N4。。。

The NAL units of pictures R1, R3, N2, R5, and N4 are marked with a DON equal to 1, 2, 3, 4, and 5, respectively.

图片R1、R3、N2、R5和N4的NAL单位分别用等于1、2、3、4和5的DON标记。

Each reference picture consists of three slice groups that are scattered as follows (a number denotes the slice group number for each macroblock in a QCIF frame):

每个参考图片由以下三个分散的切片组组成(数字表示QCIF帧中每个宏块的切片组编号):

0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2

0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2

For the sake of simplicity, we assume that all the macroblocks of a slice group are included in one slice. Three MTAPs are constructed from three consecutive reference pictures so that each MTAP contains three aggregation units, each of which contains all the macroblocks from one slice group. The first MTAP contains slice group 0 of picture R1, slice group 1 of picture R3, and slice group 2 of picture R5. The second MTAP contains slice group 1 of picture R1, slice group 2 of picture R3, and slice group 0 of picture R5. The third MTAP contains slice group 2 of picture R1, slice group 0 of picture R3, and slice group 1 of picture R5. Each non-reference picture is encapsulated into an STAP-B.

为了简单起见,我们假设一个片组的所有宏块都包含在一个片中。三个MTAP由三个连续的参考图片构成,因此每个MTAP包含三个聚合单元,每个聚合单元包含一个切片组中的所有宏块。第一个MTAP包含图片R1的切片组0、图片R3的切片组1和图片R5的切片组2。第二个MTAP包含图片R1的切片组1、图片R3的切片组2和图片R5的切片组0。第三个MTAP包含图片R1的切片组2、图片R3的切片组0和图片R5的切片组1。每个非参考图片被封装到STAP-B中。

Consequently, the transmission order of NAL units is the following:

因此,NAL单元的传输顺序如下:

      R1, slice group 0, DON 1, carried in MTAP,   RTP SN: N
      R3, slice group 1, DON 2, carried in MTAP,   RTP SN: N
      R5, slice group 2, DON 4, carried in MTAP,   RTP SN: N
      R1, slice group 1, DON 1, carried in MTAP,   RTP SN: N+1
      R3, slice group 2, DON 2, carried in MTAP,   RTP SN: N+1
      R5, slice group 0, DON 4, carried in MTAP,   RTP SN: N+1
      R1, slice group 2, DON 1, carried in MTAP,   RTP SN: N+2
      R3, slice group 1, DON 2, carried in MTAP,   RTP SN: N+2
      R5, slice group 0, DON 4, carried in MTAP,   RTP SN: N+2
      N2,                DON 3, carried in STAP-B, RTP SN: N+3
      N4,                DON 5, carried in STAP-B, RTP SN: N+4
        
      R1, slice group 0, DON 1, carried in MTAP,   RTP SN: N
      R3, slice group 1, DON 2, carried in MTAP,   RTP SN: N
      R5, slice group 2, DON 4, carried in MTAP,   RTP SN: N
      R1, slice group 1, DON 1, carried in MTAP,   RTP SN: N+1
      R3, slice group 2, DON 2, carried in MTAP,   RTP SN: N+1
      R5, slice group 0, DON 4, carried in MTAP,   RTP SN: N+1
      R1, slice group 2, DON 1, carried in MTAP,   RTP SN: N+2
      R3, slice group 1, DON 2, carried in MTAP,   RTP SN: N+2
      R5, slice group 0, DON 4, carried in MTAP,   RTP SN: N+2
      N2,                DON 3, carried in STAP-B, RTP SN: N+3
      N4,                DON 5, carried in STAP-B, RTP SN: N+4
        

The receiver is able to organize the NAL units back in decoding order based on the value of DON associated with each NAL unit.

接收机能够基于与每个NAL单元相关联的DON的值以解码顺序重新组织NAL单元。

If one of the MTAPs is lost, the spatially adjacent and temporally co-located macroblocks are received and can be used to conceal the loss efficiently. If one of the STAPs is lost, the effect of the loss does not propagate temporally.

如果其中一个mtap丢失,则接收空间上相邻且时间上共存的宏块,并可用于有效地隐藏丢失。如果其中一个STAP丢失,则丢失的影响不会在时间上传播。

13.3. Example of Robust Packet Scheduling
13.3. 健壮分组调度示例

An example of robust packet scheduling follows. The communication system used in the example consists of the following components in the order that the video is processed from source to sink:

下面是一个健壮的数据包调度示例。本示例中使用的通信系统由以下组件组成,按照视频从源到接收器的处理顺序排列:

o camera and capturing o pre-encoding buffer o encoder o encoded picture buffer o transmitter o transmission channel o receiver o receiver buffer o decoder o decoded picture buffer o display

o 照相机和捕获o预编码缓冲器o编码器o编码图片缓冲器o发射器o传输通道o接收器o接收器缓冲器o解码器o解码图片缓冲器o显示器

The video communication system used in the example operates as follows. Note that processing of the video stream happens gradually and at the same time in all components of the system. The source video sequence is shot and captured to a pre-encoding buffer. The pre-encoding buffer can be used to order pictures from sampling order to encoding order or to analyze multiple uncompressed frames for bit rate control purposes, for example. In some cases, the pre-encoding buffer may not exist; instead, the sampled pictures are encoded right away. The encoder encodes pictures from the pre-encoding buffer and stores the output; i.e., coded pictures, to the encoded picture buffer. The transmitter encapsulates the coded pictures from the encoded picture buffer to transmission packets and sends them to a receiver through a transmission channel. The receiver stores the received packets to the receiver buffer. The receiver buffering process typically includes buffering for transmission delay jitter. The receiver buffer can also be used to recover correct decoding order of coded data. The decoder reads coded data from the receiver buffer and produces decoded pictures as output into the decoded picture buffer. The decoded picture buffer is used to recover the output (or display) order of pictures. Finally, pictures are displayed.

本示例中使用的视频通信系统的操作如下。注意,视频流的处理在系统的所有组件中逐渐同时进行。源视频序列被拍摄并捕获到预编码缓冲区。例如,预编码缓冲器可用于从采样顺序到编码顺序对图片进行排序,或用于出于比特率控制目的分析多个未压缩帧。在某些情况下,预编码缓冲区可能不存在;取而代之的是,立即对采样的图片进行编码。编码器对来自预编码缓冲器的图片进行编码并存储输出;i、 例如,编码图片,发送到编码图片缓冲区。发射机将来自编码图片缓冲器的编码图片封装到传输分组中,并通过传输信道将其发送到接收机。接收器将接收到的数据包存储到接收器缓冲区。接收机缓冲处理通常包括对传输延迟抖动的缓冲。接收机缓冲器还可用于恢复编码数据的正确解码顺序。解码器从接收器缓冲器读取编码数据,并产生解码图片作为输出到解码图片缓冲器中。解码图片缓冲区用于恢复图片的输出(或显示)顺序。最后,显示图片。

In the following example figures, I denotes an IDR picture, R denotes a reference picture, N denotes a non-reference picture, and the number after I, R, or N indicates the sampling time relative to the previous IDR picture in decoding order. Values below the sequence of pictures indicate scaled system clock timestamps. The system clock is initialized arbitrarily in this example, and time runs from left to right. Each I, R, and N picture is mapped into the same timeline compared to the previous processing step, if any, assuming that

在以下示例图中,I表示IDR图片,R表示参考图片,N表示非参考图片,并且I、R或N之后的数字表示相对于解码顺序中的先前IDR图片的采样时间。图片序列下方的值表示缩放的系统时钟时间戳。在本例中,系统时钟任意初始化,时间从左到右运行。与上一个处理步骤(如果有)相比,每个I、R和N图片被映射到相同的时间线,假设

encoding, transmission, and decoding take no time. Thus, events happening at the same time are located in the same column throughout all example figures.

编码、传输和解码不需要时间。因此,在所有示例图中,同时发生的事件位于同一列中。

A subset of a sequence of coded pictures is depicted below in sampling order.

下面以采样顺序描述编码图片序列的子集。

       ...  N58 N59 I00 N01 N02 R03 N04 N05 R06 ... N58 N59 I00 N01 ...
       ... --|---|---|---|---|---|---|---|---|- ... -|---|---|---|- ...
       ...  58  59  60  61  62  63  64  65  66  ... 128 129 130 131 ...
        
       ...  N58 N59 I00 N01 N02 R03 N04 N05 R06 ... N58 N59 I00 N01 ...
       ... --|---|---|---|---|---|---|---|---|- ... -|---|---|---|- ...
       ...  58  59  60  61  62  63  64  65  66  ... 128 129 130 131 ...
        

Figure 16. Sequence of pictures in sampling order

图16。按采样顺序排列的图片序列

The sampled pictures are buffered in the pre-encoding buffer to arrange them in encoding order. In this example, we assume that the non-reference pictures are predicted from both the previous and the next reference picture in output order, except for the non-reference pictures immediately preceding an IDR picture, which are predicted only from the previous reference picture in output order. Thus, the pre-encoding buffer has to contain at least two pictures, and the buffering causes a delay of two picture intervals. The output of the pre-encoding buffering process and the encoding (and decoding) order of the pictures are as follows:

采样的图片缓冲在预编码缓冲区中,以按编码顺序排列。在该示例中,我们假设非参考图片是以输出顺序从上一参考图片和下一参考图片预测的,除了IDR图片前面的非参考图片,它们是仅以输出顺序从上一参考图片预测的。因此,预编码缓冲器必须包含至少两个图片,并且该缓冲器导致两个图片间隔的延迟。预编码缓冲处理的输出和图片的编码(和解码)顺序如下:

                ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
                ... -|---|---|---|---|---|---|---|---|- ...
                ... 60  61  62  63  64  65  66  67  68  ...
        
                ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
                ... -|---|---|---|---|---|---|---|---|- ...
                ... 60  61  62  63  64  65  66  67  68  ...
        

Figure 17. Re-ordered pictures in the pre-encoding buffer

图17。在预编码缓冲区中重新排序图片

The encoder or the transmitter can set the value of DON for each picture to a value of DON for the previous picture in decoding order plus one.

编码器或发射器可以将每个图片的DON值设置为解码顺序加1的前一张图片的DON值。

For the sake of simplicity, let us assume that:

为了简单起见,让我们假设:

o the frame rate of the sequence is constant, o each picture consists of only one slice, o each slice is encapsulated in a single NAL unit packet, o there is no transmission delay, and o pictures are transmitted at constant intervals (that is, 1 / frame rate).

o 序列的帧速率是恒定的,o每个图片仅由一个片段组成,o每个片段封装在单个NAL单元分组中,o没有传输延迟,并且o图片以恒定间隔(即,1/帧速率)传输。

When pictures are transmitted in decoding order, they are received as follows:

当以解码顺序发送图片时,它们按如下方式接收:

                ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
                ... -|---|---|---|---|---|---|---|---|- ...
                ... 60  61  62  63  64  65  66  67  68  ...
        
                ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
                ... -|---|---|---|---|---|---|---|---|- ...
                ... 60  61  62  63  64  65  66  67  68  ...
        

Figure 18. Received pictures in decoding order

图18。按解码顺序接收图片

The OPTIONAL sprop-interleaving-depth MIME type parameter is set to 0, as the transmission (or reception) order is identical to the decoding order.

由于传输(或接收)顺序与解码顺序相同,因此可选的sprop interleaving depth MIME type参数设置为0。

The decoder has to buffer for one picture interval initially in its decoded picture buffer to organize pictures from decoding order to output order as depicted below:

解码器最初必须在其解码图片缓冲区中缓冲一个图片间隔,以将图片从解码顺序组织到输出顺序,如下所示:

                    ... N58 N59 I00 N01 N02 R03 N04 N05 R06 ...
                    ... -|---|---|---|---|---|---|---|---|- ...
                    ... 61  62  63  64  65  66  67  68  69  ...
        
                    ... N58 N59 I00 N01 N02 R03 N04 N05 R06 ...
                    ... -|---|---|---|---|---|---|---|---|- ...
                    ... 61  62  63  64  65  66  67  68  69  ...
        

Figure 19. Output order

图19。输出顺序

The amount of required initial buffering in the decoded picture buffer can be signaled in the buffering period SEI message or with the num_reorder_frames syntax element of H.264 video usability information. num_reorder_frames indicates the maximum number of frames, complementary field pairs, or non-paired fields that precede any frame, complementary field pair, or non-paired field in the sequence in decoding order and that follow it in output order. For the sake of simplicity, we assume that num_reorder_frames is used to indicate the initial buffer in the decoded picture buffer. In this example, num_reorder_frames is equal to 1.

解码图片缓冲器中所需的初始缓冲量可以在缓冲周期SEI消息中或使用H.264视频可用性信息的num_reorder_frames语法元素来表示。num_reorder_frames表示序列中以解码顺序在任何帧、互补场对或非成对场之前,并以输出顺序在其之后的帧、互补场对或非成对场的最大数量。为了简单起见,我们假设num_reorder_frames用于指示解码图片缓冲区中的初始缓冲区。在这个例子中,u等于1。

It can be observed that if the IDR picture I00 is lost during transmission and a retransmission request is issued when the value of the system clock is 62, there is one picture interval of time (until the system clock reaches timestamp 63) to receive the retransmitted IDR picture I00.

可以观察到,如果IDR图片I00在传输期间丢失并且当系统时钟的值为62时发出重发请求,则存在一个图片时间间隔(直到系统时钟达到时间戳63)来接收重发的IDR图片I00。

Let us then assume that IDR pictures are transmitted two frame intervals earlier than their decoding position; i.e., the pictures are transmitted as follows:

然后,让我们假设IDR图片的传输间隔早于其解码位置两帧;i、 例如,图片的传输方式如下:

                       ...  I00 N58 N59 R03 N01 N02 R06 N04 N05 ...
                       ... --|---|---|---|---|---|---|---|---|- ...
                       ...  62  63  64  65  66  67  68  69  70  ...
        
                       ...  I00 N58 N59 R03 N01 N02 R06 N04 N05 ...
                       ... --|---|---|---|---|---|---|---|---|- ...
                       ...  62  63  64  65  66  67  68  69  70  ...
        

Figure 20. Interleaving: Early IDR pictures in sending order

图20。交错:发送顺序中的早期IDR图片

The OPTIONAL sprop-interleaving-depth MIME type parameter is set equal to 1 according to its definition. (The value of sprop-interleaving-depth in this example can be derived as follows: Picture I00 is the only picture preceding picture N58 or N59 in transmission order and following it in decoding order. Except for pictures I00, N58, and N59, the transmission order is the same as the decoding order of pictures. As a coded picture is encapsulated into exactly one NAL unit, the value of sprop-interleaving-depth is equal to the maximum number of pictures preceding any picture in transmission order and following the picture in decoding order.)

根据定义,可选存储过程交错深度MIME类型参数设置为1。(本例中sprop交织深度的值可以如下导出:图片I00是图片N58或N59之前传输顺序和之后解码顺序的唯一图片。除了图片I00、N58和N59之外,传输顺序与图片的解码顺序相同。因为编码图片被封装到只有一个NAL单元,sprop交织深度的值等于以传输顺序在任何图片之前,以解码顺序在图片之后的最大图片数。)

The receiver buffering process contains two pictures at a time according to the value of the sprop-interleaving-depth parameter and orders pictures from the reception order to the correct decoding order based on the value of DON associated with each picture. The output of the receiver buffering process is as follows:

接收机缓冲处理根据sprop交错深度参数的值一次包含两个图片,并基于与每个图片相关联的DON的值将图片从接收顺序排序到正确的解码顺序。接收机缓冲过程的输出如下:

                            ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
                            ... -|---|---|---|---|---|---|---|---|- ...
                            ... 63  64  65  66  67  68  69  70  71  ...
        
                            ... N58 N59 I00 R03 N01 N02 R06 N04 N05 ...
                            ... -|---|---|---|---|---|---|---|---|- ...
                            ... 63  64  65  66  67  68  69  70  71  ...
        

Figure 21. Interleaving: Receiver buffer

图21。交织:接收机缓冲器

Again, an initial buffering delay of one picture interval is needed to organize pictures from decoding order to output order, as depicted below:

同样,需要一个图片间隔的初始缓冲延迟来将图片从解码顺序组织到输出顺序,如下所示:

                                ... N58 N59 I00 N01 N02 R03 N04 N05 ...
                                ... -|---|---|---|---|---|---|---|- ...
                                ... 64  65  66  67  68  69  70  71  ...
        
                                ... N58 N59 I00 N01 N02 R03 N04 N05 ...
                                ... -|---|---|---|---|---|---|---|- ...
                                ... 64  65  66  67  68  69  70  71  ...
        

Figure 22. Interleaving: Receiver buffer after reordering

图22。交织:重新排序后的接收器缓冲区

Note that the maximum delay that IDR pictures can undergo during transmission, including possible application, transport, or link layer retransmission, is equal to three picture intervals. Thus, the

请注意,IDR图片在传输期间(包括可能的应用、传输或链路层重传)可经历的最大延迟等于三个图片间隔。因此

loss resiliency of IDR pictures is improved in systems supporting retransmission compared to the case in which pictures were transmitted in their decoding order.

与以解码顺序发送图片的情况相比,在支持重传的系统中,IDR图片的丢失恢复能力得到了改进。

13.4. Robust Transmission Scheduling of Redundant Coded Slices
13.4. 冗余编码片的鲁棒传输调度

A redundant coded picture is a coded representation of a picture or a part of a picture that is not used in the decoding process if the corresponding primary coded picture is correctly decoded. There should be no noticeable difference between any area of the decoded primary picture and a corresponding area that would result from application of the H.264 decoding process for any redundant picture in the same access unit. A redundant coded slice is a coded slice that is a part of a redundant coded picture.

冗余编码图片是在相应的主编码图片被正确解码的情况下,未在解码过程中使用的图片或图片的一部分的编码表示。解码的主图片的任何区域与将由对同一接入单元中的任何冗余图片应用H.264解码处理而产生的对应区域之间不应有明显差异。冗余编码片是作为冗余编码图片的一部分的编码片。

Redundant coded pictures can be used to provide unequal error protection in error-prone video transmission. If a primary coded representation of a picture is decoded incorrectly, a corresponding redundant coded picture can be decoded. Examples of applications and coding techniques using the redundant codec picture feature include the video redundancy coding [23] and the protection of "key pictures" in multicast streaming [24].

冗余编码图片可用于在易出错的视频传输中提供不等的错误保护。如果图片的主要编码表示被错误解码,则相应的冗余编码图片可以被解码。使用冗余编解码器图片功能的应用和编码技术的示例包括视频冗余编码[23]和多播流中“关键图片”的保护[24]。

One property of many error-prone video communications systems is that transmission errors are often bursty. Therefore, they may affect more than one consecutive transmission packets in transmission order. In low bit-rate video communication, it is relatively common that an entire coded picture can be encapsulated into one transmission packet. Consequently, a primary coded picture and the corresponding redundant coded pictures may be transmitted in consecutive packets in transmission order. To make the transmission scheme more tolerant of bursty transmission errors, it is beneficial to transmit the primary coded picture and redundant coded picture separated by more than a single packet. The DON concept enables this.

许多容易出错的视频通信系统的一个特点是传输错误通常是突发的。因此,它们可能影响传输顺序中的多个连续传输分组。在低比特率视频通信中,将整个编码图片封装到一个传输包中是相对常见的。因此,主编码图片和相应的冗余编码图片可以按照传输顺序以连续分组的形式传输。为了使传输方案更能容忍突发传输错误,传输由多个分组分隔的主编码图片和冗余编码图片是有益的。DON概念实现了这一点。

13.5. Remarks on Other Design Possibilities
13.5. 关于其他设计可能性的评论

The slice header syntax structure of the H.264 coding standard contains the frame_num syntax element that can indicate the decoding order of coded frames. However, the usage of the frame_num syntax element is not feasible or desirable to recover the decoding order, due to the following reasons:

H.264编码标准的切片头语法结构包含frame_num语法元素,该元素可以指示编码帧的解码顺序。然而,由于以下原因,使用frame_num语法元素来恢复解码顺序是不可行或不可取的:

o The receiver is required to parse at least one slice header per coded picture (before passing the coded data to the decoder).

o 接收器需要对每个编码图片至少解析一个切片头(在将编码数据传递给解码器之前)。

o Coded slices from multiple coded video sequences cannot be interleaved, as the frame number syntax element is reset to 0 in each IDR picture.

o 来自多个编码视频序列的编码片段不能交错,因为在每个IDR图片中帧编号语法元素重置为0。

o The coded fields of a complementary field pair share the same value of the frame_num syntax element. Thus, the decoding order of the coded fields of a complementary field pair cannot be recovered based on the frame_num syntax element or any other syntax element of the H.264 coding syntax.

o 互补字段对的编码字段共享frame_num语法元素的相同值。因此,不能基于H.264编码语法的frame_num语法元素或任何其他语法元素来恢复互补字段对的编码字段的解码顺序。

The RTP payload format for transport of MPEG-4 elementary streams [25] enables interleaving of access units and transmission of multiple access units in the same RTP packet. An access unit is specified in the H.264 coding standard to comprise all NAL units associated with a primary coded picture according to subclause 7.4.1.2 of [1]. Consequently, slices of different pictures cannot be interleaved, and the multi-picture slice interleaving technique (see section 12.6) for improved error resilience cannot be used.

用于传输MPEG-4基本流的RTP有效载荷格式[25]支持在同一RTP数据包中交错接入单元和传输多个接入单元。根据[1]的子条款7.4.1.2,H.264编码标准中规定了接入单元,以包括与主编码图片相关联的所有NAL单元。因此,不同图片的切片不能交错,并且不能使用用于提高错误恢复能力的多图片切片交错技术(见第12.6节)。

14. Acknowledgements
14. 致谢

The authors thank Roni Even, Dave Lindbergh, Philippe Gentric, Gonzalo Camarillo, Gary Sullivan, Joerg Ott, and Colin Perkins for careful review.

作者感谢Roni、Dave Lindbergh、Philippe Gentric、Gonzalo Camarillo、Gary Sullivan、Joerg Ott和Colin Perkins的仔细评论。

15. References
15. 工具书类
15.1. Normative References
15.1. 规范性引用文件

[1] ITU-T Recommendation H.264, "Advanced video coding for generic audiovisual services", May 2003.

[1] ITU-T建议H.264,“通用视听服务的高级视频编码”,2003年5月。

[2] ISO/IEC International Standard 14496-10:2003.

[2] ISO/IEC国际标准14496-10:2003。

[3] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[3] Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[4] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.

[4] Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。

[5] Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998.

[5] Handley,M.和V.Jacobson,“SDP:会话描述协议”,RFC 2327,1998年4月。

[6] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 3548, July 2003.

[6] Josefsson,S.,“Base16、Base32和Base64数据编码”,RFC3548,2003年7月。

[7] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.

[7] Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,2002年6月。

15.2. Informative References
15.2. 资料性引用

[8] "Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC)", available from http://ftp3.itu.int/av-arch/jvt-site/2003_03_Pattaya/JVT-G050r1.zip, May 2003.

[8] “ITU-T建议草案和联合视频规范国际标准最终草案(ITU-T Rec.H.264 | ISO/IEC 14496-10 AVC)”,可从http://ftp3.itu.int/av-arch/jvt-site/2003_03_Pattaya/JVT-G050r1.zip,2003年5月。

[9] Luthra, A., Sullivan, G.J., and T. Wiegand (eds.), Special Issue on H.264/AVC. IEEE Transactions on Circuits and Systems on Video Technology, July 2003.

[9] Luthra,A.,Sullivan,G.J.,和T.Wiegand(编辑),H.264/AVC特刊。IEEE视频技术电路和系统交易,2003年7月。

[10] Bormann, C., Cline, L., Deisher, G., Gardos, T., Maciocco, C., Newell, D., Ott, J., Sullivan, G., Wenger, S., and C. Zhu, "RTP Payload Format for the 1998 Version of ITU-T Rec. H.263 Video (H.263+)", RFC 2429, October 1998.

[10] Bormann,C.,Cline,L.,Deisher,G.,Gardos,T.,Maciocco,C.,Newell,D.,Ott,J.,Sullivan,G.,Wenger,S.,和C.Zhu,“1998版ITU-T Rec.H.263视频(H.263+)的RTP有效载荷格式”,RFC 24291998年10月。

[11] ISO/IEC IS 14496-2.

[11] ISO/IEC IS 14496-2。

[12] Wenger, S., "H.26L over IP", IEEE Transaction on Circuits and Systems for Video technology, Vol. 13, No. 7, July 2003.

[12] Wenger,S.,“H.26L over IP”,IEEE视频技术电路和系统交易,第13卷,第7期,2003年7月。

[13] Wenger, S., "H.26L over IP: The IP Network Adaptation Layer", Proceedings Packet Video Workshop 02, April 2002.

[13] Wenger,S.“H.26L over IP:IP网络适配层”,《分组视频研讨会论文集》,2002年4月,第02期。

[14] Stockhammer, T., Hannuksela, M.M., and S. Wenger, "H.26L/JVT Coding Network Abstraction Layer and IP-based Transport" in Proc. ICIP 2002, Rochester, NY, September 2002.

[14] Stockhammer,T.,Hannuksela,M.M.,和S.Wenger,“H.26L/JVT编码网络抽象层和基于IP的传输”,在Proc。ICIP 2002,纽约州罗切斯特,2002年9月。

[15] ITU-T Recommendation H.241, "Extended video procedures and control signals for H.300 series terminals", 2004.

[15] ITU-T建议H.241,“H.300系列终端的扩展视频程序和控制信号”,2004年。

[16] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003.

[16] Schulzrinne,H.和S.Casner,“具有最小控制的音频和视频会议的RTP配置文件”,STD 65,RFC 3551,2003年7月。

[17] ITU-T Recommendation H.223, "Multiplexing protocol for low bit rate multimedia communication", July 2001.

[17] ITU-T建议H.223,“低比特率多媒体通信的多路复用协议”,2001年7月。

[18] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format for Generic Forward Error Correction", RFC 2733, December 1999.

[18] Rosenberg,J.和H.Schulzrinne,“通用前向纠错的RTP有效载荷格式”,RFC 2733,1999年12月。

[19] Stockhammer, T., Wiegand, T., Oelbaum, T., and F. Obermeier, "Video Coding and Transport Layer Techniques for H.264/AVC-Based Transmission over Packet-Lossy Networks", IEEE International Conference on Image Processing (ICIP 2003), Barcelona, Spain, September 2003.

[19] Stockhammer,T.,Wiegand,T.,Oelbaum,T.,和F.Obermier,“分组有损网络上基于H.264/AVC传输的视频编码和传输层技术”,IEEE国际图像处理会议(ICIP 2003),西班牙巴塞罗那,2003年9月。

[20] Varsa, V. and M. Karczewicz, "Slice interleaving in compressed video packetization", Packet Video Workshop 2000.

[20] Varsa,V.和M.Karczewicz,“压缩视频分组中的切片交织”,分组视频研讨会2000年。

[21] Kang, S.H. and A. Zakhor, "Packet scheduling algorithm for wireless video streaming," International Packet Video Workshop 2002.

[21] Kang,S.H.和A.Zakhor,“无线视频流的分组调度算法”,2002年国际分组视频研讨会。

[22] Hannuksela, M.M., "Enhanced concept of GOP", JVT-B042, available http://ftp3.itu.int/av-arch/video-site/0201_Gen/JVT-B042.doc, January 2002.

[22] Hannuksela,M.M.,“GOP的增强概念”,JVT-B042,可供查阅http://ftp3.itu.int/av-arch/video-site/0201_Gen/JVT-B042.doc,2002年1月。

[23] Wenger, S., "Video Redundancy Coding in H.263+", 1997 International Workshop on Audio-Visual Services over Packet Networks, September 1997.

[23] Wenger,S.,“H.263+中的视频冗余编码”,1997年分组网络视听服务国际研讨会,1997年9月。

[24] Wang, Y.-K., Hannuksela, M.M., and M. Gabbouj, "Error Resilient Video Coding Using Unequally Protected Key Pictures", in Proc. International Workshop VLBV03, September 2003.

[24] Wang,Y.-K.,Hannuksela,M.M.和M.Gabbouj,“使用不平等保护的关键图片的抗错误视频编码”,在Proc。VLBV03国际研讨会,2003年9月。

[25] van der Meer, J., Mackie, D., Swaminathan, V., Singer, D., and P. Gentric, "RTP Payload Format for Transport of MPEG-4 Elementary Streams", RFC 3640, November 2003.

[25] van der Meer,J.,Mackie,D.,Swaminathan,V.,Singer,D.,和P.Gentric,“MPEG-4基本流传输的RTP有效载荷格式”,RFC 36402003年11月。

[26] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004.

[26] Baugher,M.,McGrew,D.,Naslund,M.,Carrara,E.,和K.Norrman,“安全实时传输协议(SRTP)”,RFC 37112004年3月。

[27] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998.

[27] Schulzrinne,H.,Rao,A.,和R.Lanphier,“实时流协议(RTSP)”,RFC2326,1998年4月。

[28] Handley, M., Perkins, C., and E. Whelan, "Session Announcement Protocol", RFC 2974, October 2000.

[28] Handley,M.,Perkins,C.,和E.Whelan,“会话公告协议”,RFC 29742000年10月。

[29] ISO/IEC 14496-15: "Information technology - Coding of audio-visual objects - Part 15: Advanced Video Coding (AVC) file format".

[29] ISO/IEC 14496-15:“信息技术-视听对象编码-第15部分:高级视频编码(AVC)文件格式”。

[30] Castagno, R. and D. Singer, "MIME Type Registrations for 3rd Generation Partnership Project (3GPP) Multimedia files", RFC 3839, July 2004.

[30] Castagno,R.和D.Singer,“第三代合作伙伴项目(3GPP)多媒体文件的MIME类型注册”,RFC 3839,2004年7月。

Authors' Addresses

作者地址

Stephan Wenger TU Berlin / Teles AG Franklinstr. 28-29 D-10587 Berlin Germany

Stephan Wenger TU Berlin/Teles AG Franklinstr。28-29 D-10587德国柏林

   Phone: +49-172-300-0813
   EMail: stewe@stewe.org
        
   Phone: +49-172-300-0813
   EMail: stewe@stewe.org
        

Miska M. Hannuksela Nokia Corporation P.O. Box 100 33721 Tampere Finland

Miska M.Hannuksela诺基亚公司芬兰坦佩雷邮政信箱100 33721

   Phone: +358-7180-73151
   EMail: miska.hannuksela@nokia.com
        
   Phone: +358-7180-73151
   EMail: miska.hannuksela@nokia.com
        

Thomas Stockhammer Nomor Research D-83346 Bergen Germany

德国卑尔根Thomas Stockhammer Nomor Research D-83346

   Phone: +49-8662-419407
   EMail: stockhammer@nomor.de
        
   Phone: +49-8662-419407
   EMail: stockhammer@nomor.de
        

Magnus Westerlund Multimedia Technologies Ericsson Research EAB/TVA/A Ericsson AB Torshamsgatan 23 SE-164 80 Stockholm Sweden

Magnus Westerlund Multimedia Technologies Ericsson Research EAB/TVA/A Ericsson AB Torshamsgatan 23 SE-164 80瑞典斯德哥尔摩

   Phone: +46-8-7190000
   EMail: magnus.westerlund@ericsson.com
        
   Phone: +46-8-7190000
   EMail: magnus.westerlund@ericsson.com
        

David Singer QuickTime Engineering Apple 1 Infinite Loop MS 302-3MT Cupertino CA 95014 USA

David Singer QuickTime工程苹果1无限循环MS 302-3MT Cupertino CA 95014美国

Phone +1 408 974-3162 EMail: singer@apple.com

电话+1 408 974-3162电子邮件:singer@apple.com

Full Copyright Statement

完整版权声明

Copyright (C) The Internet Society (2005).

版权所有(C)互联网协会(2005年)。

This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

本文件受BCP 78中包含的权利、许可和限制的约束,除其中规定外,作者保留其所有权利。

This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

本文件及其包含的信息是按“原样”提供的,贡献者、他/她所代表或赞助的组织(如有)、互联网协会和互联网工程任务组不承担任何明示或暗示的担保,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。

Intellectual Property

知识产权

The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the IETF's procedures with respect to rights in IETF Documents can be found in BCP 78 and BCP 79.

IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何独立努力来确定任何此类权利。有关IETF文件中权利的IETF程序信息,请参见BCP 78和BCP 79。

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.

向IETF秘书处披露的知识产权副本和任何许可证保证,或本规范实施者或用户试图获得使用此类专有权利的一般许可证或许可的结果,可从IETF在线知识产权存储库获取,网址为http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.

IETF邀请任何相关方提请其注意任何版权、专利或专利申请,或其他可能涵盖实施本标准所需技术的专有权利。请将信息发送至IETF的IETF-ipr@ietf.org.

Acknowledgement

确认

Funding for the RFC Editor function is currently provided by the Internet Society.

RFC编辑功能的资金目前由互联网协会提供。