Internet Engineering Task Force (IETF) S. Wenger Request for Comments: 6190 Independent Category: Standards Track Y.-K. Wang ISSN: 2070-1721 Huawei Technologies T. Schierl Fraunhofer HHI A. Eleftheriadis Vidyo May 2011
Internet Engineering Task Force (IETF) S. Wenger Request for Comments: 6190 Independent Category: Standards Track Y.-K. Wang ISSN: 2070-1721 Huawei Technologies T. Schierl Fraunhofer HHI A. Eleftheriadis Vidyo May 2011
RTP Payload Format for Scalable Video Coding
用于可伸缩视频编码的RTP有效负载格式
Abstract
摘要
This memo describes an RTP payload format for Scalable Video Coding (SVC) as defined in Annex G of ITU-T Recommendation H.264, which is technically identical to Amendment 3 of ISO/IEC International Standard 14496-10. The RTP payload format allows for packetization of one or more Network Abstraction Layer (NAL) units in each RTP packet payload, as well as fragmentation of a NAL unit in multiple RTP packets. Furthermore, it supports transmission of an SVC stream over a single as well as multiple RTP sessions. The payload format defines a new media subtype name "H264-SVC", but is still backward compatible to RFC 6184 since the base layer, when encapsulated in its own RTP stream, must use the H.264 media subtype name ("H264") and the packetization method specified in RFC 6184. The payload format has wide applicability in videoconferencing, Internet video streaming, and high-bitrate entertainment-quality video, among others.
本备忘录描述了ITU-T建议H.264附录G中定义的可伸缩视频编码(SVC)的RTP有效载荷格式,该格式在技术上与ISO/IEC国际标准14496-10的修订件3相同。RTP有效载荷格式允许在每个RTP分组有效载荷中对一个或多个网络抽象层(NAL)单元进行分组,以及在多个RTP分组中对NAL单元进行分段。此外,它支持通过单个和多个RTP会话传输SVC流。有效负载格式定义了一个新的媒体子类型名称“H264-SVC”,但仍然向后兼容RFC 6184,因为当封装在自己的RTP流中时,基本层必须使用H.264媒体子类型名称(“H264”)和RFC 6184中指定的打包方法。有效载荷格式在视频会议、互联网视频流和高比特率娱乐质量视频等方面具有广泛的适用性。
Status of This Memo
关于下段备忘
This is an Internet Standards Track document.
这是一份互联网标准跟踪文件。
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.
本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。有关互联网标准的更多信息,请参见RFC 5741第2节。
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6190.
有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc6190.
Copyright Notice
版权公告
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2011 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。
This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.
本文件可能包含2008年11月10日之前发布或公开的IETF文件或IETF贡献中的材料。控制某些材料版权的人员可能未授予IETF信托允许在IETF标准流程之外修改此类材料的权利。在未从控制此类材料版权的人员处获得充分许可的情况下,不得在IETF标准流程之外修改本文件,也不得在IETF标准流程之外创建其衍生作品,除了将其格式化以RFC形式发布或将其翻译成英语以外的其他语言。
Table of Contents
目录
1. Introduction ....................................................5 1.1. The SVC Codec ..............................................6 1.1.1. Overview ............................................6 1.1.2. Parameter Sets ......................................8 1.1.3. NAL Unit Header .....................................9 1.2. Overview of the Payload Format ............................12 1.2.1. Design Principles ..................................12 1.2.2. Transmission Modes and Packetization Modes .........13 1.2.3. New Payload Structures .............................15 2. Conventions ....................................................16 3. Definitions and Abbreviations ..................................16 3.1. Definitions ...............................................16 3.1.1. Definitions from the SVC Specification .............16 3.1.2. Definitions Specific to This Memo ..................18 3.2. Abbreviations .............................................22 4. RTP Payload Format .............................................23 4.1. RTP Header Usage ..........................................23 4.2. NAL Unit Extension and Header Usage .......................23 4.2.1. NAL Unit Extension .................................23 4.2.2. NAL Unit Header Usage ..............................24 4.3. Payload Structures ........................................25 4.4. Transmission Modes ........................................28 4.5. Packetization Modes .......................................28 4.5.1. Packetization Modes for Single-Session Transmission .......................................28 4.5.2. Packetization Modes for Multi-Session Transmission .......................................29 4.6. Single NAL Unit Packets ...................................32 4.7. Aggregation Packets .......................................33 4.7.1. Non-Interleaved Multi-Time Aggregation Packets (NI-MTAPs) .................................33 4.8. Fragmentation Units (FUs) .................................35 4.9. Payload Content Scalability Information (PACSI) NAL Unit ..35 4.10. Empty NAL unit ...........................................43 4.11. Decoding Order Number (DON) ..............................43 4.11.1. Cross-Session DON (CS-DON) for Multi-Session Transmission ........................43 5. Packetization Rules ............................................45 5.1. Packetization Rules for Single-Session Transmission .......45 5.2. Packetization Rules for Multi-Session Transmission ........46 5.2.1. NI-T/NI-TC Packetization Rules .....................47 5.2.2. NI-C/NI-TC Packetization Rules .....................49 5.2.3. I-C Packetization Rules ............................50 5.2.4. Packetization Rules for Non-VCL NAL Units ..........50 5.2.5. Packetization Rules for Prefix NAL Units ...........51
1. Introduction ....................................................5 1.1. The SVC Codec ..............................................6 1.1.1. Overview ............................................6 1.1.2. Parameter Sets ......................................8 1.1.3. NAL Unit Header .....................................9 1.2. Overview of the Payload Format ............................12 1.2.1. Design Principles ..................................12 1.2.2. Transmission Modes and Packetization Modes .........13 1.2.3. New Payload Structures .............................15 2. Conventions ....................................................16 3. Definitions and Abbreviations ..................................16 3.1. Definitions ...............................................16 3.1.1. Definitions from the SVC Specification .............16 3.1.2. Definitions Specific to This Memo ..................18 3.2. Abbreviations .............................................22 4. RTP Payload Format .............................................23 4.1. RTP Header Usage ..........................................23 4.2. NAL Unit Extension and Header Usage .......................23 4.2.1. NAL Unit Extension .................................23 4.2.2. NAL Unit Header Usage ..............................24 4.3. Payload Structures ........................................25 4.4. Transmission Modes ........................................28 4.5. Packetization Modes .......................................28 4.5.1. Packetization Modes for Single-Session Transmission .......................................28 4.5.2. Packetization Modes for Multi-Session Transmission .......................................29 4.6. Single NAL Unit Packets ...................................32 4.7. Aggregation Packets .......................................33 4.7.1. Non-Interleaved Multi-Time Aggregation Packets (NI-MTAPs) .................................33 4.8. Fragmentation Units (FUs) .................................35 4.9. Payload Content Scalability Information (PACSI) NAL Unit ..35 4.10. Empty NAL unit ...........................................43 4.11. Decoding Order Number (DON) ..............................43 4.11.1. Cross-Session DON (CS-DON) for Multi-Session Transmission ........................43 5. Packetization Rules ............................................45 5.1. Packetization Rules for Single-Session Transmission .......45 5.2. Packetization Rules for Multi-Session Transmission ........46 5.2.1. NI-T/NI-TC Packetization Rules .....................47 5.2.2. NI-C/NI-TC Packetization Rules .....................49 5.2.3. I-C Packetization Rules ............................50 5.2.4. Packetization Rules for Non-VCL NAL Units ..........50 5.2.5. Packetization Rules for Prefix NAL Units ...........51
6. De-Packetization Process .......................................51 6.1. De-Packetization Process for Single-Session Transmission ..51 6.2. De-Packetization Process for Multi-Session Transmission ...51 6.2.1. Decoding Order Recovery for the NI-T and NI-TC Modes ........................................52 6.2.1.1. Informative Algorithm for NI-T Decoding Order Recovery within an Access Unit ............................55 6.2.2. Decoding Order Recovery for the NI-C, NI-TC, and I-C Modes ...............................57 7. Payload Format Parameters ......................................59 7.1. Media Type Registration ...................................60 7.2. SDP Parameters ............................................75 7.2.1. Mapping of Payload Type Parameters to SDP ..........75 7.2.2. Usage with the SDP Offer/Answer Model ..............76 7.2.3. Dependency Signaling in Multi-Session Transmission .......................................84 7.2.4. Usage in Declarative Session Descriptions ..........85 7.3. Examples ..................................................86 7.3.1. Example for Offering a Single SVC Session ..........86 7.3.2. Example for Offering a Single SVC Session Using scalable-layer-id ..................................87 7.3.3. Example for Offering Multiple Sessions in MST ......87 7.3.4. Example for Offering Multiple Sessions in MST Including Operation with Answerer Using scalable-layer-id ..................................89 7.3.5. Example for Negotiating an SVC Stream with a Constrained Base Layer in SST ....................90 7.4. Parameter Set Considerations ..............................91 8. Security Considerations ........................................91 9. Congestion Control .............................................92 10. IANA Considerations ...........................................93 11. Informative Appendix: Application Examples ....................93 11.1. Introduction .............................................93 11.2. Layered Multicast ........................................93 11.3. Streaming ................................................94 11.4. Videoconferencing (Unicast to MANE, Unicast to Endpoints) ...............................................95 11.5. Mobile TV (Multicast to MANE, Unicast to Endpoint) .......96 12. Acknowledgements ..............................................97 13. References ....................................................97 13.1. Normative References .....................................97 13.2. Informative References ...................................98
6. De-Packetization Process .......................................51 6.1. De-Packetization Process for Single-Session Transmission ..51 6.2. De-Packetization Process for Multi-Session Transmission ...51 6.2.1. Decoding Order Recovery for the NI-T and NI-TC Modes ........................................52 6.2.1.1. Informative Algorithm for NI-T Decoding Order Recovery within an Access Unit ............................55 6.2.2. Decoding Order Recovery for the NI-C, NI-TC, and I-C Modes ...............................57 7. Payload Format Parameters ......................................59 7.1. Media Type Registration ...................................60 7.2. SDP Parameters ............................................75 7.2.1. Mapping of Payload Type Parameters to SDP ..........75 7.2.2. Usage with the SDP Offer/Answer Model ..............76 7.2.3. Dependency Signaling in Multi-Session Transmission .......................................84 7.2.4. Usage in Declarative Session Descriptions ..........85 7.3. Examples ..................................................86 7.3.1. Example for Offering a Single SVC Session ..........86 7.3.2. Example for Offering a Single SVC Session Using scalable-layer-id ..................................87 7.3.3. Example for Offering Multiple Sessions in MST ......87 7.3.4. Example for Offering Multiple Sessions in MST Including Operation with Answerer Using scalable-layer-id ..................................89 7.3.5. Example for Negotiating an SVC Stream with a Constrained Base Layer in SST ....................90 7.4. Parameter Set Considerations ..............................91 8. Security Considerations ........................................91 9. Congestion Control .............................................92 10. IANA Considerations ...........................................93 11. Informative Appendix: Application Examples ....................93 11.1. Introduction .............................................93 11.2. Layered Multicast ........................................93 11.3. Streaming ................................................94 11.4. Videoconferencing (Unicast to MANE, Unicast to Endpoints) ...............................................95 11.5. Mobile TV (Multicast to MANE, Unicast to Endpoint) .......96 12. Acknowledgements ..............................................97 13. References ....................................................97 13.1. Normative References .....................................97 13.2. Informative References ...................................98
This memo specifies an RTP [RFC3550] payload format for the Scalable Video Coding (SVC) extension of the H.264/AVC video coding standard. SVC is specified in Amendment 3 to ISO/IEC 14496 Part 10 [ISO/IEC14496-10] and equivalently in Annex G of ITU-T Rec. H.264 [H.264]. In this memo, unless explicitly stated otherwise, "H.264/AVC" refers to the specification of [H.264] excluding Annex G.
此备忘录为H.264/AVC视频编码标准的可伸缩视频编码(SVC)扩展指定RTP[RFC3550]有效负载格式。SVC在ISO/IEC 14496第10部分[ISO/IEC14496-10]的修改件3中规定,并在ITU-T Rec.H.264[H.264]的附录G中同等规定。在本备忘录中,除非另有明确说明,“H.264/AVC”指的是[H.264]的规范,不包括附录G。
SVC covers the entire application range of H.264/AVC, from low-bitrate mobile applications, to High-Definition Television (HDTV) broadcasting, and even Digital Cinema that requires nearly lossless coding and hundreds of megabits per second. The scalability features that SVC adds to H.264/AVC enable several system-level functionalities related to the ability of a system to adapt the signal to different system conditions with no or minimal processing. The adaptation relates both to the capabilities of potentially heterogeneous receivers (differing in screen resolution, processing speed, etc.), and to differing or time-varying network conditions. The adaptation can be performed at the source, the destination, or in intermediate media-aware network elements (MANEs). The payload format specified in this memo exposes these system-level functionalities so that system designers can take direct advantage of these features.
SVC覆盖了H.264/AVC的整个应用范围,从低比特率移动应用到高清晰度电视(HDTV)广播,甚至需要几乎无损编码和每秒数百兆比特的数字电影。SVC添加到H.264/AVC中的可伸缩性功能实现了与系统适应不同系统条件的能力相关的多个系统级功能,而无需或只需最少的处理。自适应既与潜在异构接收机的能力(屏幕分辨率、处理速度等不同)有关,也与不同或时变的网络条件有关。自适应可以在源、目的地或中间媒体感知网络元件(mane)中执行。本备忘录中指定的有效负载格式公开了这些系统级功能,以便系统设计者可以直接利用这些功能。
Informative note: Since SVC streams contain, by design, a sub-stream that is compliant with H.264/AVC, it is trivial for a MANE to filter the stream so that all SVC-specific information is removed. This memo, in fact, defines a media type parameter (sprop-avc-ready, Section 7.2) that indicates whether or not the stream can be converted to one compliant with [RFC6184] by eliminating RTP packets, and rewriting RTP Control Protocol (RTCP) to match the changes to the RTP packet stream as specified in Section 7 of [RFC3550].
资料性说明:由于SVC流在设计上包含符合H.264/AVC的子流,因此MANE过滤流以删除所有SVC特定信息是很简单的。事实上,该备忘录定义了一个媒体类型参数(sprop avc ready,第7.2节),该参数指示是否可以通过消除RTP数据包并重写RTP控制协议(RTCP)以匹配[RFC3550]第7节中规定的RTP数据包流更改,将流转换为符合[RFC6184]的流。
This memo defines two basic modes for transmission of SVC data, single-session transmission (SST) and multi-session transmission (MST). In SST, a single RTP session is used for the transmission of all scalability layers comprising an SVC bitstream; in MST, the scalability layers are transported on different RTP sessions. In SST, packetization is a straightforward extension of [RFC6184]. For MST, four different modes are defined in this memo. They differ on whether or not they allow interleaving, i.e., transmitting Network Abstraction Layer (NAL) units in an order different than the decoding order, and by the technique used to effect inter-session NAL unit decoding order recovery. Decoding order recovery is performed using either inter-session timestamp alignment [RFC3550] or cross-session decoding order numbers (CS-DONs). One of the MST modes supports both
本备忘录定义了SVC数据传输的两种基本模式,即单会话传输(SST)和多会话传输(MST)。在SST中,单个RTP会话用于传输包括SVC比特流的所有可伸缩性层;在MST中,可伸缩性层在不同的RTP会话上传输。在SST中,打包是[RFC6184]的直接扩展。对于MST,本备忘录中定义了四种不同的模式。它们在是否允许交织(即,以不同于解码顺序的顺序发送网络抽象层(NAL)单元)以及通过用于实现会话间NAL单元解码顺序恢复的技术而有所不同。使用会话间时间戳对齐[RFC3550]或跨会话解码顺序号(CS DON)执行解码顺序恢复。其中一种MST模式支持这两种模式
decoding order recovery techniques, so that receivers can select their preferred technique. More details can be found in Section 1.2.2.
解码顺序恢复技术,以便接收机可以选择其首选技术。更多详情见第1.2.2节。
This memo further defines three new NAL unit types. The first type is the payload content scalability information (PACSI) NAL unit, which is used to provide an informative summary of the scalability information of the data contained in an RTP packet, as well as ancillary data (e.g., CS-DON values). The second and third new NAL unit types are the empty NAL unit and the non-interleaved multi-time aggregation packet (NI-MTAP) NAL unit. The empty NAL unit is used to ensure inter-session timestamp alignment required for decoding order recovery in MST. The NI-MTAP is used as a new payload structure allowing the grouping of NAL units of different time instances in decoding order. More details about the new packet structures can be found in Section 1.2.3.
本备忘录进一步定义了三种新的NAL单元类型。第一种类型是有效载荷内容可伸缩性信息(PACSI)NAL单元,其用于提供RTP分组中包含的数据的可伸缩性信息的信息摘要以及辅助数据(例如,CS-DON值)。第二种和第三种新的NAL单元类型是空NAL单元和非交织多时间聚合数据包(NI-MTAP)NAL单元。空NAL单元用于确保MST中解码顺序恢复所需的会话间时间戳对齐。NI-MTAP用作一种新的有效负载结构,允许按照解码顺序对不同时间实例的NAL单元进行分组。有关新数据包结构的更多详细信息,请参见第1.2.3节。
This memo also defines the signaling support for SVC transport over RTP, including a new media subtype name (H264-SVC).
此备忘录还定义了通过RTP传输SVC的信令支持,包括新的媒体子类型名称(H264-SVC)。
A non-normative overview of the SVC codec and the payload is given in the remainder of this section.
本节剩余部分给出了SVC编解码器和有效负载的非规范性概述。
SVC defines a coded video representation in which a given bitstream offers representations of the source material at different levels of fidelity (hence the term "scalable"). Scalable video coding bitstreams, or scalable bitstreams, are constructed in a pyramidal fashion: the coding process creates bitstream components that improve the fidelity of hierarchically lower components.
SVC定义了一种编码视频表示,其中给定的比特流以不同的保真度提供源材料的表示(因此术语“可伸缩”)。可伸缩视频编码比特流或可伸缩比特流以金字塔方式构造:编码过程创建的比特流组件可提高分层较低组件的保真度。
The fidelity dimensions offered by SVC are spatial (picture size), quality (or Signal-to-Noise Ratio (SNR)), and temporal (pictures per second). Bitstream components associated with a given level of spatial, quality, and temporal fidelity are identified using corresponding parameters in the bitstream: dependency_id, quality_id, and temporal_id (see also Section 1.1.3). The fidelity identifiers have integer values, where higher values designate components that are higher in the hierarchy. It is noted that SVC offers significant flexibility in terms of how an encoder may choose to structure the dependencies between the various components. Decoding of a particular component requires the availability of all the components it depends upon, either directly, or indirectly. An operation point
SVC提供的保真度维度是空间(图片大小)、质量(或信噪比(SNR))和时间(每秒图片)。与给定的空间、质量和时间保真度级别相关联的比特流组件使用比特流中的相应参数来识别:依赖项id、质量id和时间id(另请参见第1.1.3节)。保真度标识符具有整数值,其中较高的值表示层次结构中较高的组件。值得注意的是,SVC在编码器如何选择构造不同组件之间的依赖关系方面提供了极大的灵活性。解码一个特定的组件需要它所依赖的所有组件的可用性,可以是直接的,也可以是间接的。操作点
of an SVC bitstream consists of the bitstream components required to be able to decode a particular dependency_id, quality_id, and temporal_id combination.
SVC比特流的组成部分包括能够解码特定依赖项id、质量id和时间id组合所需的比特流组件。
The term "layer" is used in various contexts in this memo. For example, in the terms "Video Coding Layer" and "Network Abstraction Layer" it refers to conceptual organization levels. When referring to bitstream syntax elements such as block layer or macroblock layer, it refers to hierarchical bitstream structure levels. When used in the context of bitstream scalability, e.g., "AVC base layer", it refers to a level of representation fidelity of the source signal with a specific set of NAL units included. The correct interpretation is supported by providing the appropriate context.
“层”一词在本备忘录的不同上下文中使用。例如,在术语“视频编码层”和“网络抽象层”中,它指的是概念组织层。当提到诸如块层或宏块层之类的位流语法元素时,它指的是分层位流结构级别。当在比特流可伸缩性的上下文中使用时,例如,“AVC基本层”,它指的是源信号的表示保真度水平,其中包括一组特定的NAL单元。通过提供适当的上下文来支持正确的解释。
SVC maintains the bitstream organization introduced in H.264/AVC. Specifically, all bitstream components are encapsulated in Network Abstraction Layer (NAL) units, which are organized as Access Units (AUs). An AU is associated with a single sampling instance in time. A subset of the NAL unit types correspond to the Video Coding Layer (VCL), and contain the coded picture data associated with the source content. Non-VCL NAL units carry ancillary data that may be necessary for decoding (e.g., parameter sets as explained below) or that facilitate certain system operations but are not needed by the decoding process itself. Coded picture data at the various fidelity dimensions are organized in slices. Within one AU, a coded picture of an operation point consists of all the coded slices required for decoding up to the particular combination of dependency_id and quality_id values at the time instance corresponding to the AU.
SVC维护H.264/AVC中引入的比特流组织。具体来说,所有比特流组件都封装在网络抽象层(NAL)单元中,这些单元被组织为访问单元(AU)。AU在时间上与单个采样实例关联。NAL单元类型的子集对应于视频编码层(VCL),并且包含与源内容相关联的编码图片数据。非VCL NAL单元携带解码所需的辅助数据(例如,如下所述的参数集)或有助于某些系统操作但解码过程本身不需要的辅助数据。不同保真度维度的编码图片数据被组织成片。在一个AU内,操作点的编码图片包括在对应于AU的时间实例处解码到依赖性id和质量id值的特定组合所需的所有编码片。
It is noted that the concept of temporal scalability is already present in H.264/AVC, as profiles defined in Annex A of [H.264] already support it. Specifically, in H.264/AVC, the concept of sub-sequences has been introduced to allow optional use of temporal layers through Supplemental Enhancement Information (SEI) messages. SVC extends this approach by exposing the temporal scalability information using the temporal_id parameter, alongside (and unified with) the dependency_id and quality_id values that are used for spatial and quality scalability, respectively. For coded picture data defined in Annex G of [H.264], this is accomplished by using a new type of NAL unit, namely, coded slice in scalable extension NAL unit (type 20), where the fidelity parameters are part of its header. For coded picture data that follow H.264/AVC, and to ensure compatibility with existing H.264/AVC decoders, another new type of NAL unit, namely, prefix NAL unit (type 14), has been defined to carry this header information. SVC additionally specifies a third new type of NAL unit, namely, subset sequence parameter set NAL unit (type 15), to contain sequence parameter set information for quality and spatial enhancement layers. All these three newly specified NAL
注意,时间可伸缩性的概念已经存在于H.264/AVC中,因为[H.264]的附录A中定义的概要文件已经支持它。具体地说,在H.264/AVC中,引入了子序列的概念,以允许通过补充增强信息(SEI)消息来可选地使用时间层。SVC通过使用temporal_id参数以及(并与)分别用于空间和质量可伸缩性的dependency_id和quality_id值一起公开时间可伸缩性信息来扩展这种方法。对于[H.264]的附录G中定义的编码图片数据,这是通过使用新型的NAL单元来实现的,即,可伸缩扩展NAL单元(类型20)中的编码片,其中保真度参数是其头部的一部分。对于遵循H.264/AVC的编码图片数据,并且为了确保与现有H.264/AVC解码器的兼容性,已经定义了另一种新类型的NAL单元,即前缀NAL单元(类型14),以承载该报头信息。SVC另外指定第三种新类型的NAL单元,即子集序列参数集NAL单元(类型15),以包含质量和空间增强层的序列参数集信息。所有这三个新指定的名称
unit types (14, 15, and 20) are among those reserved in H.264/AVC and are to be ignored by decoders conforming to one or more of the profiles specified in Annex A of [H.264].
单元类型(14、15和20)属于H.264/AVC中保留的单元类型,并且将被符合[H.264]附录A中规定的一个或多个配置文件的解码器忽略。
Within an AU, the VCL NAL units associated with a given dependency_id and quality_id are referred to as a "layer representation". The layer representation corresponding to the lowest values of dependency_id and quality_id (i.e., zero for both) is compliant by design to H.264/AVC. The set of VCL and associated non-VCL NAL units across all AUs in a bitstream associated with a particular combination of values of dependency_id and quality_id, and regardless of the value of temporal_id, is conceptually a scalable layer. For backward compatibility with H.264/AVC, it is important to differentiate, however, whether or not SVC-specific NAL units are present in a given bitstream. This is particularly important for the lowest fidelity values in terms of dependency_id and quality_id (zero for both), as the corresponding VCL data are compliant with H.264/AVC, and may or may not be accompanied by associated prefix NAL units. This memo therefore uses the term "AVC base layer" to designate the layer that does not contain SVC-specific NAL units, and "SVC base layer" to designate the same layer but with the addition of the associated SVC prefix NAL units. Note that the SVC specification uses the term "base layer" for what in this memo will be referred to as "AVC base layer". Similarly, it is also important to be able to differentiate, within a layer, the temporal fidelity components it contains. This memo uses the term "T0" to indicate, within a particular layer, the subset that contains the NAL units associated with temporal_id equal to 0.
在AU中,与给定的依赖项id和质量id关联的VCL NAL单元称为“层表示”。根据设计,与dependency_id和quality_id的最低值相对应的层表示(即两者均为零)符合H.264/AVC。在与依赖性id和质量id的值的特定组合相关联的比特流中,跨越所有au的VCL和相关联的非VCL NAL单元的集合在概念上是可伸缩层,而与时间id的值无关。然而,为了与H.264/AVC向后兼容,重要的是要区分特定于SVC的NAL单元是否存在于给定的比特流中。这对于依赖性_id和质量_id(两者均为零)方面的最低保真度值尤其重要,因为对应的VCL数据符合H.264/AVC,并且可能伴随或可能不伴随相关前缀NAL单元。因此,本备忘录使用术语“AVC基本层”指定不包含SVC特定NAL单元的层,使用术语“SVC基本层”指定相同的层,但添加了相关的SVC前缀NAL单元。请注意,SVC规范使用术语“基本层”表示本备忘录中所称的“AVC基本层”。同样,能够在层内区分其包含的时间保真度组件也很重要。本备忘录使用术语“T0”表示特定层中包含与等于0的时间_id相关联的NAL单元的子集。
SNR scalability in SVC is offered in two different ways. In what is called coarse-grain scalability (CGS), scalability is provided by including or excluding a complete layer when decoding a particular bitstream. In contrast, in medium-grain scalability (MGS), scalability is provided by selectively omitting the decoding of specific NAL units belonging to MGS layers. The selection of the NAL units to omit can be based on fixed-length fields present in the NAL unit header (see also Sections 1.1.3 and 4.2).
SVC中的SNR可伸缩性有两种不同的方式。在所谓的粗粒度可伸缩性(CGS)中,可伸缩性是通过在解码特定比特流时包括或排除完整层来提供的。相反,在中等粒度可伸缩性(MGS)中,可伸缩性是通过选择性地省略属于MGS层的特定NAL单元的解码来提供的。可根据NAL单元标题中的固定长度字段选择要忽略的NAL单元(另请参见第1.1.3节和第4.2节)。
SVC maintains the parameter sets concept in H.264/AVC and introduces a new type of sequence parameter set, referred to as the subset sequence parameter set [H.264]. Subset sequence parameter sets have NAL unit type equal to 15, which is different from the NAL unit type value (7) of sequence parameter sets. VCL NAL units of NAL unit type 1 to 5 must only (indirectly) refer to sequence parameter sets, while VCL NAL units of NAL unit type 20 must only (indirectly) refer to subset sequence parameter sets. The references are indirect because
SVC保留了H.264/AVC中的参数集概念,并引入了一种新类型的序列参数集,称为子集序列参数集[H.264]。子集序列参数集的NAL单位类型等于15,这与序列参数集的NAL单位类型值(7)不同。NAL单元类型1至5的VCL NAL单元必须仅(间接)引用序列参数集,而NAL单元类型20的VCL NAL单元必须仅(间接)引用子集序列参数集。这些引用是间接的,因为
VCL NAL units refer to picture parameter sets (in their slice header), which in turn refer to regular or subset sequence parameter sets. Subset sequence parameter sets use a separate identifier value space than sequence parameter sets.
VCL NAL单元指的是图片参数集(在其切片头中),而图片参数集又指常规或子集序列参数集。子集序列参数集比序列参数集使用单独的标识符值空间。
In SVC, coded picture data from different layers may use the same or different sequence and picture parameter sets. Let the variable DQId be equal to dependency_id * 16 + quality_id. At any time instant during the decoding process there is one active sequence parameter set for the layer representation with the highest value of DQId and one or more active layer SVC sequence parameter set(s) for layer representations with lower values of DQId. The active sequence parameter set or an active layer SVC sequence parameter set remains unchanged throughout a coded video sequence in the scalable layer in which the active sequence parameter set or active layer SVC sequence parameter set is referred to. This means that the referred sequence parameter set or subset sequence parameter set can only change at instantaneous decoding refresh (IDR) access units for any layer. At any time instant during the decoding process there may be one active picture parameter set (for the layer representation with the highest value of DQId) and one or more active layer picture parameter set(s) (for layer representations with lower values of DQId). The active picture parameter set or an active layer picture parameter set remains unchanged throughout a layer representation in which the active picture parameter set or active layer picture parameter set is referred to, but may change from one AU to the next.
在SVC中,来自不同层的编码图片数据可以使用相同或不同的序列和图片参数集。让变量DQId等于dependency_id*16+quality_id。在解码过程中的任何时刻,都有一个用于DQId值最高的层表示的活动序列参数集,以及一个或多个用于DQId值较低的层表示的活动层SVC序列参数集。活动序列参数集或活动层SVC序列参数集在参考活动序列参数集或活动层SVC序列参数集的可伸缩层中的编码视频序列中始终保持不变。这意味着所参考的序列参数集或子集序列参数集只能在任何层的瞬时解码刷新(IDR)访问单元处改变。在解码过程中的任何时刻,可能存在一个活动图片参数集(对于具有最高DQId值的层表示)和一个或多个活动层图片参数集(对于具有较低DQId值的层表示)。活动图片参数集或活动层图片参数集在参考活动图片参数集或活动层图片参数集的整个层表示中保持不变,但可以从一个AU改变到下一个AU。
SVC extends the one-byte H.264/AVC NAL unit header by three additional octets for NAL units of types 14 and 20. The header indicates the type of the NAL unit, the (potential) presence of bit errors or syntax violations in the NAL unit payload, information regarding the relative importance of the NAL unit for the decoding process, the layer identification information, and other fields as discussed below.
SVC将一字节H.264/AVC NAL单元头扩展为14和20型NAL单元的三个额外八位字节。报头指示NAL单元的类型、NAL单元有效载荷中(可能)存在的比特错误或语法冲突、关于解码过程中NAL单元的相对重要性的信息、层标识信息以及下文讨论的其他字段。
The syntax and semantics of the NAL unit header are specified in [H.264], but the essential properties of the NAL unit header are summarized below for convenience.
NAL单元头的语法和语义在[H.264]中有规定,但为了方便起见,NAL单元头的基本属性总结如下。
The first byte of the NAL unit header has the following format (the bit fields are the same as defined for the one-byte H.264/AVC NAL unit header, while the semantics of some fields have changed slightly, in a backward-compatible way):
NAL单元头的第一个字节具有以下格式(位字段与为单字节H.264/AVC NAL单元头定义的相同,而某些字段的语义以向后兼容的方式略有变化):
+---------------+ |0|1|2|3|4|5|6|7| +-+-+-+-+-+-+-+-+ |F|NRI| Type | +---------------+
+---------------+ |0|1|2|3|4|5|6|7| +-+-+-+-+-+-+-+-+ |F|NRI| Type | +---------------+
The semantics of the components of the NAL unit type octet, as specified in [H.264], are described briefly below. In addition to the name and size of each field, the corresponding syntax element name in [H.264] is also provided.
下文简要描述了[H.264]中规定的NAL单元类型八位字节的组件语义。除了每个字段的名称和大小外,[H.264]中还提供了相应的语法元素名称。
F: 1 bit forbidden_zero_bit. H.264/AVC declares a value of 1 as a syntax violation.
F:1位禁止\u零位\u位。H.264/AVC将值1声明为语法冲突。
NRI: 2 bits nal_ref_idc. A value of "00" (in binary form) indicates that the content of the NAL unit is not used to reconstruct reference pictures for future prediction. Such NAL units can be discarded without risking the integrity of the reference pictures in the same layer. A value greater than "00" indicates that the decoding of the NAL unit is required to maintain the integrity of reference pictures in the same layer or that the NAL unit contains parameter sets.
NRI:2位nal\U ref\U idc。值“00”(二进制形式)表示NAL单元的内容不用于重构参考图片以供将来预测。这样的NAL单元可以被丢弃,而不会危及同一层中参考图片的完整性。大于“00”的值表示需要解码NAL单元以保持同一层中参考图片的完整性,或者NAL单元包含参数集。
Type: 5 bits nal_unit_type. This component specifies the NAL unit type as defined in Table 7-1 of [H.264], and later within this memo. For a reference of all currently defined NAL unit types and their semantics, please refer to Section 7.4.1 in [H.264].
类型:5位nal\U单元\U类型。该组件指定了[H.264]表7-1中定义的NAL单元类型,以及本备忘录后面的内容。有关所有当前定义的NAL单元类型及其语义的参考,请参考[H.264]中的第7.4.1节。
In H.264/AVC, NAL unit types 14, 15, and 20 are reserved for future extensions. SVC uses these three NAL unit types as follows: NAL unit type 14 is used for prefix NAL unit, NAL unit type 15 is used for subset sequence parameter set, and NAL unit type 20 is used for coded slice in scalable extension (see Section 7.4.1 in [H.264]). NAL unit types 14 and 20 indicate the presence of three additional octets in the NAL unit header, as shown below.
在H.264/AVC中,NAL单元类型14、15和20保留用于将来的扩展。SVC使用这三种NAL单元类型如下:NAL单元类型14用于前缀NAL单元,NAL单元类型15用于子集序列参数集,NAL单元类型20用于可伸缩扩展中的编码切片(见[H.264]第7.4.1节)。NAL单元类型14和20表示NAL单元头中存在三个额外的八位字节,如下所示。
+---------------+---------------+---------------+ |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |R|I| PRID |N| DID | QID | TID |U|D|O| RR| +---------------+---------------+---------------+
+---------------+---------------+---------------+ |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |R|I| PRID |N| DID | QID | TID |U|D|O| RR| +---------------+---------------+---------------+
R: 1 bit reserved_one_bit. Reserved bit for future extension. R must be equal to 1. The value of R must be ignored by decoders.
R:1位保留\ 1位。为将来的扩展保留位。R必须等于1。解码器必须忽略R的值。
I: 1 bit idr_flag. This component specifies whether the layer representation is an instantaneous decoding refresh (IDR) layer representation (when equal to 1) or not (when equal to 0).
I:1位idr_标志。此组件指定层表示是瞬时解码刷新(IDR)层表示(当等于1时)还是非瞬时解码刷新(IDR)层表示(当等于0时)。
PRID: 6 bits priority_id. This flag specifies a priority identifier for the NAL unit. A lower value of PRID indicates a higher priority.
PRID:6位优先级\u id。此标志指定NAL单元的优先级标识符。PRID值越低表示优先级越高。
N: 1 bit no_inter_layer_pred_flag. This flag specifies, when present in a coded slice NAL unit, whether inter-layer prediction may be used for decoding the coded slice (when equal to 1) or not (when equal to 0).
N:1位无层间层pred标志。当在编码片段NAL单元中存在时,该标志指定层间预测是否可用于解码编码片段(当等于1时)或不(当等于0时)。
DID: 3 bits dependency_id. This component indicates the inter-layer coding dependency level of a layer representation. At any access unit, a layer representation with a given dependency_id may be used for inter-layer prediction for coding of a layer representation with a higher dependency_id, while a layer representation with a given dependency_id shall not be used for inter-layer prediction for coding of a layer representation with a lower dependency_id.
DID:3位依赖项\ id。此组件表示层表示的层间编码依赖项级别。在任何接入单元,具有给定依赖性的层表示可以用于层间预测,用于编码具有较高依赖性的层表示,而具有给定依赖性的层表示不应用于层间预测,用于编码具有较低依赖性的层表示。
QID: 4 bits quality_id. This component indicates the quality level of an MGS layer representation. At any access unit and for identical dependency_id values, a layer representation with quality_id equal to ql uses a layer representation with quality_id equal to ql-1 for inter-layer prediction.
QID:4位质量标识。该组件表示MGS层表示的质量级别。在任何访问单元上,对于相同的依赖项id值,质量id等于ql的层表示使用质量id等于ql-1的层表示进行层间预测。
TID: 3 bits temporal_id. This component indicates the temporal level of a layer representation. The temporal_id is associated with the frame rate, with lower values of _temporal_id corresponding to lower frame rates. A layer representation at a given temporal_id typically depends on layer representations with lower temporal_id values, but it never depends on layer representations with higher temporal_id values.
TID:3位时间id。该组件表示层表示的时间级别。时间id与帧速率相关联,较低的时间id值对应于较低的帧速率。给定时间id的层表示通常依赖于具有较低时间id值的层表示,但它从不依赖于具有较高时间id值的层表示。
U: 1 bit use_ref_base_pic_flag. A value of 1 indicates that only reference base pictures are used during the inter prediction process. A value of 0 indicates that the reference base pictures are not used during the inter prediction process.
U:1位使用\参考\基础\图片\标志。值1表示在帧间预测处理期间仅使用参考基础图片。值0表示在帧间预测处理期间不使用基准图片。
D: 1 bit discardable_flag. A value of 1 indicates that the current NAL unit is not used for decoding NAL units with values of dependency_id higher than the one of the current NAL unit, in the current and all subsequent access units. Such NAL units can be discarded without risking the integrity of layers with higher dependency_id values. discardable_flag equal to 0 indicates that the decoding of the NAL unit is required to maintain the integrity of layers with higher dependency_id.
D:1位可丢弃的_标志。值1表示在当前和所有后续接入单元中,当前NAL单元不用于解码依赖性_id的值高于当前NAL单元的值的NAL单元。这样的NAL单元可以被丢弃,而不会危及具有较高依赖性id值的层的完整性。可丢弃_标志等于0表示需要解码NAL单元以保持具有较高依赖性_id的层的完整性。
O: 1 bit output_flag: Affects the decoded picture output process as defined in Annex C of [H.264].
O:1位输出_标志:影响[H.264]附录C中定义的解码图片输出过程。
RR: 2 bits reserved_three_2bits. Reserved bits for future extension. RR MUST be equal to "11" (in binary form). The value of RR must be ignored by decoders.
RR:2位保留3位。为将来的扩展保留位。RR必须等于“11”(二进制形式)。解码器必须忽略RR的值。
This memo extends the semantics of F, NRI, I, PRID, DID, QID, TID, U, and D per Annex G of [H.264] as described in Section 4.2.
本备忘录根据[H.264]附录G扩展了F、NRI、I、PRID、DID、QID、TID、U和D的语义,如第4.2节所述。
Similar to [RFC6184], this payload format can only be used to carry the raw NAL unit stream over RTP and not the bytestream format specified in Annex B of [H.264].
与[RFC6184]类似,该有效载荷格式只能用于通过RTP传输原始NAL单元流,而不是[H.264]附录B中规定的ByTestStream格式。
The design principles, transmission modes, and packetization modes as well as new payload structures are summarized in this section. It is assumed that the reader is familiar with the terminology and concepts defined in [RFC6184].
本节总结了设计原则、传输模式、打包模式以及新的有效载荷结构。假设读者熟悉[RFC6184]中定义的术语和概念。
The following design principles have been observed for this payload format:
此有效载荷格式遵循以下设计原则:
o Backward compatibility with [RFC6184] wherever possible.
o 尽可能向后兼容[RFC6184]。
o The SVC base layer or any H.264/AVC compatible subset of the SVC base layer, when transmitted in its own RTP stream, must be encapsulated using [RFC6184]. This ensures that such an RTP stream can be understood by [RFC6184] receivers.
o SVC基本层或SVC基本层的任何H.264/AVC兼容子集在其自身RTP流中传输时,必须使用[RFC6184]进行封装。这确保[RFC6184]接收机可以理解这样的RTP流。
o Media-aware network elements (MANEs) as defined in [RFC6184] are signaling-aware, rely on signaling information, and have state.
o [RFC6184]中定义的媒体感知网元(MANE)具有信令感知、依赖信令信息并具有状态。
o MANEs can aggregate multiple RTP streams, possibly from multiple RTP sessions.
o MANE可以聚合多个RTP流,可能来自多个RTP会话。
o MANEs can perform media-aware stream thinning (selective elimination of packets or portions thereof). By using the payload header information identifying layers within an RTP session, MANEs are able to remove packets or portions thereof from the incoming RTP packet stream. This implies rewriting the RTP headers of the outgoing packet stream, and rewriting of RTCP packets as specified in Section 7 of [RFC3550].
o mane可以执行媒体感知流细化(选择性地消除分组或其部分)。通过使用用于识别RTP会话内的层的有效载荷报头信息,mane能够从传入RTP分组流中移除分组或其部分。这意味着按照[RFC3550]第7节的规定,重写传出数据包流的RTP报头,并重写RTCP数据包。
This memo allows the packetization of SVC data for both single-session transmission (SST) and multi-session transmission (MST). In the case of SST all SVC data are carried in a single RTP session. In the case of MST two or more RTP sessions are used to carry the SVC data, in accordance with the MST-specific packetization modes defined in this memo, which are based on the packetization modes defined in [RFC6184]. In MST, each RTP session is associated with one RTP stream, which may carry one or more layers.
此备忘录允许对单会话传输(SST)和多会话传输(MST)的SVC数据进行打包。在SST的情况下,所有SVC数据都在单个RTP会话中传输。在MST的情况下,根据本备忘录中定义的MST特定打包模式(基于[RFC6184]中定义的打包模式),使用两个或多个RTP会话来传输SVC数据。在MST中,每个RTP会话与一个RTP流相关联,该RTP流可以承载一个或多个层。
The base layer is, by design, compatible to H.264/AVC. During transmission, the associated prefix NAL units, which are introduced by SVC and, when present, are ignored by H.264/AVC decoders, may be encapsulated within the same RTP packet stream as the H.264/AVC VCL NAL units or in a different RTP packet stream (when MST is used). For convenience, the term "AVC base layer" is used to refer to the base layer without prefix NAL units, while the term "SVC base layer" is used to refer to the base layer with prefix NAL units.
根据设计,基本层与H.264/AVC兼容。在传输期间,由SVC引入且当存在时被H.264/AVC解码器忽略的相关前缀NAL单元可以封装在与H.264/AVC VCL NAL单元相同的RTP分组流中,或者封装在不同的RTP分组流中(当使用MST时)。为方便起见,术语“AVC基本层”用于指代没有前缀NAL单元的基本层,而术语“SVC基本层”用于指代具有前缀NAL单元的基本层。
Furthermore, the base layer may have multiple temporal components (i.e., supporting different frame rates). As a result, the lowest temporal component ("T0") of the AVC or SVC base layer is used as the starting point of the SVC bitstream hierarchy.
此外,基本层可以具有多个时间分量(即,支持不同的帧速率)。结果,AVC或SVC基本层的最低时间分量(“T0”)被用作SVC比特流层次结构的起点。
This memo allows encapsulating in a given RTP stream any of the following three alternatives of layer combinations:
此备忘录允许在给定RTP流中封装以下三种层组合的任意一种:
1. the T0 AVC base layer or the T0 SVC base layer only; 2. one or more enhancement layers only; or 3. the T0 SVC base layer, and one or more enhancement layers.
1. 仅T0 AVC基层或T0 SVC基层;2.仅一个或多个增强层;或3。T0 SVC基本层和一个或多个增强层。
SST should be used in point-to-point unicast applications and, in general, whenever the potential benefit of using multiple RTP sessions does not justify the added complexity. When SST is used, the layer combination cases 1 and 3 above can be used. When an H.264/AVC compatible subset of the SVC base layer is transmitted using SST, the packetization of [RFC6184] must be used, thus ensuring compatibility with [RFC6184] receivers. When, however, one or more SVC quality or spatial enhancement layers are transmitted using SST, the packetization defined in this memo must be used. In SST, any of the three [RFC6184] packetization modes, namely, single NAL unit mode, non-interleaved mode, and interleaved mode, can be used.
SST应用于点对点单播应用程序中,通常,当使用多个RTP会话的潜在好处不足以证明增加的复杂性时。当使用SST时,可以使用上面的图层组合情况1和3。当使用SST传输SVC基本层的H.264/AVC兼容子集时,必须使用[RFC6184]的分组,从而确保与[RFC6184]接收机的兼容性。但是,当使用SST传输一个或多个SVC质量或空间增强层时,必须使用本备忘录中定义的打包。在SST中,可以使用三种[RFC6184]分组模式中的任何一种,即单NAL单元模式、非交织模式和交织模式。
MST should be used in a multicast session when different receivers may request different layers of the scalable bitstream. An operation point for an SVC bitstream, as defined in this memo, corresponds to a set of layers that together conform to one of the profiles defined in Annex A or G of [H.264] and, when decoded, offer a representation of the original video at a certain fidelity. The number of streams used in MST should be at least equal to the number of operation points that may be requested by the receivers. Depending on the application, this may result in each layer being carried in its own RTP session, or in having multiple layers encapsulated within one RTP session.
当不同的接收器可能请求不同层次的可伸缩比特流时,应在多播会话中使用MST。本备忘录中定义的SVC比特流的操作点对应于一组层,这些层共同符合[H.264]附录a或G中定义的配置文件之一,并在解码时以一定保真度提供原始视频的表示。MST中使用的流的数量应至少等于接收机可能请求的操作点的数量。根据应用程序的不同,这可能会导致每个层在其自己的RTP会话中承载,或者在一个RTP会话中封装多个层。
Informative note: Layered multicast is a term commonly used to describe the application where multicast is used to transmit layered or scalable data that has been encapsulated into more than one RTP session. This application allows different receivers in the multicast session to receive different operation points of the scalable bitstream. Layered multicast, among other application examples, is discussed in more detail in Section 11.2.
资料性说明:分层多播是一个常用术语,用于描述多播用于传输已封装到多个RTP会话中的分层或可伸缩数据的应用程序。该应用程序允许多播会话中的不同接收器接收可伸缩比特流的不同操作点。在其他应用示例中,第11.2节将更详细地讨论分层多播。
When MST is used, any of the three layer combinations above can be used for each of the sessions. When an H.264/AVC compatible subset of the SVC base layer is transmitted in its own session in MST, the packetization of [RFC6184] must be used, such that [RFC6184] receivers can be part of the MST and receive only this session. For MST, this memo defines four different MST-specific packetization modes, namely, non-interleaved timestamp (NI-T) based mode, non-interleaved CS-DON (NI-C) based mode, non-interleaved combined timestamp and CS-DON mode (NI-TC), and interleaved CS-DON (I-C) based mode (detailed in Section 4.5.2). The modes differ depending on whether the SVC data are allowed to be interleaved, i.e., to be transmitted in an order different than the intended decoding order,
当使用MST时,上述三个层组合中的任何一个都可以用于每个会话。当SVC基本层的H.264/AVC兼容子集在MST中自己的会话中传输时,必须使用[RFC6184]的分组,以便[RFC6184]接收机可以是MST的一部分,并且只接收该会话。对于MST,本备忘录定义了四种不同的MST特定打包模式,即基于非交错时间戳(NI-T)的模式、基于非交错CS-DON(NI-C)的模式、非交错组合时间戳和CS-DON模式(NI-TC)以及基于交错CS-DON(I-C)的模式(详见第4.5.2节)。这些模式取决于SVC数据是否允许交织,即以不同于预期解码顺序的顺序发送,
and they also differ in the mechanisms provided in order to recover the correct decoding order of the NAL units across the multiple RTP sessions. These four MST modes reuse the packetization modes introduced in [RFC6184] for the packetization of NAL units in each of their individual RTP sessions.
而且它们在提供的机制上也有所不同,以便在多个RTP会话中恢复NAL单元的正确解码顺序。这四种MST模式重用了[RFC6184]中介绍的打包模式,用于NAL单元在各自RTP会话中的打包。
As the names of the MST packetization modes imply, the NI-T, NI-C, and NI-TC modes do not allow interleaved transmission, while the I-C mode allows interleaved transmission. With any of the three non-interleaved MST packetization modes, legacy [RFC6184] receivers with implementation of the non-interleaved mode specified in [RFC6184] can join a multi-session transmission of SVC, to receive the base RTP session encapsulated according to [RFC6184].
正如MST分组模式的名称所暗示的,NI-T、NI-C和NI-TC模式不允许交织传输,而I-C模式允许交织传输。使用三种非交错MST分组模式中的任何一种,实现了[RFC6184]中指定的非交错模式的传统[RFC6184]接收机可以加入SVC的多会话传输,以接收根据[RFC6184]封装的基本RTP会话。
[RFC6184] specifies three basic payload structures, namely, single NAL unit packet, aggregation packet, and fragmentation unit. Depending on the basic payload structure, an RTP packet may contain a NAL unit not aggregating other NAL units, one or more NAL units aggregated in another NAL unit, or a fragment of a NAL unit not aggregating other NAL units. Each NAL unit of a type specified in [H.264] (i.e., 1 to 23, inclusive) may be carried in its entirety in a single NAL unit packet, may be aggregated in an aggregation packet, or may be fragmented and carried in a number of fragmentation unit packets. To enable aggregation or fragmentation of NAL units while still ensuring that the RTP packet payload is only composed of NAL units, [RFC6184] introduced six new NAL unit types (24-29) to be used as payload structures, selected from the NAL unit types left unspecified in [H.264].
[RFC6184]指定了三种基本有效负载结构,即单个NAL单元数据包、聚合数据包和碎片单元。根据基本有效负载结构,RTP分组可以包含不聚合其他NAL单元的NAL单元、在另一NAL单元中聚合的一个或多个NAL单元,或者不聚合其他NAL单元的NAL单元的片段。[H.264]中指定类型的每个NAL单元(即,1到23,包括1到23)可以在单个NAL单元分组中整体携带,可以在聚合分组中聚合,或者可以在多个分段单元分组中分段和携带。为了在仍然确保RTP数据包有效负载仅由NAL单元组成的情况下实现NAL单元的聚合或分段,[RFC6184]引入了六种新的NAL单元类型(24-29),用作有效负载结构,从[H.264]中未指定的NAL单元类型中选择。
This memo reuses all the payload structures used in [RFC6184]. Furthermore, three new types of NAL units are defined: payload content scalability information (PACSI) NAL unit, empty NAL unit, and non-interleaved multi-time aggregation packet (NI-MTAP) (specified in Sections 4.9, 4.10, and 4.7.1, respectively).
此备忘录重用[RFC6184]中使用的所有有效负载结构。此外,定义了三种新类型的NAL单元:有效负载内容可伸缩性信息(PACSI)NAL单元、空NAL单元和非交织多时间聚合包(NI-MTAP)(分别在第4.9、4.10和4.7.1节中规定)。
PACSI NAL units may be used for the following purposes:
PACSI NAL装置可用于以下目的:
o To enable MANEs to decide whether to forward, process, or discard aggregation packets, by checking in PACSI NAL units the scalability information and other characteristics of the aggregated NAL units, rather than looking into the aggregated NAL units themselves, which are defined by the video coding specification.
o 通过在PACSI NAL单元中检查聚合NAL单元的可伸缩性信息和其他特征,而不是查看由视频编码规范定义的聚合NAL单元本身,使MANE能够决定是转发、处理还是丢弃聚合分组。
o To enable correct decoding order recovery in MST using the NI-C or NI-TC mode, with the help of the CS-DON information included in PACSI NAL units.
o 借助PACSI NAL单元中包含的CS-DON信息,使用NI-C或NI-TC模式在MST中实现正确的解码顺序恢复。
o To improve resilience to packet losses, e.g., by utilizing the following data or information included in PACSI NAL units: repeated Supplemental Enhancement Information (SEI) messages, information regarding the start and end of layer representations, and the indices to layer representations of the lowest temporal subset.
o 例如,通过利用PACSI NAL单元中包括的以下数据或信息来提高对分组丢失的恢复能力:重复补充增强信息(SEI)消息、关于层表示的开始和结束的信息以及最低时间子集的层表示的索引。
Empty NAL units may be used to enable correct decoding order recovery in MST using the NI-T or NI-TC mode. NI-MTAP NAL units may be used to aggregate NAL units from multiple access units but without interleaving.
空NAL单元可用于使用NI-T或NI-TC模式在MST中实现正确的解码顺序恢复。NI-MTAP NAL单元可用于聚合来自多个接入单元的NAL单元,但无需交错。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [RFC2119].
本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照BCP 14、RFC 2119[RFC2119]中的说明进行解释。
This specification uses the notion of setting and clearing a bit when bit fields are handled. Setting a bit is the same as assigning that bit the value of 1 (On). Clearing a bit is the same as assigning that bit the value of 0 (Off).
本规范使用在处理位字段时设置和清除位的概念。设置位与将该位的值指定为1(On)相同。清除一个位与将该位赋值为0(关闭)相同。
This document uses the terms and definitions of [H.264]. Section 3.1.1 lists relevant definitions copied from [H.264] for convenience.
本文档使用[H.264]中的术语和定义。第3.1.1节列出了为方便起见从[H.264]复制的相关定义。
When there is discrepancy, the definitions in [H.264] take precedence. Section 3.1.2 gives definitions specific to this memo. Some of the definitions in Section 3.1.2 are also present in [RFC6184] and copied here with slight adaptations as needed.
如果存在差异,则以[H.264]中的定义为准。第3.1.2节给出了本备忘录的具体定义。第3.1.2节中的一些定义也出现在[RFC6184]中,并在此处复制,根据需要稍作修改。
access unit: A set of NAL units always containing exactly one primary coded picture. In addition to the primary coded picture, an access unit may also contain one or more redundant coded pictures, one auxiliary coded picture, or other NAL units not containing slices or slice data partitions of a coded picture. The decoding of an access unit always results in a decoded picture.
存取单元:一组NAL单元,通常只包含一个主编码图片。除了主编码图片之外,访问单元还可以包含一个或多个冗余编码图片、一个辅助编码图片或不包含编码图片的切片或切片数据分区的其他NAL单元。访问单元的解码总是导致解码图片。
base layer: A bitstream subset that contains all the NAL units with the nal_unit_type syntax element equal to 1 or 5 of the bitstream and does not contain any NAL unit with the nal_unit_type syntax element equal to 14, 15, or 20 and conforms to one or more of the profiles specified in Annex A of [H.264].
基本层:一个比特流子集,包含所有NAL单元,其NAL单元的NAL单元类型语法元素等于比特流的1或5,不包含任何NAL单元的NAL单元的NAL单元类型语法元素等于14、15或20,并符合[H.264]附录A中规定的一个或多个配置文件。
base quality layer representation: The layer representation of the target dependency representation of an access unit that is associated with the quality_id syntax element equal to 0.
基本质量层表示:与等于0的quality_id语法元素关联的访问单元的目标依赖关系表示的层表示。
coded video sequence: A sequence of access units that consists, in decoding order, of an IDR access unit followed by zero or more non-IDR access units including all subsequent access units up to but not including any subsequent IDR access unit.
编码视频序列:按解码顺序由一个IDR访问单元和零个或多个非IDR访问单元组成的一系列访问单元,包括所有后续访问单元,但不包括任何后续IDR访问单元。
dependency representation: A subset of Video Coding Layer (VCL) NAL units within an access unit that are associated with the same value of the dependency_id syntax element, which is provided as part of the NAL unit header or by an associated prefix NAL unit. A dependency representation consists of one or more layer representations.
依赖关系表示:访问单元内的视频编码层(VCL)NAL单元的子集,与依赖关系id语法元素的相同值相关联,作为NAL单元头的一部分或由相关前缀NAL单元提供。依赖关系表示由一个或多个图层表示组成。
IDR access unit: An access unit in which the primary coded picture is an IDR picture.
IDR访问单元:其中主编码图片为IDR图片的访问单元。
IDR picture: Instantaneous decoding refresh picture. A coded picture in which all slices of the target dependency representation within the access unit are I or EI slices that causes the decoding process to mark all reference pictures as "unused for reference" immediately after decoding the IDR picture. After the decoding of an IDR picture all following coded pictures in decoding order can be decoded without inter prediction from any picture decoded prior to the IDR picture. The first picture of each coded video sequence is an IDR picture.
IDR图片:即时解码刷新图片。一种编码图片,其中接入单元内的目标依赖性表示的所有片都是I或EI片,使得解码过程在解码IDR图片之后立即将所有参考图片标记为“未使用以供参考”。在对IDR图片进行解码之后,可以按照解码顺序对所有后续编码图片进行解码,而无需从在IDR图片之前解码的任何图片进行帧间预测。每个编码视频序列的第一个图片是IDR图片。
layer representation: A subset of VCL NAL units within an access unit that are associated with the same values of the dependency_id and quality_id syntax elements, which are provided as part of the VCL NAL unit header or by an associated prefix NAL unit. One or more layer representations represent a dependency representation.
层表示:访问单元中与依赖项id和质量id语法元素的相同值相关联的VCL NAL单元的子集,这些元素作为VCL NAL单元头的一部分或由相关前缀NAL单元提供。一个或多个图层表示表示依赖关系表示。
prefix NAL unit: A NAL unit with nal_unit_type equal to 14 that immediately precedes in decoding order a NAL unit with nal_unit_type equal to 1, 5, or 12. The NAL unit that immediately succeeds in decoding order the prefix NAL unit is referred to as the associated NAL unit. The prefix NAL unit contains data associated with the associated NAL unit, which are considered to be part of the associated NAL unit.
前缀NAL单元:NAL_单元_类型等于14的NAL单元,其解码顺序紧跟在NAL_单元_类型等于1、5或12的NAL单元之前。按照前缀NAL单元的解码顺序立即成功的NAL单元称为相关联的NAL单元。前缀NAL单元包含与关联NAL单元相关联的数据,这些数据被视为关联NAL单元的一部分。
reference base picture: A reference picture that is obtained by decoding a base quality layer representation with the nal_ref_idc syntax element not equal to 0 and the store_ref_base_pic_flag syntax element equal to 1 of an access unit and all layer representations of the access unit that are referred to by inter-layer prediction of the base quality layer representation. A reference base picture is not an output of the decoding process, but the samples of a reference base picture may be used for inter prediction in the decoding process of subsequent pictures in decoding order. Reference base picture is a collective term for a reference base field or a reference base frame.
参考基本图片:通过解码基本质量层表示而获得的参考图片,其中nal_ref_idc语法元素不等于0,store_ref_base_pic_flag语法元素等于访问单元的1,以及通过基本质量层的层间预测而参考的访问单元的所有层表示质量层表示。参考基本图片不是解码处理的输出,但是参考基本图片的样本可以用于按照解码顺序的后续图片的解码处理中的帧间预测。参考基准图片是参考基准字段或参考基准帧的集合术语。
scalable bitstream: A bitstream with the property that one or more bitstream subsets that are not identical to the scalable bitstream form another bitstream that conforms to the SVC specification [H.264].
可伸缩比特流:一种比特流,其特性是与可伸缩比特流不同的一个或多个比特流子集形成另一个符合SVC规范[H.264]的比特流。
target dependency representation: The dependency representation of an access unit that is associated with the largest value of the dependency_id syntax element for all dependency representations of the access unit.
目标依赖关系表示法:访问单元的依赖关系表示法,它与访问单元的所有依赖关系表示法的依赖关系id语法元素的最大值相关联。
target layer representation: The layer representation of the target dependency representation of an access unit that is associated with the largest value of the quality_id syntax element for all layer representations of the target dependency representation of the access unit.
目标层表示:访问单元的目标依赖关系表示的层表示,与访问单元的目标依赖关系表示的所有层表示的quality_id语法元素的最大值关联。
anchor layer representation: An anchor layer representation is such a layer representation that, if decoding of the operation point corresponding to the layer starts from the access unit containing this layer representation, all the following layer representations of the layer, in output order, can be correctly decoded. The output order is defined in [H.264] as the order in which decoded pictures are output from the decoded picture buffer of the decoder. As H.264 does not specify the picture display process, this more general term is used instead of display order. An anchor layer representation is a random access point to the layer the anchor layer representation belongs. However, some layer representations, succeeding an anchor layer representation in decoding order but preceding the anchor layer representation in output order, may refer to earlier layer representations for inter prediction, and hence the decoding may be incorrect if random access is performed at the anchor layer representation.
锚定层表示:锚定层表示是这样一种层表示,如果对应于该层的操作点的解码从包含该层表示的接入单元开始,则该层的所有以下层表示可以按输出顺序正确解码。输出顺序在[H.264]中定义为从解码器的解码图片缓冲器输出解码图片的顺序。由于H.264没有指定图片显示过程,因此使用这个更通用的术语来代替显示顺序。锚定层表示是锚定层表示所属层的随机访问点。然而,在解码顺序中继锚定层表示而在输出顺序中在锚定层表示之前的一些层表示可以参考用于帧间预测的较早的层表示,因此,如果在锚定层表示处执行随机访问,则解码可能是不正确的。
AVC base layer: The subset of the SVC base layer in which all prefix NAL units (type 14) are removed. Note that this is equivalent to the term "base layer" as defined in Annex G of [H.264].
AVC基本层:删除所有前缀NAL单元(类型14)的SVC基本层的子集。注意,这相当于[H.264]附录G中定义的术语“基层”。
base RTP session: When multi-session transmission is used, the RTP session that carries the RTP stream containing the T0 AVC base layer or the T0 SVC base layer, and zero or more enhancement layers. This RTP session does not depend on any other RTP session as indicated by mechanisms defined in Section 7.2.3. The base RTP session may carry NAL units of NAL unit type equal to 14 and 15.
基本RTP会话:使用多会话传输时,承载包含T0 AVC基本层或T0 SVC基本层的RTP流以及零个或多个增强层的RTP会话。如第7.2.3节中定义的机制所示,该RTP会话不依赖于任何其他RTP会话。基本RTP会话可携带NAL单元类型等于14和15的NAL单元。
decoding order number (DON): A field in the payload structure or a derived variable indicating NAL unit decoding order. Values of DON are in the range of 0 to 65535, inclusive. After reaching the maximum value, the value of DON wraps around to 0. Note that this definition also exists in [RFC6184] in exactly the same form.
解码顺序号(DON):有效负载结构中的一个字段或表示NAL单元解码顺序的派生变量。DON的值在0到65535之间(含0到65535)。达到最大值后,DON的值将变为0。请注意,此定义在[RFC6184]中也以完全相同的形式存在。
Empty NAL unit: A NAL unit with NAL unit type equal to 31 and sub-type equal to 1. An empty NAL unit consists of only the two-byte NAL unit header with an empty payload.
空NAL单元:NAL单元类型为31且子类型为1的NAL单元。空的NAL单元仅由带空有效负载的两字节NAL单元头组成。
enhancement RTP session: When multi-session transmission is used, an RTP session that is not the base RTP session. An enhancement RTP session typically contains an RTP stream that depends on at least one other RTP session as indicated by mechanisms defined in Section 7.2.3. A lower RTP session to an enhancement RTP session is an RTP session on which the enhancement RTP session depends. The lowest RTP session for a receiver is the RTP session that does not depend on any other RTP session received by the receiver. The highest RTP session for a receiver is the RTP session on which no other RTP session received by the receiver depends.
增强RTP会话:当使用多会话传输时,不是基本RTP会话的RTP会话。增强RTP会话通常包含一个RTP流,该RTP流依赖于第7.2.3节中定义的机制所指示的至少一个其他RTP会话。增强RTP会话的下层RTP会话是增强RTP会话所依赖的RTP会话。接收机的最低RTP会话是不依赖于接收机接收的任何其他RTP会话的RTP会话。接收机的最高RTP会话是接收机接收到的其他RTP会话都不依赖的RTP会话。
cross-session decoding order number (CS-DON): A derived variable indicating NAL unit decoding order number over all NAL units within all the session-multiplexed RTP sessions that carry the same SVC bitstream.
跨会话解码顺序号(CS-DON):一个派生变量,指示携带相同SVC比特流的所有会话多路复用RTP会话中所有NAL单元上的NAL单元解码顺序号。
default level: The level indicated by the profile-level-id parameter. In Session Description Protocol (SDP) Offer/Answer, the level is downgradable, i.e., the answer may either use the default level or a lower level. Note that this definition also exists in [RFC6184] in a slightly different form.
默认级别:配置文件级别id参数指示的级别。在会话描述协议(SDP)提供/应答中,级别是可降级的,即应答可以使用默认级别或更低级别。请注意,该定义在[RFC6184]中的形式也略有不同。
default sub-profile: The subset of coding tools, which may be all coding tools of one profile or the common subset of coding tools of more than one profile, indicated by the profile-level-id parameter. In SDP Offer/Answer, the default sub-profile must be used in a
默认子概要文件:编码工具的子集,可以是一个概要文件的所有编码工具,也可以是多个概要文件的编码工具的公共子集,由概要文件级别id参数表示。在SDP报价/应答中,必须在
symmetric manner, i.e., the answer must either use the same sub-profile as the offer or reject the offer. Note that this definition also exists in [RFC6184] in a slightly different form.
对称方式,即答案必须使用与报价相同的子配置文件或拒绝报价。请注意,该定义在[RFC6184]中的形式也略有不同。
enhancement layer: A layer in which at least one of the values of dependency_id or quality_id is higher than 0, or a layer in which none of the NAL units is associated with the value of temporal_id equal to 0. An operation point constructed using the maximum temporal_id, dependency_id, and quality_id values associated with an enhancement layer may or may not conform to one or more of the profiles specified in Annex A of [H.264].
增强层:其中至少一个依赖项id或质量id的值大于0的层,或其中没有任何NAL单元与等于0的时间id的值相关联的层。使用与增强层相关联的最大时间id、依赖性id和质量id值构建的操作点可能符合也可能不符合[H.264]附录A中规定的一个或多个配置文件。
H.264/AVC compatible: The property of a bitstream subset of conforming to one or more of the profiles specified in Annex A of [H.264].
H.264/AVC兼容:符合[H.264]附录a中规定的一个或多个配置文件的比特流子集的属性。
intra layer representation: A layer representation that contains only slices that use intra prediction, and hence do not refer to any earlier layer representation in decoding order in the same layer. Note that in SVC intra prediction includes intra-layer intra prediction as well as inter-layer intra prediction.
层内表示:仅包含使用帧内预测的切片的层表示,因此不参考同一层中解码顺序中的任何早期层表示。注意,在SVC中,帧内预测包括层内预测以及层间帧内预测。
layer: A bitstream subset in which all NAL units of type 1, 5, 12, 14, or 20 have the same values of dependency_id and quality_id, either directly through their NAL unit header (for NAL units of type 14 or 20) or through association to a prefix (type 14) NAL unit (for NAL unit type 1, 5, or 12). A layer may contain NAL units associated with more than one values of temporal_id.
层:一种比特流子集,其中类型1、5、12、14或20的所有NAL单元都具有相同的依赖性标识和质量标识值,可以直接通过它们的NAL单元头(对于类型14或20的NAL单元),也可以通过与前缀(类型14)NAL单元(对于类型1、5或12的NAL单元)的关联。层可以包含与时间id的多个值相关联的NAL单元。
media-aware network element (MANE): A network element, such as a middlebox or application layer gateway that is capable of parsing certain aspects of the RTP payload headers or the RTP payload and reacting to their contents. Note that this definition also exists in [RFC6184] in exactly the same form.
媒体感知网元(MANE):能够解析RTP有效负载头或RTP有效负载的某些方面并对其内容作出反应的网元,如中间盒或应用层网关。请注意,此定义在[RFC6184]中也以完全相同的形式存在。
Informative note: The concept of a MANE goes beyond normal routers or gateways in that a MANE has to be aware of the signaling (e.g., to learn about the payload type mappings of the media streams), and in that it has to be trusted when working with Secure Real-time Transport Protocol (SRTP). The advantage of using MANEs is that they allow packets to be dropped according to the needs of the media coding. For example, if a MANE has to drop packets due to congestion on a certain link, it can identify and remove those packets whose elimination produces the least adverse effect on the user experience. After dropping packets, MANEs must rewrite RTCP packets to match the changes to the RTP packet stream as specified in Section 7 of [RFC3550].
资料性说明:MANE的概念超出了普通路由器或网关的范围,因为MANE必须了解信令(例如,了解媒体流的有效负载类型映射),并且在使用安全实时传输协议(SRTP)时必须信任它。使用mane的优点是,它们允许根据媒体编码的需要丢弃数据包。例如,如果MANE由于某一链路上的拥塞而不得不丢弃分组,则它可以识别并移除那些其消除对用户体验产生最小不利影响的分组。丢弃数据包后,MANE必须重写RTCP数据包,以匹配[RFC3550]第7节中规定的RTP数据包流更改。
multi-session transmission: The transmission mode in which the SVC stream is transmitted over multiple RTP sessions. Dependency between RTP sessions MUST be signaled according to Section 7.2.3 of this memo.
多会话传输:通过多个RTP会话传输SVC流的传输模式。RTP会话之间的依赖关系必须根据本备忘录第7.2.3节发出信号。
NAL unit decoding order: A NAL unit order that conforms to the constraints on NAL unit order given in Section G.7.4.1.2 in [H.264]. Note that this definition also exists in [RFC6184] in a slightly different form.
NAL单元解码顺序:符合[H.264]第G.7.4.1.2节中给出的NAL单元顺序约束的NAL单元顺序。请注意,该定义在[RFC6184]中的形式也略有不同。
NALU-time: The value that the RTP timestamp would have if the NAL unit would be transported in its own RTP packet. Note that this definition also exists in [RFC6184] in exactly the same form.
NALU时间:如果NAL单元将在其自己的RTP数据包中传输,则RTP时间戳将具有的值。请注意,此定义在[RFC6184]中也以完全相同的形式存在。
operation point: An operation point is identified by a set of values of temporal_id, dependency_id, and quality_id. A bitstream corresponding to an operation point can be constructed by removing all NAL units associated with a higher value of dependency_id, and all NAL units associated with the same value of dependency_id but higher values of quality_id or temporal_id. An operation point bitstream conforms to at least one of the profiles defined in Annex A or G of [H.264], and offers a representation of the original video signal at a certain fidelity.
操作点:一个操作点由一组时间\ id、依赖\ id和质量\ id的值标识。通过移除与更高依赖\ id值关联的所有NAL单元,可以构造与操作点对应的比特流,以及与相同的依赖项id值但较高的质量id或时间id值相关联的所有NAL单元。操作点比特流符合[H.264]附录A或G中定义的至少一个配置文件,并以一定保真度提供原始视频信号的表示。
Informative note: Additional NAL units may be removed (with lower dependency_id or same dependency_id but lower quality_id) if they are not required for decoding the bitstream at the particular operation point. The resulting bitstream, however, may no longer conform to any of the profiles defined in Annex A or G of [H.264].
资料性说明:如果在特定操作点解码比特流时不需要额外的NAL单元,则可以移除它们(具有较低的依赖项_id或相同的依赖项_id但质量较低的_id)。然而,产生的比特流可能不再符合[H.264]的附录A或G中定义的任何配置文件。
operation point representation: The set of all NAL units of an operation point within the same access unit.
操作点表示:同一访问单元内一个操作点的所有NAL单元的集合。
RTP packet stream: A sequence of RTP packets with increasing sequence numbers (except for wrap-around), identical payload type and identical SSRC (Synchronization Source), carried in one RTP session. Within the scope of this memo, one RTP packet stream is utilized to transport one or more layers.
RTP数据包流:一个RTP会话中携带的RTP数据包序列,序列号增加(环绕除外),有效负载类型相同,SSRC(同步源)相同。在本备忘录的范围内,一个RTP数据包流用于传输一个或多个层。
single-session transmission: The transmission mode in which the SVC bitstream is transmitted over a single RTP session.
单会话传输:通过单个RTP会话传输SVC比特流的传输模式。
SVC base layer: The layer that includes all NAL units associated with dependency_id and quality_id values both equal to 0, including prefix NAL units (NAL unit type 14).
SVC基本层:包含所有与依赖项id和质量id值均等于0相关联的NAL单元的层,包括前缀NAL单元(NAL单元类型14)。
SVC enhancement layer: A layer in which at least one of the values of dependency_id or quality_id is higher than 0. An operation point constructed using the maximum dependency_id and quality_id values and any temporal_id value associated with an SVC enhancement layer does not conform to any of the profiles specified in Annex A of [H.264].
SVC增强层:其中至少有一个依赖项id或质量id值大于0的层。使用与SVC增强层相关联的最大依赖性_id和质量_id值以及任何时间_id值构建的操作点不符合[H.264]附录A中规定的任何配置文件。
SVC NAL unit: A NAL unit of NAL unit type 14, 15, or 20 as specified in Annex G of [H.264].
SVC NAL装置:NAL装置类型为14、15或20的NAL装置,如[H.264]附录G所述。
SVC NAL unit header: A four-byte header resulting from the addition of a three-byte SVC-specific header extension added in NAL unit types 14 and 20.
SVC NAL单元头:由于在NAL单元类型14和20中添加了三字节SVC特定头扩展而产生的四字节头。
SVC RTP session: Either the base RTP session or an enhancement RTP session.
SVC RTP会话:基本RTP会话或增强RTP会话。
T0 AVC base layer: A subset of the AVC base layer constructed by removing all VCL NAL units associated with temporal_id values higher than 0 and non-VCL NAL units and SEI messages associated only with the VCL NAL units being removed.
T0 AVC基本层:AVC基本层的子集,通过移除与大于0的时间_id值相关联的所有VCL NAL单元、非VCL NAL单元以及仅与被移除的VCL NAL单元相关联的SEI消息而构建。
T0 SVC base layer: A subset of the SVC base layer constructed by removing all VCL NAL units associated with temporal_id values higher than 0 as well as prefix NAL units, non-VCL NAL units, and SEI messages associated only with the VCL NAL units being removed.
T0 SVC基本层:SVC基本层的子集,通过移除与大于0的时间_id值相关联的所有VCL NAL单元以及前缀NAL单元、非VCL NAL单元和仅与被移除的VCL NAL单元相关联的SEI消息而构建。
transmission order: The order of packets in ascending RTP sequence number order (in modulo arithmetic). Within an aggregation packet, the NAL unit transmission order is the same as the order of appearance of NAL units in the packet. Note that this definition also exists in [RFC6184] in exactly the same form.
传输顺序:以RTP序列号升序排列的数据包顺序(在模运算中)。在聚合分组内,NAL单元传输顺序与分组中NAL单元的出现顺序相同。请注意,此定义在[RFC6184]中也以完全相同的形式存在。
In addition to the abbreviations defined in [RFC6184], the following abbreviations are used in this memo.
除[RFC6184]中定义的缩写外,本备忘录中还使用了以下缩写。
CGS: Coarse-Grain Scalability CS-DON: Cross-Session Decoding Order Number MGS: Medium-Grain Scalability MST: Multi-Session Transmission PACSI: Payload Content Scalability Information SST: Single-Session Transmission SNR: Signal-to-Noise Ratio SVC: Scalable Video Coding
CGS:粗粒度可伸缩性CS-DON:跨会话解码顺序号MGS:中粒度可伸缩性MST:多会话传输PACSI:有效负载内容可伸缩性信息SST:单会话传输SNR:信噪比SVC:可伸缩视频编码
In addition to Section 5.1 of [RFC6184], the following rules apply.
除[RFC6184]第5.1节外,以下规则适用。
o Setting of the M bit:
o M位的设置:
The M bit of an RTP packet for which the packet payload is an NI-MTAP MUST be equal to 1 if the last NAL unit, in decoding order, of the access unit associated with the RTP timestamp is contained in the packet.
如果与RTP时间戳相关联的接入单元的最后一个NAL单元(按解码顺序)包含在分组中,则分组有效载荷为NI-MTAP的RTP分组的M位必须等于1。
o Setting of the RTP timestamp:
o RTP时间戳的设置:
For an RTP packet for which the packet payload is an empty NAL unit, the RTP timestamp must be set according to Section 4.10.
对于包有效负载为空NAL单元的RTP包,必须根据第4.10节设置RTP时间戳。
For an RTP packet for which the packet payload is a PACSI NAL unit, the RTP timestamp MUST be equal to the NALU-time of the next non-PACSI NAL unit in transmission order. Recall that the NALU-time of a NAL unit in an MTAP is defined in [RFC6184] as the value that the RTP timestamp would have if that NAL unit would be transported in its own RTP packet.
对于包有效负载为PACSI NAL单元的RTP包,RTP时间戳必须等于传输顺序中下一个非PACSI NAL单元的NALU时间。回想一下,MTAP中NAL单元的NALU时间在[RFC6184]中定义为RTP时间戳的值,如果该NAL单元将在其自身的RTP数据包中传输。
o Setting of the SSRC:
o SSRC的设置:
For both SST and MST, the SSRC values MUST be set according to [RFC3550].
对于SST和MST,必须根据[RFC3550]设置SSRC值。
This memo specifies a NAL unit extension mechanism to allow for introduction of new types of NAL units, beyond the three NAL unit types left undefined in [RFC6184] (i.e., 0, 30, and 31). The extension mechanism utilizes the NAL unit type value 31 and is specified as follows. When the NAL unit type value is equal to 31, the one-byte NAL unit header consisting of the F, NRI, and Type fields as specified in Section 1.1.3 is extended by one additional octet, which consists of a 5-bit field named Subtype and three 1-bit fields named J, K, and L, respectively. The additional octet is shown in the following figure.
本备忘录规定了NAL单元扩展机制,以允许引入新的NAL单元类型,超出[RFC6184]中未定义的三种NAL单元类型(即0、30和31)。扩展机制利用NAL单元类型值31,并指定如下。当NAL单元类型值等于31时,由第1.1.3节中规定的F、NRI和类型字段组成的单字节NAL单元头扩展一个额外的八位字节,该八位字节分别由一个名为Subtype的5位字段和三个名为J、K和L的1位字段组成。下图显示了附加的八位字节。
+---------------+ |0|1|2|3|4|5|6|7| +-+-+-+-+-+-+-+-+ | Subtype |J|K|L| +---------------+
+---------------+ |0|1|2|3|4|5|6|7| +-+-+-+-+-+-+-+-+ | Subtype |J|K|L| +---------------+
The Subtype value determines the (extended) NAL unit type of this NAL unit. The interpretation of the fields J, K, and L depends on the Subtype. The semantics of the fields are as follows.
“子类型”值确定此NAL单元的(扩展)NAL单元类型。字段J、K和L的解释取决于子类型。字段的语义如下所示。
When Subtype is equal to 1, the NAL unit is an empty NAL unit as specified in Section 4.10. When Subtype is equal to 2, the NAL unit is an NI-MTAP NAL unit as specified in Section 4.7.1. All other values of Subtype (0, 3-31) are reserved for future extensions, and receivers MUST ignore the entire NAL unit when Subtype is equal to any of these reserved values.
当子类型等于1时,NAL单位为第4.10节规定的空NAL单位。当子类型等于2时,NAL单位为第4.7.1节规定的NI-MTAP NAL单位。子类型(0,3-31)的所有其他值保留用于将来的扩展,并且当子类型等于这些保留值中的任何一个时,接收器必须忽略整个NAL单元。
The structure and semantics of the NAL unit header according to the H.264 specification [H.264] were introduced in Section 1.1.3. This section specifies the extended semantics of the NAL unit header fields F, NRI, I, PRID, DID, QID, TID, U, and D, according to this memo. When the Type field is equal to 31, the semantics of the fields in the extension NAL unit header were specified in Section 4.2.1.
第1.1.3节介绍了符合H.264规范[H.264]的NAL单元头的结构和语义。根据本备忘录,本节规定了NAL单元标题字段F、NRI、I、PRID、DID、QID、TID、U和D的扩展语义。当类型字段等于31时,第4.2.1节规定了扩展NAL单元标题中字段的语义。
The semantics of F specified in Section 5.3 of [RFC6184] also apply in this memo. That is, a value of 0 for F indicates that the NAL unit type octet and payload should not contain bit errors or other syntax violations, whereas a value of 1 for F indicates that the NAL unit type octet and payload may contain bit errors or other syntax violations. MANEs SHOULD set the F bit to indicate bit errors in the NAL unit.
[RFC6184]第5.3节中规定的F的语义也适用于本备忘录。也就是说,F的值为0表示NAL单元类型八位字节和有效负载不应包含位错误或其他语法冲突,而F的值为1表示NAL单元类型八位字节和有效负载可能包含位错误或其他语法冲突。MANEs应设置F位以指示NAL单元中的位错误。
For NRI, for a bitstream conforming to one of the profiles defined in Annex A of [H.264] and transported using [RFC6184], the semantics specified in Section 5.3 of [RFC6184] apply, i.e., NRI also indicates the relative importance of NAL units. For a bitstream conforming to one of the profiles defined in Annex G of [H.264] and transported using this memo, in addition to the semantics specified in Annex G of [H.264], NRI also indicates the relative importance of NAL units within a layer.
对于NRI,对于符合[H.264]附录a中定义的配置文件之一并使用[RFC6184]传输的比特流,[RFC6184]第5.3节中规定的语义适用,即,NRI还指示NAL单元的相对重要性。对于符合[H.264]附录G中定义的配置文件之一并使用本备忘录传输的比特流,除了[H.264]附录G中规定的语义外,NRI还指示层内NAL单元的相对重要性。
For I, in addition to the semantics specified in Annex G of [H.264], according to this memo, MANEs MAY use this information to protect NAL units with I equal to 1 better than NAL units with I equal to 0. MANEs MAY also utilize information of NAL units with I equal to 1 to
对于I,除[H.264]附录G中规定的语义外,根据本备忘录,MANE可使用该信息保护I等于1的NAL单位,而不是I等于0的NAL单位。mane还可以利用I等于1到1的NAL单位的信息
decide when to forward more packets for an RTP packet stream. For example, when it is detected that spatial layer switching has happened such that the operation point has changed to a higher value of DID, MANEs MAY start to forward NAL units with the higher value of DID only after forwarding a NAL unit with I equal to 1 with the higher value of DID.
决定何时为RTP数据包流转发更多数据包。例如,当检测到空间层交换已经发生,使得操作点已经改变为较高的DID值时,mane可以仅在转发I等于1且具有较高的DID值的NAL单元之后,才开始转发具有较高的DID值的NAL单元。
Note that, in the context of this section, "protecting a NAL unit" means any RTP or network transport mechanism that could improve the probability of successful delivery of the packet conveying the NAL unit, including applying a Quality of Service (QoS) enabled network, Forward Error Correction (FEC), retransmissions, and advanced scheduling behavior, whenever possible.
注意,在本节的上下文中,“保护NAL单元”是指可以提高传送NAL单元的分组的成功递送的概率的任何RTP或网络传输机制,包括应用支持服务质量(QoS)的网络、前向纠错(FEC)、重传、,和高级调度行为,只要可能。
For PRID, the semantics specified in Annex G of [H.264] apply. Note that MANEs implementing unequal error protection MAY use this information to protect NAL units with smaller PRID values better than those with larger PRID values, for example, by including only the more important NAL units in a FEC protection mechanism. The importance for the decoding process decreases as the PRID value increases.
对于PRID,[H.264]附录G中规定的语义适用。注意,实现不等错误保护的mane可以使用该信息来保护PRID值较小的NAL单元,而不是PRID值较大的NAL单元,例如,通过在FEC保护机制中仅包括更重要的NAL单元。解码过程的重要性随着PRID值的增加而降低。
For DID, QID, or TID, in addition to the semantics specified in Annex G of [H.264], according to this memo, values of DID, QID, or TID indicate the relative importance in their respective dimension. A lower value of DID, QID, or TID indicates a higher importance if the other two components are identical. MANEs MAY use this information to protect more important NAL units better than less important NAL units.
对于DID、QID或TID,除了[H.264]附录G中规定的语义外,根据本备忘录,DID、QID或TID的值表示其各自维度中的相对重要性。如果其他两个组件相同,则DID、QID或TID值越低表示重要性越高。与不太重要的NAL单元相比,MANE可以使用此信息更好地保护更重要的NAL单元。
For U, in addition to the semantics specified in Annex G of [H.264], according to this memo, MANEs MAY use this information to protect NAL units with U equal to 1 better than NAL units with U equal to 0.
对于U,除了[H.264]附录G中规定的语义外,根据本备忘录,MANE可以使用此信息保护U等于1的NAL单元,而不是U等于0的NAL单元。
For D, in addition to the semantics specified in Annex G of [H.264], according to this memo, MANEs MAY use this information to determine whether a given NAL unit is required for successfully decoding a certain Operation Point of the SVC bitstream, hence to decide whether to forward the NAL unit.
对于D,除了[H.264]的附录G中规定的语义之外,根据本备忘录,MANEs可以使用该信息来确定是否需要给定的NAL单元来成功解码SVC比特流的某个操作点,从而决定是否转发NAL单元。
The NAL unit structure is central to H.264/AVC, [RFC6184], as well as SVC and this memo. In H.264/AVC and SVC, all coded bits for representing a video signal are encapsulated in NAL units. In [RFC6184], each RTP packet payload is structured as a NAL unit, which contains one or a part of one NAL unit specified in H.264/AVC, or aggregates one or more NAL units specified in H.264/AVC.
NAL单元结构是H.264/AVC[RFC6184]以及SVC和本备忘录的核心。在H.264/AVC和SVC中,用于表示视频信号的所有编码比特都封装在NAL单元中。在[RFC6184]中,每个RTP数据包有效负载被构造为一个NAL单元,它包含H.264/AVC中指定的一个NAL单元的一个或一部分,或聚合H.264/AVC中指定的一个或多个NAL单元。
[RFC6184] specifies three basic payload structures (in Section 5.2 of [RFC6184]): single NAL unit packet, aggregation packet, fragmentation unit, and six new types (24 to 29) of NAL units. The value of the Type field of the RTP packet payload header (i.e., the first byte of the payload) may be equal to any value from 1 to 23 for a single NAL unit packet, any value from 24 to 27 for an aggregation packet, and 28 or 29 for a fragmentation unit.
[RFC6184]指定了三种基本有效负载结构(见[RFC6184]第5.2节):单个NAL单元数据包、聚合数据包、碎片单元和六种新类型(24至29)的NAL单元。RTP分组有效载荷报头的类型字段的值(即,有效载荷的第一字节)可以等于单个NAL单元分组的1到23之间的任何值,聚合分组的24到27之间的任何值,以及分段单元的28或29。
In addition to the NAL unit types defined originally for H.264/AVC, SVC defines three new NAL unit types specifically for SVC: coded slice in scalable extension NAL units (type 20), prefix NAL units (type 14), and subset sequence parameter set NAL units (type 15), as described in Section 1.1.
除了最初为H.264/AVC定义的NAL单元类型外,SVC还专门为SVC定义了三种新的NAL单元类型:可扩展NAL单元中的编码片(类型20)、前缀NAL单元(类型14)和子集序列参数集NAL单元(类型15),如第1.1节所述。
This memo further introduces three new types of NAL units, PACSI NAL unit (NAL unit type 30) as specified in Section 4.9, empty NAL unit (type 31, subtype 1) as specified in Section 4.10, and NI-MTAP NAL unit (type 31, subtype 2) as specified in Section 4.7.1.
本备忘录进一步介绍了三种新型NAL装置:第4.9节规定的PACSI NAL装置(NAL装置类型30)、第4.10节规定的空NAL装置(类型31,子类型1)和第4.7.1节规定的NI-MTAP NAL装置(类型31,子类型2)。
The RTP packet payload structure in [RFC6184] is maintained with slight extensions in this memo, as follows. Each RTP packet payload is still structured as a NAL unit, which contains one or a part of one NAL unit specified in H.264/AVC and SVC, or contains one PACSI NAL unit or one empty NAL unit, or aggregates zero or more NAL units specified in H.264/AVC and SVC, zero or one PACSI NAL unit, and zero or more empty NAL units.
[RFC6184]中的RTP数据包有效负载结构在本备忘录中进行了轻微扩展,如下所示。每个RTP数据包有效负载仍被构造为NAL单元,其包含H.264/AVC和SVC中指定的一个NAL单元的一个或一部分,或包含一个PACSI NAL单元或一个空NAL单元,或聚合H.264/AVC和SVC中指定的零个或多个NAL单元、零个或一个PACSI NAL单元以及零个或多个空NAL单元。
In this memo, one of the three basic payload structures, fragmentation unit, remains the same as in [RFC6184], and the other two, single NAL unit packet and aggregation packet, are extended as follows. The value of the Type field of the payload header may be equal to any value from 1 to 23, inclusive, and 30 to 31, inclusive, for a single NAL unit packet, and any value from 24 to 27, inclusive, and 31, for an aggregation packet. When the Type field of the payload header is equal to 31 and the Subtype field of the payload header is equal to 2, the packet is an aggregation packet (containing an NI-MTAP NAL unit). When the Type field of the payload header is equal to 31 and the Subtype field of the payload header is equal to 1, the packet is a single NAL unit packet (containing an empty NAL unit).
在本备忘录中,三种基本有效负载结构中的一种,即碎片单元,与[RFC6184]中的相同,另外两种,即单个NAL单元数据包和聚合数据包,扩展如下。对于单个NAL单元分组,有效载荷报头的类型字段的值可以等于1到23(包括)和30到31(包括)之间的任何值,对于聚合分组,有效载荷报头的类型字段的值可以等于24到27(包括)和31之间的任何值。当有效负载标头的类型字段等于31且有效负载标头的子类型字段等于2时,该数据包为聚合数据包(包含NI-MTAP NAL单元)。当有效负载报头的类型字段等于31且有效负载报头的子类型字段等于1时,该分组是单个NAL单元分组(包含空NAL单元)。
Note that, in this memo, the length of the payload header varies depending on the value of the Type field in the first byte of the RTP packet payload. If the value is equal to 14, 20, or 30, the first four bytes of the packet payload form the payload header; otherwise, if the value is equal to 31, the first two bytes of the payload form the payload header; otherwise, the payload header is the first byte of the packet payload.
请注意,在此备忘录中,有效负载头的长度根据RTP数据包有效负载第一个字节中类型字段的值而变化。如果该值等于14、20或30,则分组有效载荷的前四个字节形成有效载荷报头;否则,如果该值等于31,则有效负载的前两个字节形成有效负载报头;否则,有效负载报头是数据包有效负载的第一个字节。
Table 1 lists the NAL unit types introduced in SVC and this memo and where they are described in this memo. Table 2 summarizes the basic payload structure types for all NAL unit types when they are directly used as RTP packet payloads according to this memo. Table 3 summarizes the NAL unit types allowed to be aggregated (i.e., used as aggregation units in aggregation packets) or fragmented (i.e., carried in fragmentation units) according to this memo.
表1列出了SVC和本备忘录中介绍的NAL机组类型,以及本备忘录中描述的NAL机组类型。表2总结了根据本备忘录直接用作RTP数据包有效载荷时所有NAL单元类型的基本有效载荷结构类型。表3总结了根据本备忘录允许聚合(即,用作聚合数据包中的聚合单元)或分段(即,在分段单元中携带)的NAL单元类型。
Table 1. NAL unit types introduced in SVC and this memo
表1。SVC和本备忘录中介绍的NAL机组类型
Type Subtype NAL Unit Name Section Numbers ----------------------------------------------------------- 14 - Prefix NAL unit 1.1 15 - Subset sequence parameter set 1.1 20 - Coded slice in scalable extension 1.1 30 - PACSI NAL unit 4.9 31 0 reserved 4.2.1 31 1 Empty NAL unit 4.10 31 2 NI-MTAP 4.7.1 31 3-31 reserved 4.2.1
Type Subtype NAL Unit Name Section Numbers ----------------------------------------------------------- 14 - Prefix NAL unit 1.1 15 - Subset sequence parameter set 1.1 20 - Coded slice in scalable extension 1.1 30 - PACSI NAL unit 4.9 31 0 reserved 4.2.1 31 1 Empty NAL unit 4.10 31 2 NI-MTAP 4.7.1 31 3-31 reserved 4.2.1
Table 2. Basic payload structure types for all NAL unit types when they are directly used as RTP packet payloads
表2。所有NAL单元类型直接用作RTP数据包有效载荷时的基本有效载荷结构类型
Type Subtype Basic Payload Structure ------------------------------------------ 0 - reserved 1-23 - Single NAL Unit Packet 24-27 - Aggregation Packet 28-29 - Fragmentation Unit 30 - Single NAL Unit Packet 31 0 reserved 31 1 Single NAL Unit Packet 31 2 Aggregation Packet 31 3-31 reserved
Type Subtype Basic Payload Structure ------------------------------------------ 0 - reserved 1-23 - Single NAL Unit Packet 24-27 - Aggregation Packet 28-29 - Fragmentation Unit 30 - Single NAL Unit Packet 31 0 reserved 31 1 Single NAL Unit Packet 31 2 Aggregation Packet 31 3-31 reserved
Table 3. Summary of the NAL unit types allowed to be aggregated or fragmented (yes = allowed, no = disallowed, - = not applicable/not specified)
表3。允许聚合或分段的NAL单元类型摘要(是=允许,否=不允许,-=不适用/未指定)
Type Subtype STAP-A STAP-B MTAP16 MTAP24 FU-A FU-B NI-MTAP ------------------------------------------------------------- 0 - - - - - - - - 1-23 - yes yes yes yes yes yes yes 24-29 - no no no no no no no 30 - yes yes yes yes no no yes 31 0 - - - - - - - 31 1 yes no no no no no yes 31 2 no no no no no no no 31 3-31 - - - - - - -
Type Subtype STAP-A STAP-B MTAP16 MTAP24 FU-A FU-B NI-MTAP ------------------------------------------------------------- 0 - - - - - - - - 1-23 - yes yes yes yes yes yes yes 24-29 - no no no no no no no 30 - yes yes yes yes no no yes 31 0 - - - - - - - 31 1 yes no no no no no yes 31 2 no no no no no no no 31 3-31 - - - - - - -
This memo enables transmission of an SVC bitstream over one or more RTP sessions. If only one RTP session is used for transmission of the SVC bitstream, the transmission mode is referred to as single-session transmission (SST); otherwise (more than one RTP session is used for transmission of the SVC bitstream), the transmission mode is referred to as multi-session transmission (MST).
此备忘录允许通过一个或多个RTP会话传输SVC比特流。如果仅使用一个RTP会话来传输SVC比特流,则传输模式称为单会话传输(SST);否则(不止一个RTP会话用于SVC比特流的传输),传输模式被称为多会话传输(MST)。
SST SHOULD be used for point-to-point unicast scenarios, while MST SHOULD be used for point-to-multipoint multicast scenarios where different receivers requires different operation points of the same SVC bitstream, to improve bandwidth utilizing efficiency.
SST应用于点对点单播场景,而MST应用于点对多点多播场景,其中不同的接收器需要相同SVC比特流的不同操作点,以提高带宽利用效率。
If the OPTIONAL mst-mode media type parameter (see Section 7.1) is not present, SST MUST be used; otherwise (mst-mode is present), MST MUST be used.
如果可选的mst模式介质类型参数(见第7.1节)不存在,则必须使用SST;否则(存在mst模式),必须使用mst。
When SST is in use, Section 5.4 of [RFC6184] applies with the following extensions.
当使用SST时,[RFC6184]第5.4节适用于以下扩展。
The packetization modes specified in Section 5.4 of [RFC6184], namely, single NAL unit mode, non-interleaved mode, and interleaved mode, are also referred to as session packetization modes. Table 4 summarizes the allowed session packetization modes for SST.
[RFC6184]第5.4节中规定的分组模式,即单NAL单元模式、非交织模式和交织模式,也称为会话分组模式。表4总结了SST允许的会话打包模式。
Table 4. Summary of allowed session packetization modes (denoted as "Session Mode" for simplicity) for SST (yes = allowed, no = disallowed)
表4。SST允许的会话打包模式摘要(为简单起见,表示为“会话模式”)(是=允许,否=不允许)
Session Mode Allowed ------------------------------------- Single NAL Unit Mode yes Non-Interleaved Mode yes Interleaved Mode yes
Session Mode Allowed ------------------------------------- Single NAL Unit Mode yes Non-Interleaved Mode yes Interleaved Mode yes
For NAL unit types in the range of 0 to 29, inclusive, the NAL unit types allowed to be directly used as packet payloads for each session packetization mode are the same as specified in Section 5.4 of [RFC6184]. For other NAL unit types, which are newly introduced in this memo, the NAL unit types allowed to be directly used as packet payloads for each session packetization mode are summarized in Table 5.
对于0到29(含0到29)范围内的NAL单元类型,允许直接用作每个会话分组模式的分组有效载荷的NAL单元类型与[RFC6184]第5.4节中的规定相同。对于本备忘录中新引入的其他NAL单元类型,表5总结了允许直接用作每个会话分组模式的分组有效载荷的NAL单元类型。
Table 5. New NAL unit types allowed to be directly used as packet payloads for each session packetization mode (yes = allowed, no = disallowed, - = not applicable/not specified)
Table 5. New NAL unit types allowed to be directly used as packet payloads for each session packetization mode (yes = allowed, no = disallowed, - = not applicable/not specified)
Type Subtype Single NAL Non-Interleaved Interleaved Unit Mode Mode Mode ------------------------------------------------------------- 30 - yes no no 31 0 - - - 31 1 yes yes no 31 2 no yes no 31 3-31 - - -
Type Subtype Single NAL Non-Interleaved Interleaved Unit Mode Mode Mode ------------------------------------------------------------- 30 - yes no no 31 0 - - - 31 1 yes yes no 31 2 no yes no 31 3-31 - - -
For MST, this memo specifies four MST packetization modes:
对于MST,本备忘录规定了四种MST打包模式:
o Non-interleaved timestamp based mode (NI-T);
o 基于非交织时间戳的模式(NI-T);
o Non-interleaved cross-session decoding order number (CS-DON) based mode (NI-C);
o 基于非交织跨会话解码顺序号(CS-DON)的模式(NI-C);
o Non-interleaved combined timestamp and CS-DON mode (NI-TC); and
o 非交织组合时间戳和CS-DON模式(NI-TC);和
o Interleaved CS-DON (I-C) mode.
o 交织CS-DON(I-C)模式。
These four modes differ in two ways. First, they differ in terms of whether NAL units are required to be transmitted within each RTP session in decoding order (i.e., non-interleaved), or they are allowed to be transmitted in a different order (i.e., interleaved).
这四种模式有两种不同。首先,它们在以下方面有所不同:是需要在每个RTP会话中以解码顺序(即,非交织)发送NAL单元,还是允许以不同顺序(即,交织)发送NAL单元。
Second, they differ in the mechanisms they provide in order to recover the correct decoding order of the NAL units across all RTP sessions involved.
其次,它们提供的机制不同,以便在所有涉及的RTP会话中恢复NAL单元的正确解码顺序。
The NI-T, NI-C, and NI-TC modes do not allow interleaving, and are thus targeted for systems that require relatively low end-to-end latency, e.g., conversational systems. The I-C mode allows interleaving and is thus targeted for systems that do not require very low end-to-end latency. The benefits of interleaving are the same as that of the interleaved mode specified in [RFC6184].
NI-T、NI-C和NI-TC模式不允许交织,因此针对需要相对较低端到端延迟的系统,例如会话系统。I-C模式允许交织,因此针对不需要非常低的端到端延迟的系统。交织的好处与[RFC6184]中规定的交织模式的好处相同。
The NI-T mode uses timestamps to recover the decoding order of NAL units, whereas the NI-C and I-C modes both use the CS-DON mechanism (explained later) to do so. The NI-TC mode provides both timestamps and the CS-DON method; receivers in this case may choose to use either method for performing decoding order recovery. The MST packetization mode in use MUST be signaled by the value of the OPTIONAL mst-mode media type parameter. The used MST packetization mode governs which session packetization modes are allowed in the associated RTP sessions, which in turn govern which NAL unit types are allowed to be directly used as RTP packet payloads.
NI-T模式使用时间戳来恢复NAL单元的解码顺序,而NI-C和I-C模式都使用CS-DON机制(稍后解释)来恢复NAL单元的解码顺序。NI-TC模式提供时间戳和CS-DON方法;在这种情况下,接收机可以选择使用任一方法来执行解码顺序恢复。使用中的MST打包模式必须通过可选MST mode media type参数的值发出信号。使用的MST打包模式控制相关RTP会话中允许的会话打包模式,进而控制允许直接用作RTP包有效负载的NAL单元类型。
Table 6 summarizes the allowed session packetization modes for NI-T, NI-C, and NI-TC. Table 7 summarizes the allowed session packetization modes for I-C.
表6总结了NI-T、NI-C和NI-TC允许的会话打包模式。表7总结了I-C允许的会话打包模式。
Table 6. Summary of allowed session packetization modes (denoted as "Session Mode" for simplicity) for NI-T, NI-C, and NI-TC (yes = allowed, no = disallowed)
表6。NI-T、NI-C和NI-TC(是=允许,否=不允许)允许的会话打包模式摘要(为简单起见,表示为“会话模式”)
Session Mode Base Session Enhancement Session ----------------------------------------------------------- Single NAL Unit Mode yes no Non-Interleaved Mode yes yes Interleaved Mode no no
Session Mode Base Session Enhancement Session ----------------------------------------------------------- Single NAL Unit Mode yes no Non-Interleaved Mode yes yes Interleaved Mode no no
Table 7. Summary of allowed session packetization modes (denoted as "Session Mode" for simplicity) for I-C (yes = allowed, no = disallowed)
表7。I-C允许的会话打包模式摘要(为简单起见,表示为“会话模式”)(是=允许,否=不允许)
Session Mode Base Session Enhancement Session ----------------------------------------------------------- Single NAL Unit Mode no no Non-Interleaved Mode no no Interleaved Mode yes yes
Session Mode Base Session Enhancement Session ----------------------------------------------------------- Single NAL Unit Mode no no Non-Interleaved Mode no no Interleaved Mode yes yes
For NAL unit types in the range of 0 to 29, inclusive, the NAL unit types allowed to be directly used as packet payloads for each session packetization mode are the same as specified in Section 5.4 of [RFC6184]. For other NAL unit types, which are newly introduced in this memo, the NAL unit types allowed to be directly used as packet payloads for each allowed session packetization mode for NI-T, NI-C, NI-TC, and I-C are summarized in Tables 8, 9, 10, and 11, respectively.
对于0到29(含0到29)范围内的NAL单元类型,允许直接用作每个会话分组模式的分组有效载荷的NAL单元类型与[RFC6184]第5.4节中的规定相同。对于本备忘录中新引入的其他NAL单元类型,表8、表9、表10和表11分别总结了允许直接用作NI-T、NI-C、NI-TC和I-C的每个允许会话打包模式的数据包有效载荷的NAL单元类型。
Table 8. New NAL unit types allowed to be directly used as packet payloads for each allowed session packetization mode when NI-T is in use (yes = allowed, no = disallowed, - = not applicable/not specified)
表8。当NI-T正在使用时,允许将新的NAL单元类型直接用作每个允许会话打包模式的数据包有效载荷(是=允许,否=不允许,-=不适用/未指定)
Type Subtype Single NAL Non-Interleaved Unit Mode Mode --------------------------------------------------- 30 - yes no 31 0 - - 31 1 yes yes 31 2 no yes 31 3-31 - -
Type Subtype Single NAL Non-Interleaved Unit Mode Mode --------------------------------------------------- 30 - yes no 31 0 - - 31 1 yes yes 31 2 no yes 31 3-31 - -
Table 9. New NAL unit types allowed to be directly used as packet payloads for each allowed session packetization mode when NI-C is in use (yes = allowed, no = disallowed, - = not applicable/not specified)
表9。当NI-C正在使用时,允许将新的NAL单元类型直接用作每个允许会话打包模式的数据包有效载荷(是=允许,否=不允许,-=不适用/未指定)
Type Subtype Single NAL Non-Interleaved Unit Mode Mode --------------------------------------------------- 30 - yes yes 31 0 - - 31 1 no no 31 2 no yes 31 3-31 - -
Type Subtype Single NAL Non-Interleaved Unit Mode Mode --------------------------------------------------- 30 - yes yes 31 0 - - 31 1 no no 31 2 no yes 31 3-31 - -
Table 10. New NAL unit types allowed to be directly used as packet payloads for each allowed session packetization mode when NI-TC is in use (yes = allowed, no = disallowed, - = not applicable/not specified)
表10。当NI-TC正在使用时,允许将新的NAL单元类型直接用作每个允许会话打包模式的数据包有效载荷(是=允许,否=不允许,-=不适用/未指定)
Type Subtype Single NAL Non-Interleaved Unit Mode Mode --------------------------------------------------- 30 - yes yes 31 0 - - 31 1 yes yes 31 2 no yes 31 3-31 - -
Type Subtype Single NAL Non-Interleaved Unit Mode Mode --------------------------------------------------- 30 - yes yes 31 0 - - 31 1 yes yes 31 2 no yes 31 3-31 - -
Table 11. New NAL unit types allowed to be directly used as packet payloads for the allowed session packetization mode when I-C is in use (yes = allowed, no = disallowed, - = not applicable/not specified)
表11。当I-C正在使用时,允许将新的NAL单元类型直接用作允许会话打包模式的数据包有效载荷(是=允许,否=不允许,-=不适用/未指定)
Type Subtype Interleaved Mode ------------------------------------ 30 - no 31 0 - 31 1 no 31 2 no 31 3-31 -
Type Subtype Interleaved Mode ------------------------------------ 30 - no 31 0 - 31 1 no 31 2 no 31 3-31 -
When MST is in use and the MST packetization mode in use is NI-C, empty NAL units (type 31, subtype 1) MUST NOT be used, i.e., no RTP packet is allowed to contain one or more empty NAL units.
当使用MST且使用的MST打包模式为NI-C时,不得使用空NAL单元(类型31,子类型1),即不允许RTP数据包包含一个或多个空NAL单元。
When MST is in use and the MST packetization mode in use is I-C, both empty NAL units (type 31, subtype 1) and NI-MTAP NAL units (type 31, subtype 2) MUST NOT be used, i.e., no RTP packet is allowed to contain one or more empty NAL units or an NI-MTAP NAL unit.
当使用MST且使用的MST打包模式为I-C时,不得使用空NAL单元(类型31,子类型1)和NI-MTAP NAL单元(类型31,子类型2),即RTP数据包不允许包含一个或多个空NAL单元或NI-MTAP NAL单元。
Section 5.6 of [RFC6184] applies with the following extensions.
[RFC6184]第5.6节适用于以下扩展。
The payload of a single NAL unit packet MAY be a PACSI NAL unit (Type 30) or an empty NAL unit (Type 31 and Subtype 1), in addition to a NAL unit with NAL unit type equal to any value from 1 to 23, inclusive.
单个NAL单元分组的有效载荷可以是PACSI NAL单元(类型30)或空NAL单元(类型31和子类型1),以及NAL单元类型等于1到23之间的任何值(包括1到23)的NAL单元。
If the Type field of the first byte of the payload is not equal to 31, the payload header is the first byte of the payload. Otherwise, (the Type field of the first byte of the payload is equal to 31), the payload header is the first two bytes of the payload.
如果有效负载第一个字节的类型字段不等于31,则有效负载报头是有效负载的第一个字节。否则,(有效负载的第一个字节的类型字段等于31),有效负载头是有效负载的前两个字节。
In addition to Section 5.7 of [RFC6184], the following applies in this memo.
除[RFC6184]第5.7节外,以下内容适用于本备忘录。
One new NAL unit type introduced in this memo is the non-interleaved multi-time aggregation packet (NI-MTAP). An NI-MTAP consists of one or more non-interleaved multi-time aggregation units.
本备忘录中引入的一种新的NAL单元类型是非交错多时间聚合数据包(NI-MTAP)。NI-MTAP由一个或多个非交错多时间聚合单元组成。
The NAL units contained in NI-MTAPs MUST be aggregated in decoding order.
NI MTAP中包含的NAL单元必须按解码顺序聚合。
A non-interleaved multi-time aggregation unit for the NI-MTAP consists of 16 bits of unsigned size information of the following NAL unit (in network byte order), and 16 bits (in network byte order) of timestamp offset (TS offset) for the NAL unit. The structure is presented in Figure 1. The starting or ending position of an aggregation unit within a packet may or may not be on a 32-bit word boundary. The NAL units in the NI-MTAP are ordered in NAL unit decoding order.
NI-MTAP的非交织多时间聚合单元由以下NAL单元的16位无符号大小信息(按网络字节顺序)和NAL单元的16位时间戳偏移(TS偏移)组成。结构如图1所示。数据包内聚合单元的起始或结束位置可以在32位字边界上,也可以不在32位字边界上。NI-MTAP中的NAL单元按NAL单元解码顺序排序。
The Type field of the NI-MTAP MUST be set equal to "31".
NI-MTAP的类型字段必须设置为“31”。
The F bit MUST be set to 0 if all the F bits of the aggregated NAL units are zero; otherwise, it MUST be set to 1.
如果聚合NAL单元的所有F位均为零,则F位必须设置为0;否则,必须将其设置为1。
The value of NRI MUST be the maximum value of NRI across all NAL units carried in the NI-MTAP packet.
NRI值必须是NI-MTAP数据包中所有NAL单元的最大NRI值。
The field Subtype MUST be equal to 2.
字段子类型必须等于2。
If the field J is equal to 1, the optional DON field MUST be present for each of the non-interleaved multi-time aggregation units. For SST, the J field MUST be equal to 0. For MST, in the NI-T mode the J field MUST be equal to 0, whereas in the NI-C or NI-TC mode the J field MUST be equal to 1. When the NI-C or NI-TC mode is in use, the DON field, when present, MUST represent the CS-DON value for the particular NAL unit as defined in Section 6.2.2.
如果字段J等于1,则每个非交错多时间聚合单元必须存在可选的DON字段。对于SST,J字段必须等于0。对于MST,在NI-T模式下,J场必须等于0,而在NI-C或NI-TC模式下,J场必须等于1。当使用NI-C或NI-TC模式时,DON字段(如果存在)必须表示第6.2.2节中定义的特定NAL装置的CS-DON值。
The fields K and L MUST be both equal to 0.
字段K和L必须都等于0。
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : NAL unit size | TS offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DON (optional) | | |-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NAL unit | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : NAL unit size | TS offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DON (optional) | | |-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NAL unit | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1. Non-interleaved multi-time aggregation unit for NI-MTAP
图1。NI-MTAP的非交错多时间聚合单元
Let TS be the RTP timestamp of the packet carrying the NAL unit. Recall that the NALU-time of a NAL unit in an MTAP is defined in [RFC6184] as the value that the RTP timestamp would have if that NAL unit would be transported in its own RTP packet. The timestamp offset field MUST be set to a value equal to the value of the following formula:
设TS为携带NAL单元的分组的RTP时间戳。回想一下,MTAP中NAL单元的NALU时间在[RFC6184]中定义为RTP时间戳的值,如果该NAL单元将在其自身的RTP数据包中传输。时间戳偏移字段必须设置为等于以下公式值的值:
if NALU-time >= TS, TS offset = NALU-time - TS else, TS offset = NALU-time + (2^32 - TS)
if NALU-time >= TS, TS offset = NALU-time - TS else, TS offset = NALU-time + (2^32 - TS)
For the "earliest" multi-time aggregation unit in an NI-MTAP, the timestamp offset MUST be zero. Hence, the RTP timestamp of the NI-MTAP itself is identical to the earliest NALU-time.
对于NI-MTAP中的“最早”多时间聚合单元,时间戳偏移量必须为零。因此,NI-MTAP本身的RTP时间戳与最早的NALU时间相同。
Informative note: The "earliest" multi-time aggregation unit is the one that would have the smallest extended RTP timestamp among all the aggregation units of an NI-MTAP if the aggregation units were encapsulated in single NAL unit packets. An extended timestamp is a timestamp that has more than 32 bits and is capable of counting the wraparound of the timestamp field, thus enabling one to determine the smallest value if the timestamp wraps. Such an "earliest" aggregation unit may or may not be the first one in the order in which the aggregation units are encapsulated in an NI-MTAP. The "earliest" NAL unit need not be the same as the first NAL unit in the NAL unit decoding order either.
资料性说明:“最早”的多次聚合单元是指如果聚合单元封装在单个NAL单元数据包中,则在NI-MTAP的所有聚合单元中具有最小扩展RTP时间戳的单元。扩展时间戳是具有超过32位的时间戳,并且能够对时间戳字段的环绕进行计数,从而使得能够在时间戳环绕时确定最小值。这种“最早的”聚合单元可能是也可能不是聚合单元封装在NI-MTAP中的顺序中的第一个聚合单元。“最早的”NAL单元也不必与NAL单元解码顺序中的第一个NAL单元相同。
Figure 2 presents an example of an RTP packet that contains an NI-MTAP that contains two non-interleaved multi-time aggregation units, labeled as 1 and 2 in the figure.
图2显示了一个RTP数据包的示例,其中包含一个NI-MTAP,该NI-MTAP包含两个非交错多时间聚合单元,在图中标记为1和2。
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|NRI| Type | Subtype |J|K|L| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | Non-interleaved multi-time aggregation unit #1 | : : | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Non-interleaved multi-time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | aggregation unit #2 | : : | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | :...OPTIONAL RTP padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|NRI| Type | Subtype |J|K|L| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | Non-interleaved multi-time aggregation unit #1 | : : | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Non-interleaved multi-time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | aggregation unit #2 | : : | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | :...OPTIONAL RTP padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2. An RTP packet including an NI-MTAP containing two non-interleaved multi-time aggregation units
图2。一种RTP数据包,包括一个NI-MTAP,其中包含两个非交错的多次聚合单元
Section 5.8 of [RFC6184] applies.
[RFC6184]第5.8节适用。
Informative note: In case a NAL unit with the four-byte SVC NAL unit header is fragmented, the three-byte SVC-specific header extension is considered as part of the NAL unit payload. That is, the three-byte SVC-specific header extension is only available in the first fragment of the fragmented NAL unit.
资料性说明:如果具有四字节SVC NAL单元头的NAL单元被分段,则三字节SVC特定头扩展被视为NAL单元有效负载的一部分。也就是说,三字节SVC特定的报头扩展仅在分段NAL单元的第一个片段中可用。
Another new type of NAL unit specified in this memo is the payload content scalability information (PACSI) NAL unit. The Type field of PACSI NAL units MUST be equal to 30 (a NAL unit type value left unspecified in [H.264] and [RFC6184]). A PACSI NAL unit MAY be carried in a single NAL unit packet or an aggregation packet, and MUST NOT be fragmented.
Another new type of NAL unit specified in this memo is the payload content scalability information (PACSI) NAL unit. The Type field of PACSI NAL units MUST be equal to 30 (a NAL unit type value left unspecified in [H.264] and [RFC6184]). A PACSI NAL unit MAY be carried in a single NAL unit packet or an aggregation packet, and MUST NOT be fragmented.translate error, please retry
PACSI NAL units may be used for the following purposes:
PACSI NAL装置可用于以下目的:
o To enable MANEs to decide whether to forward, process, or discard aggregation packets, by checking in PACSI NAL units the scalability information and other characteristics of the
o 通过在PACSI NAL单元中检查聚合数据包的可伸缩性信息和其他特征,使MANE能够决定是转发、处理还是丢弃聚合数据包
aggregated NAL units, rather than looking into the aggregated NAL units themselves, which are defined by the video coding specification;
聚合NAL单元,而不是查看由视频编码规范定义的聚合NAL单元本身;
o To enable correct decoding order recovery in MST using the NI-C or NI-TC mode, with the help of the CS-DON information included in PACSI NAL units; and
o 借助PACSI NAL单元中包含的CS-DON信息,使用NI-C或NI-TC模式在MST中实现正确的解码顺序恢复;和
o To improve resilience to packet losses, e.g., by utilizing the following data or information included in PACSI NAL units: repeated Supplemental Enhancement Information (SEI) messages, information regarding the start and end of layer representations, and the indices to layer representations of the lowest temporal subset.
o 例如,通过利用PACSI NAL单元中包括的以下数据或信息来提高对分组丢失的恢复能力:重复补充增强信息(SEI)消息、关于层表示的开始和结束的信息以及最低时间子集的层表示的索引。
PACSI NAL units MAY be ignored in the NI-T mode without affecting the decoding order recovery process.
在NI-T模式中,可以忽略PACSI NAL单元,而不影响解码顺序恢复过程。
When a PACSI NAL unit is present in an aggregation packet, the following applies.
当聚合数据包中存在PACSI NAL单元时,以下情况适用。
o The PACSI NAL unit MUST be the first aggregated NAL unit in the aggregation packet.
o PACSI NAL单元必须是聚合数据包中的第一个聚合NAL单元。
o There MUST be at least one additional aggregated NAL unit in the aggregation packet.
o 聚合数据包中必须至少有一个额外的聚合NAL单元。
o The RTP header fields and the payload header fields of the aggregation packet are set as if the PACSI NAL unit was not included in the aggregation packet.
o 将聚合分组的RTP报头字段和有效负载报头字段设置为聚合分组中不包括PACSI NAL单元。
o If the aggregation packet is an MTAP16, MTAP24, or NI-MTAP with the J field equal to 1, the decoding order number (DON) for the PACSI NAL unit MUST be set to indicate that the PACSI NAL unit has an identical DON to the first NAL unit in decoding order among the remaining NAL units in the aggregation packet.
o 如果聚合数据包是J字段等于1的MTAP16、MTAP24或NI-MTAP,则必须设置PACSI NAL单元的解码顺序号(DON),以指示PACSI NAL单元在聚合数据包中的其余NAL单元中具有与解码顺序相同的第一个NAL单元的DON。
When a PACSI NAL unit is included in a single NAL unit packet, it is associated with the next non-PACSI NAL unit in transmission order, and the RTP header fields of the packet are set as if the next non-PACSI NAL unit in transmission order was included in a single NAL unit packet.
当PACSI NAL单元包括在单个NAL单元分组中时,它与传输顺序中的下一个非PACSI NAL单元相关联,并且分组的RTP报头字段被设置为好像传输顺序中的下一个非PACSI NAL单元包括在单个NAL单元分组中。
The PACSI NAL unit structure is as follows. The first four octets are exactly the same as the four-byte SVC NAL unit header discussed in Section 1.1.3. They are followed by one octet containing several flags, then five optional octets, and finally zero or more SEI NAL units. Each SEI NAL unit is preceded by a 16-bit unsigned size field
PACSI NAL单元结构如下所示。前四个八位字节与第1.1.3节中讨论的四字节SVC NAL单元头完全相同。它们后面是一个包含多个标志的八位字节,然后是五个可选的八位字节,最后是零个或多个SEI-NAL单位。每个序列单元前面都有一个16位无符号大小字段
(in network byte order) that indicates the size of the following NAL unit in bytes (excluding these two octets, but including the NAL unit header octet of the SEI NAL unit). Figure 3 illustrates the PACSI NAL unit structure and an example of a PACSI NAL unit containing two SEI NAL units.
(以网络字节顺序)表示以下NAL单元的大小(不包括这两个八位字节,但包括SEI NAL单元的NAL单元头八位字节)。图3说明了PACSI NAL单元结构和包含两个SEI NAL单元的PACSI NAL单元示例。
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|NRI| Type |R|I| PRID |N| DID | QID | TID |U|D|O| RR| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |X|Y|T|A|P|C|S|E| TL0PICIDX (o) | IDRPICID (o) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DONC (o) | NAL unit size 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | SEI NAL unit 1 | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | NAL unit size 2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | SEI NAL unit 2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|NRI| Type |R|I| PRID |N| DID | QID | TID |U|D|O| RR| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |X|Y|T|A|P|C|S|E| TL0PICIDX (o) | IDRPICID (o) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DONC (o) | NAL unit size 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | SEI NAL unit 1 | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | NAL unit size 2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | SEI NAL unit 2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3. PACSI NAL unit structure. Fields suffixed by "(o)" are OPTIONAL.
图3。PACSI NAL单元结构。后缀为“(o)”的字段是可选的。
The bits A, P, and C are specified only if the bit X is equal to 1. The bits S and E are specified, and the fields TL0PICIDX and IDRPICID are present, only if the bit Y is equal to 1. The field DONC is present only if the bit T is equal to 1. The field T MUST be equal to 0 if the PACSI NAL unit is contained in an STAP-B, MTAP16, MTAP24, or NI-MTAP with the J field equal to 1.
仅当位X等于1时,才指定位A、P和C。仅当位Y等于1时,才指定位S和E,并且存在字段TL0PICIDX和IDRPICID。只有当位T等于1时,字段DONC才存在。如果PACSI NAL单位包含在STAP-B、MTAP16、MTAP24或NI-MTAP中,且J字段等于1,则字段T必须等于0。
The values of the fields in PACSI NAL unit MUST be set as follows.
以PACSI NAL为单位的字段值必须设置如下。
o The F bit MUST be set to 1 if the F bit in at least one of the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the next non-PACSI NAL unit in transmission order has the F bit equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the F bit MUST be set to 0.
o 如果聚合分组中至少一个剩余NAL单元中的F位等于1(当PACSI NAL单元包括在聚合分组中时),或者如果传输顺序中的下一个非PACSI NAL单元的F位等于1(当PACSI NAL单元包括在单个NAL单元分组中时),则F位必须设置为1。否则,F位必须设置为0。
o The NRI field MUST be set to the highest value of NRI field among all the remaining NAL units in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the value of the NRI field of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet).
o NRI字段必须设置为聚合数据包中所有剩余NAL单元中NRI字段的最高值(当PACSI NAL单元包含在聚合数据包中时),或传输顺序中下一个非PACSI NAL单元的NRI字段的值(当PACSI NAL单元包含在单个NAL单元数据包中时)。
o The Type field MUST be set to 30.
o 类型字段必须设置为30。
o The R bit MUST be set to 1. Receivers MUST ignore the value of R.
o R位必须设置为1。接收者必须忽略R的值。
o The I bit MUST be set to 1 if the I bit of at least one of the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the I bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the I bit MUST be set to 0.
o 如果聚合分组中至少一个剩余NAL单元的I位等于1(当PACSI NAL单元包括在聚合分组中时),或者如果传输顺序中的下一个非PACSI NAL单元的I位等于1(当PACSI NAL单元包括在单个NAL单元分组中时),则I位必须设置为1。否则,I位必须设置为0。
o The PRID field MUST be set to the lowest value of the PRID values of the remaining NAL units in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the PRID value of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet).
o PRID字段必须设置为聚合数据包中剩余NAL单元的PRID值的最低值(当PACSI NAL单元包含在聚合数据包中时)或传输顺序中下一个非PACSI NAL单元的PRID值(当PACSI NAL单元包含在单个NAL单元数据包中时)。
o The N bit MUST be set to 1 if the N bit of all the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the N bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the N bit MUST be set to 0.
o 如果聚合数据包中所有剩余NAL单元的N位等于1(当PACSI NAL单元包含在聚合数据包中时),或者如果传输顺序中的下一个非PACSI NAL单元的N位等于1(当PACSI NAL单元包含在单个NAL单元数据包中时),则N位必须设置为1。否则,N位必须设置为0。
o The DID field MUST be set to the lowest value of the DID values of the remaining NAL units in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the DID value of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet).
o DID字段必须设置为聚合数据包中剩余NAL单元的DID值的最低值(当PACSI NAL单元包含在聚合数据包中时)或传输顺序中下一个非PACSI NAL单元的DID值(当PACSI NAL单元包含在单个NAL单元数据包中时)。
o The QID field MUST be set to the lowest value of the QID values of the remaining NAL units with the lowest value of DID in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the QID value of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet).
o QID字段必须设置为聚合数据包中DID值最低的剩余NAL单元的QID值的最低值(当PACSI NAL单元包含在聚合数据包中时)或传输顺序中的下一个非PACSI NAL单元的QID值(当PACSI NAL单元包含在单个NAL单元数据包中时)。
o The TID field MUST be set to the lowest value of the TID values of the remaining NAL units with the lowest value of DID in the aggregation packet (when the PACSI NAL unit is included in an
o TID字段必须设置为聚合数据包中具有最低DID值的剩余NAL单元的TID值的最低值(当PACSI NAL单元包含在聚合数据包中时)
aggregation packet) or the TID value of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet).
聚合数据包)或传输顺序中下一个非PACSI NAL单元的TID值(当PACSI NAL单元包含在单个NAL单元数据包中时)。
o The U bit MUST be set to 1 if the U bit of at least one of the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the U bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the U bit MUST be set to 0.
o 如果聚合分组中至少一个剩余NAL单元的U位等于1(当PACSI NAL单元包括在聚合分组中时),或者如果传输顺序中的下一个非PACSI NAL单元的U位等于1(当PACSI NAL单元包括在单个NAL单元分组中时),则U位必须设置为1。否则,U位必须设置为0。
o The D bit MUST be set to 1 if the D value of all the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the D bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the D bit MUST be set to 0.
o 如果聚合数据包中所有剩余NAL单元的D值等于1(当PACSI NAL单元包含在聚合数据包中时),或者如果传输顺序中的下一个非PACSI NAL单元的D位等于1(当PACSI NAL单元包含在单个NAL单元数据包中时),则D位必须设置为1。否则,D位必须设置为0。
o The O bit MUST be set to 1 if the O bit of at least one of the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the O bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the O bit MUST be set to 0.
o 如果聚合分组中至少一个剩余NAL单元的O位等于1(当PACSI NAL单元包括在聚合分组中时),或者如果传输顺序中的下一个非PACSI NAL单元的O位等于1(当PACSI NAL单元包括在单个NAL单元分组中时),则O位必须设置为1。否则,O位必须设置为0。
o The RR field MUST be set to "11" (in binary form). Receivers MUST ignore the value of RR.
o RR字段必须设置为“11”(二进制形式)。接收者必须忽略RR的值。
o If the X bit is equal to 1, the bits A, P, and C are specified as below. Otherwise, the bits A, P, and C are unspecified, and receivers MUST ignore the values of these bits. The X bit SHOULD be identical for all the PACSI NAL units in all the RTP sessions carrying the same SVC bitstream.
o 如果X位等于1,则位A、P和C指定如下。否则,位A、P和C未指定,接收器必须忽略这些位的值。对于承载相同SVC比特流的所有RTP会话中的所有PACSI NAL单元,X位应相同。
o If the Y bit is equal to 1, the OPTIONAL fields TL0PICIDX and IDRPICID MUST be present and specified as below, and the bits S and E are also specified as below. Otherwise, the fields TL0PICIDX and IDRPICID MUST NOT be present, while the S and E bits are unspecified and receivers MUST ignore the values of these bits. The Y bit MUST be identical for all the PACSI NAL units in all the RTP sessions carrying the same SVC bitstream. The Y bit MUST be equal to 0 when the parameter packetization-mode is equal to 2.
o 如果Y位等于1,则可选字段TL0PICIDX和IDRPICID必须存在并按如下方式指定,位S和E也按如下方式指定。否则,字段TL0PICIDX和IDRPICID不得存在,而S和E位未指定,接收器必须忽略这些位的值。对于承载相同SVC比特流的所有RTP会话中的所有PACSI NAL单元,Y位必须相同。当参数打包模式等于2时,Y位必须等于0。
o If the T bit is equal to 1, the OPTIONAL field DONC MUST be present and specified as below. Otherwise, the field DONC MUST NOT be present. The field T MUST be equal to 0 if the PACSI NAL unit is contained in an STAP-B, MTAP16, MTAP24, or NI-MTAP.
o 如果T位等于1,则必须存在可选字段DONC,并按如下方式指定。否则,字段DONC不得出现。如果PACSI NAL单位包含在STAP-B、MTAP16、MTAP24或NI-MTAP中,则字段T必须等于0。
o The A bit MUST be set to 1 if at least one of the remaining NAL units in the aggregation packet belongs to an anchor layer representation (when the PACSI NAL unit is included in an aggregation packet) or if the next non-PACSI NAL unit in transmission order belongs to an anchor layer representation (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the A bit MUST be set to 0.
o 如果聚合数据包中剩余的NAL单元中至少有一个属于锚定层表示(当PACSI NAL单元包括在聚合数据包中时),或者如果传输顺序中的下一个非PACSI NAL单元属于锚定层表示,则A位必须设置为1(当PACSI NAL单元包含在单个NAL单元数据包中时)。否则,a位必须设置为0。
Informative note: The A bit indicates whether CGS or spatial layer switching at a non-IDR layer representation (a layer representation with nal_unit_type not equal to 5 and idr_flag not equal to 1) can be performed. With some picture coding structures a non-IDR intra layer representation can be used for random access. Compared to using only IDR layer representations, higher coding efficiency can be achieved. The H.264/AVC or SVC solution to indicate the random accessibility of a non-IDR intra layer representation is using a recovery point SEI message. The A bit offers direct access to this information, without having to parse the recovery point SEI message, which may be buried deeply in an SEI NAL unit. Furthermore, the SEI message may or may not be present in the bitstream.
资料性说明:A位指示是否可以在非IDR层表示(nal_单位_类型不等于5且IDR_标志不等于1的层表示)下执行CGS或空间层切换。对于一些图片编码结构,非IDR层内表示可用于随机访问。与仅使用IDR层表示相比,可以实现更高的编码效率。用于指示非IDR层内表示的随机可访问性的H.264/AVC或SVC解决方案使用恢复点SEI消息。A位提供了对该信息的直接访问,而无需解析恢复点SEI消息,该消息可能被深埋在SEI NAL单元中。此外,SEI消息可能存在于比特流中,也可能不存在于比特流中。
o The P bit MUST be set to 1 if all the remaining NAL units in the aggregation packet have redundant_pic_cnt greater than 0 (when the PACSI NAL unit is included in an aggregation packet) or the next non-PACSI NAL unit in transmission order has redundant_pic_cnt greater than 0 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the P bit MUST be set to 0.
o 如果聚合数据包中的所有剩余NAL单元的冗余_picu cnt大于0(当PACSI NAL单元包含在聚合数据包中时),或者传输顺序中的下一个非PACSI NAL单元的冗余_picu cnt大于0(当PACSI NAL单元包含在单个NAL单元数据包中时),则P位必须设置为1。否则,P位必须设置为0。
Informative note: The P bit indicates whether a packet can be discarded because it contains only redundant slice NAL units. Without this bit, the corresponding information can be obtained from the syntax element redundant_pic_cnt, which is contained in the variable-length coded slice header.
资料性说明:P位表示是否可以丢弃数据包,因为它只包含冗余的片NAL单元。如果没有该位,则可以从语法元素redundant_pic_cnt获得相应的信息,该语法元素包含在可变长度编码的片头中。
o The C bit MUST be set to 1 if at least one of the remaining NAL units in the aggregation packet belongs to an intra layer representation (when the PACSI NAL unit is included in an aggregation packet) or if the next non-PACSI NAL unit in transmission order belongs to an intra layer representation (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the C bit MUST be set to 0.
o 如果聚合数据包中剩余的NAL单元中至少有一个属于层内表示(当PACSI NAL单元包括在聚合数据包中时),或者如果传输顺序中的下一个非PACSI NAL单元属于层内表示,则C位必须设置为1(当PACSI NAL单元包含在单个NAL单元数据包中时)。否则,C位必须设置为0。
Informative note: The C bit indicates whether a packet contains intra slices, which may be the only packets to be forwarded, e.g., when the network conditions are particularly adverse.
信息性说明:C位表示一个数据包是否包含帧内片,例如,当网络条件特别不利时,帧内片可能是唯一要转发的数据包。
o The S bit MUST be set to 1, if the first NAL unit following the PACSI NAL unit in an aggregation packet is the first VCL NAL unit, in decoding order, of a layer representation (when the PACSI NAL unit is included in an aggregation packet) or if the next non-PACSI NAL unit in transmission order is the first VCL NAL unit, in decoding order, of a layer representation(when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the S bit MUST be set to 0.
o 如果聚合分组中PACSI NAL单元之后的第一个NAL单元是层表示的第一个VCL NAL单元(解码顺序)(当PACSI NAL单元包括在聚合分组中时),或者如果传输顺序中的下一个非PACSI NAL单元是第一个VCL NAL单元(解码顺序),则S位必须设置为1,层表示(当PACSI NAL单元包含在单个NAL单元数据包中时)。否则,S位必须设置为0。
o The E bit MUST be set to 1, if the last NAL unit following the PACSI NAL unit in an aggregation packet is the last VCL NAL unit, in decoding order, of a layer representation (when the PACSI NAL unit is included in an aggregation packet) or if the next non-PACSI NAL unit in transmission order is the last VCL NAL unit, in decoding order, of a layer representation (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the E bit MUST be set to 0.
o 如果聚合分组中PACSI NAL单元后面的最后一个NAL单元是层表示的最后一个VCL NAL单元(当PACSI NAL单元包括在聚合分组中时),或者如果传输顺序中的下一个非PACSI NAL单元是最后一个VCL NAL单元,则E位必须设置为1,层表示(当PACSI NAL单元包含在单个NAL单元数据包中时)。否则,E位必须设置为0。
Informative note: In an aggregation packet it is always possible to detect the beginning or end of a layer representation by detecting changes in the values of dependency_id, quality_id, and temporal_id in NAL unit headers, except from the first and last NAL units of a packet. The S or E bits are used to provide this information, for both single NAL unit and aggregation packets, so that previous or following packets do not have to be examined. This enables MANEs to detect slice loss and take proper action such as requesting a retransmission as soon as possible, as well as to allow efficient playout buffer handling similarly to the M bit present in the RTP header. The M bit in the RTP header still indicates the end of an access unit, not the end of a layer representation.
资料性说明:在聚合数据包中,始终可以通过检测NAL单元头中的依赖项id、质量id和时间id的值的变化来检测层表示的开始或结束,数据包的第一个和最后一个NAL单元除外。S或E位用于为单个NAL单元和聚合分组提供此信息,以便不必检查之前或之后的分组。这使得mane能够检测片丢失并采取适当的措施,例如尽快请求重传,以及允许与RTP报头中存在的M位类似的高效播放缓冲区处理。RTP报头中的M位仍然表示访问单元的结束,而不是层表示的结束。
o When present, the TL0PICIDX field MUST be set to equal to tl0_dep_rep_idx as specified in Annex G of [H.264] for the layer representation containing the first NAL unit following the PACSI NAL unit in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or containing the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet).
o 当存在时,TL0PICIDX字段必须设置为等于[H.264]附录G中规定的tl0_dep_rep_idx,用于包含聚合数据包中PACSI NAL单元之后的第一个NAL单元的层表示(当PACSI NAL单元包含在聚合数据包中时)或包含传输顺序中的下一个非PACSI NAL单元(当PACSI NAL单元包含在单个NAL单元分组中时)。
o When present, the IDRPICID field MUST be set to equal to effective_idr_pic_id as specified in Annex G of [H.264] for the layer representation containing the first NAL unit following the PACSI NAL unit in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or containing the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet).
o 当存在时,IDRPICID字段必须设置为等于[H.264]附录G中规定的有效_idr_picu id,用于包含聚合数据包中PACSI NAL单元之后的第一个NAL单元的层表示(当PACSI NAL单元包含在聚合数据包中时)或包含传输顺序中的下一个非PACSI NAL单元(当PACSI NAL单元包含在单个NAL单元分组中时)。
Informative note: The TL0PICIDX and IDRPICID fields enable the detection of the loss of layer representations in the most important temporal layer (with temporal_id equal to 0) by receivers as well as MANEs. SVC provides a solution that uses SEI messages, which are harder to parse and may or may not be present in the bitstream. When the PACSI NAL unit is part of an NI-MTAP packet, it is possible to infer the correct values of tl0_dep_rep_idx and idr_pic_id for all layer representations contained in the NI-MTAP by following the rules that specify how these parameters are set as given in Annex G of [H.264] and by detecting the different layer representations contained in the NI-MTAP packet by detecting changes in the values of dependency_id_, quality_id, and temporal_id in the NAL unit headers as well as using the S and E flags. The only exception is if NAL units of an IDR picture are present in the NI-MTAP in a position other than the first NAL unit following the PACSI NAL unit, in which case the value of idr_pic_id cannot be inferred. In this case the NAL unit has to be partially parsed to obtain the idr_pic_id. Note that, due to the large size of IDR pictures, their inclusion in an NI-MTAP, and especially in a position other than the first NAL unit following the PACSI NAL unit, may be neither practical nor useful.
资料性说明:TL0PICIDX和IDRPICID字段允许接收机和MANE在最重要的时间层(时间id等于0)中检测层表示的丢失。SVC提供了一种使用SEI消息的解决方案,SEI消息更难解析,并且可能存在于比特流中,也可能不存在于比特流中。当PACSI NAL单元是NI-MTAP数据包的一部分时,可以按照[H.264]附录G中规定的规则,推断NI-MTAP中包含的所有图层表示的tl0_dep_rep_idx和idr_picu id的正确值以及通过检测NAL单元报头中的dependency_id_、quality_id和temporal_id的值的变化以及使用S和E标志来检测NI-MTAP数据包中包含的不同层表示。唯一的例外情况是,如果IDR图片的NAL单位出现在NI-MTAP中,而不是PACSI NAL单位之后的第一个NAL单位,则无法推断IDR_picu_id的值。在这种情况下,必须对NAL单元进行部分解析以获得idr_picu_id。请注意,由于idr图片的大小较大,因此将其包含在NI-MTAP中,尤其是在PACSI NAL单元后面的第一个NAL单元以外的位置,可能既不实用也不有用。
o When present, the field DONC indicates the cross-session decoding order number (CS-DON) for the first of the remaining NAL units in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the CS-DON of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet). CS-DON is further discussed in Section 4.11.
o 当存在时,字段DONC指示聚合分组中剩余NAL单元中的第一个的跨会话解码顺序号(CS-DON)(当PACSI-NAL单元包括在聚合分组中时)或传输顺序中的下一个非PACSI-NAL单元的CS-DON(当PACSI-NAL单元包括在单个NAL单元分组中时). 第4.11节将进一步讨论CS-DON。
The PACSI NAL unit MAY include a subset of the SEI NAL units associated with the access unit to which the first non-PACSI NAL unit in the aggregation packet belongs, and MUST NOT contain SEI NAL units associated with any other access unit.
PACSI NAL单元可以包括与聚合分组中的第一非PACSI NAL单元所属的接入单元相关联的seinal单元的子集,并且不能包含与任何其他接入单元相关联的seinal单元。
Informative note: In H.264/AVC and SVC, within each access unit, SEI NAL units must appear before any VCL NAL unit in decoding order. Therefore, without using PACSI NAL units, SEI messages are typically only conveyed in the first of the packets carrying an access unit. Senders may repeat SEI NAL units in PACSI NAL units, so that they are repeated in more than one packet and thus increase robustness against packet losses. Receivers may use the repeated SEI messages in place of missing SEI messages.
资料性说明:在H.264/AVC和SVC中,在每个接入单元中,SEI-NAL单元必须以解码顺序出现在任何VCL-NAL单元之前。因此,在不使用PACSI NAL单元的情况下,SEI消息通常仅在承载接入单元的第一个分组中传送。发送方可以在PACSI NAL单元中重复SEI NAL单元,以便在多个分组中重复SEI NAL单元,从而提高对分组丢失的鲁棒性。接收者可以使用重复的SEI消息代替丢失的SEI消息。
For a PACSI NAL unit included in an aggregation packet, an SEI message SHOULD NOT be included in the PACSI NAL unit and also included in one of the remaining NAL units contained in the same aggregation packet.
对于聚合数据包中包含的PACSI NAL单元,SEI消息不应包含在PACSI NAL单元中,也应包含在同一聚合数据包中包含的剩余NAL单元之一中。
An empty NAL unit MAY be included in a single NAL unit packet, an STAP-A or an NI-MTAP packet. Empty NAL units MUST have an RTP timestamp (when transported in a single NAL unit packet) or NALU-time (when transported in an aggregation packet) that is associated with an access unit for which there exists at least one NAL unit of type 1, 5, or 20. When MST is used, the type 1, 5, or 20 NAL unit may be in a different RTP session. Empty NAL units may be used in the decoding order recovery process of the NI-T mode as described in Section 5.2.1.
空NAL单元可包括在单个NAL单元分组、STAP-a或NI-MTAP分组中。空NAL单元必须具有RTP时间戳(当在单个NAL单元分组中传输时)或NALU时间(当在聚合分组中传输时),该时间戳或NALU时间与存在至少一个类型为1、5或20的NAL单元的接入单元相关联。当使用MST时,类型1、5或20 NAL单元可能处于不同的RTP会话中。空NAL单元可用于第5.2.1节所述的NI-T模式的解码顺序恢复过程。
The packet structure is shown in the following figure.
数据包结构如下图所示。
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|NRI| Type | Subtype |J|K|L| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|NRI| Type | Subtype |J|K|L| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4. Empty NAL unit structure.
图4。空NAL单元结构。
The fields MUST be set as follows:
必须按如下方式设置字段:
F MUST be equal to 0 NRI MUST be equal to 3 Type MUST be equal to 31 Subtype MUST be equal to 1 J MUST be equal to 0 K MUST be equal to 0 L MUST be equal to 0
F必须等于0 NRI必须等于3类型必须等于31子类型必须等于1 J必须等于0 K必须等于0 L必须等于0
The DON concept is introduced in [RFC6184] and is used to recover the decoding order when interleaving is used within a single session. Section 5.5 of [RFC6184] applies when using SST.
[RFC6184]中引入了DON概念,用于在单个会话中使用交织时恢复解码顺序。使用SST时,[RFC6184]第5.5节适用。
When using MST, it is necessary to recover the decoding order across the various RTP sessions regardless if interleaving is used or not. In addition to the timestamp mechanism described later, the CS-DON mechanism is an extension of the DON facility that can be used for this purpose, and is defined in the following section.
在使用MST时,无论是否使用交织,都必须恢复各个RTP会话的解码顺序。除了后面描述的时间戳机制外,CS-DON机制是可用于此目的的DON设施的扩展,并在下一节中定义。
The cross-session decoding order number (CS-DON) is a number that indicates the decoding order of NAL units across all RTP sessions involved in MST. It is similar to the DON concept in [RFC6184], but contrary to [RFC6184] where the DON was used only for interleaved
跨会话解码顺序号(CS-DON)是一个数字,指示MST中涉及的所有RTP会话中NAL单元的解码顺序。它类似于[RFC6184]中的DON概念,但与[RFC6184]相反,在[RFC6184]中,DON仅用于交织
packetization, in this memo it is used not only in the interleaved MST mode (I-C) but also in two of the non-interleaved MST modes (NI-C and NI-TC).
打包,在本备忘录中,它不仅用于交错MST模式(I-C),还用于两种非交错MST模式(NI-C和NI-TC)。
When the NI-C or NI-TC MST modes are in use, the packetization of each session MUST be as specified in Section 5.2.2. In PACSI NAL units the CS-DON value is explicitly coded in the field DONC. For non-PACSI NAL units the CS-DON value is derived as follows. Let SN indicate the RTP sequence number of a packet.
当使用NI-C或NI-TC MST模式时,每个会话的打包必须符合第5.2.2节的规定。在PACSI NAL单位中,CS-DON值在DONC字段中显式编码。对于非PACSI NAL单位,CS-DON值推导如下。让SN表示数据包的RTP序列号。
o For each non-PACSI NAL unit carried in a session using the single NAL unit session packetization mode, the CS-DON value of the NAL unit is equal to (DONC_prev_PACSI + SN_diff - 1) % 65536, wherein "%" is the modulo operation, DONC_prev_PACSI is the DONC value of the previous PACSI NAL unit with the same NALU-time as the current NAL unit, and SN_diff is calculated as follows:
o 对于使用单个NAL单元会话打包模式的会话中携带的每个非PACSI NAL单元,NAL单元的CS-DON值等于(DONC_prev_PACSI+SN_diff-1)%65536,其中“%”是模运算,DONC_prev_PACSI是与当前NAL单元具有相同NALU时间的前一个PACSI NAL单元的DONC值,SN_diff的计算如下:
if SN1 > SN2, SN_diff = SN1 - SN2 else SN_diff = SN2 + 65536 - SN1
if SN1 > SN2, SN_diff = SN1 - SN2 else SN_diff = SN2 + 65536 - SN1
where SN1 and SN2 are the SNs of the current NAL unit and the previous PACSI NAL unit with the same NALU-time, respectively.
其中SN1和SN2分别是具有相同NALU时间的当前NAL单元和先前PACSI NAL单元的sn。
o For non-PACSI NAL units carried in a session using the non-interleaved session packetization mode, the CS-DON value of each non-PACSI NAL unit is derived as follows.
o 对于使用非交织会话分组模式的会话中携带的非PACSI NAL单元,每个非PACSI NAL单元的CS-DON值推导如下。
For a non-PACSI NAL unit in a single NAL unit packet, the following applies.
对于单个NAL单元数据包中的非PACSI NAL单元,以下内容适用。
If the previous PACSI NAL unit is contained in a single NAL unit packet, the CS-DON value of the NAL unit is calculated as above;
如果先前的PACSI NAL单元包含在单个NAL单元分组中,则如上所述计算NAL单元的CS-DON值;
otherwise (the previous PACSI NAL unit is contained in an STAP-A packet), the CS-DON value of the NAL unit is calculated as above, with DONC_prev_PACSI being replaced by the CS-DON value of the previous non-PACSI NAL unit in decoding order (i.e., the CS-DON value of the last NAL unit of the STAP-A packet).
否则(先前的PACSI NAL单元包含在STAP-A分组中),如上所述计算NAL单元的CS-DON值,以解码顺序将DONC_prev_PACSI替换为先前的非PACSI NAL单元的CS-DON值(即,STAP-A分组的最后一个NAL单元的CS-DON值)。
For a non-PACSI NAL unit in an STAP-A packet, the following applies.
对于STAP-a数据包中的非PACSI NAL单元,以下内容适用。
If the non-PACSI NAL unit is the first non-PACSI NAL unit in the STAP-A packet, the CS-DON value of the NAL unit is equal to DONC of the PACSI NAL unit in the STAP-A packet;
如果非PACSI NAL单元是STAP-A分组中的第一个非PACSI NAL单元,则NAL单元的CS-DON值等于STAP-A分组中PACSI NAL单元的DONC;
otherwise (the non-PACSI NAL unit is not the first non-PACSI NAL unit in the STAP-A packet), the CS-DON value of the NAL unit is equal to: (the CS-DON value of the previous non-PACSI NAL unit in decoding order + 1) % 65536, wherein "%" is the modulo operation.
否则(非PACSI NAL单元不是STAP-A分组中的第一个非PACSI NAL单元),NAL单元的CS-DON值等于:(解码顺序为+1的前一个非PACSI NAL单元的CS-DON值)%65536,其中“%”是模运算。
For a non-PACSI NAL unit in a number of FU-A packets, the CS-DON value of the NAL unit is calculated the same way as when the single NAL unit session packetization mode is in use, with SN1 being the SN value of the first FU-A packet.
对于多个FU-a分组中的非PACSI NAL单元,NAL单元的CS-DON值的计算方式与使用单个NAL单元会话分组化模式时相同,SN1是第一个FU-a分组的SN值。
For a non-PACSI NAL unit in an NI-MTAP packet, the CS-DON value is equal to the value of the DON field of the non-interleaved multi-time aggregation unit.
对于NI-MTAP数据包中的非PACSI NAL单元,CS-DON值等于非交错多时间聚合单元的DON字段的值。
When the I-C MST packetization mode is in use, the DON values derived according to [RFC6184] for all the NAL units in each of the RTP sessions MUST indicate CS-DON values.
当使用I-C MST打包模式时,根据[RFC6184]为每个RTP会话中的所有NAL单元导出的DON值必须指示CS-DON值。
Section 6 of [RFC6184] applies in this memo, with the following additions.
[RFC6184]第6节适用于本备忘录,并增加以下内容。
All receivers MUST support the single NAL unit packetization mode to provide backward compatibility to endpoints supporting only the single NAL unit mode of [RFC6184]. However, the use of single NAL unit packetization mode (packetization-mode equal to 0) SHOULD be avoided whenever possible, because encapsulating NAL units of small sizes in their own packets (e.g., small NAL units containing parameter sets, prefix NAL units, or SEI messages) is less efficient due to the packet header overhead.
所有接收器都必须支持单NAL单元打包模式,以向仅支持[RFC6184]单NAL单元模式的端点提供向后兼容性。然而,应尽可能避免使用单个NAL单元打包模式(打包模式等于0),因为将小尺寸的NAL单元封装在它们自己的包中(例如,包含参数集、前缀NAL单元或SEI消息的小NAL单元)由于包头开销而效率较低。
All receivers MUST support the non-interleaved mode.
所有接收机必须支持非交织模式。
Informative note: The non-interleaved mode of [RFC6184] does allow an application to encapsulate a single NAL unit in a single RTP packet. Historically, the single NAL unit mode has been included in [RFC6184] only for compatibility with ITU-T Rec. H.241 Annex A [H.241]. There is no point in carrying this historic ballast towards a new application space such as the one provided with SVC. The implementation complexity increase for supporting the additional mechanisms of the non-interleaved mode (namely, STAP-A and FU-A) is minor, whereas the benefits are significant. As a result, the support of STAP-A and FU-A is required. Additionally,
资料性说明:[RFC6184]的非交错模式允许应用程序将单个NAL单元封装在单个RTP数据包中。从历史上看,[RFC6184]中包含的单个NAL单元模式仅用于与ITU-T Rec.H.241附录A[H.241]的兼容性。将这一历史性压舱物带到新的应用空间(如配备SVC的应用空间)是没有意义的。支持非交织模式的附加机制(即STAP-A和FU-A)的实现复杂性增加很小,但好处是显著的。因此,需要STAP-a和FU-a的支持。另外,,
support for two of the three NAL unit types defined in this memo, namely, empty NAL units and NI-MTAP is needed, as specified in Section 4.5.1.
根据第4.5.1节的规定,需要支持本备忘录中定义的三种NAL装置类型中的两种,即空NAL装置和NI-MTAP。
A NAL unit of small size SHOULD be encapsulated in an aggregation packet together with one or more other NAL units. For example, non-VCL NAL units such as access unit delimiters, parameter sets, or SEI NAL units are typically small.
较小的NAL单元应与一个或多个其他NAL单元一起封装在聚合数据包中。例如,非VCL NAL单元(如访问单元分隔符、参数集或序列单元)通常很小。
A prefix NAL unit and the NAL unit with which it is associated, and which follows the prefix NAL unit in decoding order, SHOULD be included in the same aggregation packet whenever an aggregation packet is used for the associated NAL unit, unless this would violate session MTU constraints or if fragmentation units are used for the associated NAL unit.
每当聚合分组用于相关联的NAL单元时,前缀NAL单元及其相关联的NAL单元,以及在解码顺序上跟随前缀NAL单元的NAL单元,应包括在相同的聚合分组中,除非这会违反会话MTU约束,或者相关NAL单元使用分段单元。
Informative note: Although the prefix NAL unit is ignored by an H.264/AVC decoder, it is necessary in the SVC decoding process.
资料性说明:尽管H.264/AVC解码器忽略了前缀NAL单元,但它在SVC解码过程中是必要的。
Given the small size of the prefix NAL unit, it is best if it is transported in the same RTP packet as its associated NAL unit.
考虑到前缀NAL单元的小尺寸,最好在与其相关联的NAL单元相同的RTP数据包中进行传输。
When only an H.264/AVC compatible subset of the SVC base layer is transmitted in an RTP session, the subset MUST be encapsulated according to [RFC6184]. This way, an [RFC6184] receiver will be able to receive the H.264/AVC compatible bitstream subset.
当在RTP会话中仅传输SVC基本层的H.264/AVC兼容子集时,该子集必须根据[RFC6184]进行封装。这样,[RFC6184]接收机将能够接收H.264/AVC兼容的比特流子集。
When a set of layers including one or more SVC enhancement layers is transmitted in an RTP session, the set SHOULD be carried in one RTP stream that SHOULD be encapsulated according to this memo.
当在RTP会话中传输包括一个或多个SVC增强层的一组层时,该组层应携带在一个RTP流中,该RTP流应根据本备忘录进行封装。
When MST is used, the packetization rules specified in Section 5.1 still apply. In addition, the following packetization rules MUST be followed, to ensure that decoding order of NAL units carried in the sessions can be correctly recovered for each of the MST packetization modes using the de-packetization process specified in Section 6.2.
使用MST时,第5.1节中规定的打包规则仍然适用。此外,必须遵循以下分组规则,以确保会话中携带的NAL单元的解码顺序可以使用第6.2节中规定的反分组过程,针对每个MST分组模式正确恢复。
The NI-T and NI-TC modes both use timestamps to recover the decoding order. In order to be able to do so, it is necessary for the RTP packet stream to contain data for all sampling instances of a given RTP session in all enhancement RTP sessions that depend on the given RTP session. The NI-C and I-C modes do not have this limitation, and use the CS-DON values as a means to explicitly indicate decoding order, either directly coded in PACSI NAL units, or inferred from
NI-T和NI-TC模式都使用时间戳来恢复解码顺序。为了能够这样做,RTP分组流必须在依赖于给定RTP会话的所有增强RTP会话中包含给定RTP会话的所有采样实例的数据。NI-C和I-C模式没有此限制,并且使用CS-DON值作为明确指示解码顺序的手段,可以直接以PACSI NAL单位编码,也可以从中推断
them using the packetization rules. It is noted that the NI-TC mode offers both alternatives and it is up to the receiver to select which one to use.
他们使用打包规则。值得注意的是,NI-TC模式提供了两种选择,由接收机选择使用哪一种。
When using the NI-T mode and a PACSI NAL unit is present, the T bit MUST be equal to 0, i.e., the DONC field MUST NOT be present.
当使用NI-T模式且存在PACSI NAL单元时,T位必须等于0,即DONC字段不得存在。
When using the NI-T mode, the optional parameters sprop-mst-remux-buf-size, sprop-remux-buf-req, remux-buf-cap, sprop-remux-init-buf-time, sprop-mst-max-don-diff MUST NOT be present.
使用NI-T模式时,可选参数sprop mst remux buf size、sprop remux buf req、remux buf cap、sprop remux init buf time、sprop mst max don diff不得存在。
When the NI-T or NI-TC MST mode is in use, the following applies.
当使用NI-T或NI-TC MST模式时,以下情况适用。
If one or more NAL units of an access unit of sampling time instance t is present in RTP session A, then one or more NAL units of the same access unit MUST be present in any enhancement RTP session that depends on RTP session A.
如果采样时间实例t的接入单元的一个或多个NAL单元存在于RTP会话A中,则同一接入单元的一个或多个NAL单元必须存在于依赖于RTP会话A的任何增强RTP会话中。
Informative note: The mapping between RTP and NTP format timestamps is conveyed in RTCP SR packets. In addition, the mechanisms for faster media timestamp synchronization discussed in [RFC6051] may be used to speed up the acquisition of the RTP-to-wall-clock mapping.
资料性说明:RTP和NTP格式时间戳之间的映射在RTCP SR数据包中传输。此外,[RFC6051]中讨论的更快的媒体时间戳同步机制可用于加快RTP到墙壁时钟映射的获取。
Informative note: The rule above may require the insertion of NAL units, typically when temporal scalability is used, i.e., an enhancement RTP session does not contain any NAL units for an access unit with a particular NTP timestamp (media timestamp), which, however, is present in a lower enhancement RTP session or the base RTP session. There are two ways to insert additional NAL units in order to satisfy this rule:
信息性说明:上述规则可能需要插入NAL单元,通常在使用时间可伸缩性时,即增强RTP会话不包含具有特定NTP时间戳(媒体时间戳)的接入单元的任何NAL单元,然而,该NTP时间戳存在于较低增强RTP会话或基本RTP会话中。有两种方法可以插入额外的NAL单元以满足此规则:
- One option for adding additional NAL units is to use empty NAL units (defined in Section 4.10), which can be used by the process described in Section 6.2.1 for the access unit reordering process.
- 添加额外NAL单元的一个选项是使用空NAL单元(定义见第4.10节),该单元可由第6.2.1节中描述的访问单元重新排序过程使用。
- Additional NAL units may also be added by the encoder itself, for example, by transmitting coded data that simply instruct the decoder to repeat the previous picture. This option, however, may be difficult to use with pre-encoded content.
- 另外的NAL单元也可以由编码器本身添加,例如,通过发送简单地指示解码器重复先前图片的编码数据。但是,此选项可能难以用于预编码内容。
If a packet must be inserted in order to satisfy the above rule, e.g., in case of a MANE generating multiple RTP streams out of a single RTP stream, the inserted packet must have an RTP timestamp that maps to the same wall-clock time (in NTP format) as the one of
如果必须插入数据包以满足上述规则,例如,在MANE从单个RTP流生成多个RTP流的情况下,则插入的数据包必须具有映射到与其中一个相同的挂钟时间(NTP格式)的RTP时间戳
the RTP timestamp of any packet of the access unit present in any lower enhancement RTP session or the base RTP session. This is easy to accomplish if the NAL unit or the packet can be inserted at the time of the RTP stream generation, since the media timestamp (NTP timestamp) must be the same for the inserted packet and the packet of the corresponding access unit. If there is no knowledge of the media time at RTP stream generation or if the RTP streams are not generated at the same instance, this can be also applied later in the transmission process. In this case the NTP timestamp of the inserted packet can be calculated as follows.
存在于任何较低增强RTP会话或基本RTP会话中的接入单元的任何分组的RTP时间戳。如果可以在RTP流生成时插入NAL单元或分组,则这是容易实现的,因为对于插入的分组和相应接入单元的分组,媒体时间戳(NTP时间戳)必须相同。如果在RTP流生成时不知道媒体时间,或者如果在同一实例中不生成RTP流,则这也可以在稍后的传输过程中应用。在这种情况下,插入的分组的NTP时间戳可以如下计算。
Assume that a packet A2 of an access unit with RTP timestamp TS_A2 is present in base RTP session A, and that no packet of that access unit is present in enhancement RTP session B, as shown in Figure 5. Thus, a packet B2 must be inserted into session B following the rule above. The most recent RTCP sender report in session A carries NTP timestamp NTP_A and the RTP timestamp TS_A. The sender report in session B with a lower NTP timestamp than NTP_A is NTP_B, and carries the RTP timestamp TS_B.
假设具有RTP时间戳TS_A2的接入单元的分组A2存在于基本RTP会话a中,并且该接入单元的分组不存在于增强RTP会话B中,如图5所示。因此,必须按照上述规则将分组B2插入会话B。会话A中最新的RTCP发送方报告带有NTP时间戳NTP_A和RTP时间戳TS_A。会话B中NTP时间戳低于NTP_A的发送方报告是NTP_B,并带有RTP时间戳TS_B。
RTP session B:..B0........B1........(B2)......................
RTP session B:..B0........B1........(B2)......................
RTCP session B:.....SR(NTP_B,TS_B).............................
RTCP session B:.....SR(NTP_B,TS_B).............................
RTP session A:..A0........A1........A2........................
RTP session A:..A0........A1........A2........................
RTCP session A:..................SR(NTP_A,TS_A)................
RTCP session A:..................SR(NTP_A,TS_A)................
-----------------|--x------|-----x---|------------------------> NTP time --------------------+<---------->+<->+------------------------> t1 t2 RTP TS(B) time
-----------------|--x------|-----x---|------------------------> NTP time --------------------+<---------->+<->+------------------------> t1 t2 RTP TS(B) time
Figure 5. Example calculation of RTP timestamp for packet insertion in an enhancement layer RTP session
图5。增强层RTP会话中用于分组插入的RTP时间戳的示例计算
The vertical bars ("|")in the NTP time line in the figure above indicate that access unit data is present in at least one of the sessions. The "x" marks indicate the times of the sender reports. The RTP timestamp time line for session B, shown right below the NTP time line, indicates two time segments, t1 and t2. t1 is the time difference between the sender reports between the two sessions, expressed in RTP timestamp clock ticks, and t2 is the time difference from the session A sender report to the A2 packet, again expressed in RTP timestamp clock ticks. The sum of these differences is added to
上图中NTP时间线中的垂直条(“|”)表示至少一个会话中存在访问单元数据。“x”标记表示发件人报告的时间。会话B的RTP时间戳时间线显示在NTP时间线的正下方,表示两个时间段t1和t2。t1是两个会话之间的发送方报告之间的时间差,以RTP时间戳时钟信号表示,t2是会话A发送方报告到A2数据包之间的时间差,同样以RTP时间戳时钟信号表示。将这些差异的总和添加到
the RTP timestamp of the session report from session B in order to derive the correct RTP timestamp for the inserted packet B2. In other words:
来自会话B的会话报告的RTP时间戳,以便为插入的数据包B2导出正确的RTP时间戳。换言之:
TS_B2 = TS_B + t1 + t2
TS_B2 = TS_B + t1 + t2
Let toRTP() be a function that calculates the RTP time difference (in clock ticks of the used clock) given an NTP timestamp difference, and effRTPdiff() be a function that calculates the effective difference between two timestamps, including wraparounds:
假设toRTP()是一个函数,用于计算给定NTP时间戳差的RTP时间差(以所用时钟的时钟滴答数为单位),effRTPdiff()是一个函数,用于计算两个时间戳之间的有效差,包括wraparounds:
effRTPdiff( ts1, ts2 ):
effRTPdiff(ts1、ts2):
if( ts1 <= ts2 ) then effRTPdiff := ts1-ts2 else effRTPDiff := (4294967296 + ts2) - ts1 We have:
if( ts1 <= ts2 ) then effRTPdiff := ts1-ts2 else effRTPDiff := (4294967296 + ts2) - ts1 We have:
t1 = toRTP(NTP_A - NTP_B) and t2 = effRTPdiff(TS_A2, TS_A)
t1 = toRTP(NTP_A - NTP_B) and t2 = effRTPdiff(TS_A2, TS_A)
Hence in order to generate the RTP timestamp TS_B2 for the inserted packet B2, the RTP timestamp for packet B2 TS_B2 can be calculated as follows.
因此,为了为插入的分组B2生成RTP时间戳TS_B2,分组B2的RTP时间戳TS_B2可以如下计算。
TS_B2 = TS_B + toRTP(NTP_A - NTP_B) + effRTPdiff(TS_A2, TS_A)
TS_B2 = TS_B + toRTP(NTP_A - NTP_B) + effRTPdiff(TS_A2, TS_A)
When the NI-C or NI-TC MST mode is in use, the following applies for each of the RTP sessions.
当使用NI-C或NI-TC MST模式时,以下内容适用于每个RTP会话。
o For each single NAL unit packet containing a non-PACSI NAL unit, the previous packet, if present, MUST have the same RTP timestamp as the single NAL unit packet, and the following applies.
o 对于包含非PACSI NAL单元的每个单个NAL单元数据包,前一个数据包(如果存在)必须具有与单个NAL单元数据包相同的RTP时间戳,并且以下内容适用。
o If the NALU-time of the non-PACSI NAL unit is not equal to the NALU-time of the previous non-PACSI NAL unit in decoding order, the previous packet MUST contain a PACSI NAL unit containing the DONC field.
o 如果非PACSI NAL单元的NALU时间不等于解码顺序中前一个非PACSI NAL单元的NALU时间,则前一个数据包必须包含包含DONC字段的PACSI NAL单元。
o In an STAP-A packet the first NAL unit in the STAP-A packet MUST be a PACSI NAL unit containing the DONC field.
o 在STAP-A数据包中,STAP-A数据包中的第一个NAL单元必须是包含DONC字段的PACSI NAL单元。
o For an FU-A packet the previous packet MUST have the same RTP timestamp as the FU-A packet, and the following applies.
o 对于FU-A数据包,前一个数据包必须具有与FU-A数据包相同的RTP时间戳,并且以下内容适用。
o If the FU-A packet is the start of the fragmented NAL unit, the following applies.
o 如果FU-A数据包是分段NAL单元的开始,则以下情况适用。
o If the NALU-time of the fragmented NAL unit is not equal to the NALU-time of the previous non-PACSI NAL unit in decoding order, the previous packet MUST contain a PACSI NAL unit containing the DONC field;
o 如果分段NAL单元的NALU时间不等于解码顺序中先前非PACSI NAL单元的NALU时间,则先前分组必须包含包含DONC字段的PACSI NAL单元;
o Otherwise, (the NALU-time of the fragmented NAL unit is equal to the NALU-time of the previous non-PACSI NAL unit in decoding order), the previous packet MAY contain a PACSI NAL unit containing the DONC field.
o 否则,(分段NAL单元的NALU时间等于解码顺序中的先前非PACSI NAL单元的NALU时间),先前分组可以包含包含DONC字段的PACSI NAL单元。
o Otherwise, if the FU-A packet is the end of the fragmented NAL unit, the following applies.
o 否则,如果FU-A分组是分段的NAL单元的末端,则以下情况适用。
o If the next non-PACSI NAL unit in decoding order has NALU-time equal to the NALU-time of the fragmented NAL unit, and is carried in a number of FU-A packets or a single NAL unit packet, the next packet MUST be a single NAL unit packet containing a PACSI NAL unit containing the DONC field.
o 如果解码顺序中的下一个非PACSI NAL单元的NALU时间等于分段NAL单元的NALU时间,并且在多个FU-a分组或单个NAL单元分组中携带,则下一个分组必须是包含包含包含DONC字段的PACSI NAL单元的单个NAL单元分组。
o Otherwise (the FU-A packet is neither the start nor the end of the fragmented NAL unit), the previous packet MUST be a FU-A packet.
o 否则(FU-A分组既不是分段NAL单元的开始也不是结束),前一个分组必须是FU-A分组。
o For each single NAL unit packet containing a PACSI NAL unit, if present, the PACSI NAL unit MUST contain the DONC field.
o 对于包含PACSI NAL单元的每个NAL单元数据包(如果存在),PACSI NAL单元必须包含DONC字段。
o When the optional media type parameter sprop-mst-csdon-always-present is equal to 1, the session packetization mode in use MUST be the non-interleaved mode, and only STAP-A and NI-MTAP packets can be used.
o 当可选媒体类型参数sprop mst csdon always present等于1时,正在使用的会话分组模式必须是非交错模式,并且只能使用STAP-A和NI-MTAP数据包。
When the I-C MST packetization mode is in use, the following applies.
当使用I-C MST打包模式时,以下情况适用。
o When a PACSI NAL unit is present, the T bit MUST be equal to 0, i.e., the DONC field is not present, and the Y bit MUST be equal to 0, i.e., the TL0PICIDX and IDRPICID are not present.
o 当存在PACSI NAL单元时,T位必须等于0,即DONC字段不存在,Y位必须等于0,即TL0PICIDX和IDRPICID不存在。
NAL units that do not directly encode video slices are known in H.264 as non-VCL NAL units. Non-VCL units that are only used by, or only relevant to, enhancement RTP sessions SHOULD be sent in the lowest session to which they are relevant.
不直接编码视频片段的NAL单元在H.264中称为非VCL NAL单元。仅由增强RTP会话使用或仅与增强RTP会话相关的非VCL单元应在其相关的最低会话中发送。
Some senders, however, such as those sending pre-encoded data, may be unable to easily determine which non-VCL units are relevant to which session. Thus, non-VCL NAL units MAY, instead, be sent in a session on which the session using these non-VCL NAL units depends (e.g., the base RTP session).
然而,一些发送方,例如发送预编码数据的发送方,可能无法轻松确定哪些非VCL单元与哪个会话相关。因此,可以在使用这些非VCL NAL单元的会话所依赖的会话(例如,基本RTP会话)中发送非VCL NAL单元。
If a non-VCL unit is relevant to more than one RTP session, neither of which depends on the other(s), the NAL unit MAY be sent in another session on which all these sessions depend.
如果非VCL单元与一个以上的RTP会话相关,其中任何一个都不依赖于其他会话,则可以在所有这些会话所依赖的另一个会话中发送NAL单元。
Section 5.1 of this memo applies, with the following addition. If the base layer is sent in a base RTP session using [RFC6184], prefix NAL units MAY be sent in the lowest enhancement RTP session rather than in the base RTP session.
本备忘录第5.1节适用,并添加以下内容。如果使用[RFC6184]在基本RTP会话中发送基本层,则可以在最低增强RTP会话中而不是在基本RTP会话中发送前缀NAL单元。
For single-session transmission, where a single RTP session is used, the de-packetization process specified in Section 7 of [RFC6184] applies.
对于使用单个RTP会话的单会话传输,[RFC6184]第7节中规定的反打包过程适用。
For multi-session transmission, where more than one RTP session is used to receive data from the same SVC bitstream, the de-packetization process is specified as follows.
对于多会话传输,其中使用多个RTP会话从同一SVC比特流接收数据,反打包过程指定如下。
As for a single RTP session, the general concept behind the de-packetization process is to reorder NAL units from transmission order to the NAL unit decoding order.
对于单个RTP会话,反打包过程背后的一般概念是将NAL单元从传输顺序重新排序为NAL单元解码顺序。
The sessions to be received MUST be identified by mechanisms specified in Section 7.2.3. An enhancement RTP session typically contains an RTP stream that depends on at least one other RTP session, as indicated by mechanisms defined in Section 7.2.3. A lower RTP session to an enhancement RTP session is an RTP session on which the enhancement RTP session depends. The lowest RTP session for a receiver is the base RTP session, which does not depend on any other RTP session received by the receiver. The highest RTP session for a receiver is the RTP session on which no other RTP session received by the receiver depends.
必须通过第7.2.3节规定的机制确定要接收的会话。增强RTP会话通常包含依赖于至少一个其他RTP会话的RTP流,如第7.2.3节中定义的机制所示。增强RTP会话的下层RTP会话是增强RTP会话所依赖的RTP会话。接收机的最低RTP会话是基本RTP会话,它不依赖于接收机接收的任何其他RTP会话。接收机的最高RTP会话是接收机接收到的其他RTP会话都不依赖的RTP会话。
For each of the RTP sessions, the RTP reception process as specified in RFC 3550 is applied. Then the received packets are passed into the payload de-packetization process as defined in this memo.
对于每个RTP会话,应用RFC 3550中规定的RTP接收过程。然后,接收到的数据包被传递到有效载荷反打包过程中,如本备忘录中所定义。
The decoding order of the NAL units carried in all the associated RTP sessions is then recovered by applying one of the following subsections, depending on which of the MST packetization modes is in use.
然后,根据使用的是哪种MST分组模式,通过应用以下小节之一来恢复在所有相关RTP会话中携带的NAL单元的解码顺序。
The following process MUST be applied when the NI-T packetization mode is in use. The following process MAY be applied when the NI-TC packetization mode is in use.
当使用NI-T包装模式时,必须采用以下程序。当使用NI-TC包装模式时,可采用以下程序。
The process is based on RTP session dependency signaling, RTP sequence numbers, and timestamps.
该过程基于RTP会话依赖信令、RTP序列号和时间戳。
The decoding order of NAL units within an RTP packet stream in RTP session is given by the ordering of sequence numbers SN of the RTP packets that contain the NAL units, and the order of appearance of NAL units within a packet.
RTP会话中RTP分组流中NAL单元的解码顺序由包含NAL单元的RTP分组的序列号SN的顺序以及分组中NAL单元的出现顺序给出。
Timing information according to the media timestamp TS, i.e., the NTP timestamp as derived from the RTP timestamp of an RTP packet, is associated with all NAL units contained in the same RTP packet received in an RTP session.
根据媒体时间戳TS的定时信息,即从RTP分组的RTP时间戳导出的NTP时间戳,与包含在RTP会话中接收的相同RTP分组中的所有NAL单元相关联。
For NI-MTAP packets the NALU-time is derived for each contained NAL unit by using the "TS offset" value in the NI-MTAP packet as defined in Section 4.10, and is used instead of the RTP packet timestamp to derive the media timestamp, e.g., using the NTP wall clock as provided via RTCP sender reports. NAL units contained in fragmentation packets are handled as defragmented, entire NAL units with their own media timestamps. All NAL units associated with the same value of media timestamp TS are part of the same access unit AU(TS). Any empty NAL units SHOULD be kept as, effectively, access unit indicators in the reordering process. Empty NAL units and PACSI NAL units SHOULD be removed before passing access unit data to the decoder.
对于NI-MTAP数据包,通过使用第4.10节中定义的NI-MTAP数据包中的“TS偏移量”值,为每个包含的NAL单元导出NALU时间,并代替RTP数据包时间戳来导出媒体时间戳,例如,使用RTCP发送方报告提供的NTP挂钟。碎片数据包中包含的NAL单元作为碎片处理,整个NAL单元具有自己的媒体时间戳。与媒体时间戳TS的相同值相关联的所有NAL单元都是相同访问单元AU(TS)的一部分。在重新排序过程中,任何空的NAL单元都应作为有效的访问单元指示器保留。空NAL单元和PACSI NAL单元应在将访问单元数据传递给解码器之前移除。
Informative note: These empty NAL units are used to associate NAL units present in other RTP sessions with RTP sessions not containing any data for an access unit of a particular time instance. They act as access unit indicators in sessions that would otherwise contain no data for the particular access unit. The presence of these NAL units is ensured by the packetization rules in Section 5.2.1.
资料性说明:这些空NAL单元用于将其他RTP会话中的NAL单元与不包含特定时间实例的访问单元的任何数据的RTP会话相关联。它们在会话中充当访问单元指示符,否则会话将不包含特定访问单元的数据。这些NAL单元的存在由第5.2.1节中的包装规则保证。
It is assumed that the receiver has established an operation point (DID, QID, and TID values), and has identified the highest enhancement RTP session for this operation point. The decoding order of NAL units from multiple RTP streams in multiple RTP sessions MUST be recovered into a single sequence of NAL units, grouped into access units, by performing any process equivalent to the following steps. The general process is described in Section 4.2 of [RFC6051]. For convenience the instructions of [RFC6051] are repeated and applied to NAL units rather than to full RTP packets. Additionally, SVC-specific extensions to the procedure in Section 4.2. of [RFC6051] are presented in the following list:
假设接收器已经建立了一个操作点(DID、QID和TID值),并且已经确定了该操作点的最高增强RTP会话。在多个RTP会话中,来自多个RTP流的NAL单元的解码顺序必须通过执行与以下步骤等效的任何过程恢复为单个NAL单元序列,分组为接入单元。[RFC6051]第4.2节描述了一般过程。为方便起见,[RFC6051]的指令被重复并应用于NAL单元,而不是完整的RTP数据包。此外,第4.2节中程序的SVC特定扩展。[RFC6051]的定义如下表所示:
o The process should be started with the NAL units received in the highest RTP session with the first media timestamp TS (in NTP format) available in the session's (de-jittering) buffer. It is assumed that packets in the de-jittering buffer are already stored in RTP sequence number order.
o 该过程应使用在最高RTP会话中接收的NAL单元启动,该会话的(去抖动)缓冲区中有第一个媒体时间戳TS(NTP格式)。假设解抖动缓冲区中的数据包已按RTP序列号顺序存储。
o Collect all NAL units associated with the same value of media timestamp TS, starting from the highest RTP session, from all the (de-jittering) buffers of the received RTP sessions. The collected NAL units will be those associated with the access unit AU(TS).
o 从最高RTP会话开始,从接收到的RTP会话的所有(去抖动)缓冲区收集与相同媒体时间戳TS值相关联的所有NAL单元。收集的NAL单元将是与访问单元AU(TS)相关联的单元。
o Place the collected NAL units in the order of session dependency as derived by the dependency indication as specified in Section 7.2.3, starting from the lowest RTP session.
o 从最低RTP会话开始,按照第7.2.3节规定的依赖项指示导出的会话依赖项顺序放置收集的NAL单元。
o Place the session ordered NAL units in decoding order within the particular access unit by satisfying the NAL unit ordering rules for SVC access units, as described in the informative algorithm provided in Section 6.2.1.1.
o 如第6.2.1.1节中提供的信息算法所述,通过满足SVC接入单元的NAL单元顺序规则,将会话顺序的NAL单元按解码顺序放置在特定接入单元内。
o Remove NI-MTAP and any PACSI NAL units from the access unit AU(TS).
o 从访问单元AU(TS)上卸下NI-MTAP和任何PACSI NAL单元。
o The access units can then be transferred to the decoder. Access units AU(TS) are transferred to the decoder in the order of appearance (given by the order of RTP sequence numbers) of media timestamp values TS in the highest RTP session associated with access unit AU(TS).
o 然后可以将接入单元传送到解码器。访问单元AU(TS)按照与访问单元AU(TS)相关联的最高RTP会话中的媒体时间戳值TS的出现顺序(由RTP序列号的顺序给出)被传送到解码器。
Informative note: Due to packet loss it is possible that not all sessions may have NAL units present for the media timestamp value TS present in the highest RTP session. In such a case, an algorithm may: a) proceed to the next complete access unit with NAL units present in all the received RTP sessions; or b) consider a new highest RTP
资料性说明:由于数据包丢失,可能并非所有会话都具有NAL单元,用于最高RTP会话中存在的媒体时间戳值TS。在这种情况下,算法可以:a)在所有接收到的RTP会话中都存在NAL单元的情况下,进入下一个完整的接入单元;或b)考虑一个新的最高RTP
session, the highest RTP session for which the access unit is complete, and apply the process above. The algorithm may return to the original highest RTP session when a complete and error-free access unit that contains NAL units in all the sessions is received.
会话,访问单元完成的最高RTP会话,并应用上述过程。当接收到在所有会话中包含NAL单元的完整且无错误的访问单元时,该算法可以返回到原始的最高RTP会话。
The following gives an informative example.
下面给出了一个信息丰富的示例。
The example shown in Figure 6 refers to three RTP sessions A, B, and C containing an SVC bitstream transmitted as 3 sources. In the example, the dependency signaling (described in Section 7.2.3) indicates that session A is the base RTP session, B is the first enhancement RTP session and depends on A, and C is the second enhancement RTP session and depends on A and B. A hierarchical picture coding prediction structure is used, in which session A has the lowest frame rate and sessions B and C have the same but higher frame rate.
图6所示的示例涉及三个RTP会话A、B和C,其中包含作为3个源传输的SVC比特流。在该示例中,依赖信令(在第7.2.3节中描述)指示会话A是基本RTP会话,B是第一个增强RTP会话并依赖于A,C是第二个增强RTP会话并依赖于A和B。使用分层图片编码预测结构,其中会话A具有最低的帧速率,会话B和C具有相同但更高的帧速率。
The figure shows NAL units contained in RTP packets that are stored in the de-jittering buffer at the receiver for session de-packetization. The NAL units are already reordered according to their RTP sequence number order and, if within an aggregation packet, according to the order of their appearance within the aggregation packet. The figure indicates for the received NAL units the decoding order within the sessions, as well as the associated media (NTP) timestamps ("TS[..]"). NAL units of the same access unit within a session are grouped by "(.,.)" and share the same media timestamp TS, which is shown at the bottom of the figure. Note that the timestamps are not in increasing order since, in this example, the decoding order is different from the output/display order.
该图显示了RTP数据包中包含的NAL单元,这些数据包存储在接收器的解抖动缓冲区中,用于会话解分组。NAL单元已经根据其RTP序列号顺序重新排序,如果在聚合分组中,则根据其在聚合分组中的出现顺序重新排序。该图为接收到的NAL单元指示会话内的解码顺序以及相关媒体(NTP)时间戳(“TS[…]”)。会话中相同访问单元的NAL单元按“(,)”分组,并共享相同的媒体时间戳TS,如图底部所示。请注意,时间戳不是按递增顺序排列的,因为在本例中,解码顺序不同于输出/显示顺序。
The process first proceeds to the NAL units associated with the first media timestamp TS[1] present in the highest session C and removes/ignores all preceding (in decoding order) NAL units to NAL units with TS[1] in each of the de-jittering buffers of RTP sessions A, B, and C. Then, starting from session C, the first media timestamp available in decoding order (TS[1]) is selected and NAL units starting from RTP session A, and sessions B and C are placed in order of the RTP session dependency as required by Section 7.2.3 of this memo (in the example for TS[1]: first session B and then session C) into the access unit AU(TS[1]) associated with media timestamp TS[1]. Then the next media timestamp TS[3] in order of appearance in the highest RTP session C is processed and the process described above is repeated. Note that there may be access units with no NAL units present, e.g., in the lowest RTP session A (see, e.g., TS[1]). With TS[8], the first access unit with NAL units present in all the RTP sessions appears in the buffers.
该过程首先进行到与最高会话C中存在的第一媒体时间戳TS[1]相关联的NAL单元,并移除/忽略所有先前的(以解码顺序)NAL单元到RTP会话A、B和C的每个解抖动缓冲器中具有TS[1]的NAL单元。然后,从会话C开始,选择以解码顺序(TS[1])可用的第一个媒体时间戳,并将从RTP会话A开始的NAL单元,以及会话B和C按照本备忘录第7.2.3节要求的RTP会话相关性顺序(在TS[1]的示例中:首先会话B,然后会话C)放入接入单元AU(TS[1])与媒体时间戳TS[1]关联。然后,按照最高RTP会话C中的出现顺序处理下一媒体时间戳TS[3],并重复上述处理。注意,可能存在不存在NAL单元的接入单元,例如,在最低RTP会话A中(例如,参见TS[1])。对于TS[8],所有RTP会话中都存在NAL单元的第一个访问单元出现在缓冲区中。
C: ------------(1,2)-(3,4)--(5)---(6)---(7,8)(9,10)-(11)--(12)---- | | | | | | | | | | B: -(1,2)-(3,4)-(5)---(6)--(7,8)-(9,10)-(11)-(12)--(13,14)(15,15)- | | | | | | A: -------(1)---------------(2)---(3)---------------(4)----(5)---- ---------------------------------------------------decoding order-->
C: ------------(1,2)-(3,4)--(5)---(6)---(7,8)(9,10)-(11)--(12)---- | | | | | | | | | | B: -(1,2)-(3,4)-(5)---(6)--(7,8)-(9,10)-(11)-(12)--(13,14)(15,15)- | | | | | | A: -------(1)---------------(2)---(3)---------------(4)----(5)---- ---------------------------------------------------decoding order-->
TS: [4] [2] [1] [3] [8] [6] [5] [7] [12] [10]
TS:[4][2][1][3][8][6][5][7][12][10]
Key: A, B, C - RTP sessions Integer values in "()" - NAL unit decoding order within RTP session "( )" - groups the NAL units of an access unit in an RTP session "|" - indicates corresponding NAL units of the same access unit AU(TS[..]) in the RTP sessions Integer values in "[]" - media timestamp TS, sampling time as derived, e.g., from NTP timestamp associated with the access unit AU(TS[..]), consisting of NAL units in the sessions above each TS value.
键:A、B、C-RTP会话整数值以“()”-RTP会话内的NAL单元解码顺序以“()”-分组RTP会话中访问单元的NAL单元“|”-表示RTP会话中相同访问单元AU(TS[…])的对应NAL单元整数值以“[]”表示-媒体时间戳TS,导出的采样时间,例如。,从与访问单元AU(TS[…])相关联的NTP时间戳开始,包括每个TS值上方会话中的NAL单元。
Figure 6. Example of decoding order recovery in multi-source transmission.
图6。多源传输中解码顺序恢复的示例。
6.2.1.1. Informative Algorithm for NI-T Decoding Order Recovery within an Access Unit
6.2.1.1. 接入单元内NI-T解码顺序恢复的信息算法
Within an access unit, the [H.264] specification (Sections 7.4.1.2.3 and G.7.4.1.2.3) constrains the valid decoding order of NAL units.
在接入单元内,[H.264]规范(第7.4.1.2.3节和G.7.4.1.2.3节)限制NAL单元的有效解码顺序。
These constraints make it possible to reconstruct a valid decoding order for the NAL units of an access unit based only on the order of NAL units in each session, the NAL unit headers, and Supplemental Enhancement Information message headers.
这些约束使得能够仅基于每个会话中的NAL单元的顺序、NAL单元报头和补充增强信息消息报头来重构接入单元的NAL单元的有效解码顺序。
This section specifies an informative algorithm to reconstruct a valid decoding order for NAL units within an access unit. Other NAL unit orderings may also be valid; however, any compliant NAL unit ordering will describe the same video stream and ancillary data as the one produced by this algorithm.
本节规定了一种信息算法,用于为接入单元内的NAL单元重建有效的解码顺序。其他NAL装置订单也可能有效;然而,任何合规的NAL单元顺序将描述与此算法产生的视频流和辅助数据相同的视频流和辅助数据。
An actual implementation, of course, needs only to behave "as if" this reordering is done. In particular, NAL units that are discarded by an implementation's decoding process do not need to be reordered.
当然,一个实际的实现只需要“好像”完成了这种重新排序。特别地,被实现的解码过程丢弃的NAL单元不需要重新排序。
In this algorithm, NAL units within an access unit are first ordered by NAL unit type, in the order specified in Table 12 below, except from NAL unit type 14, which is handled specially as described in the table. NAL units of the same type are then ordered as specified for the type, if necessary.
在该算法中,接入单元内的NAL单元首先按NAL单元类型排序,顺序如下表12所示,但NAL单元类型14除外,该NAL单元类型14按照表中所述进行专门处理。如有必要,同一类型的NAL装置将按照该类型的规定进行订购。
For the purposes of this algorithm, "session order" is the order of NAL units implied by their transmission order within an RTP session. For the non-interleaved and single NAL unit modes, this is the RTP sequence number order coupled with the order of NAL units within an aggregation unit.
在该算法中,“会话顺序”是指NAL单元在RTP会话中的传输顺序所隐含的顺序。对于非交织和单NAL单元模式,这是RTP序列号顺序与聚合单元内NAL单元的顺序耦合。
Table 12. Ordering of NAL unit types within an Access Unit
表12。访问单元内NAL单元类型的排序
Type Description / Comments ----------------------------------------------------------- 9 Access unit delimiter
Type Description / Comments ----------------------------------------------------------- 9 Access unit delimiter
7 Sequence parameter set
7序列参数集
13 Sequence parameter set extension
13序列参数集扩展
15 Subset sequence parameter set
15子集序列参数集
8 Picture parameter set
8图像参数集
16-18 Reserved
16-18保留
6 Supplemental enhancement information (SEI) If an SEI message with a first payload of 0 (Buffering Period) is present, it must be the first SEI message.
6补充增强信息(SEI)如果出现第一个有效负载为0(缓冲期)的SEI消息,则它必须是第一个SEI消息。
If SEI messages with a Scalable Nesting (30) payload and a nested payload of 0 (Buffering Period) are present, these then follow the first SEI message. Such an SEI message with the all_layer_representations_in_au_flag equal to 1 is placed first, followed by any others, sorted in increasing order of DQId.
如果存在具有可伸缩嵌套(30)有效负载和嵌套有效负载为0(缓冲期)的SEI消息,则这些消息将跟随第一条SEI消息。这样一个SEI消息,其所有的层表示形式都等于1,放在第一位,然后是任何其他消息,按DQId的递增顺序排序。
All other SEI messages follow in any order.
所有其他SEI消息以任何顺序跟随。
14 Prefix NAL unit in scalable extension 1 Coded slice of a non-IDR picture 5 Coded slice of an IDR picture
14可扩展扩展扩展中的前缀NAL单元1非IDR图片的编码片段5 IDR图片的编码片段
NAL units of type 1 or 5 will be sent within only a single session for any given access unit. They are placed in session order. (Note: Any given access unit will contain only NAL units of type 1 or type 5, not both.)
对于任何给定的访问单元,类型1或5的NAL单元将仅在单个会话内发送。它们按会话顺序排列。(注意:任何给定的访问单元将仅包含类型1或类型5的NAL单元,而不是两者。)
If NAL units of type 14 are present, every NAL unit of type 1 or 5 is prefixed by a NAL unit of type 14. (Note: Within an access unit, every NAL unit of type 14 is identical, so correlation of type 14 NAL units with the other NAL units is not necessary.)
如果存在类型14的NAL单元,则类型1或5的每个NAL单元都以类型14的NAL单元为前缀。(注意:在一个接入单元内,类型14的每个NAL单元都是相同的,因此不需要将类型14的NAL单元与其他NAL单元进行关联。)
12 Filler data
12填充数据
The only restriction of filler data NAL units within an access unit is that they shall not precede the first VCL NAL unit with the same access unit.
访问单元内填充数据NAL单元的唯一限制是,它们不得位于具有相同访问单元的第一个VCL NAL单元之前。
19 Coded slice of an auxiliary coded picture without partitioning
19无分区的辅助编码图片的编码片段
These NAL units will be sent within only a single session for any given access unit, and are placed in session order.
对于任何给定的访问单元,这些NAL单元将仅在单个会话中发送,并按会话顺序放置。
20 Coded slice in scalable extension 21-23 Reserved
可扩展扩展扩展21-23中的20编码片保留
Type 20 NAL units are placed in increasing order of DQId. Within each DQId value, they are placed in session order.
20型NAL装置按DQId的递增顺序排列。在每个DQId值中,它们按会话顺序排列。
(Note: SVC slices with a given DQId value will be sent within only a single session for any given access unit.)
(注意:对于任何给定的访问单元,具有给定DQId值的SVC片将仅在单个会话内发送。)
Type 21-23 NAL units are placed immediately following the non-reserved-type VCL NAL unit they follow in session order.
21-23型NAL单元按会话顺序紧跟在非保留型VCL NAL单元之后。
10 End of sequence
10序列结束
11 End of stream
11溪流末端
The following process MUST be used when either the NI-C or I-C MST packetization mode is in use. The following process MAY be applied when the NI-TC MST packetization mode is in use.
当使用NI-C或I-C MST打包模式时,必须使用以下过程。当使用NI-TC MST打包模式时,可采用以下程序。
The RTP packets output from the RTP-level reception processing for each session are placed into a re-multiplexing buffer.
从每个会话的RTP级接收处理输出的RTP分组被放置到重复用缓冲器中。
It is RECOMMENDED to set the size of the re-multiplexing buffer (in bytes) equal to or greater than the value of the sprop-remux-buf-req media type parameter of the highest RTP session the receiver receives.
建议将重新复用缓冲区的大小(字节)设置为等于或大于接收器接收的最高RTP会话的sprop remux buf req media type参数的值。
The CS-DON value is calculated and stored for each NAL unit.
计算并存储每个NAL单元的CS-DON值。
Informative note: The CS-DON value of a NAL unit may rely on information carried in another packet than the packet containing the NAL unit. This happens, e.g., when the CS-DON values need to be derived for non-PACSI NAL units contained in single NAL unit packets, as the single NAL unit packets themselves do not contain CS-DON information. In this case, when no packet containing required CS-DON information is received for a NAL unit, this NAL unit has to be discarded by the receiver as it cannot be fed to the decoder in the correct order. When the optional media type parameter sprop-mst-csdon-always-present is equal to 1, no such dependency exists, i.e., the CS-DON value of any particular NAL unit can be derived solely according to information in the packet containing the NAL unit, and therefore, the receiver does not need to discard any received NAL units.
资料性说明:NAL单元的CS-DON值可能依赖于包含NAL单元的数据包以外的另一个数据包中携带的信息。例如,当需要为单个NAL单元分组中包含的非PACSI NAL单元导出CS-DON值时,就会发生这种情况,因为单个NAL单元分组本身不包含CS-DON信息。在这种情况下,当没有为NAL单元接收到包含所需CS-DON信息的分组时,接收机必须丢弃该NAL单元,因为它不能以正确的顺序馈送到解码器。当可选媒体类型参数sprop mst csdon always present等于1时,不存在这种依赖关系,即,任何特定NAL单元的CS-DON值可以仅根据包含该NAL单元的分组中的信息导出,因此,接收机不需要丢弃任何接收到的NAL单元。
The receiver operation is described below with the help of the following functions and constants:
在以下函数和常数的帮助下,接收器操作如下所述:
o Function AbsDON is specified in Section 8.1 of [RFC6184].
o [RFC6184]第8.1节规定了函数AbsDON。
o Function don_diff is specified in Section 5.5 of [RFC6184].
o [RFC6184]第5.5节规定了函数don_diff。
o Constant N is the value of the OPTIONAL sprop-mst-remux-buf-size media type parameter of the highest RTP session incremented by 1.
o 常数N是最高RTP会话的可选sprop mst remux buf size媒体类型参数的值,递增1。
Initial buffering lasts until one of the following conditions is fulfilled:
初始缓冲持续到满足以下条件之一:
o There are N or more VCL NAL units in the re-multiplexing buffer.
o 重新复用缓冲区中有N个或更多VCL NAL单元。
o If sprop-mst-max-don-diff of the highest RTP session is present, don_diff(m,n) is greater than the value of sprop-mst-max-don-diff of the highest RTP session, where n corresponds to the NAL unit having the greatest value of AbsDON among the received NAL units and m corresponds to the NAL unit having the smallest value of AbsDON among the received NAL units.
o 如果存在最高RTP会话的sprop mst max don diff,则don_diff(m,n)大于最高RTP会话的sprop mst max don diff的值,其中,n对应于接收到的NAL单元中具有最大AbsDON值的NAL单元,m对应于接收到的NAL单元中具有最小AbsDON值的NAL单元。
o Initial buffering has lasted for the duration equal to or greater than the value of the OPTIONAL sprop-remux-init-buf-time media type parameter of the highest RTP session.
o 初始缓冲的持续时间等于或大于最高RTP会话的可选sprop remux init buf time media type参数的值。
The NAL units to be removed from the re-multiplexing buffer are determined as follows:
要从重新复用缓冲器中移除的NAL单元确定如下:
o If the re-multiplexing buffer contains at least N VCL NAL units, NAL units are removed from the re-multiplexing buffer and passed to the decoder in the order specified below until the buffer contains N-1 VCL NAL units.
o 如果重新复用缓冲器包含至少N个VCL NAL单元,则NAL单元将从重新复用缓冲器中移除,并按照下面指定的顺序传递给解码器,直到缓冲器包含N-1个VCL NAL单元。
o If sprop-mst-max-don-diff of the highest RTP session is present, all NAL units m for which don_diff(m,n) is greater than sprop-max-don-diff of the highest RTP session are removed from the re-multiplexing buffer and passed to the decoder in the order specified below. Herein, n corresponds to the NAL unit having the greatest value of AbsDON among the NAL units in the re-multiplexing buffer.
o 如果存在最高RTP会话的sprop mst max don diff,则将don_diff(m,n)大于最高RTP会话的sprop max don diff的所有NAL单元m从重新复用缓冲区中移除,并按照以下指定的顺序传递给解码器。这里,n对应于在重复用缓冲器中的NAL单元中具有最大AbsDON值的NAL单元。
The order in which NAL units are passed to the decoder is specified as follows:
NAL单元传递给解码器的顺序规定如下:
o Let PDON be a variable that is initialized to 0 at the beginning of the RTP sessions.
o 设PDON为一个变量,在RTP会话开始时初始化为0。
o For each NAL unit associated with a value of CS-DON, a CS-DON distance is calculated as follows. If the value of CS-DON of the NAL unit is larger than the value of PDON, the CS-DON distance is equal to CS-DON - PDON. Otherwise, the CS-DON distance is equal to 65535 - PDON + CS-DON + 1.
o 对于与CS-DON值相关联的每个NAL单元,CS-DON距离计算如下。如果NAL单元的CS-DON值大于PDON值,则CS-DON距离等于CS-DON-PDON。否则,CS-DON距离等于65535-PDON+CS-DON+1。
o NAL units are delivered to the decoder in increasing order of CS-DON distance. If several NAL units share the same value of CS-DON distance, they can be passed to the decoder in any order.
o NAL单元以CS-DON距离的递增顺序传送到解码器。如果多个NAL单元共享相同的CS-DON距离值,则它们可以按任意顺序传递给解码器。
o When a desired number of NAL units have been passed to the decoder, the value of PDON is set to the value of CS-DON for the last NAL unit passed to the decoder.
o 当已将所需数量的NAL单元传递给解码器时,PDON的值被设置为传递给解码器的最后一个NAL单元的CS-DON值。
This section specifies the parameters that MAY be used to select optional features of the payload format and certain features of the bitstream. The parameters are specified here as part of the media type registration for the SVC codec. A mapping of the parameters into the Session Description Protocol (SDP) [RFC4566] is also
本节规定了可用于选择有效负载格式的可选特征和比特流的某些特征的参数。此处指定的参数是SVC编解码器的媒体类型注册的一部分。还讨论了参数到会话描述协议(SDP)[RFC4566]的映射
provided for applications that use SDP. Equivalent parameters could be defined elsewhere for use with control protocols that do not use SDP.
为使用SDP的应用程序提供。可以在其他地方定义等效参数,以便与不使用SDP的控制协议一起使用。
Some parameters provide a receiver with the properties of the stream that will be sent. The names of all these parameters start with "sprop" for stream properties. Some of these "sprop" parameters are limited by other payload or codec configuration parameters. For example, the sprop-parameter-sets parameter is constrained by the profile-level-id parameter. The media sender selects all "sprop" parameters rather than the receiver. This uncommon characteristic of the "sprop" parameters may be incompatible with some signaling protocol concepts, in which case the use of these parameters SHOULD be avoided.
一些参数向接收器提供将要发送的流的属性。对于流属性,所有这些参数的名称都以“sprop”开头。其中一些“sprop”参数受到其他有效负载或编解码器配置参数的限制。例如,“sprop参数集”参数受“纵断面标高id”参数的约束。媒体发送方选择所有“存储”参数,而不是接收方。“sprop”参数的这种不常见特征可能与某些信令协议概念不兼容,在这种情况下,应避免使用这些参数。
The media subtype for the SVC codec has been allocated from the IETF tree.
SVC编解码器的媒体子类型已从IETF树中分配。
The receiver MUST ignore any unspecified parameter.
接收器必须忽略任何未指定的参数。
Informative note: Requiring that the receiver ignore unspecified parameters allows for backward compatibility of future extensions. For example, if a future specification that is backward compatible to this specification specifies some new parameters, then a receiver according to this specification is capable of receiving data per the new payload but ignoring those parameters newly specified in the new payload specification. This provision is also present in [RFC6184].
资料性说明:要求接收器忽略未指定的参数允许将来扩展的向后兼容性。例如,如果与该规范向后兼容的未来规范指定了一些新参数,则根据该规范的接收器能够接收每个新有效载荷的数据,但忽略在新有效载荷规范中新指定的那些参数。[RFC6184]中也有此规定。
Media Type name: video
媒体类型名称:视频
Media subtype name: H264-SVC
媒体子类型名称:H264-SVC
Required parameters: none
所需参数:无
OPTIONAL parameters:
可选参数:
In the following definitions of parameters, "the stream" or "the NAL unit stream" refers to all NAL units conveyed in the current RTP session in SST, and all NAL units conveyed in the current RTP session and all NAL units conveyed in other RTP sessions that the current RTP session depends on in MST.
在以下参数定义中,“流”或“NAL单元流”指SST中当前RTP会话中传输的所有NAL单元,以及MST中当前RTP会话所依赖的当前RTP会话中传输的所有NAL单元和其他RTP会话中传输的所有NAL单元。
profile-level-id: A base16 [RFC4648] (hexadecimal) representation of the following three bytes in the sequence parameter set or subset sequence parameter set NAL unit specified in [H.264]: 1) profile_idc; 2) a byte herein referred to as profile-iop, composed of the values of constraint_set0_flag, constraint_set1_flag, constraint_set2_flag, constraint_set3_flag, constraint_set4_flag, constraint_set5_flag, and reserved_zero_2bits, in bit-significance order, starting from the most-significant bit, and 3) level_idc. Note that reserved_zero_2bits is required to be equal to 0 in [H.264], but other values for it may be specified in the future by ITU-T or ISO/IEC.
配置文件级别id:base16[RFC4648](十六进制)表示[H.264]:1)配置文件中指定的序列参数集或子集序列参数集NAL单元中的以下三个字节;2) 本文中称为配置文件iop的字节,由约束设置0_标志、约束设置1_标志、约束设置2_标志、约束设置3_标志、约束设置4_标志、约束设置5_标志和保留0_2位的值组成,按位重要性顺序,从最高有效位开始,以及3)级别idc。请注意,在[H.264]中,保留的0位必须等于0,但将来可能由ITU-T或ISO/IEC指定其其他值。
The profile-level-id parameter indicates the default sub-profile, i.e., the subset of coding tools that may have been used to generate the stream or that the receiver supports, and the default level of the stream or the one that the receiver supports.
profile level id参数指示默认子概要文件,即可能已用于生成流或接收器支持的编码工具的子集,以及流或接收器支持的流的默认级别。
The default sub-profile is indicated collectively by the profile_idc byte and some fields in the profile-iop byte. Depending on the values of the fields in the profile-iop byte, the default sub-profile may be the same set of coding tools supported by one profile, or a common subset of coding tools of multiple profiles, as specified in Subsection G.7.4.2.1.1 of [H.264]. The default level is indicated by the level_idc byte, and, when profile_idc is equal to 66, 77, or 88 (the Baseline, Main, or Extended profile) and level_idc is equal to 11, additionally by bit 4 (constraint_set3_flag) of the profile-iop byte. When profile_idc is equal to 66, 77, or 88 (the Baseline, Main, or Extended profile) and level_idc is equal to 11, and bit 4 (constraint_set3_flag) of the profile-iop byte is equal to 1, the default level is Level 1b.
默认子配置文件由配置文件\ idc字节和配置文件iop字节中的某些字段共同指示。根据配置文件iop字节中字段的值,默认子配置文件可以是一个配置文件支持的同一组编码工具,或多个配置文件的编码工具的公共子集,如[H.264]第G.7.4.2.1.1小节所述。默认级别由级别_idc字节表示,当配置文件_idc等于66、77或88(基线、主配置文件或扩展配置文件)且级别_idc等于11时,另外由配置文件iop字节的位4(约束设置3标志)表示。当profile_idc等于66、77或88(基线、主或扩展profile)且level_idc等于11,profile iop字节的第4位(constraint_set3_标志)等于1时,默认级别为1b。
Table 13 lists all profiles defined in Annexes A and G of [H.264] and, for each of the profiles, the possible combinations of profile_idc and profile-iop that represent the same sub-profile.
表13列出了[H.264]附录A和G中定义的所有配置文件,以及对于每个配置文件,表示相同子配置文件的配置文件和配置文件iop的可能组合。
Table 13. Combinations of profile_idc and profile-iop representing the same sub-profile corresponding to the full set of coding tools supported by one profile. In the following, x may be either 0 or 1, while the profile names are indicated as follows. CB: Constrained Baseline profile, B: Baseline profile, M: Main profile, E: Extended profile, H: High profile, H10: High 10 profile, H42: High 4:2:2 profile, H44: High 4:4:4 Predictive profile, H10I: High 10 Intra profile, H42I: High
Table 13. Combinations of profile_idc and profile-iop representing the same sub-profile corresponding to the full set of coding tools supported by one profile. In the following, x may be either 0 or 1, while the profile names are indicated as follows. CB: Constrained Baseline profile, B: Baseline profile, M: Main profile, E: Extended profile, H: High profile, H10: High 10 profile, H42: High 4:2:2 profile, H44: High 4:4:4 Predictive profile, H10I: High 10 Intra profile, H42I: High
4:2:2 Intra profile, H44I: High 4:4:4 Intra profile, C44I: CAVLC 4:4:4 Intra profile, SB: Scalable Baseline profile, SH: Scalable High profile, and SHI: Scalable High Intra profile.
4:2:2内部配置文件、H44I:High 4:4:4内部配置文件、C44I:CAVLC 4:4:4内部配置文件、SB:可扩展基线配置文件、SH:可扩展高配置文件和SHI:可扩展高内部配置文件。
Profile profile_idc profile-iop (hexadecimal) (binary)
配置文件\ idc配置文件iop(十六进制)(二进制)
CB 42 (B) x1xx0000 same as: 4D (M) 1xxx0000 same as: 58 (E) 11xx0000 B 42 (B) x0xx0000 same as: 58 (E) 10xx0000 M 4D (M) 0x0x0000 E 58 00xx0000 H 64 00000000 H10 6E 00000000 H42 7A 00000000 H44 F4 00000000 H10I 6E 00010000 H42I 7A 00010000 H44I F4 00010000 C44I 2C 00010000 SB 53 x0000000 SH 56 0x000000 SHI 56 0x010000
CB 42(B)x1xx0000同:4D(M)1x0000同:58(E)11xx0000 B 42(B)0xX0000同:58(E)10xx0000 M 4D(M)0x0x0000 E 58 00xx0000 H 64 00000000 H10 6E 00000000 H42 7A 00000000 H44 F40 00000000 H10I 6E 00010000 H42I 7A 00010000 H44I 00010000 C44I 2C 00010000 SB 53 x0000000 SH 56 0x000000 SHI 56 0x010000
For example, in the table above, profile_idc equal to 58 (Extended) with profile-iop equal to 11xx0000 indicates the same sub-profile corresponding to profile_idc equal to 42 (Baseline) with profile-iop equal to x1xx0000. Note that other combinations of profile_idc and profile-iop (not listed in Table 13) may represent a sub-profile equivalent to the common subset of coding tools for more than one profile. Note also that a decoder conforming to a certain profile may be able to decode bitstreams conforming to other profiles.
例如,在上表中,profile_idc等于58(扩展),profile iop等于11xx0000,表示profile_idc等于42(基线),profile iop等于x1xx0000的相同子profile。注意,profile_idc和profile iop的其他组合(未在表13中列出)可能表示与多个profile的编码工具的公共子集等效的子profile。还注意,符合特定简档的解码器可以解码符合其他简档的比特流。
If profile-level-id is used to indicate stream properties, it indicates that, to decode the stream, the minimum subset of coding tools a decoder has to support is the default sub-profile, and the lowest level the decoder has to support is the default level.
如果配置文件级别id用于指示流属性,则它指示,为了解码流,解码器必须支持的编码工具的最小子集是默认子配置文件,并且解码器必须支持的最低级别是默认级别。
If the profile-level-id parameter is used for capability exchange or session setup, it indicates the subset of coding tools, which is equal to the default sub-profile, that the codec supports for both receiving and sending. If max-recv-level is not present, the default level from profile-level-id indicates the highest level the codec wishes to support. If
如果profile level id参数用于功能交换或会话设置,则它表示编解码器在接收和发送时支持的编码工具子集,该子集等于默认子配置文件。如果不存在max recv level,则配置文件级别id中的默认级别表示编解码器希望支持的最高级别。如果
max-recv-level is present, it indicates the highest level the codec supports for receiving. For either receiving or sending, all levels that are lower than the highest level supported MUST also be supported.
存在max recv level,表示编解码器支持接收的最高级别。对于接收或发送,还必须支持低于支持的最高级别的所有级别。
Informative note: Capability exchange and session setup procedures should provide means to list the capabilities for each supported sub-profile separately. For example, the one-of-N codec selection procedure of the SDP Offer/Answer model can be used (Section 10.2 of [RFC3264]). The one-of-N codec selection procedure may also be used to provide different combinations of profile_idc and profile-iop that represent the same sub-profile. When there are many different combinations of profile_idc and profile-iop that represent the same sub-profile, using the one-of-N codec selection procedure may result in a fairly large SDP message. Therefore, a receiver should understand the different equivalent combinations of profile_idc and profile-iop that represent the same sub-profile, and be ready to accept an offer using any of the equivalent combinations.
资料性说明:能力交换和会话设置程序应提供单独列出每个受支持子概要文件的能力的方法。例如,可以使用SDP提供/应答模型的N选一编解码器选择过程(RFC3264第10.2节)。N个编解码器中的一个选择过程还可用于提供表示相同子简档的简档和简档iop的不同组合。当存在代表同一子配置文件的多个不同配置文件和配置文件iop组合时,使用N选一编解码器选择过程可能会产生相当大的SDP消息。因此,接收方应了解代表同一子剖面的剖面图和剖面图iop的不同等效组合,并准备接受使用任何等效组合的报价。
If no profile-level-id is present, the Baseline Profile without additional constraints at Level 1 MUST be implied.
如果不存在概要文件级别id,则必须暗示在级别1没有附加约束的基线概要文件。
max-recv-level: This parameter MAY be used to indicate the highest level a receiver supports when the highest level is higher than the default level (the level indicated by profile-level-id). The value of max-recv-level is a base16 (hexadecimal) representation of the two bytes after the syntax element profile_idc in the sequence parameter set NAL unit specified in [H.264]: profile-iop (as defined above) and level_idc. If (the level_idc byte of max-recv-level is equal to 11 and bit 4 of the profile-iop byte of max-recv-level is equal to 1) or (the level_idc byte of max-recv-level is equal to 9 and bit 4 of the profile-iop byte of max-recv-level is equal to 0), the highest level the receiver supports is Level 1b. Otherwise, the highest level the receiver supports is equal to the level_idc byte of max-recv-level divided by 10.
最大recv级别:当最高级别高于默认级别(由配置文件级别id指示的级别)时,此参数可用于指示接收器支持的最高级别。max recv level的值是[H.264]:profile iop(如上定义)和level_idc中指定的序列参数集NAL unit中语法元素profile_idc之后两个字节的base16(十六进制)表示。如果(max recv level的级别_idc字节等于11,max recv level的配置文件iop字节的位4等于1)或(max recv level的级别_idc字节等于9,max recv level的配置文件iop字节的位4等于0),则接收机支持的最高级别为1b。否则,接收器支持的最高电平等于max recv level的level_idc字节除以10。
max-recv-level MUST NOT be present if the highest level the receiver supports is not higher than the default level.
如果接收器支持的最高电平不高于默认电平,则不得出现最大recv电平。
max-recv-base-level: This parameter MAY be used to indicate the highest level a receiver supports for the base layer when negotiating an SVC stream. The value of max-recv-base-level is a base16
max recv base level:该参数可用于指示协商SVC流时接收器对基础层支持的最高级别。max recv base level的值是base16
(hexadecimal) representation of the two bytes after the syntax element profile_idc in the sequence parameter set NAL unit specified in [H.264]: profile-iop (as defined above) and level_idc. If (the level_idc byte of max-recv-level is equal to 11 and bit 4 of the profile-iop byte of max-recv-level is equal to 1) or (the level_idc byte of max-recv-level is equal to 9 and bit 4 of the profile-iop byte of max-recv-level is equal to 0), the highest level the receiver supports for the base layer is Level 1b. Otherwise, the highest level the receiver supports for the base layer is equal to the level_idc byte of max-recv-level divided by 10.
(十六进制)表示[H.264]中指定的序列参数集NAL单位中语法元素profile_idc之后的两个字节:profile iop(如上定义)和level_idc。如果(max recv level的level_idc字节等于11,max recv level的profile iop字节的位4等于1)或(max recv level的level_idc字节等于9,max recv level的profile iop字节的位4等于0),则接收机为基础层支持的最高级别为1b。否则,接收器对基础层支持的最高级别等于max recv level的级别idc字节除以10。
max-mbps, max-fs, max-cpb, max-dpb, and max-br: The common properties of these parameters are specified in [RFC6184].
max-mbps、max-fs、max-cpb、max-dpb和max-br:这些参数的公共属性在[RFC6184]中指定。
max-mbps: This parameter is as specified in [RFC6184].
最大mbps:此参数如[RFC6184]中所述。
max-fs: This parameter is as specified in [RFC6184].
max fs:此参数在[RFC6184]中指定。
max-cpb: The value of max-cpb is an integer indicating the maximum coded picture buffer size in units of 1000 bits for the VCL HRD parameters and in units of 1200 bits for the NAL HRD parameters. Note that this parameter does not use units of cpbBrVclFactor and cpbBrNALFactor (see Table A-1 of [H.264]). The max-cpb parameter signals that the receiver has more memory than the minimum amount of coded picture buffer memory required by the signaled highest level conveyed in the value of the profile-level-id parameter or the max-recv-level parameter. When max-cpb is signaled, the receiver MUST be able to decode NAL unit streams that conform to the signaled highest level, with the exception that the MaxCPB value in Table A-1 of [H.264] for the signaled highest level is replaced with the value of max-cpb (after taking cpbBrVclFactor and cpbBrNALFactor into consideration when needed). The value of max-cpb (after taking cpbBrVclFactor and cpbBrNALFactor into consideration when needed) MUST be greater than or equal to the value of MaxCPB given in Table A-1 of [H.264] for the highest level. Senders MAY use this knowledge to construct coded video streams with greater variation of bitrate than can be achieved with the MaxCPB value in Table A-1 of [H.264].
max cpb:max cpb的值是一个整数,表示VCL HRD参数的最大编码图片缓冲区大小,以1000位为单位,NAL HRD参数以1200位为单位。请注意,此参数不使用cpbBrVclFactor和cpbBrNALFactor的单位(参见[H.264]中的表A-1])。max cpb参数表示接收器的内存大于在profile level id参数或max recv level参数的值中传送的信号化最高电平所需的最小编码图片缓冲内存量。当发送最大cpb信号时,接收机必须能够解码符合发送信号的最高电平的NAL单位流,但[H.264]表A-1中发送信号的最高电平的最大cpb值被替换为最大cpb值(在需要时考虑cpbBrVclFactor和cpbBrNALFactor后)。最大cpb值(在需要时考虑cpbBrVclFactor和CpBbBrnalFactor后)必须大于或等于[H.264]表A-1中给出的最高水平的最大cpb值。发送者可以使用此知识来构造比特率变化比[H.264]表A-1中的MaxCPB值更大的编码视频流。
Informative note: The coded picture buffer is used in the Hypothetical Reference Decoder (HRD, Annex C) of [H.264]. The use of the HRD is recommended in SVC encoders to verify that the produced bitstream conforms to the standard and to control the output bitrate. Thus, the coded picture buffer is conceptually independent of any other potential buffers in the receiver, including de-interleaving, re-multiplexing, and de-jitter buffers. The coded picture buffer need not be implemented in decoders as specified in Annex C of [H.264]; standard-compliant decoders can have any buffering arrangements provided that they can decode standard-compliant bitstreams. Thus, in practice, the input buffer for video decoder can be integrated with the de-interleaving, re-multiplexing, and de-jitter buffers of the receiver.
资料性说明:编码图片缓冲器用于[H.264]的假设参考解码器(HRD,附录C)。建议在SVC编码器中使用HRD来验证生成的比特流是否符合标准并控制输出比特率。因此,编码图片缓冲器在概念上独立于接收机中的任何其他潜在缓冲器,包括解交织、重复用和解抖动缓冲器。如[H.264]的附录C所规定,编码图片缓冲器不需要在解码器中实现;标准兼容解码器可以具有任何缓冲安排,只要它们能够解码标准兼容比特流。因此,在实践中,视频解码器的输入缓冲器可以与接收机的解交错、重复用和解抖动缓冲器集成。
max-dpb: This parameter is as specified in [RFC6184].
最大dpb:此参数如[RFC6184]中所述。
max-br: The value of max-br is an integer indicating the maximum video bitrate in units of 1000 bits per second for the VCL HRD parameters and in units of 1200 bits per second for the NAL HRD parameters. Note that this parameter does not use units of cpbBrVclFactor and cpbBrNALFactor (see Table A-1 of [H.264]).
max br:max br的值是一个整数,表示VCL HRD参数的最大视频比特率,单位为1000比特/秒,NAL HRD参数的最大视频比特率单位为1200比特/秒。请注意,此参数不使用cpbBrVclFactor和cpbBrNALFactor的单位(参见[H.264]中的表A-1])。
The max-br parameter signals that the video decoder of the receiver is capable of decoding video at a higher bitrate than is required by the signaled highest level conveyed in the value of the profile-level-id parameter or the max-recv-level parameter.
max-br参数表示接收器的视频解码器能够以高于在简档电平id参数或max-recv电平参数的值中传送的信号化最高电平所要求的比特率解码视频。
When max-br is signaled, the video codec of the receiver MUST be able to decode NAL unit streams that conform to the signaled highest level, with the following exceptions in the limits specified by the highest level:
当发送max br信号时,接收器的视频编解码器必须能够解码符合所发送信号的最高级别的NAL单元流,但在最高级别指定的限制范围内存在以下例外情况:
o The value of max-br (after taking cpbBrVclFactor and cpbBrNALFactor into consideration when needed) replaces the MaxBR value in Table A-1 of [H.264] for the highest level.
o max br的值(在需要时考虑cpbBrVclFactor和cpbBrNALFactor后)替换[H.264]表A-1中最高级别的MaxBR值。
o When the max-cpb parameter is not present, the result of the following formula replaces the value of MaxCPB in Table A-1 of [H.264]: (MaxCPB of the signaled level) * max-br / (MaxBR of the signaled highest level).
o 当max cpb参数不存在时,以下公式的结果将替换[H.264]表A-1中的MaxCPB值:(信号电平的MaxCPB)*max br/(信号最高电平的MaxBR)。
For example, if a receiver signals capability for Main profile Level 1.2 with max-br equal to 1550, this indicates a maximum video bitrate of 1550 kbits/sec for VCL HRD parameters, a
例如,如果接收机向主配置文件级别1.2发送信号,最大br等于1550,则表示VCL HRD参数的最大视频比特率为1550 kbits/sec,a
maximum video bitrate of 1860 kbits/sec for NAL HRD parameters, and a CPB size of 4036458 bits (1550000 / 384000 * 1000 * 1000).
NAL HRD参数的最大视频比特率为1860 kbits/sec,CPB大小为4036458位(1550000/384000*1000*1000)。
The value of max-br (after taking cpbBrVclFactor and cpbBrNALFactor into consideration when needed) MUST be greater than or equal to the value MaxBR given in Table A-1 of [H.264] for the signaled highest level.
max br的值(在需要时考虑到cpbBrVclFactor和cpbBrNALFactor后)必须大于或等于[H.264]表A-1中给出的信号最高电平的MaxBR值。
Senders MAY use this knowledge to send higher-bitrate video as allowed in the level definition of SVC, to achieve improved video quality.
发送方可以使用此知识发送SVC级别定义中允许的更高比特率视频,以提高视频质量。
Informative note: This parameter was added primarily to complement a similar codepoint in the ITU-T Recommendation H.245, so as to facilitate signaling gateway designs. No assumption can be made from the value of this parameter that the network is capable of handling such bitrates at any given time. In particular, no conclusion can be drawn that the signaled bitrate is possible under congestion control constraints.
资料性说明:添加此参数主要是为了补充ITU-T建议H.245中的类似代码点,以便于信令网关设计。根据该参数的值,不能假设网络能够在任何给定时间处理此类比特率。特别地,不能得出在拥塞控制约束下信号比特率是可能的结论。
redundant-pic-cap: This parameter is as specified in [RFC6184].
冗余pic cap:此参数如[RFC6184]所述。
sprop-parameter-sets: This parameter MAY be used to convey any sequence parameter set, subset sequence parameter set, and picture parameter set NAL units (herein referred to as the initial parameter set NAL units) that can be placed in the NAL unit stream to precede any other NAL units in decoding order and that are associated with the default level of profile-level-id. The parameter MUST NOT be used to indicate codec capability in any capability exchange procedure. The value of the parameter is a comma (',') separated list of base64 [RFC4648] representations of the parameter set NAL units as specified in Sections 7.3.2.1, 7.3.2.2, and G.7.3.2.1 of [H.264]. Note that the number of bytes in a parameter set NAL unit is typically less than 10, but a picture parameter set NAL unit can contain several hundreds of bytes.
sprop参数集:该参数可用于传递任何序列参数集、子集序列参数集和图片参数集NAL单元(此处称为初始参数集NAL单元)可放置在NAL单元流中,以在解码顺序上先于任何其他NAL单元,并与profile-level-id的默认级别相关联。该参数不得用于指示任何能力交换过程中的编解码器能力。参数值为[H.264]第7.3.2.1、7.3.2.2和G.7.3.2.1节中规定的参数集NAL单位的base64[RFC4648]表示的逗号(“,”)分隔列表。请注意,参数集NAL单元中的字节数通常小于10,但图片参数集NAL单元可以包含数百个字节。
Informative note: When several payload types are offered in the SDP Offer/Answer model, each with its own sprop-parameter-sets parameter, then the receiver cannot assume that those parameter sets do not use conflicting storage locations (i.e., identical values of parameter set
资料性说明:当SDP提供/应答模型中提供了几种有效负载类型,每种类型都有自己的sprop参数集参数时,接收方不能假设这些参数集没有使用冲突的存储位置(即参数集的相同值)
identifiers). Therefore, a receiver should buffer all sprop-parameter-sets and make them available to the decoder instance that decodes a certain payload type.
标识符)。因此,接收器应缓冲所有sprop参数集,并使其可用于解码特定有效负载类型的解码器实例。
sprop-level-parameter-sets: This parameter MAY be used to convey any sequence, subset sequence, and picture parameter set NAL units (herein referred to as the initial parameter set NAL units) that can be placed in the NAL unit stream to precede any other NAL units in decoding order and that are associated with one or more levels different than the default level of profile-level-id. The parameter MUST NOT be used to indicate codec capability in any capability exchange procedure.
sprop级参数集:此参数可用于传送任何序列、子集序列和图片参数集NAL单元(此处称为初始参数集NAL单元)可放置在NAL单元流中,以在解码顺序上先于任何其他NAL单元,并与一个或多个不同于profile-level-id默认级别的级别相关联。该参数不得用于指示任何能力交换过程中的编解码器能力。
The sprop-level-parameter-sets parameter contains parameter sets for one or more levels that are different than the default level. All parameter sets targeted for use when one level of the default sub-profile is accepted by a receiver are clustered and prefixed with a three-byte field that has the same syntax as profile-level-id. This enables the receiver to install the parameter sets for the accepted level and discard the rest. The three-byte field is named PLId, and all parameter sets associated with one level are named PSL, which has the same syntax as sprop-parameter-sets. Parameter sets for each level are represented in the form of PLId:PSL, i.e., PLId followed by a colon (':') and the base64 [RFC4648] representation of the initial parameter set NAL units for the level. Each pair of PLId:PSL is also separated by a colon. Note that a PSL can contain multiple parameter sets for that level, separated with commas (',').
“存储级别参数集”参数包含一个或多个与默认级别不同的级别的参数集。当接收器接受默认子配置文件的一个级别时,所有目标使用的参数集都是群集的,并以一个三字节字段作为前缀,该字段的语法与profile-level-id相同。这使接收器能够为接受的级别安装参数集,并丢弃其余的参数集。三字节字段命名为PLId,与一个级别关联的所有参数集命名为PSL,其语法与sprop参数集相同。每个级别的参数集以PLId:PSL的形式表示,即PLId后跟冒号(“:”)和级别初始参数集NAL单位的base64[RFC4648]表示。每对PLId:PSL也用冒号分隔。请注意,PSL可以包含该级别的多个参数集,用逗号(“,”)分隔。
The subset of coding tools indicated by each PLId field MUST be equal to the default sub-profile, and the level indicated by each PLId field MUST be different than the default level.
每个PLId字段指示的编码工具子集必须等于默认子配置文件,并且每个PLId字段指示的级别必须不同于默认级别。
Informative note: This parameter allows for efficient level downgrade or upgrade in SDP Offer/Answer and out-of-band transport of parameter sets, simultaneously.
资料性说明:此参数允许在SDP提供/应答和参数集带外传输中同时进行有效的级别降级或升级。
in-band-parameter-sets: This parameter MAY be used to indicate a receiver capability. The value MAY be equal to either 0 or 1. The value 1 indicates that the receiver discards out-of-band parameter sets in sprop-parameter-sets and sprop-level-parameter-sets, therefore the sender MUST transmit all parameter sets in-band. The value 0 indicates that the receiver utilizes out-of-band parameter sets included in sprop-parameter-sets and/or sprop-level-parameter-sets. However, in this case, the sender MAY still choose to
带内参数集:此参数可用于指示接收器能力。该值可以等于0或1。值1表示接收器丢弃sprop参数集和sprop级别参数集中的带外参数集,因此发送器必须在带内传输所有参数集。值0表示接收器使用sprop参数集和/或sprop级别参数集中包含的带外参数集。但是,在这种情况下,发送方仍然可以选择
send parameter sets in-band. When the parameter is not present, this receiver capability is not specified, and therefore the sender MAY send out-of-band parameter sets only, or it MAY send in-band-parameter-sets only, or it MAY send both.
在频带内发送参数集。如果参数不存在,则不指定此接收器功能,因此发送方可能只发送带外参数集,也可能只发送带内参数集,或者同时发送这两个参数集。
packetization-mode: This parameter is as specified in [RFC6184]. When the mst-mode parameter is present, the value of this parameter is additionally constrained as follows. If mst-mode is equal to "NI-T", "NI-C", or "NI-TC", packetization-mode MUST NOT be equal to 2. Otherwise, (mst-mode is equal to "I-C"), packetization-mode MUST be equal to 2.
打包模式:此参数在[RFC6184]中指定。当存在mst mode参数时,此参数的值将另外受到如下约束。如果mst模式等于“NI-T”、“NI-C”或“NI-TC”,则打包模式不得等于2。否则,(mst模式等于“I-C”),打包模式必须等于2。
sprop-interleaving-depth: This parameter is as specified in [RFC6184].
sprop交织深度:该参数如[RFC6184]所述。
sprop-deint-buf-req: This parameter is as specified in [RFC6184].
sprop deint buf req:该参数如[RFC6184]所述。
deint-buf-cap: This parameter is as specified in [RFC6184].
deint buf cap:此参数如[RFC6184]中所述。
sprop-init-buf-time: This parameter is as specified in [RFC6184].
sprop init buf time:此参数在[RFC6184]中指定。
sprop-max-don-diff: This parameter is as specified in [RFC6184].
sprop max don diff:此参数在[RFC6184]中指定。
max-rcmd-nalu-size: This parameter is as specified in [RFC6184].
最大rcmd nalu大小:此参数在[RFC6184]中指定。
mst-mode: This parameter MAY be used to signal the properties of a NAL unit stream or the capabilities of a receiver implementation. If this parameter is present, multi-session transmission MUST be used. Otherwise (this parameter is not present), single-session transmission MUST be used. When this parameter is present, the following applies. When the value of mst-mode is equal to "NI-T", the NI-T mode MUST be used. When the value of mst-mode is equal to "NI-C", the NI-C mode MUST be used. When the value of mst-mode is equal to "NI-TC", the NI-TC mode MUST be used. When the value of mst-mode is equal to "I-C", the I-C mode MUST be used. The value of mst-mode MUST have one of the following tokens: "NI-T", "NI-C", "NI-TC", or "I-C".
mst模式:此参数可用于表示NAL单元流的属性或接收器实现的能力。如果存在此参数,则必须使用多会话传输。否则(此参数不存在),必须使用单会话传输。当此参数存在时,以下情况适用。当mst模式的值等于“NI-T”时,必须使用NI-T模式。当mst模式的值等于“NI-C”时,必须使用NI-C模式。当mst模式的值等于“NI-TC”时,必须使用NI-TC模式。当mst模式的值等于“I-C”时,必须使用I-C模式。mst模式的值必须具有以下标记之一:“NI-T”、“NI-C”、“NI-TC”或“I-C”。
All RTP sessions in an MST MUST have the same value of mst-mode.
MST中的所有RTP会话必须具有相同的MST mode值。
sprop-mst-csdon-always-present: This parameter MUST NOT be present when mst-mode is not present or the value of mst-mode is equal to "NI-T" or "I-C". This parameter signals the properties of the NAL unit stream. When sprop-mst-csdon-always-present is present and the value is equal to 1, packetization-mode MUST be equal to 1, and all the RTP packets carrying the NAL unit stream MUST be STAP-A packets containing a PACSI NAL unit that further contains the DONC field or NI-MTAP packets with the J field equal to 1. When sprop-mst-csdon-always-present is present and the value is equal to 1, the CS-DON value of any particular NAL unit can be derived solely according to information in the packet containing the NAL unit.
sprop mst csdon始终存在:当mst模式不存在或mst模式的值等于“NI-T”或“I-C”时,此参数不得存在。此参数表示NAL单位流的属性。当存在sprop mst csdon always present且该值等于1时,打包模式必须等于1,并且所有承载NAL单元流的RTP数据包必须是包含PACSI NAL单元的STAP-A数据包,该PACSI NAL单元进一步包含DONC字段,或包含J字段等于1的NI-MTAP数据包。当存在sprop mst csdon ALYST present且该值等于1时,可仅根据包含NAL单元的数据包中的信息导出任何特定NAL单元的CS-DON值。
When sprop-mst-csdon-always-present is present in the current RTP session, it MUST be present also in all the RTP sessions the current RTP session depends on and the value of sprop-mst-csdon-always-present is identical for the current RTP session and all the RTP sessions on which the current RTP session depends.
当当前RTP会话中存在sprop mst csdon ALYST present时,它必须也存在于当前RTP会话所依赖的所有RTP会话中,并且对于当前RTP会话和当前RTP会话所依赖的所有RTP会话,sprop mst csdon ALYST present的值相同。
sprop-mst-remux-buf-size: This parameter MUST NOT be present when mst-mode is not present or the value of mst-mode is equal to "NI-T". This parameter MUST be present when mst-mode is present and the value of mst-mode is equal to "NI-C", "NI-TC", or "I-C".
sprop mst remux buf size:当mst模式不存在或mst模式的值等于“NI-T”时,此参数不得存在。当mst模式存在且mst模式值等于“NI-C”、“NI-TC”或“I-C”时,此参数必须存在。
This parameter signals the properties of the NAL unit stream. It MUST be set to a value one less than the minimum re-multiplexing buffer size (in NAL units), so that it is guaranteed that receivers can reconstruct NAL unit decoding order as specified in Subsection 6.2.2.
此参数表示NAL单位流的属性。必须将其设置为小于最小再复用缓冲区大小(以NAL单元为单位)1的值,以便保证接收机能够按照第6.2.2小节的规定重建NAL单元解码顺序。
The value of sprop-mst-remux-buf-size MUST be an integer in the range of 0 to 32767, inclusive.
sprop mst remux buf size的值必须是介于0到32767(包括0到32767)之间的整数。
sprop-remux-buf-req: This parameter MUST NOT be present when mst-mode is not present or the value of mst-mode is equal to "NI-T". It MUST be present when mst-mode is present and the value of mst-mode is equal to "NI-C", "NI-TC", or "I-C".
sprop remux buf req:当mst模式不存在或mst模式值等于“NI-T”时,此参数不得存在。当mst模式存在且mst模式值等于“NI-C”、“NI-TC”或“I-C”时,它必须存在。
sprop-remux-buf-req signals the required size of the re-multiplexing buffer for the NAL unit stream. It is guaranteed that receivers can recover the decoding order of the received NAL units from the current RTP session and the RTP sessions the
sprop remux buf req向NAL单元流发送所需大小的重新复用缓冲区信号。保证接收机能够从当前RTP会话和RTP会话恢复接收到的NAL单元的解码顺序
current RTP session depends on as specified in Section 6.2.2, when the re-multiplexing buffer size is of at least the value of sprop-remux-buf-req in units of bytes.
当前RTP会话取决于第6.2.2节中的规定,当重新复用缓冲区大小至少为以字节为单位的sprop remux buf req值时。
The value of sprop-remux-buf-req MUST be an integer in the range of 0 to 4294967295, inclusive.
sprop remux buf req的值必须是介于0到4294967295(包括0和4294967295)之间的整数。
remux-buf-cap: This parameter MUST NOT be present when mst-mode is not present or the value of mst-mode is equal to "NI-T". This parameter MAY be used to signal the capabilities of a receiver implementation and indicates the amount of re-multiplexing buffer space in units of bytes that the receiver has available for recovering the NAL unit decoding order as specified in Section 6.2.2. A receiver is able to handle any NAL unit stream for which the value of the sprop-remux-buf-req parameter is smaller than or equal to this parameter.
remux buf cap:当mst模式不存在或mst模式的值等于“NI-T”时,此参数不得存在。该参数可用于表示接收器实现的能力,并指示接收器可用于恢复第6.2.2节中规定的NAL单元解码顺序的以字节为单位的重复用缓冲空间量。接收器能够处理sprop remux buf req参数值小于或等于此参数的任何NAL单元流。
If the parameter is not present, then a value of 0 MUST be used for remux-buf-cap. The value of remux-buf-cap MUST be an integer in the range of 0 to 4294967295, inclusive.
如果参数不存在,则remux buf cap必须使用0值。remux buf cap的值必须是介于0到4294967295(包括0和4294967295)之间的整数。
sprop-remux-init-buf-time: This parameter MAY be used to signal the properties of the NAL unit stream. The parameter MUST NOT be present if mst-mode is not present or the value of mst-mode is equal to "NI-T".
sprop remux init buf time:此参数可用于表示NAL单元流的属性。如果mst模式不存在或mst模式的值等于“NI-T”,则该参数不得存在。
The parameter signals the initial buffering time that a receiver MUST wait before starting to recover the NAL unit decoding order as specified in Section 6.2.2 of this memo.
该参数表示接收器在开始恢复本备忘录第6.2.2节规定的NAL单元解码顺序之前必须等待的初始缓冲时间。
The parameter is coded as a non-negative base10 integer representation in clock ticks of a 90-kHz clock. If the parameter is not present, then no initial buffering time value is defined. Otherwise, the value of sprop-remux-init-buf-time MUST be an integer in the range of 0 to 4294967295, inclusive.
该参数以90 kHz时钟的时钟信号为单位编码为非负的base10整数表示。如果参数不存在,则不定义初始缓冲时间值。否则,sprop remux init buf time的值必须是0到4294967295(包括0到4294967295)范围内的整数。
sprop-mst-max-don-diff: This parameter MAY be used to signal the properties of the NAL unit stream. It MUST NOT be used to signal transmitter or receiver or codec capabilities. The parameter MUST NOT be present if mst-mode is not present or the value of mst-mode is equal to "NI-T". sprop-mst-max-don-diff is an integer in the range of 0 to 32767, inclusive. If sprop-mst-max-don-diff is not present, the value of the parameter is unspecified. sprop-mst-max-don-diff is calculated same as sprop-max-don-diff as specified in [RFC6184], with decoding order number being replaced by cross-session decoding order number.
sprop mst max don diff:此参数可用于表示NAL单位流的属性。不得将其用于信号发送器或接收器或编解码器功能。如果mst模式不存在或mst模式的值等于“NI-T”,则该参数不得存在。sprop mst max don diff是一个介于0到32767(含0到32767)之间的整数。如果sprop mst max don diff不存在,则该参数的值未指定。sprop mst max don diff的计算方法与[RFC6184]中规定的sprop max don diff相同,解码顺序号替换为跨会话解码顺序号。
sprop-scalability-info: This parameter MAY be used to convey the NAL unit containing the scalability information SEI message as specified in Annex G of [H.264]. This parameter MAY be used to signal the contained layers of an SVC bitstream. The parameter MUST NOT be used to indicate codec capability in any capability exchange procedure. The value of the parameter is the base64 [RFC4648] representation of the NAL unit containing the scalability information SEI message. If present, the NAL unit MUST contain only one SEI message that is a scalability information SEI message.
sprop可伸缩性信息:该参数可用于传送包含[H.264]附录G中规定的可伸缩性信息SEI消息的NAL单元。该参数可用于向SVC比特流的包含层发送信号。在任何功能交换过程中,该参数不得用于指示编解码器功能。该参数的值是包含可伸缩性信息SEI消息的NAL单元的base64[RFC4648]表示。如果存在,NAL单元必须仅包含一条SEI消息,该消息是可伸缩性信息SEI消息。
This parameter MAY be used in an offering or declarative SDP message to indicate what layers (operation points) can be provided. A receiver MAY indicate its choice of one layer using the optional media type parameter scalable-layer-id.
此参数可在产品或声明性SDP消息中使用,以指示可以提供哪些层(操作点)。接收机可以使用可选的媒体类型参数scalable-layer-id指示其对一个层的选择。
scalable-layer-id: This parameter MAY be used to signal a receiver's choice of the offers or declared operation points or layers using sprop-scalability-info or sprop-operation-point-info. The value of scalable-layer-id is a base16 representation of the layer_id[ i ] syntax element in the scalability information SEI message as specified in Annex G of [H.264] or layer-ID contained in sprop-operation-point-info.
可伸缩层id:此参数可用于向接收方发出信号,表明其选择了使用sprop可伸缩性信息或sprop操作点信息的报价或声明的操作点或层。可伸缩层id的值是可伸缩性信息SEI消息中layer_id[i]语法元素的base16表示,如[H.264]的附录G所规定,或sprop操作点信息中包含的layer id。
sprop-operation-point-info: This parameter MAY be used to describe the operation points of an RTP session. The value of this parameter consists of a comma-separated list of operation-point-description vectors. The values given by the operation-point-description vectors are the same as, or are derived from, the values that would be given for a scalable layer in the scalability information SEI message as specified in Annex G of [H.264], where the term scalable layer in the scalability information SEI message refers to all NAL units associated with the same values of temporal_id, dependency_id, and quality_id. In this memo, such a set of NAL units is called an operation point.
sprop操作点信息:此参数可用于描述RTP会话的操作点。此参数的值由逗号分隔的操作点描述向量列表组成。操作点描述向量给出的值与[H.264]附录G中规定的可伸缩性信息SEI消息中可伸缩层给出的值相同,或源自该值,其中,可伸缩性信息SEI消息中的术语可伸缩层指的是与相同的时间、相关性和质量id值相关联的所有NAL单元。在本备忘录中,这样一组NAL单元称为操作点。
Each operation-point-description vector has ten elements, provided as a comma-separated list of values as defined below. The first value of the operation-point-description vector is preceded by a '<', and the last value of the operation-point-description vector is followed by a '>'. If the sprop-operation-point-info is followed by exactly one operation-point-description vector, this describes the highest operation point contained in the RTP session. If there are two or more
每个操作点描述向量有十个元素,作为逗号分隔的值列表提供,如下所述。操作点描述向量的第一个值前面是“<”,操作点描述向量的最后一个值后面是“>”。如果sprop操作点信息后面紧跟着一个操作点描述向量,则描述RTP会话中包含的最高操作点。如果有两个或更多
operation-point-description vectors, the first describes the lowest and the last describes the highest operation point contained in the RTP session.
操作点描述向量,第一个描述RTP会话中包含的最低操作点,最后一个描述RTP会话中包含的最高操作点。
The values given by the operation-point-description vector are as follows, in the order listed:
操作点描述向量给出的值如下所示,顺序如下:
- layer-ID: This value specifies the layer identifier of the operation point, which is identical to the layer_id that would be indicated (for the same values of dependency_id, quality_id, and temporal_id) in the scalability information SEI message. This field MAY be empty, indicating that the value is unspecified. When there are multiple operation-point-description vectors with layer-ID, the values of layer-ID do not need to be consecutive.
- layer ID:该值指定操作点的层标识符,该标识符与可伸缩性信息SEI消息中指示的层ID相同(对于相同的依赖项ID、质量ID和时间ID值)。此字段可能为空,表示该值未指定。当存在多个具有层ID的操作点描述向量时,层ID的值不需要是连续的。
- temporal-ID: This value specifies the temporal_id of the operation point. This field MUST NOT be empty.
- 临时ID:该值指定操作点的临时ID。此字段不能为空。
- dependency-ID: This values specifies the dependency_id of the operation point. This field MUST NOT be empty.
- 依赖项ID:该值指定操作点的依赖项ID。此字段不能为空。
- quality-ID: This values specifies the quality_id of the operation point. This field MUST NOT be empty.
- 质量ID:该值指定操作点的质量ID。此字段不能为空。
- profile-level-ID: This value specifies the profile-level-idc of the operation point in the base16 format. The default sub-profile or default level indicated by the parameter profile-level-ID in the sprop-operation-point-info vector SHALL be equal to or lower than the default sub-profile or default level indicated by profile-level-id, which may be either present or the default value is taken. This field MAY be empty, indicating that the value is unspecified.
- 配置文件级别ID:该值以base16格式指定操作点的配置文件级别idc。sprop操作点信息向量中参数配置文件级别ID指示的默认子配置文件或默认级别应等于或低于配置文件级别ID指示的默认子配置文件或默认级别,该默认子配置文件或默认级别可以存在,也可以取默认值。此字段可能为空,表示该值未指定。
- avg-framerate: This value specifies the average frame rate of the operation point. This value is given as an integer in frames per 256 seconds. The field MAY be empty, indicating that the value is unspecified.
- 平均帧速率:该值指定操作点的平均帧速率。该值以整数形式给出,单位为每256秒帧数。该字段可能为空,表示该值未指定。
- width: This value specifies the width dimension in pixels of decoded frames for the operation point. This parameter is not directly given in the scalability information SEI message. This field MAY be empty, indicating that the value is unspecified.
- 宽度:该值指定操作点解码帧的宽度尺寸(以像素为单位)。可伸缩性信息SEI消息中未直接给出此参数。此字段可能为空,表示该值未指定。
- height: This value gives the height dimension in pixels of decoded frames for the operation point. This parameter is not directly given in the scalability information SEI. This field MAY be empty, indicating that the value is unspecified.
- 高度:该值给出操作点解码帧的高度尺寸(以像素为单位)。可伸缩性信息SEI中未直接给出此参数。此字段可能为空,表示该值未指定。
- avg-bitrate: This value specifies the average bitrate of the operation point. This parameter is given as an integer in kbits per second over the entire stream. Note that this parameter is provided in the scalability information SEI message in bits per second and calculated over a variable time window. This field MAY be empty, indicating that the value is unspecified.
- 平均比特率:该值指定操作点的平均比特率。该参数在整个流中以整数形式给出,单位为kbits/s。请注意,此参数在可伸缩性信息SEI消息中以比特/秒为单位提供,并在可变时间窗口中计算。此字段可能为空,表示该值未指定。
- max-bitrate: This value specifies the maximum bitrate of the operation point. This parameter is given as an integer in kbits per second and describes the maximum bitrate per each one-second window. Note that this parameter is provided in the scalability information SEI message in bits per second and is calculated over a variable time window. This field MAY be empty, indicating that the value is unspecified.
- 最大比特率:该值指定操作点的最大比特率。此参数以整数形式给出,单位为kbits/s,并描述每1秒窗口的最大比特率。请注意,此参数在可伸缩性信息SEI消息中以比特/秒为单位提供,并在可变时间窗口中计算。此字段可能为空,表示该值未指定。
Similarly to sprop-scalability-info, this parameter MAY be used in an offering or declarative SDP message to indicate what layers (operation points) can be provided. A receiver MAY indicate its choice of the highest layer it wants to send and/or receive using the optional media type parameter scalable-layer-id.
与sprop可伸缩性信息类似,此参数可在产品或声明性SDP消息中使用,以指示可以提供哪些层(操作点)。接收机可以使用可选的媒体类型参数scalable-layer-id指示其要发送和/或接收的最高层的选择。
sprop-no-NAL-reordering-required: This parameter MAY be used to signal the properties of the NAL unit stream. This parameter MUST NOT be present when mst-mode is not present or the value of mst-mode is not equal to "NI-T". The presence of this parameter indicates that no reordering of non-VCL or VCL NAL units is required for the decoding order recovery process.
sprop无需NAL重新排序:此参数可用于表示NAL单元流的属性。当mst模式不存在或mst模式值不等于“NI-T”时,此参数不得存在。此参数的存在表明解码顺序恢复过程不需要对非VCL或VCL NAL单元进行重新排序。
sprop-avc-ready: This parameter MAY be used to indicate the properties of the NAL unit stream. The presence of this parameter indicates that the RTP session, if used in SST, or used in MST combined with other RTP sessions also with this parameter present, can be processed by a [RFC6184] receiver. This parameter MAY be used with RTP sessions with media subtype H264-SVC.
sprop avc ready:此参数可用于指示NAL单位流的属性。此参数的存在表明,如果在SST中使用RTP会话,或在MST中与其他RTP会话结合使用时也存在此参数,则[RFC6184]接收器可以处理RTP会话。此参数可用于媒体子类型为H264-SVC的RTP会话。
Encoding considerations: This media type is framed and binary; see Section 4.8 of RFC 4288 [RFC4288].
编码注意事项:此媒体类型为框架和二进制;参见RFC 4288[RFC4288]第4.8节。
Security considerations: See Section 8 of RFC 6190.
安全注意事项:见RFC 6190第8节。
Published specification: Please refer to RFC 6190 and its Section 13.
发布规范:请参考RFC 6190及其第13节。
Additional information: none
其他信息:无
File extensions: none
文件扩展名:无
Macintosh file type code: none
Macintosh文件类型代码:无
Object identifier or OID: none
对象标识符或OID:无
Person & email address to contact for further information:
联系人和电子邮件地址,以获取更多信息:
Ye-Kui Wang, yekui.wang@huawei.com
王爷奎,爷奎。wang@huawei.com
Intended usage: COMMON
预期用途:普通
Restrictions on usage: This media type depends on RTP framing, and hence is only defined for transfer via RTP [RFC3550]. Transport within other framing protocols is not defined at this time.
使用限制:此媒体类型取决于RTP帧,因此仅定义为通过RTP传输[RFC3550]。此时未定义其他帧协议内的传输。
Interoperability considerations: The media subtype name contains "SVC" to avoid potential conflict with RFC 3984 and its potential future replacement RTP payload format for H.264 non-SVC profiles.
互操作性注意事项:媒体子类型名称包含“SVC”,以避免与RFC 3984及其可能的未来替代RTP有效负载格式(用于H.264非SVC配置文件)发生冲突。
Applications that use this media type: Real-time video applications like video streaming, video telephony, and video conferencing.
使用此媒体类型的应用程序:实时视频应用程序,如视频流、视频电话和视频会议。
Author:
作者:
Ye-Kui Wang, yekui.wang@huawei.com
王爷奎,爷奎。wang@huawei.com
Change controller: IETF Audio/Video Transport working group delegated from the IESG.
变更控制员:IESG授权的IETF音频/视频传输工作组。
The media type video/H264-SVC string is mapped to fields in the Session Description Protocol (SDP) as follows:
媒体类型video/H264-SVC字符串映射到会话描述协议(SDP)中的字段,如下所示:
o The media name in the "m=" line of SDP MUST be video.
o SDP的“m=”行中的媒体名称必须是视频。
o The encoding name in the "a=rtpmap" line of SDP MUST be H264-SVC (the media subtype).
o SDP的“a=rtpmap”行中的编码名称必须是H264-SVC(媒体子类型)。
o The clock rate in the "a=rtpmap" line MUST be 90000.
o “a=rtpmap”行中的时钟频率必须为90000。
o The OPTIONAL parameters profile-level-id, max-recv-level, max-recv-base-level, max-mbps, max-fs, max-cpb, max-dpb, max-br, redundant-pic-cap, in-band-parameter-sets, packetization-mode, sprop-interleaving-depth, deint-buf-cap, sprop-deint-buf-req, sprop-init-buf-time, sprop-max-don-diff, max-rcmd-nalu-size, mst-mode, sprop-mst-csdon-always-present, sprop-mst-remux-buf-size, sprop-remux-buf-req, remux-buf-cap, sprop-remux-init-buf-time, sprop-mst-max-don-diff, and scalable-layer-id, when present, MUST be included in the "a=fmtp" line of SDP. These parameters are expressed as a media type string, in the form of a semicolon-separated list of parameter=value pairs.
o 可选参数配置文件级别id、最大recv级别、最大recv基本级别、最大mbps、最大fs、最大cpb、最大dpb、最大br、冗余pic cap、带内参数集、打包模式、sprop交织深度、deint buf cap、sprop deint buf req、sprop init buf time、sprop max don diff、最大rcmd nalu size、mst mode、sprop mst csdon始终存在,sprop mst remux buf size、sprop remux buf req、remux buf cap、sprop remux init buf time、sprop mst max don diff和可伸缩层id(如果存在)必须包含在SDP的“a=fmtp”行中。这些参数表示为媒体类型字符串,以分号分隔的参数=值对列表的形式。
o The OPTIONAL parameters sprop-parameter-sets, sprop-level-parameter-sets, sprop-scalability-info, sprop-operation-point-info, sprop-no-NAL-reordering-required, and sprop-avc-ready, when present, MUST be included in the "a=fmtp" line of SDP or conveyed using the "fmtp" source attribute as specified in Section 6.3 of [RFC5576]. For a particular media format (i.e., RTP payload type), a sprop-parameter-sets or sprop-level-parameter-sets MUST NOT be both included in the "a=fmtp" line of SDP and conveyed using the "fmtp" source attribute. When included in the "a=fmtp" line of SDP, these parameters are expressed as a media type string, in the form of a semicolon-separated list of parameter=value pairs. When conveyed using the "fmtp" source attribute, these parameters are only associated with the given source and payload type as parts of the "fmtp" source attribute.
o 可选参数sprop参数集、sprop级别参数集、sprop可伸缩性信息、sprop操作点信息、需要的sprop no NAL重新排序和sprop avc ready(如果存在)必须包含在SDP的“a=fmtp”行中,或使用[RFC5576]第6.3节中规定的“fmtp”源属性进行传输。对于特定媒体格式(即RTP有效负载类型),sprop参数集或sprop级别参数集不得同时包含在SDP的“a=fmtp”行中,且不得使用“fmtp”源属性进行传输。当包含在SDP的“a=fmtp”行中时,这些参数表示为媒体类型字符串,以分号分隔的参数=值对列表的形式。当使用“fmtp”源属性进行传输时,这些参数仅与作为“fmtp”源属性一部分的给定源和有效负载类型相关联。
Informative note: Conveyance of sprop-parameter-sets and sprop-level-parameter-sets using the "fmtp" source attribute allows for out-of-band transport of parameter sets in topologies like Topo-Video-switch-MCU [RFC5117].
资料性说明:使用“fmtp”源属性传输sprop参数集和sprop级别参数集允许在拓扑(如拓扑视频开关MCU)[RFC5117]中进行带外传输。
When an SVC stream (with media subtype H264-SVC) is offered over RTP using SDP in an Offer/Answer model [RFC3264] for negotiation for unicast usage, the following limitations and rules apply:
当在提供/应答模型[RFC3264]中使用SDP通过RTP提供SVC流(媒体子类型为H264-SVC)以协商单播使用时,以下限制和规则适用:
o The parameters identifying a media format configuration for SVC are profile-level-id, packetization-mode, and mst-mode. These media configuration parameters (except for the level part of profile-level-id) MUST be used symmetrically when the answerer does not include scalable-layer-id in the answer; i.e., the answerer MUST either maintain all configuration parameters or remove the media format (payload type) completely, if one or more of the parameter values are not supported. Note that the level part of profile-level-id includes level_idc, and, for indication of level 1b when profile_idc is equal to 66, 77, or 88, bit 4 (constraint_set3_flag) of profile-iop. The level part of profile-level-id is changeable.
o 标识SVC媒体格式配置的参数有配置文件级别id、打包模式和mst模式。当应答者在应答中不包括可伸缩层id时,这些媒体配置参数(配置文件级别id的级别部分除外)必须对称使用;i、 例如,如果不支持一个或多个参数值,应答者必须维护所有配置参数或完全删除媒体格式(有效负载类型)。请注意,配置文件级别id的级别部分包括级别_idc,并且,当配置文件_idc等于66、77或88时,用于指示级别1b,配置文件iop的第4位(约束设置3_标志)。配置文件级别id的级别部分是可更改的。
Informative note: The requirement for symmetric use does not apply for the level part of profile-level-id, and does not apply for the other stream properties and capability parameters.
资料性说明:对称使用要求不适用于概要文件级别id的级别部分,也不适用于其他流属性和能力参数。
Informative note: In [H.264], all the levels except for Level 1b are equal to the value of level_idc divided by 10. Level 1b is a level higher than Level 1.0 but lower than Level 1.1, and is signaled in an ad hoc manner. For the Baseline, Main, and Extended profiles (with profile_idc equal to 66, 77, and 88, respectively), Level 1b is indicated by level_idc equal to 11 (i.e., the same as level 1.1) and constraint_set3_flag equal to 1. For other profiles, Level 1b is indicated by level_idc equal to 9 (but note that Level 1b for these profiles is still higher than Level 1, which has level_idc equal to 10, and lower than Level 1.1). In SDP Offer/Answer, an answer may indicate a level equal to or lower than the level indicated in the offer. Due to the ad hoc indication of Level 1b, offerers and answerers must check the value of bit 4 (constraint_set3_flag) of the middle octet of the parameter profile-level-id, when profile_idc is equal to 66, 77, or 88 and level_idc is equal to 11.
资料性说明:在[H.264]中,除1b级外的所有级别均等于_级idc值除以10。1b级是高于1.0级但低于1.1级的级别,并以特殊方式发出信号。对于基线、主配置文件和扩展配置文件(配置文件idc分别等于66、77和88),1b级由级别idc等于11(即,与级别1.1相同)和约束设置3_标志等于1表示。对于其他配置文件,1b级由等于9的级别_idc表示(但请注意,这些配置文件的1b级仍然高于级别1,后者的级别_idc等于10,低于级别1.1)。在SDP报价/应答中,应答可能表示等于或低于报价中指示的水平。由于1b级的特殊指示,当profile_idc等于66、77或88且Level_idc等于11时,报价人和应答人必须检查参数profile Level id中间八位字节的第4位(constraint_set3_flag)的值。
To simplify handling and matching of these configurations, the same RTP payload type number used in the offer should also be used in the answer, as specified in [RFC3264]. The same RTP payload type number used in the offer MUST also be used in the answer when the answer includes scalable-layer-id. When the answer does not include scalable-layer-id, the answer MUST NOT contain a payload
为了简化这些配置的处理和匹配,响应中还应使用报价中使用的相同RTP有效负载类型编号,如[RFC3264]中所述。当答案包括可扩展层id时,报价中使用的相同RTP有效负载类型编号也必须用于答案。当答案不包括可扩展层id时,答案不得包含有效负载
type number used in the offer unless the configuration is exactly the same as in the offer or the configuration in the answer only differs from that in the offer with a level lower than the default level offered.
报价中使用的类型编号,除非配置与报价中的配置完全相同,或者答案中的配置仅与报价中的配置不同,且级别低于默认级别。
Informative note: When an offerer receives an answer that does not include scalable-layer-id it has to compare payload types not declared in the offer based on the media type (i.e., video/H264-SVC) and the above media configuration parameters with any payload types it has already declared. This will enable it to determine whether the configuration in question is new or if it is equivalent to configuration already offered, since a different payload type number may be used in the answer.
资料性说明:当报价人收到不包括可扩展层id的答复时,必须根据媒体类型(即视频/H264-SVC)和上述媒体配置参数,将报价中未声明的有效负载类型与其已声明的任何有效负载类型进行比较。这将使其能够确定所讨论的配置是新的还是与已经提供的配置等效,因为答案中可能会使用不同的有效负载类型编号。
Since an SVC stream may contain multiple operation points, a facility is provided so that an answerer can select a different operation point than the entire SVC stream. Specifically, different operation points MAY be described using the sprop-scalability-info or sprop-operation-point-info parameters. The first one carries the entire scalability information SEI message defined in Annex G of [H.264], whereas the second one may be derived, e.g., as a subset of this SEI message that only contains key information about an operation point. Operation points, in both cases, are associated with a layer identifier.
由于SVC流可能包含多个操作点,因此提供了一种设施,以便应答者可以选择与整个SVC流不同的操作点。具体地,可以使用sprop可伸缩性信息或sprop操作点信息参数来描述不同的操作点。第一个包含[H.264]附录G中定义的整个可伸缩性信息SEI消息,而第二个可导出,例如,作为该SEI消息的子集,该SEI消息仅包含关于操作点的关键信息。在这两种情况下,操作点都与层标识符相关联。
If such information (sprop-operation-point-info or sprop-scalability-info) is provided in an offer, an answerer MAY select from the various operation points offered in the sprop-scalability-information or sprop-operation-point-info parameters by including scalable-layer-id in the answer. By this, the answerer indicates its selection of a particular operation point in the received and/or in the sent stream. When such operation point selection takes place, i.e., the answerer includes scalable-layer-id in the answer, the media configuration parameters MUST NOT be present in the answer. Rather, the media configuration that the answerer will use for receiving and/or sending is the one used for the selected operation point as indicated in the offer.
如果报价中提供了此类信息(sprop操作点信息或sprop可伸缩性信息),则应答者可以通过在应答中包括可伸缩层id,从sprop可伸缩性信息或sprop操作点信息参数中提供的各种操作点中进行选择。由此,应答器指示其在接收和/或发送流中选择的特定操作点。当发生这样的操作点选择时,即应答者在应答中包括可伸缩层id,媒体配置参数不得出现在应答中。相反,应答者将用于接收和/或发送的媒体配置是用于所选操作点的配置,如报价中所示。
Informative note: The ability to perform operation point selection enables a receiver to utilize the scalable nature of an SVC stream.
资料性说明:执行操作点选择的能力使接收器能够利用SVC流的可伸缩性。
o The parameter max-recv-level, when present, declares the highest level supported for receiving. In case max-recv-level is not present, the highest level supported for receiving is equal to the
o 参数max recv level(如果存在)声明接收支持的最高级别。如果不存在最大recv级别,则支持接收的最高级别等于
default level indicated by the level part of profile-level-id. max-recv-level, when present, MUST be higher than the default level.
由profile-level-id的级别部分指示的默认级别。最大recv级别(如果存在)必须高于默认级别。
o The parameter max-recv-base-level, when present, declares the highest level of the base layer supported for receiving. When max-recv-base-level is not present, the highest level supported for the base layer is not constrained separately from the SVC stream containing the base layer. The endpoint at the other side MUST NOT send a scalable stream for which the base layer is of a level higher than max-recv-base-level. Parameters declaring receiver capabilities above the default level (max-mbps, max-smbps, max-fs, max-cpb, max-dpb, max-br, and max-recv-level) do not apply to the base layer when max-recv-base-level is present.
o 参数max recv base level(如果存在)声明了支持接收的基层的最高级别。当max recv base level不存在时,基础层支持的最高级别不会与包含基础层的SVC流分开进行约束。另一端的端点不得发送基础层级别高于max recv base level的可扩展流。当存在最大recv基本级别时,声明接收器功能高于默认级别(最大mbps、最大smbps、最大fs、最大cpb、最大dpb、最大br和最大recv级别)的参数不适用于基本层。
o The parameters sprop-deint-buf-req, sprop-interleaving-depth, sprop-max-don-diff, sprop-init-buf-time, sprop-mst-csdon-always-present, sprop-remux-buf-req, sprop-mst-remux-buf-size, sprop-remux-init-buf-time, sprop-mst-max-don-diff, sprop-scalability-information, sprop-operation-point-info, sprop-no-NAL-reordering-required, and sprop-avc-ready describe the properties of the NAL unit stream that the offerer or answerer is sending for the media format configuration. This differs from the normal usage of the Offer/Answer parameters: normally such parameters declare the properties of the stream that the offerer or the answerer is able to receive. When dealing with SVC, the offerer assumes that the answerer will be able to receive media encoded using the configuration being offered.
o 参数sprop deint buf req、sprop交错深度、sprop max don diff、sprop init buf time、sprop mst csdon始终存在、sprop remux buf req、sprop mst remux buf size、sprop remux init buf time、sprop mst max don diff、sprop可伸缩性信息、sprop操作点信息、sprop无需重新排序,sprop avc ready描述了提供方或应答方为媒体格式配置发送的NAL单元流的属性。这与要约/应答参数的正常用法不同:通常这些参数声明了要约人或应答人能够接收的流的属性。在处理SVC时,报价人假设应答人能够接收使用提供的配置编码的媒体。
Informative note: The above parameters apply for any stream sent by the declaring entity with the same configuration; i.e., they are dependent on their source. Rather than being bound to the payload type, the values may have to be applied to another payload type when being sent, as they apply for the configuration.
资料性说明:上述参数适用于声明实体以相同配置发送的任何流;i、 例如,它们依赖于它们的来源。这些值在发送时可能必须应用于另一个有效负载类型,而不是绑定到有效负载类型,因为它们适用于配置。
o The capability parameters max-mbps, max-fs, max-cpb, max-dpb, max-br, redundant-pic-cap, and max-rcmd-nalu-size MAY be used to declare further capabilities of the offerer or answerer for receiving. These parameters MUST NOT be present when the direction attribute is sendonly, and the parameters describe the limitations of what the offerer or answerer accepts for receiving streams.
o 能力参数max-mbps、max-fs、max-cpb、max-dpb、max-br、冗余pic cap和max-rcmd-nalu-size可用于声明报价人或应答人的进一步接收能力。当direction属性为sendonly时,这些参数不得出现,并且这些参数描述了提供方或应答方接受接收流的限制。
o When mst-mode is not present and packetization-mode is equal to 2, the following applies.
o 当mst模式不存在且打包模式等于2时,以下情况适用。
o An offerer has to include the size of the de-interleaving buffer, sprop-deint-buf-req, in the offer. To enable the offerer and answerer to inform each other about their capabilities for de-interleaving buffering, both parties are RECOMMENDED to include deint-buf-cap. It is also RECOMMENDED to consider offering multiple payload types with different buffering requirements when the capabilities of the receiver are unknown.
o 报价人必须在报价中包含解交织缓冲区的大小,即sprop deint buf req。为了使报价人和应答人能够相互告知其解交错缓冲能力,建议双方包括deint buf cap。还建议考虑当接收机的能力未知时,提供具有不同缓冲要求的多个有效载荷类型。
o When mst-mode is present and equal to "NI-C", "NI-TC", or "I-C", the following applies.
o 当mst模式存在且等于“NI-C”、“NI-TC”或“I-C”时,以下情况适用。
o An offerer has to include sprop-remux-buf-req in the offer. To enable the offerer and answerer to inform each other about their capabilities for re-multiplexing buffering, both parties are RECOMMENDED to include remux-buf-cap. It is also RECOMMENDED to consider offering multiple payload types with different buffering requirements when the capabilities of the receiver are unknown.
o 报价人必须在报价中包含sprop remux buf req。为了使报价人和应答人能够相互告知其重新复用缓冲的能力,建议双方包括remux buf cap。还建议考虑当接收机的能力未知时,提供具有不同缓冲要求的多个有效载荷类型。
o The sprop-parameter-sets or sprop-level-parameter-sets parameter, when present (included in the "a=fmtp" line of SDP or conveyed using the "fmtp" source attribute as specified in Section 6.3 of [RFC5576]), is used for out-of-band transport of parameter sets. However, when out-of-band transport of parameter sets is used, parameter sets MAY still be additionally transported in-band.
o sprop参数集或sprop级别参数集参数,当存在时(包括在SDP的“a=fmtp”行中,或使用[RFC5576]第6.3节规定的“fmtp”源属性进行传输),用于参数集的带外传输。然而,当使用参数集的带外传输时,参数集仍然可以在带内额外传输。
The answerer MAY use either out-of-band or in-band transport of parameter sets for the stream it is sending, regardless of whether out-of-band parameter sets transport has been used in the offerer-to-answerer direction. Parameter sets included in an answer are independent of those parameter sets included in the offer, as they are used for decoding two different video streams, one from the answerer to the offerer, and the other in the opposite direction.
应答者可以对其发送的流使用参数集的带外传输或带内传输,而不管是否在提供方到应答方的方向上使用了带外参数集传输。答案中包含的参数集独立于报价中包含的参数集,因为它们用于解码两个不同的视频流,一个从应答者到报价者,另一个在相反方向。
The following rules apply to transport of parameter sets in the offerer-to-answerer direction.
以下规则适用于报价人向应答人方向的参数集传输。
o An offer MAY include either or both of sprop-parameter- sets and sprop-level-parameter-sets. If neither sprop-parameter-sets nor sprop-level-parameter-sets is present in the offer, then only in-band transport of parameter sets is used.
o 报价可能包括sprop参数集和sprop级别参数集中的一个或两个。如果报价中既不存在sprop参数集,也不存在sprop级别参数集,则仅使用参数集的带内传输。
o If the answer includes in-band-parameter-sets equal to 1, then the offerer MUST transmit parameter sets in-band. Otherwise, the following applies.
o 如果答案包含等于1的带内参数集,则报价人必须传输带内参数集。否则,以下内容适用。
o If the level to use in the offerer-to-answerer direction is equal to the default level in the offer, the following applies.
o 如果在“报价人-应答人”指示中使用的级别等于报价中的默认级别,则以下内容适用。
The answerer MUST be prepared to use the parameter sets included in sprop-parameter-sets, when present, for decoding the incoming NAL unit stream, and ignore sprop-level-parameter-sets, when present.
应答者必须准备好使用sprop参数集中包含的参数集(当存在时)对传入的NAL单元流进行解码,并忽略sprop级别的参数集(当存在时)。
When sprop-parameter-sets is not present in the offer, in-band transport of parameter sets MUST be used.
如果报价中不存在sprop参数集,则必须使用参数集的带内传输。
o Otherwise (the level to use in the offerer-to-answerer direction is not equal to the default level in the offer), the following applies.
o 否则(在报价人对应答人的指示中使用的级别不等于报价中的默认级别),以下适用。
The answerer MUST be prepared to use the parameter sets that are included in sprop-level-parameter-sets for the accepted level (i.e., the default level in the answer, which is also the level to use in the offerer-to-answerer direction), when present, for decoding the incoming NAL unit stream, and ignore all other parameter sets included in sprop-level-parameter-sets and sprop-parameter-sets, when present.
应答者必须准备好使用sprop级别参数集中包含的参数集来解码传入的NAL单元流,该参数集用于接受级别(即,应答中的默认级别,也是在报价者到应答者方向中使用的级别),并忽略sprop级别参数集和sprop参数集(如果存在)中包含的所有其他参数集。
When no parameter sets for the accepted level are present in the sprop-level-parameter-sets, in-band transport of parameter sets MUST be used.
当sprop级别参数集中不存在可接受级别的参数集时,必须使用参数集的带内传输。
The following rules apply to transport of parameter sets in the answerer-to-offerer direction.
以下规则适用于应答方向报价方方向传输参数集。
o An answer MAY include either sprop-parameter-sets or sprop-level-parameter-sets, but MUST NOT include both of the two. If neither sprop-parameter-sets nor sprop-level-parameter-sets is present in the answer, then only in-band transport of parameter sets is used.
o 答案可以包括sprop参数集或sprop级别参数集,但不能同时包括这两个参数集。如果答案中既不存在sprop参数集,也不存在sprop级别参数集,则仅使用参数集的带内传输。
o If the offer includes in-band-parameter-sets equal to 1, then the answerer MUST NOT include sprop-parameter-sets or sprop-level-parameter-sets in the answer and MUST transmit parameter sets in-band. Otherwise, the following applies.
o 如果报价包含等于1的带内参数集,则应答者不得在应答中包含sprop参数集或sprop级别参数集,并且必须在带内传输参数集。否则,以下内容适用。
o If the level to use in the answerer-to-offerer direction is equal to the default level in the answer, the following applies.
o 如果回答者对发盘者的指示中使用的级别等于回答中的默认级别,则以下情况适用。
The offerer MUST be prepared to use the parameter sets included in sprop-parameter-sets, when present, for decoding the incoming NAL unit stream, and ignore sprop-level-parameter-sets, when present.
报价人必须准备好使用sprop参数集中包含的参数集(如果存在)对传入的NAL单元流进行解码,并忽略sprop级别的参数集(如果存在)。
When sprop-parameter-sets is not present in the answer, the answerer MUST transmit parameter sets in-band.
当应答中不存在sprop参数集时,应答者必须在频带内传输参数集。
o Otherwise (the level to use in the answerer-to-offerer direction is not equal to the default level in the answer), the following applies.
o 否则(在回答者对发盘者的指示中使用的级别不等于回答中的默认级别),以下适用。
The offerer MUST be prepared to use the parameter sets that are included in sprop-level-parameter-sets for the level to use in the answerer-to-offerer direction, when present in the answer, for decoding the incoming NAL unit stream, and ignore all other parameter sets included in sprop-level-parameter-sets and sprop-parameter-sets, when present in the answer.
报价人必须准备好使用sprop级别参数集中包含的参数集,以便在应答人至报价人方向中使用的级别(当出现在应答中时)解码传入的NAL单元流,并忽略sprop级别参数集和sprop参数集中包含的所有其他参数集,当答案中出现时。
When no parameter sets for the level to use in the answerer-to-offerer direction are present in sprop-level-parameter-sets in the answer, the answerer MUST transmit parameter sets in-band.
当应答中的sprop level参数集中不存在应答方至报价方方向中使用的电平参数集时,应答方必须在频带内传输参数集。
When sprop-parameter-sets or sprop-level-parameter-sets is conveyed using the "fmtp" source attribute as specified in Section 6.3 of [RFC5576], the receiver of the parameters MUST store the parameter sets included in the sprop-parameter-sets or sprop-level-parameter-sets for the accepted level and associate them to the source given as a part of the "fmtp" source attribute. Parameter sets associated with one source MUST only be used to decode NAL units conveyed in RTP packets from the same source. When this mechanism is in use, SSRC collision detection and resolution MUST be performed as specified in [RFC5576].
当使用[RFC5576]第6.3节中规定的“fmtp”源属性传送sprop参数集或sprop级别参数集时,参数接收者必须存储接受级别的sprop参数集或sprop级别参数集中包含的参数集,并将其与作为“fmtp”源属性一部分给定的源关联。与一个源关联的参数集只能用于解码来自同一源的RTP数据包中传输的NAL单元。使用此机制时,必须按照[RFC5576]中的规定执行SSRC碰撞检测和解决。
Informative note: Conveyance of sprop-parameter-sets and sprop-level-parameter-sets using the "fmtp" source attribute may be used in topologies like Topo-Video-switch-MCU [RFC5117] to enable out-of-band transport of parameter sets.
资料性说明:使用“fmtp”源属性传输sprop参数集和sprop级别参数集可在拓扑中使用,如拓扑视频开关MCU[RFC5117],以启用参数集的带外传输。
For streams being delivered over multicast, the following rules apply:
对于通过多播传送的流,以下规则适用:
o The media format configuration is identified by profile-level- id, including the level part, packetization-mode, and mst-mode. These media format configuration parameters (including the level part of profile-level-id) MUST be used symmetrically; i.e., the answerer
o 媒体格式配置由配置文件级别id标识,包括级别部分、打包模式和mst模式。这些媒体格式配置参数(包括配置文件级别id的级别部分)必须对称使用;i、 e.回答者
MUST either maintain all configuration parameters or remove the media format (payload type) completely. Note that this implies that the level part of profile-level-id for Offer/Answer in multicast is not changeable.
必须保留所有配置参数或完全删除媒体格式(负载类型)。请注意,这意味着多播中提供/应答的概要文件级别id的级别部分是不可更改的。
To simplify handling and matching of these configurations, the same RTP payload type number used in the offer should also be used in the answer, as specified in [RFC3264]. An answer MUST NOT contain a payload type number used in the offer unless the configuration is the same as in the offer.
为了简化这些配置的处理和匹配,响应中还应使用报价中使用的相同RTP有效负载类型编号,如[RFC3264]中所述。答案不得包含报价中使用的有效负载类型编号,除非配置与报价中的配置相同。
o Parameter sets received MUST be associated with the originating source, and MUST be only used in decoding the incoming NAL unit stream from the same source.
o 接收到的参数集必须与原始源关联,并且只能用于解码来自同一源的传入NAL单元流。
o The rules for other parameters are the same as above for unicast as long as the above rules are obeyed.
o 只要遵守上述规则,其他参数的规则与单播相同。
Table 14 lists the interpretation of all the parameters that MUST be used for the various combinations of offer, answer, and direction attributes. Note that the two columns wherein the scalable-layer-id parameter is used only apply to answers, whereas the other columns apply to both offers and answers.
表14列出了报价、应答和方向属性的各种组合必须使用的所有参数的解释。请注意,使用可伸缩层id参数的两列仅适用于答案,而其他列同时适用于报价和答案。
Table 14. Interpretation of parameters for various combinations of offers, answers, direction attributes, with and without scalable-layer-id. Columns that do not indicate offer or answer apply to both.
表14。解释各种报价、答案、方向属性组合的参数,包括和不包括可伸缩层id。不表示报价或答案的列适用于两者。
sendonly --+ answer: recvonly,scalable-layer-id --+ | recvonly w/o scalable-layer-id --+ | | answer: sendrecv, scalable-layer-id --+ | | | sendrecv w/o scalable-layer-id --+ | | | | | | | | | profile-level-id C X C X P max-recv-level R R R R - max-recv-base-level R R R R - packetization-mode C X C X P mst-mode C X C X P sprop-avc-ready P P - - P sprop-deint-buf-req P P - - P sprop-init-buf-time P P - - P sprop-interleaving-depth P P - - P sprop-max-don-diff P P - - P sprop-mst-csdon-always-present P P - - P sprop-mst-max-don-diff P P - - P sprop-mst-remux-buf-size P P - - P sprop-no-NAL-reordering-required P P - - P sprop-operation-point-info P P - - P sprop-remux-buf-req P P - - P sprop-remux-init-buf-time P P - - P sprop-scalability-info P P - - P deint-buf-cap R R R R - max-br R R R R - max-cpb R R R R - max-dpb R R R R - max-fs R R R R - max-mbps R R R R - max-rcmd-nalu-size R R R R - redundant-pic-cap R R R R - remux-buf-cap R R R R - in-band-parameter-sets R R R R - sprop-parameter-sets S S - - S sprop-level-parameter-sets S S - - S scalable-layer-id X O X O -
sendonly --+ answer: recvonly,scalable-layer-id --+ | recvonly w/o scalable-layer-id --+ | | answer: sendrecv, scalable-layer-id --+ | | | sendrecv w/o scalable-layer-id --+ | | | | | | | | | profile-level-id C X C X P max-recv-level R R R R - max-recv-base-level R R R R - packetization-mode C X C X P mst-mode C X C X P sprop-avc-ready P P - - P sprop-deint-buf-req P P - - P sprop-init-buf-time P P - - P sprop-interleaving-depth P P - - P sprop-max-don-diff P P - - P sprop-mst-csdon-always-present P P - - P sprop-mst-max-don-diff P P - - P sprop-mst-remux-buf-size P P - - P sprop-no-NAL-reordering-required P P - - P sprop-operation-point-info P P - - P sprop-remux-buf-req P P - - P sprop-remux-init-buf-time P P - - P sprop-scalability-info P P - - P deint-buf-cap R R R R - max-br R R R R - max-cpb R R R R - max-dpb R R R R - max-fs R R R R - max-mbps R R R R - max-rcmd-nalu-size R R R R - redundant-pic-cap R R R R - remux-buf-cap R R R R - in-band-parameter-sets R R R R - sprop-parameter-sets S S - - S sprop-level-parameter-sets S S - - S scalable-layer-id X O X O -
Legend:
图例:
C: configuration for sending and receiving streams P: properties of the stream to be sent R: receiver capabilities S: out-of-band parameter sets O: operation point selection X: MUST NOT be present -: not usable, when present SHOULD be ignored
C: configuration for sending and receiving streams P: properties of the stream to be sent R: receiver capabilities S: out-of-band parameter sets O: operation point selection X: MUST NOT be present -: not usable, when present SHOULD be ignored
Parameters used for declaring receiver capabilities are in general downgradable; i.e., they express the upper limit for a sender's possible behavior. Thus, a sender MAY select to set its encoder using only lower/lesser or equal values of these parameters.
用于声明接收器功能的参数通常是可降级的;i、 例如,它们表示发送者可能行为的上限。因此,发送方可选择仅使用这些参数的较低/较小或相等值来设置其编码器。
Parameters declaring a configuration point are not changeable, with the exception of the level part of the profile-level-id parameter for unicast usage. This expresses values a receiver expects to be used and must be used verbatim on the sender side. If level downgrading (for profile-level-id) is used, an answerer MUST NOT include the scalable-layer-id parameter.
声明配置点的参数不可更改,但用于单播使用的概要文件级别id参数的级别部分除外。这表示接收者期望使用的值,并且必须在发送者端逐字使用。如果使用了级别降级(用于配置文件级别id),应答者不得包括可伸缩层id参数。
When a sender's capabilities are declared, and non-downgradable parameters are used in this declaration, then these parameters express a configuration that is acceptable for the sender to receive streams. In order to achieve high interoperability levels, it is often advisable to offer multiple alternative configurations, e.g., for the packetization mode. It is impossible to offer multiple configurations in a single payload type. Thus, when multiple configuration offers are made, each offer requires its own RTP payload type associated with the offer.
当声明了发送方的功能,并且在此声明中使用了不可降级的参数时,这些参数表示发送方可接受的用于接收流的配置。为了实现高互操作性级别,通常建议提供多种备选配置,例如,对于打包模式。不可能在一种有效负载类型中提供多种配置。因此,当做出多个配置报价时,每个报价都需要与报价关联的自己的RTP有效负载类型。
A receiver SHOULD understand all media type parameters, even if it only supports a subset of the payload format's functionality. This ensures that a receiver is capable of understanding when an offer to receive media can be downgraded to what is supported by the receiver of the offer.
接收器应该理解所有媒体类型参数,即使它只支持有效负载格式功能的一个子集。这确保接收者能够理解何时可以将接收媒体的要约降级为要约接收者支持的内容。
An answerer MAY extend the offer with additional media format configurations. However, to enable their usage, in most cases a second offer is required from the offerer to provide the stream property parameters that the media sender will use. This also has the effect that the offerer has to be able to receive this media format configuration, not only to send it.
应答者可以通过附加媒体格式配置来延长报价。然而,为了能够使用它们,在大多数情况下,需要提供方提供第二次提供,以提供媒体发送方将使用的流属性参数。这也意味着,报价人必须能够接收此媒体格式配置,而不仅仅是发送它。
If an offerer wishes to have non-symmetric capabilities between sending and receiving, the offerer can allow asymmetric levels via level-asymmetry-allowed equal to 1. Alternatively, the offerer can offer different RTP sessions, i.e., different media lines declared as "recvonly" and "sendonly", respectively. This may have further implications on the system, and may require additional external semantics to associate the two media lines.
如果发盘方希望在发送和接收之间具有非对称能力,发盘方可以通过允许的电平不对称等于1来允许不对称电平。或者,报价人可以提供不同的RTP会话,即分别声明为“RecvoOnly”和“sendonly”的不同媒体线路。这可能对系统有进一步的影响,并且可能需要额外的外部语义来关联两条媒体线路。
If MST is used, the rules on signaling media decoding dependency in SDP as defined in [RFC5583] apply. The rules on "hierarchical or layered encoding" with multicast in Section 5.7 of [RFC4566] do not
如果使用MST,则适用[RFC5583]中定义的SDP中的媒体解码依赖性信令规则。[RFC4566]第5.7节中关于多播的“分层或分层编码”规则不适用
apply, i.e., the notation for Connection Data "c=" SHALL NOT be used with more than one address. Additionally, the order of dependencies of the RTP sessions indicated by the "a=depend" attribute as defined in [RFC5583] MUST represent the decoding order of the VC) NAL units in an access unit, i.e., the order of session dependency is given from the base or the lowest enhancement RTP session (the most important) to the highest enhancement RTP session (the least important).
应用,即连接数据的符号“c=”不得与多个地址一起使用。此外,[RFC5583]中定义的“a=depend”属性所指示的RTP会话的依赖顺序必须表示访问单元中VC)NAL单元的解码顺序,即,会话依赖顺序从基本或最低增强RTP会话(最重要)给出到最高增强RTP会话(最不重要)。
When SVC over RTP is offered with SDP in a declarative style, as in Real Time Streaming Protocol (RTSP) [RFC2326] or Session Announcement Protocol (SAP) [RFC2974], the following considerations are necessary.
当SVC over RTP以声明式方式与SDP一起提供时,如在实时流协议(RTSP)[RFC2326]或会话公告协议(SAP)[RFC2974]中,需要考虑以下事项。
o All parameters capable of indicating both stream properties and receiver capabilities are used to indicate only stream properties. For example, in this case, the parameter profile-level-id declares the values used by the stream, not the capabilities for receiving streams. This results in that the following interpretation of the parameters MUST be used:
o 所有能够同时指示流属性和接收器能力的参数仅用于指示流属性。例如,在本例中,参数profile level id声明流使用的值,而不是接收流的功能。这导致必须使用以下参数解释:
Declaring actual configuration or stream properties:
声明实际配置或流属性:
- profile-level-id - packetization-mode - mst-mode - sprop-deint-buf-req - sprop-interleaving-depth - sprop-max-don-diff - sprop-init-buf-time - sprop-mst-csdon-always-present - sprop-mst-remux-buf-size - sprop-remux-buf-req - sprop-remux-init-buf-time - sprop-mst-max-don-diff - sprop-scalability-info - sprop-operation-point-info - sprop-no-NAL-reordering-required - sprop-avc-ready
- 配置文件级别id-打包模式-mst模式-sprop deint buf req-sprop交错深度-sprop max don diff-sprop init buf time-sprop mst csdon始终存在-sprop mst remux buf size-sprop remux buf req-sprop remux init buf time-sprop mst max don diff-sprop可伸缩性信息-sprop操作点信息-sprop无需NAL重新订购-sprop avc就绪
Out-of-band transporting of parameter sets:
参数集的带外传输:
- sprop-parameter-sets - sprop-level-parameter-sets
- sprop参数集-sprop级别参数集
Not usable (when present, they SHOULD be ignored):
不可用(如果存在,则应忽略):
- max-mbps - max-fs - max-cpb - max-dpb - max-br - max-recv-level - max-recv-base-level - redundant-pic-cap - max-rcmd-nalu-size - deint-buf-cap - remux-buf-cap - scalable-layer-id
- 最大mbps-最大fs-最大cpb-最大dpb-最大br-最大recv级别-最大recv基本级别-冗余pic cap-最大rcmd nalu大小-设计buf cap-remux buf cap-可伸缩层id
o A receiver of the SDP is required to support all parameters and values of the parameters provided; otherwise, the receiver MUST reject (RTSP) or not participate in (SAP) the session. It falls on the creator of the session to use values that are expected to be supported by the receiving application.
o SDP接收器需要支持提供的所有参数和参数值;否则,接收方必须拒绝(RTSP)或不参与(SAP)会话。会话的创建者需要使用接收应用程序预期支持的值。
In the following examples, "{data}" is used to indicate a data string encoded as base64.
在以下示例中,“{data}”用于指示编码为base64的数据字符串。
Example 1: The offerer offers one video media description including two RTP payload types. The first payload type offers H264, and the second offers H264-SVC. Both payload types have different fmtp parameters as profile-level-id, packetization-mode, and sprop-parameter-sets.
示例1:报价人提供一种视频媒体描述,包括两种RTP有效负载类型。第一种有效负载类型提供H264,第二种提供H264-SVC。两种有效负载类型都有不同的fmtp参数,如配置文件级别id、打包模式和sprop参数集。
Offerer -> Answerer SDP message:
报价人->应答人SDP消息:
m=video 20000 RTP/AVP 97 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; sprop-parameter-sets={sps0},{pps0}; a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53000c; packetization-mode=1; sprop-parameter-sets={sps0},{pps0},{sps1},{pps1};
m=video 20000 RTP/AVP 97 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; sprop-parameter-sets={sps0},{pps0}; a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53000c; packetization-mode=1; sprop-parameter-sets={sps0},{pps0},{sps1},{pps1};
If the answerer does not support media subtype H264-SVC, it can issue an answer accepting only the base layer offer (payload type 96). In the following example, the receiver supports H264-SVC, so it lists payload type 97 first as the preferred option.
如果应答器不支持媒体子类型H264-SVC,它可以发出仅接受基本层提供(有效负载类型96)的应答。在以下示例中,接收器支持H264-SVC,因此它首先将有效负载类型97列为首选选项。
Answerer -> Offerer SDP message:
应答人->报价人SDP消息:
m=video 40000 RTP/AVP 97 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; sprop-parameter-sets={sps2},{pps2}; a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53000c; packetization-mode=1; sprop-parameter-sets={sps2},{pps2},{sps3},{pps3};
m=video 40000 RTP/AVP 97 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; sprop-parameter-sets={sps2},{pps2}; a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53000c; packetization-mode=1; sprop-parameter-sets={sps2},{pps2},{sps3},{pps3};
7.3.2. Example for Offering a Single SVC Session Using scalable-layer-id
7.3.2. 使用可伸缩层id提供单个SVC会话的示例
Example 2: Offerer offers the same media configurations as shown in the example above for receiving and sending the stream, but using a single RTP payload type and including sprop-operation-point-info.
示例2:提供方提供与上述示例中所示相同的媒体配置,用于接收和发送流,但使用单个RTP有效负载类型,包括sprop操作点信息。
Offerer -> Answerer SDP message:
报价人->应答人SDP消息:
m=video 20000 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53000c; packetization-mode=1; sprop-parameter-sets={sps0},{sps1},{pps0},{pps1}; sprop-operation-point-info=<1,0,0,0,4de00a,3200,176,144,128, 256>,<2,1,1,0,53000c,6400,352,288,256,512>;
m=video 20000 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53000c; packetization-mode=1; sprop-parameter-sets={sps0},{sps1},{pps0},{pps1}; sprop-operation-point-info=<1,0,0,0,4de00a,3200,176,144,128, 256>,<2,1,1,0,53000c,6400,352,288,256,512>;
In this example, the receiver supports H264-SVC and chooses the lower operation point offered in the RTP payload type for sending and receiving the stream.
在此示例中,接收器支持H264-SVC,并选择RTP有效负载类型中提供的较低操作点来发送和接收流。
Answerer -> Offerer SDP message:
应答人->报价人SDP消息:
m=video 40000 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 sprop-parameter-sets={sps2},{sps3},{pps2},{pps3}; scalable-layer-id=1;
m=video 40000 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 sprop-parameter-sets={sps2},{sps3},{pps2},{pps3}; scalable-layer-id=1;
In an equivalent example showing the use of sprop-scalability-info instead using the sprop-operation-point-info, the sprop-operation-point-info would be exchanged by the sprop-scalability-info followed by the binary (base16) representation of the Scalability Information SEI message.
在显示使用sprop可伸缩性信息而不是使用sprop操作点信息的等效示例中,sprop操作点信息将由sprop可伸缩性信息交换,后跟可伸缩性信息SEI消息的二进制(base16)表示。
Example 3: In this example, the offerer offers a multi-session transmission with up to three sessions. The base session media description includes payload types that are backward compatible with
示例3:在此示例中,报价人提供最多三个会话的多会话传输。基本会话媒体描述包括向后兼容的负载类型
[RFC6184], and three different payload types are offered. The other two media are using payload types with media subtype H264-SVC. In each media description, different values of profile-level-id, packetization-mode, mst-mode, and sprop-parameter-sets are offered.
[RFC6184],并提供三种不同的有效负载类型。其他两种介质使用的有效负载类型为介质子类型H264-SVC。在每个媒体描述中,提供了配置文件级别id、打包模式、mst模式和sprop参数集的不同值。
Offerer -> Answerer SDP message:
报价人->应答人SDP消息:
a=group:DDP L1 L2 L3 m=video 20000 RTP/AVP 96 97 98 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; mst-mode=NI-T; sprop-parameter-sets={sps0},{pps0}; a=rtpmap:97 H264/90000 a=fmtp:97 profile-level-id=4de00a; packetization-mode=1; mst-mode=NI-TC; sprop-parameter-sets={sps0},{pps0}; a=rtpmap:98 H264/90000 a=fmtp:98 profile-level-id=4de00a; packetization-mode=2; mst-mode=I-C; init-buf-time=156320; sprop-parameter-sets={sps0},{pps0}; a=mid:L1 m=video 20002 RTP/AVP 99 100 a=rtpmap:99 H264-SVC/90000 a=fmtp:99 profile-level-id=53000c; packetization-mode=1; mst-mode=NI-T; sprop-parameter-sets={sps1},{pps1}; a=rtpmap:100 H264-SVC/90000 a=fmtp:100 profile-level-id=53000c; packetization-mode=2; mst-mode=I-C; sprop-parameter-sets={sps1},{pps1}; a=mid:L2 a=depend:99 lay L1:96,97; 100 lay L1:98 m=video 20004 RTP/AVP 101 a=rtpmap:101 H264-SVC/90000 a=fmtp:101 profile-level-id=53001F; packetization-mode=1; mst-mode=NI-T; sprop-parameter-sets={sps2},{pps2}; a=mid:L3 a=depend:101 lay L1:96,97 L2:99
a=group:DDP L1 L2 L3 m=video 20000 RTP/AVP 96 97 98 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; mst-mode=NI-T; sprop-parameter-sets={sps0},{pps0}; a=rtpmap:97 H264/90000 a=fmtp:97 profile-level-id=4de00a; packetization-mode=1; mst-mode=NI-TC; sprop-parameter-sets={sps0},{pps0}; a=rtpmap:98 H264/90000 a=fmtp:98 profile-level-id=4de00a; packetization-mode=2; mst-mode=I-C; init-buf-time=156320; sprop-parameter-sets={sps0},{pps0}; a=mid:L1 m=video 20002 RTP/AVP 99 100 a=rtpmap:99 H264-SVC/90000 a=fmtp:99 profile-level-id=53000c; packetization-mode=1; mst-mode=NI-T; sprop-parameter-sets={sps1},{pps1}; a=rtpmap:100 H264-SVC/90000 a=fmtp:100 profile-level-id=53000c; packetization-mode=2; mst-mode=I-C; sprop-parameter-sets={sps1},{pps1}; a=mid:L2 a=depend:99 lay L1:96,97; 100 lay L1:98 m=video 20004 RTP/AVP 101 a=rtpmap:101 H264-SVC/90000 a=fmtp:101 profile-level-id=53001F; packetization-mode=1; mst-mode=NI-T; sprop-parameter-sets={sps2},{pps2}; a=mid:L3 a=depend:101 lay L1:96,97 L2:99
It is assumed that in this example the answerer only supports the NI-T mode for multi-session transmission. For this reason, it chooses the corresponding payload type (96) for the base RTP session. For the two enhancement RTP sessions, the answerer also chooses the payload types that use the NI-T mode (99 and 101).
在此示例中,假设应答者仅支持多会话传输的NI-T模式。因此,它为基本RTP会话选择相应的有效负载类型(96)。对于两个增强RTP会话,应答者还选择使用NI-T模式(99和101)的有效负载类型。
Answerer -> Offerer SDP message:
应答人->报价人SDP消息:
a=group:DDP L1 L2 L3 m=video 40000 RTP/AVP 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; mst-mode=NI-T; sprop-parameter-sets={sps3},{pps3}; a=mid:L1 m=video 40002 RTP/AVP 99 a=rtpmap:99 H264-SVC/90000 a=fmtp:99 profile-level-id=53000c; packetization-mode=1; mst-mode=NI-T; sprop-parameter-sets={sps4},{pps4}; a=mid:L2 a=depend:99 lay L1:96 m=video 40004 RTP/AVP 101 a=rtpmap:101 H264-SVC/90000 a=fmtp:101 profile-level-id=53001F; packetization-mode=1; mst-mode=NI-T; sprop-parameter-sets={sps5},{pps5}; a=mid:L3 a=depend:101 lay L1:96 L2:99
a=group:DDP L1 L2 L3 m=video 40000 RTP/AVP 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; mst-mode=NI-T; sprop-parameter-sets={sps3},{pps3}; a=mid:L1 m=video 40002 RTP/AVP 99 a=rtpmap:99 H264-SVC/90000 a=fmtp:99 profile-level-id=53000c; packetization-mode=1; mst-mode=NI-T; sprop-parameter-sets={sps4},{pps4}; a=mid:L2 a=depend:99 lay L1:96 m=video 40004 RTP/AVP 101 a=rtpmap:101 H264-SVC/90000 a=fmtp:101 profile-level-id=53001F; packetization-mode=1; mst-mode=NI-T; sprop-parameter-sets={sps5},{pps5}; a=mid:L3 a=depend:101 lay L1:96 L2:99
7.3.4. Example for Offering Multiple Sessions in MST Including Operation with Answerer Using scalable-layer-id
7.3.4. 在MST中提供多个会话的示例,包括使用可伸缩层id与应答器进行操作
Example 4: In this example, the offerer offers a multi-session transmission of three layers with up to two sessions. The base session media description has a payload type that is backward compatible with [RFC6184]. Note that no parameter sets are provided, in which case in-band transport must be used. The other media description contains two enhancement layers and uses the media subtype H264-SVC. It includes two operation point definitions.
示例4:在此示例中,报价人提供三层多会话传输,最多两个会话。基本会话媒体描述的有效负载类型与[RFC6184]向后兼容。请注意,未提供参数集,在这种情况下,必须使用带内传输。另一种媒体描述包含两个增强层,并使用媒体子类型H264-SVC。它包括两个操作点定义。
Offerer -> Answerer SDP message:
报价人->应答人SDP消息:
a=group:DDP L1 L2 m=video 20000 RTP/AVP 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; mst-mode=NI-T; a=mid:L1 m=video 20002 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53001F; packetization-mode=1; mst-mode=NI-TC; sprop-operation-point-info=<2,0,1,0,53000c, 3200,352,288,384,512>,<3,1,2,0,53001F,6400,704,576,768,1024>; a=mid:L2 a=depend:97 lay L1:96
a=group:DDP L1 L2 m=video 20000 RTP/AVP 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; mst-mode=NI-T; a=mid:L1 m=video 20002 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53001F; packetization-mode=1; mst-mode=NI-TC; sprop-operation-point-info=<2,0,1,0,53000c, 3200,352,288,384,512>,<3,1,2,0,53001F,6400,704,576,768,1024>; a=mid:L2 a=depend:97 lay L1:96
It is assumed that the answerer wants to send and receive the base layer (payload type 96), but it only wants to send and receive the lower enhancement layer, i.e., the one with layer id equal to 2. For this reason, the response will include the selection of the desired layer by setting scalable-layer-id equal to 2. Note that the answer only includes the scalable-layer-id information. The answer could include sprop-parameter-sets in the response.
假设应答者想要发送和接收基本层(有效载荷类型96),但它只想要发送和接收较低的增强层,即层id等于2的增强层。因此,响应将包括通过将可伸缩层id设置为2来选择所需的层。请注意,答案仅包括可伸缩层id信息。答案可能包括响应中的sprop参数集。
Answerer -> Offerer SDP message:
应答人->报价人SDP消息:
a=group:DDP L1 L2 m=video 40000 RTP/AVP 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; mst-mode=NI-T; a=mid:L1 m=video 40002 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 scalable-layer-id=2; a=mid:L2 a=depend:97 lay L1:96
a=group:DDP L1 L2 m=video 40000 RTP/AVP 96 a=rtpmap:96 H264/90000 a=fmtp:96 profile-level-id=4de00a; packetization-mode=0; mst-mode=NI-T; a=mid:L1 m=video 40002 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 scalable-layer-id=2; a=mid:L2 a=depend:97 lay L1:96
7.3.5. Example for Negotiating an SVC Stream with a Constrained Base Layer in SST
7.3.5. 与SST中受约束的基本层协商SVC流的示例
Example 5: The offerer (Alice) offers one video description including two RTP payload types with differing levels and packetization modes.
示例5:报价人(Alice)提供了一个视频描述,包括两种具有不同级别和打包模式的RTP有效负载类型。
Offerer -> Answerer SDP message:
报价人->应答人SDP消息:
m=video 20000 RTP/AVP 97 96 a=rtpmap:96 H264-SVC/90000 a=fmtp:96 profile-level-id=53001e; packetization-mode=0; a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53001f; packetization-mode=1;
m=video 20000 RTP/AVP 97 96 a=rtpmap:96 H264-SVC/90000 a=fmtp:96 profile-level-id=53001e; packetization-mode=0; a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53001f; packetization-mode=1;
The answerer (Bridge) chooses packetization mode 1, and indicates that it would receive an SVC stream with the base layer being constrained.
应答器(桥接器)选择分组化模式1,并指示它将在基本层受到约束的情况下接收SVC流。
Answerer -> Offerer SDP message:
应答人->报价人SDP消息:
m=video 40000 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53001f; packetization-mode=1; max-recv-base-level=000d
m=video 40000 RTP/AVP 97 a=rtpmap:97 H264-SVC/90000 a=fmtp:97 profile-level-id=53001f; packetization-mode=1; max-recv-base-level=000d
The answering endpoint must send an SVC stream at Level 3.1. Since the offering endpoint did not declare max-recv-base-level, the base layer of the SVC stream the answering endpoint must send is not specifically constrained. The offering endpoint (Alice) must send an SVC stream at Level 3.1, for which the base layer must be of a level not higher than Level 1.3.
应答端点必须发送级别为3.1的SVC流。由于提供端点未声明max recv base level,因此应答端点必须发送的SVC流的基本层不受特定约束。提供端点(Alice)必须在级别3.1上发送SVC流,对于该流,基础层的级别必须不高于级别1.3。
Section 8.4 of [RFC6184] applies in this memo, with the following applies additionally for multi-session transmission (MST).
[RFC6184]第8.4节适用于本备忘录,以下内容还适用于多会话传输(MST)。
In MST, regardless of out-of-band or in-band transport of parameter sets are in use, parameter sets required for decoding NAL units carried in one particular RTP session SHOULD be carried in the same session, MAY be carried in a session that the particular RTP session depends on, and MUST NOT be carried in a session that the particular RTP session does not depend on.
在MST中,无论参数集的带外或带内传输在使用中,解码在一个特定RTP会话中携带的NAL单元所需的参数集应在同一会话中携带,可在特定RTP会话所依赖的会话中携带,并且不得在特定RTP会话不依赖的会话中进行。
The security considerations of the RTP Payload Format for H.264 Video specification [RFC6184] apply. Additionally, the following applies.
适用于H.264视频规范[RFC6184]的RTP有效负载格式的安全注意事项。此外,以下内容适用。
Decoders MUST exercise caution with respect to the handling of reserved NAL unit types and reserved SEI messages, particularly if they contain active elements, and MUST restrict their domain of applicability to the presentation containing the stream. The safest way is to simply discard these NAL units and SEI messages.
解码器必须谨慎处理保留的NAL单元类型和保留的SEI消息,尤其是当它们包含活动元素时,并且必须将其适用范围限制在包含流的表示上。最安全的方法是简单地丢弃这些NAL单元和SEI消息。
When integrity protection is applied to a stream, care MUST be taken that the stream being transported may be scalable; hence a receiver may be able to access only part of the entire stream.
当对流应用完整性保护时,必须注意正在传输的流可能是可伸缩的;因此,接收器可能只能访问整个流的一部分。
End-to-end security with either authentication, integrity, or confidentiality protection will prevent a MANE from performing media-aware operations other than discarding complete packets. And in the case of confidentiality protection it will even be prevented from performing discarding of packets in a media-aware way. To allow any MANE to perform its operations, it will be required to be a trusted entity that is included in the security context establishment. This applies both for the media path and for the RTCP path, if RTCP packets need to be rewritten.
具有身份验证、完整性或机密性保护的端到端安全性将防止MANE执行除丢弃完整数据包以外的媒体感知操作。并且在保密保护的情况下,它甚至将被阻止以媒体感知的方式执行数据包丢弃。要允许任何MANE执行其操作,它必须是安全上下文建立中包含的受信任实体。如果需要重写RTCP数据包,则这适用于媒体路径和RTCP路径。
Within any given RTP session carrying payload according to this specification, the provisions of Section 10 of [RFC6184] apply. Reducing the session bitrate is possible by one or more of the following means:
在根据本规范承载有效载荷的任何给定RTP会话中,[RFC6184]第10节的规定适用。可以通过以下一种或多种方式降低会话比特率:
a) Within the highest layer identified by the DID field remove any NAL units with QID higher than a certain value.
a) 在DID字段标识的最高层内,删除QID高于某个值的任何NAL单元。
b) Remove all NAL units with TID higher than a certain value.
b) 删除TID高于特定值的所有NAL单元。
c) Remove all NAL units associated with a DID higher than a certain value.
c) 删除与高于特定值的DID关联的所有NAL单元。
Informative note: Removal of all coded slice NAL units associated with DIDs higher than a certain value in the entire stream is required in order to preserve conformance of the resulting SVC stream.
资料性说明:为了保持生成的SVC流的一致性,需要删除与整个流中高于某个值的DID相关联的所有编码片NAL单元。
d) Utilize the PRID field to indicate the relative importance of NAL units, and remove all NAL units associated with a PRID higher than a certain value. Note that the use of the PRID is application-specific.
d) 利用PRID字段指示NAL单元的相对重要性,并删除与高于特定值的PRID关联的所有NAL单元。请注意,PRID的使用是特定于应用程序的。
e) Remove NAL units or entire packets according to application-specific rules. The result will depend on the particular coding structure used as well as any additional application-specific functionality (e.g., concealment performed at the receiving decoder). In general, this will result in the reception of a non-conforming bitstream and hence the decoder behavior is not specified by [H.264]. Significant artifacts may therefore appear in the decoded output if the particular decoder implementation does not take appropriate action in response to congestion control.
e) 根据特定于应用程序的规则删除NAL单元或整个数据包。结果将取决于所使用的特定编码结构以及任何附加的特定于应用的功能(例如,在接收解码器处执行的隐藏)。通常,这将导致接收不一致的比特流,因此解码器行为不由[H.264]指定。因此,如果特定解码器实现没有响应于拥塞控制而采取适当的动作,则解码输出中可能出现显著的伪影。
Informative note: The discussion above is centered on NAL units rather than packets, primarily because that is the level where senders can meaningfully manipulate the scalable bitstream. The mapping of NAL units to RTP packets is fairly flexible when using aggregation packets. Depending on the nature of the congestion control algorithm, the "dimension" of congestion measurement (packet count or bitrate) and reaction to it (reducing packet count or bitrate or both) can be adjusted accordingly.
信息提示:上面的讨论集中在NAL单元而不是数据包上,主要是因为这是发送方可以有意义地操作可伸缩比特流的级别。使用聚合数据包时,NAL单元到RTP数据包的映射相当灵活。根据拥塞控制算法的性质,可以相应地调整拥塞测量的“维度”(数据包计数或比特率)和对它的反应(减少数据包计数或比特率或两者)。
All aforementioned means are available to the RTP sender, regardless of whether that sender is located in the sending endpoint or in a mixer-based MANE.
RTP发送方可以使用所有上述方法,而不管该发送方是位于发送端点中还是位于基于混音器的MANE中。
When a translator-based MANE is employed, then the MANE MAY manipulate the session only on the MANE's outgoing path, so that the sensed end-to-end congestion falls within the permissible envelope. As with all translators, in this case, the MANE needs to rewrite RTCP RRs to reflect the manipulations it has performed on the session.
当使用基于翻译器的鬃毛时,鬃毛可以仅在鬃毛的传出路径上操纵会话,使得感测到的端到端拥塞落在允许的包络内。与所有翻译器一样,在这种情况下,MANE需要重写RTCP RRs,以反映其在会话上执行的操作。
Informative note: Applications MAY also implement, in addition or separately, other congestion control mechanisms, e.g., as described in [RFC5775] and [Yan].
资料性说明:应用程序还可以附加或单独实施其他拥塞控制机制,如[RFC5775]和[Yan]中所述。
A new media type, as specified in Section 7.1 of this memo, has been registered with IANA.
本备忘录第7.1节规定的新媒体类型已在IANA注册。
Scalable video coding is a concept that has been around since at least MPEG-2 [MPEG2], which goes back as early as 1993. Nevertheless, it has never gained wide acceptance, perhaps partly because applications didn't materialize in the form envisioned during standardization.
可伸缩视频编码是一个概念,至少从1993年的MPEG-2[MPEG2]开始就存在了。然而,它从未得到广泛接受,部分原因可能是应用程序没有以标准化过程中设想的形式实现。
ISO/IEC MPEG and ITU-T VCEG, respectively, performed a requirement analysis for the SVC project. The MPEG and VCEG requirement documents are available in [JVT-N026] and [JVT-N027], respectively.
ISO/IEC MPEG和ITU-T VCEG分别对SVC项目进行了需求分析。MPEG和VCEG要求文件分别在[JVT-N026]和[JVT-N027]中提供。
The following introduces four main application scenarios that the authors consider relevant and that are implementable with this specification.
下面介绍了作者认为相关的四个主要应用场景,并用本规范实现。
This well-understood form of the use of layered coding [McCanne] implies that all layers are individually conveyed in their own RTP packet streams, each carried in its own RTP session using the IP (multicast) address and port number as the single demultiplexing point. Receivers "tune" into the layers by subscribing to the IP multicast, normally by using IGMP [IGMP]. Depending on the application scenario, it is also possible to convey a number of layers in one RTP session, when finer operation points within the subset of layers are not needed.
分层编码[McCanne]的这种被充分理解的使用形式意味着所有层都在各自的RTP分组流中单独传输,每个层都在各自的RTP会话中使用IP(多播)地址和端口号作为单个解复用点进行传输。接收器通过订阅IP多播“调谐”到层中,通常使用IGMP[IGMP]。根据应用场景,当层子集内不需要更精细的操作点时,也可以在一个RTP会话中传输多个层。
Layered multicast has the great advantage of simplicity and easy implementation. However, it has also the great disadvantage of utilizing many different transport addresses. While the authors
分层组播具有简单易实现的优点。然而,它也有很大的缺点,利用许多不同的传输地址。而作者
consider this not to be a major problem for a professionally maintained content server, receiving client endpoints need to open many ports to IP multicast addresses in their firewalls. This is a practical problem from a firewall and network address translation (NAT) viewpoint. Furthermore, even today IP multicast is not as widely deployed as many wish.
考虑到这不是专业维护的内容服务器的一个主要问题,接收客户端端点需要在防火墙中向IP多播地址打开许多端口。从防火墙和网络地址转换(NAT)的角度来看,这是一个实际问题。此外,即使在今天,IP多播也没有像许多人希望的那样广泛部署。
The authors consider layered multicast an important application scenario for the following reasons. First, it is well understood and the implementation constraints are well known. Second, there may well be large-scale IP networks outside the immediate Internet context that may wish to employ layered multicast in the future. One possible example could be a combination of content creation and core-network distribution for the various mobile TV services, e.g., those being developed by 3GPP (MBMS) [MBMS] and DVB (DVB-H) [DVB-H].
作者认为分层组播是一个重要的应用场景,原因如下。首先,它已被很好地理解,并且实现约束已众所周知。第二,在当前的互联网环境之外,很可能有大规模的IP网络,将来可能希望采用分层多播。一个可能的示例可以是各种移动电视服务的内容创建和核心网络分发的组合,例如,由3GPP(MBMS)[MBMS]和DVB(DVB-H)[DVB-H]开发的移动电视服务。
In this scenario, a streaming server has a repository of stored SVC coded layers for a given content. At the time of streaming, and according to the capabilities, connectivity, and congestion situation of the client(s), the streaming server generates and serves a scalable stream. Both unicast and multicast serving is possible. At the same time, the streaming server may use the same repository of stored layers to compose different streams (with a different set of layers) intended for other audiences.
在这种情况下,流式服务器有一个存储了给定内容的SVC编码层的存储库。在流传输时,根据客户端的能力、连接性和拥塞情况,流服务器生成并服务可伸缩流。单播和多播服务都是可能的。同时,流式传输服务器可以使用相同的存储层存储库来组合针对其他受众的不同流(具有不同的层集)。
As every endpoint receives only a single SVC RTP session, the number of firewall pinholes can be optimized to one.
由于每个端点只接收一个SVC RTP会话,防火墙针孔的数量可以优化为一个。
The main difference between this scenario and straightforward simulcasting lies in the architecture and the requirements of the streaming server, and is therefore out of the scope of IETF standardization. However, compelling arguments can be made why such a streaming server design makes sense. One possible argument is related to storage space and channel bandwidth. Another is bandwidth adaptability without transcoding -- a considerable advantage in a congestion controlled network. When the streaming server learns about congestion, it can reduce the sending bitrate by choosing fewer layers when composing the layered stream; see Section 9. SVC is designed to gracefully support both bandwidth ramp-down and bandwidth ramp-up with a considerable dynamic range. This payload format is designed to allow for bandwidth flexibility in the mentioned sense. While, in theory, a transcoding step could achieve a similar dynamic range, the computational demands are impractically high and video quality is typically lowered -- therefore, few (if any) streaming servers implement full transcoding.
此场景与直接同步广播的主要区别在于流式服务器的体系结构和需求,因此不在IETF标准化的范围内。然而,为什么这样的流式服务器设计有意义,可以提出令人信服的论点。一个可能的参数与存储空间和通道带宽有关。另一个是无需转码的带宽适应性——这在拥塞控制网络中是一个相当大的优势。当流媒体服务器了解到拥塞时,它可以通过在合成分层流时选择较少的层来降低发送比特率;见第9节。SVC被设计为在相当大的动态范围内优雅地支持带宽缓降和带宽缓升。这种有效负载格式的设计考虑到上述意义上的带宽灵活性。虽然从理论上讲,转码步骤可以实现类似的动态范围,但计算需求非常高,视频质量通常较低——因此,很少(如果有的话)流服务器实现完全转码。
Videoconferencing has traditionally relied on Multipoint Control Units (MCUs). These units connect endpoints in a star configuration and operate as follows. Coded video is transmitted from each endpoint to the MCU, where it is decoded, scaled, and composited to construct output frames, which are then re-encoded and transmitted to the endpoint(s). In systems supporting personalized layout (each user is allowed to select the layout of his/her screen), the compositing and encoding process is performed for each of the receiving endpoints. Even without personalized layout, rate matching still requires that the encoding process at the MCU is performed separately for each endpoint. As a result, MCUs have considerable complexity and introduce significant delay. The cascaded encodings also reduce the video quality. Particularly for multipoint connections, interactive communication is cumbersome as the end-to-end delay is very high [G.114]. A simpler architecture is the switching MCU, in which one of the incoming video streams is redirected to the receiving endpoints. Obviously, only one user at a time can be seen and rate matching cannot be performed, thus forcing all transmitting endpoints to transmit at the lowest bit rate available in the MCU-to-endpoint connections.
视频会议传统上依赖于多点控制单元(MCU)。这些单元以星形配置连接端点,并按如下操作。编码视频从每个端点传输到MCU,在MCU中对其进行解码、缩放和合成以构造输出帧,然后对输出帧进行重新编码并传输到端点。在支持个性化布局的系统中(允许每个用户选择其屏幕的布局),为每个接收端点执行合成和编码过程。即使没有个性化布局,速率匹配仍然需要在MCU处为每个端点分别执行编码过程。因此,MCU具有相当大的复杂性,并引入显著的延迟。级联编码也会降低视频质量。特别是对于多点连接,交互通信非常麻烦,因为端到端延迟非常高[G.114]。更简单的架构是交换MCU,其中一个传入视频流被重定向到接收端点。显然,一次只能看到一个用户,并且无法执行速率匹配,因此强制所有传输端点以MCU到端点连接中可用的最低比特率进行传输。
With scalable video coding the MCU can be replaced with an application-level router (ALR): this unit simply selects which incoming packets should be transmitted to which of the receiving endpoints [Eleft]. In such a system, each endpoint performs its own composition of the incoming video streams. Assuming, for example, a system that uses spatial scalability with two layers, personalized layout is equivalent to instructing the ALR to only send the required packets for the corresponding resolution to the particular endpoint. Similarly, rate matching at the ALR for a particular endpoint can be performed by selecting an appropriate subset of the incoming video packets to transmit to the particular endpoint. Personalized layout and rate matching thus become routing decisions, and require no signal processing. Note that scalability also allows participants to enjoy the best video quality afforded by their links, i.e., users no longer have to be forced to operate at the quality supported by the weakest endpoint. Most importantly, the ALR has an insignificant contribution to the end-to-end delay, typically an order of magnitude less than an MCU. This makes it possible to have fully interactive multipoint conferences with even a very large number of participants. There are significant advantages as well in terms of error resilience and, in fact, error tolerance can be increased by nearly an order of magnitude here as well (e.g., using unequal error protection). Finally, the very low delay of an ALR allows these systems to be
通过可伸缩视频编码,MCU可以被应用级路由器(ALR)取代:该单元只需选择哪些传入数据包应传输到哪个接收端点[Eleft]。在这样的系统中,每个端点执行其自己的传入视频流的合成。例如,假设使用具有两层的空间可伸缩性的系统,个性化布局相当于指示ALR仅向特定端点发送相应分辨率所需的分组。类似地,通过选择传入视频分组的适当子集以发送到特定端点,可以在ALR处执行特定端点的速率匹配。因此,个性化布局和速率匹配成为路由决策,不需要信号处理。请注意,可伸缩性还允许参与者享受其链接提供的最佳视频质量,即用户不再被迫以最薄弱端点支持的质量进行操作。最重要的是,ALR对端到端延迟的贡献很小,通常比MCU小一个数量级。这使得甚至有大量参与者参加的全交互式多点会议成为可能。在错误恢复能力方面也有显著的优势,事实上,这里的容错能力也可以增加近一个数量级(例如,使用不平等的错误保护)。最后,ALR的极低延迟允许这些系统
cascaded, with significant benefits in terms of system design and deployment. Cascading of traditional MCUs is impossible due to the very high delay that even a single MCU introduces.
级联,在系统设计和部署方面具有显著优势。传统MCU的级联是不可能的,因为即使是单个MCU也会引入很高的延迟。
Scalable video coding enables a very significant paradigm shift in videoconferencing systems, bringing the complexity of video communication systems (particularly the servers residing within the network) in line with other types of network applications.
可伸缩视频编码使视频会议系统实现了一个非常重要的范式转变,使视频通信系统(特别是驻留在网络中的服务器)的复杂性与其他类型的网络应用程序相一致。
This scenario is a bit more complex, and designed to optimize the network traffic in a core network, while still requiring only a single pinhole in the endpoint's firewall. One of its key applications is the mobile TV market.
此场景稍微复杂一些,旨在优化核心网络中的网络流量,同时仍然只需要端点防火墙中的一个针孔。其关键应用之一是移动电视市场。
Consider a large private IP network, e.g., the core network of the Third Generation Partnership Project (3GPP). Streaming servers within this core network can be assumed to be professionally maintained. It is assumed that these servers can have many ports open to the network and that layered multicast is a real option. Therefore, the streaming server multicasts SVC scalable layers, instead of simulcasting different representations of the same content at different bitrates.
考虑一个大型的私有IP网络,例如第三代合作伙伴项目(3GPP)的核心网络。这个核心网络中的流式服务器可以被认为是专业维护的。假设这些服务器可以有许多端口向网络开放,分层多播是一种切实可行的选择。因此,流式服务器多播SVC可伸缩层,而不是以不同比特率同时广播相同内容的不同表示。
Also consider many endpoints of different classes. Some of these endpoints may lack the processing power or the display size to meaningfully decode all layers; others may have these capabilities. Users of some endpoints may wish not to pay for high quality and are happy with a base service, which may be cheaper or even free. Other users are willing to pay for high quality. Finally, some connected users may have a bandwidth problem in that they can't receive the bandwidth they would want to receive -- be it through congestion, connectivity, change of service quality, or for whatever other reasons. However, all these users have in common that they don't want to be exposed too much, and therefore the number of firewall pinholes needs to be small.
还考虑了不同类别的许多端点。其中一些端点可能缺乏处理能力或显示大小,无法对所有层进行有意义的解码;其他人可能具有这些能力。一些端点的用户可能不希望为高质量付费,而希望使用更便宜甚至免费的基本服务。其他用户愿意为高质量付费。最后,一些连接的用户可能存在带宽问题,因为他们无法接收到他们想要接收的带宽——无论是由于拥塞、连接、服务质量的变化,还是其他任何原因。然而,所有这些用户都有一个共同点,那就是他们不想暴露太多,因此防火墙针孔的数量需要很小。
This situation can be handled best by introducing middleboxes close to the edge of the core network, which receive the layered multicast streams and compose the single SVC scalable bitstream according to the needs of the endpoint connected. These middleboxes are called MANEs throughout this specification. In practice, the authors envision the MANE to be part of (or at least physically and topologically close to) the base station of a mobile network, where all the signaling and media traffic necessarily are multiplexed on the same physical link.
通过在靠近核心网络边缘的位置引入中间盒,可以最好地处理这种情况,中间盒接收分层多播流,并根据所连接端点的需要组成单个SVC可伸缩比特流。在本规范中,这些中间盒称为MANE。在实践中,作者设想MANE是移动网络的基站的一部分(或至少在物理上和拓扑上靠近基站),其中所有信令和媒体流量必须在同一物理链路上多路复用。
MANEs necessarily need to be fairly complex devices. They certainly need to understand the signaling, so, for example, to associate the payload type octet in the RTP header with the SVC payload type.
鬃毛必须是相当复杂的装置。他们当然需要理解信令,因此,例如,要将RTP报头中的有效负载类型八位字节与SVC有效负载类型相关联。
A MANE may aggregate multiple RTP streams, possibly from multiple RTP sessions, thus to reduce the number of firewall pinholes required at the endpoints, or may optimize the outgoing RTP stream to the MTU size of the outgoing path by utilizing the aggregation and fragmentation mechanisms of this memo. This type of MANE is conceptually easy to implement and can offer powerful features, primarily because it necessarily can "see" the payload (including the RTP payload headers), utilize the wealth of layering information available therein, and manipulate it.
MANE可以聚合可能来自多个RTP会话的多个RTP流,从而减少端点处所需的防火墙针孔的数量,或者可以通过利用该备忘录的聚合和分段机制将传出RTP流优化为传出路径的MTU大小。这种类型的鬃毛在概念上易于实现,并且可以提供强大的功能,主要是因为它必须能够“看到”有效载荷(包括RTP有效载荷头),利用其中可用的丰富分层信息,并对其进行操作。
A MANE can also perform stream thinning, in order to adhere to congestion control principles as discussed in Section 9. While the implementation of the forward (media) channel of such a MANE appears to be comparatively simple, the need to rewrite RTCP RRs makes even such a MANE a complex device.
鬃毛还可以执行流细化,以遵守第9节中讨论的拥塞控制原则。虽然这种鬃毛的前向(媒体)信道的实现似乎相对简单,但重写RTCP RRs的需要使得即使这样的鬃毛也是一个复杂的设备。
While the implementation complexity of either case of a MANE, as discussed above, is fairly high, the computational demands are comparatively low.
尽管如上所述,鬃毛的任一情况的实现复杂性相当高,但计算需求相对较低。
Miska Hannuksela contributed significantly to the designs of the PACSI NAL unit and the NI-C mode for decoding order recovery. Roni Even organized and coordinated the design team for the development of this memo, and provided valuable comments. Jonathan Lennox contributed to the NAL unit reordering algorithm for MST and provided input on several parts of this memo. Peter Amon, Sam Ganesan, Mike Nilsson, Colin Perkins, and Thomas Wiegand were members of the design team and provided valuable contributions. Magnus Westerlund has also made valuable comments. Charles Eckel and Stuart Taylor provided valuable comments after the first WGLC for this document. Xiaohui (Joanne) Wei helped improving Table 13 and the SDP examples.
Miska Hannuksela对PACSI NAL单元和解码顺序恢复NI-C模式的设计做出了重大贡献。Roni甚至组织和协调了设计团队开发该备忘录,并提供了宝贵的意见。Jonathan Lennox为MST的NAL单元重新排序算法做出了贡献,并就本备忘录的几个部分提供了意见。Peter Amon、Sam Ganesan、Mike Nilsson、Colin Perkins和Thomas Wiegand是设计团队的成员,他们做出了宝贵的贡献。马格努斯·韦斯特隆德也发表了宝贵的评论。查尔斯·埃克尔(Charles Eckel)和斯图尔特·泰勒(Stuart Taylor)在本文件第一次工作组讨论后提出了宝贵意见。魏晓辉(Joanne)帮助改进了表13和SDP示例。
The work of Thomas Schierl has been supported by the European Commission under contract number FP7-ICT-248036, project COAST.
Thomas Schierl的工作得到了欧盟委员会的支持,合同号为FP7-ICT-248036,项目海岸。
[H.264] ITU-T Recommendation H.264, "Advanced video coding for generic audiovisual services", March 2010.
[H.264]ITU-T建议H.264,“通用视听服务的高级视频编码”,2010年3月。
[RFC6184] Wang, Y.-K., Even, R., Kristensen, T., and R. Jesup, "RTP Payload Format for H.264 Video", RFC 6184, May 2011.
[RFC6184]Wang,Y.-K.,Even,R.,Kristensen,T.,和R.Jesup,“H.264视频的RTP有效载荷格式”,RFC 6184,2011年5月。
[ISO/IEC14496-10] ISO/IEC International Standard 14496-10:2005.
[ISO/IEC14496-10]ISO/IEC国际标准14496-10:2005。
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.
[RFC3264]Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,2002年6月。
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.
[RFC3550]Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。
[RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", BCP 13, RFC 4288, December 2005.
[RFC4288]Freed,N.和J.Klensin,“介质类型规范和注册程序”,BCP 13,RFC 4288,2005年12月。
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.
[RFC4566]Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC4566,2006年7月。
[RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, October 2006.
[RFC4648]Josefsson,S.,“Base16、Base32和Base64数据编码”,RFC4648,2006年10月。
[RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific Media Attributes in the Session Description Protocol (SDP)", RFC 5576, June 2009.
[RFC5576]Lennox,J.,Ott,J.,和T.Schierl,“会话描述协议(SDP)中的源特定媒体属性”,RFC 55762009年6月。
[RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding Dependency in the Session Description Protocol (SDP)", RFC 5583, July 2009.
[RFC5583]Schierl,T.和S.Wenger,“会话描述协议(SDP)中的信令媒体解码依赖性”,RFC 5583,2009年7月。
[RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP Flows", RFC 6051, November 2010.
[RFC6051]Perkins,C.和T.Schierl,“RTP流的快速同步”,RFC 60512010年11月。
[DVB-H] DVB - Digital Video Broadcasting (DVB); DVB-H Implementation Guidelines, ETSI TR 102 377, 2005.
[DVB-H]DVB-数字视频广播(DVB);DVB-H实施指南,ETSI TR 102 377,2005年。
[Eleft] Eleftheriadis, A., R. Civanlar, and O. Shapiro, "Multipoint Videoconferencing with Scalable Video Coding", Journal of Zhejiang University SCIENCE A, Vol. 7, Nr. 5, April 2006, pp. 696-705. (Proceedings of the Packet Video 2006 Workshop.)
[Eleft]Eleftheriadis,A.,R.Civanlar和O.Shapiro,“具有可伸缩视频编码的多点视频会议”,浙江大学学报,第7卷,第5期,2006年4月,第696-705页。(2006年分组视频研讨会论文集)
[G.114] ITU-T Rec. G.114, "One-way transmission time", May 2003.
[G.114]ITU-T Rec.G.114,“单向传输时间”,2003年5月。
[H.241] ITU-T Rec. H.241, "Extended video procedures and control signals for H.300-series terminals", May 2006.
[H.241]ITU-T Rec.H.241,“H.300系列终端的扩展视频程序和控制信号”,2006年5月。
[IGMP] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. Thyagarajan, "Internet Group Management Protocol, Version 3", RFC 3376, October 2002.
[IGMP]Cain,B.,Deering,S.,Kouvelas,I.,Fenner,B.,和A.Thyagarajan,“互联网组管理协议,第3版”,RFC 3376,2002年10月。
[JVT-N026] Ohm J.-R., Koenen, R., and Chiariglione, L. (ed.), "SVC requirements specified by MPEG (ISO/IEC JTC1 SC29 WG11)", JVT-N026, available from http://ftp3.itu.ch/av-arch/ jvt-site/2005_01_HongKong/JVT-N026.doc, Hong Kong, China, January 2005.
[JVT-N026]Ohm J.-R.,Koenen,R.,和Chiariglione,L.(编辑),“MPEG(ISO/IEC JTC1 SC29 WG11)规定的SVC要求”,JVT-N026,可从http://ftp3.itu.ch/av-arch/ JVT网站/ 2005年1月1日港/JVT-N026 DOC,香港,中国,2005年1月。
[JVT-N027] Sullivan, G. and Wiegand, T. (ed.), "SVC requirements specified by VCEG (ITU-T SG16 Q.6)", JVT-N027, available from http://ftp3.itu.int/av-arch/ jvt-site/2005_01_HongKong/JVT-N027.doc, Hong Kong, China, January 2005.
[JVT-N027]Sullivan,G.和Wiegand,T.(编辑),“VCEG规定的SVC要求(ITU-T SG16 Q.6)”,JVT-N027,可从http://ftp3.itu.int/av-arch/ JVT网站/ 2005年1月1日港/JVT-N027.DOC,香港,中国,2005年1月。
[McCanne] McCanne, S., Jacobson, V., and Vetterli, M., "Receiver-driven layered multicast", in Proc. of ACM SIGCOMM'96, pages 117-130, Stanford, CA, August 1996.
[McCanne]McCanne,S.,Jacobson,V.,和Vetterli,M.,“接收器驱动的分层多播”,在Proc。ACM SIGCOMM'96,第117-130页,加利福尼亚州斯坦福,1996年8月。
[MBMS] 3GPP - Technical Specification Group Services and System Aspects; Multimedia Broadcast/Multicast Service (MBMS); Protocols and codecs (Release 6), December 2005.
[MBMS]3GPP-技术规范组服务和系统方面;多媒体广播/多播服务(MBMS);协议和编解码器(第6版),2005年12月。
[MPEG2] ISO/IEC International Standard 13818-2:1993.
[MPEG2]ISO/IEC国际标准13818-2:1993。
[RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998.
[RFC2326]Schulzrinne,H.,Rao,A.,和R.Lanphier,“实时流协议(RTSP)”,RFC2326,1998年4月。
[RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session Announcement Protocol", RFC 2974, October 2000.
[RFC2974]Handley,M.,Perkins,C.,和E.Whelan,“会话公告协议”,RFC 2974,2000年10月。
[RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117, January 2008.
[RFC5117]Westerlund,M.和S.Wenger,“RTP拓扑”,RFC 51172008年1月。
[RFC5775] Luby, M., Watson, M., and L. Vicisano, "Asynchronous Layered Coding (ALC) Protocol Instantiation", RFC 5775, April 2010.
[RFC5775]Luby,M.,Watson,M.,和L.Vicisano,“异步分层编码(ALC)协议实例化”,RFC 5775,2010年4月。
[Yan] Yan, J., Katrinis, K., May, M., and Plattner, R., "Media-and TCP-friendly congestion control for scalable video streams", in IEEE Trans. Multimedia, pages 196-206, April 2006.
[Yan]Yan,J.,Katrinis,K.,May,M.,和Plattner,R.,“可伸缩视频流的媒体和TCP友好拥塞控制”,载于IEEE Trans。多媒体,第196-206页,2006年4月。
Authors' Addresses
作者地址
Stephan Wenger 2400 Skyfarm Dr. Hillsborough, CA 94010 USA
Stephan Wenger 2400 Skyfarm Hillsborough博士,美国加利福尼亚州94010
Phone: +1-415-713-5473 EMail: stewe@stewe.org
Phone: +1-415-713-5473 EMail: stewe@stewe.org
Ye-Kui Wang Huawei Technologies 400 Crossing Blvd, 2nd Floor Bridgewater, NJ 08807 USA
美国新泽西州布里奇沃特市横穿大道400号2楼华为科技公司王业奎08807
Phone: +1-908-541-3518 EMail: yekui.wang@huawei.com
Phone: +1-908-541-3518 EMail: yekui.wang@huawei.com
Thomas Schierl Fraunhofer HHI Einsteinufer 37 D-10587 Berlin Germany
Thomas Schierl Fraunhofer HHI Einsteinufer 37 D-10587德国柏林
Phone: +49-30-31002-227 EMail: ts@thomas-schierl.de
Phone: +49-30-31002-227 EMail: ts@thomas-schierl.de
Alex Eleftheriadis Vidyo, Inc. 433 Hackensack Ave. Hackensack, NJ 07601 USA
Alex Eleftheriadis Vidyo,Inc.美国新泽西州哈肯萨克哈肯萨克大道433号,邮编:07601
Phone: +1-201-467-5135 EMail: alex@vidyo.com
Phone: +1-201-467-5135 EMail: alex@vidyo.com