Network Working Group                                        M. Hatanaka
Request for Comments: 5584                                  J. Matsumoto
Category: Standards Track                               Sony Corporation
                                                               July 2009
        
Network Working Group                                        M. Hatanaka
Request for Comments: 5584                                  J. Matsumoto
Category: Standards Track                               Sony Corporation
                                                               July 2009
        

RTP Payload Format for the Adaptive TRansform Acoustic Coding (ATRAC) Family

自适应变换声学编码(ATRAC)系列的RTP有效载荷格式

Abstract

摘要

This document describes an RTP payload format for efficient and flexible transporting of audio data encoded with the Adaptive TRansform Audio Coding (ATRAC) family of codecs. Recent enhancements to the ATRAC family of codecs support high-quality audio coding with multiple channels. The RTP payload format as presented in this document also includes support for data fragmentation, elementary redundancy measures, and a variation on scalable streaming.

本文档描述了一种RTP有效载荷格式,用于高效灵活地传输使用自适应变换音频编码(ATRAC)编解码器系列编码的音频数据。ATRAC系列编解码器的最新增强支持多通道高质量音频编码。本文档中介绍的RTP有效负载格式还包括对数据分段、基本冗余措施的支持,以及对可伸缩流的一种变体。

Status of This Memo

关于下段备忘

This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.

本文件规定了互联网社区的互联网标准跟踪协议,并要求进行讨论和提出改进建议。有关本协议的标准化状态和状态,请参考当前版本的“互联网官方协议标准”(STD 1)。本备忘录的分发不受限制。

Copyright Notice

版权公告

Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2009 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document.

本文件受BCP 78和IETF信托在本文件出版之日生效的与IETF文件有关的法律规定的约束(http://trustee.ietf.org/license-info). 请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。

This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

本文件可能包含2008年11月10日之前发布或公开的IETF文件或IETF贡献中的材料。控制某些材料版权的人员可能未授予IETF信托允许在IETF标准流程之外修改此类材料的权利。在未从控制此类材料版权的人员处获得充分许可的情况下,不得在IETF标准流程之外修改本文件,也不得在IETF标准流程之外创建其衍生作品,除了将其格式化以RFC形式发布或将其翻译成英语以外的其他语言。

Table of Contents

目录

   1. Introduction ....................................................3
   2. Conventions Used in This Document ...............................3
   3. Codec-Specific Details ..........................................3
   4. RTP Packetization and Transport of ATRAC-Family Streams .........4
      4.1. ATRAC Frames ...............................................4
      4.2. Concatenation of Frames ....................................4
      4.3. Frame Fragmentation ........................................4
      4.4. Transmission of Redundant Frames ...........................4
      4.5. Scalable Lossless Streaming (High-Speed Transfer Mode) .....5
           4.5.1. Scalable Multiplexed Streaming ......................5
           4.5.2. Scalable Multi-Session Streaming ....................5
   5. Payload Format ..................................................6
      5.1. Global Structure of Payload Format .........................6
      5.2. Usage of RTP Header Fields .................................7
      5.3. RTP Payload Structure ......................................8
           5.3.1. Usage of ATRAC Header Section .......................8
           5.3.2. Usage of ATRAC Frames Section .......................9
   6. Packetization Examples .........................................12
      6.1. Example Multi-Frame Packet ................................12
      6.2. Example Fragmented ATRAC Frame ............................13
   7. Payload Format Parameters ......................................14
      7.1. ATRAC3 Media Type Registration ............................14
      7.2. ATRAC-X Media Type Registration ...........................16
      7.3. ATRAC Advanced Lossless Media Type Registration ...........18
      7.4. Channel Mapping Configuration Table .......................20
      7.5. Mapping Media Type Parameters into SDP ....................21
           7.5.1. For Media Subtype ATRAC3 ...........................21
           7.5.2. For Media Subtype ATRAC-X ..........................21
           7.5.3. For Media Subtype ATRAC Advanced Lossless ..........22
      7.6. Offer/Answer Model Considerations .........................22
           7.6.1. For All Three Media Subtypes .......................22
           7.6.2. For Media Subtype ATRAC3 ...........................23
           7.6.3. For Media Subtype ATRAC-X ..........................23
           7.6.4. For Media Subtype ATRAC Advanced Lossless ..........23
      7.7. Usage of Declarative SDP ..................................24
      7.8. Example SDP Session Descriptions ..........................24
      7.9. Example Offer/Answer Exchange .............................26
   8. IANA Considerations ............................................28
   9. Security Considerations ........................................28
   10. Considerations on Correct Decoding ............................28
      10.1. Verification of the Packets ..............................28
      10.2. Validity Checking of the Packets .........................29
   11. References ....................................................29
      11.1. Normative References .....................................29
      11.2. Informative References ...................................30
        
   1. Introduction ....................................................3
   2. Conventions Used in This Document ...............................3
   3. Codec-Specific Details ..........................................3
   4. RTP Packetization and Transport of ATRAC-Family Streams .........4
      4.1. ATRAC Frames ...............................................4
      4.2. Concatenation of Frames ....................................4
      4.3. Frame Fragmentation ........................................4
      4.4. Transmission of Redundant Frames ...........................4
      4.5. Scalable Lossless Streaming (High-Speed Transfer Mode) .....5
           4.5.1. Scalable Multiplexed Streaming ......................5
           4.5.2. Scalable Multi-Session Streaming ....................5
   5. Payload Format ..................................................6
      5.1. Global Structure of Payload Format .........................6
      5.2. Usage of RTP Header Fields .................................7
      5.3. RTP Payload Structure ......................................8
           5.3.1. Usage of ATRAC Header Section .......................8
           5.3.2. Usage of ATRAC Frames Section .......................9
   6. Packetization Examples .........................................12
      6.1. Example Multi-Frame Packet ................................12
      6.2. Example Fragmented ATRAC Frame ............................13
   7. Payload Format Parameters ......................................14
      7.1. ATRAC3 Media Type Registration ............................14
      7.2. ATRAC-X Media Type Registration ...........................16
      7.3. ATRAC Advanced Lossless Media Type Registration ...........18
      7.4. Channel Mapping Configuration Table .......................20
      7.5. Mapping Media Type Parameters into SDP ....................21
           7.5.1. For Media Subtype ATRAC3 ...........................21
           7.5.2. For Media Subtype ATRAC-X ..........................21
           7.5.3. For Media Subtype ATRAC Advanced Lossless ..........22
      7.6. Offer/Answer Model Considerations .........................22
           7.6.1. For All Three Media Subtypes .......................22
           7.6.2. For Media Subtype ATRAC3 ...........................23
           7.6.3. For Media Subtype ATRAC-X ..........................23
           7.6.4. For Media Subtype ATRAC Advanced Lossless ..........23
      7.7. Usage of Declarative SDP ..................................24
      7.8. Example SDP Session Descriptions ..........................24
      7.9. Example Offer/Answer Exchange .............................26
   8. IANA Considerations ............................................28
   9. Security Considerations ........................................28
   10. Considerations on Correct Decoding ............................28
      10.1. Verification of the Packets ..............................28
      10.2. Validity Checking of the Packets .........................29
   11. References ....................................................29
      11.1. Normative References .....................................29
      11.2. Informative References ...................................30
        
1. Introduction
1. 介绍

The ATRAC family of perceptual audio codecs is designed to address numerous needs for high-quality, low-bit-rate audio transfer. ATRAC technology can be found in many consumer and professional products and applications, including MD players, CD players, voice recorders, and mobile phones.

ATRAC感知音频编解码器系列旨在满足高质量、低比特率音频传输的众多需求。ATRAC技术可以在许多消费和专业产品和应用中找到,包括MD播放器、CD播放器、录音机和手机。

Recent advances in ATRAC technology allow for multiple channels of audio to be encoded in customizable groupings. This should allow for future expansions in scaled streaming to provide the greatest flexibility in streaming any one of the ATRAC family member codecs; however, this payload format does not distinguish between the codecs on a packet level.

ATRAC技术的最新进展允许在可定制的分组中对多个音频通道进行编码。这应考虑到未来扩展的规模流,以提供最大的灵活性,流任何一个ATRAC家族成员编解码器;然而,这种有效载荷格式并不区分分组级别上的编解码器。

This simplified payload format contains only the basic information needed to disassemble a packet of ATRAC audio in order to decode it. There is also basic support for fragmentation and redundancy.

这种简化的有效载荷格式只包含反汇编ATRAC音频数据包以对其进行解码所需的基本信息。还有对碎片和冗余的基本支持。

2. Conventions Used in This Document
2. 本文件中使用的公约

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [4].

本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照RFC 2119[4]中所述进行解释。

3. Codec-Specific Details
3. 特定于编解码器的详细信息

Early versions of the ATRAC codec handled only two channels of audio at 44.1 kHz sampling frequency, with typical bit-rates between 66 kbps and 132 kbps. The latest version allows for a maximum of 8 channels of audio, up to 96 kHz in sampling frequency, and a lossless encoding option that can be transmitted in either a scalable (also known as High-Speed Transfer mode) or standard (aka Standard mode) format. The feasible bit-rate range has also expanded, allowing from a low of 8 kbps up to 1400 kbps in lossy encoding modes.

ATRAC编解码器的早期版本仅处理两个音频通道,采样频率为44.1 kHz,典型比特率介于66 kbps和132 kbps之间。最新版本允许最多8个声道的音频,采样频率高达96 kHz,并且支持无损编码选项,可以以可扩展(也称为高速传输模式)或标准(也称为标准模式)格式传输。可行的比特率范围也已扩大,允许在有损编码模式下从8 kbps的低比特率扩展到1400 kbps。

Depending on the version of ATRAC used, the sample-frame size is either 512, 1024, or 2048 samples. While the lossy and Standard mode lossless formats are encoded as sequential single audio frames, High-Speed Transfer mode lossless data comprises two layers -- a lossy base layer and an enhancement layer.

根据使用的ATRAC版本,采样帧大小为512、1024或2048个采样。虽然有损和标准模式无损格式被编码为连续的单音频帧,但高速传输模式无损数据包括两层——有损基本层和增强层。

Although streaming of multi-channel audio is supported depending on the ATRAC version used, all encoded audio for a given time period is contained within a single frame. Therefore, there is no interleaving nor splitting of audio data on a per-channel basis with which to be concerned.

尽管根据使用的ATRAC版本支持多通道音频流,但给定时间段内的所有编码音频都包含在单个帧内。因此,不存在与之相关的基于每个信道的音频数据的交织或分割。

4. RTP Packetization and Transport of ATRAC-Family Streams
4. ATRAC族流的RTP封装和传输
4.1. ATRAC Frames
4.1. ATRAC框架

For transportation of compressed audio data, ATRAC uses the concept of frames. ATRAC frames are the smallest data unit for which timing information is attributed. Frames are octet-aligned by definition.

对于压缩音频数据的传输,ATRAC使用帧的概念。ATRAC帧是时间信息的最小数据单元。根据定义,帧是八进制对齐的。

4.2. Concatenation of Frames
4.2. 帧的串联

It is often possible to carry multiple frames in one RTP packet. This can be useful in audio, where on a LAN with a 1500-byte MTU, an average of 7 complete 64 kbps ATRAC frames could be carried in a single RTP packet, as each ATRAC frame would be approximately 200 bytes. ATRAC frames may be of fixed or variable length. To facilitate parsing in the case of multiple frames in one RTP packet, the size of each frame is made known to the receiver by carrying "in-band" the frame size for each contained frame in an RTP packet. However, to simplify the implementation of RTP receivers, it is required that when multiple frames are carried in an RTP packet, each frame MUST be complete, i.e., the number of frames in an RTP packet MUST be integral.

通常可以在一个RTP数据包中携带多个帧。这在音频中非常有用,在具有1500字节MTU的LAN上,单个RTP数据包中平均可以承载7个完整的64 kbps ATRAC帧,因为每个ATRAC帧大约有200字节。ATRAC框架可以是固定长度或可变长度。为了便于在一个RTP分组中的多个帧的情况下进行解析,通过携带RTP分组中每个包含帧的帧大小的“带内”使接收机知道每个帧的大小。然而,为了简化RTP接收机的实现,要求当在RTP分组中携带多个帧时,每个帧必须是完整的,即,RTP分组中的帧数必须是整数。

4.3. Frame Fragmentation
4.3. 帧碎片

The ATRAC codec can handle very large frames. As most IP networks have significantly smaller MTU sizes than the frame sizes ATRAC can handle, this payload format allows for the fragmentation of an ATRAC frame over multiple RTP packets. However, to simplify the implementation of RTP receivers, an RTP packet MUST carry either one or more complete ATRAC frames or a single fragment of one ATRAC frame. In other words, RTP packets MUST NOT contain fragments of multiple ATRAC frames and MUST NOT contain a mix of complete and fragmented frames.

ATRAC编解码器可以处理非常大的帧。由于大多数IP网络的MTU大小明显小于ATRAC可以处理的帧大小,因此该有效负载格式允许在多个RTP数据包上对ATRAC帧进行分段。然而,为了简化RTP接收机的实现,RTP分组必须携带一个或多个完整的ATRAC帧或一个ATRAC帧的单个片段。换句话说,RTP数据包不得包含多个ATRAC帧的片段,也不得包含完整帧和片段帧的混合。

4.4. Transmission of Redundant Frames
4.4. 冗余帧的传输

As RTP does not guarantee reliable transmission, receipt of data is not assured. Loss of a packet can result in a "decoding gap" at the receiver. One method to remedy this problem is to allow time-shifted copies of ATRAC frames to be sent along with current data. For a modest cost in latency and implementation complexity, error resiliency to packet loss can be achieved. For further details, see Section 5.3.2.1 and [12].

由于RTP不能保证可靠的传输,因此无法保证数据的接收。数据包丢失可导致接收器处的“解码间隙”。解决此问题的一种方法是允许将ATRAC帧的时移副本与当前数据一起发送。对于延迟和实现复杂性方面的适度成本,可以实现对数据包丢失的错误恢复能力。有关更多详细信息,请参见第5.3.2.1和[12]节。

4.5. Scalable Lossless Streaming (High-Speed Transfer Mode)
4.5. 可扩展无损流(高速传输模式)

As ATRAC supports a variation on scalable encoding, this payload format provides a mechanism for transmitting essential data (also referred to as the base layer) with its enhancement data in two ways -- multiplexed through one session or separated over two sessions.

由于ATRAC支持可伸缩编码的变体,这种有效负载格式提供了一种机制,用于以两种方式传输基本数据(也称为基本层)及其增强数据——通过一个会话多路复用或通过两个会话分离。

In either method, only the base layer is essential in producing audio data. The enhancement layer carries the remaining audio data needed to decode lossless audio data. So in situations of limited bandwidth, the sender may choose not to transmit enhancement data yet still provide a client with enough data to generate lossily-encoded audio through the base layer.

在这两种方法中,只有基本层在生成音频数据时是必不可少的。增强层携带解码无损音频数据所需的剩余音频数据。因此,在带宽有限的情况下,发送方可以选择不发送增强数据,但仍然向客户机提供足够的数据,以通过基本层生成有损编码的音频。

4.5.1. Scalable Multiplexed Streaming
4.5.1. 可伸缩多路复用流

In multiplexed streaming, the base layer and enhancement layer are coupled together in each packet, utilizing only one session as illustrated in Figure 1.

在多路复用流中,基本层和增强层在每个分组中耦合在一起,仅使用一个会话,如图1所示。

The packet MUST begin with the base layer, and the two layer types MUST interleave if both of the layers exist in a packet (only base or enhancement is included in a packet at the beginning of a streaming, or during the fragmentation).

数据包必须从基本层开始,如果数据包中存在两个层,则两个层类型必须交错(在流传输开始时或在分段期间,数据包中仅包括基本层或增强层)。

   +----------------+  +----------------+  +----------------+
   |Base|Enhancement|--|Base|Enhancement|--|Base|Enhancement| ...
   +----------------+  +----------------+  +----------------+
           N                   N+1                 N+2        : Packet
        
   +----------------+  +----------------+  +----------------+
   |Base|Enhancement|--|Base|Enhancement|--|Base|Enhancement| ...
   +----------------+  +----------------+  +----------------+
           N                   N+1                 N+2        : Packet
        

Figure 1. Multiplexed Structure

图1。多路复用结构

4.5.2. Scalable Multi-Session Streaming
4.5.2. 可扩展的多会话流媒体

In multi-session streaming, the base layer and enhancement layer are sent over two separate sessions, allowing clients with certain bandwidth limitations to receive just the base layer for decoding as illustrated in Figure 2.

在多会话流中,基本层和增强层通过两个单独的会话发送,允许具有特定带宽限制的客户端仅接收基本层进行解码,如图2所示。

In this case, it is REQUIRED to determine which sessions are paired together in receiver side. For paired base and enhancement layer sessions, the CNAME bindings in the RTP Control Protocol (RTCP) session MUST be applied using the same CNAME to ensure correct mapping to the RTP source.

在这种情况下,需要确定哪些会话在接收器端配对在一起。对于成对的基本层和增强层会话,必须使用相同的CNAME应用RTP控制协议(RTCP)会话中的CNAME绑定,以确保正确映射到RTP源。

While there may be alternative methods for synchronization of the layers, the timestamp SHOULD be used for synchronizing the base layer with its enhancement. The two sessions MUST be synchronized using the information in RTCP SR packets to align the RTP timestamps.

虽然可能存在用于同步层的替代方法,但时间戳应用于同步基础层及其增强。必须使用RTCP SR数据包中的信息同步两个会话,以对齐RTP时间戳。

If the enhancement layer's session data cannot arrive until the presentation time, the decoder MUST decode the base layer session's data only, ignoring the enhancement layer's data.

如果增强层的会话数据在呈现时间之前无法到达,则解码器必须仅解码基本层会话的数据,而忽略增强层的数据。

         Session 1:
         +------+  +------+  +------+  +------+
         | Base |--| Base |--| Base |--| Base | ...
         +------+  +------+  +------+  +------+
            N         N+1       N+2       N+3     : Packet
        
         Session 1:
         +------+  +------+  +------+  +------+
         | Base |--| Base |--| Base |--| Base | ...
         +------+  +------+  +------+  +------+
            N         N+1       N+2       N+3     : Packet
        
         Session 2:
         +-------------+  +-------------+  +-------------+
         | Enhancement |--| Enhancement |--| Enhancement | ...
         +-------------+  +-------------+  +-------------+
               N                N+1              N+2         : Packet
        
         Session 2:
         +-------------+  +-------------+  +-------------+
         | Enhancement |--| Enhancement |--| Enhancement | ...
         +-------------+  +-------------+  +-------------+
               N                N+1              N+2         : Packet
        

Figure 2. Multi-Session Streaming

图2。多会话流媒体

5. Payload Format
5. 有效载荷格式
5.1. Global Structure of Payload Format
5.1. 有效载荷格式的全局结构

The structure of ATRAC Payload is illustrated in Figure 3. The RTP payload following the RTP header contains two octet-aligned data sections.

ATRAC有效载荷的结构如图3所示。RTP报头后面的RTP有效负载包含两个八位字节对齐的数据部分。

            +------+--------------+-----------------------------+
            |RTP   | ATRAC Header |   ATRAC Frames Section      |
            |Header| Section      | (including redundant data)  |
            +------+--------------+-----------------------------+
            < ---------------- RTP Packet Payload ------------- >
        
            +------+--------------+-----------------------------+
            |RTP   | ATRAC Header |   ATRAC Frames Section      |
            |Header| Section      | (including redundant data)  |
            +------+--------------+-----------------------------+
            < ---------------- RTP Packet Payload ------------- >
        

Figure 3. Structure of RTP Payload of ATRAC Family

图3。ATRAC系列RTP有效载荷的结构

The first data section is the ATRAC Header, containing just one header with information for the whole packet. The second section is where the encoded ATRAC frames are stored. This may contain either a single fragment of one ATRAC frame or one or more complete ATRAC frames. The ATRAC Frames Section MUST NOT be empty. When using the redundancy mechanism described in Section 5.3.2.1, the redundant frame data can be included in this section and timestamp MUST be set to the oldest redundant frame's timestamp.

第一个数据部分是ATRAC报头,只包含一个报头,其中包含整个数据包的信息。第二部分是存储编码的ATRAC帧的地方。这可能包含一个ATRAC帧的单个片段或一个或多个完整的ATRAC帧。ATRAC框架部分不能为空。当使用第5.3.2.1节中描述的冗余机制时,冗余帧数据可包含在本节中,且时间戳必须设置为最早的冗余帧的时间戳。

To benefit from ATRAC's High-Speed Transfer mode lossless encoding capability, the RTP payload can be split across two sessions, with one transmitting an essential base layer and the other transmitting enhancement data. However, in either case, the above structure still applies.

为了受益于ATRAC的高速传输模式无损编码能力,RTP有效载荷可以分为两个会话,一个传输基本的基本层,另一个传输增强数据。然而,在这两种情况下,上述结构仍然适用。

5.2. Usage of RTP Header Fields
5.2. RTP头字段的使用
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 4. RTP Standard Header Part

图4。RTP标准头部件

The structure of the RTP Standard Header Part is illustrated in Figure 4.

RTP标准头部分的结构如图4所示。

Version(V): 2 bits Set to 2.

版本(V):2位设置为2。

Padding(P): 1 bit If the padding bit is set, the packet contains one or more additional padding octets at the end, which are not part of the payload. The last octet of the padding contains a count of how many padding octets should be ignored, including itself. Padding may be needed by some encryption algorithms with fixed block sizes or for carrying several RTP packets in a lower-layer protocol data unit (see [1]).

Padding(P):1位如果设置了Padding位,则数据包的末尾包含一个或多个额外的Padding八位字节,它们不是有效负载的一部分。填充的最后一个八位字节包含应忽略的填充八位字节数,包括其本身。某些具有固定块大小的加密算法或在较低层协议数据单元中承载多个RTP数据包可能需要填充(参见[1])。

Extension(X): 1 bit Defined by the RTP profile used.

扩展名(X):由使用的RTP配置文件定义的1位。

CSRC count(CC): 4 bits See RFC 3550 [1].

CSC计数(CC):4位见RFC 3550[1]。

Marker (M): 1 bit Set to 1 if the packet is the first packet after a silence period; otherwise, it MUST be set to 0.

标记(M):如果数据包是静默期后的第一个数据包,则将1位设置为1;否则,必须将其设置为0。

Payload Type (PT): 7 bits The assignment of an RTP payload type for this packet format is outside the scope of this document; it is specified by the RTP profile under which this payload format is used, or signaled dynamically out-of-band (e.g., using the Session Description Protocol (SDP)).

有效载荷类型(PT):7位此数据包格式的RTP有效载荷类型分配不在本文件范围内;它由RTP配置文件指定,在RTP配置文件下使用此有效负载格式,或在带外动态发送信号(例如,使用会话描述协议(SDP))。

sequence number: 16 bits A sequential number for the RTP packet. It ranges from 0 to 65535 and repeats itself periodically.

序列号:16位RTP数据包的序列号。它的范围从0到65535,并定期重复。

Timestamp: 32 bits A timestamp representing the sampling time of the first sample of the first ATRAC frame in the current RTP packet. When using SDP, the clock rate of the RTP timestamp MUST be expressed using the "rtpmap" attribute. For ATRAC3 and ATRAC Advanced Lossless, the RTP timestamp rate MUST be 44100 Hz. For ATRAC-X, the RTP timestamp rate is 44100 Hz or 48000 Hz, and it will be selected by out-of-band signaling.

时间戳:32位表示当前RTP数据包中第一个ATRAC帧的第一个样本的采样时间的时间戳。使用SDP时,RTP时间戳的时钟速率必须使用“rtpmap”属性表示。对于ATRAC3和ATRAC Advanced无损,RTP时间戳速率必须为44100 Hz。对于ATRAC-X,RTP时间戳速率为44100 Hz或48000 Hz,将通过带外信令进行选择。

SSRC: 32 bits See RFC 3550 [1].

SSRC:32位见RFC3550[1]。

CSRC list: 0 to 15 items, 32 bits each See RFC 3550 [1].

中国证监会清单:0至15项,各32位见RFC 3550[1]。

5.3. RTP Payload Structure
5.3. RTP有效载荷结构
5.3.1. Usage of ATRAC Header Section
5.3.1. ATRAC头段的使用

The ATRAC header section has the fixed length of one byte as illustrated in Figure 5.

ATRAC头段的固定长度为一个字节,如图5所示。

                     0 1 2 3 4 5 6 7
                    +-+-+-+-+-+-+-+-+
                    |C|FrgNo|NFrames|
                    +-+-+-+-+-+-+-+-+
        
                     0 1 2 3 4 5 6 7
                    +-+-+-+-+-+-+-+-+
                    |C|FrgNo|NFrames|
                    +-+-+-+-+-+-+-+-+
        

Figure 5. ATRAC RTP Header

图5。ATRAC RTP报头

Continuation Flag (C) : 1 bit The packet that corresponds to the last part of the audio frame data in a fragmentation MUST have this bit set to 0; otherwise, it's set to 1.

延续标志(C):1位与片段中音频帧数据的最后一部分相对应的数据包必须将该位设置为0;否则,它将设置为1。

Fragment Number (FrgNo): 3 bits In the event of data fragmentation, this value is one for the first packet, and increases sequentially for the remaining fragmented data

片段编号(FrgNo):在数据碎片的情况下,该值为3位,对于第一个数据包,该值为1,对于剩余的碎片数据,该值依次增加

packets. This value MUST be zero for an unfragmented frame. (Note: 3 bits is sufficient to avoid Fragment Number rollover given the current maximum supported bit-rate in the ATRAC specification. If that changes, the choice of 3 bits for the Fragment Number should be revisited.)

小包。对于未分段的帧,此值必须为零。(注意:考虑到ATRAC规范中当前支持的最大比特率,3位足以避免碎片号翻转。如果这种情况发生变化,应重新选择碎片号的3位。)

Number of Frames (NFrames): 4 bits The number of audio frames in this packet are field value + 1. This allows for a maximum of 16 ATRAC-encoded audio frames per packet, with 0 indicating one audio frame. Each audio frame MUST be complete in the packet if fragmentation is not applied. In the case of fragmentation, the data for only one audio frame is allowed to be fragmented, and this value MUST be 0.

帧数(NFrames):4位此数据包中的音频帧数为字段值+1。这允许每个数据包最多16个ATRAC编码音频帧,0表示一个音频帧。如果未应用分段,则数据包中的每个音频帧必须完整。在分段的情况下,只允许对一个音频帧的数据进行分段,并且该值必须为0。

5.3.2. Usage of ATRAC Frames Section
5.3.2. ATRAC框架部分的使用

The ATRAC Frames Section contains an integer number of complete ATRAC frames or a single fragment of one ATRAC frame, as illustrated in Figure 6. Each ATRAC frame is preceded by a one-bit flag indicating the layer type and a Block Length field indicating the size in bytes of the ATRAC frame. If more than one ATRAC frame is present, then the frames are concatenated into a contiguous string of bit-flag, Block Length, and ATRAC frame in order of their frame number. This section MUST NOT be empty.

ATRAC帧部分包含整数个完整的ATRAC帧或一个ATRAC帧的单个片段,如图6所示。每个ATRAC帧前面都有一个表示层类型的一位标志和一个表示ATRAC帧字节大小的块长度字段。如果存在多个ATRAC帧,则这些帧将按其帧编号的顺序连接为位标志、块长度和ATRAC帧的连续字符串。此部分不能为空。

    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |E|       Block Length          |         ATRAC frame           |...
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |E|       Block Length          |         ATRAC frame           |...
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 6. ATRAC Frame Section Format

图6。ATRAC帧段格式

Layer Type Flag (E): 1 bit Set to 1 if the corresponding ATRAC frame is from an enhancement layer. 0 indicates a base layer encoded frame.

层类型标志(E):如果对应的ATRAC帧来自增强层,则将1位设置为1。0表示基本层编码的帧。

Block length: 15 bits The byte length of encoded audio data for the following frame. This is so that in the case of fragmentation, if only a subsequent packet is received, decoding can still occur. 15 bits allows for a maximum block length of 32,767 bytes.

块长度:15位下一帧编码音频数据的字节长度。这使得在分段的情况下,如果仅接收到后续分组,则仍然可以进行解码。15位允许最大块长度为32767字节。

ATRAC frame: The encoded ATRAC audio data.

ATRAC帧:编码的ATRAC音频数据。

5.3.2.1. Support of Redundancy
5.3.2.1. 支持冗余

This payload format provides a rudimentary scheme to compensate for occasional packet loss. As every packet's timestamp corresponds to the first audio frame regardless of whether or not it is redundant, and because we know how many frames of audio each packet encapsulates, if two successive packets are successfully transmitted, we can calculate the number of redundant frames being sent. The result gives the client a sense of how the server is responding to RTCP reports and warns it to expand its buffer size if necessary. As an example of using the Redundant Data, refer to Figures 7 and 8.

这种有效载荷格式提供了一种基本的方案来补偿偶尔的数据包丢失。由于每个数据包的时间戳都对应于第一个音频帧,而不管它是否冗余,并且因为我们知道每个数据包封装了多少个音频帧,所以如果两个连续的数据包成功传输,我们可以计算发送的冗余帧数。结果使客户机了解服务器如何响应RTCP报告,并警告客户机在必要时扩展其缓冲区大小。作为使用冗余数据的示例,请参阅图7和图8。

In this example, the server has determined that for the next few packets, it should send the last two frames from the previous packet due to recent RTCP reports. Thus, between packets N and N+1, there is a redundancy of two frames (of which the client may choose to dispose). The benefit arises when packets N+2 and N+3 do not arrive at all, after which eventually packet N+4 arrives with successive necessary audio frame data.

在此示例中,服务器已确定,由于最近的RTCP报告,对于接下来的几个数据包,它应发送前一个数据包的最后两个帧。因此,在分组N和N+1之间,存在两个帧的冗余(客户端可以选择处理其中的帧)。当分组N+2和N+3根本没有到达时,益处就产生了,在这之后,分组N+4最终与连续的必需音频帧数据一起到达。

[Sender]

[发件人]

   |-Fr0-|-Fr1-|-Fr2-|                         Packet: N,   TS=0
         |-Fr1-|-Fr2-|-Fr3-|                   Packet: N+1, TS=1024
               |-Fr2-|-Fr3-|-Fr4-|             Packet: N+2, TS=2048
                     |-Fr3-|-Fr4-|-Fr5-|       Packet: N+3, TS=3072
                           |-Fr4-|-Fr5-|-Fr6-| Packet: N+4, TS=4096
        
   |-Fr0-|-Fr1-|-Fr2-|                         Packet: N,   TS=0
         |-Fr1-|-Fr2-|-Fr3-|                   Packet: N+1, TS=1024
               |-Fr2-|-Fr3-|-Fr4-|             Packet: N+2, TS=2048
                     |-Fr3-|-Fr4-|-Fr5-|       Packet: N+3, TS=3072
                           |-Fr4-|-Fr5-|-Fr6-| Packet: N+4, TS=4096
        
   -----------> Packet "N+2" and "N+3" not arrived  ------------->
        
   -----------> Packet "N+2" and "N+3" not arrived  ------------->
        

[Receiver]

[接收人]

   |-Fr0-|-Fr1-|-Fr2-|                         Packet: N,   TS=0
         |-Fr1-|-Fr2-|-Fr3-|                   Packet: N+1, TS=1024
                           |-Fr4-|-Fr5-|-Fr6-| Packet: N+4, TS=4096
        
   |-Fr0-|-Fr1-|-Fr2-|                         Packet: N,   TS=0
         |-Fr1-|-Fr2-|-Fr3-|                   Packet: N+1, TS=1024
                           |-Fr4-|-Fr5-|-Fr6-| Packet: N+4, TS=4096
        

The receiver can decode from FR4 to Fr6 by using Packet "N+4" data even if the packet loss of "N+2" and "N+3" has occurred.

即使发生了“N+2”和“N+3”的分组丢失,接收机也可以使用分组“N+4”数据从FR4解码到Fr6。

Figure 7. Redundant Example

图7。多余的例子

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        timestamp (= start sample time of Fr1)                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|  0  |   3   |0|         Block Length        |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         (redundant)  ATRAC frame (Fr1) data  ...              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|       Block Length          |(redundant) ATRAC frame (Fr2)  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    (cont.)  |0|   Block Length          |  ATRAC frame (Fr3)  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                       (cont.)                                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        timestamp (= start sample time of Fr1)                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|  0  |   3   |0|         Block Length        |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         (redundant)  ATRAC frame (Fr1) data  ...              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|       Block Length          |(redundant) ATRAC frame (Fr2)  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    (cont.)  |0|   Block Length          |  ATRAC frame (Fr3)  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                       (cont.)                                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 8. Packet Structure Example with Redundant Data (Case of Packet "N+1")

图8。具有冗余数据的分组结构示例(分组“N+1”的情况)

5.3.2.2. Frame Fragmentation
5.3.2.2. 帧碎片

Each RTP packet MUST contain either an integer number of ATRAC-encoded audio frames (with a maximum of 16) or one ATRAC frame fragment. In the former case, as many complete ATRAC frames as can fit in a single path-MTU SHOULD be placed in an RTP packet. However, if even a single ATRAC frame will not fit into a complete RTP packet, the ATRAC frame MUST be fragmented.

每个RTP数据包必须包含整数个ATRAC编码音频帧(最多16个)或一个ATRAC帧片段。在前一种情况下,尽可能多的完整ATRAC帧可以容纳在一个单路径MTU中,应该放置在一个RTP数据包中。但是,如果即使单个ATRAC帧也无法装入完整的RTP数据包,则ATRAC帧必须分段。

The start of a fragmented frame gets placed in its own RTP packet with its Continuation bit (C) set to one, and its Fragment Number (FragNo) set to one. As the frame must be the only one in the packet, the Number of Frames field is zero. Subsequent packets are to contain the remaining fragmented frame data, with the Fragment Number increasing sequentially and the Continuation bit (C) consistently set to one. As subsequent packets do not contain any new frames, the Number of Frames field MUST be ignored. The last packet of fragmented data MUST have the Continuation bit (C) set to zero.

片段帧的开始被放置在它自己的RTP数据包中,其继续位(C)设置为1,片段号(FragNo)设置为1。由于帧必须是数据包中的唯一帧,因此帧数字段为零。后续数据包将包含剩余的片段帧数据,片段数按顺序增加,连续位(C)始终设置为1。由于后续数据包不包含任何新帧,因此必须忽略“帧数”字段。最后一个碎片数据包的延续位(C)必须设置为零。

Packets containing related fragmented frames MUST have identical timestamps. Thus, while the Continuous bit and Fragment Number fields indicate fragmentation and a means to reorder the packets, the timestamp can be used to determine which packets go together.

包含相关碎片帧的数据包必须具有相同的时间戳。因此,虽然连续比特和片段编号字段指示片段和对分组重新排序的方法,但时间戳可用于确定哪些分组一起去。

6. Packetization Examples
6. 打包示例
6.1. Example Multi-Frame Packet
6.1. 示例多帧数据包

Multiple encoded audio frames are combined into one packet. Note how, for this example, only base layer frames are sent redundantly, but are followed by interleaved base layer and enhancement layer frames as illustrated in Figure 9.

将多个编码音频帧组合成一个数据包。请注意,对于本例,仅冗余发送基本层帧,但随后是交错的基本层和增强层帧,如图9所示。

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|  0  |   5   |0|         Block Length        |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         (redundant)  base layer frame 1 data...               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|       Block Length          |(redundant) base layer frame 2 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    (cont.)  |0|   Block Length          |  base layer frame 3 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | (cont.) |1|       Block Length          | enhancement frame 3 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | (cont.) |0|       Block Length          |  base layer frame 4 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | (cont.) |1|       Block Length          | enhancement frame 4 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|  0  |   5   |0|         Block Length        |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         (redundant)  base layer frame 1 data...               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|       Block Length          |(redundant) base layer frame 2 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    (cont.)  |0|   Block Length          |  base layer frame 3 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | (cont.) |1|       Block Length          | enhancement frame 3 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | (cont.) |0|       Block Length          |  base layer frame 4 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | (cont.) |1|       Block Length          | enhancement frame 4 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 9. Example Multi-Frame Packet

图9。示例多帧数据包

6.2. Example Fragmented ATRAC Frame
6.2. ATRAC框架示例

The encoded audio data frame is split over three RTP packets as illustrated in Figure 10. The following points are highlighted in the example below:

编码的音频数据帧在三个RTP数据包上分割,如图10所示。以下几点在下面的示例中突出显示:

o transition from one to zero of the Continuation bit (C)

o 连续位(C)从1到0的转换

o sequential increase in the Fragment Number

o 片段数量的连续增加

   Packet 1:
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|  1  |   0   |1|        Block Length         |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     enhancement data...                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
   Packet 1:
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|  1  |   0   |1|        Block Length         |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     enhancement data...                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
   Packet 2:
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|  2  |   0   |1|        Block Length         |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  ...more enhancement data...                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
   Packet 2:
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|  2  |   0   |1|        Block Length         |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  ...more enhancement data...                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
   Packet 3:
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|  3  |   0   |1|        Block Length         |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            ...the last of the enhancement data                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
   Packet 3:
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          timestamp                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            synchronization source (SSRC) identifier           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           contributing source (CSRC) identifiers              |
   |                             .....                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|  3  |   0   |1|        Block Length         |               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |            ...the last of the enhancement data                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Figure 10. Example Fragmented ATRAC Frame

图10。ATRAC框架示例

7. Payload Format Parameters
7. 有效载荷格式参数

Certain parameters will need to be defined before ATRAC-family-encoded content can be streamed. Other optional parameters may also be defined to take advantage of specific features relevant to certain ATRAC versions. Parameters for ATRAC3, ATRAC-X, and ATRAC Advanced Lossless are defined here as part of the media subtype registration process. A mapping of these parameters into the Session Description Protocol (SDP) (RFC 4566) [2] is also provided for applications that utilize SDP. These registrations use the template defined in RFC 4288 [5] and follow RFC 4855 [6].

在传输ATRAC系列编码内容之前,需要定义某些参数。还可以定义其他可选参数,以利用与某些ATRAC版本相关的特定功能。ATRAC3、ATRAC-X和ATRAC Advanced Lossless的参数在此定义为媒体子类型注册过程的一部分。还为使用SDP的应用程序提供了这些参数到会话描述协议(SDP)(RFC 4566)[2]的映射。这些注册使用RFC 4288[5]中定义的模板,并遵循RFC 4855[6]。

The data format and parameters are specified for real-time transport in RTP.

为RTP中的实时传输指定了数据格式和参数。

7.1. ATRAC3 Media Type Registration
7.1. ATRAC3媒体类型注册

The media subtype for the Adaptive TRansform Codec version 3 (ATRAC3) uses the template defined in RFC 4855 [6].

自适应变换编解码器版本3(ATRAC3)的媒体子类型使用RFC 4855[6]中定义的模板。

Note, any unknown parameter MUST be ignored by the receiver.

注意,接收器必须忽略任何未知参数。

Type name: audio

类型名称:音频

Subtype name: ATRAC3

子类型名称:ATRAC3

Required parameters:

所需参数:

rate: Represents the sampling frequency in Hz of the original audio data. Permissible value is 44100 only.

速率:表示原始音频数据的采样频率(以Hz为单位)。允许值仅为44100。

baseLayer: Indicates the encoded bit-rate in kbps for the audio data to be streamed. Permissible values are 66, 105, and 132.

baseLayer:表示要流式传输的音频数据的编码比特率,单位为kbps。允许值为66、105和132。

Optional parameters:

可选参数:

ptime: See RFC 4566 [2].

ptime:见RFC 4566[2]。

maxptime: See RFC 4566 [2]. The frame length of ATRAC3 is 1024/44100 = 23.22...(ms), and fractional value may not be applicable for the SDP definition.

maxptime:请参阅RFC 4566[2]。ATRAC3的帧长度为1024/44100=23.22…(ms),分数值可能不适用于SDP定义。

So the value of the parameter MUST be a multiple of 24 (ms) considering safe transmission.

因此,考虑到安全传输,参数值必须是24(ms)的倍数。

If this parameter is not present, the sender MAY encapsulate a maximum of 6 encoded frames into one RTP packet, in streaming of ATRAC3.

如果该参数不存在,则发送方可在ATRAC3的流中将最多6个编码帧封装到一个RTP分组中。

maxRedundantFrames: The maximum number of redundant frames that may be sent during a session in any given packet under the redundant framing mechanism detailed in the document. Allowed values are integers in the range of 0 to 15, inclusive. If this parameter is not used, a default of 15 MUST be assumed.

maxRedundantFrames:在文档中详细说明的冗余帧机制下,会话期间在任何给定数据包中可以发送的最大冗余帧数。允许的值是0到15(包括0到15)范围内的整数。如果未使用此参数,则必须假定默认值为15。

Encoding considerations: This media type is framed and contains binary data.

编码注意事项:此媒体类型是框架式的,包含二进制数据。

Security considerations: This media type does not carry active content. See Section 9 of this document.

安全注意事项:此媒体类型不包含活动内容。见本文件第9节。

Interoperability considerations: none

互操作性注意事项:无

Published specification: ATRAC3 Standard Specification [9]

已发布规范:ATRAC3标准规范[9]

Applications that use this media type: Audio and video streaming and conferencing tools.

使用此媒体类型的应用程序:音频和视频流媒体以及会议工具。

   Additional information:  none
   Magic number(s):  none
   File extension(s):  'at3', 'aa3', and 'omg'
   Macintosh file type code(s):  none
        
   Additional information:  none
   Magic number(s):  none
   File extension(s):  'at3', 'aa3', and 'omg'
   Macintosh file type code(s):  none
        

Person and email address to contact for further information: Mitsuyuki Hatanaka Jun Matsumoto actech@jp.sony.com

联系人和电子邮件地址以获取更多信息:Mitsuyuki Hatanaka Jun Matsumotoactech@jp.sony.com

Intended usage: COMMON

预期用途:普通

Restrictions on usage: This media type depends on RTP framing, and hence is only defined for transfer via RTP.

使用限制:此媒体类型取决于RTP帧,因此仅定义为通过RTP传输。

Author: Mitsuyuki Hatanaka Jun Matsumoto actech@jp.sony.com

作者:Mitsuyuki Hatanaka Jun Matsumotoactech@jp.sony.com

Change controller: IETF AVT WG delegated from the IESG

变更控制员:IESG授权的IETF AVT工作组

7.2. ATRAC-X Media Type Registration
7.2. ATRAC-X媒体类型注册

The media subtype for the Adaptive TRansform Codec version X (ATRAC-X) uses the template defined in RFC 4855 [6].

自适应变换编解码器版本X(ATRAC-X)的媒体子类型使用RFC 4855[6]中定义的模板。

Note, any unknown parameter MUST be ignored by the receiver.

注意,接收器必须忽略任何未知参数。

Type name: audio

类型名称:音频

Subtype name: ATRAC-X

子类型名称:ATRAC-X

Required parameters:

所需参数:

rate: Represents the sampling frequency in Hz of the original audio data. Permissible values are 44100 and 48000.

速率:表示原始音频数据的采样频率(以Hz为单位)。允许值为44100和48000。

baseLayer: Indicates the encoded bit-rate in kbps for the audio data to be streamed. Permissible values are 32, 48, 64, 96, 128, 160, 192, 256, 320, and 352.

baseLayer:表示要流式传输的音频数据的编码比特率,单位为kbps。允许值为32、48、64、96、128、160、192、256、320和352。

channelID: Indicates the number of channels and channel layout according to the table1 in Section 7.4. Note that this layout is different from that proposed in RFC 3551 [3]. However, as channelID = 0 defines an ambiguous channel layout, the channel mapping defined in Section 4.1 of [3] could be used. Permissible values are 0, 1, 2, 3, 4, 5, 6, 7.

channelID:根据第7.4节中的表1,表示通道数量和通道布局。请注意,此布局与RFC 3551[3]中提出的布局不同。但是,由于channelID=0定义了不明确的通道布局,因此可以使用[3]第4.1节中定义的通道映射。允许值为0、1、2、3、4、5、6、7。

Optional parameters:

可选参数:

ptime: See RFC 4566 [2].

ptime:见RFC 4566[2]。

maxptime: See RFC 4566 [2]. The frame length of ATRAC-X is 2048/44100 = 46.44...(ms) or 2048/48000 = 42.67...(ms), but fractional value may not be applicable for the SDP definition. So the value of the parameter MUST be a multiple of 47 (ms) or 43 (ms) considering safe transmission.

maxptime:请参阅RFC 4566[2]。ATRAC-X的帧长为2048/44100=46.44…(ms)或2048/48000=42.67…(ms),但分数值可能不适用于SDP定义。因此,考虑到安全传输,参数值必须是47(ms)或43(ms)的倍数。

If this parameter is not present, the sender MAY encapsulate a maximum of 16 encoded frames into one RTP packet, in streaming of ATRAC-X.

如果此参数不存在,则发送方可在ATRAC-X流中将最多16个编码帧封装到一个RTP分组中。

maxRedundantFrames: The maximum number of redundant frames that may be sent during a session in any given packet under the redundant framing mechanism detailed in the document. Allowed values are integers in the range 0 to 15, inclusive. If this parameter is not used, a default of 15 MUST be assumed.

maxRedundantFrames:在文档中详细说明的冗余帧机制下,会话期间在任何给定数据包中可以发送的最大冗余帧数。允许的值是0到15(包括0到15)范围内的整数。如果未使用此参数,则必须假定默认值为15。

delayMode: Indicates a desire to use low-delay features, in which case the decoder will process received data accordingly based on this value. Permissible values are 2 and 4.

delayMode:表示希望使用低延迟特性,在这种情况下,解码器将根据此值相应地处理接收到的数据。允许值为2和4。

Encoding considerations: This media type is framed and contains binary data.

编码注意事项:此媒体类型是框架式的,包含二进制数据。

Security considerations: This media type does not carry active content. See Section 9 of this document.

安全注意事项:此媒体类型不包含活动内容。见本文件第9节。

Interoperability considerations: none

互操作性注意事项:无

Published specification: ATRAC-X Standard Specification [10]

已发布规范:ATRAC-X标准规范[10]

Applications that use this media type: Audio and video streaming and conferencing tools.

使用此媒体类型的应用程序:音频和视频流媒体以及会议工具。

Additional information: none

其他信息:无

   Magic number(s):  none
   File extension(s):  'atx', 'aa3', and 'omg'
   Macintosh file type code(s):  none
        
   Magic number(s):  none
   File extension(s):  'atx', 'aa3', and 'omg'
   Macintosh file type code(s):  none
        

Person and email address to contact for further information: Mitsuyuki Hatanaka Jun Matsumoto actech@jp.sony.com

联系人和电子邮件地址以获取更多信息:Mitsuyuki Hatanaka Jun Matsumotoactech@jp.sony.com

Intended usage: COMMON

预期用途:普通

Restrictions on usage: This media type depends on RTP framing, and hence is only defined for transfer via RTP.

使用限制:此媒体类型取决于RTP帧,因此仅定义为通过RTP传输。

Author: Mitsuyuki Hatanaka Jun Matsumoto actech@jp.sony.com

作者:Mitsuyuki Hatanaka Jun Matsumotoactech@jp.sony.com

Change controller: IETF AVT WG delegated from the IESG

变更控制员:IESG授权的IETF AVT工作组

7.3. ATRAC Advanced Lossless Media Type Registration
7.3. ATRAC高级无损媒体类型注册

The media subtype for the Adaptive TRansform Codec Lossless version (ATRAC Advanced Lossless) uses the template defined in RFC 4855 [6].

自适应变换编解码器无损版本(ATRAC Advanced Lossless)的媒体子类型使用RFC 4855[6]中定义的模板。

Note, any unknown parameter MUST be ignored by the receiver.

注意,接收器必须忽略任何未知参数。

Type name: audio

类型名称:音频

Subtype name: ATRAC-ADVANCED-LOSSLESS

子类型名称:ATRAC-ADVANCED-LOSSLESS

Required parameters:

所需参数:

rate: Represents the sampling frequency in Hz of the original audio data. Permissible value is 44100 only for High-Speed Transfer mode. Any value of 24000, 32000, 44100, 48000, 64000, 88200, 96000, 176400, and 192000 can be used for Standard mode.

速率:表示原始音频数据的采样频率(以Hz为单位)。仅在高速传输模式下,允许值为44100。标准模式可以使用24000、32000、44100、48000、64000、88200、96000、176400和192000的任意值。

baseLayer: Indicates the encoded bit-rate in kbps for the base layer in High-Speed Transfer mode lossless encodings.

baseLayer:表示高速传输模式无损编码中基本层的编码比特率,单位为kbps。

For Standard lossless mode, this value MUST be 0.

对于标准无损模式,此值必须为0。

The Permissible values for ATRAC3 baselayer are 66, 105, and 132. For ATRAC-X baselayer, they are 32, 48, 64, 96, 128, 160, 192, 256, 320, and 352.

ATRAC3基层的允许值为66、105和132。对于ATRAC-X baselayer,它们是32、48、64、96、128、160、192、256、320和352。

blockLength: Indicates the block length. In High-Speed Transfer mode, the value of 1024 and 2048 is used for ATRAC3 based and ATRAC-X based ATRAC Advanced Lossless streaming, respectively.

blockLength:表示块的长度。在高速传输模式下,1024和2048的值分别用于基于ATRAC3和基于ATRAC-X的ATRAC高级无损流。

Any value of 512, 1024, and 2048 can be used for Standard mode.

512、1024和2048的任何值均可用于标准模式。

channelID: Indicates the number of channels and channel layout according to the table1 in Section 7.4. Note that this layout is different from that proposed in RFC 3551 [3]. However, as channelID = 0 defines an ambiguous channel layout, the channel mapping defined in Section 4.1 of [3] could be used in this case. Permissible values are 0, 1, 2, 3, 4, 5, 6, 7.

channelID:根据第7.4节中的表1,表示通道数量和通道布局。请注意,此布局与RFC 3551[3]中提出的布局不同。但是,由于channelID=0定义了不明确的通道布局,因此在这种情况下可以使用[3]第4.1节中定义的通道映射。允许值为0、1、2、3、4、5、6、7。

ptime: See RFC 4566 [2].

ptime:见RFC 4566[2]。

maxptime: See RFC 4566 [2]. In streaming of ATRAC Advanced Lossless, multiple frames cannot be transmitted in a single RTP packet, as the frame size is large. So it SHOULD be regarded as the time of one encoded frame in both of the sender and the receiver side. The frame length of ATRAC Advanced Lossless is 512/44100 = 11.6...(ms), 1024/44100 = 23.22...(ms), or 2048/44100 = 46.44...(ms), but fractional value may not be applicable for the SDP definition. So the value of the parameter MUST be 12(ms), 24(ms), or 47(ms) considering safe transmission.

maxptime:请参阅RFC 4566[2]。在ATRAC Advanced无损传输流中,由于帧大小较大,无法在单个RTP数据包中传输多个帧。因此,它应该被视为发送端和接收端的一个编码帧的时间。ATRAC Advanced Lossless的帧长为512/44100=11.6…(ms)、1024/44100=23.22…(ms)或2048/44100=46.44…(ms),但分数值可能不适用于SDP定义。因此,考虑到安全传输,参数值必须为12(ms)、24(ms)或47(ms)。

Encoding considerations: This media type is framed and contains binary data.

编码注意事项:此媒体类型是框架式的,包含二进制数据。

Security considerations: This media type does not carry active content. See Section 9 of this document.

安全注意事项:此媒体类型不包含活动内容。见本文件第9节。

Interoperability considerations: none

互操作性注意事项:无

Published specification: ATRAC Advanced Lossless Standard Specification [11]

已发布规范:ATRAC高级无损标准规范[11]

Applications that use this media type: Audio and video streaming and conferencing tools.

使用此媒体类型的应用程序:音频和视频流媒体以及会议工具。

Additional information: none

其他信息:无

   Magic number(s):  none
   File extension(s):  'aal', 'aa3', and 'omg'
   Macintosh file type code(s):  none
        
   Magic number(s):  none
   File extension(s):  'aal', 'aa3', and 'omg'
   Macintosh file type code(s):  none
        

Person and email address to contact for further information:

联系人和电子邮件地址,以获取更多信息:

Mitsuyuki Hatanaka Jun Matsumoto actech@jp.sony.com

松本光之actech@jp.sony.com

Intended usage: COMMON

预期用途:普通

Restrictions on usage: This media type depends on RTP framing, and hence is only defined for transfer via RTP.

使用限制:此媒体类型取决于RTP帧,因此仅定义为通过RTP传输。

Author: Mitsuyuki Hatanaka Jun Matsumoto actech@jp.sony.com

作者:Mitsuyuki Hatanaka Jun Matsumotoactech@jp.sony.com

Change controller: IETF AVT WG delegated from the IESG

变更控制员:IESG授权的IETF AVT工作组

7.4. Channel Mapping Configuration Table
7.4. 通道映射配置表

Table 1 explains the mapping between the channelID as passed during SDP negotiations, and the speaker mapping the value represents.

表1解释了SDP协商期间传递的channelID与该值所代表的说话人映射之间的映射。

            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            | channelID | Number of |  Default Speaker    |
            |           | Channels  |      Mapping        |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     0     |  max 64   |     undefined       |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     1     |     1     | front: center       |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     2     |     2     | front: left, right  |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     3     |     3     | front: left, right  |
            |           |           | front: center       |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     4     |     4     | front: left, right  |
            |           |           | front: center       |
            |           |           | rear: surround      |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     5     |    5+1    | front: left, right  |
            |           |           | front: center       |
            |           |           | rear: left, right   |
            |           |           | LFE                 |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     6     |    6+1    | front: left, right  |
            |           |           | front: center       |
            |           |           | rear: left, right   |
            |           |           | rear: center        |
            |           |           | LFE                 |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     7     |    7+1    | front: left, right  |
            |           |           | front: center       |
            |           |           | rear: left, right   |
            |           |           | side: left, right   |
            |           |           | LFE                 |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            | channelID | Number of |  Default Speaker    |
            |           | Channels  |      Mapping        |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     0     |  max 64   |     undefined       |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     1     |     1     | front: center       |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     2     |     2     | front: left, right  |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     3     |     3     | front: left, right  |
            |           |           | front: center       |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     4     |     4     | front: left, right  |
            |           |           | front: center       |
            |           |           | rear: surround      |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     5     |    5+1    | front: left, right  |
            |           |           | front: center       |
            |           |           | rear: left, right   |
            |           |           | LFE                 |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     6     |    6+1    | front: left, right  |
            |           |           | front: center       |
            |           |           | rear: left, right   |
            |           |           | rear: center        |
            |           |           | LFE                 |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |     7     |    7+1    | front: left, right  |
            |           |           | front: center       |
            |           |           | rear: left, right   |
            |           |           | side: left, right   |
            |           |           | LFE                 |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

Table 1. Channel Configuration

表1。通道配置

7.5. Mapping Media Type Parameters into SDP
7.5. 将媒体类型参数映射到SDP

The information carried in the Media type specification has a specific mapping to fields in the Session Description Protocol (SDP) [2], which is commonly used to describe RTP sessions. When SDP is used to specify sessions employing the ATRAC family of codecs, the following mapping rules according to the ATRAC codec apply.

媒体类型规范中包含的信息与会话描述协议(SDP)[2]中的字段具有特定映射,该协议通常用于描述RTP会话。当使用SDP指定使用ATRAC编解码器系列的会话时,将应用以下根据ATRAC编解码器的映射规则。

7.5.1. For Media Subtype ATRAC3
7.5.1. 对于媒体子类型ATRAC3

o The Media type ("audio") goes in SDP "m=" as the media name.

o 媒体类型(“音频”)以SDP“m=”作为媒体名称。

o The Media subtype (payload format name) goes in SDP "a=rtpmap" as the encoding name. ATRAC3 supports only mono or stereo signals, so a corresponding number of channels (0 or 1) MUST also be specified in this attribute.

o 媒体子类型(有效负载格式名称)以SDP“a=rtpmap”作为编码名称。ATRAC3仅支持单声道或立体声信号,因此还必须在此属性中指定相应数量的通道(0或1)。

o The "baseLayer" parameter goes in SDP "a=fmtp". This parameter MUST be present. "maxRedundantFrames" may follow, but if no value is transmitted, the receiver SHOULD assume a default value of "15".

o “baseLayer”参数位于SDP“a=fmtp”中。此参数必须存在。可能会出现“maxRedundantFrames”,但如果没有传输值,则接收器应假定默认值为“15”。

o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and "a=maxptime" attributes, respectively.

o 参数“ptime”和“maxptime”分别位于SDP“a=ptime”和“a=maxptime”属性中。

7.5.2. For Media Subtype ATRAC-X
7.5.2. 对于媒体子类型ATRAC-X

o The Media type ("audio") goes in SDP "m=" as the media name.

o 媒体类型(“音频”)以SDP“m=”作为媒体名称。

o The Media subtype (payload format name) goes in SDP "a=rtpmap" as the encoding name. This SHOULD be followed by the "sampleRate" (as the RTP clock rate), and then the actual number of channels regardless of the channelID parameter.

o 媒体子类型(有效负载格式名称)以SDP“a=rtpmap”作为编码名称。这之后应该是“sampleRate”(作为RTP时钟速率),然后是通道的实际数量,而不考虑channelID参数。

o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and "a=maxptime" attributes, respectively.

o 参数“ptime”和“maxptime”分别位于SDP“a=ptime”和“a=maxptime”属性中。

o Any remaining parameters go in the SDP "a=fmtp" attribute by copying them directly from the Media type string as a semicolon-separated list of parameter=value pairs. The "baseLayer" parameter MUST be the first entry on this line. The "channelID" parameter MUST be the next entry. The receiver MUST assume a default value of "15" for "maxRedundantFrames".

o 通过直接从媒体类型字符串中以分号分隔的参数=值对列表形式复制其余参数,将其放入SDP“a=fmtp”属性中。“baseLayer”参数必须是此行的第一个条目。“channelID”参数必须是下一个条目。接收器必须假定“maxRedundantFrames”的默认值为“15”。

7.5.3. For Media Subtype ATRAC Advanced Lossless
7.5.3. 对于媒体子类型ATRAC高级无损

o The Media type ("audio") goes in SDP "m=" as the media name.

o 媒体类型(“音频”)以SDP“m=”作为媒体名称。

o The Media subtype (payload format name) goes in SDP "a=rtpmap" as the encoding name. This MUST be followed by the "sampleRate" (as the RTP clock rate), and then the actual number of channels regardless of the channelID parameter.

o 媒体子类型(有效负载格式名称)以SDP“a=rtpmap”作为编码名称。这之后必须是“sampleRate”(作为RTP时钟速率),然后是通道的实际数量,与channelID参数无关。

o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and "a=maxptime" attributes, respectively.

o 参数“ptime”和“maxptime”分别位于SDP“a=ptime”和“a=maxptime”属性中。

o Any remaining parameters go in the SDP "a=fmtp" attribute by copying them directly from the Media type string as a semicolon-separated list of parameter=value pairs.

o 通过直接从媒体类型字符串中以分号分隔的参数=值对列表形式复制其余参数,将其放入SDP“a=fmtp”属性中。

On this line, the parameters "baseLayer" and "blockLength" MUST be present in this order.

在此行中,参数“baseLayer”和“blockLength”必须按此顺序出现。

The value of "blockLength" MUST be one of 1024 and 2048, for using ATRAC3 and ATRAC-X as baselayer, respectively. If "baseLayer=0" (means standard mode), "blockLength" MUST be one of either 512, 1024, or 2048. The "channelID" parameter MUST be the next entry . The receiver MUST assume a default value of "15" for "maxRedundantFrames".

“blockLength”的值必须是1024和2048中的一个,以分别使用ATRAC3和ATRAC-X作为基层。如果“baseLayer=0”(表示标准模式),“blockLength”必须是512、1024或2048中的一个。“channelID”参数必须是下一个条目。接收器必须假定“maxRedundantFrames”的默认值为“15”。

7.6. Offer/Answer Model Considerations
7.6. 提供/回答模型注意事项

Some options for encoding and decoding ATRAC audio data will require either or both of the sender and receiver complying with certain specifications. In order to establish an interoperable transmission framework, an Offer/Answer negotiation in SDP MUST observe the following considerations. (See [14].)

一些编码和解码ATRAC音频数据的选项将要求发送方和接收方中的一方或双方符合某些规范。为了建立可互操作的传输框架,SDP中的要约/应答协商必须遵守以下注意事项。(见[14]。)

7.6.1. For All Three Media Subtypes
7.6.1. 对于所有三种媒体子类型

o Each combination of the RTP payload transport format configuration parameters (baseLayer and blockLength, sampleRate, channelID) is unique in its bit-pattern and not compatible with any other combination. When creating an offer in an application desiring to use the more advanced features (sample rates above 44100 kHz, more than two channels), the offerer SHOULD also offer a payload type containing only the lowest set of necessary requirements. If multiple configurations are of interest to the application, they may all be offered.

o RTP有效负载传输格式配置参数的每个组合(baseLayer和blockLength、sampleRate、channelID)在其位模式中是唯一的,并且不与任何其他组合兼容。在希望使用更高级功能(采样率高于44100 kHz,超过两个通道)的应用程序中创建报价时,报价人还应提供仅包含最低必要要求集的有效负载类型。如果应用程序对多个配置感兴趣,则可以提供所有配置。

o The parameters "maxptime" and "ptime" will in most cases not affect interoperability; however, the setting of the parameters can affect the performance of the application. The SDP Offer/Answer handling of the "ptime" parameter is described in RFC 3264. The "maxptime" parameter MUST be handled in the same way.

o 参数“maxptime”和“ptime”在大多数情况下不会影响互操作性;但是,参数的设置可能会影响应用程序的性能。RFC 3264中描述了“ptime”参数的SDP提供/应答处理。必须以相同的方式处理“maxptime”参数。

7.6.2. For Media Subtype ATRAC3
7.6.2. 对于媒体子类型ATRAC3

o In response to an offer, downgraded subsets of "baseLayer" are possible. However, for best performance, we suggest the answer contain the highest possible values offered.

o 作为对要约的回应,“基层”的降级子集是可能的。但是,为了获得最佳性能,我们建议答案包含可能提供的最高值。

7.6.3. For Media Subtype ATRAC-X
7.6.3. 对于媒体子类型ATRAC-X

o In response to an offer, downgraded subsets of "sampleRate", "baseLayer", and "channelID" are possible. For best performance, an answer MUST NOT contain any values requiring further capabilities than the offer contains, but it SHOULD provide values as close as possible to those in the offer.

o 作为对要约的回应,“sampleRate”、“baseLayer”和“channelID”的降级子集是可能的。为了获得最佳性能,答案中不得包含任何需要比报价中包含的功能更多的值,但应提供尽可能接近报价中的值。

o The "maxRedundantFrames" is a suggested minimum. This value MAY be increased in an answer (with a maximum of 15), but MUST NOT be reduced.

o “maxRedundantFrames”是建议的最小值。答案中该值可以增加(最大值为15),但不得减少。

o The optional parameter "delayMode" is non-negotiable. If the Answerer cannot comply with the offered value, the session MUST be deemed inoperable.

o 可选参数“delayMode”是不可协商的。如果回答者不能遵守提供的值,则必须认为该会话不可操作。

7.6.4. For Media Subtype ATRAC Advanced Lossless
7.6.4. 对于媒体子类型ATRAC高级无损

o In response to an offer, downgraded subsets of "sampleRate", "baseLayer", and "channelID" are possible. For best performance, an answer MUST NOT contain any values requiring further capabilities than the offer contains, but it SHOULD provide values as close as possible to those in the offer.

o 作为对要约的回应,“sampleRate”、“baseLayer”和“channelID”的降级子集是可能的。为了获得最佳性能,答案中不得包含任何需要比报价中包含的功能更多的值,但应提供尽可能接近报价中的值。

o There are no requirements when negotiating "blockLength", other than that both parties must be in agreement.

o 在协商“区块长度”时没有任何要求,只是双方必须达成一致。

o The "maxRedundantFrames" is a suggested minimum. This value MAY be increased in an answer (with a maximum of 15), but MUST NOT be reduced.

o “maxRedundantFrames”是建议的最小值。答案中该值可以增加(最大值为15),但不得减少。

o For transmission of scalable multi-session streaming of ATRAC Advanced Lossless content, the attributes of media stream identification, group information, and decoding dependency between base layer stream and enhancement layer stream MUST be signaled in SDP by the Offer/Answer model. In this case, the attribute of

o 为了传输ATRAC Advanced无损内容的可扩展多会话流,必须通过提供/应答模型在SDP中通知媒体流标识、组信息以及基本层流和增强层流之间的解码依赖性等属性。在本例中,的属性

"group", "mid", and "depend" followed by the appropriate parameter MUST be used in SDP [7] [8] in order to indicate layered coding dependency. The attribute of "group" followed by "DDP" parameter is used for indicating the relationship between the base and the enhancement layer stream with decoding dependency. Each stream is identified by "mid" attribute, and the dependency of enhancement layer stream is defined by the "depend" attribute, as the enhancement layer is only useful when the base layer is available. Examples for signaling ATRAC Advanced Lossless decoding dependency are described in Sections 7.8 and 7.9.

SDP[7][8]中必须使用“group”、“mid”和“Dependent”,后跟适当的参数,以表示分层编码依赖关系。“group”属性后跟“DDP”参数,用于指示具有解码依赖性的基本和增强层流之间的关系。每个流由“mid”属性标识,增强层流的依赖关系由“depend”属性定义,因为增强层仅在基础层可用时才有用。第7.8节和第7.9节描述了发送ATRAC高级无损解码相关性信号的示例。

7.7. Usage of Declarative SDP
7.7. 声明性SDP的用法

In declarative usage, like SDP in Real-Time Streaming Protocol (RTSP) [15] or Session Announcement Protocol (SAP) [16], the parameters MUST be interpreted as follows:

在声明性用法中,如实时流协议(RTSP)[15]中的SDP或会话公告协议(SAP)[16],参数必须解释如下:

o The payload format configuration parameters (baseLayer, sampleRate, channelID) are all declarative and a participant MUST use the configuration(s) provided for the session. More than one configuration may be provided if necessary by declaring multiple RTP payload types; however, the number of types SHOULD be kept small.

o 有效负载格式配置参数(baseLayer、sampleRate、channelID)都是声明性的,参与者必须使用为会话提供的配置。如有必要,可通过声明多个RTP有效负载类型来提供多个配置;但是,类型的数量应保持在较小的范围内。

o Any "maxptime" and "ptime" values SHOULD be selected with care to ensure that the session's participants can achieve reasonable performance.

o 应谨慎选择任何“maxptime”和“ptime”值,以确保课程参与者能够实现合理的绩效。

o The attribute of "mid", "group", and "depend" MUST be used for indicating the relationship and dependency of the base layer and the enhancement layer in scalable multi-session streaming of ATRAC ADVANCED LOSSLESS content, as described in Sections 7.6, 7.8, and 7.9.

o 如第7.6节、第7.8节和第7.9节所述,在ATRAC ADVANCED无损内容的可伸缩多会话流媒体中,必须使用“mid”、“group”和“depend”属性来表示基本层和增强层之间的关系和依赖关系。

7.8. Example SDP Session Descriptions
7.8. 示例SDP会话描述

Example usage of ATRAC-X with stereo at 44100 Hz:

ATRAC-X在44100 Hz立体声下的使用示例:

   v=0
   o=atrac 2465317890 2465317890 IN IP4 service.example.com
   s=ATRAC-X Streaming
   c=IN IP4 192.0.2.1/127
   t=3409539540 3409543140
   m=audio 49120 RTP/AVP 99
   a=rtpmap:99 ATRAC-X/44100/2
   a=fmtp:99 baseLayer=128; channelID=2; delayMode=2
   a=maxptime:47
        
   v=0
   o=atrac 2465317890 2465317890 IN IP4 service.example.com
   s=ATRAC-X Streaming
   c=IN IP4 192.0.2.1/127
   t=3409539540 3409543140
   m=audio 49120 RTP/AVP 99
   a=rtpmap:99 ATRAC-X/44100/2
   a=fmtp:99 baseLayer=128; channelID=2; delayMode=2
   a=maxptime:47
        

Example usage of ATRAC-X with 5.1 setup at 48000 Hz:

ATRAC-X在48000 Hz的5.1设置下的使用示例:

   v=0
   o=atrac 2465317890 2465317890 IN IP4 service.example.com
   s=ATRAC-X 5.1ch Streaming
   c=IN IP4 192.0.2.1/127
   t=3409539540 3409543140
   m=audio 49120 RTP/AVP 99
   a=rtpmap:99 ATRAC-X/48000/6
   a=fmtp:99 baseLayer=320; channelID=5
   a=maxptime:43
        
   v=0
   o=atrac 2465317890 2465317890 IN IP4 service.example.com
   s=ATRAC-X 5.1ch Streaming
   c=IN IP4 192.0.2.1/127
   t=3409539540 3409543140
   m=audio 49120 RTP/AVP 99
   a=rtpmap:99 ATRAC-X/48000/6
   a=fmtp:99 baseLayer=320; channelID=5
   a=maxptime:43
        

Example usage of ATRAC-Advanced-Lossless in multiplexed High-Speed Transfer mode:

ATRAC Advanced Lossless在多路复用高速传输模式中的使用示例:

   v=0
   o=atrac 2465317890 2465317890 IN IP4 service.example.com
   s=AAL Multiplexed Streaming
   c=IN IP4 192.0.2.1/127
   t=3409539540 3409543140
   m=audio 49200 RTP/AVP 96
   a=rtpmap:96 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:96 baseLayer=128; blockLength=2048; channelID=2
   a=maxptime:47
        
   v=0
   o=atrac 2465317890 2465317890 IN IP4 service.example.com
   s=AAL Multiplexed Streaming
   c=IN IP4 192.0.2.1/127
   t=3409539540 3409543140
   m=audio 49200 RTP/AVP 96
   a=rtpmap:96 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:96 baseLayer=128; blockLength=2048; channelID=2
   a=maxptime:47
        

Example usage of ATRAC-Advanced-Lossless in multi-session High-Speed Transfer mode. In this case, the base layer and the enhancement layer stream are identified by L1 and L2, respectively, and L2 depends on L1 in decoding.

ATRAC Advanced Lossless在多会话高速传输模式中的示例使用。在这种情况下,基本层和增强层流分别由L1和L2标识,并且L2在解码中依赖于L1。

   v=0
   o=atrac 2465317890 2465317890 IN IP4 service.example.com
   s=AAL Multi Session Streaming
   c=IN IP4 192.0.2.1/127
   t=3409539540 3409543140
   a=group:DDP L1 L2
   m=audio 49200 RTP/AVP 96
   a=rtpmap:96 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:96 baseLayer=128; blockLength=2048; channelID=2
   a=maxptime:47
   a=mid:L1
   m=audio 49202 RTP/AVP 97
   a=rtpmap:97 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:97 baseLayer=0; blockLength=2048; channelID=2
   a=maxptime:47
   a=mid:L2
   a=depend:97 lay L1:96
        
   v=0
   o=atrac 2465317890 2465317890 IN IP4 service.example.com
   s=AAL Multi Session Streaming
   c=IN IP4 192.0.2.1/127
   t=3409539540 3409543140
   a=group:DDP L1 L2
   m=audio 49200 RTP/AVP 96
   a=rtpmap:96 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:96 baseLayer=128; blockLength=2048; channelID=2
   a=maxptime:47
   a=mid:L1
   m=audio 49202 RTP/AVP 97
   a=rtpmap:97 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:97 baseLayer=0; blockLength=2048; channelID=2
   a=maxptime:47
   a=mid:L2
   a=depend:97 lay L1:96
        

Example usage of ATRAC-Advanced-Lossless in Standard mode:

ATRAC Advanced无损检测在标准模式下的使用示例:

   m=audio 49200 RTP/AVP 99
   a=rtpmap:99 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:99 baseLayer=0; blockLength=1024; channelID=2
   a=maxptime:24
        
   m=audio 49200 RTP/AVP 99
   a=rtpmap:99 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:99 baseLayer=0; blockLength=1024; channelID=2
   a=maxptime:24
        
7.9. Example Offer/Answer Exchange
7.9. 报价/应答交换示例

The following Offer/Answer example shows how a desire to stream multi-channel content is turned down by the receiver, who answers with only the ability to receive stereo content:

以下提供/应答示例显示了接收者如何拒绝流式传输多频道内容的愿望,而接收者只能够接收立体声内容:

Offer:

报价:

   m=audio 49170 RTP/AVP 98 99
   a=rtpmap:98 ATRAC-X/44100/6
   a=fmtp:98 baseLayer=320; channelID=5
   a=rtpmap:99 ATRAC-X/44100/2
   a=fmtp:99 baseLayer=160; channelID=2
        
   m=audio 49170 RTP/AVP 98 99
   a=rtpmap:98 ATRAC-X/44100/6
   a=fmtp:98 baseLayer=320; channelID=5
   a=rtpmap:99 ATRAC-X/44100/2
   a=fmtp:99 baseLayer=160; channelID=2
        

Answer:

答复:

   m=audio 49170 RTP/AVP 99
   a=rtpmap:99 ATRAC-X/44100/2
   a=fmtp:99 baseLayer=160; channelID=2
        
   m=audio 49170 RTP/AVP 99
   a=rtpmap:99 ATRAC-X/44100/2
   a=fmtp:99 baseLayer=160; channelID=2
        

The following Offer/Answer example shows the receiver answering with a selection of supported parameters:

以下提供/应答示例显示了接收者通过选择支持的参数进行应答:

Offer:

报价:

   m=audio 49170 RTP/AVP 97 98 99
   a=rtpmap:97 ATRAC-X/44100/2
   a=fmtp:97 baseLayer=128; channelID=2
   a=rtpmap:98 ATRAC-X/44100/6
   a=fmtp:98 baseLayer=128; channelID=5
   a=rtpmap:99 ATRAC-X/48000/6
   a=fmtp:99 baseLayer=320; channelID=5
        
   m=audio 49170 RTP/AVP 97 98 99
   a=rtpmap:97 ATRAC-X/44100/2
   a=fmtp:97 baseLayer=128; channelID=2
   a=rtpmap:98 ATRAC-X/44100/6
   a=fmtp:98 baseLayer=128; channelID=5
   a=rtpmap:99 ATRAC-X/48000/6
   a=fmtp:99 baseLayer=320; channelID=5
        

Answer:

答复:

   m=audio 49170 RTP/AVP 97 98
   a=rtpmap:97 ATRAC-X/44100/2
   a=fmtp:97 baseLayer=128; channelID=2
   a=rtpmap:98 ATRAC-X/44100/6
   a=fmtp:98 baseLayer=128; channelID=5
        
   m=audio 49170 RTP/AVP 97 98
   a=rtpmap:97 ATRAC-X/44100/2
   a=fmtp:97 baseLayer=128; channelID=2
   a=rtpmap:98 ATRAC-X/44100/6
   a=fmtp:98 baseLayer=128; channelID=5
        

The following Offer/Answer example shows an exchange in trying to resolve using ATRAC-Advanced-Lossless. The offer contains three options: multi-session High-Speed Transfer mode, multiplexed High-Speed Transfer mode, and Standard mode.

以下提供/应答示例显示了尝试使用ATRAC Advanced Lossless解决问题的exchange。该产品包含三个选项:多会话高速传输模式、多路高速传输模式和标准模式。

Offer:

报价:

// Multi-session High-Speed Transfer mode, L1 and L2 correspond to the base layer and the enhancement layer, respectively, and L2 depends on L1 in decoding.

//多会话高速传输模式,L1和L2分别对应于基本层和增强层,L2在解码中依赖于L1。

   a=group:DDP L1 L2
   m=audio 49200 RTP/AVP 96
   a=rtpmap:96 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:96 baseLayer=132; blockLength=1024; channelID=2
   a=maxptime:24
   a=mid:L1
        
   a=group:DDP L1 L2
   m=audio 49200 RTP/AVP 96
   a=rtpmap:96 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:96 baseLayer=132; blockLength=1024; channelID=2
   a=maxptime:24
   a=mid:L1
        
   m=audio 49202 RTP/AVP 97
   a=rtpmap:97 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:97 baseLayer=0; blockLength=2048; channelID=2
   a=maxptime:24
   a=mid:L2
   a=depend:97 lay L1:96
        
   m=audio 49202 RTP/AVP 97
   a=rtpmap:97 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:97 baseLayer=0; blockLength=2048; channelID=2
   a=maxptime:24
   a=mid:L2
   a=depend:97 lay L1:96
        
// Multiplexed High-Speed Transfer mode
   m=audio 49200 RTP/AVP 98
   a=rtpmap:98 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:98 baseLayer=256; blockLength=2048; channelID=2
   a=maxptime:47
        
// Multiplexed High-Speed Transfer mode
   m=audio 49200 RTP/AVP 98
   a=rtpmap:98 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:98 baseLayer=256; blockLength=2048; channelID=2
   a=maxptime:47
        
// Standard mode
   m=audio 49200 RTP/AVP 99
   a=rtpmap:99 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:99 baseLayer=0; blockLength=2048; channelID=2
   a=maxptime:47
        
// Standard mode
   m=audio 49200 RTP/AVP 99
   a=rtpmap:99 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:99 baseLayer=0; blockLength=2048; channelID=2
   a=maxptime:47
        

Answer:

答复:

   a=group:DDP L1 L2
   m=audio 49200 RTP/AVP 94
   a=rtpmap:94 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:94 baseLayer=132; blockLength=1024; channelID=2
   a=maxptime:24
   a=mid:L1
        
   a=group:DDP L1 L2
   m=audio 49200 RTP/AVP 94
   a=rtpmap:94 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:94 baseLayer=132; blockLength=1024; channelID=2
   a=maxptime:24
   a=mid:L1
        
   m=audio 49202 RTP/AVP 95
   a=rtpmap:95 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:95 baseLayer=0; blockLength=2048; channelID=2
   a=maxptime:24
   a=mid:L2
   a=depend:95 lay L1:94
        
   m=audio 49202 RTP/AVP 95
   a=rtpmap:95 ATRAC-ADVANCED-LOSSLESS/44100/2
   a=fmtp:95 baseLayer=0; blockLength=2048; channelID=2
   a=maxptime:24
   a=mid:L2
   a=depend:95 lay L1:94
        

Note that the names of payload format (encoding) and Media subtypes are case-insensitive in both places. Similarly, parameter names are case-insensitive both in Media types and in the default mapping to the SDP a=fmtp attribute.

请注意,有效负载格式(编码)和媒体子类型的名称在这两个位置都不区分大小写。类似地,参数名称在媒体类型和SDP a=fmtp属性的默认映射中都不区分大小写。

8. IANA Considerations
8. IANA考虑

Three new Media subtypes, audio/ATRAC3, audio/ATRAC-X, and audio/ATRAC-ADVANCED-LOSSLESS, have been registered (see Section 7).

已注册三种新媒体子类型:audio/ATRAC3、audio/ATRAC-X和audio/ATRAC-ADVANCED-LOSSLESS(见第7节)。

9. Security Considerations
9. 安全考虑

The payload format as described in this document is subject to the security considerations defined in RFC 3550 [1] and any applicable profile, for example, RFC 3551 [3]. Also, the security of Media type registration MUST be taken into account as described in Section 5 of RFC 4855 [6].

本文件中描述的有效载荷格式受RFC 3550[1]和任何适用概要文件(例如RFC 3551[3])中定义的安全注意事项的约束。此外,如RFC 4855[6]第5节所述,必须考虑媒体类型注册的安全性。

The payload for ATRAC family consists solely of compressed audio data to be decoded and presented as sound, and the standard specifications of ATRAC3, ATRAC-X, and ATRAC Advanced Lossless [9] [10] [11] strictly define the bit stream syntax and the buffer model in decoder side for each codec. So they can not carry "active content" that could impose malicious side effects upon the receiver, and they do not cause any problem of illegal resource consumption in receiver side, as far as the bit streams are conforming to their standard specifications.

ATRAC系列的有效载荷仅包括要解码并以声音形式呈现的压缩音频数据,ATRAC3、ATRAC-X和ATRAC Advanced Lossless[9][10][11]的标准规范严格定义了每个编解码器的位流语法和解码器端的缓冲区模型。因此,只要比特流符合其标准规范,它们就不能携带可能对接收器施加恶意副作用的“活动内容”,并且不会在接收器端造成任何非法资源消耗的问题。

This payload format does not implement any security mechanisms of its own. Confidentiality, integrity protection, and authentication have to be provided by a mechanism external to this payload format, e.g., SRTP RFC 3711 [13].

此有效负载格式本身不实现任何安全机制。机密性、完整性保护和身份验证必须由该有效负载格式之外的机制提供,例如SRTP RFC 3711[13]。

10. Considerations on Correct Decoding
10. 关于正确解码的思考
10.1. Verification of the Packets
10.1. 数据包的验证

Verification of the received encoded audio packets MUST be performed so as to ensure correct decoding of the packets. As a most primitive implementation, the comparison of the packet size and payload length can be taken into account. If the UDP packet length is longer than

必须对接收到的编码音频数据包进行验证,以确保数据包的正确解码。作为最原始的实现,可以考虑分组大小和有效负载长度的比较。如果UDP数据包长度大于

the RTP packet length, the packet can be accepted, but the extra bytes MUST be ignored. In case of receiving a shorter UDP packet or improperly encoded packets, the packets MUST be discarded.

RTP数据包长度,可以接受数据包,但必须忽略额外的字节。如果接收到较短的UDP数据包或编码不正确的数据包,则必须丢弃这些数据包。

10.2. Validity Checking of the Packets
10.2. 数据包的有效性检查

Also, validity checking of the received audio packets MUST be performed. It can be carried out by the decoding process, as the ATRAC format is designed so that the validity of data frames can be determined by decoding the algorithm. The required decoder response to a malformed frame is to discard the malformed data and conceal the errors in the audio output until a valid frame is detected and decoded. This is expected to prevent crashes and other abnormal decoder behavior in response to errors or attacks.

此外,必须对接收到的音频数据包进行有效性检查。它可以通过解码过程来执行,因为ATRAC格式的设计可以通过解码算法来确定数据帧的有效性。解码器对格式错误的帧所需的响应是丢弃格式错误的数据并隐藏音频输出中的错误,直到检测到并解码有效帧。这有望防止崩溃和其他异常解码器行为,以响应错误或攻击。

11. References
11. 工具书类
11.1. Normative References
11.1. 规范性引用文件

[1] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.

[1] Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。

[2] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.

[2] Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC4566,2006年7月。

[3] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003.

[3] Schulzrinne,H.和S.Casner,“具有最小控制的音频和视频会议的RTP配置文件”,STD 65,RFC 3551,2003年7月。

[4] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[4] Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[5] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", BCP 13, RFC 4288, December 2005.

[5] Freed,N.和J.Klensin,“介质类型规范和注册程序”,BCP 13,RFC 4288,2005年12月。

[6] Casner, S., "Media Type Registration of RTP Payload Formats", RFC 4855, February 2007.

[6] Casner,S.,“RTP有效载荷格式的媒体类型注册”,RFC 4855,2007年2月。

[7] Camarillo, G., Eriksson, G., Holler, J., and H. Schulzrinne, "Grouping of Media Lines in the Session Description Protocol (SDP)", RFC 3388, December 2002.

[7] Camarillo,G.,Eriksson,G.,Holler,J.,和H.Schulzrinne,“会话描述协议(SDP)中媒体线路的分组”,RFC 3388,2002年12月。

[8] Schierl, T., and S. Wenger, "Signaling Media Decoding Dependency in the Session Description Protocol (SDP)", RFC 5583, July 2009.

[8] Schierl,T.和S.Wenger,“会话描述协议(SDP)中的信令媒体解码依赖性”,RFC 5583,2009年7月。

[9] ATRAC3 Standard Specification ver.1.1, Sony Corporation, 2003.

[9] ATRAC3标准规范1.1版,索尼公司,2003年。

[10] ATRAC-X Standard Specification ver.1.2, Sony Corporation, 2004.

[10] ATRAC-X标准规范1.2版,索尼公司,2004年。

[11] ATRAC Advanced Lossless Standard Specification ver.1.1, Sony Corporation, 2007.

[11] ATRAC高级无损标准规范1.1版,索尼公司,2007年。

11.2. Informative References
11.2. 资料性引用

[12] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, September 1997.

[12] 帕金斯,C.,库维拉斯,I.,霍德森,O.,哈德曼,V.,汉德利,M.,博洛特,J.,维加·加西亚,A.,和S.福斯·帕里斯,“冗余音频数据的RTP有效载荷”,RFC 21981997年9月。

[13] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004.

[13] Baugher,M.,McGrew,D.,Naslund,M.,Carrara,E.,和K.Norrman,“安全实时传输协议(SRTP)”,RFC 37112004年3月。

[14] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.

[14] Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,2002年6月。

[15] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998.

[15] Schulzrinne,H.,Rao,A.,和R.Lanphier,“实时流协议(RTSP)”,RFC2326,1998年4月。

[16] Handley, M., Perkins, C., and E. Whelan, "Session Announcement Protocol", RFC 2974, October 2000.

[16] Handley,M.,Perkins,C.,和E.Whelan,“会话公告协议”,RFC 29742000年10月。

Authors' Addresses

作者地址

Mitsuyuki Hatanaka Sony Corporation, Japan 1-7-1 Konan Minato-ku Tokyo 108-0075 Japan

日本滨田光之索尼株式会社1-7-1日本东京河南町108-0075

   EMail: actech@jp.sony.com
        
   EMail: actech@jp.sony.com
        

Jun Matsumoto Sony Corporation, Japan 1-7-1 Konan Minato-ku Tokyo 108-0075 Japan

松本骏日本索尼公司1-7-1日本东京河南弥敦区108-0075

   EMail: actech@jp.sony.com
        
   EMail: actech@jp.sony.com