Network Working Group                                        S. Pfeiffer
Request for Comments: 3533                                         CSIRO
Category: Informational                                         May 2003
        
Network Working Group                                        S. Pfeiffer
Request for Comments: 3533                                         CSIRO
Category: Informational                                         May 2003
        

The Ogg Encapsulation Format Version 0

Ogg封装格式版本为0

Status of this Memo

本备忘录的状况

This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.

本备忘录为互联网社区提供信息。它没有规定任何类型的互联网标准。本备忘录的分发不受限制。

Copyright Notice

版权公告

Copyright (C) The Internet Society (2003). All Rights Reserved.

版权所有(C)互联网协会(2003年)。版权所有。

Abstract

摘要

This document describes the Ogg bitstream format version 0, which is a general, freely-available encapsulation format for media streams. It is able to encapsulate any kind and number of video and audio encoding formats as well as other data streams in a single bitstream.

本文档介绍了Ogg比特流格式版本0,它是一种通用的、免费提供的媒体流封装格式。它能够将任何种类和数量的视频和音频编码格式以及其他数据流封装在单个比特流中。

Terminology

术语

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [2].

本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照BCP 14、RFC 2119[2]中的描述进行解释。

Table of Contents

目录

   1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . .   2
   2. Definitions  . . . . . . . . . . . . . . . . . . . . . . . . .   2
   3. Requirements for a generic encapsulation format  . . . . . . .   3
   4. The Ogg bitstream format . . . . . . . . . . . . . . . . . . .   3
   5. The encapsulation process  . . . . . . . . . . . . . . . . . .   6
   6. The Ogg page format  . . . . . . . . . . . . . . . . . . . . .   9
   7. Security Considerations  . . . . . . . . . . . . . . . . . . .  11
   8. References . . . . . . . . . . . . . . . . . . . . . . . . . .  12
   A. Glossary of terms and abbreviations  . . . . . . . . . . . . .  13
   B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . .  14
      Author's Address . . . . . . . . . . . . . . . . . . . . . . .  14
      Full Copyright Statement . . . . . . . . . . . . . . . . . . .  15
        
   1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . .   2
   2. Definitions  . . . . . . . . . . . . . . . . . . . . . . . . .   2
   3. Requirements for a generic encapsulation format  . . . . . . .   3
   4. The Ogg bitstream format . . . . . . . . . . . . . . . . . . .   3
   5. The encapsulation process  . . . . . . . . . . . . . . . . . .   6
   6. The Ogg page format  . . . . . . . . . . . . . . . . . . . . .   9
   7. Security Considerations  . . . . . . . . . . . . . . . . . . .  11
   8. References . . . . . . . . . . . . . . . . . . . . . . . . . .  12
   A. Glossary of terms and abbreviations  . . . . . . . . . . . . .  13
   B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . .  14
      Author's Address . . . . . . . . . . . . . . . . . . . . . . .  14
      Full Copyright Statement . . . . . . . . . . . . . . . . . . .  15
        
1. Introduction
1. 介绍

The Ogg bitstream format has been developed as a part of a larger project aimed at creating a set of components for the coding and decoding of multimedia content (codecs) which are to be freely available and freely re-implementable, both in software and in hardware for the computing community at large, including the Internet community. It is the intention of the Ogg developers represented by Xiph.Org that it be usable without intellectual property concerns.

Ogg比特流格式是作为一个更大项目的一部分开发的,该项目旨在创建一组用于多媒体内容(编解码器)编码和解码的组件,这些组件将在软件和硬件上免费提供,并可在整个计算社区(包括互联网社区)中免费重新实现。由Xiph.Org代表的Ogg开发者的意图是,它可以在不考虑知识产权的情况下使用。

This document describes the Ogg bitstream format and how to use it to encapsulate one or several media bitstreams created by one or several encoders. The Ogg transport bitstream is designed to provide framing, error protection and seeking structure for higher-level codec streams that consist of raw, unencapsulated data packets, such as the Vorbis audio codec or the upcoming Tarkin and Theora video codecs. It is capable of interleaving different binary media and other time-continuous data streams that are prepared by an encoder as a sequence of data packets. Ogg provides enough information to properly separate data back into such encoder created data packets at the original packet boundaries without relying on decoding to find packet boundaries.

本文档介绍Ogg位流格式以及如何使用它封装由一个或多个编码器创建的一个或多个媒体位流。Ogg传输比特流旨在为包含原始、未封装数据包的高级编解码器流(如Vorbis音频编解码器或即将推出的Tarkin和Theora视频编解码器)提供成帧、错误保护和搜索结构。它能够交错不同的二进制媒体和其他时间连续的数据流,这些数据流由编码器作为数据包序列准备。Ogg提供了足够的信息,可以在原始数据包边界处正确地将数据分离回编码器创建的数据包中,而无需依靠解码来查找数据包边界。

Please note that the MIME type application/ogg has been registered with the IANA [1].

请注意,MIME类型应用程序/ogg已在IANA注册[1]。

2. Definitions
2. 定义

For describing the Ogg encapsulation process, a set of terms will be used whose meaning needs to be well understood. Therefore, some of the most fundamental terms are defined now before we start with the description of the requirements for a generic media stream encapsulation format, the process of encapsulation, and the concrete format of the Ogg bitstream. See the Appendix for a more complete glossary.

为了描述Ogg封装过程,将使用一组术语,其含义需要很好地理解。因此,在我们开始描述通用媒体流封装格式的要求、封装过程和Ogg比特流的具体格式之前,现在定义了一些最基本的术语。有关更完整的术语表,请参见附录。

The result of an Ogg encapsulation is called the "Physical (Ogg) Bitstream". It encapsulates one or several encoder-created bitstreams, which are called "Logical Bitstreams". A logical bitstream, provided to the Ogg encapsulation process, has a structure, i.e., it is split up into a sequence of so-called "Packets". The packets are created by the encoder of that logical bitstream and represent meaningful entities for that encoder only (e.g., an uncompressed stream may use video frames as packets). They do not contain boundary information - strung together they appear to be streams of random bytes with no landmarks.

Ogg封装的结果称为“物理(Ogg)比特流”。它封装了一个或多个编码器创建的位流,称为“逻辑位流”。提供给Ogg封装过程的逻辑比特流具有一种结构,即,它被分成所谓的“分组”序列。分组由该逻辑比特流的编码器创建,并且仅表示该编码器的有意义实体(例如,未压缩流可以使用视频帧作为分组)。它们不包含边界信息——串在一起的是没有标志的随机字节流。

Please note that the term "packet" is not used in this document to signify entities for transport over a network.

请注意,本文件中使用的术语“数据包”并不表示通过网络传输的实体。

3. Requirements for a generic encapsulation format
3. 通用封装格式的要求

The design idea behind Ogg was to provide a generic, linear media transport format to enable both file-based storage and stream-based transmission of one or several interleaved media streams independent of the encoding format of the media data. Such an encapsulation format needs to provide:

Ogg背后的设计思想是提供一种通用的线性媒体传输格式,以支持一个或多个交错媒体流的基于文件的存储和基于流的传输,而不依赖于媒体数据的编码格式。这种封装格式需要提供:

o framing for logical bitstreams.

o 逻辑比特流的帧。

o interleaving of different logical bitstreams.

o 不同逻辑位流的交错。

o detection of corruption.

o 侦查腐败。

o recapture after a parsing error.

o 解析错误后重新捕获。

o position landmarks for direct random access of arbitrary positions in the bitstream.

o 用于直接随机访问位流中任意位置的位置标志。

o streaming capability (i.e., no seeking is needed to build a 100% complete bitstream).

o 流能力(即,构建100%完整的比特流不需要搜索)。

o small overhead (i.e., use no more than approximately 1-2% of bitstream bandwidth for packet boundary marking, high-level framing, sync and seeking).

o 开销小(即,用于数据包边界标记、高级成帧、同步和查找的比特流带宽不超过约1-2%)。

o simplicity to enable fast parsing.

o 实现快速解析的简单性。

o simple concatenation mechanism of several physical bitstreams.

o 多个物理比特流的简单级联机制。

All of these design considerations have been taken into consideration for Ogg. Ogg supports framing and interleaving of logical bitstreams, seeking landmarks, detection of corruption, and stream resynchronisation after a parsing error with no more than approximately 1-2% overhead. It is a generic framework to perform encapsulation of time-continuous bitstreams. It does not know any specifics about the codec data that it encapsulates and is thus independent of any media codec.

Ogg考虑了所有这些设计考虑因素。Ogg支持逻辑比特流的帧和交织、寻找地标、检测损坏以及解析错误后的流重新同步,开销不超过大约1-2%。它是执行时间连续比特流封装的通用框架。它不知道它封装的编解码器数据的任何细节,因此独立于任何媒体编解码器。

4. The Ogg bitstream format
4. Ogg位流格式

A physical Ogg bitstream consists of multiple logical bitstreams interleaved in so-called "Pages". Whole pages are taken in order from multiple logical bitstreams multiplexed at the page level. The logical bitstreams are identified by a unique serial number in the

一个物理Ogg比特流由多个交织在所谓“页面”中的逻辑比特流组成。整个页面按顺序从页面级多路复用的多个逻辑位流中获取。逻辑位流由中的唯一序列号标识

header of each page of the physical bitstream. This unique serial number is created randomly and does not have any connection to the content or encoder of the logical bitstream it represents. Pages of all logical bitstreams are concurrently interleaved, but they need not be in a regular order - they are only required to be consecutive within the logical bitstream. Ogg demultiplexing reconstructs the original logical bitstreams from the physical bitstream by taking the pages in order from the physical bitstream and redirecting them into the appropriate logical decoding entity.

物理位流的每一页的标题。此唯一序列号是随机创建的,与它所表示的逻辑位流的内容或编码器没有任何连接。所有逻辑位流的页都是并行交错的,但它们不需要按常规顺序排列——它们只需要在逻辑位流中连续。Ogg解复用通过从物理位流按顺序获取页面并将其重定向到适当的逻辑解码实体,从物理位流重构原始逻辑位流。

Each Ogg page contains only one type of data as it belongs to one logical bitstream only. Pages are of variable size and have a page header containing encapsulation and error recovery information. Each logical bitstream in a physical Ogg bitstream starts with a special start page (bos=beginning of stream) and ends with a special page (eos=end of stream).

每个Ogg页面只包含一种类型的数据,因为它只属于一个逻辑位流。页面大小可变,并且具有包含封装和错误恢复信息的页眉。物理Ogg比特流中的每个逻辑比特流以一个特殊的起始页(bos=流的开始)开始,并以一个特殊的页(eos=流的结束)结束。

The bos page contains information to uniquely identify the codec type and MAY contain information to set up the decoding process. The bos page SHOULD also contain information about the encoded media - for example, for audio, it should contain the sample rate and number of channels. By convention, the first bytes of the bos page contain magic data that uniquely identifies the required codec. It is the responsibility of anyone fielding a new codec to make sure it is possible to reliably distinguish his/her codec from all other codecs in use. There is no fixed way to detect the end of the codec-identifying marker. The format of the bos page is dependent on the codec and therefore MUST be given in the encapsulation specification of that logical bitstream type. Ogg also allows but does not require secondary header packets after the bos page for logical bitstreams and these must also precede any data packets in any logical bitstream. These subsequent header packets are framed into an integral number of pages, which will not contain any data packets. So, a physical bitstream begins with the bos pages of all logical bitstreams containing one initial header packet per page, followed by the subsidiary header packets of all streams, followed by pages containing data packets.

bos页面包含唯一标识编解码器类型的信息,还可能包含设置解码过程的信息。bos页面还应包含有关编码媒体的信息-例如,对于音频,它应包含采样率和通道数。按照惯例,bos页面的第一个字节包含唯一标识所需编解码器的魔法数据。部署新编解码器的任何人都有责任确保能够可靠地将其编解码器与使用中的所有其他编解码器区分开来。没有固定的方法来检测编解码器识别标记的结尾。bos页面的格式取决于编解码器,因此必须在该逻辑位流类型的封装规范中给出。Ogg还允许但不要求逻辑位流的bos页面之后的辅助头数据包,并且这些数据包也必须位于任何逻辑位流中的任何数据包之前。这些随后的报头数据包被帧成整数个页面,这些页面将不包含任何数据包。因此,物理比特流从所有逻辑比特流的bos页面开始,每个页面包含一个初始头数据包,然后是所有流的辅助头数据包,然后是包含数据包的页面。

The encapsulation specification for one or more logical bitstreams is called a "media mapping". An example for a media mapping is "Ogg Vorbis", which uses the Ogg framework to encapsulate Vorbis-encoded audio data for stream-based storage (such as files) and transport (such as TCP streams or pipes). Ogg Vorbis provides the name and revision of the Vorbis codec, the audio rate and the audio quality on the Ogg Vorbis bos page. It also uses two additional header pages per logical bitstream. The Ogg Vorbis bos page starts with the byte 0x01, followed by "vorbis" (a total of 7 bytes of identifier).

一个或多个逻辑比特流的封装规范称为“媒体映射”。媒体映射的一个示例是“Ogg Vorbis”,它使用Ogg框架封装Vorbis编码的音频数据,用于基于流的存储(例如文件)和传输(例如TCP流或管道)。Ogg Vorbis在Ogg Vorbis bos页面上提供Vorbis编解码器的名称和版本、音频速率和音频质量。它还为每个逻辑位流使用两个额外的头页。Ogg Vorbis bos页面以字节0x01开始,后跟“Vorbis”(总共7个字节的标识符)。

Ogg knows two types of multiplexing: concurrent multiplexing (so-called "Grouping") and sequential multiplexing (so-called "Chaining"). Grouping defines how to interleave several logical bitstreams page-wise in the same physical bitstream. Grouping is for example needed for interleaving a video stream with several synchronised audio tracks using different codecs in different logical bitstreams. Chaining on the other hand, is defined to provide a simple mechanism to concatenate physical Ogg bitstreams, as is often needed for streaming applications.

Ogg知道两种类型的多路复用:并发多路复用(所谓的“分组”)和顺序多路复用(所谓的“链接”)。分组定义了如何在同一物理位流中逐页交错多个逻辑位流。例如,为了使用不同逻辑比特流中的不同编解码器将视频流与多个同步音频轨迹交错,需要分组。另一方面,链接被定义为提供一种简单的机制来连接物理Ogg比特流,这是流应用程序经常需要的。

In grouping, all bos pages of all logical bitstreams MUST appear together at the beginning of the Ogg bitstream. The media mapping specifies the order of the initial pages. For example, the grouping of a specific Ogg video and Ogg audio bitstream may specify that the physical bitstream MUST begin with the bos page of the logical video bitstream, followed by the bos page of the audio bitstream. Unlike bos pages, eos pages for the logical bitstreams need not all occur contiguously. Eos pages may be 'nil' pages, that is, pages containing no content but simply a page header with position information and the eos flag set in the page header. Each grouped logical bitstream MUST have a unique serial number within the scope of the physical bitstream.

在分组中,所有逻辑位流的所有bos页面必须一起出现在Ogg位流的开头。媒体映射指定初始页面的顺序。例如,特定Ogg视频和Ogg音频比特流的分组可以指定物理比特流必须以逻辑视频比特流的bos页面开始,然后是音频比特流的bos页面。与bos页面不同,逻辑位流的eos页面不需要全部连续出现。Eos页面可以是“nil”页面,也就是说,页面不包含任何内容,只包含带有位置信息和页面标题中设置的Eos标志的页面标题。每个分组的逻辑位流必须在物理位流的范围内具有唯一的序列号。

In chaining, complete logical bitstreams are concatenated. The bitstreams do not overlap, i.e., the eos page of a given logical bitstream is immediately followed by the bos page of the next. Each chained logical bitstream MUST have a unique serial number within the scope of the physical bitstream.

在链接中,将连接完整的逻辑位流。比特流不重叠,即,给定逻辑比特流的eos页面后面紧跟着下一个逻辑比特流的bos页面。每个链式逻辑位流必须在物理位流范围内具有唯一的序列号。

It is possible to consecutively chain groups of concurrently multiplexed bitstreams. The groups, when unchained, MUST stand on their own as a valid concurrently multiplexed bitstream. The following diagram shows a schematic example of such a physical bitstream that obeys all the rules of both grouped and chained multiplexed bitstreams.

可以连续地将并发多路复用的比特流组链接起来。这些组在未加限制时,必须作为有效的并发多路复用比特流独立存在。下图显示了此类物理位流的示意图示例,该物理位流遵守分组和链式多路复用位流的所有规则。

               physical bitstream with pages of
          different logical bitstreams grouped and chained
      -------------------------------------------------------------
      |*A*|*B*|*C*|A|A|C|B|A|B|#A#|C|...|B|C|#B#|#C#|*D*|D|...|#D#|
      -------------------------------------------------------------
       bos bos bos             eos           eos eos bos       eos
        
               physical bitstream with pages of
          different logical bitstreams grouped and chained
      -------------------------------------------------------------
      |*A*|*B*|*C*|A|A|C|B|A|B|#A#|C|...|B|C|#B#|#C#|*D*|D|...|#D#|
      -------------------------------------------------------------
       bos bos bos             eos           eos eos bos       eos
        

In this example, there are two chained physical bitstreams, the first of which is a grouped stream of three logical bitstreams A, B, and C. The second physical bitstream is chained after the end of the grouped bitstream, which ends after the last eos page of all its grouped logical bitstreams. As can be seen, grouped bitstreams begin

在此示例中,有两个链接的物理比特流,其中第一个是由三个逻辑比特流a、B和C组成的分组流。第二个物理比特流链接在分组比特流的末尾之后,该分组比特流在其所有分组逻辑比特流的最后一个eos页面之后结束。可以看出,分组比特流开始

together - all of the bos pages MUST appear before any data pages. It can also be seen that pages of concurrently multiplexed bitstreams need not conform to a regular order. And it can be seen that a grouped bitstream can end long before the other bitstreams in the group end.

一起-所有bos页面必须显示在任何数据页面之前。还可以看出,并发复用比特流的页面不需要符合规则顺序。可以看出,分组的比特流可以早于组中的其他比特流结束。

Ogg does not know any specifics about the codec data except that each logical bitstream belongs to a different codec, the data from the codec comes in order and has position markers (so-called "Granule positions"). Ogg does not have a concept of 'time': it only knows about sequentially increasing, unitless position markers. An application can only get temporal information through higher layers which have access to the codec APIs to assign and convert granule positions or time.

Ogg不知道关于编解码器数据的任何细节,除了每个逻辑比特流属于不同的编解码器,来自编解码器的数据按顺序排列并且具有位置标记(所谓的“颗粒位置”)。Ogg没有“时间”的概念:它只知道顺序递增、无单位的位置标记。应用程序只能通过更高的层获取时间信息,这些层可以访问编解码器API来分配和转换颗粒位置或时间。

A specific definition of a media mapping using Ogg may put further constraints on its specific use of the Ogg bitstream format. For example, a specific media mapping may require that all the eos pages for all grouped bitstreams need to appear in direct sequence. An example for a media mapping is the specification of "Ogg Vorbis". Another example is the upcoming "Ogg Theora" specification which encapsulates Theora-encoded video data and usually comes multiplexed with a Vorbis stream for an Ogg containing synchronised audio and video. As Ogg does not specify temporal relationships between the encapsulated concurrently multiplexed bitstreams, the temporal synchronisation between the audio and video stream will be specified in this media mapping. To enable streaming, pages from various logical bitstreams will typically be interleaved in chronological order.

使用Ogg的媒体映射的特定定义可能会对其Ogg比特流格式的特定使用施加进一步的限制。例如,特定媒体映射可能要求所有分组比特流的所有eos页面需要以直接顺序出现。媒体映射的一个例子是“Ogg Vorbis”规范。另一个例子是即将推出的“Ogg Theora”规范,该规范封装了Theora编码的视频数据,并且通常与Vorbis流多路复用,用于包含同步音频和视频的Ogg。由于Ogg没有指定封装的并发复用比特流之间的时间关系,因此音频和视频流之间的时间同步将在该媒体映射中指定。为了启用流式传输,来自各种逻辑位流的页面通常将按时间顺序交错。

5. The encapsulation process
5. 封装过程

The process of multiplexing different logical bitstreams happens at the level of pages as described above. The bitstreams provided by encoders are however handed over to Ogg as so-called "Packets" with packet boundaries dependent on the encoding format. The process of encapsulating packets into pages will be described now.

多路复用不同逻辑比特流的过程发生在如上所述的页面级别。然而,编码器提供的比特流作为所谓的“分组”移交给Ogg,分组边界取决于编码格式。现在将描述将包封装到页面中的过程。

From Ogg's perspective, packets can be of any arbitrary size. A specific media mapping will define how to group or break up packets from a specific media encoder. As Ogg pages have a maximum size of about 64 kBytes, sometimes a packet has to be distributed over several pages. To simplify that process, Ogg divides each packet into 255 byte long chunks plus a final shorter chunk. These chunks are called "Ogg Segments". They are only a logical construct and do not have a header for themselves.

从Ogg的角度来看,数据包可以是任意大小的。特定媒体映射将定义如何分组或分解来自特定媒体编码器的数据包。由于Ogg页面的最大大小约为64 KB,因此有时一个数据包必须分布在多个页面上。为了简化这个过程,Ogg将每个数据包分成255字节长的数据块加上最后一个较短的数据块。这些块称为“Ogg段”。它们只是一个逻辑结构,本身没有头。

A group of contiguous segments is wrapped into a variable length page preceded by a header. A segment table in the page header tells about the "Lacing values" (sizes) of the segments included in the page. A flag in the page header tells whether a page contains a packet continued from a previous page. Note that a lacing value of 255 implies that a second lacing value follows in the packet, and a value of less than 255 marks the end of the packet after that many additional bytes. A packet of 255 bytes (or a multiple of 255 bytes) is terminated by a lacing value of 0. Note also that a 'nil' (zero length) packet is not an error; it consists of nothing more than a lacing value of zero in the header.

一组连续的段被包装到一个可变长度的页面中,页面前面有一个标题。页眉中的段表说明了页面中包含的段的“系带值”(大小)。页眉中的标志指示页面是否包含从上一页继续的数据包。请注意,lacing值255表示数据包中紧跟着第二个lacing值,而小于255的值则表示该多个额外字节之后的数据包结束。255字节(或255字节的倍数)的数据包以0的lacing值终止。还要注意,“零”(零长度)数据包不是错误;它只包含标题中零的花边值。

The encoding is optimized for speed and the expected case of the majority of packets being between 50 and 200 bytes large. This is a design justification rather than a recommendation. This encoding both avoids imposing a maximum packet size as well as imposing minimum overhead on small packets. In contrast, e.g., simply using two bytes at the head of every packet and having a max packet size of 32 kBytes would always penalize small packets (< 255 bytes, the typical case) with twice the segmentation overhead. Using the lacing values as suggested, small packets see the minimum possible byte-aligned overhead (1 byte) and large packets (>512 bytes) see a fairly constant ~0.5% overhead on encoding space.

该编码针对速度和大多数数据包大小在50到200字节之间的预期情况进行了优化。这是设计理由,而不是建议。这种编码既避免了对小数据包施加最大数据包大小,也避免了对小数据包施加最小开销。相反,例如,简单地在每个数据包的头部使用两个字节,并且最大数据包大小为32 kBytes,将始终惩罚具有两倍分段开销的小数据包(<255字节,典型情况)。使用建议的lacing值,小数据包可以看到最小的字节对齐开销(1字节),而大数据包(>512字节)可以看到相当恒定的编码空间开销~0.5%。

The following diagram shows a schematic example of a media mapping using Ogg and grouped logical bitstreams:

下图显示了使用Ogg和分组逻辑位流的媒体映射的示意图示例:

          logical bitstream with packet boundaries
 -----------------------------------------------------------------
 > |       packet_1             | packet_2         | packet_3 |  <
 -----------------------------------------------------------------
        
          logical bitstream with packet boundaries
 -----------------------------------------------------------------
 > |       packet_1             | packet_2         | packet_3 |  <
 -----------------------------------------------------------------
        

|segmentation (logically only) v

|分段(仅逻辑上)v

      packet_1 (5 segments)          packet_2 (4 segs)    p_3 (2 segs)
     ------------------------------ -------------------- ------------
 ..  |seg_1|seg_2|seg_3|seg_4|s_5 | |seg_1|seg_2|seg_3|| |seg_1|s_2 | ..
     ------------------------------ -------------------- ------------
        
      packet_1 (5 segments)          packet_2 (4 segs)    p_3 (2 segs)
     ------------------------------ -------------------- ------------
 ..  |seg_1|seg_2|seg_3|seg_4|s_5 | |seg_1|seg_2|seg_3|| |seg_1|s_2 | ..
     ------------------------------ -------------------- ------------
        

| page encapsulation v

|页面封装

 page_1 (packet_1 data)   page_2 (pket_1 data)   page_3 (packet_2 data)
------------------------  ----------------  ------------------------
|H|------------------- |  |H|----------- |  |H|------------------- |
|D||seg_1|seg_2|seg_3| |  |D|seg_4|s_5 | |  |D||seg_1|seg_2|seg_3| | ...
|R|------------------- |  |R|----------- |  |R|------------------- |
------------------------  ----------------  ------------------------
        
 page_1 (packet_1 data)   page_2 (pket_1 data)   page_3 (packet_2 data)
------------------------  ----------------  ------------------------
|H|------------------- |  |H|----------- |  |H|------------------- |
|D||seg_1|seg_2|seg_3| |  |D|seg_4|s_5 | |  |D||seg_1|seg_2|seg_3| | ...
|R|------------------- |  |R|----------- |  |R|------------------- |
------------------------  ----------------  ------------------------
        
                    |
pages of            |
other    --------|  |
logical         -------
bitstreams      | MUX |
                -------
                   |
                   v
        
                    |
pages of            |
other    --------|  |
logical         -------
bitstreams      | MUX |
                -------
                   |
                   v
        
              page_1  page_2          page_3
      ------  ------  -------  -----  -------
 ...  ||   |  ||   |  ||    |  ||  |  ||    |  ...
      ------  ------  -------  -----  -------
              physical Ogg bitstream
        
              page_1  page_2          page_3
      ------  ------  -------  -----  -------
 ...  ||   |  ||   |  ||    |  ||  |  ||    |  ...
      ------  ------  -------  -----  -------
              physical Ogg bitstream
        

In this example we take a snapshot of the encapsulation process of one logical bitstream. We can see part of that bitstream's subdivision into packets as provided by the codec. The Ogg encapsulation process chops up the packets into segments. The packets in this example are rather large such that packet_1 is split into 5 segments - 4 segments with 255 bytes and a final smaller one. Packet_2 is split into 4 segments - 3 segments with 255 bytes and a

在本例中,我们对一个逻辑位流的封装过程进行快照。我们可以看到编解码器提供的部分比特流细分为数据包。Ogg封装过程将数据包分割成若干段。本例中的数据包相当大,因此数据包_1被分成5个段-4个255字节的段和最后一个较小的段。数据包_2分为4段-3段,包含255个字节和一个

final very small one - and packet_3 is split into two segments. The encapsulation process then creates pages, which are quite small in this example. Page_1 consists of the first three segments of packet_1, page_2 contains the remaining 2 segments from packet_1, and page_3 contains the first three pages of packet_2. Finally, this logical bitstream is multiplexed into a physical Ogg bitstream with pages of other logical bitstreams.

最后一个非常小的数据包被分成两段。然后封装过程创建页面,在本例中页面非常小。页面_1包含数据包_1的前三个段,页面_2包含数据包_1的其余两个段,页面_3包含数据包_2的前三个页面。最后,将该逻辑比特流与其他逻辑比特流的页面复用成物理Ogg比特流。

6. The Ogg page format
6. Ogg页面格式

A physical Ogg bitstream consists of a sequence of concatenated pages. Pages are of variable size, usually 4-8 kB, maximum 65307 bytes. A page header contains all the information needed to demultiplex the logical bitstreams out of the physical bitstream and to perform basic error recovery and landmarks for seeking. Each page is a self-contained entity such that the page decode mechanism can recognize, verify, and handle single pages at a time without requiring the overall bitstream.

物理Ogg比特流由一系列连接的页面组成。页面大小可变,通常为4-8KB,最大为65307字节。页眉包含将逻辑比特流从物理比特流中解复用并执行基本错误恢复和查找标记所需的所有信息。每个页面都是自包含的实体,因此页面解码机制可以一次识别、验证和处理单个页面,而不需要整个比特流。

The Ogg page header has the following format:

Ogg页面标题具有以下格式:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| capture_pattern: Magic number for page start "OggS"           | 0-3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| version       | header_type   | granule_position              | 4-7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               | 8-11
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               | bitstream_serial_number       | 12-15
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               | page_sequence_number          | 16-19
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               | CRC_checksum                  | 20-23
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               |page_segments  | segment_table | 24-27
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...                                                           | 28-
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| capture_pattern: Magic number for page start "OggS"           | 0-3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| version       | header_type   | granule_position              | 4-7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               | 8-11
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               | bitstream_serial_number       | 12-15
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               | page_sequence_number          | 16-19
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               | CRC_checksum                  | 20-23
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               |page_segments  | segment_table | 24-27
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...                                                           | 28-
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        

The LSb (least significant bit) comes first in the Bytes. Fields with more than one byte length are encoded LSB (least significant byte) first.

LSb(最低有效位)位于字节的第一位。长度超过一个字节的字段首先编码为LSB(最低有效字节)。

The fields in the page header have the following meaning:

页面标题中的字段具有以下含义:

1. capture_pattern: a 4 Byte field that signifies the beginning of a page. It contains the magic numbers:

1. 捕获模式:一个4字节的字段,表示页面的开头。它包含神奇的数字:

0x4f 'O'

0x4f'O'

0x67 'g'

0x67‘g’

0x67 'g'

0x67‘g’

0x53 'S'

0x53'S'

It helps a decoder to find the page boundaries and regain synchronisation after parsing a corrupted stream. Once the capture pattern is found, the decoder verifies page sync and integrity by computing and comparing the checksum.

它帮助解码器在解析损坏的流后找到页面边界并重新获得同步。一旦找到捕获模式,解码器将通过计算和比较校验和来验证页面同步和完整性。

2. stream_structure_version: 1 Byte signifying the version number of the Ogg file format used in this stream (this document specifies version 0).

2. 流\结构\版本:1字节,表示此流中使用的Ogg文件格式的版本号(本文档指定版本0)。

3. header_type_flag: the bits in this 1 Byte field identify the specific type of this page.

3. 页眉类型标志:此1字节字段中的位标识此页面的特定类型。

* bit 0x01

* 位0x01

set: page contains data of a packet continued from the previous page

set:页面包含从上一页继续的数据包的数据

unset: page contains a fresh packet

未设置:页面包含一个新的数据包

* bit 0x02

* 位0x02

set: this is the first page of a logical bitstream (bos)

设置:这是逻辑位流(bos)的第一页

unset: this page is not a first page

取消设置:此页不是第一页

* bit 0x04

* 位0x04

set: this is the last page of a logical bitstream (eos)

set:这是逻辑位流(eos)的最后一页

unset: this page is not a last page

取消设置:此页不是最后一页

4. granule_position: an 8 Byte field containing position information. For example, for an audio stream, it MAY contain the total number of PCM samples encoded after including all frames finished on this page. For a video stream it MAY contain the total number of video

4. 位置:包含位置信息的8字节字段。例如,对于音频流,它可能包含在包括该页面上完成的所有帧之后编码的PCM样本总数。对于视频流,它可能包含视频的总数

frames encoded after this page. This is a hint for the decoder and gives it some timing and position information. Its meaning is dependent on the codec for that logical bitstream and specified in a specific media mapping. A special value of -1 (in two's complement) indicates that no packets finish on this page.

在此页之后编码的帧。这是对解码器的提示,并为其提供一些时间和位置信息。其含义取决于该逻辑位流的编解码器,并在特定媒体映射中指定。一个特殊的值-1(2的补码)表示此页上没有数据包完成。

5. bitstream_serial_number: a 4 Byte field containing the unique serial number by which the logical bitstream is identified.

5. bitstream_serial_number:一个4字节的字段,包含唯一的序列号,逻辑位流通过该序列号进行标识。

6. page_sequence_number: a 4 Byte field containing the sequence number of the page so the decoder can identify page loss. This sequence number is increasing on each logical bitstream separately.

6. 页面序列号:一个4字节的字段,包含页面的序列号,以便解码器能够识别页面丢失。该序列号在每个逻辑位流上分别递增。

7. CRC_checksum: a 4 Byte field containing a 32 bit CRC checksum of the page (including header with zero CRC field and page content). The generator polynomial is 0x04c11db7.

7. CRC_校验和:一个4字节字段,包含页面的32位CRC校验和(包括CRC字段和页面内容为零的标题)。生成器多项式为0x04c11db7。

8. number_page_segments: 1 Byte giving the number of segment entries encoded in the segment table.

8. 数字\页\段:1个字节,表示段表中编码的段条目数。

9. segment_table: number_page_segments Bytes containing the lacing values of all segments in this page. Each Byte contains one lacing value.

9. 段\表:数字\页\段字节,包含此页中所有段的系带值。每个字节包含一个鞋带值。

The total header size in bytes is given by: header_size = number_page_segments + 27 [Byte]

以字节为单位的总页眉大小由以下公式给出:页眉大小=页数页数段+27[字节]

   The total page size in Bytes is given by:
   page_size = header_size + sum(lacing_values: 1..number_page_segments)
   [Byte]
        
   The total page size in Bytes is given by:
   page_size = header_size + sum(lacing_values: 1..number_page_segments)
   [Byte]
        
7. Security Considerations
7. 安全考虑

The Ogg encapsulation format is a container format and only encapsulates content (such as Vorbis-encoded audio). It does not provide for any generic encryption or signing of itself or its contained content bitstreams. However, it encapsulates any kind of content bitstream as long as there is a codec for it, and is thus able to contain encrypted and signed content data. It is also possible to add an external security mechanism that encrypts or signs an Ogg physical bitstream and thus provides content confidentiality and authenticity.

Ogg封装格式是一种容器格式,仅封装内容(如Vorbis编码音频)。它不提供自身或其包含的内容比特流的任何通用加密或签名。但是,只要有编解码器,它就可以封装任何类型的内容比特流,因此能够包含加密和签名的内容数据。还可以添加外部安全机制,对Ogg物理比特流进行加密或签名,从而提供内容机密性和真实性。

As Ogg encapsulates binary data, it is possible to include executable content in an Ogg bitstream. This can be an issue with applications that are implemented using the Ogg format, especially when Ogg is used for streaming or file transfer in a networking scenario. As

由于Ogg封装二进制数据,因此可以在Ogg比特流中包含可执行内容。这可能是使用Ogg格式实现的应用程序的一个问题,尤其是当Ogg用于网络场景中的流式传输或文件传输时。像

such, Ogg does not pose a threat there. However, an application decoding Ogg and its encapsulated content bitstreams has to ensure correct handling of manipulated bitstreams, of buffer overflows and the like.

因此,Ogg在那里不会构成威胁。然而,解码Ogg及其封装内容比特流的应用程序必须确保正确处理被操纵的比特流、缓冲区溢出等。

8. References
8. 工具书类

[1] Walleij, L., "The application/ogg Media Type", RFC 3534, May 2003.

[1] Walleij,L.,“应用程序/ogg媒体类型”,RFC 3534,2003年5月。

[2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[2] Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

Appendix A. Glossary of terms and abbreviations
附录A术语表和缩写

bos page: The initial page (beginning of stream) of a logical bitstream which contains information to identify the codec type and other decoding-relevant information.

bos页:逻辑位流的初始页(流的开头),其中包含标识编解码器类型的信息和其他解码相关信息。

chaining (or sequential multiplexing): Concatenation of two or more complete physical Ogg bitstreams.

链接(或顺序多路复用):两个或多个完整物理Ogg比特流的连接。

eos page: The final page (end of stream) of a logical bitstream.

eos页:逻辑位流的最后一页(流的结尾)。

granule position: An increasing position number for a specific logical bitstream stored in the page header. Its meaning is dependent on the codec for that logical bitstream and specified in a specific media mapping.

颗粒位置:存储在页眉中的特定逻辑位流的递增位置号。其含义取决于该逻辑位流的编解码器,并在特定媒体映射中指定。

grouping (or concurrent multiplexing): Interleaving of pages of several logical bitstreams into one complete physical Ogg bitstream under the restriction that all bos pages of all grouped logical bitstreams MUST appear before any data pages.

分组(或并发多路复用):在所有分组的逻辑位流的所有bos页必须出现在任何数据页之前的限制下,将多个逻辑位流的页交织成一个完整的物理Ogg位流。

lacing value: An entry in the segment table of a page header representing the size of the related segment.

花边值:页眉的段表中的一个条目,表示相关段的大小。

logical bitstream: A sequence of bits being the result of an encoded media stream.

逻辑位流:编码媒体流的结果位序列。

media mapping: A specific use of the Ogg encapsulation format together with a specific (set of) codec(s).

媒体映射:Ogg封装格式与特定(一组)编解码器的特定使用。

(Ogg) packet: A subpart of a logical bitstream that is created by the encoder for that bitstream and represents a meaningful entity for the encoder, but only a sequence of bits to the Ogg encapsulation.

(Ogg)数据包:由编码器为该位流创建的逻辑位流的子部分,表示编码器的有意义实体,但仅表示Ogg封装的位序列。

(Ogg) page: A physical bitstream consists of a sequence of Ogg pages containing data of one logical bitstream only. It usually contains a group of contiguous segments of one packet only, but sometimes packets are too large and need to be split over several pages.

(Ogg)页:物理位流由一系列Ogg页组成,这些Ogg页只包含一个逻辑位流的数据。它通常只包含一个数据包的一组连续段,但有时数据包太大,需要在多个页面上拆分。

physical (Ogg) bitstream: The sequence of bits resulting from an Ogg encapsulation of one or several logical bitstreams. It consists of a sequence of pages from the logical bitstreams with the restriction that the pages of one logical bitstream MUST come in their correct temporal order.

物理(Ogg)位流:由一个或多个逻辑位流的Ogg封装产生的位序列。它由来自逻辑位流的一系列页面组成,限制一个逻辑位流的页面必须以正确的时间顺序出现。

(Ogg) segment: The Ogg encapsulation process splits each packet into chunks of 255 bytes plus a last fractional chunk of less than 255 bytes. These chunks are called segments.

(Ogg)段:Ogg封装过程将每个数据包拆分为255字节的块加上最后一个小于255字节的分数块。这些块称为段。

Appendix B. Acknowledgements
附录B.确认书

The author gratefully acknowledges the work that Christopher Montgomery and the Xiph.Org foundation have done in defining the Ogg multimedia project and as part of it the open file format described in this document. The author hopes that providing this document to the Internet community will help in promoting the Ogg multimedia project at http://www.xiph.org/. Many thanks also for the many technical and typo corrections that C. Montgomery and the Ogg community provided as feedback to this RFC.

作者感谢Christopher Montgomery和XIPH.org基金会在定义OGG多媒体项目中所做的工作,并将其作为本文档中描述的开放文件格式的一部分。作者希望,向互联网社区提供本文件将有助于在互联网上推广Ogg多媒体项目http://www.xiph.org/. 非常感谢C.Montgomery和Ogg社区向本RFC提供的许多技术和打字更正,作为反馈。

Author's Address

作者地址

Silvia Pfeiffer CSIRO, Australia Locked Bag 17 North Ryde, NSW 2113 Australia

Silvia Pfeiffer CSIRO,澳大利亚新南威尔士州北莱德17号锁袋2113澳大利亚

   Phone: +61 2 9325 3141
   EMail: Silvia.Pfeiffer@csiro.au
   URI:   http://www.cmis.csiro.au/Silvia.Pfeiffer/
        
   Phone: +61 2 9325 3141
   EMail: Silvia.Pfeiffer@csiro.au
   URI:   http://www.cmis.csiro.au/Silvia.Pfeiffer/
        

Full Copyright Statement

完整版权声明

Copyright (C) The Internet Society (2003). All Rights Reserved.

版权所有(C)互联网协会(2003年)。版权所有。

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.

本文件及其译本可复制并提供给他人,对其进行评论或解释或协助其实施的衍生作品可全部或部分编制、复制、出版和分发,不受任何限制,前提是上述版权声明和本段包含在所有此类副本和衍生作品中。但是,不得以任何方式修改本文件本身,例如删除版权通知或对互联网协会或其他互联网组织的引用,除非出于制定互联网标准的需要,在这种情况下,必须遵循互联网标准过程中定义的版权程序,或根据需要将其翻译成英语以外的其他语言。

The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.

上述授予的有限许可是永久性的,互联网协会或其继承人或受让人不会撤销。

This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

本文件和其中包含的信息是按“原样”提供的,互联网协会和互联网工程任务组否认所有明示或暗示的保证,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。

Acknowledgement

确认

Funding for the RFC Editor function is currently provided by the Internet Society.

RFC编辑功能的资金目前由互联网协会提供。