Network Working Group M. Murata Request for Comments: 3023 IBM Tokyo Research Laboratory Obsoletes: 2376 S. St.Laurent Updates: 2048 simonstl.com Category: Standards Track D. Kohn Skymoon Ventures January 2001
Network Working Group M. Murata Request for Comments: 3023 IBM Tokyo Research Laboratory Obsoletes: 2376 S. St.Laurent Updates: 2048 simonstl.com Category: Standards Track D. Kohn Skymoon Ventures January 2001
XML Media Types
XML媒体类型
Status of this Memo
本备忘录的状况
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
本文件规定了互联网社区的互联网标准跟踪协议,并要求进行讨论和提出改进建议。有关本协议的标准化状态和状态,请参考当前版本的“互联网官方协议标准”(STD 1)。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (C) The Internet Society (2001). All Rights Reserved.
版权所有(C)互联网协会(2001年)。版权所有。
Abstract
摘要
This document standardizes five new media types -- text/xml, application/xml, text/xml-external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd -- for use in exchanging network entities that are related to the Extensible Markup Language (XML). This document also standardizes a convention (using the suffix '+xml') for naming media types outside of these five types when those media types represent XML MIME (Multipurpose Internet Mail Extensions) entities. XML MIME entities are currently exchanged via the HyperText Transfer Protocol on the World Wide Web, are an integral part of the WebDAV protocol for remote web authoring, and are expected to have utility in many domains.
本文档标准化了五种新媒体类型——text/xml、application/xml、text/xml外部解析实体、application/xml外部解析实体和application/xml dtd——用于交换与可扩展标记语言(xml)相关的网络实体。本文档还标准化了一种约定(使用后缀“+xml”)来命名这五种类型之外的媒体类型,这些媒体类型表示xml MIME(多用途Internet邮件扩展)实体。XML MIME实体目前通过万维网上的超文本传输协议进行交换,是用于远程Web创作的WebDAV协议的一个组成部分,预计在许多领域都有实用价值。
Major differences from RFC 2376 are (1) the addition of text/xml-external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd, (2) the '+xml' suffix convention (which also updates the RFC 2048 registration process), and (3) the discussion of "utf-16le" and "utf-16be".
与RFC 2376的主要区别在于:(1)添加了文本/xml外部解析实体、应用程序/xml外部解析实体和应用程序/xml dtd;(2)“+xml”后缀约定(这也更新了RFC 2048注册过程),以及(3)对“utf-16le”和“utf-16be”的讨论。
Table of Contents
目录
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Notational Conventions . . . . . . . . . . . . . . . . . . . 4 3. XML Media Types . . . . . . . . . . . . . . . . . . . . . . 5 3.1 Text/xml Registration . . . . . . . . . . . . . . . . . . . 7 3.2 Application/xml Registration . . . . . . . . . . . . . . . . 9 3.3 Text/xml-external-parsed-entity Registration . . . . . . . . 11 3.4 Application/xml-external-parsed-entity Registration . . . . 12 3.5 Application/xml-dtd Registration . . . . . . . . . . . . . . 13 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4. The Byte Order Mark (BOM) and Conversions to/from the UTF-16 Charset . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . 15 6. The Base URI . . . . . . . . . . . . . . . . . . . . . . . . 15 7. A Naming Convention for XML-Based Media Types . . . . . . . 16 7.1 Referencing . . . . . . . . . . . . . . . . . . . . . . . . 18 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 18 8.1 Text/xml with UTF-8 Charset . . . . . . . . . . . . . . . . 19 8.2 Text/xml with UTF-16 Charset . . . . . . . . . . . . . . . . 19 8.3 Text/xml with UTF-16BE Charset . . . . . . . . . . . . . . . 19 8.4 Text/xml with ISO-2022-KR Charset . . . . . . . . . . . . . 20 8.5 Text/xml with Omitted Charset . . . . . . . . . . . . . . . 20 8.6 Application/xml with UTF-16 Charset . . . . . . . . . . . . 20 8.7 Application/xml with UTF-16BE Charset . . . . . . . . . . . 21 8.8 Application/xml with ISO-2022-KR Charset . . . . . . . . . . 21 8.9 Application/xml with Omitted Charset and UTF-16 XML MIME Entity . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 8.10 Application/xml with Omitted Charset and UTF-8 Entity . . . 22 8.11 Application/xml with Omitted Charset and Internal Encoding Declaration . . . . . . . . . . . . . . . . . . . . . . . . 22 8.12 Text/xml-external-parsed-entity with UTF-8 Charset . . . . . 22 8.13 Application/xml-external-parsed-entity with UTF-16 Charset . 23 8.14 Application/xml-external-parsed-entity with UTF-16BE Charset 23 8.15 Application/xml-dtd . . . . . . . . . . . . . . . . . . . . 23 8.16 Application/mathml+xml . . . . . . . . . . . . . . . . . . . 24 8.17 Application/xslt+xml . . . . . . . . . . . . . . . . . . . . 24 8.18 Application/rdf+xml . . . . . . . . . . . . . . . . . . . . 24 8.19 Image/svg+xml . . . . . . . . . . . . . . . . . . . . . . . 24 8.20 INCONSISTENT EXAMPLE: Text/xml with UTF-8 Charset . . . . . 25 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 25 10. Security Considerations . . . . . . . . . . . . . . . . . . 25 References . . . . . . . . . . . . . . . . . . . . . . . . . 27 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 31 A. Why Use the '+xml' Suffix for XML-Based MIME Types? . . . . 32 A.1 Why not just use text/xml or application/xml and let the XML processor dispatch to the correct application based on the referenced DTD? . . . . . . . . . . . . . . . . . . . . . . 32
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Notational Conventions . . . . . . . . . . . . . . . . . . . 4 3. XML Media Types . . . . . . . . . . . . . . . . . . . . . . 5 3.1 Text/xml Registration . . . . . . . . . . . . . . . . . . . 7 3.2 Application/xml Registration . . . . . . . . . . . . . . . . 9 3.3 Text/xml-external-parsed-entity Registration . . . . . . . . 11 3.4 Application/xml-external-parsed-entity Registration . . . . 12 3.5 Application/xml-dtd Registration . . . . . . . . . . . . . . 13 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4. The Byte Order Mark (BOM) and Conversions to/from the UTF-16 Charset . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . 15 6. The Base URI . . . . . . . . . . . . . . . . . . . . . . . . 15 7. A Naming Convention for XML-Based Media Types . . . . . . . 16 7.1 Referencing . . . . . . . . . . . . . . . . . . . . . . . . 18 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 18 8.1 Text/xml with UTF-8 Charset . . . . . . . . . . . . . . . . 19 8.2 Text/xml with UTF-16 Charset . . . . . . . . . . . . . . . . 19 8.3 Text/xml with UTF-16BE Charset . . . . . . . . . . . . . . . 19 8.4 Text/xml with ISO-2022-KR Charset . . . . . . . . . . . . . 20 8.5 Text/xml with Omitted Charset . . . . . . . . . . . . . . . 20 8.6 Application/xml with UTF-16 Charset . . . . . . . . . . . . 20 8.7 Application/xml with UTF-16BE Charset . . . . . . . . . . . 21 8.8 Application/xml with ISO-2022-KR Charset . . . . . . . . . . 21 8.9 Application/xml with Omitted Charset and UTF-16 XML MIME Entity . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 8.10 Application/xml with Omitted Charset and UTF-8 Entity . . . 22 8.11 Application/xml with Omitted Charset and Internal Encoding Declaration . . . . . . . . . . . . . . . . . . . . . . . . 22 8.12 Text/xml-external-parsed-entity with UTF-8 Charset . . . . . 22 8.13 Application/xml-external-parsed-entity with UTF-16 Charset . 23 8.14 Application/xml-external-parsed-entity with UTF-16BE Charset 23 8.15 Application/xml-dtd . . . . . . . . . . . . . . . . . . . . 23 8.16 Application/mathml+xml . . . . . . . . . . . . . . . . . . . 24 8.17 Application/xslt+xml . . . . . . . . . . . . . . . . . . . . 24 8.18 Application/rdf+xml . . . . . . . . . . . . . . . . . . . . 24 8.19 Image/svg+xml . . . . . . . . . . . . . . . . . . . . . . . 24 8.20 INCONSISTENT EXAMPLE: Text/xml with UTF-8 Charset . . . . . 25 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 25 10. Security Considerations . . . . . . . . . . . . . . . . . . 25 References . . . . . . . . . . . . . . . . . . . . . . . . . 27 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 31 A. Why Use the '+xml' Suffix for XML-Based MIME Types? . . . . 32 A.1 Why not just use text/xml or application/xml and let the XML processor dispatch to the correct application based on the referenced DTD? . . . . . . . . . . . . . . . . . . . . . . 32
A.2 Why not create a new subtree (e.g., image/xml.svg) to represent XML MIME types? . . . . . . . . . . . . . . . . . 32 A.3 Why not create a new top-level MIME type for XML-based media types? . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 A.4 Why not just have the MIME processor 'sniff' the content to determine whether it is XML? . . . . . . . . . . . . . . . . 33 A.5 Why not use a MIME parameter to specify that a media type uses XML syntax? . . . . . . . . . . . . . . . . . . . . . . 33 A.6 How about labeling with parameters in the other direction (e.g., application/xml; Content-Feature=iotp)? . . . . . . . 34 A.7 How about a new superclass MIME parameter that is defined to apply to all MIME types (e.g., Content-Type: application/iotp; $superclass=xml)? . . . . . . . . . . . . 34 A.8 What about adding a new parameter to the Content-Disposition header or creating a new Content-Structure header to indicate XML syntax? . . . . . . . . . . . . . . . . . . . . 35 A.9 How about a new Alternative-Content-Type header? . . . . . . 35 A.10 How about using a conneg tag instead (e.g., accept-features: (syntax=xml))? . . . . . . . . . . . . . . . . . . . . . . . 35 A.11 How about a third-level content-type, such as text/xml/rdf? 35 A.12 Why use the plus ('+') character for the suffix '+xml'? . . 36 A.13 What is the semantic difference between application/foo and application/foo+xml? . . . . . . . . . . . . . . . . . . . . 36 A.14 What happens when an even better markup language (e.g., EBML) is defined, or a new category of data? . . . . . . . . 36 A.15 Why must I use the '+xml' suffix for my new XML-based media type? . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 B. Changes from RFC 2376 . . . . . . . . . . . . . . . . . . . 37 C. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38 Full Copyright Statement . . . . . . . . . . . . . . . . . . 39
A.2 Why not create a new subtree (e.g., image/xml.svg) to represent XML MIME types? . . . . . . . . . . . . . . . . . 32 A.3 Why not create a new top-level MIME type for XML-based media types? . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 A.4 Why not just have the MIME processor 'sniff' the content to determine whether it is XML? . . . . . . . . . . . . . . . . 33 A.5 Why not use a MIME parameter to specify that a media type uses XML syntax? . . . . . . . . . . . . . . . . . . . . . . 33 A.6 How about labeling with parameters in the other direction (e.g., application/xml; Content-Feature=iotp)? . . . . . . . 34 A.7 How about a new superclass MIME parameter that is defined to apply to all MIME types (e.g., Content-Type: application/iotp; $superclass=xml)? . . . . . . . . . . . . 34 A.8 What about adding a new parameter to the Content-Disposition header or creating a new Content-Structure header to indicate XML syntax? . . . . . . . . . . . . . . . . . . . . 35 A.9 How about a new Alternative-Content-Type header? . . . . . . 35 A.10 How about using a conneg tag instead (e.g., accept-features: (syntax=xml))? . . . . . . . . . . . . . . . . . . . . . . . 35 A.11 How about a third-level content-type, such as text/xml/rdf? 35 A.12 Why use the plus ('+') character for the suffix '+xml'? . . 36 A.13 What is the semantic difference between application/foo and application/foo+xml? . . . . . . . . . . . . . . . . . . . . 36 A.14 What happens when an even better markup language (e.g., EBML) is defined, or a new category of data? . . . . . . . . 36 A.15 Why must I use the '+xml' suffix for my new XML-based media type? . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 B. Changes from RFC 2376 . . . . . . . . . . . . . . . . . . . 37 C. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38 Full Copyright Statement . . . . . . . . . . . . . . . . . . 39
The World Wide Web Consortium has issued Extensible Markup Language (XML) 1.0 (Second Edition)[XML]. To enable the exchange of XML network entities, this document standardizes five new media types -- text/xml, application/xml, text/xml-external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd -- as well as a naming convention for identifying XML-based MIME media types.
万维网联盟发布了可扩展标记语言(XML)1.0(第二版)[XML]。为了实现XML网络实体的交换,本文档标准化了五种新的媒体类型——text/XML、application/XML、text/XML外部解析实体、application/XML外部解析实体和application/XML dtd——以及用于标识基于XML的MIME媒体类型的命名约定。
XML entities are currently exchanged on the World Wide Web, and XML is also used for property values and parameter marshalling by the WebDAV[RFC2518] protocol for remote web authoring. Thus, there is a need for a media type to properly label the exchange of XML network entities.
XML实体目前在万维网上进行交换,XML还通过WebDAV[RFC2518]协议用于远程Web创作的属性值和参数编组。因此,需要一种媒体类型来正确标记XML网络实体的交换。
Although XML is a subset of the Standard Generalized Markup Language (SGML) ISO 8879[SGML], which has been assigned the media types text/sgml and application/sgml, there are several reasons why use of text/sgml or application/sgml to label XML is inappropriate. First, there exist many applications that can process XML, but that cannot process SGML, due to SGML's larger feature set. Second, SGML applications cannot always process XML entities, because XML uses features of recent technical corrigenda to SGML. Third, the definition of text/sgml and application/sgml in [RFC1874] includes parameters for SGML bit combination transformation format (SGML-bctf), and SGML boot attribute (SGML-boot). Since XML does not use these parameters, it would be ambiguous if such parameters were given for an XML MIME entity. For these reasons, the best approach for labeling XML network entities is to provide new media types for XML.
尽管XML是标准通用标记语言(SGML)ISO 8879[SGML]的子集,该语言已被指定为媒体类型text/SGML和application/SGML,但使用text/SGML或application/SGML标记XML有几个原因不合适。首先,存在许多可以处理XML的应用程序,但由于SGML具有更大的功能集,它们无法处理SGML。其次,SGML应用程序不能总是处理XML实体,因为XML使用了SGML最新技术勘误的特性。第三,[RFC1874]中text/sgml和application/sgml的定义包括sgml位组合转换格式(sgml bctf)和sgml引导属性(sgml boot)的参数。由于XML不使用这些参数,因此如果为XML MIME实体提供这些参数,则会产生歧义。出于这些原因,标记XML网络实体的最佳方法是为XML提供新的媒体类型。
Since XML is an integral part of the WebDAV Distributed Authoring Protocol, and since World Wide Web Consortium Recommendations have conventionally been assigned IETF tree media types, and since similar media types (HTML, SGML) have been assigned IETF tree media types, the XML media types also belong in the IETF media types tree.
由于XML是WebDAV分布式创作协议不可分割的一部分,由于万维网联盟的建议通常被分配IETF树媒体类型,并且由于相似的媒体类型(HTML、SGML)被分配IETF树媒体类型,因此XML媒体类型也属于IETF媒体类型树。
Similarly, XML will be used as a foundation for other media types, including types in every branch of the IETF media types tree. To facilitate the processing of such types, media types based on XML, but that are not identified using text/xml or application/xml, SHOULD be named using a suffix of '+xml' as described in Section 7. This will allow XML-based tools -- browsers, editors, search engines, and other processors -- to work with all XML-based media types.
类似地,XML将用作其他媒体类型的基础,包括IETF媒体类型树的每个分支中的类型。为便于处理此类类型,基于XML但未使用text/XML或application/XML标识的媒体类型应使用后缀“+XML”命名,如第7节所述。这将允许基于XML的工具——浏览器、编辑器、搜索引擎和其他处理器——处理所有基于XML的媒体类型。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中所述进行解释。
As defined in [RFC2781], the three charsets "utf-16", "utf-16le", and "utf-16be" are used to label UTF-16 text. In this document, "the UTF-16 family" refers to those three charsets. By contrast, the phrases "utf-16" or UTF-16 in this document refer specifically to the single charset "utf-16".
如[RFC2781]中所定义,三个字符集“utf-16”、“utf-16le”和“utf-16be”用于标记utf-16文本。在本文档中,“UTF-16系列”指的是这三个字符集。相比之下,本文档中的短语“utf-16”或utf-16专门指单个字符集“utf-16”。
As sometimes happens between two communities, both MIME and XML have defined the term entity, with different meanings. Section 2.4 of [RFC2045] says:
正如两个社区之间有时发生的情况一样,MIME和XML都定义了术语实体,含义不同。[RFC2045]第2.4节规定:
"The term 'entity' refers specifically to the MIME-defined header fields and contents of either a message or one of the parts in the body of a multipart entity."
“术语‘实体’专门指MIME定义的消息头字段和内容或多部分实体正文中的一部分。”
Section 4 of [XML] says:
[XML]的第4节说:
"An XML document may consist of one or many storage units" called entities that "have content" and are normally "identified by name".
“一个XML文档可能由一个或多个存储单元组成”,这些存储单元称为“具有内容”的实体,通常“通过名称标识”。
In this document, "XML MIME entity" is defined as the latter (an XML entity) encapsulated in the former (a MIME entity).
在本文档中,“XML MIME实体”定义为封装在前者(MIME实体)中的后者(XML实体)。
This document standardizes five media types related to XML MIME entities: text/xml, application/xml, text/xml-external-parsed-entity, application/xml-external-parsed-entity, and application/xml-dtd. Registration information for these media types is described in the sections below.
本文档标准化了与XML MIME实体相关的五种媒体类型:text/XML、application/XML、text/XML外部解析实体、application/XML外部解析实体和application/XML dtd。这些媒体类型的注册信息将在以下各节中介绍。
Within the XML specification, XML MIME entities can be classified into four types. In the XML terminology, they are called "document entities", "external DTD subsets", "external parsed entities", and "external parameter entities". The media types text/xml and application/xml MAY be used for "document entities", while text/xml-external-parsed-entity or application/xml-external-parsed-entity SHOULD be used for "external parsed entities". The media type application/xml-dtd SHOULD be used for "external DTD subsets" or "external parameter entities". application/xml and text/xml MUST NOT be used for "external parameter entities" or "external DTD subsets", and MUST NOT be used for "external parsed entities" unless they are also well-formed "document entities" and are referenced as such. Note that [RFC2376] (which this document obsoletes) allowed such usage, although in practice it is likely to have been rare.
在XML规范中,XML MIME实体可以分为四种类型。在XML术语中,它们被称为“文档实体”、“外部DTD子集”、“外部解析实体”和“外部参数实体”。媒体类型text/xml和application/xml可用于“文档实体”,而text/xml外部解析实体或application/xml外部解析实体应用于“外部解析实体”。媒体类型应用程序/xml dtd应用于“外部dtd子集”或“外部参数实体”。application/xml和text/xml不得用于“外部参数实体”或“外部DTD子集”,也不得用于“外部解析实体”,除非它们也是格式良好的“文档实体”并被引用。请注意,[RFC2376](本文件已废弃)允许此类使用,尽管在实践中可能很少使用。
Neither external DTD subsets nor external parameter entities parse as XML documents, and while some XML document entities may be used as external parsed entities and vice versa, there are many cases where the two are not interchangeable. XML also has unparsed entities, internal parsed entities, and internal parameter entities, but they are not XML MIME entities.
外部DTD子集和外部参数实体都不能解析为XML文档,尽管一些XML文档实体可以用作外部解析实体,反之亦然,但在许多情况下,这两种实体是不可互换的。XML还具有未解析实体、内部解析实体和内部参数实体,但它们不是XML MIME实体。
If an XML document -- that is, the unprocessed, source XML document -- is readable by casual users, text/xml is preferable to application/xml. MIME user agents (and web user agents) that do not have explicit support for text/xml will treat it as text/plain, for example, by displaying the XML MIME entity as plain text. Application/xml is preferable when the XML MIME entity is unreadable by casual users. Similarly, text/xml-external-parsed-entity is
如果XML文档(即未经处理的源XML文档)可供临时用户阅读,则text/XML比application/XML更可取。不明确支持text/xml的MIME用户代理(和web用户代理)将把它视为text/plain,例如,将xml MIME实体显示为纯文本。当普通用户无法读取xml MIME实体时,应用程序/xml更可取。类似地,text/xml外部解析实体是
preferable when an external parsed entity is readable by casual users, but application/xml-external-parsed-entity is preferable when a plain text display is inappropriate.
当临时用户可以读取外部解析实体时,更可取,但当纯文本显示不合适时,应用程序/xml外部解析实体更可取。
NOTE: Users are in general not used to text containing tags such as <price>, and often find such tags quite disorienting or annoying. If one is not sure, the conservative principle would suggest using application/* instead of text/* so as not to put information in front of users that they will quite likely not understand.
NOTE: Users are in general not used to text containing tags such as <price>, and often find such tags quite disorienting or annoying. If one is not sure, the conservative principle would suggest using application/* instead of text/* so as not to put information in front of users that they will quite likely not understand.
The top-level media type "text" has some restrictions on MIME entities and they are described in [RFC2045] and [RFC2046]. In particular, the UTF-16 family, UCS-4, and UTF-32 are not allowed (except over HTTP[RFC2616], which uses a MIME-like mechanism). Thus, if an XML document or external parsed entity is encoded in such character encoding schemes, it cannot be labeled as text/xml or text/xml-external-parsed-entity (except for HTTP).
顶级媒体类型“text”对MIME实体有一些限制,如[RFC2045]和[RFC2046]所述。特别是,不允许使用UTF-16系列、UCS-4和UTF-32(HTTP[RFC2616]除外,后者使用类似MIME的机制)。因此,如果XML文档或外部解析实体是以这种字符编码模式编码的,则不能将其标记为text/XML或text/XML外部解析实体(HTTP除外)。
Text/xml and application/xml behave differently when the charset parameter is not explicitly specified. If the default charset (i.e., US-ASCII) for text/xml is inconvenient for some reason (e.g., bad web servers), application/xml provides an alternative (see "Optional parameters" of application/xml registration in Section 3.2). The same rules apply to the distinction between text/xml-external-parsed-entity and application/xml-external-parsed-entity.
未显式指定charset参数时,Text/xml和application/xml的行为不同。如果文本/xml的默认字符集(即US-ASCII)由于某种原因(例如,坏的web服务器)不方便使用,则application/xml提供了另一种选择(请参阅第3.2节中application/xml注册的“可选参数”)。同样的规则适用于text/xml外部解析实体和application/xml外部解析实体之间的区别。
XML provides a general framework for defining sequences of structured data. In some cases, it may be desirable to define new media types that use XML but define a specific application of XML, perhaps due to domain-specific security considerations or runtime information. Furthermore, such media types may allow UTF-8 or UTF-16 only and prohibit other charsets. This document does not prohibit such media types and in fact expects them to proliferate. However, developers of such media types are STRONGLY RECOMMENDED to use this document as a basis for their registration. In particular, the charset parameter SHOULD be used in the same manner, as described in Section 7.1, in order to enhance interoperability.
XML为定义结构化数据序列提供了通用框架。在某些情况下,可能需要定义使用XML但定义XML的特定应用程序的新媒体类型,这可能是由于特定于域的安全考虑或运行时信息。此外,此类介质类型可能仅允许UTF-8或UTF-16,并禁止其他字符集。本文件并不禁止此类媒体类型,事实上,预期它们会扩散。但是,强烈建议此类媒体类型的开发人员使用本文档作为注册的基础。特别是,应按照第7.1节所述的相同方式使用字符集参数,以增强互操作性。
An XML document labeled as text/xml or application/xml might contain namespace declarations, stylesheet-linking processing instructions (PIs), schema information, or other declarations that might be used to suggest how the document is to be processed. For example, a document might have the XHTML namespace and a reference to a CSS stylesheet. Such a document might be handled by applications that would use this information to dispatch the document for appropriate processing.
标记为text/XML或application/XML的XML文档可能包含名称空间声明、样式表链接处理指令(pi)、模式信息或其他声明,这些声明可用于建议如何处理文档。例如,文档可能具有XHTML名称空间和对CSS样式表的引用。这样的文档可以由应用程序处理,这些应用程序将使用此信息来分发文档以进行适当的处理。
MIME media type name: text
MIME媒体类型名称:text
MIME subtype name: xml
MIME子类型名称:xml
Mandatory parameters: none
强制参数:无
Optional parameters: charset
可选参数:字符集
Although listed as an optional parameter, the use of the charset parameter is STRONGLY RECOMMENDED, since this information can be used by XML processors to determine authoritatively the character encoding of the XML MIME entity. The charset parameter can also be used to provide protocol-specific operations, such as charset-based content negotiation in HTTP. "utf-8" [RFC2279] is the recommended value, representing the UTF-8 charset. UTF-8 is supported by all conforming processors of [XML].
尽管列为可选参数,但强烈建议使用charset参数,因为XML处理器可以使用此信息权威地确定XML MIME实体的字符编码。charset参数还可用于提供特定于协议的操作,例如HTTP中基于charset的内容协商。“utf-8”[RFC2279]是建议的值,表示utf-8字符集。所有符合[XML]标准的处理器都支持UTF-8。
If the XML MIME entity is transmitted via HTTP, which uses a MIME-like mechanism that is exempt from the restrictions on the text top-level type (see section 19.4.1 of [RFC2616]), "utf-16" [RFC2781]) is also recommended. UTF-16 is supported by all conforming processors of [XML]. Since the handling of CR, LF and NUL for text types in most MIME applications would cause undesired transformations of individual octets in UTF-16 multi-octet characters, gateways from HTTP to these MIME applications MUST transform the XML MIME entity from text/xml; charset="utf-16" to application/xml; charset="utf-16".
如果XML MIME实体通过HTTP传输,HTTP使用类似MIME的机制,不受文本顶级类型的限制(参见[RFC2616]第19.4.1节),则也建议使用“utf-16”[RFC2781])。所有符合[XML]标准的处理器都支持UTF-16。由于在大多数MIME应用程序中处理文本类型的CR、LF和NUL会导致UTF-16多八位元字符中单个八位元的不希望的转换,因此从HTTP到这些MIME应用程序的网关必须将XML MIME实体从text/XML转换为XML;charset=“utf-16”到应用程序/xml;charset=“utf-16”。
Conformant with [RFC2046], if a text/xml entity is received with the charset parameter omitted, MIME processors and XML processors MUST use the default charset value of "us-ascii"[ASCII]. In cases where the XML MIME entity is transmitted via HTTP, the default charset value is still "us-ascii". (Note: There is an inconsistency between this specification and HTTP/1.1, which uses ISO-8859-1[ISO8859] as the default for a historical reason. Since XML is a new format, a new default should be chosen for better I18N. US-ASCII was chosen, since it is the intersection of UTF-8 and ISO-8859-1 and since it is already used by MIME.)
与[RFC2046]一致,如果接收到的文本/xml实体省略了字符集参数,MIME处理器和xml处理器必须使用默认的字符集值“us ascii”[ascii]。在XML MIME实体通过HTTP传输的情况下,默认字符集值仍然是“us ascii”。(注意:本规范与HTTP/1.1之间存在不一致,由于历史原因,HTTP/1.1使用ISO-8859-1[ISO8859]作为默认值。由于XML是一种新的格式,因此应选择新的默认值以更好地使用I18N。选择US-ASCII,因为它是UTF-8和ISO-8859-1的交叉点,并且MIME已经使用它。)
There are several reasons that the charset parameter is authoritative. First, some MIME processing engines do transcoding of MIME bodies of the top-level media type "text" without reference to any of the internal content. Thus, it is possible that some agent might change text/xml; charset="iso-2022-jp" to text/xml; charset="utf-8" without modifying the encoding declaration of an XML document. Second, text/xml must be
charset参数具有权威性有几个原因。首先,一些MIME处理引擎对顶级媒体类型“text”的MIME主体进行转码,而不参考任何内部内容。因此,某些代理可能会更改text/xml;charset=“iso-2022-jp”转换为text/xml;charset=“utf-8”,而不修改XML文档的编码声明。其次,text/xml必须是
compatible with text/plain, since MIME agents that do not understand text/xml will fallback to handling it as text/plain. If the charset parameter for text/xml were not authoritative, such fallback would cause data corruption. Third, recent web servers have been improved so that users can specify the charset parameter. Fourth, [RFC2130] specifies that the recommended specification scheme is the "charset" parameter.
与text/plain兼容,因为不理解text/xml的MIME代理将退回到以text/plain方式处理它。如果text/xml的charset参数不具有权威性,则此类回退将导致数据损坏。第三,最近的web服务器得到了改进,用户可以指定charset参数。第四,[RFC2130]指定推荐的规范方案是“charset”参数。
Since the charset parameter is authoritative, the charset is not always declared within an XML encoding declaration. Thus, special care is needed when the recipient strips the MIME header and provides persistent storage of the received XML MIME entity (e.g., in a file system). Unless the charset is UTF-8 or UTF-16, the recipient SHOULD also persistently store information about the charset, perhaps by embedding a correct XML encoding declaration within the XML MIME entity.
由于字符集参数是权威的,所以字符集并不总是在XML编码声明中声明。因此,当接收者剥离MIME头并提供接收到的XML MIME实体的持久存储(例如,在文件系统中)时,需要特别小心。除非字符集是UTF-8或UTF-16,否则接收方还应该持久地存储有关该字符集的信息,可能是通过在XML MIME实体中嵌入正确的XML编码声明。
Encoding considerations: This media type MAY be encoded as appropriate for the charset and the capabilities of the underlying MIME transport. For 7-bit transports, data in UTF-8 MUST be encoded in quoted-printable or base64. For 8-bit clean transport (e.g., 8BITMIME[RFC1652] ESMTP or NNTP[RFC0977]), UTF-8 does not need to be encoded. Over HTTP[RFC2616], no content-transfer-encoding is necessary and UTF-16 may also be used.
编码注意事项:此媒体类型可以根据字符集和基础MIME传输的功能进行编码。对于7位传输,UTF-8中的数据必须用带引号的可打印或base64编码。对于8位干净传输(例如,8BITMIME[RFC1652]ESMTP或NNTP[RFC0977]),不需要对UTF-8进行编码。通过HTTP[RFC2616],不需要内容传输编码,也可以使用UTF-16。
Security considerations: See Section 10.
安全注意事项:见第10节。
Interoperability considerations: XML has proven to be interoperable across WebDAV clients and servers, and for import and export from multiple XML authoring tools. For maximum interoperability, validating processors are recommended. Although non-validating processors may be more efficient, they are not required to handle all features of XML. For further information, see sub-section 2.9 "Standalone Document Declaration" and section 5 "Conformance" of [XML].
互操作性注意事项:XML已被证明可跨WebDAV客户端和服务器进行互操作,并可从多个XML创作工具导入和导出。为了实现最大的互操作性,建议验证处理器。尽管非验证处理器可能更高效,但它们并不需要处理XML的所有特性。有关更多信息,请参见[XML]第2.9小节“独立文档声明”和第5节“一致性”。
Published specification: Extensible Markup Language (XML) 1.0 (Second Edition)[XML].
已发布规范:可扩展标记语言(XML)1.0(第二版)[XML]。
Applications which use this media type: XML is device-, platform-, and vendor-neutral and is supported by a wide range of Web user agents, WebDAV[RFC2518] clients and servers, as well as XML authoring tools.
使用这种媒体类型的应用程序:XML是设备、平台和供应商中立的,并受到广泛的Web用户代理、WebDAV[RFC2518]客户端和服务器以及XML创作工具的支持。
Additional information:
其他信息:
Magic number(s): None.
幻数:无。
Although no byte sequences can be counted on to always be present, XML MIME entities in ASCII-compatible charsets (including UTF-8) often begin with hexadecimal 3C 3F 78 6D 6C ("<?xml"), and those in UTF-16 often begin with hexadecimal FE FF 00 3C 00 3F 00 78 00 6D 00 6C or FF FE 3C 00 3F 00 78 00 6D 00 6C 00 (the Byte Order Mark (BOM) followed by "<?xml"). For more information, see Appendix F of [XML].
虽然不能指望总是存在字节序列,但ASCII兼容字符集(包括UTF-8)中的XML MIME实体通常以十六进制3C 3F 78 6D 6C(“XML”)开头,而UTF-16中的XML MIME实体通常以十六进制FE FF 00 3C 00 3F 00 78 00 6D 00 6C或FF FE 3C 00 3F 00 78 00 6D 00 6C 00(字节顺序标记(BOM)开头后跟“<?xml”)。有关更多信息,请参见[XML]的附录F。
File extension(s): .xml
文件扩展名:.xml
Macintosh File Type Code(s): "TEXT"
Macintosh文件类型代码:“文本”
Person and email address for further information:
人员和电子邮件地址以获取更多信息:
MURATA Makoto (FAMILY Given) <mmurata@trl.ibm.co.jp>
MURATA Makoto (FAMILY Given) <mmurata@trl.ibm.co.jp>
Simon St.Laurent <simonstl@simonstl.com>
Simon St.Laurent <simonstl@simonstl.com>
Daniel Kohn <dan@dankohn.com>
Daniel Kohn <dan@dankohn.com>
Intended usage: COMMON
预期用途:普通
Author/Change controller: The XML specification is a work product of the World Wide Web Consortium's XML Working Group, and was edited by:
作者/变更控制者:XML规范是万维网联盟XML工作组的工作成果,由以下人员编辑:
Tim Bray <tbray@textuality.com>
Tim Bray <tbray@textuality.com>
Jean Paoli <jeanpa@microsoft.com>
Jean Paoli <jeanpa@microsoft.com>
C. M. Sperberg-McQueen <cmsmcq@uic.edu>
C. M. Sperberg-McQueen <cmsmcq@uic.edu>
Eve Maler <eve.maler@east.sun.com>
Eve Maler <eve.maler@east.sun.com>
The W3C, and the W3C XML Core Working Group, have change control over the XML specification.
W3C和W3CXML核心工作组对XML规范具有更改控制权。
MIME media type name: application
MIME媒体类型名称:应用程序
MIME subtype name: xml
MIME子类型名称:xml
Mandatory parameters: none
强制参数:无
Optional parameters: charset
可选参数:字符集
Although listed as an optional parameter, the use of the charset parameter is STRONGLY RECOMMENDED, since this information can be used by XML processors to determine authoritatively the charset of the XML MIME entity. The charset parameter can also be used to provide protocol-specific operations, such as charset-based content negotiation in HTTP.
尽管列为可选参数,但强烈建议使用charset参数,因为XML处理器可以使用此信息权威地确定XML MIME实体的字符集。charset参数还可用于提供特定于协议的操作,例如HTTP中基于charset的内容协商。
"utf-8" [RFC2279] and "utf-16" [RFC2781] are the recommended values, representing the UTF-8 and UTF-16 charsets, respectively. These charsets are preferred since they are supported by all conforming processors of [XML].
“utf-8”[RFC2279]和“utf-16”[RFC2781]是推荐值,分别表示utf-8和utf-16字符集。这些字符集是首选的,因为它们受到所有符合[XML]标准的处理器的支持。
If an application/xml entity is received where the charset parameter is omitted, no information is being provided about the charset by the MIME Content-Type header. Conforming XML processors MUST follow the requirements in section 4.3.3 of [XML] that directly address this contingency. However, MIME processors that are not XML processors SHOULD NOT assume a default charset if the charset parameter is omitted from an application/xml entity.
如果接收到的应用程序/xml实体省略了charset参数,则MIME内容类型标头不会提供有关该字符集的任何信息。合格的XML处理者必须遵守[XML]第4.3.3节中直接解决此意外事件的要求。但是,如果应用程序/XML实体中省略了charset参数,则非XML处理器的MIME处理器不应采用默认字符集。
There are several reasons that the charset parameter is authoritative. First, recent web servers have been improved so that users can specify the charset parameter. Second, [RFC2130] specifies that the recommended specification scheme is the "charset" parameter.
charset参数具有权威性有几个原因。首先,最近的web服务器得到了改进,用户可以指定charset参数。其次,[RFC2130]指定建议的规范方案是“charset”参数。
On the other hand, it has been argued that the charset parameter should be omitted and the mechanism described in Appendix F of [XML] (which is non-normative) should be solely relied on. This approach would allow users to avoid configuration of the charset parameter; an XML document stored in a file is likely to contain a correct encoding declaration or BOM (if necessary), since the operating system does not typically provide charset information for files. If users would like to rely on the encoding declaration or BOM and to hide charset information from protocols, they may determine not to use the parameter.
另一方面,有人认为应省略字符集参数,并且应完全依赖[XML](非规范性)附录F中描述的机制。这种方法允许用户避免配置charset参数;存储在文件中的XML文档可能包含正确的编码声明或BOM(如果需要),因为操作系统通常不提供文件的字符集信息。如果用户希望依赖编码声明或BOM并对协议隐藏字符集信息,他们可能会决定不使用该参数。
Since the charset parameter is authoritative, the charset is not always declared within an XML encoding declaration. Thus, special care is needed when the recipient strips the MIME header and provides persistent storage of the received XML MIME entity (e.g., in a file system). Unless the charset is UTF-8 or UTF-16, the recipient SHOULD also persistently store information about the charset, perhaps by embedding a correct XML encoding declaration within the XML MIME entity.
由于字符集参数是权威的,所以字符集并不总是在XML编码声明中声明。因此,当接收者剥离MIME头并提供接收到的XML MIME实体的持久存储(例如,在文件系统中)时,需要特别小心。除非字符集是UTF-8或UTF-16,否则接收方还应该持久地存储有关该字符集的信息,可能是通过在XML MIME实体中嵌入正确的XML编码声明。
Encoding considerations: This media type MAY be encoded as appropriate for the charset and the capabilities of the underlying MIME transport. For 7-bit transports, data in either UTF-8 or UTF-16 MUST be encoded in quoted-printable or base64. For 8-bit clean transport (e.g., 8BITMIME[RFC1652] ESMTP or NNTP[RFC0977]), UTF-8 is not encoded, but the UTF-16 family MUST be encoded in base64. For binary clean transports (e.g., HTTP[RFC2616]), no content-transfer-encoding is necessary.
编码注意事项:此媒体类型可以根据字符集和基础MIME传输的功能进行编码。对于7位传输,UTF-8或UTF-16中的数据必须用带引号的可打印或base64编码。对于8位干净传输(例如,8BITMIME[RFC1652]ESMTP或NNTP[RFC0977]),UTF-8不编码,但UTF-16系列必须在base64中编码。对于二进制干净传输(例如HTTP[RFC2616]),不需要内容传输编码。
Security considerations: See Section 10.
安全注意事项:见第10节。
Interoperability considerations: Same as Section 3.1.
互操作性注意事项:与第3.1节相同。
Published specification: Same as Section 3.1.
已发布规范:与第3.1节相同。
Applications which use this media type: Same as Section 3.1.
使用此介质类型的应用程序:与第3.1节相同。
Additional information: Same as Section 3.1.
附加信息:与第3.1节相同。
Person and email address for further information: Same as Section 3.1.
更多信息的人员和电子邮件地址:与第3.1节相同。
Intended usage: COMMON
预期用途:普通
Author/Change controller: Same as Section 3.1.
作者/变更控制者:与第3.1节相同。
MIME media type name: text
MIME媒体类型名称:text
MIME subtype name: xml-external-parsed-entity
MIME子类型名称:xml外部解析实体
Mandatory parameters: none
强制参数:无
Optional parameters: charset
可选参数:字符集
The charset parameter of text/xml-external-parsed-entity is handled the same as that of text/xml as described in Section 3.1.
text/xml外部解析实体的字符集参数的处理与text/xml的字符集参数的处理相同,如第3.1节所述。
Encoding considerations: Same as Section 3.1.
编码注意事项:与第3.1节相同。
Security considerations: See Section 10.
安全注意事项:见第10节。
Interoperability considerations: XML external parsed entities are as interoperable as XML documents, though they have a less tightly constrained structure and therefore need to be referenced by XML documents for proper handling by XML processors. Similarly, XML documents cannot be reliably used as external parsed entities
互操作性注意事项:XML外部解析实体与XML文档一样具有互操作性,尽管它们的结构约束较少,因此需要被XML文档引用,以便由XML处理器进行适当处理。同样,XML文档也不能可靠地用作外部解析实体
because external parsed entities are prohibited from having standalone document declarations or DTDs. Identifying XML external parsed entities with their own content type should enhance interoperability of both XML documents and XML external parsed entities.
因为禁止外部解析实体具有独立的文档声明或DTD。用自己的内容类型标识XML外部解析实体应该增强XML文档和XML外部解析实体的互操作性。
Published specification: Same as Section 3.1.
已发布规范:与第3.1节相同。
Applications which use this media type: Same as Section 3.1.
使用此介质类型的应用程序:与第3.1节相同。
Additional information:
其他信息:
Magic number(s): Same as Section 3.1.
幻数:与第3.1节相同。
File extension(s): .xml or .ent
文件扩展名:.xml或.ent
Macintosh File Type Code(s): "TEXT"
Macintosh文件类型代码:“文本”
Person and email address for further information: Same as Section 3.1.
更多信息的人员和电子邮件地址:与第3.1节相同。
Intended usage: COMMON
预期用途:普通
Author/Change controller: Same as Section 3.1.
作者/变更控制者:与第3.1节相同。
MIME media type name: application
MIME媒体类型名称:应用程序
MIME subtype name: xml-external-parsed-entity
MIME子类型名称:xml外部解析实体
Mandatory parameters: none
强制参数:无
Optional parameters: charset
可选参数:字符集
The charset parameter of application/xml-external-parsed-entity is handled the same as that of application/xml as described in Section 3.2.
application/xml外部解析实体的字符集参数的处理与application/xml的字符集参数的处理相同,如第3.2节所述。
Encoding considerations: Same as Section 3.2.
编码注意事项:与第3.2节相同。
Security considerations: See Section 10.
安全注意事项:见第10节。
Interoperability considerations: Same as those for text/xml-external-parsed-entity as described in Section 3.3.
互操作性注意事项:与第3.3节中描述的文本/xml外部解析实体的注意事项相同。
Published specification: Same as text/xml as described in Section 3.1.
已发布规范:与第3.1节中描述的文本/xml相同。
Applications which use this media type: Same as Section 3.1.
使用此介质类型的应用程序:与第3.1节相同。
Additional information:
其他信息:
Magic number(s): Same as Section 3.1.
幻数:与第3.1节相同。
File extension(s): .xml or .ent
文件扩展名:.xml或.ent
Macintosh File Type Code(s): "TEXT"
Macintosh文件类型代码:“文本”
Person and email address for further information: Same as Section 3.1.
更多信息的人员和电子邮件地址:与第3.1节相同。
Intended usage: COMMON
预期用途:普通
Author/Change controller: Same as Section 3.1.
作者/变更控制者:与第3.1节相同。
MIME media type name: application
MIME媒体类型名称:应用程序
MIME subtype name: xml-dtd
MIME子类型名称:xml dtd
Mandatory parameters: none
强制参数:无
Optional parameters: charset
可选参数:字符集
The charset parameter of application/xml-dtd is handled the same as that of application/xml as described in Section 3.2.
application/xml dtd的字符集参数的处理与application/xml的字符集参数的处理相同,如第3.2节所述。
Encoding considerations: Same as Section 3.2.
编码注意事项:与第3.2节相同。
Security considerations: See Section 10.
安全注意事项:见第10节。
Interoperability considerations: XML DTDs have proven to be interoperable by DTD authoring tools and XML browsers, among others.
互操作性注意事项:XML DTD已被证明可以通过DTD创作工具和XML浏览器等进行互操作。
Published specification: Same as text/xml as described in Section 3.1.
已发布规范:与第3.1节中描述的文本/xml相同。
Applications which use this media type: DTD authoring tools handle external DTD subsets as well as external parameter entities. XML browsers may also access external DTD subsets and external parameter entities.
使用此媒体类型的应用程序:DTD创作工具处理外部DTD子集以及外部参数实体。XML浏览器还可以访问外部DTD子集和外部参数实体。
Additional information:
其他信息:
Magic number(s): Same as Section 3.1.
幻数:与第3.1节相同。
File extension(s): .dtd or .mod
文件扩展名:.dtd或.mod
Macintosh File Type Code(s): "TEXT"
Macintosh文件类型代码:“文本”
Person and email address for further information: Same as Section 3.1.
更多信息的人员和电子邮件地址:与第3.1节相同。
Intended usage: COMMON
预期用途:普通
Author/Change controller: Same as Section 3.1.
作者/变更控制者:与第3.1节相同。
The following list applies to text/xml, text/xml-external-parsed-entity, and XML-based media types under the top-level type "text" that define the charset parameter according to this specification:
以下列表适用于根据本规范定义字符集参数的顶级类型“text”下的text/xml、text/xml外部解析实体和基于xml的媒体类型:
o Charset parameter is strongly recommended.
o 强烈建议使用字符集参数。
o If the charset parameter is not specified, the default is "us-ascii". The default of "iso-8859-1" in HTTP is explicitly overridden.
o 如果未指定字符集参数,则默认值为“us ascii”。HTTP中默认的“iso-8859-1”被显式覆盖。
o No error handling provisions.
o 无错误处理规定。
o An encoding declaration, if present, is irrelevant, but when saving a received resource as a file, the correct encoding declaration SHOULD be inserted.
o 编码声明(如果存在)与此无关,但在将接收到的资源保存为文件时,应插入正确的编码声明。
The next list applies to application/xml, application/xml-external-parsed-entity, application/xml-dtd, and XML-based media types under top-level types other than "text" that define the charset parameter according to this specification:
下一个列表适用于除根据本规范定义字符集参数的“text”以外的顶级类型下的application/xml、application/xml外部解析实体、application/xml dtd和基于xml的媒体类型:
o Charset parameter is strongly recommended, and if present, it takes precedence.
o 强烈建议使用字符集参数,如果存在,则以该参数为准。
o If the charset parameter is omitted, conforming XML processors MUST follow the requirements in section 4.3.3 of [XML].
o 如果省略字符集参数,则符合要求的XML处理器必须遵循[XML]第4.3.3节的要求。
Section 4.3.3 of [XML] specifies that XML MIME entities in the charset "utf-16" MUST begin with a byte order mark (BOM), which is a hexadecimal octet sequence 0xFE 0xFF (or 0xFF 0xFE, depending on endian). The XML Recommendation further states that the BOM is an encoding signature, and is not part of either the markup or the character data of the XML document.
[XML]第4.3.3节规定字符集“utf-16”中的XML MIME实体必须以字节顺序标记(BOM)开头,该标记是十六进制八位字节序列0xFE 0xFF(或0xFF 0xFE,取决于endian)。XML建议进一步指出,BOM是编码签名,不是XML文档的标记或字符数据的一部分。
Due to the presence of the BOM, applications that convert XML from "utf-16" to a non-Unicode encoding MUST strip the BOM before conversion. Similarly, when converting from another encoding into "utf-16", the BOM MUST be added after conversion is complete.
由于BOM的存在,将XML从“utf-16”转换为非Unicode编码的应用程序必须在转换之前剥离BOM。类似地,当从另一种编码转换为“utf-16”时,必须在转换完成后添加BOM。
In addition to the charset "utf-16", [RFC2781] introduces "utf-16le" (little endian) and "utf-16be" (big endian) as well. The BOM is prohibited for these charsets. When an XML MIME entity is encoded in "utf-16le" or "utf-16be", it MUST NOT begin with the BOM but SHOULD contain an encoding declaration. Conversion from "utf-16" to "utf-16be" or "utf-16le" and conversion in the other direction MUST strip or add the BOM, respectively.
除字符集“utf-16”外,[RFC2781]还引入了“utf-16le”(小端)和“utf-16be”(大端)。这些字符集禁止使用BOM。当XML MIME实体编码为“utf-16le”或“utf-16be”时,它不能以BOM开头,而应该包含编码声明。从“utf-16”到“utf-16be”或“utf-16le”的转换以及在其他方向的转换必须分别剥离或添加BOM。
Section 4.1 of [RFC2396] notes that the semantics of a fragment identifier (the part of a URI after a "#") is a property of the data resulting from a retrieval action, and that the format and interpretation of fragment identifiers is dependent on the media type of the retrieval result.
[RFC2396]第4.1节指出,片段标识符(URI中“#”)的语义是检索操作产生的数据的属性,片段标识符的格式和解释取决于检索结果的媒体类型。
As of today, no established specifications define identifiers for XML media types. However, a working draft published by W3C, namely "XML Pointer Language (XPointer)", attempts to define fragment identifiers for text/xml and application/xml. The current specification for XPointer is available at http://www.w3.org/TR/xptr.
到目前为止,还没有为XML媒体类型定义标识符的既定规范。然而,W3C发布的一份工作草案,即“XML指针语言(XPointer)”,试图为text/XML和application/XML定义片段标识符。XPointer的当前规范可在http://www.w3.org/TR/xptr.
Section 5.1 of [RFC2396] specifies that the semantics of a relative URI reference embedded in a MIME entity is dependent on the base URI. The base URI is either (1) the base URI embedded in the MIME entity, (2) the base URI of the encapsulating MIME entity, (3) the URI used to retrieve the MIME entity, or (4) the application-dependent default base URI, where (1) has the highest precedence. [RFC2396] further specifies that the mechanism for embedding the base URI is dependent on the media type.
[RFC2396]的第5.1节规定,嵌入MIME实体中的相对URI引用的语义取决于基本URI。基本URI是(1)嵌入MIME实体的基本URI,(2)封装MIME实体的基本URI,(3)用于检索MIME实体的URI,或(4)依赖于应用程序的默认基本URI,其中(1)具有最高优先级。[RFC2396]进一步指定嵌入基本URI的机制取决于媒体类型。
As of today, no established specifications define mechanisms for embedding the base URI in XML MIME entities. However, a Proposed Recommendation published by W3C, namely "XML Base", attempts to define such a mechanism for text/xml, application/xml, text/xml-external-parsed-entity, and application/xml-external-parsed-entity. The current specification for XML Base is available at http://www.w3.org/TR/xmlbase.
到目前为止,还没有成熟的规范定义在XMLMIME实体中嵌入基本URI的机制。然而,W3C发布的一项建议,即“XML库”,试图为文本/XML、应用程序/XML、文本/XML外部解析实体和应用程序/XML外部解析实体定义这样一种机制。XML库的当前规范可在http://www.w3.org/TR/xmlbase.
This document recommends the use of a naming convention (a suffix of '+xml') for identifying XML-based MIME media types, whatever their particular content may represent. This allows the use of generic XML processors and technologies on a wide variety of different XML document types at a minimum cost, using existing frameworks for media type registration.
本文档建议使用命名约定(后缀“+xml”)来标识基于xml的MIME媒体类型,无论它们的特定内容可能代表什么。这允许使用现有的媒体类型注册框架,以最低的成本在各种不同的XML文档类型上使用通用XML处理器和技术。
Although the use of a suffix was not considered as part of the original MIME architecture, this choice is considered to provide the most functionality with the least potential for interoperability problems or lack of future extensibility. The alternatives to the ' +xml' suffix and the reason for its selection are described in Appendix A.
虽然后缀的使用未被视为原始MIME体系结构的一部分,但这种选择被认为是提供了最多的功能,而互操作性问题或缺乏未来扩展性的可能性最小。附录A中描述了“+xml”后缀的替代方案及其选择原因。
As XML development continues, new XML document types are appearing rapidly. Many of these XML document types would benefit from the identification possibilities of a more specific MIME media type than text/xml or application/xml can provide, and it is likely that many new media types for XML-based document types will be registered in the near and ongoing future.
随着XML的不断发展,新的XML文档类型正在迅速出现。与text/XML或application/XML相比,这些XML文档类型中的许多都将受益于更具体的MIME媒体类型的识别可能性,并且很可能在不久的将来和将来会注册许多基于XML的文档类型的新媒体类型。
While the benefits of specific MIME types for particular types of XML documents are significant, all XML documents share common structures and syntax that make possible common processing.
虽然特定类型的XML文档使用特定MIME类型的好处非常显著,但所有XML文档都共享公共结构和语法,从而可以进行公共处理。
Some areas where 'generic' processing is useful include:
“通用”处理有用的一些领域包括:
o Browsing - An XML browser can display any XML document with a provided [CSS] or [XSLT] style sheet, whatever the vocabulary of that document.
o 浏览-XML浏览器可以显示任何带有[CSS]或[XSLT]样式表的XML文档,无论该文档的词汇是什么。
o Editing - Any XML editor can read, modify, and save any XML document.
o 编辑-任何XML编辑器都可以读取、修改和保存任何XML文档。
o Fragment identification - XPointers (work in progress) can work with any XML document, whatever vocabulary it uses and whether or not it uses XPointer for its own fragment identification.
o 片段标识—XPointer(正在工作)可以处理任何XML文档,不管它使用什么词汇表,也不管它是否使用XPointer进行自己的片段标识。
o Hypertext linking - XLink (work in progress) hypertext linking is designed to connect any XML documents, regardless of vocabulary.
o 超文本链接-XLink(正在工作)超文本链接设计用于连接任何XML文档,而不考虑词汇表。
o Searching - XML-oriented search engines, web crawlers, agents, and query tools should be able to read XML documents and extract the names and content of elements and attributes even if the tools are ignorant of the particular vocabulary used for elements and attributes.
o 搜索-面向XML的搜索引擎、web爬虫、代理和查询工具应该能够读取XML文档并提取元素和属性的名称和内容,即使这些工具不知道用于元素和属性的特定词汇表。
o Storage - XML-oriented storage systems, which keep XML documents internally in a parsed form, should similarly be able to process, store, and recreate any XML document.
o 面向存储的XML存储系统在内部以解析形式保存XML文档,应该同样能够处理、存储和重新创建任何XML文档。
o Well-formedness and validity checking - An XML processor can confirm that any XML document is well-formed and that it is valid (i.e., conforms to its declared DTD or Schema).
o 格式正确性和有效性检查-XML处理器可以确认任何XML文档格式正确且有效(即符合其声明的DTD或模式)。
When a new media type is introduced for an XML-based format, the name of the media type SHOULD end with '+xml'. This convention will allow applications that can process XML generically to detect that the MIME entity is supposed to be an XML document, verify this assumption by invoking some XML processor, and then process the XML document accordingly. Applications may match for types that represent XML MIME entities by comparing the subtype to the pattern '*/*+xml'. (Of course, 4 of the 5 media types defined in this document -- text/xml, application/xml, text/xml-external-parsed-entity, and application/xml-external-parsed-entity -- also represent XML MIME entities while not conforming to the '*/*+xml' pattern.)
为基于XML的格式引入新媒体类型时,媒体类型的名称应以“+XML”结尾。此约定将允许能够通用地处理XML的应用程序检测MIME实体是否应该是XML文档,通过调用某个XML处理器验证此假设,然后相应地处理XML文档。通过将子类型与模式“*/*+XML”进行比较,应用程序可以匹配表示XML MIME实体的类型。(当然,本文档中定义的5种媒体类型中的4种——text/xml、application/xml、text/xml外部解析实体和application/xml外部解析实体——也表示xml MIME实体,但不符合“*/*+xml”模式。)
NOTE: Section 14.1 of HTTP[RFC2616] does not support Accept headers of the form "Accept: */*+xml" and so this header MUST NOT be used in this way. Instead, content negotiation[RFC2703] could potentially be used if an XML-based MIME type were needed.
注意:HTTP[RFC2616]第14.1节不支持格式为“Accept:*/*+xml”的Accept标头,因此此标头不得以这种方式使用。相反,如果需要基于XML的MIME类型,则可以使用内容协商[RFC2703]。
XML generic processing is not always appropriate for XML-based media types. For example, authors of some such media types may wish that the types remain entirely opaque except to applications that are specifically designed to deal with that media type. By NOT following the naming convention '+xml', such media types can avoid XML-generic processing. Since generic processing will be useful in many cases, however -- including in some situations that are difficult to predict ahead of time -- those registering media types SHOULD use the '+xml' convention unless they have a particularly compelling reason not to.
XML通用处理并不总是适用于基于XML的媒体类型。例如,某些此类媒体类型的作者可能希望这些类型保持完全不透明,但专门设计用于处理该媒体类型的应用程序除外。通过不遵循命名约定“+xml”,此类媒体类型可以避免xml通用处理。然而,由于通用处理在许多情况下都是有用的,包括在某些难以提前预测的情况下,那些注册媒体类型的人应该使用“+xml”约定,除非他们有特别令人信服的理由不这样做。
The registration process for these media types is described in [RFC2048]. The registrar for the IETF tree will encourage new XML-based media type registrations in the IETF tree to follow this guideline. Registrars for other trees SHOULD follow this convention
[RFC2048]中描述了这些媒体类型的注册过程。IETF树的注册者将鼓励IETF树中新的基于XML的媒体类型注册遵循本指南。其他树木的登记员应遵守本公约
in order to ensure maximum interoperability of their XML-based documents. Similarly, media subtypes that do not represent XML MIME entities MUST NOT be allowed to register with a '+xml' suffix.
以确保其基于XML的文档的最大互操作性。类似地,不代表XML MIME实体的媒体子类型也不能使用“+XML”后缀注册。
Registrations for new XML-based media types under the top-level type "text" SHOULD, in specifying the charset parameter and encoding considerations, define them as: "Same as [charset parameter / encoding considerations] of text/xml as specified in RFC 3023."
在指定字符集参数和编码注意事项时,顶级类型“text”下基于XML的新媒体类型的注册应将其定义为:“与RFC 3023中指定的text/XML的[charset参数/编码注意事项]相同。”
Registrations for new XML-based media types under top-level types other than "text" SHOULD, in specifying the charset parameter and encoding considerations, define them as: "Same as [charset parameter / encoding considerations] of application/xml as specified in RFC 3023."
在指定字符集参数和编码注意事项时,在顶级类型(而非“文本”)下注册新的基于XML的媒体类型时,应将其定义为:“与RFC 3023中指定的应用程序/XML的[charset参数/编码注意事项]相同。”
The use of the charset parameter is STRONGLY RECOMMENDED, since this information can be used by XML processors to determine authoritatively the charset of the XML MIME entity.
强烈建议使用charset参数,因为XML处理器可以使用此信息权威地确定XML MIME实体的字符集。
These registrations SHOULD specify that the XML-based media type being registered has all of the security considerations described in RFC 3023 plus any additional considerations specific to that media type.
这些注册应指定正在注册的基于XML的媒体类型具有RFC 3023中描述的所有安全注意事项,以及特定于该媒体类型的任何其他注意事项。
These registrations SHOULD also make reference to RFC 3023 in specifying magic numbers, fragment identifiers, base URIs, and use of the BOM.
这些注册还应参考RFC3023来指定幻数、片段标识符、基本URI和BOM的使用。
These registrations MAY reference the text/xml registration in RFC 3023 in specifying interoperability considerations, if these considerations are not overridden by issues specific to that media type.
这些注册可以参考RFC 3023中的文本/xml注册来指定互操作性注意事项,如果这些注意事项没有被特定于该媒体类型的问题所覆盖。
The examples below give the value of the MIME Content-type header and the XML declaration (which includes the encoding declaration) inside the XML MIME entity. For UTF-16 examples, the Byte Order Mark character is denoted as "{BOM}", and the XML declaration is assumed to come at the beginning of the XML MIME entity, immediately following the BOM. Note that other MIME headers may be present, and the XML MIME entity may contain other data in addition to the XML declaration; the examples focus on the Content-type header and the encoding declaration for clarity.
下面的示例给出了XML MIME实体中MIME内容类型头和XML声明(包括编码声明)的值。对于UTF-16示例,字节顺序标记字符表示为“{BOM}”,并且假定XML声明位于XML MIME实体的开头,紧跟在BOM之后。注意,可能存在其他MIME头,XML MIME实体可能包含XML声明之外的其他数据;为了清晰起见,这些示例将重点放在内容类型头和编码声明上。
Content-type: text/xml; charset="utf-8"
Content-type: text/xml; charset="utf-8"
<?xml version="1.0" encoding="utf-8"?>
<?xml version="1.0" encoding="utf-8"?>
This is the recommended charset value for use with text/xml. Since the charset parameter is provided, MIME and XML processors MUST treat the enclosed entity as UTF-8 encoded.
这是用于text/xml的建议字符集值。由于提供了charset参数,MIME和XML处理器必须将封闭的实体视为UTF-8编码。
If sent using a 7-bit transport (e.g., SMTP[RFC0821]), the XML MIME entity MUST use a content-transfer-encoding of either quoted-printable or base64. For an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), or a binary clean transport (e.g., HTTP), no content-transfer-encoding is necessary.
如果使用7位传输(例如SMTP[RFC0821])发送,XML MIME实体必须使用引用的可打印或base64的内容传输编码。对于8位干净传输(例如8BITMIME ESMTP或NNTP)或二进制干净传输(例如HTTP),不需要内容传输编码。
Content-type: text/xml; charset="utf-16"
Content-type: text/xml; charset="utf-16"
{BOM}<?xml version='1.0' encoding='utf-16'?>
{BOM}<?xml version='1.0' encoding='utf-16'?>
or
或
{BOM}<?xml version='1.0'?>
{BOM}<?xml version='1.0'?>
This is possible only when the XML MIME entity is transmitted via HTTP, which uses a MIME-like mechanism and is a binary-clean protocol, hence does not perform CR and LF transformations and allows NUL octets. As described in [RFC2781], the UTF-16 family MUST NOT be used with media types under the top-level type "text" except over HTTP (see section 19.4.1 of [RFC2616] for details).
这只有在XML MIME实体通过HTTP传输时才可能实现,HTTP使用类似MIME的机制,是二进制干净协议,因此不执行CR和LF转换,并允许NUL八位字节。如[RFC2781]所述,UTF-16系列不得与顶级类型“文本”下的媒体类型一起使用,HTTP除外(有关详细信息,请参阅[RFC2616]第19.4.1节)。
Since HTTP is binary clean, no content-transfer-encoding is necessary.
由于HTTP是二进制干净的,因此不需要内容传输编码。
Content-type: text/xml; charset="utf-16be"
Content-type: text/xml; charset="utf-16be"
<?xml version='1.0' encoding='utf-16be'?>
<?xml version='1.0' encoding='utf-16be'?>
Observe that the BOM does not exist. This is again possible only when the XML MIME entity is transmitted via HTTP.
请注意BOM表不存在。同样,只有通过HTTP传输XML MIME实体时,这才可能实现。
Content-type: text/xml; charset="iso-2022-kr"
Content-type: text/xml; charset="iso-2022-kr"
<?xml version="1.0" encoding='iso-2022-kr'?>
<?xml version="1.0" encoding='iso-2022-kr'?>
This example shows text/xml with a Korean charset (e.g., Hangul) encoded following the specification in [RFC1557]. Since the charset parameter is provided, MIME and XML processors MUST treat the enclosed entity as encoded per RFC 1557.
此示例显示了文本/xml,其中包含按照[RFC1557]中的规范编码的韩语字符集(例如,韩语)。由于提供了charset参数,MIME和XML处理器必须将封闭的实体视为按照RFC 1557编码的实体。
Since ISO-2022-KR has been defined to use only 7 bits of data, no content-transfer-encoding is necessary with any transport.
由于ISO-2022-KR定义为仅使用7位数据,因此任何传输都不需要内容传输编码。
Content-type: text/xml
内容类型:text/xml
{BOM}<?xml version="1.0" encoding="utf-16"?>
{BOM}<?xml version="1.0" encoding="utf-16"?>
or
或
{BOM}<?xml version="1.0"?>
{BOM}<?xml version="1.0"?>
This example shows text/xml with the charset parameter omitted. In this case, MIME and XML processors MUST assume the charset is "us-ascii", the default charset value for text media types specified in [RFC2046]. The default of "us-ascii" holds even if the text/xml entity is transported using HTTP.
此示例显示了省略了charset参数的text/xml。在这种情况下,MIME和XML处理器必须假定字符集为“us ascii”,即[RFC2046]中指定的文本媒体类型的默认字符集值。即使使用HTTP传输文本/xml实体,“us ascii”的默认值仍然有效。
Omitting the charset parameter is NOT RECOMMENDED for text/xml. For example, even if the contents of the XML MIME entity are UTF-16 or UTF-8, or the XML MIME entity has an explicit encoding declaration, XML and MIME processors MUST assume the charset is "us-ascii".
对于text/xml,不建议省略charset参数。例如,即使XML MIME实体的内容是UTF-16或UTF-8,或者XML MIME实体具有显式编码声明,XML和MIME处理器也必须假定字符集为“us ascii”。
Content-type: application/xml; charset="utf-16"
Content-type: application/xml; charset="utf-16"
{BOM}<?xml version="1.0" encoding="utf-16"?>
{BOM}<?xml version="1.0" encoding="utf-16"?>
or
或
{BOM}<?xml version="1.0"?>
{BOM}<?xml version="1.0"?>
This is a recommended charset value for use with application/xml. Since the charset parameter is provided, MIME and XML processors MUST treat the enclosed entity as UTF-16 encoded.
这是一个建议用于application/xml的字符集值。由于提供了charset参数,MIME和XML处理器必须将封闭的实体视为UTF-16编码的实体。
If sent using a 7-bit transport (e.g., SMTP) or an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), the XML MIME entity MUST be encoded in quoted-printable or base64. For a binary clean transport (e.g., HTTP), no content-transfer-encoding is necessary.
如果使用7位传输(如SMTP)或8位干净传输(如8BITMIME ESMTP或NNTP)发送,则XML MIME实体必须以带引号的可打印或base64编码。对于二进制干净传输(例如HTTP),不需要内容传输编码。
Content-type: application/xml; charset="utf-16be"
Content-type: application/xml; charset="utf-16be"
<?xml version='1.0' encoding='utf-16be'?>
<?xml version='1.0' encoding='utf-16be'?>
Observe that the BOM does not exist. Since the charset parameter is provided, MIME and XML processors MUST treat the enclosed entity as UTF-16BE encoded.
请注意BOM表不存在。由于提供了charset参数,MIME和XML处理器必须将封闭的实体视为UTF-16BE编码。
Content-type: application/xml; charset="iso-2022-kr"
Content-type: application/xml; charset="iso-2022-kr"
<?xml version="1.0" encoding="iso-2022-kr"?>
<?xml version="1.0" encoding="iso-2022-kr"?>
This example shows application/xml with a Korean charset (e.g., Hangul) encoded following the specification in [RFC1557]. Since the charset parameter is provided, MIME and XML processors MUST treat the enclosed entity as encoded per RFC 1557, independent of whether the XML MIME entity has an internal encoding declaration (this example does show such a declaration, which agrees with the charset parameter).
此示例显示了应用程序/xml,其中包含按照[RFC1557]中的规范编码的韩语字符集(例如,韩语)。由于提供了charset参数,MIME和XML处理器必须将封闭的实体视为按照RFC 1557编码的实体,这与XML MIME实体是否具有内部编码声明无关(此示例确实显示了这样一个声明,它与charset参数一致)。
Since ISO-2022-KR has been defined to use only 7 bits of data, no content-transfer-encoding is necessary with any transport.
由于ISO-2022-KR定义为仅使用7位数据,因此任何传输都不需要内容传输编码。
Content-type: application/xml
内容类型:application/xml
{BOM}<?xml version='1.0' encoding="utf-16"?>
{BOM}<?xml version='1.0' encoding="utf-16"?>
or
或
{BOM}<?xml version='1.0'?>
{BOM}<?xml version='1.0'?>
For this example, the XML MIME entity begins with a BOM. Since the charset has been omitted, a conforming XML processor follows the requirements of [XML], section 4.3.3. Specifically, the XML processor reads the BOM, and thus knows deterministically that the charset is UTF-16.
对于本例,XML MIME实体以BOM表开始。由于省略了字符集,一致性XML处理器遵循[XML]第4.3.3节的要求。具体地说,XML处理器读取BOM,从而确定地知道字符集是UTF-16。
An XML-unaware MIME processor SHOULD make no assumptions about the charset of the XML MIME entity.
不知道XML的MIME处理器不应假设XML MIME实体的字符集。
Content-type: application/xml
内容类型:application/xml
<?xml version='1.0'?>
<?xml version='1.0'?>
In this example, the charset parameter has been omitted, and there is no BOM. Since there is no BOM, the XML processor follows the requirements in section 4.3.3 of [XML], and optionally applies the mechanism described in Appendix F (which is non-normative) of [XML] to determine the charset encoding of UTF-8. The XML MIME entity does not contain an encoding declaration, but since the encoding is UTF-8, this is still a conforming XML MIME entity.
在本例中,省略了charset参数,并且没有BOM表。由于没有BOM,XML处理器遵循[XML]第4.3.3节中的要求,并选择性地应用[XML]附录F(非规范性)中描述的机制来确定UTF-8的字符集编码。XML MIME实体不包含编码声明,但由于编码是UTF-8,因此它仍然是一个符合XML MIME实体。
An XML-unaware MIME processor SHOULD make no assumptions about the charset of the XML MIME entity.
不知道XML的MIME处理器不应假设XML MIME实体的字符集。
8.11 Application/xml with Omitted Charset and Internal Encoding Declaration
8.11 带省略字符集和内部编码声明的Application/xml
Content-type: application/xml
内容类型:application/xml
<?xml version='1.0' encoding="iso-10646-ucs-4"?>
<?xml version='1.0' encoding="iso-10646-ucs-4"?>
In this example, the charset parameter has been omitted, and there is no BOM. However, the XML MIME entity does have an encoding declaration inside the XML MIME entity that specifies the entity's charset. Following the requirements in section 4.3.3 of [XML], and optionally applying the mechanism described in Appendix F (non-normative) of [XML], the XML processor determines the charset of the XML MIME entity (in this example, UCS-4).
在本例中,省略了charset参数,并且没有BOM表。但是,XML MIME实体在XML MIME实体中确实有一个编码声明,用于指定实体的字符集。遵循[XML]第4.3.3节的要求,并选择性地应用[XML]附录F(非规范性)中描述的机制,XML处理器确定XML MIME实体(在本例中为UCS-4)的字符集。
An XML-unaware MIME processor SHOULD make no assumptions about the charset of the XML MIME entity.
不知道XML的MIME处理器不应假设XML MIME实体的字符集。
Content-type: text/xml-external-parsed-entity; charset="utf-8"
Content-type: text/xml-external-parsed-entity; charset="utf-8"
<?xml encoding="utf-8"?>
<?xml encoding="utf-8"?>
This is the recommended charset value for use with text/xml-external-parsed-entity. Since the charset parameter is provided, MIME and XML processors MUST treat the enclosed entity as UTF-8 encoded.
这是推荐用于text/xml外部解析实体的字符集值。由于提供了charset参数,MIME和XML处理器必须将封闭的实体视为UTF-8编码。
If sent using a 7-bit transport (e.g., SMTP), the XML MIME entity MUST use a content-transfer-encoding of either quoted-printable or base64. For an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), or a binary clean transport (e.g., HTTP) no content-transfer-encoding is necessary.
如果使用7位传输(例如SMTP)发送,XML MIME实体必须使用引用的可打印或base64的内容传输编码。对于8位干净传输(例如8BITMIME ESMTP或NNTP)或二进制干净传输(例如HTTP),不需要内容传输编码。
Content-type: application/xml-external-parsed-entity; charset="utf-16"
Content-type: application/xml-external-parsed-entity; charset="utf-16"
{BOM}<?xml encoding="utf-16"?>
{BOM}<?xml encoding="utf-16"?>
or
或
{BOM}<?xml?>
{BOM}<?xml?>
This is a recommended charset value for use with application/xml-external-parsed-entity. Since the charset parameter is provided, MIME and XML processors MUST treat the enclosed entity as UTF-16 encoded.
这是建议与application/xml外部解析实体一起使用的字符集值。由于提供了charset参数,MIME和XML处理器必须将封闭的实体视为UTF-16编码的实体。
If sent using a 7-bit transport (e.g., SMTP) or an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP), the XML MIME entity MUST be encoded in quoted-printable or base64. For a binary clean transport (e.g., HTTP), no content-transfer-encoding is necessary.
如果使用7位传输(如SMTP)或8位干净传输(如8BITMIME ESMTP或NNTP)发送,则XML MIME实体必须以带引号的可打印或base64编码。对于二进制干净传输(例如HTTP),不需要内容传输编码。
Content-type: application/xml-external-parsed-entity; charset="utf-16be"
Content-type: application/xml-external-parsed-entity; charset="utf-16be"
<?xml encoding="utf-16be"?>
<?xml encoding="utf-16be"?>
Since the charset parameter is provided, MIME and XML processors MUST treat the enclosed entity as UTF-16BE encoded.
由于提供了charset参数,MIME和XML处理器必须将封闭的实体视为UTF-16BE编码。
Content-type: application/xml-dtd; charset="utf-8"
Content-type: application/xml-dtd; charset="utf-8"
<?xml encoding="utf-8"?>
<?xml encoding="utf-8"?>
Charset "utf-8" is a recommended charset value for use with application/xml-dtd. Since the charset parameter is provided, MIME and XML processors MUST treat the enclosed entity as UTF-8 encoded.
字符集“utf-8”是建议与application/xml dtd一起使用的字符集值。由于提供了charset参数,MIME和XML处理器必须将封闭的实体视为UTF-8编码。
Content-type: application/mathml+xml
Content-type: application/mathml+xml
<?xml version="1.0" ?>
<?xml version="1.0" ?>
MathML documents are XML documents whose content describes mathematical information, as defined by [MathML]. As a format based on XML, MathML documents SHOULD use the '+xml' suffix convention in their MIME content-type identifier. However, no content type has yet been registered for MathML and so this media type should not be used until such registration has been completed.
MathML文档是XML文档,其内容描述了[MathML]定义的数学信息。作为一种基于XML的格式,MathML文档应该在其MIME内容类型标识符中使用“+XML”后缀约定。但是,尚未为MathML注册任何内容类型,因此在完成此类注册之前,不应使用此媒体类型。
Content-type: application/xslt+xml
Content-type: application/xslt+xml
<?xml version="1.0" ?>
<?xml version="1.0" ?>
Extensible Stylesheet Language (XSLT) documents are XML documents whose content describes stylesheets for other XML documents, as defined by [XSLT]. As a format based on XML, XSLT documents SHOULD use the '+xml' suffix convention in their MIME content-type identifier. However, no content type has yet been registered for XSLT and so this media type should not be used until such registration has been completed.
可扩展样式表语言(XSLT)文档是XML文档,其内容描述了[XSLT]定义的其他XML文档的样式表。作为一种基于XML的格式,XSLT文档应该在其MIME内容类型标识符中使用“+XML”后缀约定。但是,尚未为XSLT注册任何内容类型,因此在完成此类注册之前不应使用此媒体类型。
Content-type: application/rdf+xml
Content-type: application/rdf+xml
<?xml version="1.0" ?>
<?xml version="1.0" ?>
RDF documents identified using this MIME type are XML documents whose content describes metadata, as defined by [RDF]. As a format based on XML, RDF documents SHOULD use the '+xml' suffix convention in their MIME content-type identifier. However, no content type has yet been registered for RDF and so this media type should not be used until such registration has been completed.
使用此MIME类型标识的RDF文档是XML文档,其内容描述了[RDF]定义的元数据。作为一种基于XML的格式,RDF文档应该在其MIME内容类型标识符中使用“+XML”后缀约定。但是,尚未为RDF注册任何内容类型,因此在完成此类注册之前,不应使用此媒体类型。
Content-type: image/svg+xml
Content-type: image/svg+xml
<?xml version="1.0" ?>
<?xml version="1.0" ?>
Scalable Vector Graphics (SVG) documents are XML documents whose content describes graphical information, as defined by [SVG]. As a format based on XML, SVG documents SHOULD use the '+xml' suffix convention in their MIME content-type identifier. However, no content type has yet been registered for SVG and so this media type should not be used until such registration has been completed.
可伸缩矢量图形(SVG)文档是XML文档,其内容描述了[SVG]定义的图形信息。作为一种基于XML的格式,SVG文档应该在其MIME内容类型标识符中使用“+XML”后缀约定。但是,尚未为SVG注册任何内容类型,因此在完成此类注册之前,不应使用此媒体类型。
Content-type: text/xml; charset="utf-8"
Content-type: text/xml; charset="utf-8"
<?xml version="1.0" encoding="iso-8859-1"?>
<?xml version="1.0" encoding="iso-8859-1"?>
Since the charset parameter is provided in the Content-Type header, MIME and XML processors MUST treat the enclosed entity as UTF-8 encoded. That is, the "iso-8859-1" encoding MUST be ignored.
由于charset参数是在Content-Type头中提供的,MIME和XML处理器必须将封闭的实体视为UTF-8编码的。也就是说,必须忽略“iso-8859-1”编码。
Processors generating XML MIME entities MUST NOT label conflicting charset information between the MIME Content-Type and the XML declaration.
生成XML MIME实体的处理器不得标记MIME内容类型和XML声明之间冲突的字符集信息。
As described in Section 7, this document updates the [RFC2048] registration process for XML-based MIME types.
如第7节所述,本文档更新了基于XML的MIME类型的[RFC2048]注册过程。
XML, as a subset of SGML, has all of the same security considerations as specified in [RFC1874], and likely more, due to its expected ubiquitous deployment.
XML作为SGML的一个子集,具有与[RFC1874]中指定的所有相同的安全注意事项,并且可能更多,因为它预期无处不在的部署。
To paraphrase section 3 of RFC 1874, XML MIME entities contain information to be parsed and processed by the recipient's XML system. These entities may contain and such systems may permit explicit system level commands to be executed while processing the data. To the extent that an XML system will execute arbitrary command strings, recipients of XML MIME entities may be a risk. In general, it may be possible to specify commands that perform unauthorized file operations or make changes to the display processor's environment that affect subsequent operations.
套用RFC1874第3节的话,XML MIME实体包含要由接收方的XML系统解析和处理的信息。这些实体可以包含并且这些系统可以允许在处理数据时执行显式的系统级命令。在某种程度上,XML系统将执行任意命令字符串,XML MIME实体的接收者可能存在风险。通常,可以指定执行未经授权的文件操作的命令,或对显示处理器的环境进行影响后续操作的更改。
In general, any information stored outside of the direct control of the user -- including CSS style sheets, XSL transformations, entity declarations, and DTDs -- can be a source of insecurity, by either obvious or subtle means. For example, a tiny "whiteout attack" modification made to a "master" style sheet could make words in critical locations disappear in user documents, without directly
一般来说,任何存储在用户直接控制之外的信息——包括CSS样式表、XSL转换、实体声明和DTD——都可能是不安全的来源,无论是通过明显的还是微妙的方式。例如,对“主”样式表进行微小的“增白攻击”修改可能会使关键位置的单词在用户文档中消失,而不会直接删除
modifying the user document or the stylesheet it references. Thus, the security of any XML document is vitally dependent on all of the documents recursively referenced by that document.
修改用户文档或其引用的样式表。因此,任何XML文档的安全性在很大程度上取决于该文档递归引用的所有文档。
The entity lists and DTDs for XHTML 1.0[XHTML], for instance, are likely to be a commonly used set of information. Many developers will use and trust them, few of whom will know much about the level of security on the W3C's servers, or on any similarly trusted repository.
例如,XHTML 1.0[XHTML]的实体列表和DTD可能是一组常用的信息。许多开发人员都会使用并信任它们,但很少有人知道W3C服务器或任何类似可信存储库上的安全级别。
The simplest attack involves adding declarations that break validation. Adding extraneous declarations to a list of character entities can effectively "break the contract" used by documents. A tiny change that produces a fatal error in a DTD could halt XML processing on a large scale. Extraneous declarations are fairly obvious, but more sophisticated tricks, like changing attributes from being optional to required, can be difficult to track down. Perhaps the most dangerous option available to crackers is redefining default values for attributes: e.g., if developers have relied on defaulted attributes for security, a relatively small change might expose enormous quantities of information.
最简单的攻击包括添加破坏验证的声明。向字符实体列表中添加无关声明可以有效地“破坏文档使用的契约”。在DTD中产生致命错误的微小更改可能会大规模停止XML处理。无关的声明是相当明显的,但更复杂的技巧,如将属性从可选更改为必需,可能很难找到。也许对破解者来说,最危险的选择是重新定义属性的默认值:例如,如果开发人员依赖默认属性来实现安全性,相对较小的更改可能会暴露大量信息。
Apart from the structural possibilities, another option, "entity spoofing," can be used to insert text into documents, vandalizing and perhaps conveying an unintended message. Because XML 1.0 permits multiple entity declarations, and the first declaration takes precedence, it's possible to insert malicious content where an entity is used, such as by inserting the full text of Winnie the Pooh in every occurrence of —.
除了结构上的可能性外,另一个选项“实体欺骗”可用于在文档中插入文本,破坏并可能传递意外消息。因为XML 1.0允许多个实体声明,并且第一个声明优先,所以在使用实体的地方可以插入恶意内容,例如在每次出现&mdash;时插入小熊维尼的全文;。
Use of the digital signatures work currently underway by the xmldsig working group may eventually ameliorate the dangers of referencing external documents not under one's own control.
xmldsig工作组目前正在进行的数字签名工作的使用可能最终会减轻引用不在自己控制下的外部文档的危险。
Use of XML is expected to be varied, and widespread. XML is under scrutiny by a wide range of communities for use as a common syntax for community-specific metadata. For example, the Dublin Core[RFC2413] group is using XML for document metadata, and a new effort has begun that is considering use of XML for medical information. Other groups view XML as a mechanism for marshalling parameters for remote procedure calls. More uses of XML will undoubtedly arise.
XML的使用预计将是多种多样的,并且广泛使用。XML作为社区特定元数据的通用语法,正受到广泛社区的关注。例如,都柏林核心[RFC2413]小组正在使用XML作为文档元数据,一项新的工作已经开始,正在考虑将XML用于医疗信息。其他组将XML视为远程过程调用的参数编组机制。毫无疑问,XML的更多用途将会出现。
Security considerations will vary by domain of use. For example, XML medical records will have much more stringent privacy and security considerations than XML library metadata. Similarly, use of XML as a parameter marshalling syntax necessitates a case by case security review.
安全注意事项因使用领域而异。例如,XML医疗记录比XML库元数据具有更严格的隐私和安全考虑。类似地,使用XML作为参数编组语法需要逐案进行安全审查。
XML may also have some of the same security concerns as plain text. Like plain text, XML can contain escape sequences that, when displayed, have the potential to change the display processor environment in ways that adversely affect subsequent operations. Possible effects include, but are not limited to, locking the keyboard, changing display parameters so subsequent displayed text is unreadable, or even changing display parameters to deliberately obscure or distort subsequent displayed material so that its meaning is lost or altered. Display processors SHOULD either filter such material from displayed text or else make sure to reset all important settings after a given display operation is complete.
XML可能也有一些与纯文本相同的安全问题。与纯文本一样,XML可以包含转义序列,当显示该序列时,可能会以对后续操作产生不利影响的方式更改显示处理器环境。可能的影响包括但不限于锁定键盘、更改显示参数以使后续显示的文本不可读,甚至更改显示参数以故意模糊或扭曲后续显示的材料,从而使其含义丢失或改变。显示处理器应该从显示的文本中过滤这些内容,或者确保在给定的显示操作完成后重置所有重要设置。
Some terminal devices have keys whose output, when pressed, can be changed by sending the display processor a character sequence. If this is possible the display of a text object containing such character sequences could reprogram keys to perform some illicit or dangerous action when the key is subsequently pressed by the user. In some cases not only can keys be programmed, they can be triggered remotely, making it possible for a text display operation to directly perform some unwanted action. As such, the ability to program keys SHOULD be blocked either by filtering or by disabling the ability to program keys entirely.
某些终端设备具有按键,按下按键后,可通过向显示处理器发送字符序列来更改其输出。如果这是可能的,当用户随后按下键时,包含此类字符序列的文本对象的显示可能会重新编程键以执行一些非法或危险的操作。在某些情况下,不仅可以对按键进行编程,还可以远程触发按键,使文本显示操作能够直接执行一些不需要的操作。因此,应通过过滤或完全禁用密钥编程功能来阻止密钥编程功能。
Note that it is also possible to construct XML documents that make use of what XML terms "entity references" (using the XML meaning of the term "entity" as described in Section 2), to construct repeated expansions of text. Recursive expansions are prohibited by [XML] and XML processors are required to detect them. However, even non-recursive expansions may cause problems with the finite computing resources of computers, if they are performed many times.
请注意,还可以使用XML术语“实体引用”(使用第2节中描述的术语“实体”的XML含义)构造XML文档,以构造文本的重复扩展。[XML]禁止递归扩展,需要XML处理器来检测它们。然而,即使是非递归扩展,如果执行多次,也可能会导致计算机有限计算资源的问题。
References
工具书类
[ASCII] "US-ASCII. Coded Character Set -- 7-Bit American Standard Code for Information Interchange", ANSI X3.4-1986, 1986.
[ASCII]“US-ASCII编码字符集——信息交换用7位美国标准代码”,ANSI X3.4-1986,1986。
[CSS] Bos, B., Lie, H.W., Lilley, C. and I. Jacobs, "Cascading Style Sheets, level 2 (CSS2) Specification", World Wide Web Consortium Recommendation REC-CSS2, May 1998, <http://www.w3.org/TR/REC-CSS2/>.
[CSS]Bos,B.,Lie,H.W.,Lilley,C.和I.Jacobs,“级联样式表,2级(CSS2)规范”,万维网联盟建议REC-CSS2,1998年5月<http://www.w3.org/TR/REC-CSS2/>.
[ISO8859] "ISO-8859. International Standard -- Information Processing -- 8-bit Single-Byte Coded Graphic Character Sets -- Part 1: Latin alphabet No. 1, ISO-8859-1:1987", 1987.
[ISO8859]“ISO-8859.国际标准——信息处理——8位单字节编码图形字符集——第1部分:第1号拉丁字母表,ISO-8859-1:1987”,1987年。
[MathML] Ion, P. and R. Miner, "Mathematical Markup Language (MathML) 1.01", World Wide Web Consortium Recommendation REC-MathML, July 1999, <http://www.w3.org/TR/REC-MathML/>.
[MathML]Ion,P.和R.Miner,“数学标记语言(MathML)1.01”,万维网联盟建议REC MathML,1999年7月<http://www.w3.org/TR/REC-MathML/>.
[PNG] Boutell, T., "PNG (Portable Network Graphics) Specification", World Wide Web Consortium Recommendation REC-png, October 1996, <http://www.w3.org/TR/REC-png>.
[PNG]Boutell,T.,“PNG(便携式网络图形)规范”,万维网联盟建议REC PNG,1996年10月<http://www.w3.org/TR/REC-png>.
[RDF] Lassila, O. and R.R. Swick, "Resource Description Framework (RDF) Model and Syntax Specification", World Wide Web Consortium Recommendation REC-rdf-syntax, February 1999, <http://www.w3.org/TR/REC-rdf-syntax/>.
[RDF]Lassila,O.和R.R.Swick,“资源描述框架(RDF)模型和语法规范”,万维网联盟建议REC RDF语法,1999年2月<http://www.w3.org/TR/REC-rdf-syntax/>.
[RFC0821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, August 1982.
[RFC0821]Postel,J.,“简单邮件传输协议”,STD 10,RFC 821,1982年8月。
[RFC0977] Kantor, B. and P. Lapsley, "Network News Transfer Protocol", RFC 977, February 1986.
[RFC0977]Kantor,B.和P.Lapsley,“网络新闻传输协议”,RFC 977,1986年2月。
[RFC1557] Choi, U., Chon, K. and H. Park, "Korean Character Encoding for Internet Messages", RFC 1557, December 1993.
[RFC1557]Choi,U.,Chon,K.和H.Park,“互联网信息的韩文字符编码”,RFC 15571993年12月。
[RFC1652] Klensin, J., Freed, N., Rose, M., Stefferud, E. and D. Crocker, "SMTP Service Extension for 8bit-MIMEtransport", RFC 1652, July 1994.
[RFC1652]Klensin,J.,Freed,N.,Rose,M.,Stefferud,E.和D.Crocker,“8位MIMEtransport的SMTP服务扩展”,RFC 16521994年7月。
[RFC1874] Levinson, E., "SGML Media Types", RFC 1874, December 1995.
[RFC1874]Levinson,E.“SGML媒体类型”,RFC18741995年12月。
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996.
[RFC2045]Freed,N.和N.Borenstein,“多用途Internet邮件扩展(MIME)第一部分:Internet邮件正文格式”,RFC 20451996年11月。
[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996.
[RFC2046]Freed,N.和N.Borenstein,“多用途Internet邮件扩展(MIME)第二部分:媒体类型”,RFC 20461996年11月。
[RFC2048] Freed, N., Klensin, J. and J. Postel, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", RFC 2048, November 1996.
[RFC2048]Freed,N.,Klensin,J.和J.Postel,“多用途互联网邮件扩展(MIME)第四部分:注册程序”,RFC 20481996年11月。
[RFC2060] Crispin, M., "Internet Message Access Protocol - Version 4rev1", RFC 2060, December 1996.
[RFC2060]Crispin,M.,“互联网消息访问协议-版本4rev1”,RFC20601996年12月。
[RFC2077] Nelson, S., Parks, C. and Mitra, "The Model Primary Content Type for Multipurpose Internet Mail Extensions", RFC 2077, January 1997.
[RFC2077]Nelson,S.,Parks,C.和Mitra,“多用途Internet邮件扩展的主要内容类型模型”,RFC 2077,1997年1月。
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。
[RFC2130] Weider, C., Preston, C., Simonsen, K., Alvestrand, H., Atkinson, R., Crispin, M. and P. Svanberg, "The Report of the IAB Character Set Workshop held 29 February - 1 March, 1996", RFC 2130, April 1997.
[RFC2130]Weider,C.,Preston,C.,Simonsen,K.,Alvestrand,H.,Atkinson,R.,Crispin,M.和P.Svanberg,“1996年2月29日至3月1日举行的IAB字符集研讨会报告”,RFC 21301997年4月。
[RFC2279] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 2279, January 1998.
[RFC2279]Yergeau,F.,“UTF-8,ISO 10646的转换格式”,RFC 2279,1998年1月。
[RFC2376] Whitehead, E. and M. Murata, "XML Media Types", RFC 2376, July 1998.
[RFC2376]Whitehead,E.和M.Murata,“XML媒体类型”,RFC23761998年7月。
[RFC2396] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax.", RFC 2396, August 1998.
[RFC2396]Berners Lee,T.,Fielding,R.和L.Masinter,“统一资源标识符(URI):通用语法”,RFC 23961998年8月。
[RFC2413] Weibel, S., Kunze, J., Lagoze, C. and M. Wolf, "Dublin Core Metadata for Resource Discovery", RFC 2413, September 1998.
[RFC2413]Weibel,S.,Kunze,J.,Lagoze,C.和M.Wolf,“用于资源发现的都柏林核心元数据”,RFC 2413,1998年9月。
[RFC2445] Dawson, F. and D. Stenerson, "Internet Calendaring and Scheduling Core Object Specification (iCalendar)", RFC 2445, November 1998.
[RFC2445]Dawson,F.和D.Stenerson,“互联网日历和调度核心对象规范(iCalendar)”,RFC 24451998年11月。
[RFC2518] Goland, Y., Whitehead, E., Faizi, A., Carter, S. and D. Jensen, "HTTP Extensions for Distributed Authoring -- WEBDAV", RFC 2518, February 1999.
[RFC2518]Goland,Y.,Whitehead,E.,Faizi,A.,Carter,S.和D.Jensen,“分布式创作的HTTP扩展——WEBDAV”,RFC25181999年2月。
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
[RFC2616]菲尔丁,R.,盖蒂,J.,莫格尔,J.,尼尔森,H.,马斯特,L.,利奇,P.和T.伯纳斯李,“超文本传输协议——HTTP/1.1”,RFC 2616,1999年6月。
[RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999.
[RFC2629]Rose,M.,“使用XML编写I-D和RFC”,RFC 26292999年6月。
[RFC2703] Klyne, G., "Protocol-independent Content Negotiation Framework", RFC 2703, September 1999.
[RFC2703]Klyne,G.,“独立于协议的内容协商框架”,RFC 2703,1999年9月。
[RFC2781] Hoffman, P. and F. Yergeau, "UTF-16, an encoding of ISO 10646", RFC 2781, Februrary 2000.
[RFC2781]Hoffman,P.和F.Yergeau,“UTF-16,ISO 10646编码”,RFC 2781,2000年2月。
[RFC2801] Burdett, D., "Internet Open Trading Protocol - IOTP Version 1.0", RFC 2801, April 2000.
[RFC2801]Burdett,D.,“互联网开放交易协议-IOTP版本1.0”,RFC2801,2000年4月。
[SGML] International Standard Organization, "Information Processing -- Text and Office Systems -- Standard Generalized Markup Language (SGML)", ISO 8879, October 1986.
[SGML]国际标准组织,“信息处理——文本和办公系统——标准通用标记语言(SGML)”,ISO 8879,1986年10月。
[SVG] Ferraiolo, J., "Scalable Vector Graphics (SVG)", World Wide Web Consortium Candidate Recommendation SVG, November 2000, <http://www.w3.org/TR/SVG>.
[SVG]Ferraiolo,J.,“可缩放矢量图形(SVG)”,万维网联盟候选推荐SVG,2000年11月<http://www.w3.org/TR/SVG>.
[XHTML] Pemberton, S. and et al, "XHTML 1.0: The Extensible HyperText Markup Language", World Wide Web Consortium Recommendation xhtml1, January 2000, <http://www.w3.org/TR/xhtml1>.
[XHTML]Pemberton,S.和等人,“XHTML 1.0:可扩展超文本标记语言”,万维网联盟建议xhtml1,2000年1月<http://www.w3.org/TR/xhtml1>.
[XML] Bray, T., Paoli, J., Sperberg-McQueen, C.M. and E. Maler, "Extensible Markup Language (XML) 1.0 (Second Edition)", World Wide Web Consortium Recommendation REC-xml, October 2000, <http://www.w3.org/TR/REC-xml>.
[XML]Bray,T.,Paoli,J.,Sperberg McQueen,C.M.和E.Maler,“可扩展标记语言(XML)1.0(第二版)”,万维网联盟建议REC XML,2000年10月<http://www.w3.org/TR/REC-xml>.
[XSLT] Clark, J., "XSL Transformations (XSLT) Version 1.0", World Wide Web Consortium Recommendation xslt, November 1999, <http://www.w3.org/TR/xslt>.
[XSLT]Clark,J.,“XSL转换(XSLT)1.0版”,万维网联盟建议XSLT,1999年11月<http://www.w3.org/TR/xslt>.
Authors' Addresses
作者地址
MURATA Makoto (FAMILY Given) IBM Tokyo Research Laboratory 1623-14, Shimotsuruma Yamato-shi, Kanagawa-ken 242-8502 Japan
MURATA Makoto(家族)IBM东京研究实验室1623-14,神奈川县下沼大和市242-8502
Phone: +81-46-215-4678 EMail: mmurata@trl.ibm.co.jp
Phone: +81-46-215-4678 EMail: mmurata@trl.ibm.co.jp
Simon St.Laurent simonstl.com 1259 Dryden Road Ithaca, New York 14850 USA
Simon St.Laurent simonstl.com美国纽约州伊萨卡德莱顿路1259号,邮编14850
EMail: simonstl@simonstl.com URI: http://www.simonstl.com/
EMail: simonstl@simonstl.com URI: http://www.simonstl.com/
Dan Kohn Skymoon Ventures 3045 Park Boulevard Palo Alto, California 94306 USA
Dan Kohn Skymoon Ventures美国加利福尼亚州帕洛阿尔托公园大道3045号,邮编94306
Phone: +1-650-327-2600 EMail: dan@dankohn.com URI: http://www.dankohn.com/
Phone: +1-650-327-2600 EMail: dan@dankohn.com URI: http://www.dankohn.com/
Appendix A. Why Use the '+xml' Suffix for XML-Based MIME Types?
附录A.为什么对基于xml的MIME类型使用“+xml”后缀?
Although the use of a suffix was not considered as part of the original MIME architecture, this choice is considered to provide the most functionality with the least potential for interoperability problems or lack of future extensibility. The alternatives to the '+xml' suffix and the reason for its selection are described below.
虽然后缀的使用未被视为原始MIME体系结构的一部分,但这种选择被认为是提供了最多的功能,而互操作性问题或缺乏未来扩展性的可能性最小。“+xml”后缀的替代选项及其选择原因如下所述。
A.1 Why not just use text/xml or application/xml and let the XML processor dispatch to the correct application based on the referenced DTD?
A.1为什么不使用text/xml或application/xml,让xml处理器根据引用的DTD分派到正确的应用程序?
text/xml and application/xml remain useful in many situations, especially for document-oriented applications that involve combining XML with a stylesheet in order to present the data. However, XML is also used to define entirely new data types, and an XML-based format such as image/svg+xml fits the definition of a MIME media type exactly as well as image/png[PNG] does. (Note that image/svg+xml is not yet registered.) Although extra functionality is available for MIME processors that are also XML processors, XML-based media types -- even when treated as opaque, non-XML media types -- are just as useful as any other media type and should be treated as such.
text/xml和application/xml在许多情况下仍然很有用,特别是对于涉及将xml与样式表组合以显示数据的面向文档的应用程序。然而,XML也被用来定义全新的数据类型,基于XML的格式(如image/svg+XML)与image/png[png]一样完全符合MIME媒体类型的定义。(请注意,image/svg+xml尚未注册。)尽管作为xml处理器的MIME处理器还可以使用额外的功能,但基于xml的媒体类型——即使被视为不透明的非xml媒体类型——与任何其他媒体类型一样有用,也应如此对待。
Since MIME dispatchers work off of the MIME type, use of text/xml or application/xml to label discrete media types will hinder correct dispatching and general interoperability. Finally, many XML documents use neither DTDs nor namespaces, yet are perfectly legal XML.
由于MIME分派器不使用MIME类型,因此使用text/xml或application/xml标记离散媒体类型将妨碍正确分派和一般互操作性。最后,许多XML文档既不使用DTD也不使用名称空间,但却是完全合法的XML。
A.2 Why not create a new subtree (e.g., image/xml.svg) to represent XML MIME types?
A.2为什么不创建一个新的子树(例如image/xml.svg)来表示xml MIME类型?
The subtree under which a media type is registered -- IETF, vendor (*/vnd.*), or personal (*/prs.*); see [RFC2048] for details -- is completely orthogonal from whether the media type uses XML syntax or not. The suffix approach allows XML document types to be identified within any subtree. The vendor subtree, for example, is likely to include a large number of XML-based document types. By using a suffix, rather than setting up a separate subtree, those types may remain in the same location in the tree of MIME types that they would have occupied had they not been based on XML.
注册媒体类型的子树——IETF、供应商(*/vnd.*)或个人(*/prs.*);有关详细信息,请参见[RFC2048]——与媒体类型是否使用XML语法完全正交。后缀方法允许在任何子树中标识XML文档类型。例如,供应商子树可能包含大量基于XML的文档类型。通过使用后缀,而不是设置一个单独的子树,这些类型可以保留在MIME类型树中的相同位置,如果它们不是基于XML的话,它们将占据相同的位置。
The top-level MIME type (e.g., model/*[RFC2077]) determines what kind of content the type is, not what syntax it uses. For example, agents using image/* to signal acceptance of any image format should certainly be given access to media type image/svg+xml, which is in
The top-level MIME type (e.g., model/*[RFC2077]) determines what kind of content the type is, not what syntax it uses. For example, agents using image/* to signal acceptance of any image format should certainly be given access to media type image/svg+xml, which is in
all respects a standard image subtype. It just happens to use XML to describe its syntax. The two aspects of the media type are completely orthogonal.
所有方面都是标准图像子类型。它只是碰巧使用XML来描述它的语法。媒体类型的两个方面是完全正交的。
XML-based data types will most likely be registered in ALL top-level categories. Potential, though currently unregistered, examples could include application/mathml+xml[MathML] and image/svg+xml[SVG].
基于XML的数据类型很可能在所有顶级类别中注册。潜在的示例可能包括application/mathml+xml[mathml]和image/svg+xml[svg],尽管目前尚未注册。
A.4 Why not just have the MIME processor 'sniff' the content to determine whether it is XML?
A.4为什么不让MIME处理器“嗅探”内容以确定它是否是XML?
Rather than explicitly labeling XML-based media types, the processor could look inside each type and see whether or not it is XML. The processor could also cache a list of XML-based media types.
处理器不必显式地标记基于XML的媒体类型,而是可以查看每种类型的内部,看看它是否是XML。处理器还可以缓存基于XML的媒体类型列表。
Although this method might work acceptably for some mail applications, it would fail completely in many other uses of MIME. For instance, an XML-based web crawler would have no way of determining whether a file is XML except to fetch it and check. The same issue applies in some IMAP4[RFC2060] mail applications, where the client first fetches the MIME type as part of the message structure and then decides whether to fetch the MIME entity. Requiring these fetches just to determine whether the MIME type is XML could have significant bandwidth and latency disadvantages in many situations.
尽管这种方法在某些邮件应用程序中可以接受,但在MIME的许多其他用途中它将完全失败。例如,一个基于XML的网络爬虫将无法确定一个文件是否是XML,除非获取并检查它。同样的问题也适用于某些IMAP4[RFC2060]邮件应用程序,其中客户端首先获取MIME类型作为消息结构的一部分,然后决定是否获取MIME实体。在许多情况下,只需要这些回迁来确定MIME类型是否为XML,可能会有显著的带宽和延迟劣势。
Sniffing XML also isn't as simple as it might seem. DOCTYPE declarations aren't required, and they can appear fairly deep into a document under certain unpreventable circumstances. (E.g., the XML declaration, comments, and processing instructions can occupy space before the DOCTYPE declaration.) Even sniffing the DOCTYPE isn't completely reliable, thanks to a variety of issues involving default values for namespaces within external DTDs and overrides inside the internal DTD. Finally, the variety in potential character encodings (something XML provides tools to deal with), also makes reliable sniffing less likely.
嗅探XML也不像看上去那么简单。DOCTYPE声明不是必需的,在某些无法预防的情况下,它们可能会深入到文档中。(例如,XML声明、注释和处理指令可能会占用DOCTYPE声明之前的空间。)即使嗅探DOCTYPE也不是完全可靠的,这要归功于涉及外部DTD中名称空间的默认值和内部DTD中的重写的各种问题。最后,潜在字符编码的多样性(XML提供了一些工具来处理)也使得可靠的嗅探变得不太可能。
A.5 Why not use a MIME parameter to specify that a media type uses XML syntax?
A.5为什么不使用MIME参数来指定媒体类型使用XML语法?
For example, one could use "Content-Type: application/iotp; alternate-type=text/xml" or "Content-Type: application/iotp; syntax=xml".
For example, one could use "Content-Type: application/iotp; alternate-type=text/xml" or "Content-Type: application/iotp; syntax=xml".
Section 5 of [RFC2045] says that "Parameters are modifiers of the media subtype, and as such do not fundamentally affect the nature of the content". However, all XML-based media types are by their nature
[RFC2045]第5节指出,“参数是媒体子类型的修饰符,因此不会从根本上影响内容的性质”。但是,所有基于XML的媒体类型本质上都是相同的
always XML. Parameters, as they have been defined in the MIME architecture, are never invariant across all instantiations of a media type.
始终使用XML。在MIME体系结构中定义的参数在一个媒体类型的所有实例化中都不是不变的。
More practically, very few if any MIME dispatchers and other MIME agents support dispatching off of a parameter. While MIME agents on the receiving side will need to be updated in either case to support (or fall back to) generic XML processing, it has been suggested that it is easier to implement this functionality when acting off of the media type rather than a parameter. More important, sending agents require no update to properly tag an image as "image/svg+xml", but few if any sending agents currently support always tagging certain content types with a parameter.
实际上,很少有MIME分派器和其他MIME代理支持分派参数。尽管在任何情况下都需要更新接收方的MIME代理以支持(或退回到)通用XML处理,但有人建议,在脱离媒体类型而不是参数时,更容易实现此功能。更重要的是,发送代理不需要更新就可以正确地将图像标记为“image/svg+xml”,但目前很少有发送代理支持始终使用参数标记某些内容类型。
A.6 How about labeling with parameters in the other direction (e.g., application/xml; Content-Feature=iotp)?
A.6使用另一个方向的参数(例如,应用程序/xml;内容特性=iotp)进行标记如何?
This proposal fails under the simplest case, of a user with neither knowledge of XML nor an XML-capable MIME dispatcher. In that case, the user's MIME dispatcher is likely to dispatch the content to an XML processing application when the correct default behavior should be to dispatch the content to the application responsible for the content type (e.g., an ecommerce engine for application/iotp+xml[RFC2801], once this media type is registered).
这个建议在最简单的情况下失败了,即用户既不了解XML,也没有支持XML的MIME调度器。在这种情况下,当正确的默认行为应该是将内容发送到负责内容类型的应用程序时,用户的MIME调度器很可能将内容发送到XML处理应用程序(例如,一旦注册了该媒体类型,应用程序/iotp+XML[RFC2801]的电子商务引擎)。
Note that even if the user had already installed the appropriate application (e.g., the ecommerce engine), and that installation had updated the MIME registry, many operating system level MIME registries such as .mailcap in Unix and HKEY_CLASSES_ROOT in Windows do not currently support dispatching off a parameter, and cannot easily be upgraded to do so. And, even if the operating system were upgraded to support this, each MIME dispatcher would also separately need to be upgraded.
请注意,即使用户已经安装了适当的应用程序(例如,电子商务引擎),并且该安装更新了MIME注册表,许多操作系统级别的MIME注册表,例如Unix中的.mailcap和Windows中的HKEY_CLASSES_ROOT,目前都不支持发送参数,而且不容易升级到这样做。而且,即使操作系统升级以支持此功能,每个MIME调度程序也需要单独升级。
A.7 How about a new superclass MIME parameter that is defined to apply to all MIME types (e.g., Content-Type: application/iotp; $superclass=xml)?
A.7如何定义一个新的超类MIME参数以应用于所有MIME类型(例如,内容类型:application/iotp;$superclass=xml)?
This combines the problems of Appendix A.5 and Appendix A.6.
这结合了附录A.5和附录A.6中的问题。
If the sender attaches an image/svg+xml file to a message and includes the instructions "Please copy the French text on the road sign", someone with an XML-aware MIME client and an XML browser but no support for SVG can still probably open the file and copy the text. By contrast, with superclasses, the sender must add superclass support to her existing mailer AND the receiver must add superclass support to his before this transaction can work correctly.
如果发件人将图像/svg+xml文件附加到邮件中,并包含“请复制路标上的法语文本”的说明,则具有xml感知MIME客户端和xml浏览器但不支持svg的人仍可能打开该文件并复制文本。相比之下,对于超类,发送方必须向其现有邮件程序添加超类支持,接收方必须向其添加超类支持,此事务才能正常工作。
If the receiver comes to rely on the superclass tag being present and applications are deployed relying on that tag (as always seems to happen), then only upgraded senders will be able to interoperate with those receiving applications.
如果接收方依赖于存在的超类标记,并且应用程序依赖于该标记部署(这似乎总是会发生),那么只有升级的发送方才能与那些接收应用程序进行互操作。
A.8 What about adding a new parameter to the Content-Disposition header or creating a new Content-Structure header to indicate XML syntax?
A.8向内容处置头添加一个新参数或创建一个新的内容结构头以指示XML语法如何?
This has nearly identical problems to Appendix A.7, in that it requires both senders and receivers to be upgraded, and few if any operating systems and MIME dispatchers support working off of anything other than the MIME type.
这与附录A.7的问题几乎相同,因为它要求发送方和接收方都要升级,并且很少有操作系统和MIME分派器支持除MIME类型之外的任何其他类型。
This is better than Appendix A.8, in that no extra functionality needs to be added to a MIME registry to support dispatching of information other than standard content types. However, it still requires both sender and receiver to be upgraded, and it will also fail in many cases (e.g., web hosting to an outsourced server), where the user can set MIME types (often through implicit mapping to file extensions), but has no way of adding arbitrary HTTP headers.
这比附录A.8更好,因为不需要向MIME注册表添加额外的功能来支持标准内容类型以外的信息调度。但是,它仍然需要同时升级发送方和接收方,并且在许多情况下(例如,外包服务器的web托管)也会失败,用户可以设置MIME类型(通常通过隐式映射到文件扩展名),但无法添加任意HTTP头。
A.10 How about using a conneg tag instead (e.g., accept-features: (syntax=xml))?
A.10改用conneg标记(例如,accept features:(syntax=xml))怎么样?
When the conneg protocol is fully defined, this may potentially be a reasonable thing to do. But given the limited current state of conneg[RFC2703] development, it is not a credible replacement for a MIME-based solution.
当conneg协议被完全定义时,这可能是一件合理的事情。但考虑到conneg[RFC2703]开发的有限现状,它并不能可靠地替代基于MIME的解决方案。
MIME explicitly defines two levels of content type, the top-level for the kind of content and the second-level for the specific media type. [RFC2048] extends this in an interoperable way by using prefixes to specify separate trees for IETF, vendor, and personal registrations. This specification also extends the two-level type by using the ' +xml' suffix. In both cases, processors that are unaware of these later specifications treat them as opaque and continue to interoperate. By contrast, adding a third-level type would break the current MIME architecture and cause numerous interoperability failures.
MIME明确定义了两个级别的内容类型,第一级为内容类型,第二级为特定媒体类型。[RFC2048]通过使用前缀为IETF、供应商和个人注册指定单独的树,以可互操作的方式扩展了这一功能。此规范还通过使用“+xml”后缀扩展了两级类型。在这两种情况下,不知道这些后续规范的处理器将它们视为不透明的,并继续进行互操作。相反,添加第三级类型将破坏当前的MIME体系结构,并导致大量互操作性失败。
As specified in Section 5.1 of [RFC2045], a tspecial can't be used:
根据[RFC2045]第5.1节的规定,以下情况下不能使用特殊材料:
tspecials := "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "\" / <"> "/" / "[" / "]" / "?" / "="
tspecials := "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "\" / <"> "/" / "[" / "]" / "?" / "="
It was thought that "." would not be a good choice since it is already used as an additional hierarchy delimiter. Also, "*" has a common wildcard meaning, and "-" and "_" are common word separators and easily confused. The characters %'`#& are frequently used for quoting or comments and so are not ideal.
人们认为“.”不是一个好的选择,因为它已经被用作附加的层次分隔符。此外,“*”具有通用的通配符含义,“-”和“_”是常用的分词符,很容易混淆。字符“%”、“#”经常用于引用或注释,因此并不理想。
That leaves: ~!$^+{}|
That leaves: ~!$^+{}|
Note that "-" is used heavily in the current registry. "$" and "_" are used once each. The others are currently unused.
请注意,在当前注册表中大量使用“-”。“$”和“\”各使用一次。其他的目前未使用。
It was thought that '+' expressed the semantics that a MIME type can be treated (for example) as both scalable vector graphics AND ALSO as XML; it is both simultaneously.
有人认为“+”表达了这样一种语义,即MIME类型可以(例如)同时作为可伸缩向量图形和XML处理;两者同时存在。
A.13 What is the semantic difference between application/foo and application/foo+xml?
A.13 application/foo和application/foo+xml之间的语义区别是什么?
MIME processors that are unaware of XML will treat the '+xml' suffix as completely opaque, so it is essential that no extra semantics be assigned to its presence. Therefore, application/foo and application/foo+xml SHOULD be treated as completely independent media types. Although, for example, text/calendar+xml could be an XML version of text/calendar[RFC2445], it is possible that this (hypothetical) new media type would include new semantics as well as new syntax, and in any case, there would be many applications that support text/calendar but had not yet been upgraded to support text/calendar+xml.
不知道XML的MIME处理器会将“+XML”后缀视为完全不透明的,因此不必为其存在分配额外的语义。因此,应将application/foo和application/foo+xml视为完全独立的媒体类型。例如,尽管text/calendar+xml可能是text/calendar[RFC2445]的xml版本,但这种(假设的)新媒体类型可能包括新的语义和语法,而且在任何情况下,都会有许多应用程序支持text/calendar,但尚未升级到支持text/calendar+xml。
A.14 What happens when an even better markup language (e.g., EBML) is defined, or a new category of data?
A.14如果定义了更好的标记语言(如EBML)或新的数据类别,会发生什么?
In the ten years that MIME has existed, XML is the first generic data format that has seemed to justify special treatment, so it is hoped that no further suffixes will be necessary. However, if some are later defined, and these documents were also XML, they would need to specify that the '+xml' suffix is always the outermost suffix (e.g.,
在MIME存在的十年中,XML是第一种似乎需要特殊处理的通用数据格式,因此希望不再需要更多的后缀。但是,如果后来定义了一些,并且这些文档也是XML,则需要指定“+XML”后缀始终是最外层的后缀(例如。,
application/foo+ebml+xml not application/foo+xml+ebml). If they were not XML, then they would use a regular suffix (e.g., application/foo+ebml).
application/foo+ebml+xml不是application/foo+xml+ebml)。如果它们不是XML,那么它们将使用常规后缀(例如application/foo+ebml)。
You don't have to, but unless you have a good reason to explicitly disallow generic XML processing, you should use the suffix so as not to curtail the options of future users and developers.
您不必这样做,但除非有充分的理由明确禁止通用XML处理,否则应该使用后缀,以免限制未来用户和开发人员的选择。
Whether the inventors of a media type, today, design it for dispatch to generic XML processing machinery (and most won't) is not the critical issue. The core notion is that the knowledge that some media type happens to use XML syntax opens the door to unanticipated kinds of processing beyond those envisioned by its inventors, and on this basis identifying such encoding is a good and useful thing.
今天,媒体类型的发明者是否设计它以发送到通用XML处理机器(大多数不会)并不是关键问题。核心概念是,某些媒体类型碰巧使用XML语法,这一知识为超出其发明者设想的意外处理打开了大门,在此基础上识别这种编码是一件好事,也是一件有用的事情。
Developers of new media types are often tightly focused on a particular type of processing that meets current needs. But there is no need to rule out generic processing as well, which could make your media type more valuable over time. It is believed that registering with the '+xml' suffix will cause no interoperability problems whatsoever, while it may enable significant new functionality and interoperability now and in the future. So, the conservative approach is to include the '+xml' suffix.
新媒体类型的开发人员通常密切关注满足当前需求的特定处理类型。但也没有必要排除通用处理,这可能会使您的媒体类型随着时间的推移变得更有价值。人们相信,使用“+xml”后缀注册不会导致任何互操作性问题,而它现在和将来可能会实现重要的新功能和互操作性。因此,保守的方法是包含“+xml”后缀。
There are numerous and significant differences between this specification and [RFC2376], which it obsoletes. This appendix summarizes the major differences only.
本规范与[RFC2376]之间存在许多重大差异,该规范已经过时。本附录仅总结了主要差异。
First, text/xml-external-parsed-entity and application/xml-external-parsed-entity are added as media types for external parsed entities, and text/xml and application/xml are now prohibited.
首先,添加text/xml外部解析实体和application/xml外部解析实体作为外部解析实体的媒体类型,现在禁止使用text/xml和application/xml。
Second, application/xml-dtd is added as a media type for external DTD subsets and external parameter entities, and text/xml and application/xml are now prohibited.
其次,application/xmldtd被添加为外部dtd子集和外部参数实体的媒体类型,现在禁止使用text/xml和application/xml。
Third, "utf-16le" and "utf-16be" are added. RFC 2781 has introduced these BOM-less variations of the UTF-16 family.
第三,增加了“utf-16le”和“utf-16be”。RFC2781引入了UTF-16系列的这些无BOM变体。
Fourth, a naming convention ('+xml') for XML-based media types has been added, which also updates [RFC2048] as described in Section 7. By following this convention, an XML-based media type can be easily recognized as such.
第四,添加了基于xml的媒体类型的命名约定(“+xml”),这也更新了[RFC2048],如第7节所述。通过遵循此约定,可以很容易地识别基于XML的媒体类型。
This document reflects the input of numerous participants to the ietf-xml-mime@imc.org mailing list, though any errors are the responsibility of the authors. Special thanks to:
本文档反映了众多参与者对ietf xml的输入-mime@imc.org邮件列表,但任何错误均由作者负责。特别感谢:
Mark Baker, James Clark, Dan Connolly, Martin Duerst, Ned Freed, Yaron Goland, Rick Jelliffe, Larry Masinter, David Megginson, Keith Moore, Chris Newman, Gavin Nicol, Marshall Rose, Jim Whitehead and participants of the XML activity at the W3C.
Mark Baker、James Clark、Dan Connolly、Martin Duerst、Ned Freed、Yaron Goland、Rick Jelliffe、Larry Masinter、David Megginson、Keith Moore、Chris Newman、Gavin Nicol、Marshall Rose、Jim Whitehead以及W3C XML活动的参与者。
Full Copyright Statement
完整版权声明
Copyright (C) The Internet Society (2001). All Rights Reserved.
版权所有(C)互联网协会(2001年)。版权所有。
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
本文件及其译本可复制并提供给他人,对其进行评论或解释或协助其实施的衍生作品可全部或部分编制、复制、出版和分发,不受任何限制,前提是上述版权声明和本段包含在所有此类副本和衍生作品中。但是,不得以任何方式修改本文件本身,例如删除版权通知或对互联网协会或其他互联网组织的引用,除非出于制定互联网标准的需要,在这种情况下,必须遵循互联网标准过程中定义的版权程序,或根据需要将其翻译成英语以外的其他语言。
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.
上述授予的有限许可是永久性的,互联网协会或其继承人或受让人不会撤销。
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
本文件和其中包含的信息是按“原样”提供的,互联网协会和互联网工程任务组否认所有明示或暗示的保证,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。
Acknowledgement
确认
Funding for the RFC Editor function is currently provided by the Internet Society.
RFC编辑功能的资金目前由互联网协会提供。