Network Working Group D. Connolly Request for Comments: 2854 World Wide Web Consortium (W3C) Obsoletes: 2070, 1980, 1942, 1867, 1866 L. Masinter Category: Informational AT&T June 2000
Network Working Group D. Connolly Request for Comments: 2854 World Wide Web Consortium (W3C) Obsoletes: 2070, 1980, 1942, 1867, 1866 L. Masinter Category: Informational AT&T June 2000
The 'text/html' Media Type
“text/html”媒体类型
Status of this Memo
本备忘录的状况
This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.
本备忘录为互联网社区提供信息。它没有规定任何类型的互联网标准。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (C) The Internet Society (2000). All Rights Reserved.
版权所有(C)互联网协会(2000年)。版权所有。
Abstract
摘要
This document summarizes the history of HTML development, and defines the "text/html" MIME type by pointing to the relevant W3C recommendations; it is intended to obsolete the previous IETF documents defining HTML, including RFC 1866, RFC 1867, RFC 1980, RFC 1942 and RFC 2070, and to remove HTML from IETF Standards Track.
本文档总结了HTML开发的历史,并通过指向相关的W3C建议定义了“text/HTML”MIME类型;它旨在淘汰以前定义HTML的IETF文档,包括RFC 1866、RFC 1867、RFC 1980、RFC 1942和RFC 2070,并从IETF标准轨道中删除HTML。
This document was prepared at the request of the W3C HTML working group. Please send comments to www-html@w3.org, a public mailing list with archive at <http://lists.w3.org/Archives/Public/www-html/>.
本文件是应W3C HTML工作组的要求编写的。请将意见发送至www-html@w3.org,一个公共邮件列表,存档地址为<http://lists.w3.org/Archives/Public/www-html/>.
HTML has been in use in the World Wide Web information infrastructure since 1990, and specified in various informal documents. The text/html media type was first officially defined by the IETF HTML working group in 1995 in [HTML20]. Extensions to HTML were proposed in [HTML30], [UPLOAD], [TABLES], [CLIMAPS], and [I18N].
自1990年以来,HTML已在万维网信息基础设施中使用,并在各种非正式文件中指定。1995年,IETF html工作组在[HTML20]中首次正式定义了文本/html媒体类型。在[HTML30]、[UPLOAD]、[TABLES]、[CLIMAPS]和[I18N]中提出了对HTML的扩展。
The IETF HTML working group closed Sep 1996, and work on defining HTML moved to the World Wide Web Consortium (W3C). The proposed extensions were incorporated to some extent in [HTML32], and to a larger extent in [HTML40]. The definition of multipart/form-data from [UPLOAD] was described in [FORMDATA]. In addition, a reformulation of HTML 4.0 in XML 1.0[XHTML1] was developed.
IETF HTML工作组于1996年9月结束,定义HTML的工作转移到了万维网联盟(W3C)。建议的扩展在某种程度上包含在[HTML32]中,在更大程度上包含在[HTML40]中。[UPLOAD]中的多部分/表单数据定义如[FORMDATA]所述。此外,还开发了XML1.0[XHTML1]中HTML4.0的重新格式。
[HTML32] notes "This specification defines HTML version 3.2. HTML 3.2 aims to capture recommended practice as of early '96 and as such to be used as a replacement for HTML 2.0 (RFC 1866)." Subsequent specifications for HTML describe the differences in each version.
[HTML32]注释“本规范定义了HTML版本3.2。HTML3.2旨在获取自1996年初以来的推荐实践,并以此作为HTML2.0(RFC1866)的替代品。”随后的HTML规范描述了每个版本的差异。
In addition to the development of standards, a wide variety of additional extensions, restrictions, and modifications to HTML were popularized by NCSA's Mosaic system and subsequently by the competitive implementations of Netscape Navigator and Microsoft Internet Explorer; these extensions are documented in numerous books and online guides.
除了制定标准外,NCSA的Mosaic系统以及随后的Netscape Navigator和Microsoft Internet Explorer的竞争性实施推广了对HTML的各种附加扩展、限制和修改;这些扩展被记录在许多书籍和在线指南中。
MIME media type name: text MIME subtype name: html Required parameters: none Optional parameters:
MIME媒体类型名称:文本MIME子类型名称:html必需参数:无可选参数:
charset The optional parameter "charset" refers to the character encoding used to represent the HTML document as a sequence of bytes. Any registered IANA charset may be used, but UTF-8 is preferred. Although this parameter is optional, it is strongly recommended that it always be present. See Section 6 below for a discussion of charset default rules.
charset可选参数“charset”是指用于将HTML文档表示为字节序列的字符编码。可以使用任何注册的IANA字符集,但最好使用UTF-8。尽管此参数是可选的,但强烈建议它始终存在。有关字符集默认规则的讨论,请参见下面的第6节。
Note that [HTML20] included an optional "level" parameter; in practice, this parameter was never used and has been removed from this specification. [HTML30] also suggested a "version" parameter; in practice, this parameter also was never used and has been removed from this specification.
注意,[HTML20]包括一个可选的“级别”参数;实际上,该参数从未使用过,并且已从本规范中删除。[HTML30]还建议了一个“版本”参数;实际上,该参数也从未使用过,并且已从本规范中删除。
Encoding considerations: See Section 4 of this document.
编码注意事项:参见本文档第4节。
Security considerations: See Section 7 of this document.
安全注意事项:见本文件第7节。
Interoperability considerations: HTML is designed to be interoperable across the widest possible range of platforms and devices of varying capabilities. However, there are contexts (platforms of limited display capability, for example) where not all of the capabilities of the full HTML definition are feasible. There is ongoing work to develop both a modularization of HTML and a set of profiling capabilities to identify and negotiate restricted (and extended) capabilities.
互操作性注意事项:HTML被设计为可在尽可能广泛的平台和具有不同功能的设备之间进行互操作。但是,在某些上下文(例如,显示能力有限的平台)中,完整HTML定义的所有功能都不可行。目前正在进行开发HTML模块化和一组分析功能的工作,以识别和协商受限(和扩展)功能。
Due to the long and distributed development of HTML, current practice on the Internet includes a wide variety of HTML variants. Implementors of text/html interpreters must be prepared to be "bug-compatible" with popular browsers in order to work with many HTML documents available the Internet.
由于HTML的长期和分布式发展,目前互联网上的实践包括各种各样的HTML变体。文本/html解释器的实现者必须准备好与流行浏览器“bug兼容”,以便处理互联网上可用的许多html文档。
Typically, different versions are distinguishable by the DOCTYPE declaration contained within them, although the DOCTYPE declaration itself is sometimes omitted or incorrect.
通常,不同的版本可以通过其中包含的DOCTYPE声明来区分,尽管DOCTYPE声明本身有时会被忽略或不正确。
Published specification: The text/html media type is now defined by W3C Recommendations; the latest published version is [HTML401]. In addition, [XHTML1] defines a profile of use of XHTML which is compatible with HTML 4.01 and which may also be labeled as text/html.
已发布的规范:文本/html媒体类型现在由W3C建议定义;最新发布的版本是[HTML401]。此外,[XHTML1]定义了一个与HTML4.01兼容的XHTML使用配置文件,该配置文件也可以标记为text/HTML。
Applications which use this media type: The first and most common application of HTML is the World Wide Web; commonly, HTML documents contain URI references [URI] to other documents and media to be retrieved using the HTTP protocol [HTTP]. Many gateway applications provide HTML-based interfaces to other underlying complex services. Numerous other applications now also use HTML as a convenient platform-independent multimedia document representation.
使用这种媒体类型的应用程序:HTML的第一个也是最常见的应用程序是万维网;通常,HTML文档包含使用HTTP协议[HTTP]检索的其他文档和媒体的URI引用[URI]。许多网关应用程序向其他底层复杂服务提供基于HTML的接口。许多其他应用程序现在也使用HTML作为一种方便的独立于平台的多媒体文档表示。
Additional information:
其他信息:
Magic number: There is no single initial string that is always present for HTML files. However, Section 5 below gives some guidelines for recognizing HTML files.
神奇数字:HTML文件中没有始终存在的单个初始字符串。但是,下面的第5节给出了识别HTML文件的一些指导原则。
File extension: The file extensions 'html' or 'htm' are commonly used, but other extensions denoting file formats for preprocessing are also common.
文件扩展名:文件扩展名“html”或“htm”常用,但表示预处理文件格式的其他扩展名也很常见。
Macintosh File Type code: TEXT
Macintosh文件类型代码:文本
Person & email address to contact for further information: Dan Connolly <connolly@w3.org> Larry Masinter <lmm@acm.org>
Person & email address to contact for further information: Dan Connolly <connolly@w3.org> Larry Masinter <lmm@acm.org>
Intended usage: COMMON
预期用途:普通
Author/Change controller: The HTML specification is a work product of the World Wide Web Consortium's HTML Working Group. The W3C has change control over the HTML specification.
作者/变更控制者:HTML规范是万维网联盟HTML工作组的工作成果。W3C对HTML规范具有变更控制权。
Further information: HTML has a means of including, by reference via URI, additional resources (image, video clip, applet) within the base document. In order to transfer a complete HTML object and the included resources in a single MIME object, the mechanisms of [MHTML] may be used.
进一步信息:HTML有一种方法,通过URI引用,在基础文档中包含其他资源(图像、视频剪辑、小程序)。为了在单个MIME对象中传输完整的HTML对象和包含的资源,可以使用[MHTML]的机制。
The URI specification [URI] notes that the semantics of a fragment identifier (part of a URI after a "#") is a property of the data resulting from a retrieval action, and that the format and interpretation of fragment identifiers is dependent on the media type of the retrieval result.
URI规范[URI]指出,片段标识符(URI中“#”)的语义是检索操作产生的数据的属性,片段标识符的格式和解释取决于检索结果的媒体类型。
For documents labeled as text/html, the fragment identifier designates the correspondingly named element; any element may be named with the "id" attribute, and A, APPLET, FRAME, IFRAME, IMG and MAP elements may be named with a "name" attribute. This is described in detail in [HTML40] section 12.
对于标记为text/html的文档,片段标识符指定相应命名的元素;任何元素都可以用“id”属性命名,而A、APPLET、FRAME、IFRAME、IMG和MAP元素可以用“name”属性命名。[HTML40]第12节对此进行了详细描述。
Because of the availability within HTML itself for using character entity references, documents that use a wide repertoire of characters may still be represented using the US-ASCII charset and transported without encoding. However, transport of text/html using a charset other than US-ASCII may require base64 or quoted-printable encoding for 7-bit channels.
由于HTML本身可以使用字符实体引用,因此使用大量字符的文档仍然可以使用US-ASCII字符集表示,并且在传输时不进行编码。但是,使用US-ASCII以外的字符集传输文本/html可能需要对7位通道使用base64或带引号的可打印编码。
As with all MIME text subtypes, the canonical form of "text/html" must always represent a line break as a sequence of a CR byte value (0x0D) followed by an LF (0x0A) byte value. Similarly, any occurrence of such a CRLF sequence in "text/html" must represent a line break. Use of CR byte values and LF byte values outside of line break sequences is also forbidden. This rule applies regardless of the character encoding ('charset') involved.
与所有MIME文本子类型一样,“text/html”的规范形式必须始终将换行表示为CR字节值(0x0D)后跟LF(0x0A)字节值的序列。类似地,在“text/html”中出现这样的CRLF序列必须表示换行符。也禁止在换行序列之外使用CR字节值和LF字节值。无论涉及何种字符编码(“字符集”),此规则均适用。
Note, however, that the HTTP protocol allows the transport of data not in canonical form, and, in particular, with other end-of-line conventions; see [HTTP] section 3.7.1. This exception is commonly used for HTML.
但是,请注意,HTTP协议允许传输非规范形式的数据,特别是使用其他端到端约定;参见[HTTP]第3.7.1节。此异常通常用于HTML。
HTML sent via email is still subject to the MIME restrictions; this is discussed fully in [MHTML] Section 10.
通过电子邮件发送的HTML仍受MIME限制;[MHTML]第10节对此进行了详细讨论。
Almost all HTML files have the string "<html" or "<HTML" near the beginning of the file.
几乎所有HTML文件的开头都有字符串“<HTML”或“<HTML”。
Documents conformant to HTML 2.0, HTML 3.2 and HTML 4.0 will start with a DOCTYPE declaration "<!DOCTYPE HTML" near the beginning, before the "<html". These dialects are case insensitive. Files may start with white space, comments (introduced by "<!--" ), or processing instructions (introduced by "<?") prior to the DOCTYPE declaration.
符合HTML2.0、HTML3.2和HTML4.0的文档将以DOCTYPE声明“<!DOCTYPE HTML”开头,靠近开头,在“<HTML”之前。这些方言不区分大小写。文件可以在DOCTYPE声明之前以空格、注释(由“<!-->”引入)或处理说明(由“<?”)开头。
XHTML documents (optionally) start with an XML declaration which begins with "<?xml" and are required to have a DOCTYPE declaration "<!DOCTYPE html".
XHTML文档(可选)以XML声明开头,该声明以“<?XML”开头,并且需要有DOCTYPE声明“<!DOCTYPE html”。
The use of an explicit charset parameter is strongly recommended. While [MIME] specifies "The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII." [HTTP] Section 3.7.1, defines that "media subtypes of the 'text' type are defined to have a default charset value of 'ISO-8859-1'". Section 19.3 of [HTTP] gives additional guidelines. Using an explicit charset parameter will help avoid confusion.
强烈建议使用显式字符集参数。[MIME]指定“在没有字符集参数的情况下,必须假定默认字符集为US-ASCII。”[HTTP]第3.7.1节定义“文本”类型的媒体子类型定义为具有默认字符集值“ISO-8859-1”。[HTTP]第19.3节给出了额外的指南。使用显式字符集参数将有助于避免混淆。
Using an explicit charset parameter also takes into account that the overwhelming majority of deployed browsers are set to use something else than 'ISO-8859-1' as the default; the actual default is either a corporate character encoding or character encodings widely deployed in a certain national or regional community. For further considerations, please also see Section 5.2 of [HTML40].
使用显式字符集参数还考虑到绝大多数已部署的浏览器设置为使用“ISO-8859-1”以外的内容作为默认值;实际默认为公司字符编码或在某个国家或地区社区广泛部署的字符编码。有关进一步的考虑,请参见[HTML40]的第5.2节。
[HTML401], section B.10, notes various security issues with interpreting anchors and forms in HTML documents.
[HTML401]第B.10节指出了解释HTML文档中的锚和表单时的各种安全问题。
In addition, the introduction of scripting languages and interactive capabilities in HTML 4.0 introduced a number of security risks associated with the automatic execution of programs written by the sender but interpreted by the recipient. User agents executing such scripts or programs must be extremely careful to insure that untrusted software is executed in a protected environment.
此外,HTML 4.0中脚本语言和交互功能的引入带来了许多与自动执行由发送方编写但由接收方解释的程序相关的安全风险。执行此类脚本或程序的用户代理必须非常小心,以确保在受保护的环境中执行不受信任的软件。
Daniel W. Connolly World Wide Web Consortium (W3C) MIT Laboratory for Computer Science 545 Technology Square Cambridge, MA 02139, U.S.A.
Daniel W.Connolly万维网联盟(W3C)麻省理工学院计算机科学实验室美国马萨诸塞州剑桥技术广场545号,邮编:02139。
EMail: connolly@w3.org http://www.w3.org/People/Connolly/
EMail: connolly@w3.org http://www.w3.org/People/Connolly/
Larry Masinter AT&T 75 Willow Road Menlo Park, CA 94025
加利福尼亚州门罗公园柳树路75号Larry Masinter AT&T 94025
EMail: LM@att.com http://larry.masinter.net
EMail: LM@att.com http://larry.masinter.net
[CLIMAPS] Seidman, J., "A Proposed Extension to HTML: Client-Side Image Maps", RFC 1980, August 1996.
[Climps]Seidman,J.,“HTML的拟议扩展:客户端图像映射”,RFC 1980,1996年8月。
[FORMDATA] Masinter, L., "Returning Values from Forms: multipart/form-data", RFC 2388, August 1998.
[FORMDATA]Masinter,L.,“从表单返回值:多部分/表单数据”,RFC 23881998年8月。
[HTML20] Berners-Lee, T. and D. Connolly, "Hypertext Markup Language - 2.0", RFC 1866, November 1995.
[HTML20]Berners Lee,T.和D.Connolly,“超文本标记语言-2.0”,RFC 18661995年11月。
[HTML30] Raggett, D., "HyperText Markup Language Specification Version 3.0", September 1995. (Available at <http://www.w3.org/MarkUp/html3/CoverPage>).
[HTML30]Raggett,D.,“超文本标记语言规范3.0版”,1995年9月。(可于<http://www.w3.org/MarkUp/html3/CoverPage>).
[HTML32] Raggett, D., "HTML 3.2 Reference Specification", W3C Recomendation, January 1997. Available at <http://www.w3.org/TR/REC-html32>.
[HTML32]Raggett,D.,“HTML3.2参考规范”,W3C推荐,1997年1月。可在<http://www.w3.org/TR/REC-html32>.
[HTML40] Raggett, D., et al., "HTML 4.0 Specification", W3C Recommendation, December 1997. Available at <http://www.w3.org/TR/1998/REC-html40- 19980424>
[HTML40] Raggett, D., et al., "HTML 4.0 Specification", W3C Recommendation, December 1997. Available at <http://www.w3.org/TR/1998/REC-html40- 19980424>
[HTML401] Raggett, D., et al., "HTML 4.01 Specification", W3C Recommendation, December 1999. Available at <http://www.w3.org/TR/html401>.
[HTML401]Raggett,D.,等,“HTML4.01规范”,W3C建议,1999年12月。可在<http://www.w3.org/TR/html401>.
[HTTP] Gettys, J., Fielding, R., Mogul, J., Frystyk, H., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
[HTTP]Gettys,J.,Fielding,R.,Mogul,J.,Frystyk,H.,Masinter,L.,Leach,P.和T.Berners Lee,“超文本传输协议——HTTP/1.1”,RFC 26161999年6月。
[I18N] Yergeau, F., Nicol, G. and M. Duerst, "Internationalization of the Hypertext Markup Language", RFC 2070, January 1997.
[I18N]Yergeau,F.,Nicol,G.和M.Duerst,“超文本标记语言的国际化”,RFC 2070,1997年1月。
[MHTML] Palme, J., Hotmann, A. and N. Shelness, "MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)", RFC 2557, March 1999.
[MHTML]Palme,J.,Hotmann,A.和N.Shelness,“聚合文档的MIME封装,如HTML(MHTML)”,RFC 2557,1999年3月。
[MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996.
[MIME]Freed,N.和N.Borenstein,“多用途互联网邮件扩展(MIME)第二部分:媒体类型”,RFC 20461996年11月。
[TABLES] Raggett, D., "HTML Tables", RFC 1942, May 1996.
[表格]Raggett,D.,“HTML表格”,RFC 1942,1996年5月。
[UPLOAD] Nebel, E. and L. Masinter, "Form-based File Upload in HTML", RFC 1867, November 1995.
[上传]Nebel,E.和L.Masinter,“基于表单的HTML文件上传”,RFC 18671995年11月。
[URI] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998.
[URI]Berners Lee,T.,Fielding,R.和L.Masinter,“统一资源标识符(URI):通用语法”,RFC 2396,1998年8月。
[XHTML1] "XHTML 1.0: The Extensible HyperText Markup Language: A Reformulation of HTML 4 in XML 1.0", W3C Recommendation, January 2000. Available at <http://www.w3.org/TR/xhtml1>.
[XHTML1]“XHTML1.0:可扩展超文本标记语言:用XML1.0重新表述HTML4”,W3C建议,2000年1月。可在<http://www.w3.org/TR/xhtml1>.
Copyright (C) The Internet Society (2000). All Rights Reserved.
版权所有(C)互联网协会(2000年)。版权所有。
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
本文件及其译本可复制并提供给他人,对其进行评论或解释或协助其实施的衍生作品可全部或部分编制、复制、出版和分发,不受任何限制,前提是上述版权声明和本段包含在所有此类副本和衍生作品中。但是,不得以任何方式修改本文件本身,例如删除版权通知或对互联网协会或其他互联网组织的引用,除非出于制定互联网标准的需要,在这种情况下,必须遵循互联网标准过程中定义的版权程序,或根据需要将其翻译成英语以外的其他语言。
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.
上述授予的有限许可是永久性的,互联网协会或其继承人或受让人不会撤销。
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
本文件和其中包含的信息是按“原样”提供的,互联网协会和互联网工程任务组否认所有明示或暗示的保证,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。
Acknowledgement
确认
Funding for the RFC Editor function is currently provided by the Internet Society.
RFC编辑功能的资金目前由互联网协会提供。