Internet Engineering Task Force (IETF) L. Romary Request for Comments: 6129 TEI Consortium and INRIA Category: Informational S. Lundberg ISSN: 2070-1721 The Royal Library, Copenhagen February 2011
Internet Engineering Task Force (IETF) L. Romary Request for Comments: 6129 TEI Consortium and INRIA Category: Informational S. Lundberg ISSN: 2070-1721 The Royal Library, Copenhagen February 2011
The 'application/tei+xml' Media Type
“应用程序/tei+xml”媒体类型
Abstract
摘要
This document defines the 'application/tei+xml' media type for markup languages defined in accordance with the Text Encoding and Interchange guidelines.
本文档定义了根据文本编码和交换指南定义的标记语言的“application/tei+xml”媒体类型。
Status of This Memo
关于下段备忘
This document is not an Internet Standards Track specification; it is published for informational purposes.
本文件不是互联网标准跟踪规范;它是为了提供信息而发布的。
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.
本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准;见RFC 5741第2节。
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6129.
有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc6129.
Copyright Notice
版权公告
Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2011 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。
Table of Contents
目录
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Recognizing TEI Files . . . . . . . . . . . . . . . . . . . . . 2 3. Fragment Identifier . . . . . . . . . . . . . . . . . . . . . . 4 4. Security Considerations . . . . . . . . . . . . . . . . . . . . 4 4.1. Harmful Content . . . . . . . . . . . . . . . . . . . . . . 4 4.2. Intellectual Property Rights . . . . . . . . . . . . . . . 4 4.3. Authenticity and confidentiality . . . . . . . . . . . . . 5 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 5 5.1. Registration of MIME Type 'application/tei+xml' . . . . . . 5 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6.1. Normative References . . . . . . . . . . . . . . . . . . . 6 6.2. Informative References . . . . . . . . . . . . . . . . . . 7
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Recognizing TEI Files . . . . . . . . . . . . . . . . . . . . . 2 3. Fragment Identifier . . . . . . . . . . . . . . . . . . . . . . 4 4. Security Considerations . . . . . . . . . . . . . . . . . . . . 4 4.1. Harmful Content . . . . . . . . . . . . . . . . . . . . . . 4 4.2. Intellectual Property Rights . . . . . . . . . . . . . . . 4 4.3. Authenticity and confidentiality . . . . . . . . . . . . . 5 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 5 5.1. Registration of MIME Type 'application/tei+xml' . . . . . . 5 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6.1. Normative References . . . . . . . . . . . . . . . . . . . 6 6.2. Informative References . . . . . . . . . . . . . . . . . . 7
Text Encoding and Interchange (TEI) is an international and interdisciplinary standard that is widely used by libraries, museums, publishers, and individual scholars to represent all kinds of textual material for online research and teaching [TEI].
文本编码和交换(TEI)是一种国际性跨学科标准,图书馆、博物馆、出版商和个人学者广泛使用该标准来表示在线研究和教学[TEI]中的各种文本材料。
This document defines the 'application/tei+xml' media type in accordance with [RFC3023] in order to enable generic processing of such documents on the Internet using eXtensible Markup Language (XML) [W3C.REC-xml-20081126] technologies.
本文档根据[RFC3023]定义了“应用程序/tei+xml”媒体类型,以便使用可扩展标记语言(xml)[W3C.REC-xml-20081126]技术在互联网上对此类文档进行通用处理。
TEI files are XML documents or fragments having the root element (as defined in [W3C.REC-xml-20081126]) in a TEI namespace. TEI namespace names are defined as a Universal Resource Identifier (URI) [RFC3986] in accordance with [W3C.REC-xml-names-20091208] and begins with http://www.tei-c.org/ns/ followed by the version number of the namespace. The current namespace is http://www.tei-c.org/ns/1.0
TEI files are XML documents or fragments having the root element (as defined in [W3C.REC-xml-20081126]) in a TEI namespace. TEI namespace names are defined as a Universal Resource Identifier (URI) [RFC3986] in accordance with [W3C.REC-xml-names-20091208] and begins with http://www.tei-c.org/ns/ followed by the version number of the namespace. The current namespace is http://www.tei-c.org/ns/1.0
The most common root element names for TEI documents are
TEI文档最常见的根元素名称是
<TEI>
<TEI>
<teiCorpus>
<TEI语料库>
The teiCorpus documents provide the ability to bundle multiple documents into a single file.
teiCorpus文档提供了将多个文档捆绑到单个文件中的能力。
Examples:
示例:
A document having <TEI> root element
具有<TEI>根元素的文档
<?xml version="1.0" encoding="UTF-8" ?> <TEI xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <text> ... </text> </TEI>
<?xml version="1.0" encoding="UTF-8" ?> <TEI xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <text> ... </text> </TEI>
A document having <teiCorpus> root element
具有根元素的文档
<?xml version="1.0" encoding="UTF-8" ?> <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <TEI> <teiHeader> ... </teiHeader> <text> ... </text> </TEI> <TEI> ... second document ... </TEI> <TEI> ... third document ... </TEI> </teiCorpus>
<?xml version="1.0" encoding="UTF-8" ?> <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> <teiHeader> ... </teiHeader> <TEI> <teiHeader> ... </teiHeader> <text> ... </text> </TEI> <TEI> ... second document ... </TEI> <TEI> ... third document ... </TEI> </teiCorpus>
TEI and teiCorpus files are often given the extensions .tei and .teiCorpus, respectively. There is a third type of file, which often is given the suffix .odd. ODD ("One Document Does it All") is a TEI XML document that includes schema fragments, prose documentation, and reference documentation. It is used for the definition and documentation of XML-based languages, and primarily for the TEI Guidelines [ODD]. In other words, ODD files do not differ from other TEI files in syntax, only in function.
TEI和teiCorpus文件的扩展名通常分别为.TEI和.teicurpus。还有第三种类型的文件,通常被赋予后缀.odd。ODD(“一个文档完成所有工作”)是一个TEI XML文档,包括模式片段、散文文档和参考文档。它用于定义和记录基于XML的语言,主要用于TEI指南[ODD]。换句话说,奇数文件在语法上与其他TEI文件没有区别,只是在功能上不同。
Documents having the media type 'application/tei+xml' use the fragment identifier notation as specified in [RFC3023] for the media type 'application/xml'.
媒体类型为“application/tei+xml”的文档使用[RFC3023]中为媒体类型“application/xml”指定的片段标识符表示法。
An XML resource does not in itself compromise data security. When being available on a network simply through the dereferencing of an Internationalized Resource Identifier (IRI) [RFC3987] or a URI, care must be taken to properly interpret the data to prevent unintended access. Hence the security issues of [RFC3986], Section 7, apply. In addition, as this media type uses the "+xml" convention, it shares the same security considerations as described in RFC 3023 [RFC3023], Section 10. In general, security issues related to the use of XML in IETF protocols are treated in RFC 3470 [RFC3470], Section 7. We will not try to duplicate this material, but review some aspects that are important for document-centric XML as applied to text encoding.
XML资源本身不会损害数据安全性。当仅通过取消对国际化资源标识符(IRI)[RFC3987]或URI的引用即可在网络上使用时,必须注意正确解释数据以防止意外访问。因此[RFC3986]第7节的安全问题适用。此外,由于该媒体类型使用“+xml”约定,因此它具有与RFC 3023[RFC3023]第10节所述相同的安全注意事项。通常,RFC 3470[RFC3470]第7节讨论了与IETF协议中使用XML相关的安全问题。我们将不尝试复制此材料,而是回顾一些对于应用于文本编码的以文档为中心的XML非常重要的方面。
Any application accepting submitted or retrieving TEI XML for processing has to be aware of risks connected with injection of harmful scripts and executable XML. XML inclusion [W3C.REC-xinclude-20061115] and the use of external entities are vulnerable to various forms of spoofing, and can also reveal aspects of a service in a way that may compromise its security. Any vulnerability of these kinds are, however, application specific. The TEI namespaces do not contain such elements.
任何接受提交或检索TEI XML进行处理的应用程序都必须意识到与注入有害脚本和可执行XML相关的风险。XML包含[W3C.REC-xinclude-20061115]和外部实体的使用容易受到各种形式的欺骗的攻击,并且还可能以可能危及其安全性的方式揭示服务的各个方面。但是,任何此类漏洞都是特定于应用程序的。TEI名称空间不包含此类元素。
TEI documents often arise in digitization of cultural heritage materials. Texts made accessible in TEI format may be unrestricted in the sense that their distribution may be unlimited by Digital Rights Management [DRM] or Intellectual Property Rights [IPR] constraints. However, TEI documents are heterogeneous. Some parts of a document may be unrestricted, whereas others, such as editorial text and annotations, may be subject to DRM restrictions.
TEI文件通常出现在文化遗产材料的数字化过程中。以TEI格式访问的文本可能不受限制,因为数字版权管理[DRM]或知识产权[IPR]限制可能不限制其分发。然而,TEI文档是异构的。文档的某些部分可能不受限制,而其他部分(如编辑文本和注释)可能受DRM限制。
The TEI format provides means for highly granular attribution, down to the content of individual XML elements. Software agents participating in the exchange or processing TEI may be required to honour markup of this kind. Even when there are no IPR constraints, intellectual property attribution alone requires that document users be able to tell the difference between content from different sources.
TEI格式提供了高粒度属性的方法,具体到单个XML元素的内容。参与交换或处理TEI的软件代理可能需要遵守此类标记。即使没有知识产权限制,仅知识产权归属就要求文档用户能够区分来自不同来源的内容之间的差异。
Historical archival records are often encoded in TEI and legal document may be binding centuries after they were written. Digitization and encoding of legal texts may require technologies for assuring authenticity, such as cryptographic checksums and electronic signatures.
历史档案记录通常以TEI编码,法律文件可能在书写几个世纪后具有约束力。法律文本的数字化和编码可能需要确保真实性的技术,如加密校验和和电子签名。
Similarly, historical documents may in part or in their entirety be confidential. This may be required by law or by the terms and conditions, such as in the case of donated or deposited text from private sources. A text archive may need content filtering or cryptographic technologies to meet such requirements.
同样,历史文件可能部分或全部为机密文件。这可能是法律或条款和条件所要求的,例如私人来源的捐赠或存放文本。文本存档可能需要内容过滤或加密技术来满足这些要求。
MIME media type name: application
MIME媒体类型名称:应用程序
MIME subtype name: tei+xml
MIME子类型名称:tei+xml
Required parameters: None
所需参数:无
Optional parameters: charset
可选参数:字符集
the parameter has identical semantics to the charset parameter of the "application/xml" media type as specified in RFC 3023 [RFC3023].
该参数与RFC 3023[RFC3023]中指定的“application/xml”媒体类型的字符集参数具有相同的语义。
Encoding considerations:
编码注意事项:
Identical to those for 'application/xml'. See RFC 3023 [RFC3023], Section 3.2.
与“application/xml”相同。参见RFC 3023[RFC3023],第3.2节。
Security considerations:
安全考虑:
See Security Considerations (Section 4) in this specification.
参见本规范中的安全注意事项(第4节)。
Interoperability considerations:
互操作性注意事项:
TEI documents are often given the extension '.xml', which is not uncommon for other XML document formats.
TEI文档通常具有扩展名“.xml”,这对于其他xml文档格式来说并不少见。
Published specification:
已发布的规范:
This media type registration is for TEI documents [TEI] as described here. TEI syntax is defined in a schema [TEIschema].
此介质类型注册适用于此处所述的TEI文档[TEI]。TEI语法在模式[TEIschema]中定义。
Applications which use this media type:
使用此媒体类型的应用程序:
There are currently no known applications using the media type 'application/tei+xml'.
目前没有已知的应用程序使用媒体类型“application/tei+xml”。
Additional information:
其他信息:
Magic number(s):
幻数:
There is no single initial octet sequence that is always present in TEI documents.
TEI文档中没有始终存在的单个初始八位字节序列。
file extension(s):
文件扩展名:
Common extensions are '.tei', '.teiCorpus' and '.odd'. See Recognizing TEI files (Section 2) in this specification.
常见的扩展名有'.tei','.teiCorpus'和'.odd'。参见本规范中的识别TEI文件(第2节)。
Macintosh File Type Code(s)
Macintosh文件类型代码
TEXT
文本
Object Identifier(s) or OID(s)
对象标识符或OID
Not applicable
不适用
[RFC3023] Murata, M., St. Laurent, S., and D. Kohn, "XML Media Types", RFC 3023, January 2001.
[RFC3023]Murata,M.,St.Laurent,S.,和D.Kohn,“XML媒体类型”,RFC 3023,2001年1月。
[RFC3470] Hollenbeck, S., Rose, M., and L. Masinter, "Guidelines for the Use of Extensible Markup Language (XML) within IETF Protocols", BCP 70, RFC 3470, January 2003.
[RFC3470]Hollenbeck,S.,Rose,M.,和L.Masinter,“IETF协议中可扩展标记语言(XML)的使用指南”,BCP 70,RFC 3470,2003年1月。
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, January 2005.
[RFC3986]Berners Lee,T.,Fielding,R.,和L.Masinter,“统一资源标识符(URI):通用语法”,STD 66,RFC 3986,2005年1月。
[RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource Identifiers (IRIs)", RFC 3987, January 2005.
[RFC3987]Duerst,M.和M.Suignard,“国际化资源标识符(IRIs)”,RFC 3987,2005年1月。
[TEI] "TEI Guidelines", <http://www.tei-c.org/Vault/P5/1.8.0/ doc/tei-p5-doc/en/html/>.
[TEI]“TEI指南”<http://www.tei-c.org/Vault/P5/1.8.0/ doc/tei-p5-doc/en/html/>。
[TEIschema] "Schema generated from ODD source", <http://www.tei-c.org/ release/xml/tei/custom/schema/relaxng/tei_all.rng>.
[TEIschema]“从奇数源生成的架构”<http://www.tei-c.org/ release/xml/tei/custom/schema/relaxng/tei_all.rng>。
[W3C.REC-xml-20081126] Paoli, J., Yergeau, F., Sperberg-McQueen, C., Maler, E., and T. Bray, "Extensible Markup Language (XML) 1.0 (Fifth Edition)", World Wide Web Consortium Recommendation REC-xml-20081126, November 2008, <http://www.w3.org/TR/2008/REC-xml-20081126>.
[W3C.REC-xml-20081126]Paoli,J.,Yergeau,F.,Sperberg McQueen,C.,Maler,E.,和T.Bray,“可扩展标记语言(xml)1.0(第五版)”,万维网联盟建议REC-xml-20081126,2008年11月<http://www.w3.org/TR/2008/REC-xml-20081126>.
[W3C.REC-xml-names-20091208] Bray, T., Hollander, D., Layman, A., Tobin, R., and H. Thompson, "Namespaces in XML 1.0 (Third Edition)", World Wide Web Consortium Recommendation REC-xml-names-20091208, December 2009, <http://www.w3.org/TR/2009/REC-xml-names-20091208>.
[W3C.REC-xml-names-20091208]Bray,T.,Hollander,D.,Layman,A.,Tobin,R.,和H.Thompson,“xml 1.0中的名称空间(第三版)”,万维网联盟建议REC-xml-names-20091208,2009年12月<http://www.w3.org/TR/2009/REC-xml-names-20091208>.
[DRM] "Digital rights management", <http://en.wikipedia.org/w/ index.php?title=Digital_rights_management& oldid=412653591>.
[DRM]“数字版权管理”<http://en.wikipedia.org/w/ index.php?title=Digital\u rights\u management&oldid=412653591>。
[IPR] "Intellectual property", <http://en.wikipedia.org/w/ index.php?title=Intellectual_property&oldid=411690322>.
[知识产权]“知识产权”<http://en.wikipedia.org/w/ index.php?title=知识产权&oldid=411690322>。
[ODD] "Getting Started with P5 ODDs", <http://www.tei-c.org/Guidelines/Customization/odds.xml>.
[奇数]“从P5赔率开始”<http://www.tei-c.org/Guidelines/Customization/odds.xml>.
[W3C.REC-xinclude-20061115] Marsh, J., Orchard, D., and D. Veillard, "XML Inclusions (XInclude) Version 1.0 (Second Edition)", World Wide Web Consortium Recommendation REC-xinclude-20061115, November 2006, <http://www.w3.org/TR/2006/REC-xinclude-20061115>.
[W3C.REC-xinclude-20061115]马什,J.,乌节,D.维拉德,“XML包含(xinclude)1.0版(第二版)”,万维网联盟建议REC-xinclude-20061115,2006年11月<http://www.w3.org/TR/2006/REC-xinclude-20061115>.
Authors' Addresses
作者地址
Laurent Romary TEI Consortium and INRIA
Laurent Romary TEI财团和INRIA
EMail: laurent.romary@inria.fr URI: http://www.tei-c.org/
EMail: laurent.romary@inria.fr URI: http://www.tei-c.org/
Sigfrid Lundberg The Royal Library, Copenhagen Postbox 2149 1016 Koebenhavn K Denmark
Sigfrid Lundberg皇家图书馆,哥本哈根邮政信箱2149 1016 Koebenhavn K丹麦
EMail: slu@kb.dk URI: http://sigfrid-lundberg.se/
EMail: slu@kb.dk URI: http://sigfrid-lundberg.se/