Network Working Group                                     T. Berners-Lee
Request for Comments: 3986                                       W3C/MIT
STD: 66                                                      R. Fielding
Updates: 1738                                               Day Software
Obsoletes: 2732, 2396, 1808                                  L. Masinter
Category: Standards Track                                  Adobe Systems
                                                            January 2005
        
Network Working Group                                     T. Berners-Lee
Request for Comments: 3986                                       W3C/MIT
STD: 66                                                      R. Fielding
Updates: 1738                                               Day Software
Obsoletes: 2732, 2396, 1808                                  L. Masinter
Category: Standards Track                                  Adobe Systems
                                                            January 2005
        

Uniform Resource Identifier (URI): Generic Syntax

统一资源标识符(URI):通用语法

Status of This Memo

关于下段备忘

This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.

本文件规定了互联网社区的互联网标准跟踪协议,并要求进行讨论和提出改进建议。有关本协议的标准化状态和状态,请参考当前版本的“互联网官方协议标准”(STD 1)。本备忘录的分发不受限制。

Copyright Notice

版权公告

Copyright (C) The Internet Society (2005).

版权所有(C)互联网协会(2005年)。

Abstract

摘要

A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource. This specification defines the generic URI syntax and a process for resolving URI references that might be in relative form, along with guidelines and security considerations for the use of URIs on the Internet. The URI syntax defines a grammar that is a superset of all valid URIs, allowing an implementation to parse the common components of a URI reference without knowing the scheme-specific requirements of every possible identifier. This specification does not define a generative grammar for URIs; that task is performed by the individual specifications of each URI scheme.

统一资源标识符(URI)是一个紧凑的字符序列,用于标识抽象资源或物理资源。本规范定义了通用URI语法和解析可能是相对形式的URI引用的过程,以及在Internet上使用URI的指导原则和安全注意事项。URI语法定义了一个语法,它是所有有效URI的超集,允许实现解析URI引用的公共组件,而不知道每个可能标识符的特定于方案的要求。本规范未定义URI的生成语法;该任务由每个URI方案的各个规范执行。

Table of Contents

目录

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
       1.1.  Overview of URIs . . . . . . . . . . . . . . . . . . . .  4
             1.1.1.  Generic Syntax . . . . . . . . . . . . . . . . .  6
             1.1.2.  Examples . . . . . . . . . . . . . . . . . . . .  7
             1.1.3.  URI, URL, and URN  . . . . . . . . . . . . . . .  7
       1.2.  Design Considerations  . . . . . . . . . . . . . . . . .  8
             1.2.1.  Transcription  . . . . . . . . . . . . . . . . .  8
             1.2.2.  Separating Identification from Interaction . . .  9
             1.2.3.  Hierarchical Identifiers . . . . . . . . . . . . 10
       1.3.  Syntax Notation  . . . . . . . . . . . . . . . . . . . . 11
   2.  Characters . . . . . . . . . . . . . . . . . . . . . . . . . . 11
       2.1.  Percent-Encoding . . . . . . . . . . . . . . . . . . . . 12
       2.2.  Reserved Characters  . . . . . . . . . . . . . . . . . . 12
       2.3.  Unreserved Characters  . . . . . . . . . . . . . . . . . 13
       2.4.  When to Encode or Decode . . . . . . . . . . . . . . . . 14
       2.5.  Identifying Data . . . . . . . . . . . . . . . . . . . . 14
   3.  Syntax Components  . . . . . . . . . . . . . . . . . . . . . . 16
       3.1.  Scheme . . . . . . . . . . . . . . . . . . . . . . . . . 17
       3.2.  Authority  . . . . . . . . . . . . . . . . . . . . . . . 17
             3.2.1.  User Information . . . . . . . . . . . . . . . . 18
             3.2.2.  Host . . . . . . . . . . . . . . . . . . . . . . 18
             3.2.3.  Port . . . . . . . . . . . . . . . . . . . . . . 22
       3.3.  Path . . . . . . . . . . . . . . . . . . . . . . . . . . 22
       3.4.  Query  . . . . . . . . . . . . . . . . . . . . . . . . . 23
       3.5.  Fragment . . . . . . . . . . . . . . . . . . . . . . . . 24
   4.  Usage  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
       4.1.  URI Reference  . . . . . . . . . . . . . . . . . . . . . 25
       4.2.  Relative Reference . . . . . . . . . . . . . . . . . . . 26
       4.3.  Absolute URI . . . . . . . . . . . . . . . . . . . . . . 27
       4.4.  Same-Document Reference  . . . . . . . . . . . . . . . . 27
       4.5.  Suffix Reference . . . . . . . . . . . . . . . . . . . . 27
   5.  Reference Resolution . . . . . . . . . . . . . . . . . . . . . 28
       5.1.  Establishing a Base URI  . . . . . . . . . . . . . . . . 28
             5.1.1.  Base URI Embedded in Content . . . . . . . . . . 29
             5.1.2.  Base URI from the Encapsulating Entity . . . . . 29
             5.1.3.  Base URI from the Retrieval URI  . . . . . . . . 30
             5.1.4.  Default Base URI . . . . . . . . . . . . . . . . 30
       5.2.  Relative Resolution  . . . . . . . . . . . . . . . . . . 30
             5.2.1.  Pre-parse the Base URI . . . . . . . . . . . . . 31
             5.2.2.  Transform References . . . . . . . . . . . . . . 31
             5.2.3.  Merge Paths  . . . . . . . . . . . . . . . . . . 32
             5.2.4.  Remove Dot Segments  . . . . . . . . . . . . . . 33
       5.3.  Component Recomposition  . . . . . . . . . . . . . . . . 35
       5.4.  Reference Resolution Examples  . . . . . . . . . . . . . 35
             5.4.1.  Normal Examples  . . . . . . . . . . . . . . . . 36
             5.4.2.  Abnormal Examples  . . . . . . . . . . . . . . . 36
        
   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
       1.1.  Overview of URIs . . . . . . . . . . . . . . . . . . . .  4
             1.1.1.  Generic Syntax . . . . . . . . . . . . . . . . .  6
             1.1.2.  Examples . . . . . . . . . . . . . . . . . . . .  7
             1.1.3.  URI, URL, and URN  . . . . . . . . . . . . . . .  7
       1.2.  Design Considerations  . . . . . . . . . . . . . . . . .  8
             1.2.1.  Transcription  . . . . . . . . . . . . . . . . .  8
             1.2.2.  Separating Identification from Interaction . . .  9
             1.2.3.  Hierarchical Identifiers . . . . . . . . . . . . 10
       1.3.  Syntax Notation  . . . . . . . . . . . . . . . . . . . . 11
   2.  Characters . . . . . . . . . . . . . . . . . . . . . . . . . . 11
       2.1.  Percent-Encoding . . . . . . . . . . . . . . . . . . . . 12
       2.2.  Reserved Characters  . . . . . . . . . . . . . . . . . . 12
       2.3.  Unreserved Characters  . . . . . . . . . . . . . . . . . 13
       2.4.  When to Encode or Decode . . . . . . . . . . . . . . . . 14
       2.5.  Identifying Data . . . . . . . . . . . . . . . . . . . . 14
   3.  Syntax Components  . . . . . . . . . . . . . . . . . . . . . . 16
       3.1.  Scheme . . . . . . . . . . . . . . . . . . . . . . . . . 17
       3.2.  Authority  . . . . . . . . . . . . . . . . . . . . . . . 17
             3.2.1.  User Information . . . . . . . . . . . . . . . . 18
             3.2.2.  Host . . . . . . . . . . . . . . . . . . . . . . 18
             3.2.3.  Port . . . . . . . . . . . . . . . . . . . . . . 22
       3.3.  Path . . . . . . . . . . . . . . . . . . . . . . . . . . 22
       3.4.  Query  . . . . . . . . . . . . . . . . . . . . . . . . . 23
       3.5.  Fragment . . . . . . . . . . . . . . . . . . . . . . . . 24
   4.  Usage  . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
       4.1.  URI Reference  . . . . . . . . . . . . . . . . . . . . . 25
       4.2.  Relative Reference . . . . . . . . . . . . . . . . . . . 26
       4.3.  Absolute URI . . . . . . . . . . . . . . . . . . . . . . 27
       4.4.  Same-Document Reference  . . . . . . . . . . . . . . . . 27
       4.5.  Suffix Reference . . . . . . . . . . . . . . . . . . . . 27
   5.  Reference Resolution . . . . . . . . . . . . . . . . . . . . . 28
       5.1.  Establishing a Base URI  . . . . . . . . . . . . . . . . 28
             5.1.1.  Base URI Embedded in Content . . . . . . . . . . 29
             5.1.2.  Base URI from the Encapsulating Entity . . . . . 29
             5.1.3.  Base URI from the Retrieval URI  . . . . . . . . 30
             5.1.4.  Default Base URI . . . . . . . . . . . . . . . . 30
       5.2.  Relative Resolution  . . . . . . . . . . . . . . . . . . 30
             5.2.1.  Pre-parse the Base URI . . . . . . . . . . . . . 31
             5.2.2.  Transform References . . . . . . . . . . . . . . 31
             5.2.3.  Merge Paths  . . . . . . . . . . . . . . . . . . 32
             5.2.4.  Remove Dot Segments  . . . . . . . . . . . . . . 33
       5.3.  Component Recomposition  . . . . . . . . . . . . . . . . 35
       5.4.  Reference Resolution Examples  . . . . . . . . . . . . . 35
             5.4.1.  Normal Examples  . . . . . . . . . . . . . . . . 36
             5.4.2.  Abnormal Examples  . . . . . . . . . . . . . . . 36
        
   6.  Normalization and Comparison . . . . . . . . . . . . . . . . . 38
       6.1.  Equivalence  . . . . . . . . . . . . . . . . . . . . . . 38
       6.2.  Comparison Ladder  . . . . . . . . . . . . . . . . . . . 39
             6.2.1.  Simple String Comparison . . . . . . . . . . . . 39
             6.2.2.  Syntax-Based Normalization . . . . . . . . . . . 40
             6.2.3.  Scheme-Based Normalization . . . . . . . . . . . 41
             6.2.4.  Protocol-Based Normalization . . . . . . . . . . 42
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 43
       7.1.  Reliability and Consistency  . . . . . . . . . . . . . . 43
       7.2.  Malicious Construction . . . . . . . . . . . . . . . . . 43
       7.3.  Back-End Transcoding . . . . . . . . . . . . . . . . . . 44
       7.4.  Rare IP Address Formats  . . . . . . . . . . . . . . . . 45
       7.5.  Sensitive Information  . . . . . . . . . . . . . . . . . 45
       7.6.  Semantic Attacks . . . . . . . . . . . . . . . . . . . . 45
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 46
   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 46
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 46
       10.1. Normative References . . . . . . . . . . . . . . . . . . 46
       10.2. Informative References . . . . . . . . . . . . . . . . . 47
   A.  Collected ABNF for URI . . . . . . . . . . . . . . . . . . . . 49
   B.  Parsing a URI Reference with a Regular Expression  . . . . . . 50
   C.  Delimiting a URI in Context  . . . . . . . . . . . . . . . . . 51
   D.  Changes from RFC 2396  . . . . . . . . . . . . . . . . . . . . 53
       D.1.  Additions  . . . . . . . . . . . . . . . . . . . . . . . 53
       D.2.  Modifications  . . . . . . . . . . . . . . . . . . . . . 53
   Index  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 60
   Full Copyright Statement . . . . . . . . . . . . . . . . . . . . . 61
        
   6.  Normalization and Comparison . . . . . . . . . . . . . . . . . 38
       6.1.  Equivalence  . . . . . . . . . . . . . . . . . . . . . . 38
       6.2.  Comparison Ladder  . . . . . . . . . . . . . . . . . . . 39
             6.2.1.  Simple String Comparison . . . . . . . . . . . . 39
             6.2.2.  Syntax-Based Normalization . . . . . . . . . . . 40
             6.2.3.  Scheme-Based Normalization . . . . . . . . . . . 41
             6.2.4.  Protocol-Based Normalization . . . . . . . . . . 42
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 43
       7.1.  Reliability and Consistency  . . . . . . . . . . . . . . 43
       7.2.  Malicious Construction . . . . . . . . . . . . . . . . . 43
       7.3.  Back-End Transcoding . . . . . . . . . . . . . . . . . . 44
       7.4.  Rare IP Address Formats  . . . . . . . . . . . . . . . . 45
       7.5.  Sensitive Information  . . . . . . . . . . . . . . . . . 45
       7.6.  Semantic Attacks . . . . . . . . . . . . . . . . . . . . 45
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 46
   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 46
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 46
       10.1. Normative References . . . . . . . . . . . . . . . . . . 46
       10.2. Informative References . . . . . . . . . . . . . . . . . 47
   A.  Collected ABNF for URI . . . . . . . . . . . . . . . . . . . . 49
   B.  Parsing a URI Reference with a Regular Expression  . . . . . . 50
   C.  Delimiting a URI in Context  . . . . . . . . . . . . . . . . . 51
   D.  Changes from RFC 2396  . . . . . . . . . . . . . . . . . . . . 53
       D.1.  Additions  . . . . . . . . . . . . . . . . . . . . . . . 53
       D.2.  Modifications  . . . . . . . . . . . . . . . . . . . . . 53
   Index  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 60
   Full Copyright Statement . . . . . . . . . . . . . . . . . . . . . 61
        
1. Introduction
1. 介绍

A Uniform Resource Identifier (URI) provides a simple and extensible means for identifying a resource. This specification of URI syntax and semantics is derived from concepts introduced by the World Wide Web global information initiative, whose use of these identifiers dates from 1990 and is described in "Universal Resource Identifiers in WWW" [RFC1630]. The syntax is designed to meet the recommendations laid out in "Functional Recommendations for Internet Resource Locators" [RFC1736] and "Functional Requirements for Uniform Resource Names" [RFC1737].

统一资源标识符(URI)为标识资源提供了一种简单且可扩展的方法。URI语法和语义规范源自万维网全球信息倡议(World Wide Web global information initiative)引入的概念,其使用这些标识符的日期为1990年,并在“WWW中的通用资源标识符”[RFC1630]中进行了描述。该语法旨在满足“Internet资源定位器的功能建议”[RFC1736]和“统一资源名称的功能要求”[RFC1737]中提出的建议。

This document obsoletes [RFC2396], which merged "Uniform Resource Locators" [RFC1738] and "Relative Uniform Resource Locators" [RFC1808] in order to define a single, generic syntax for all URIs. It obsoletes [RFC2732], which introduced syntax for an IPv6 address. It excludes portions of RFC 1738 that defined the specific syntax of individual URI schemes; those portions will be updated as separate documents. The process for registration of new URI schemes is defined separately by [BCP35]. Advice for designers of new URI schemes can be found in [RFC2718]. All significant changes from RFC 2396 are noted in Appendix D.

本文档淘汰了[RFC2396],它合并了“统一资源定位器”[RFC1738]和“相对统一资源定位器”[RFC1808],以便为所有URI定义单一的通用语法。它淘汰了[RFC2732],后者引入了IPv6地址的语法。它不包括RFC1738中定义单个URI方案特定语法的部分;这些部分将作为单独的文件更新。新URI方案的注册过程由[BCP35]单独定义。有关新URI方案设计者的建议,请参见[RFC2718]。附录D中记录了RFC 2396的所有重大变化。

This specification uses the terms "character" and "coded character set" in accordance with the definitions provided in [BCP19], and "character encoding" in place of what [BCP19] refers to as a "charset".

根据[BCP19]中提供的定义,本规范使用术语“字符”和“编码字符集”,并使用“字符编码”代替[BCP19]所称的“字符集”。

1.1. Overview of URIs
1.1. URI概述

URIs are characterized as follows:

URI的特点如下:

Uniform

制服

Uniformity provides several benefits. It allows different types of resource identifiers to be used in the same context, even when the mechanisms used to access those resources may differ. It allows uniform semantic interpretation of common syntactic conventions across different types of resource identifiers. It allows introduction of new types of resource identifiers without interfering with the way that existing identifiers are used. It allows the identifiers to be reused in many different contexts, thus permitting new applications or protocols to leverage a pre-existing, large, and widely used set of resource identifiers.

统一性提供了几个好处。它允许在同一上下文中使用不同类型的资源标识符,即使用于访问这些资源的机制可能不同。它允许跨不同类型的资源标识符对常见语法约定进行统一的语义解释。它允许引入新类型的资源标识符,而不会干扰现有标识符的使用方式。它允许在许多不同的上下文中重用标识符,从而允许新的应用程序或协议利用预先存在的、大型且广泛使用的资源标识符集。

Resource

资源

This specification does not limit the scope of what might be a resource; rather, the term "resource" is used in a general sense for whatever might be identified by a URI. Familiar examples include an electronic document, an image, a source of information with a consistent purpose (e.g., "today's weather report for Los Angeles"), a service (e.g., an HTTP-to-SMS gateway), and a collection of other resources. A resource is not necessarily accessible via the Internet; e.g., human beings, corporations, and bound books in a library can also be resources. Likewise, abstract concepts can be resources, such as the operators and operands of a mathematical equation, the types of a relationship (e.g., "parent" or "employee"), or numeric values (e.g., zero, one, and infinity).

本规范不限制资源的范围;相反,术语“资源”在一般意义上用于URI可能标识的任何内容。熟悉的示例包括电子文档、图像、具有一致目的的信息源(例如,“今天的洛杉矶天气报告”)、服务(例如HTTP到SMS网关)和其他资源的集合。资源不一定可以通过互联网访问;e、 例如,图书馆中的人、公司和装订书籍也可以是资源。同样,抽象概念可以是资源,例如数学方程的运算符和操作数、关系类型(例如,“父”或“雇员”)或数值(例如,零、一和无穷大)。

Identifier

标识符

An identifier embodies the information required to distinguish what is being identified from all other things within its scope of identification. Our use of the terms "identify" and "identifying" refer to this purpose of distinguishing one resource from all other resources, regardless of how that purpose is accomplished (e.g., by name, address, or context). These terms should not be mistaken as an assumption that an identifier defines or embodies the identity of what is referenced, though that may be the case for some identifiers. Nor should it be assumed that a system using URIs will access the resource identified: in many cases, URIs are used to denote resources without any intention that they be accessed. Likewise, the "one" resource identified might not be singular in nature (e.g., a resource might be a named set or a mapping that varies over time).

标识符体现了将要识别的内容与其识别范围内的所有其他事物区分开来所需的信息。我们使用的术语“标识”和“标识”指的是将一种资源与所有其他资源区分开来的目的,无论该目的是如何实现的(例如,通过名称、地址或上下文)。不应将这些术语误认为标识符定义或体现了所引用内容的标识,尽管某些标识符可能是这样。也不应假设使用URI的系统将访问所标识的资源:在许多情况下,URI用于表示资源,而无意访问这些资源。同样,标识的“一个”资源在性质上可能不是单一的(例如,资源可能是命名集或随时间变化的映射)。

A URI is an identifier consisting of a sequence of characters matching the syntax rule named <URI> in Section 3. It enables uniform identification of resources via a separately defined extensible set of naming schemes (Section 3.1). How that identification is accomplished, assigned, or enabled is delegated to each scheme specification.

URI是由一系列字符组成的标识符,这些字符与第3节中名为<URI>的语法规则相匹配。它通过一组单独定义的可扩展命名方案实现资源的统一标识(第3.1节)。如何完成、分配或启用标识将委托给每个方案规范。

This specification does not place any limits on the nature of a resource, the reasons why an application might seek to refer to a resource, or the kinds of systems that might use URIs for the sake of identifying resources. This specification does not require that a URI persists in identifying the same resource over time, though that is a common goal of all URI schemes. Nevertheless, nothing in this

本规范没有对资源的性质、应用程序可能试图引用资源的原因或可能使用URI来标识资源的系统类型进行任何限制。本规范不要求URI随着时间的推移持续标识相同的资源,尽管这是所有URI方案的共同目标。然而,这方面没有任何进展

specification prevents an application from limiting itself to particular types of resources, or to a subset of URIs that maintains characteristics desired by that application.

规范防止应用程序将自身局限于特定类型的资源,或者局限于维护该应用程序所需特性的URI子集。

URIs have a global scope and are interpreted consistently regardless of context, though the result of that interpretation may be in relation to the end-user's context. For example, "http://localhost/" has the same interpretation for every user of that reference, even though the network interface corresponding to "localhost" may be different for each end-user: interpretation is independent of access. However, an action made on the basis of that reference will take place in relation to the end-user's context, which implies that an action intended to refer to a globally unique thing must use a URI that distinguishes that resource from all other things. URIs that identify in relation to the end-user's local context should only be used when the context itself is a defining aspect of the resource, such as when an on-line help manual refers to a file on the end-user's file system (e.g., "file:///etc/hosts").

URI有一个全局范围,无论上下文如何,都会进行一致的解释,尽管这种解释的结果可能与最终用户的上下文有关。例如,”http://localhost/“对该引用的每个用户具有相同的解释,即使每个最终用户对应于“localhost”的网络接口可能不同:解释与访问无关。然而,基于该引用所做的操作将发生在与最终用户的上下文相关的情况下,这意味着旨在引用全局唯一对象的操作必须使用将该资源与所有其他对象区分开来的URI。仅当上下文本身是资源的一个定义方面时,例如当联机帮助手册引用最终用户文件系统上的文件时,才应使用与最终用户本地上下文相关的URI(例如“file:///etc/hosts").

1.1.1. Generic Syntax
1.1.1. 泛型语法

Each URI begins with a scheme name, as defined in Section 3.1, that refers to a specification for assigning identifiers within that scheme. As such, the URI syntax is a federated and extensible naming system wherein each scheme's specification may further restrict the syntax and semantics of identifiers using that scheme.

每个URI都以第3.1节中定义的方案名称开头,该名称指的是在该方案中分配标识符的规范。因此,URI语法是一个联邦和可扩展的命名系统,其中每个方案的规范可以进一步限制使用该方案的标识符的语法和语义。

This specification defines those elements of the URI syntax that are required of all URI schemes or are common to many URI schemes. It thus defines the syntax and semantics needed to implement a scheme-independent parsing mechanism for URI references, by which the scheme-dependent handling of a URI can be postponed until the scheme-dependent semantics are needed. Likewise, protocols and data formats that make use of URI references can refer to this specification as a definition for the range of syntax allowed for all URIs, including those schemes that have yet to be defined. This decouples the evolution of identification schemes from the evolution of protocols, data formats, and implementations that make use of URIs.

本规范定义了所有URI方案所需的或许多URI方案所共有的URI语法元素。因此,它定义了实现URI引用的独立于方案的解析机制所需的语法和语义,通过该机制,URI的依赖于方案的处理可以推迟到需要依赖于方案的语义为止。同样,使用URI引用的协议和数据格式可以将此规范作为所有URI(包括那些尚未定义的方案)所允许的语法范围的定义。这将识别方案的演变与使用URI的协议、数据格式和实现的演变分离开来。

A parser of the generic URI syntax can parse any URI reference into its major components. Once the scheme is determined, further scheme-specific parsing can be performed on the components. In other words, the URI generic syntax is a superset of the syntax of all URI schemes.

通用URI语法的解析器可以将任何URI引用解析为其主要组件。一旦确定了方案,就可以对组件执行进一步的特定于方案的解析。换句话说,URI通用语法是所有URI方案语法的超集。

1.1.2. Examples
1.1.2. 例子

The following example URIs illustrate several URI schemes and variations in their common syntax components:

以下示例URI说明了几种URI方案及其常见语法组件的变体:

      ftp://ftp.is.co.za/rfc/rfc1808.txt
        
      ftp://ftp.is.co.za/rfc/rfc1808.txt
        
      http://www.ietf.org/rfc/rfc2396.txt
        
      http://www.ietf.org/rfc/rfc2396.txt
        
      ldap://[2001:db8::7]/c=GB?objectClass?one
        
      ldap://[2001:db8::7]/c=GB?objectClass?one
        

mailto:John.Doe@example.com

邮递员:约翰。Doe@example.com

news:comp.infosystems.www.servers.unix

新闻:comp.infosystems.www.servers.unix

tel:+1-816-555-1212

电话:+1-816-555-1212

      telnet://192.0.2.16:80/
        
      telnet://192.0.2.16:80/
        
      urn:oasis:names:specification:docbook:dtd:xml:4.1.2
        
      urn:oasis:names:specification:docbook:dtd:xml:4.1.2
        
1.1.3. URI, URL, and URN
1.1.3. URI、URL和URN

A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network "location"). The term "Uniform Resource Name" (URN) has been used historically to refer to both URIs under the "urn" scheme [RFC2141], which are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable, and to any other URI with the properties of a name.

URI可以进一步分类为定位器、名称或两者。术语“统一资源定位器”(URL)是指URI的子集,其除了标识资源之外,还通过描述其主要访问机制(例如,其网络“位置”)提供定位资源的方法。术语“统一资源名称”(URN)在历史上被用来指代“URN”方案[RFC2141]下的URI,这两个URI需要保持全局唯一性和持久性,即使资源不再存在或变得不可用,以及具有名称属性的任何其他URI。

An individual scheme does not have to be classified as being just one of "name" or "locator". Instances of URIs from any given scheme may have the characteristics of names or locators or both, often depending on the persistence and care in the assignment of identifiers by the naming authority, rather than on any quality of the scheme. Future specifications and related documentation should use the general term "URI" rather than the more restrictive terms "URL" and "URN" [RFC3305].

单个方案不必被归类为“名称”或“定位器”中的一种。来自任何给定方案的URI实例可能具有名称或定位器或两者的特征,这通常取决于命名机构在分配标识符时的持久性和谨慎性,而不是方案的任何质量。未来的规范和相关文档应使用通用术语“URI”,而不是更严格的术语“URL”和“URN”[RFC3305]。

1.2. Design Considerations
1.2. 设计考虑
1.2.1. Transcription
1.2.1. 转录

The URI syntax has been designed with global transcription as one of its main considerations. A URI is a sequence of characters from a very limited set: the letters of the basic Latin alphabet, digits, and a few special characters. A URI may be represented in a variety of ways; e.g., ink on paper, pixels on a screen, or a sequence of character encoding octets. The interpretation of a URI depends only on the characters used and not on how those characters are represented in a network protocol.

URI语法的设计主要考虑全局转录。URI是一组非常有限的字符:基本拉丁字母表的字母、数字和一些特殊字符。URI可以以多种方式表示;e、 例如,纸张上的墨水、屏幕上的像素或字符编码八位字节序列。URI的解释仅取决于所使用的字符,而不取决于这些字符在网络协议中的表示方式。

The goal of transcription can be described by a simple scenario. Imagine two colleagues, Sam and Kim, sitting in a pub at an international conference and exchanging research ideas. Sam asks Kim for a location to get more information, so Kim writes the URI for the research site on a napkin. Upon returning home, Sam takes out the napkin and types the URI into a computer, which then retrieves the information to which Kim referred.

转录的目标可以用一个简单的场景来描述。想象一下,萨姆和金姆这两位同事在一个国际会议上坐在酒吧里,交换研究想法。Sam向Kim询问获取更多信息的位置,因此Kim将研究站点的URI写在餐巾纸上。回到家后,山姆拿出餐巾,将URI输入计算机,然后计算机检索Kim提到的信息。

There are several design considerations revealed by the scenario:

该场景揭示了几个设计注意事项:

o A URI is a sequence of characters that is not always represented as a sequence of octets.

o URI是一个字符序列,并不总是表示为八位字节序列。

o A URI might be transcribed from a non-network source and thus should consist of characters that are most likely able to be entered into a computer, within the constraints imposed by keyboards (and related input devices) across languages and locales.

o URI可能是从非网络源转录的,因此应该由最有可能输入计算机的字符组成,在键盘(和相关输入设备)跨语言和地区施加的限制范围内。

o A URI often has to be remembered by people, and it is easier for people to remember a URI when it consists of meaningful or familiar components.

o URI通常必须被人们记住,当URI由有意义或熟悉的组件组成时,人们更容易记住URI。

These design considerations are not always in alignment. For example, it is often the case that the most meaningful name for a URI component would require characters that cannot be typed into some systems. The ability to transcribe a resource identifier from one medium to another has been considered more important than having a URI consist of the most meaningful of components.

这些设计考虑并不总是一致的。例如,通常情况下,URI组件最有意义的名称需要无法在某些系统中键入的字符。将资源标识符从一种媒体转录到另一种媒体的能力被认为比让URI包含最有意义的组件更重要。

In local or regional contexts and with improving technology, users might benefit from being able to use a wider range of characters; such use is not defined by this specification. Percent-encoded octets (Section 2.1) may be used within a URI to represent characters outside the range of the US-ASCII coded character set if this

在本地或区域环境中,随着技术的进步,用户可能会受益于能够使用更广泛的字符;本规范未规定此类用途。百分比编码八位字节(第2.1节)可在URI内用于表示US-ASCII编码字符集范围之外的字符,如果

representation is allowed by the scheme or by the protocol element in which the URI is referenced. Such a definition should specify the character encoding used to map those characters to octets prior to being percent-encoded for the URI.

模式或引用URI的协议元素允许表示。这样的定义应该指定在对URI进行百分比编码之前用于将这些字符映射到八位字节的字符编码。

1.2.2. Separating Identification from Interaction
1.2.2. 将识别与交互分离

A common misunderstanding of URIs is that they are only used to refer to accessible resources. The URI itself only provides identification; access to the resource is neither guaranteed nor implied by the presence of a URI. Instead, any operation associated with a URI reference is defined by the protocol element, data format attribute, or natural language text in which it appears.

对URI的一个常见误解是,它们仅用于引用可访问的资源。URI本身只提供标识;URI的存在既不保证也不暗示对资源的访问。相反,与URI引用关联的任何操作都是由它出现的协议元素、数据格式属性或自然语言文本定义的。

Given a URI, a system may attempt to perform a variety of operations on the resource, as might be characterized by words such as "access", "update", "replace", or "find attributes". Such operations are defined by the protocols that make use of URIs, not by this specification. However, we do use a few general terms for describing common operations on URIs. URI "resolution" is the process of determining an access mechanism and the appropriate parameters necessary to dereference a URI; this resolution may require several iterations. To use that access mechanism to perform an action on the URI's resource is to "dereference" the URI.

给定URI,系统可以尝试对资源执行各种操作,其特征可能是“访问”、“更新”、“替换”或“查找属性”。此类操作由使用URI的协议定义,而不是由本规范定义。然而,我们确实使用了一些通用术语来描述URI上的常见操作。URI“解析”是确定访问机制和取消引用URI所需的适当参数的过程;此解决方案可能需要多次迭代。使用该访问机制对URI的资源执行操作就是“取消引用”URI。

When URIs are used within information retrieval systems to identify sources of information, the most common form of URI dereference is "retrieval": making use of a URI in order to retrieve a representation of its associated resource. A "representation" is a sequence of octets, along with representation metadata describing those octets, that constitutes a record of the state of the resource at the time when the representation is generated. Retrieval is achieved by a process that might include using the URI as a cache key to check for a locally cached representation, resolution of the URI to determine an appropriate access mechanism (if any), and dereference of the URI for the sake of applying a retrieval operation. Depending on the protocols used to perform the retrieval, additional information might be supplied about the resource (resource metadata) and its relation to other resources.

当在信息检索系统中使用URI来识别信息源时,最常见的URI取消引用形式是“检索”:使用URI来检索其相关资源的表示。“表示”是一系列八位字节,以及描述这些八位字节的表示元数据,它们构成生成表示时资源状态的记录。检索是通过一个过程实现的,该过程可能包括使用URI作为缓存键来检查本地缓存的表示、解析URI以确定适当的访问机制(如果有),以及为了应用检索操作而取消对URI的引用。根据用于执行检索的协议,可能会提供有关资源(资源元数据)及其与其他资源的关系的其他信息。

URI references in information retrieval systems are designed to be late-binding: the result of an access is generally determined when it is accessed and may vary over time or due to other aspects of the interaction. These references are created in order to be used in the future: what is being identified is not some specific result that was obtained in the past, but rather some characteristic that is expected to be true for future results. In such cases, the resource referred to by the URI is actually a sameness of characteristics as observed

信息检索系统中的URI引用被设计为后期绑定:访问的结果通常是在访问时确定的,并且可能随着时间的推移或由于交互的其他方面而变化。创建这些参考是为了将来使用:所确定的不是过去获得的某些特定结果,而是预期对未来结果适用的某些特征。在这种情况下,URI引用的资源实际上是观察到的相同的特征

over time, perhaps elucidated by additional comments or assertions made by the resource provider.

随着时间的推移,可能会通过资源提供者的附加注释或断言来阐明。

Although many URI schemes are named after protocols, this does not imply that use of these URIs will result in access to the resource via the named protocol. URIs are often used simply for the sake of identification. Even when a URI is used to retrieve a representation of a resource, that access might be through gateways, proxies, caches, and name resolution services that are independent of the protocol associated with the scheme name. The resolution of some URIs may require the use of more than one protocol (e.g., both DNS and HTTP are typically used to access an "http" URI's origin server when a representation isn't found in a local cache).

尽管许多URI方案是以协议命名的,但这并不意味着使用这些URI将导致通过命名协议访问资源。URI通常只是为了识别而使用。即使URI用于检索资源的表示形式,也可以通过网关、代理、缓存和名称解析服务进行访问,这些服务独立于与方案名称关联的协议。某些URI的解析可能需要使用多个协议(例如,当在本地缓存中找不到表示时,DNS和HTTP通常用于访问“HTTP”URI的源服务器)。

1.2.3. Hierarchical Identifiers
1.2.3. 层次标识符

The URI syntax is organized hierarchically, with components listed in order of decreasing significance from left to right. For some URI schemes, the visible hierarchy is limited to the scheme itself: everything after the scheme component delimiter (":") is considered opaque to URI processing. Other URI schemes make the hierarchy explicit and visible to generic parsing algorithms.

URI语法是分层组织的,组件按重要性从左到右递减的顺序列出。对于某些URI方案,可见层次结构仅限于方案本身:方案组件分隔符(“:”)之后的所有内容都被认为对URI处理是不透明的。其他URI方案使层次结构显式,并且对通用解析算法可见。

The generic syntax uses the slash ("/"), question mark ("?"), and number sign ("#") characters to delimit components that are significant to the generic parser's hierarchical interpretation of an identifier. In addition to aiding the readability of such identifiers through the consistent use of familiar syntax, this uniform representation of hierarchy across naming schemes allows scheme-independent references to be made relative to that hierarchy.

泛型语法使用斜杠(“/”)、问号(“?”)和数字符号(“#”)字符来分隔对泛型解析器对标识符的层次解释非常重要的组件。除了通过一致使用熟悉的语法帮助此类标识符的可读性外,这种跨命名方案层次结构的统一表示允许相对于该层次结构进行独立于方案的引用。

It is often the case that a group or "tree" of documents has been constructed to serve a common purpose, wherein the vast majority of URI references in these documents point to resources within the tree rather than outside it. Similarly, documents located at a particular site are much more likely to refer to other resources at that site than to resources at remote sites. Relative referencing of URIs allows document trees to be partially independent of their location and access scheme. For instance, it is possible for a single set of hypertext documents to be simultaneously accessible and traversable via each of the "file", "http", and "ftp" schemes if the documents refer to each other with relative references. Furthermore, such document trees can be moved, as a whole, without changing any of the relative references.

通常情况下,构建文档组或“树”是为了达到一个共同的目的,其中这些文档中的绝大多数URI引用指向树内的资源,而不是树外的资源。类似地,位于特定站点的文档更可能引用该站点的其他资源,而不是远程站点的资源。URI的相对引用允许文档树部分独立于它们的位置和访问方案。例如,如果一组超文本文档通过“文件”、“http”和“ftp”方案中的每一个相互引用,则可以同时访问和遍历这些文档。此外,这些文档树可以作为一个整体进行移动,而无需更改任何相对引用。

A relative reference (Section 4.2) refers to a resource by describing the difference within a hierarchical name space between the reference context and the target URI. The reference resolution algorithm,

相对引用(第4.2节)通过描述引用上下文和目标URI之间的分层名称空间中的差异来引用资源。参考分辨率算法,

presented in Section 5, defines how such a reference is transformed to the target URI. As relative references can only be used within the context of a hierarchical URI, designers of new URI schemes should use a syntax consistent with the generic syntax's hierarchical components unless there are compelling reasons to forbid relative referencing within that scheme.

第5节介绍了如何将此类引用转换为目标URI。由于相对引用只能在层次URI的上下文中使用,因此新URI方案的设计者应使用与通用语法的层次组件一致的语法,除非有令人信服的理由禁止该方案中的相对引用。

NOTE: Previous specifications used the terms "partial URI" and "relative URI" to denote a relative reference to a URI. As some readers misunderstood those terms to mean that relative URIs are a subset of URIs rather than a method of referencing URIs, this specification simply refers to them as relative references.

注意:以前的规范使用术语“部分URI”和“相对URI”来表示对URI的相对引用。由于一些读者将这些术语误解为相对URI是URI的子集,而不是引用URI的方法,因此本规范仅将它们称为相对引用。

All URI references are parsed by generic syntax parsers when used. However, because hierarchical processing has no effect on an absolute URI used in a reference unless it contains one or more dot-segments (complete path segments of "." or "..", as described in Section 3.3), URI scheme specifications can define opaque identifiers by disallowing use of slash characters, question mark characters, and the URIs "scheme:." and "scheme:..".

使用时,所有URI引用都由通用语法分析器解析。但是,由于分层处理对引用中使用的绝对URI没有影响,除非它包含一个或多个点段(“或”。”的完整路径段,如第3.3节所述),URI方案规范可以通过禁止使用斜杠字符、问号字符和URI来定义不透明标识符“方案:”和“方案:…”。

1.3. Syntax Notation
1.3. 语法符号

This specification uses the Augmented Backus-Naur Form (ABNF) notation of [RFC2234], including the following core ABNF syntax rules defined by that specification: ALPHA (letters), CR (carriage return), DIGIT (decimal digits), DQUOTE (double quote), HEXDIG (hexadecimal digits), LF (line feed), and SP (space). The complete URI syntax is collected in Appendix A.

本规范使用[RFC2234]的增广巴科斯诺尔形式(ABNF)表示法,包括该规范定义的以下核心ABNF语法规则:阿尔法(字母)、CR(回车)、数字(十进制数字)、DQUOTE(双引号)、HEXDIG(十六进制数字)、LF(换行符)和SP(空格)。完整的URI语法收集在附录A中。

2. Characters
2. 人物

The URI syntax provides a method of encoding data, presumably for the sake of identifying a resource, as a sequence of characters. The URI characters are, in turn, frequently encoded as octets for transport or presentation. This specification does not mandate any particular character encoding for mapping between URI characters and the octets used to store or transmit those characters. When a URI appears in a protocol element, the character encoding is defined by that protocol; without such a definition, a URI is assumed to be in the same character encoding as the surrounding text.

URI语法提供了一种将数据编码为字符序列的方法,可能是为了识别资源。反过来,URI字符经常被编码为八位字节,用于传输或表示。本规范不要求对URI字符和用于存储或传输这些字符的八位字节之间的映射使用任何特定的字符编码。当URI出现在协议元素中时,字符编码由该协议定义;如果没有这样的定义,则假定URI与周围文本采用相同的字符编码。

The ABNF notation defines its terminal values to be non-negative integers (codepoints) based on the US-ASCII coded character set [ASCII]. Because a URI is a sequence of characters, we must invert that relation in order to understand the URI syntax. Therefore, the

ABNF符号将其终端值定义为基于US-ASCII编码字符集[ASCII]的非负整数(代码点)。因为URI是一个字符序列,我们必须颠倒这个关系才能理解URI语法。因此,

integer values used by the ABNF must be mapped back to their corresponding characters via US-ASCII in order to complete the syntax rules.

ABNF使用的整数值必须通过US-ASCII映射回相应的字符,以完成语法规则。

A URI is composed from a limited set of characters consisting of digits, letters, and a few graphic symbols. A reserved subset of those characters may be used to delimit syntax components within a URI while the remaining characters, including both the unreserved set and those reserved characters not acting as delimiters, define each component's identifying data.

URI由一组有限的字符组成,这些字符由数字、字母和一些图形符号组成。这些字符的保留子集可用于分隔URI中的语法组件,而其余字符(包括未保留集和不充当分隔符的保留字符)定义每个组件的标识数据。

2.1. Percent-Encoding
2.1. 百分比编码

A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the allowed set or is being used as a delimiter of, or within, the component. A percent-encoded octet is encoded as a character triplet, consisting of the percent character "%" followed by the two hexadecimal digits representing that octet's numeric value. For example, "%20" is the percent-encoding for the binary octet "00100000" (ABNF: %x20), which in US-ASCII corresponds to the space character (SP). Section 2.4 describes when percent-encoding and decoding is applied.

百分比编码机制用于表示组件中的数据八位字节,前提是该八位字节的对应字符不在允许的集合内,或者用作组件的分隔符或组件内的分隔符。百分比编码的八位字节编码为字符三元组,由百分比字符“%”和表示该八位字节数值的两个十六进制数字组成。例如,“%20”是二进制八位字节“00100000”(ABNF:%x20)的编码百分比,在US-ASCII中,它对应于空格字符(SP)。第2.4节描述了何时应用百分比编码和解码。

pct-encoded = "%" HEXDIG HEXDIG

pct编码=“%”HEXDIG HEXDIG

The uppercase hexadecimal digits 'A' through 'F' are equivalent to the lowercase digits 'a' through 'f', respectively. If two URIs differ only in the case of hexadecimal digits used in percent-encoded octets, they are equivalent. For consistency, URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings.

大写十六进制数字“A”到“F”分别相当于小写数字“A”到“F”。如果两个URI仅在百分比编码的八位字节中使用十六进制数字的情况下不同,则它们是等效的。为了保持一致性,URI生产者和规范化者应该对所有百分比编码使用大写十六进制数字。

2.2. Reserved Characters
2.2. 保留字符

URIs include components and subcomponents that are delimited by characters in the "reserved" set. These characters are called "reserved" because they may (or may not) be defined as delimiters by the generic syntax, by each scheme-specific syntax, or by the implementation-specific syntax of a URI's dereferencing algorithm. If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed.

URI包括由“保留”集中的字符分隔的组件和子组件。这些字符被称为“保留”,因为它们可能(也可能不)被通用语法、每个特定于方案的语法或URI的解引用算法的特定于实现的语法定义为分隔符。如果URI组件的数据与保留字符作为分隔符的用途相冲突,则在形成URI之前,必须对冲突数据进行百分比编码。

      reserved    = gen-delims / sub-delims
        
      reserved    = gen-delims / sub-delims
        
      gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
        
      gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
        
      sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="
        
      sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="
        

The purpose of reserved characters is to provide a set of delimiting characters that are distinguishable from other data within a URI. URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent. Percent-encoding a reserved character, or decoding a percent-encoded octet that corresponds to a reserved character, will change how the URI is interpreted by most applications. Thus, characters in the reserved set are protected from normalization and are therefore safe to be used by scheme-specific and producer-specific algorithms for delimiting data subcomponents within a URI.

保留字符的目的是提供一组可与URI中的其他数据区分的分隔字符。在用相应的编码八位字节百分比替换保留字符方面有所不同的URI是不等价的。对保留字符进行百分比编码,或对对应于保留字符的百分比编码八位字节进行解码,将改变大多数应用程序对URI的解释方式。因此,保留集中的字符不受规范化的保护,因此可以由特定于方案和特定于生产者的算法安全地用于在URI中划分数据子组件。

A subset of the reserved characters (gen-delims) is used as delimiters of the generic URI components described in Section 3. A component's ABNF syntax rule will not use the reserved or gen-delims rule names directly; instead, each syntax rule lists the characters allowed within that component (i.e., not delimiting it), and any of those characters that are also in the reserved set are "reserved" for use as subcomponent delimiters within the component. Only the most common subcomponents are defined by this specification; other subcomponents may be defined by a URI scheme's specification, or by the implementation-specific syntax of a URI's dereferencing algorithm, provided that such subcomponents are delimited by characters in the reserved set allowed within that component.

保留字符的子集(gen delims)用作第3节中描述的通用URI组件的分隔符。组件的ABNF语法规则不会直接使用保留或gen delims规则名称;相反,每个语法规则都列出了该组件中允许的字符(即,未对其进行分隔),并且保留集中的任何字符都是“保留”的,以用作组件中的子组件分隔符。本规范仅定义了最常见的子组件;其他子组件可以由URI方案的规范或URI的解引用算法的特定于实现的语法定义,前提是这些子组件由该组件中允许的保留集中的字符分隔。

URI producing applications should percent-encode data octets that correspond to characters in the reserved set unless these characters are specifically allowed by the URI scheme to represent data in that component. If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII.

生成URI的应用程序应该对与保留集中的字符相对应的数据八位字节进行百分比编码,除非URI方案特别允许这些字符表示该组件中的数据。如果在URI组件中找到保留字符,并且该字符没有已知的定界角色,则必须将其解释为表示与该字符的US-ASCII编码相对应的数据八位字节。

2.3. Unreserved Characters
2.3. 无保留字符

Characters that are allowed in a URI but do not have a reserved purpose are called unreserved. These include uppercase and lowercase letters, decimal digits, hyphen, period, underscore, and tilde.

URI中允许但没有保留用途的字符称为unreserved。其中包括大小写字母、十进制数字、连字符、句点、下划线和波浪线。

      unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
        
      unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
        

URIs that differ in the replacement of an unreserved character with its corresponding percent-encoded US-ASCII octet are equivalent: they identify the same resource. However, URI comparison implementations do not always perform normalization prior to comparison (see Section 6). For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by URI producers and, when found in a URI, should be decoded to their corresponding unreserved characters by URI normalizers.

在用相应的百分比编码US-ASCII八位字节替换非保留字符方面有所不同的URI是等效的:它们标识相同的资源。然而,URI比较实现并不总是在比较之前执行规范化(参见第6节)。为保持一致性,字母(%41-%5A和%61-%7A)、数字(%30-%39)、连字符(%2D)、句点(%2E)、下划线(%5F)或波浪号(%7E)范围内的编码八位字节百分比不应由URI生产者创建,当在URI中找到时,应由URI规范化器解码为相应的无保留字符。

2.4. When to Encode or Decode
2.4. 何时编码或解码

Under normal circumstances, the only time when octets within a URI are percent-encoded is during the process of producing the URI from its component parts. This is when an implementation determines which of the reserved characters are to be used as subcomponent delimiters and which can be safely used as data. Once produced, a URI is always in its percent-encoded form.

在正常情况下,URI中的八位字节进行百分比编码的唯一时间是在从其组成部分生成URI的过程中。这是在实现确定哪些保留字符将用作子组件分隔符,哪些可以安全地用作数据时发生的。一旦产生,URI总是以百分比编码的形式出现。

When a URI is dereferenced, the components and subcomponents significant to the scheme-specific dereferencing process (if any) must be parsed and separated before the percent-encoded octets within those components can be safely decoded, as otherwise the data may be mistaken for component delimiters. The only exception is for percent-encoded octets corresponding to characters in the unreserved set, which can be decoded at any time. For example, the octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by "~" without changing its interpretation.

当URI被解引用时,必须先解析和分离对特定于方案的解引用过程(如果有)重要的组件和子组件,然后才能安全解码这些组件中编码的八位字节百分比,否则数据可能会被误认为是组件分隔符。唯一的例外是与非保留集中的字符相对应的百分比编码八位字节,它可以在任何时候解码。例如,旧的URI处理实现通常将与波浪号(~)字符相对应的八位字节编码为“%7E”;“%7E”可以替换为“~”而不改变其解释。

Because the percent ("%") character serves as the indicator for percent-encoded octets, it must be percent-encoded as "%25" for that octet to be used as data within a URI. Implementations must not percent-encode or decode the same string more than once, as decoding an already decoded string might lead to misinterpreting a percent data octet as the beginning of a percent-encoding, or vice versa in the case of percent-encoding an already percent-encoded string.

由于百分比(“%”字符用作百分比编码八位字节的指示符,因此必须将其百分比编码为“%25”,才能将该八位字节用作URI中的数据。实现不得对同一字符串进行多次百分比编码或解码,因为解码已解码的字符串可能会导致将百分比数据八位字节错误地解释为百分比编码的开始,或者在百分比编码已进行百分比编码的字符串的情况下,将百分比数据八位字节错误地解释为百分比编码的开始。

2.5. Identifying Data
2.5. 识别数据

URI characters provide identifying data for each of the URI components, serving as an external interface for identification between systems. Although the presence and nature of the URI production interface is hidden from clients that use its URIs (and is thus beyond the scope of the interoperability requirements defined by this specification), it is a frequent source of confusion and errors in the interpretation of URI character issues. Implementers have to be aware that there are multiple character encodings involved in the

URI字符为每个URI组件提供标识数据,充当系统间标识的外部接口。尽管URI生产接口的存在和性质对使用其URI的客户机是隐藏的(因此超出了本规范定义的互操作性要求的范围),但在解释URI字符问题时,它经常会引起混淆和错误。实现者必须意识到,这个过程中涉及多个字符编码

production and transmission of URIs: local name and data encoding, public interface encoding, URI character encoding, data format encoding, and protocol encoding.

URI的生成和传输:本地名称和数据编码、公共接口编码、URI字符编码、数据格式编码和协议编码。

Local names, such as file system names, are stored with a local character encoding. URI producing applications (e.g., origin servers) will typically use the local encoding as the basis for producing meaningful names. The URI producer will transform the local encoding to one that is suitable for a public interface and then transform the public interface encoding into the restricted set of URI characters (reserved, unreserved, and percent-encodings). Those characters are, in turn, encoded as octets to be used as a reference within a data format (e.g., a document charset), and such data formats are often subsequently encoded for transmission over Internet protocols.

本地名称(如文件系统名称)使用本地字符编码存储。URI生成应用程序(例如,源服务器)通常使用本地编码作为生成有意义名称的基础。URI生成器将本地编码转换为适合公共接口的编码,然后将公共接口编码转换为受限制的URI字符集(保留、非保留和百分比编码)。这些字符依次被编码为八位字节,以用作数据格式(例如,文档字符集)内的参考,并且这些数据格式通常随后被编码以通过因特网协议传输。

For most systems, an unreserved character appearing within a URI component is interpreted as representing the data octet corresponding to that character's encoding in US-ASCII. Consumers of URIs assume that the letter "X" corresponds to the octet "01011000", and even when that assumption is incorrect, there is no harm in making it. A system that internally provides identifiers in the form of a different character encoding, such as EBCDIC, will generally perform character translation of textual identifiers to UTF-8 [STD63] (or some other superset of the US-ASCII character encoding) at an internal interface, thereby providing more meaningful identifiers than those resulting from simply percent-encoding the original octets.

对于大多数系统,URI组件中出现的无保留字符被解释为表示与该字符的US-ASCII编码相对应的数据八位字节。URI的使用者假设字母“X”对应于八位元“01011000”,即使该假设不正确,也没有什么害处。以不同字符编码形式在内部提供标识符的系统,如EBCDIC,通常会在内部接口处将文本标识符的字符转换为UTF-8[STD63](或US-ASCII字符编码的某些其他超集),从而提供比简单地对原始八位字节进行百分比编码所产生的标识符更有意义的标识符。

For example, consider an information service that provides data, stored locally using an EBCDIC-based file system, to clients on the Internet through an HTTP server. When an author creates a file with the name "Laguna Beach" on that file system, the "http" URI corresponding to that resource is expected to contain the meaningful string "Laguna%20Beach". If, however, that server produces URIs by using an overly simplistic raw octet mapping, then the result would be a URI containing "%D3%81%87%A4%95%81@%C2%85%81%83%88". An internal transcoding interface fixes this problem by transcoding the local name to a superset of US-ASCII prior to producing the URI. Naturally, proper interpretation of an incoming URI on such an interface requires that percent-encoded octets be decoded (e.g., "%20" to SP) before the reverse transcoding is applied to obtain the local name.

例如,考虑一个信息服务,它通过HTTP服务器向因特网上的客户端提供数据,使用EBCDIC文件系统本地存储。当作者在该文件系统上创建名为“Laguna Beach”的文件时,对应于该资源的“http”URI应包含有意义的字符串“Laguna%20Beach”。但是,如果该服务器使用过于简单的原始八位字节映射生成URI,则结果将是一个包含“%D3%81%87%A4%95%81@%C2%85%81%83%88”的URI。内部转码接口通过在生成URI之前将本地名称转码到US-ASCII的超集来修复此问题。自然地,在这样一个接口上对传入URI的正确解释要求在应用反向转码以获得本地名称之前对百分比编码的八位字节进行解码(例如,“%20”到SP)。

In some cases, the internal interface between a URI component and the identifying data that it has been crafted to represent is much less direct than a character encoding translation. For example, portions of a URI might reflect a query on non-ASCII data, or numeric

在某些情况下,URI组件和它所表示的标识数据之间的内部接口远不如字符编码转换直接。例如,URI的某些部分可能反映对非ASCII数据或数字的查询

coordinates on a map. Likewise, a URI scheme may define components with additional encoding requirements that are applied prior to forming the component and producing the URI.

地图上的坐标。类似地,URI方案可以定义具有在形成组件和产生URI之前应用的附加编码要求的组件。

When a new URI scheme defines a component that represents textual data consisting of characters from the Universal Character Set [UCS], the data should first be encoded as octets according to the UTF-8 character encoding [STD63]; then only those octets that do not correspond to characters in the unreserved set should be percent-encoded. For example, the character A would be represented as "A", the character LATIN CAPITAL LETTER A WITH GRAVE would be represented as "%C3%80", and the character KATAKANA LETTER A would be represented as "%E3%82%A2".

当一个新的URI方案定义了一个表示由通用字符集[UCS]中的字符组成的文本数据的组件时,应首先根据UTF-8字符编码[STD63]将数据编码为八位字节;然后,只有那些与未保留集中的字符不对应的八位字节才应进行百分比编码。例如,字符A将表示为“A”,带有坟墓的拉丁大写字母A将表示为“%C3%80”,而片假名字母A将表示为“%E3%82%A2”。

3. Syntax Components
3. 语法组件

The generic URI syntax consists of a hierarchical sequence of components referred to as the scheme, authority, path, query, and fragment.

通用URI语法由一系列组件组成,这些组件被称为scheme、authority、path、query和fragment。

      URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
        
      URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
        
      hier-part   = "//" authority path-abempty
                  / path-absolute
                  / path-rootless
                  / path-empty
        
      hier-part   = "//" authority path-abempty
                  / path-absolute
                  / path-rootless
                  / path-empty
        

The scheme and path components are required, though the path may be empty (no characters). When authority is present, the path must either be empty or begin with a slash ("/") character. When authority is not present, the path cannot begin with two slash characters ("//"). These restrictions result in five different ABNF rules for a path (Section 3.3), only one of which will match any given URI reference.

方案和路径组件是必需的,但路径可能为空(无字符)。当存在权限时,路径必须为空或以斜杠(“/”)字符开头。当权限不存在时,路径不能以两个斜杠字符(“/”)开头。这些限制导致路径有五个不同的ABNF规则(第3.3节),其中只有一个与任何给定的URI引用匹配。

The following are two example URIs and their component parts:

以下是两个示例URI及其组件:

         foo://example.com:8042/over/there?name=ferret#nose
         \_/   \______________/\_________/ \_________/ \__/
          |           |            |            |        |
       scheme     authority       path        query   fragment
          |   _____________________|__
         / \ /                        \
         urn:example:animal:ferret:nose
        
         foo://example.com:8042/over/there?name=ferret#nose
         \_/   \______________/\_________/ \_________/ \__/
          |           |            |            |        |
       scheme     authority       path        query   fragment
          |   _____________________|__
         / \ /                        \
         urn:example:animal:ferret:nose
        
3.1. Scheme
3.1. 计划

Each URI begins with a scheme name that refers to a specification for assigning identifiers within that scheme. As such, the URI syntax is a federated and extensible naming system wherein each scheme's specification may further restrict the syntax and semantics of identifiers using that scheme.

每个URI都以一个方案名开始,该方案名引用用于在该方案内分配标识符的规范。因此,URI语法是一个联邦和可扩展的命名系统,其中每个方案的规范可以进一步限制使用该方案的标识符的语法和语义。

Scheme names consist of a sequence of characters beginning with a letter and followed by any combination of letters, digits, plus ("+"), period ("."), or hyphen ("-"). Although schemes are case-insensitive, the canonical form is lowercase and documents that specify schemes must do so with lowercase letters. An implementation should accept uppercase letters as equivalent to lowercase in scheme names (e.g., allow "HTTP" as well as "http") for the sake of robustness but should only produce lowercase scheme names for consistency.

方案名称由一系列字符组成,这些字符以字母开头,后跟字母、数字、加号(“+”)、句点(“.”)或连字符(“-”)的任意组合。尽管方案不区分大小写,但规范形式是小写的,指定方案的文档必须使用小写字母。为了稳健性起见,实现应该接受大写字母与方案名称中的小写字母相等(例如,允许“HTTP”和“HTTP”),但为了一致性,应该只生成小写方案名称。

      scheme      = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
        
      scheme      = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
        

Individual schemes are not specified by this document. The process for registration of new URI schemes is defined separately by [BCP35]. The scheme registry maintains the mapping between scheme names and their specifications. Advice for designers of new URI schemes can be found in [RFC2718]. URI scheme specifications must define their own syntax so that all strings matching their scheme-specific syntax will also match the <absolute-URI> grammar, as described in Section 4.3.

本文件未规定个别方案。新URI方案的注册过程由[BCP35]单独定义。scheme注册表维护方案名称与其规范之间的映射。有关新URI方案设计者的建议,请参见[RFC2718]。URI方案规范必须定义自己的语法,以便所有匹配其方案特定语法的字符串也将匹配<absolute URI>语法,如第4.3节所述。

When presented with a URI that violates one or more scheme-specific restrictions, the scheme-specific resolution process should flag the reference as an error rather than ignore the unused parts; doing so reduces the number of equivalent URIs and helps detect abuses of the generic syntax, which might indicate that the URI has been constructed to mislead the user (Section 7.6).

当呈现的URI违反一个或多个特定于方案的限制时,特定于方案的解析过程应将引用标记为错误,而不是忽略未使用的部分;这样做可以减少等效URI的数量,并有助于检测泛型语法的滥用,这可能表明URI的构造是为了误导用户(第7.6节)。

3.2. Authority
3.2. 权威

Many URI schemes include a hierarchical element for a naming authority so that governance of the name space defined by the remainder of the URI is delegated to that authority (which may, in turn, delegate it further). The generic syntax provides a common means for distinguishing an authority based on a registered name or server address, along with optional port and user information.

许多URI方案包括命名机构的分层元素,以便将URI其余部分定义的名称空间的管理委托给该机构(该机构可能会进一步委托)。通用语法提供了一种基于注册名称或服务器地址以及可选端口和用户信息来区分授权的通用方法。

The authority component is preceded by a double slash ("//") and is terminated by the next slash ("/"), question mark ("?"), or number sign ("#") character, or by the end of the URI.

authority组件前面有一个双斜杠(“/”),并以下一个斜杠(“/”)、问号(“?”)或数字符号(“#”)字符或URI结尾终止。

      authority   = [ userinfo "@" ] host [ ":" port ]
        
      authority   = [ userinfo "@" ] host [ ":" port ]
        

URI producers and normalizers should omit the ":" delimiter that separates host from port if the port component is empty. Some schemes do not allow the userinfo and/or port subcomponents.

URI生产者和规范化者应该省略“:”分隔符,如果端口组件为空,则分隔主机和端口。某些方案不允许用户信息和/或端口子组件。

If a URI contains an authority component, then the path component must either be empty or begin with a slash ("/") character. Non-validating parsers (those that merely separate a URI reference into its major components) will often ignore the subcomponent structure of authority, treating it as an opaque string from the double-slash to the first terminating delimiter, until such time as the URI is dereferenced.

如果URI包含权限组件,则路径组件必须为空或以斜杠(“/”)字符开头。非验证解析器(仅将URI引用分离为其主要组件的解析器)通常会忽略权限的子组件结构,将其视为从双斜杠到第一个终止分隔符的不透明字符串,直到URI被取消引用为止。

3.2.1. User Information
3.2.1. 用户信息

The userinfo subcomponent may consist of a user name and, optionally, scheme-specific information about how to gain authorization to access the resource. The user information, if present, is followed by a commercial at-sign ("@") that delimits it from the host.

userinfo子组件可以由用户名和(可选)关于如何获得访问资源的授权的特定于方案的信息组成。用户信息(如果存在)后面跟着一个商业at标志(@),将其与主机分隔开来。

      userinfo    = *( unreserved / pct-encoded / sub-delims / ":" )
        
      userinfo    = *( unreserved / pct-encoded / sub-delims / ":" )
        

Use of the format "user:password" in the userinfo field is deprecated. Applications should not render as clear text any data after the first colon (":") character found within a userinfo subcomponent unless the data after the colon is the empty string (indicating no password). Applications may choose to ignore or reject such data when it is received as part of a reference and should reject the storage of such data in unencrypted form. The passing of authentication information in clear text has proven to be a security risk in almost every case where it has been used.

不推荐在userinfo字段中使用格式“user:password”。应用程序不应将userinfo子组件中第一个冒号(“:”)字符后的任何数据呈现为明文,除非冒号后的数据是空字符串(表示没有密码)。当这些数据作为引用的一部分被接收时,应用程序可以选择忽略或拒绝这些数据,并且应该拒绝以未加密的形式存储这些数据。事实证明,以明文形式传递身份验证信息在几乎所有使用过它的情况下都是一种安全风险。

Applications that render a URI for the sake of user feedback, such as in graphical hypertext browsing, should render userinfo in a way that is distinguished from the rest of a URI, when feasible. Such rendering will assist the user in cases where the userinfo has been misleadingly crafted to look like a trusted domain name (Section 7.6).

为了用户反馈而呈现URI的应用程序,例如在图形超文本浏览中,在可行的情况下,应该以不同于URI其余部分的方式呈现userinfo。如果用户信息被误导地制作成一个可信的域名(第7.6节),这种呈现将有助于用户。

3.2.2. Host
3.2.2. 主办

The host subcomponent of authority is identified by an IP literal encapsulated within square brackets, an IPv4 address in dotted-decimal form, or a registered name. The host subcomponent is case-insensitive. The presence of a host subcomponent within a URI does not imply that the scheme requires access to the given host on the Internet. In many cases, the host syntax is used only for the sake

授权的主机子组件由封装在方括号内的IP文本、点十进制形式的IPv4地址或注册名称标识。主机子组件不区分大小写。URI中存在主机子组件并不意味着该方案需要访问Internet上的给定主机。在许多情况下,使用主机语法只是为了

of reusing the existing registration process created and deployed for DNS, thus obtaining a globally unique name without the cost of deploying another registry. However, such use comes with its own costs: domain name ownership may change over time for reasons not anticipated by the URI producer. In other cases, the data within the host component identifies a registered name that has nothing to do with an Internet host. We use the name "host" for the ABNF rule because that is its most common purpose, not its only purpose.

重用为DNS创建和部署的现有注册过程,从而获得全局唯一的名称,而无需部署另一个注册表。然而,这种使用也有其自身的成本:域名所有权可能会随着时间的推移而发生变化,原因是URI生产者没有预料到。在其他情况下,主机组件中的数据标识与Internet主机无关的注册名称。我们使用ABNF规则的名称“主机”,因为这是它最常见的用途,而不是唯一的用途。

      host        = IP-literal / IPv4address / reg-name
        
      host        = IP-literal / IPv4address / reg-name
        

The syntax rule for host is ambiguous because it does not completely distinguish between an IPv4address and a reg-name. In order to disambiguate the syntax, we apply the "first-match-wins" algorithm: If host matches the rule for IPv4address, then it should be considered an IPv4 address literal and not a reg-name. Although host is case-insensitive, producers and normalizers should use lowercase for registered names and hexadecimal addresses for the sake of uniformity, while only using uppercase letters for percent-encodings.

主机的语法规则不明确,因为它不能完全区分IPv4address和reg名称。为了消除语法歧义,我们应用了“first match wins”算法:如果主机匹配IPv4address的规则,那么它应该被视为IPv4地址文本,而不是注册表名。尽管主机不区分大小写,但为了统一起见,生产者和规范化者应该对注册名称和十六进制地址使用小写,而对百分比编码只使用大写字母。

A host identified by an Internet Protocol literal address, version 6 [RFC3513] or later, is distinguished by enclosing the IP literal within square brackets ("[" and "]"). This is the only place where square bracket characters are allowed in the URI syntax. In anticipation of future, as-yet-undefined IP literal address formats, an implementation may use an optional version flag to indicate such a format explicitly rather than rely on heuristic determination.

由Internet协议文本地址(版本6[RFC3513]或更高版本)标识的主机通过将IP文本括在方括号内(“[”和“]”)来区分。这是URI语法中唯一允许使用方括号字符的地方。在对未来尚未定义的IP文本地址格式的预期中,实现可以使用可选版本标志来显式地指示这种格式,而不是依赖于启发式确定。

IP-literal = "[" ( IPv6address / IPvFuture ) "]"

IP literal=“[(IPV6地址/IPvFuture)]”

      IPvFuture  = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
        
      IPvFuture  = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
        

The version flag does not indicate the IP version; rather, it indicates future versions of the literal format. As such, implementations must not provide the version flag for the existing IPv4 and IPv6 literal address forms described below. If a URI containing an IP-literal that starts with "v" (case-insensitive), indicating that the version flag is present, is dereferenced by an application that does not know the meaning of that version flag, then the application should return an appropriate error for "address mechanism not supported".

版本标志不表示IP版本;相反,它表示文字格式的未来版本。因此,实现不能为下面描述的现有IPv4和IPv6文本地址形式提供版本标志。如果一个包含以“v”(不区分大小写)开头的IP文本的URI表示存在版本标志,但该URI被一个不知道该版本标志含义的应用程序取消引用,则该应用程序应为“地址机制不受支持”返回一个适当的错误。

A host identified by an IPv6 literal address is represented inside the square brackets without a preceding version flag. The ABNF provided here is a translation of the text definition of an IPv6 literal address provided in [RFC3513]. This syntax does not support IPv6 scoped addressing zone identifiers.

由IPv6文本地址标识的主机在方括号内表示,没有前面的版本标志。这里提供的ABNF是[RFC3513]中提供的IPv6文本地址的文本定义的翻译。此语法不支持IPv6作用域寻址区域标识符。

A 128-bit IPv6 address is divided into eight 16-bit pieces. Each piece is represented numerically in case-insensitive hexadecimal, using one to four hexadecimal digits (leading zeroes are permitted). The eight encoded pieces are given most-significant first, separated by colon characters. Optionally, the least-significant two pieces may instead be represented in IPv4 address textual format. A sequence of one or more consecutive zero-valued 16-bit pieces within the address may be elided, omitting all their digits and leaving exactly two consecutive colons in their place to mark the elision.

128位IPv6地址分为八个16位部分。每一部分都用不区分大小写的十六进制数字表示,使用一到四个十六进制数字(允许使用前导零)。这八个编码片段首先被赋予最重要的意义,用冒号字符分隔。可选地,最不重要的两个部分可以用IPv4地址文本格式表示。地址中一个或多个连续的零值16位片段的序列可以省略,省略它们的所有数字,并在它们的位置上留下两个连续的冒号来标记省略。

      IPv6address =                            6( h16 ":" ) ls32
                  /                       "::" 5( h16 ":" ) ls32
                  / [               h16 ] "::" 4( h16 ":" ) ls32
                  / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
                  / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
                  / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
                  / [ *4( h16 ":" ) h16 ] "::"              ls32
                  / [ *5( h16 ":" ) h16 ] "::"              h16
                  / [ *6( h16 ":" ) h16 ] "::"
        
      IPv6address =                            6( h16 ":" ) ls32
                  /                       "::" 5( h16 ":" ) ls32
                  / [               h16 ] "::" 4( h16 ":" ) ls32
                  / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
                  / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
                  / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
                  / [ *4( h16 ":" ) h16 ] "::"              ls32
                  / [ *5( h16 ":" ) h16 ] "::"              h16
                  / [ *6( h16 ":" ) h16 ] "::"
        
      ls32        = ( h16 ":" h16 ) / IPv4address
                  ; least-significant 32 bits of address
        
      ls32        = ( h16 ":" h16 ) / IPv4address
                  ; least-significant 32 bits of address
        
      h16         = 1*4HEXDIG
                  ; 16 bits of address represented in hexadecimal
        
      h16         = 1*4HEXDIG
                  ; 16 bits of address represented in hexadecimal
        

A host identified by an IPv4 literal address is represented in dotted-decimal notation (a sequence of four decimal numbers in the range 0 to 255, separated by "."), as described in [RFC1123] by reference to [RFC0952]. Note that other forms of dotted notation may be interpreted on some platforms, as described in Section 7.4, but only the dotted-decimal form of four octets is allowed by this grammar.

由IPv4文本地址标识的主机用点十进制表示法表示(由0到255之间的四个十进制数字组成的序列,以“.”分隔),如[RFC1123]中参考[RFC0952]所述。注意,如第7.4节所述,在某些平台上可能会解释其他形式的点符号,但该语法只允许四个八位字节的点十进制形式。

IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet

IPv4address=dec八位组“,”dec八位组“,”dec八位组“,”dec八位组“,”dec八位组“

      dec-octet   = DIGIT                 ; 0-9
                  / %x31-39 DIGIT         ; 10-99
                  / "1" 2DIGIT            ; 100-199
                  / "2" %x30-34 DIGIT     ; 200-249
                  / "25" %x30-35          ; 250-255
        
      dec-octet   = DIGIT                 ; 0-9
                  / %x31-39 DIGIT         ; 10-99
                  / "1" 2DIGIT            ; 100-199
                  / "2" %x30-34 DIGIT     ; 200-249
                  / "25" %x30-35          ; 250-255
        

A host identified by a registered name is a sequence of characters usually intended for lookup within a locally defined host or service name registry, though the URI's scheme-specific semantics may require that a specific registry (or fixed name table) be used instead. The most common name registry mechanism is the Domain Name System (DNS). A registered name intended for lookup in the DNS uses the syntax

由注册名称标识的主机是一系列字符,通常用于在本地定义的主机或服务名称注册表中查找,尽管URI的特定于方案的语义可能要求使用特定的注册表(或固定名称表)。最常见的名称注册机制是域名系统(DNS)。用于在DNS中查找的注册名称使用以下语法

defined in Section 3.5 of [RFC1034] and Section 2.1 of [RFC1123]. Such a name consists of a sequence of domain labels separated by ".", each domain label starting and ending with an alphanumeric character and possibly also containing "-" characters. The rightmost domain label of a fully qualified domain name in DNS may be followed by a single "." and should be if it is necessary to distinguish between the complete domain name and some local domain.

定义见[RFC1034]第3.5节和[RFC1123]第2.1节。这样的名称由一系列以“.”分隔的域标签组成,每个域标签以字母数字字符开头和结尾,可能还包含“-”字符。DNS中完全限定域名的最右边的域标签后面可以跟一个“.”,如果有必要区分完整的域名和某些本地域,则应该跟一个“.”。

      reg-name    = *( unreserved / pct-encoded / sub-delims )
        
      reg-name    = *( unreserved / pct-encoded / sub-delims )
        

If the URI scheme defines a default for host, then that default applies when the host subcomponent is undefined or when the registered name is empty (zero length). For example, the "file" URI scheme is defined so that no authority, an empty host, and "localhost" all mean the end-user's machine, whereas the "http" scheme considers a missing authority or empty host invalid.

如果URI方案为主机定义了默认值,则当主机子组件未定义或注册名称为空(零长度)时,该默认值适用。例如,“文件”URI方案的定义使得没有权限、空主机和“本地主机”都表示最终用户的机器,而“http”方案认为缺少权限或空主机无效。

This specification does not mandate a particular registered name lookup technology and therefore does not restrict the syntax of reg-name beyond what is necessary for interoperability. Instead, it delegates the issue of registered name syntax conformance to the operating system of each application performing URI resolution, and that operating system decides what it will allow for the purpose of host identification. A URI resolution implementation might use DNS, host tables, yellow pages, NetInfo, WINS, or any other system for lookup of registered names. However, a globally scoped naming system, such as DNS fully qualified domain names, is necessary for URIs intended to have global scope. URI producers should use names that conform to the DNS syntax, even when use of DNS is not immediately apparent, and should limit these names to no more than 255 characters in length.

本规范不强制使用特定的注册名称查找技术,因此不限制注册名称的语法,使其超出互操作性所需的范围。相反,它将注册名称语法一致性问题委托给执行URI解析的每个应用程序的操作系统,并且该操作系统决定它将允许什么用于主机标识。URI解析实现可以使用DNS、主机表、黄页、NetInfo、WINS或任何其他系统来查找注册名称。但是,要使URI具有全局作用域,必须使用全局作用域命名系统,例如DNS完全限定域名。URI生产者应该使用符合DNS语法的名称,即使DNS的使用不是很明显,并且应该将这些名称的长度限制为不超过255个字符。

The reg-name syntax allows percent-encoded octets in order to represent non-ASCII registered names in a uniform way that is independent of the underlying name resolution technology. Non-ASCII characters must first be encoded according to UTF-8 [STD63], and then each octet of the corresponding UTF-8 sequence must be percent-encoded to be represented as URI characters. URI producing applications must not use percent-encoding in host unless it is used to represent a UTF-8 character sequence. When a non-ASCII registered name represents an internationalized domain name intended for resolution via the DNS, the name must be transformed to the IDNA encoding [RFC3490] prior to name lookup. URI producers should provide these registered names in the IDNA encoding, rather than a percent-encoding, if they wish to maximize interoperability with legacy URI resolvers.

reg name语法允许百分比编码的八位字节,以便以独立于基础名称解析技术的统一方式表示非ASCII注册名称。必须首先根据UTF-8[STD63]对非ASCII字符进行编码,然后对相应UTF-8序列的每个八位字节进行百分比编码,以表示为URI字符。生成URI的应用程序不得在主机中使用百分比编码,除非它用于表示UTF-8字符序列。当非ASCII注册名称表示拟通过DNS解析的国际化域名时,必须在名称查找之前将该名称转换为IDNA编码[RFC3490]。如果URI生产者希望最大限度地提高与遗留URI解析器的互操作性,则应在IDNA编码中提供这些注册名称,而不是百分比编码。

3.2.3. Port
3.2.3. 港口城市

The port subcomponent of authority is designated by an optional port number in decimal following the host and delimited from it by a single colon (":") character.

授权的端口子组件由主机后面的十进制可选端口号指定,并由单个冒号(“:”)字符分隔。

      port        = *DIGIT
        
      port        = *DIGIT
        

A scheme may define a default port. For example, the "http" scheme defines a default port of "80", corresponding to its reserved TCP port number. The type of port designated by the port number (e.g., TCP, UDP, SCTP) is defined by the URI scheme. URI producers and normalizers should omit the port component and its ":" delimiter if port is empty or if its value would be the same as that of the scheme's default.

方案可以定义默认端口。例如,“http”方案定义了一个默认端口“80”,对应于其保留的TCP端口号。由端口号(例如TCP、UDP、SCTP)指定的端口类型由URI方案定义。如果端口为空或其值与方案的默认值相同,URI生产者和规范化者应省略端口组件及其“:”分隔符。

3.3. Path
3.3. 路径

The path component contains data, usually organized in hierarchical form, that, along with data in the non-hierarchical query component (Section 3.4), serves to identify a resource within the scope of the URI's scheme and naming authority (if any). The path is terminated by the first question mark ("?") or number sign ("#") character, or by the end of the URI.

路径组件包含通常以层次形式组织的数据,这些数据与非层次查询组件(第3.4节)中的数据一起用于标识URI方案和命名机构(如果有)范围内的资源。路径以第一个问号(“?”)或数字符号(“#”)字符或URI的结尾终止。

If a URI contains an authority component, then the path component must either be empty or begin with a slash ("/") character. If a URI does not contain an authority component, then the path cannot begin with two slash characters ("//"). In addition, a URI reference (Section 4.1) may be a relative-path reference, in which case the first path segment cannot contain a colon (":") character. The ABNF requires five separate rules to disambiguate these cases, only one of which will match the path substring within a given URI reference. We use the generic term "path component" to describe the URI substring matched by the parser to one of these rules.

如果URI包含权限组件,则路径组件必须为空或以斜杠(“/”)字符开头。如果URI不包含授权组件,则路径不能以两个斜杠字符(“/”)开头。此外,URI引用(第4.1节)可以是相对路径引用,在这种情况下,第一个路径段不能包含冒号(“:”)字符。ABNF需要五个独立的规则来消除这些情况的歧义,其中只有一个规则将匹配给定URI引用中的路径子字符串。我们使用通用术语“路径组件”来描述由解析器匹配到这些规则之一的URI子字符串。

      path          = path-abempty    ; begins with "/" or is empty
                    / path-absolute   ; begins with "/" but not "//"
                    / path-noscheme   ; begins with a non-colon segment
                    / path-rootless   ; begins with a segment
                    / path-empty      ; zero characters
        
      path          = path-abempty    ; begins with "/" or is empty
                    / path-absolute   ; begins with "/" but not "//"
                    / path-noscheme   ; begins with a non-colon segment
                    / path-rootless   ; begins with a segment
                    / path-empty      ; zero characters
        
      path-abempty  = *( "/" segment )
      path-absolute = "/" [ segment-nz *( "/" segment ) ]
      path-noscheme = segment-nz-nc *( "/" segment )
      path-rootless = segment-nz *( "/" segment )
      path-empty    = 0<pchar>
        
      path-abempty  = *( "/" segment )
      path-absolute = "/" [ segment-nz *( "/" segment ) ]
      path-noscheme = segment-nz-nc *( "/" segment )
      path-rootless = segment-nz *( "/" segment )
      path-empty    = 0<pchar>
        
      segment       = *pchar
      segment-nz    = 1*pchar
      segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
                    ; non-zero-length segment without any colon ":"
        
      segment       = *pchar
      segment-nz    = 1*pchar
      segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
                    ; non-zero-length segment without any colon ":"
        
      pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
        
      pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
        

A path consists of a sequence of path segments separated by a slash ("/") character. A path is always defined for a URI, though the defined path may be empty (zero length). Use of the slash character to indicate hierarchy is only required when a URI will be used as the context for relative references. For example, the URI <mailto:fred@example.com> has a path of "fred@example.com", whereas the URI <foo://info.example.com?fred> has an empty path.

路径由一系列由斜杠(“/”)字符分隔的路径段组成。始终为URI定义路径,尽管定义的路径可能为空(长度为零)。只有当URI将用作相对引用的上下文时,才需要使用斜杠字符来指示层次结构。例如,URI<mailto:fred@example.com>有一条“道路”fred@example.com“,而URI<foo://info.example.com?fred>有一条空路径。

The path segments "." and "..", also known as dot-segments, are defined for relative reference within the path name hierarchy. They are intended for use at the beginning of a relative-path reference (Section 4.2) to indicate relative position within the hierarchical tree of names. This is similar to their role within some operating systems' file directory structures to indicate the current directory and parent directory, respectively. However, unlike in a file system, these dot-segments are only interpreted within the URI path hierarchy and are removed as part of the resolution process (Section 5.2).

路径段“.”和“.”也称为点段,在路径名层次结构中定义为相对参考。它们用于相对路径参考(第4.2节)的开头,以指示名称层次结构树中的相对位置。这类似于它们在某些操作系统的文件目录结构中的角色,分别指示当前目录和父目录。但是,与文件系统不同,这些点段仅在URI路径层次结构中解释,并作为解析过程的一部分删除(第5.2节)。

Aside from dot-segments in hierarchical paths, a path segment is considered opaque by the generic syntax. URI producing applications often use the reserved characters allowed in a segment to delimit scheme-specific or dereference-handler-specific subcomponents. For example, the semicolon (";") and equals ("=") reserved characters are often used to delimit parameters and parameter values applicable to that segment. The comma (",") reserved character is often used for similar purposes. For example, one URI producer might use a segment such as "name;v=1.1" to indicate a reference to version 1.1 of "name", whereas another might use a segment such as "name,1.1" to indicate the same. Parameter types may be defined by scheme-specific semantics, but in most cases the syntax of a parameter is specific to the implementation of the URI's dereferencing algorithm.

除了分层路径中的点段之外,一般语法认为路径段是不透明的。产生URI的应用程序通常使用段中允许的保留字符来分隔特定于方案或取消引用特定于处理程序的子组件。例如,分号(;)和等号(=)保留字符通常用于分隔适用于该段的参数和参数值。逗号(“,”)保留字符通常用于类似目的。例如,一个URI生产者可能使用诸如“name;v=1.1”之类的段来表示对“name”版本1.1的引用,而另一个URI生产者可能使用诸如“name,1.1”之类的段来表示相同的引用。参数类型可以由特定于方案的语义定义,但在大多数情况下,参数的语法特定于URI的解引用算法的实现。

3.4. Query
3.4. 查询

The query component contains non-hierarchical data that, along with data in the path component (Section 3.3), serves to identify a resource within the scope of the URI's scheme and naming authority (if any). The query component is indicated by the first question mark ("?") character and terminated by a number sign ("#") character or by the end of the URI.

查询组件包含非层次数据,这些数据与路径组件(第3.3节)中的数据一起用于标识URI的方案和命名机构(如果有)范围内的资源。查询组件由第一个问号(“?”)字符表示,并以数字符号(“#”)字符或URI结尾。

      query       = *( pchar / "/" / "?" )
        
      query       = *( pchar / "/" / "?" )
        

The characters slash ("/") and question mark ("?") may represent data within the query component. Beware that some older, erroneous implementations may not handle such data correctly when it is used as the base URI for relative references (Section 5.1), apparently because they fail to distinguish query data from path data when looking for hierarchical separators. However, as query components are often used to carry identifying information in the form of "key=value" pairs and one frequently used value is a reference to another URI, it is sometimes better for usability to avoid percent-encoding those characters.

字符斜杠(“/”)和问号(“?”)可以表示查询组件中的数据。请注意,某些旧的、错误的实现在用作相对引用的基本URI时可能无法正确处理此类数据(第5.1节),这显然是因为它们在查找层次分隔符时无法区分查询数据和路径数据。但是,由于查询组件通常用于以“key=value”对的形式携带标识信息,并且一个常用值是对另一个URI的引用,因此有时避免对这些字符进行百分比编码对可用性更有利。

3.5. Fragment
3.5. 碎片

The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information. The identified secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. A fragment identifier component is indicated by the presence of a number sign ("#") character and terminated by the end of the URI.

URI的片段标识符组件允许通过引用主资源和附加标识信息间接标识辅助资源。所识别的次要资源可以是主要资源的某些部分或子集、关于主要资源的表示的一些视图,或者由这些表示定义或描述的一些其他资源。片段标识符组件由数字符号(#)字符表示,并以URI结尾终止。

      fragment    = *( pchar / "/" / "?" )
        
      fragment    = *( pchar / "/" / "?" )
        

The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource. The fragment's format and resolution is therefore dependent on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced. If no such representation exists, then the semantics of the fragment are considered unknown and are effectively unconstrained. Fragment identifier semantics are independent of the URI scheme and thus cannot be redefined by scheme specifications.

片段标识符的语义由一组表示定义,这些表示可能由主资源上的检索操作产生。因此,片段的格式和分辨率取决于可能检索到的表示的媒体类型[RFC2046],即使这样的检索仅在URI被取消引用时执行。如果不存在这样的表示,那么片段的语义被认为是未知的,并且实际上是不受约束的。片段标识符语义独立于URI方案,因此不能由方案规范重新定义。

Individual media types may define their own restrictions on or structures within the fragment identifier syntax for specifying different types of subsets, views, or external references that are identifiable as secondary resources by that media type. If the primary resource has multiple representations, as is often the case for resources whose representation is selected based on attributes of the retrieval request (a.k.a., content negotiation), then whatever is identified by the fragment should be consistent across all of those representations. Each representation should either define the fragment so that it corresponds to the same secondary resource, regardless of how it is represented, or should leave the fragment undefined (i.e., not found).

各个媒体类型可以在片段标识符语法中定义它们自己的限制或结构,用于指定不同类型的子集、视图或外部引用,这些子集、视图或外部引用可由该媒体类型识别为辅助资源。如果主资源具有多个表示,这通常是基于检索请求的属性(也称为内容协商)选择表示的资源的情况,那么片段标识的内容应在所有这些表示中保持一致。每个表示应该定义片段,使其对应于相同的辅助资源,而不管它是如何表示的,或者应该保留未定义的片段(即未找到)。

As with any URI, use of a fragment identifier component does not imply that a retrieval action will take place. A URI with a fragment identifier may be used to refer to the secondary resource without any implication that the primary resource is accessible or will ever be accessed.

与任何URI一样,使用片段标识符组件并不意味着将进行检索操作。带有片段标识符的URI可用于引用辅助资源,而不会暗示主资源可访问或将被访问。

Fragment identifiers have a special role in information retrieval systems as the primary form of client-side indirect referencing, allowing an author to specifically identify aspects of an existing resource that are only indirectly provided by the resource owner. As such, the fragment identifier is not used in the scheme-specific processing of a URI; instead, the fragment identifier is separated from the rest of the URI prior to a dereference, and thus the identifying information within the fragment itself is dereferenced solely by the user agent, regardless of the URI scheme. Although this separate handling is often perceived to be a loss of information, particularly for accurate redirection of references as resources move over time, it also serves to prevent information providers from denying reference authors the right to refer to information within a resource selectively. Indirect referencing also provides additional flexibility and extensibility to systems that use URIs, as new media types are easier to define and deploy than new schemes of identification.

片段标识符作为客户端间接引用的主要形式,在信息检索系统中具有特殊的作用,允许作者专门识别现有资源中仅由资源所有者间接提供的方面。因此,片段标识符不用于URI的特定于方案的处理中;相反,在解引用之前,片段标识符与URI的其余部分分离,因此片段本身中的标识信息仅由用户代理解引用,而与URI方案无关。尽管这种单独的处理通常被认为是信息的丢失,特别是在资源随时间推移而准确重定向引用时,它还可以防止信息提供者拒绝引用作者有选择地引用资源中信息的权利。间接引用还为使用URI的系统提供了额外的灵活性和可扩展性,因为新的媒体类型比新的标识方案更容易定义和部署。

The characters slash ("/") and question mark ("?") are allowed to represent data within the fragment identifier. Beware that some older, erroneous implementations may not handle this data correctly when it is used as the base URI for relative references (Section 5.1).

字符斜杠(“/”)和问号(“?”)可以表示片段标识符中的数据。请注意,当数据用作相对引用的基本URI时,一些旧的、错误的实现可能无法正确处理这些数据(第5.1节)。

4. Usage
4. 用法

When applications make reference to a URI, they do not always use the full form of reference defined by the "URI" syntax rule. To save space and take advantage of hierarchical locality, many Internet protocol elements and media type formats allow an abbreviation of a URI, whereas others restrict the syntax to a particular form of URI. We define the most common forms of reference syntax in this specification because they impact and depend upon the design of the generic syntax, requiring a uniform parsing algorithm in order to be interpreted consistently.

当应用程序引用URI时,它们并不总是使用“URI”语法规则定义的完整引用形式。为了节省空间并利用层次结构的局部性,许多Internet协议元素和媒体类型格式允许URI的缩写,而另一些则将语法限制为特定形式的URI。我们在本规范中定义了最常见的引用语法形式,因为它们影响并依赖于通用语法的设计,需要统一的解析算法才能得到一致的解释。

4.1. URI Reference
4.1. URI引用

URI-reference is used to denote the most common usage of a resource identifier.

URI引用用于表示资源标识符的最常见用法。

      URI-reference = URI / relative-ref
        
      URI-reference = URI / relative-ref
        

A URI-reference is either a URI or a relative reference. If the URI-reference's prefix does not match the syntax of a scheme followed by its colon separator, then the URI-reference is a relative reference.

URI引用是URI或相对引用。如果URI引用的前缀与后跟冒号分隔符的方案的语法不匹配,则URI引用是相对引用。

A URI-reference is typically parsed first into the five URI components, in order to determine what components are present and whether the reference is relative. Then, each component is parsed for its subparts and their validation. The ABNF of URI-reference, along with the "first-match-wins" disambiguation rule, is sufficient to define a validating parser for the generic syntax. Readers familiar with regular expressions should see Appendix B for an example of a non-validating URI-reference parser that will take any given string and extract the URI components.

URI引用通常首先解析为五个URI组件,以确定存在哪些组件以及引用是否是相对的。然后,分析每个组件的子部分及其验证。URI引用的ABNF以及“first match wins”消歧规则足以定义通用语法的验证解析器。熟悉正则表达式的读者应参阅附录B,以获取非验证URI引用解析器的示例,该解析器将接受任何给定字符串并提取URI组件。

4.2. Relative Reference
4.2. 相对参照

A relative reference takes advantage of the hierarchical syntax (Section 1.2.3) to express a URI reference relative to the name space of another hierarchical URI.

相对引用利用层次语法(第1.2.3节)来表示相对于另一个层次URI的名称空间的URI引用。

relative-ref = relative-part [ "?" query ] [ "#" fragment ]

相对引用=相对部分[“?”查询][“#”片段]

      relative-part = "//" authority path-abempty
                    / path-absolute
                    / path-noscheme
                    / path-empty
        
      relative-part = "//" authority path-abempty
                    / path-absolute
                    / path-noscheme
                    / path-empty
        

The URI referred to by a relative reference, also known as the target URI, is obtained by applying the reference resolution algorithm of Section 5.

通过应用第5节的引用解析算法,获得由相对引用(也称为目标URI)引用的URI。

A relative reference that begins with two slash characters is termed a network-path reference; such references are rarely used. A relative reference that begins with a single slash character is termed an absolute-path reference. A relative reference that does not begin with a slash character is termed a relative-path reference.

以两个斜杠字符开头的相对引用称为网络路径引用;这样的引用很少使用。以单个斜杠字符开头的相对引用称为绝对路径引用。不以斜杠字符开头的相对引用称为相对路径引用。

A path segment that contains a colon character (e.g., "this:that") cannot be used as the first segment of a relative-path reference, as it would be mistaken for a scheme name. Such a segment must be preceded by a dot-segment (e.g., "./this:that") to make a relative-path reference.

包含冒号字符(例如,“this:that”)的路径段不能用作相对路径引用的第一段,因为它会被误认为是方案名称。这样的段前面必须有一个点段(例如“/”this:that“),以形成一个相对路径引用。

4.3. Absolute URI
4.3. 绝对URI

Some protocol elements allow only the absolute form of a URI without a fragment identifier. For example, defining a base URI for later use by relative references calls for an absolute-URI syntax rule that does not allow a fragment.

一些协议元素只允许URI的绝对形式,而不允许片段标识符。例如,定义一个基本URI以供以后由相对引用使用,需要一个不允许片段的绝对URI语法规则。

absolute-URI = scheme ":" hier-part [ "?" query ]

绝对URI=方案“:“hier部分[“?”查询]

URI scheme specifications must define their own syntax so that all strings matching their scheme-specific syntax will also match the <absolute-URI> grammar. Scheme specifications will not define fragment identifier syntax or usage, regardless of its applicability to resources identifiable via that scheme, as fragment identification is orthogonal to scheme definition. However, scheme specifications are encouraged to include a wide range of examples, including examples that show use of the scheme's URIs with fragment identifiers when such usage is appropriate.

URI方案规范必须定义自己的语法,以便所有匹配其方案特定语法的字符串也将匹配<absolute URI>语法。方案规范不会定义片段标识符语法或用法,无论其是否适用于通过该方案识别的资源,因为片段标识与方案定义正交。然而,鼓励方案规范包括广泛的示例,包括在适当的情况下使用带有片段标识符的方案URI的示例。

4.4. Same-Document Reference
4.4. 同一文件参考

When a URI reference refers to a URI that is, aside from its fragment component (if any), identical to the base URI (Section 5.1), that reference is called a "same-document" reference. The most frequent examples of same-document references are relative references that are empty or include only the number sign ("#") separator followed by a fragment identifier.

当一个URI引用引用一个URI时,除了它的片段组件(如果有)之外,它与基本URI(第5.1节)是相同的,这个引用称为“同一文档”引用。同一文档引用的最常见示例是空的相对引用,或者仅包含数字符号(“#”)分隔符,后跟片段标识符。

When a same-document reference is dereferenced for a retrieval action, the target of that reference is defined to be within the same entity (representation, document, or message) as the reference; therefore, a dereference should not result in a new retrieval action.

当为检索操作取消引用同一文档引用时,该引用的目标被定义为与引用在同一实体(表示、文档或消息)内;因此,取消引用不应导致新的检索操作。

Normalization of the base and target URIs prior to their comparison, as described in Sections 6.2.2 and 6.2.3, is allowed but rarely performed in practice. Normalization may increase the set of same-document references, which may be of benefit to some caching applications. As such, reference authors should not assume that a slightly different, though equivalent, reference URI will (or will not) be interpreted as a same-document reference by any given application.

如第6.2.2节和第6.2.3节所述,允许在比较基础URI和目标URI之前对其进行标准化,但在实践中很少执行。规范化可能会增加相同文档引用的集合,这可能对某些缓存应用程序有利。因此,引用作者不应假设任何给定的应用程序会(或不会)将稍微不同但相当的引用URI解释为相同的文档引用。

4.5. Suffix Reference
4.5. 后缀引用

The URI syntax is designed for unambiguous reference to resources and extensibility via the URI scheme. However, as URI identification and usage have become commonplace, traditional media (television, radio, newspapers, billboards, etc.) have increasingly used a suffix of the

URI语法设计用于通过URI方案明确引用资源和可扩展性。然而,随着URI的识别和使用变得司空见惯,传统媒体(电视、广播、报纸、广告牌等)越来越多地使用

URI as a reference, consisting of only the authority and path portions of the URI, such as

URI作为引用,仅包含URI的权限和路径部分,例如

www.w3.org/Addressing/

www.w3.org/Addressing/

or simply a DNS registered name on its own. Such references are primarily intended for human interpretation rather than for machines, with the assumption that context-based heuristics are sufficient to complete the URI (e.g., most registered names beginning with "www" are likely to have a URI prefix of "http://"). Although there is no standard set of heuristics for disambiguating a URI suffix, many client implementations allow them to be entered by the user and heuristically resolved.

或者只是一个DNS注册名称本身。此类引用主要用于人工解释,而不是机器,前提是基于上下文的启发式方法足以完成URI(例如,大多数以“www”开头的注册名称可能具有URI前缀“http://”)。尽管没有一套标准的启发式方法来消除URI后缀的歧义,但许多客户端实现允许用户输入这些后缀并进行启发式解析。

Although this practice of using suffix references is common, it should be avoided whenever possible and should never be used in situations where long-term references are expected. The heuristics noted above will change over time, particularly when a new URI scheme becomes popular, and are often incorrect when used out of context. Furthermore, they can lead to security issues along the lines of those described in [RFC1535].

尽管这种使用后缀引用的做法很常见,但应尽可能避免使用,并且不应在需要长期引用的情况下使用。上面提到的启发式方法会随着时间的推移而改变,特别是当一个新的URI方案变得流行时,并且在脱离上下文使用时通常是不正确的。此外,它们可能导致[RFC1535]中所述的安全问题。

As a URI suffix has the same syntax as a relative-path reference, a suffix reference cannot be used in contexts where a relative reference is expected. As a result, suffix references are limited to places where there is no defined base URI, such as dialog boxes and off-line advertisements.

由于URI后缀与相对路径引用具有相同的语法,因此不能在需要相对引用的上下文中使用后缀引用。因此,后缀引用仅限于没有定义基本URI的位置,例如对话框和脱机广告。

5. Reference Resolution
5. 参考分辨率

This section defines the process of resolving a URI reference within a context that allows relative references so that the result is a string matching the <URI> syntax rule of Section 3.

本节定义了在允许相对引用的上下文中解析URI引用的过程,以便结果是与第3节的<URI>语法规则匹配的字符串。

5.1. Establishing a Base URI
5.1. 建立基本URI

The term "relative" implies that a "base URI" exists against which the relative reference is applied. Aside from fragment-only references (Section 4.4), relative references are only usable when a base URI is known. A base URI must be established by the parser prior to parsing URI references that might be relative. A base URI must conform to the <absolute-URI> syntax rule (Section 4.3). If the base URI is obtained from a URI reference, then that reference must be converted to absolute form and stripped of any fragment component prior to its use as a base URI.

术语“相对”意味着存在一个相对引用所针对的“基本URI”。除了仅片段引用(第4.4节),相对引用仅在已知基本URI时可用。在解析可能是相对的URI引用之前,必须由解析器建立基本URI。基本URI必须符合<绝对URI>语法规则(第4.3节)。如果基本URI是从URI引用获得的,那么在将该引用用作基本URI之前,必须将其转换为绝对形式并除去任何片段组件。

The base URI of a reference can be established in one of four ways, discussed below in order of precedence. The order of precedence can be thought of in terms of layers, where the innermost defined base URI has the highest precedence. This can be visualized graphically as follows:

引用的基本URI可以通过以下四种方法之一建立,下面按优先级顺序讨论。优先级顺序可以从层的角度考虑,其中最内层定义的基本URI具有最高的优先级。这可以图形化显示,如下所示:

         .----------------------------------------------------------.
         |  .----------------------------------------------------.  |
         |  |  .----------------------------------------------.  |  |
         |  |  |  .----------------------------------------.  |  |  |
         |  |  |  |  .----------------------------------.  |  |  |  |
         |  |  |  |  |       <relative-reference>       |  |  |  |  |
         |  |  |  |  `----------------------------------'  |  |  |  |
         |  |  |  | (5.1.1) Base URI embedded in content   |  |  |  |
         |  |  |  `----------------------------------------'  |  |  |
         |  |  | (5.1.2) Base URI of the encapsulating entity |  |  |
         |  |  |         (message, representation, or none)   |  |  |
         |  |  `----------------------------------------------'  |  |
         |  | (5.1.3) URI used to retrieve the entity            |  |
         |  `----------------------------------------------------'  |
         | (5.1.4) Default Base URI (application-dependent)         |
         `----------------------------------------------------------'
        
         .----------------------------------------------------------.
         |  .----------------------------------------------------.  |
         |  |  .----------------------------------------------.  |  |
         |  |  |  .----------------------------------------.  |  |  |
         |  |  |  |  .----------------------------------.  |  |  |  |
         |  |  |  |  |       <relative-reference>       |  |  |  |  |
         |  |  |  |  `----------------------------------'  |  |  |  |
         |  |  |  | (5.1.1) Base URI embedded in content   |  |  |  |
         |  |  |  `----------------------------------------'  |  |  |
         |  |  | (5.1.2) Base URI of the encapsulating entity |  |  |
         |  |  |         (message, representation, or none)   |  |  |
         |  |  `----------------------------------------------'  |  |
         |  | (5.1.3) URI used to retrieve the entity            |  |
         |  `----------------------------------------------------'  |
         | (5.1.4) Default Base URI (application-dependent)         |
         `----------------------------------------------------------'
        
5.1.1. Base URI Embedded in Content
5.1.1. 嵌入在内容中的基本URI

Within certain media types, a base URI for relative references can be embedded within the content itself so that it can be readily obtained by a parser. This can be useful for descriptive documents, such as tables of contents, which may be transmitted to others through protocols other than their usual retrieval context (e.g., email or USENET news).

在某些媒体类型中,相对引用的基本URI可以嵌入到内容本身中,以便解析器可以轻松地获得它。这对于描述性文档(如目录)非常有用,这些文档可以通过协议而不是通常的检索上下文(如电子邮件或USENET新闻)传输给其他人。

It is beyond the scope of this specification to specify how, for each media type, a base URI can be embedded. The appropriate syntax, when available, is described by the data format specification associated with each media type.

指定如何为每种媒体类型嵌入基本URI超出了本规范的范围。适当的语法(如果可用)由与每种媒体类型关联的数据格式规范描述。

5.1.2. Base URI from the Encapsulating Entity
5.1.2. 来自封装实体的基本URI

If no base URI is embedded, the base URI is defined by the representation's retrieval context. For a document that is enclosed within another entity, such as a message or archive, the retrieval context is that entity. Thus, the default base URI of a representation is the base URI of the entity in which the representation is encapsulated.

如果没有嵌入基URI,则基URI由表示的检索上下文定义。对于包含在另一个实体(如消息或存档)中的文档,检索上下文就是该实体。因此,表示的默认基本URI是封装表示的实体的基本URI。

A mechanism for embedding a base URI within MIME container types (e.g., the message and multipart types) is defined by MHTML [RFC2557]. Protocols that do not use the MIME message header syntax, but that do allow some form of tagged metadata to be included within messages, may define their own syntax for defining a base URI as part of a message.

MHTML[RFC2557]定义了在MIME容器类型(例如消息和多部分类型)中嵌入基本URI的机制。不使用MIME消息头语法,但允许在消息中包含某种形式的标记元数据的协议,可以定义自己的语法来定义作为消息一部分的基本URI。

5.1.3. Base URI from the Retrieval URI
5.1.3. 检索URI中的基URI

If no base URI is embedded and the representation is not encapsulated within some other entity, then, if a URI was used to retrieve the representation, that URI shall be considered the base URI. Note that if the retrieval was the result of a redirected request, the last URI used (i.e., the URI that resulted in the actual retrieval of the representation) is the base URI.

如果未嵌入基本URI,且表示未封装在其他实体中,则如果使用URI检索表示,则该URI应视为基本URI。请注意,如果检索是重定向请求的结果,则最后使用的URI(即导致实际检索表示的URI)是基本URI。

5.1.4. Default Base URI
5.1.4. 默认基URI

If none of the conditions described above apply, then the base URI is defined by the context of the application. As this definition is necessarily application-dependent, failing to define a base URI by using one of the other methods may result in the same content being interpreted differently by different types of applications.

如果上述条件均不适用,则基本URI由应用程序的上下文定义。由于此定义必然依赖于应用程序,如果无法使用其他方法之一定义基本URI,可能会导致不同类型的应用程序对相同的内容进行不同的解释。

A sender of a representation containing relative references is responsible for ensuring that a base URI for those references can be established. Aside from fragment-only references, relative references can only be used reliably in situations where the base URI is well defined.

包含相对引用的表示的发送方负责确保可以为这些引用建立基本URI。除了仅片段引用之外,相对引用只能在基本URI定义良好的情况下可靠地使用。

5.2. Relative Resolution
5.2. 相对分辨率

This section describes an algorithm for converting a URI reference that might be relative to a given base URI into the parsed components of the reference's target. The components can then be recomposed, as described in Section 5.3, to form the target URI. This algorithm provides definitive results that can be used to test the output of other implementations. Applications may implement relative reference resolution by using some other algorithm, provided that the results match what would be given by this one.

本节描述一种算法,用于将可能与给定基本URI相关的URI引用转换为引用目标的已解析组件。然后,如第5.3节所述,可以重新组合组件以形成目标URI。该算法提供了最终结果,可用于测试其他实现的输出。应用程序可以通过使用其他算法实现相对参考分辨率,前提是结果与此算法给出的结果相匹配。

5.2.1. Pre-parse the Base URI
5.2.1. 预解析基URI

The base URI (Base) is established according to the procedure of Section 5.1 and parsed into the five main components described in Section 3. Note that only the scheme component is required to be present in a base URI; the other components may be empty or undefined. A component is undefined if its associated delimiter does not appear in the URI reference; the path component is never undefined, though it may be empty.

基本URI(base)是根据第5.1节的过程建立的,并被解析为第3节中描述的五个主要组件。注意,基本URI中只需要存在scheme组件;其他组件可能为空或未定义。如果组件的关联分隔符未出现在URI引用中,则该组件是未定义的;路径组件从来都不是未定义的,尽管它可能是空的。

Normalization of the base URI, as described in Sections 6.2.2 and 6.2.3, is optional. A URI reference must be transformed to its target URI before it can be normalized.

如第6.2.2节和第6.2.3节所述,基本URI的标准化是可选的。URI引用必须先转换为其目标URI,然后才能进行规范化。

5.2.2. Transform References
5.2.2. 转换引用

For each URI reference (R), the following pseudocode describes an algorithm for transforming R into its target URI (T):

对于每个URI引用(R),以下伪代码描述了将R转换为其目标URI(T)的算法:

-- The URI reference is parsed into the five URI components -- (R.scheme, R.authority, R.path, R.query, R.fragment) = parse(R);

--URI引用被解析为五个URI组件--(R.scheme、R.authority、R.path、R.query、R.fragment)=解析(R);

      -- A non-strict parser may ignore a scheme in the reference
      -- if it is identical to the base URI's scheme.
      --
      if ((not strict) and (R.scheme == Base.scheme)) then
         undefine(R.scheme);
      endif;
        
      -- A non-strict parser may ignore a scheme in the reference
      -- if it is identical to the base URI's scheme.
      --
      if ((not strict) and (R.scheme == Base.scheme)) then
         undefine(R.scheme);
      endif;
        
      if defined(R.scheme) then
         T.scheme    = R.scheme;
         T.authority = R.authority;
         T.path      = remove_dot_segments(R.path);
         T.query     = R.query;
      else
         if defined(R.authority) then
            T.authority = R.authority;
            T.path      = remove_dot_segments(R.path);
            T.query     = R.query;
         else
            if (R.path == "") then
               T.path = Base.path;
               if defined(R.query) then
                  T.query = R.query;
               else
                  T.query = Base.query;
               endif;
            else
               if (R.path starts-with "/") then
                  T.path = remove_dot_segments(R.path);
               else
                  T.path = merge(Base.path, R.path);
                  T.path = remove_dot_segments(T.path);
               endif;
               T.query = R.query;
            endif;
            T.authority = Base.authority;
         endif;
         T.scheme = Base.scheme;
      endif;
        
      if defined(R.scheme) then
         T.scheme    = R.scheme;
         T.authority = R.authority;
         T.path      = remove_dot_segments(R.path);
         T.query     = R.query;
      else
         if defined(R.authority) then
            T.authority = R.authority;
            T.path      = remove_dot_segments(R.path);
            T.query     = R.query;
         else
            if (R.path == "") then
               T.path = Base.path;
               if defined(R.query) then
                  T.query = R.query;
               else
                  T.query = Base.query;
               endif;
            else
               if (R.path starts-with "/") then
                  T.path = remove_dot_segments(R.path);
               else
                  T.path = merge(Base.path, R.path);
                  T.path = remove_dot_segments(T.path);
               endif;
               T.query = R.query;
            endif;
            T.authority = Base.authority;
         endif;
         T.scheme = Base.scheme;
      endif;
        

T.fragment = R.fragment;

T.fragment=R.fragment;

5.2.3. Merge Paths
5.2.3. 合并路径

The pseudocode above refers to a "merge" routine for merging a relative-path reference with the path of the base URI. This is accomplished as follows:

上面的伪代码引用了一个“合并”例程,用于将相对路径引用与基本URI的路径合并。这是通过以下方式实现的:

o If the base URI has a defined authority component and an empty path, then return a string consisting of "/" concatenated with the reference's path; otherwise,

o 如果基本URI具有已定义的权限组件和空路径,则返回由“/”组成的字符串,该字符串与引用的路径连接在一起;否则

o return a string consisting of the reference's path component appended to all but the last segment of the base URI's path (i.e., excluding any characters after the right-most "/" in the base URI path, or excluding the entire base URI path if it does not contain any "/" characters).

o 返回一个由引用的路径组件组成的字符串,该组件附加到基本URI路径的最后一段以外的所有部分(即,排除基本URI路径中最右边“/”后面的任何字符,或者排除整个基本URI路径(如果它不包含任何“/”字符)。

5.2.4. Remove Dot Segments
5.2.4. 删除点段

The pseudocode also refers to a "remove_dot_segments" routine for interpreting and removing the special "." and ".." complete path segments from a referenced path. This is done after the path is extracted from a reference, whether or not the path was relative, in order to remove any invalid or extraneous dot-segments prior to forming the target URI. Although there are many ways to accomplish this removal process, we describe a simple method using two string buffers.

伪代码还引用了一个“remove_dot_segments”例程,用于解释和删除引用路径中的特殊“.”和“.”完整路径段。这是在从引用中提取路径后完成的,无论路径是否是相对的,以便在形成目标URI之前删除任何无效或无关的点段。虽然有许多方法可以完成此删除过程,但我们描述了一种使用两个字符串缓冲区的简单方法。

1. The input buffer is initialized with the now-appended path components and the output buffer is initialized to the empty string.

1. 输入缓冲区用现在附加的路径组件初始化,输出缓冲区初始化为空字符串。

2. While the input buffer is not empty, loop as follows:

2. 当输入缓冲区不为空时,循环如下:

A. If the input buffer begins with a prefix of "../" or "./", then remove that prefix from the input buffer; otherwise,

A.如果输入缓冲区以前缀“./”或“./”开头,则从输入缓冲区中删除该前缀;否则

B. if the input buffer begins with a prefix of "/./" or "/.", where "." is a complete path segment, then replace that prefix with "/" in the input buffer; otherwise,

B.如果输入缓冲区以前缀“/./”或“/”开头,其中“.”是完整的路径段,则将输入缓冲区中的前缀替换为“/”;否则

C. if the input buffer begins with a prefix of "/../" or "/..", where ".." is a complete path segment, then replace that prefix with "/" in the input buffer and remove the last segment and its preceding "/" (if any) from the output buffer; otherwise,

C.如果输入缓冲区以前缀“/…/”或“/…”开头,其中“.”是一个完整的路径段,则将该前缀替换为输入缓冲区中的“/”,并从输出缓冲区中删除最后一段及其前面的“/”(如果有);否则

D. if the input buffer consists only of "." or "..", then remove that from the input buffer; otherwise,

D.如果输入缓冲区仅由“.”或“.”组成,则将其从输入缓冲区中移除;否则

E. move the first path segment in the input buffer to the end of the output buffer, including the initial "/" character (if any) and any subsequent characters up to, but not including, the next "/" character or the end of the input buffer.

E.将输入缓冲区中的第一个路径段移动到输出缓冲区的末尾,包括初始“/”字符(如果有)和任何后续字符,直到但不包括下一个“/”字符或输入缓冲区的末尾。

3. Finally, the output buffer is returned as the result of remove_dot_segments.

3. 最后,作为remove_dot_段的结果返回输出缓冲区。

Note that dot-segments are intended for use in URI references to express an identifier relative to the hierarchy of names in the base URI. The remove_dot_segments algorithm respects that hierarchy by removing extra dot-segments rather than treat them as an error or leaving them to be misinterpreted by dereference implementations.

请注意,点段用于URI引用中,以表示相对于基本URI中的名称层次结构的标识符。remove_dot_segments算法通过删除额外的点段而尊重该层次结构,而不是将它们视为错误或让它们被解引用实现误解。

The following illustrates how the above steps are applied for two examples of merged paths, showing the state of the two buffers after each step.

以下说明了如何将上述步骤应用于合并路径的两个示例,显示了每个步骤后两个缓冲区的状态。

STEP OUTPUT BUFFER INPUT BUFFER

步进输出缓冲器输入缓冲器

       1 :                         /a/b/c/./../../g
       2E:   /a                    /b/c/./../../g
       2E:   /a/b                  /c/./../../g
       2E:   /a/b/c                /./../../g
       2B:   /a/b/c                /../../g
       2C:   /a/b                  /../g
       2C:   /a                    /g
       2E:   /a/g
        
       1 :                         /a/b/c/./../../g
       2E:   /a                    /b/c/./../../g
       2E:   /a/b                  /c/./../../g
       2E:   /a/b/c                /./../../g
       2B:   /a/b/c                /../../g
       2C:   /a/b                  /../g
       2C:   /a                    /g
       2E:   /a/g
        

STEP OUTPUT BUFFER INPUT BUFFER

步进输出缓冲器输入缓冲器

       1 :                         mid/content=5/../6
       2E:   mid                   /content=5/../6
       2E:   mid/content=5         /../6
       2C:   mid                   /6
       2E:   mid/6
        
       1 :                         mid/content=5/../6
       2E:   mid                   /content=5/../6
       2E:   mid/content=5         /../6
       2C:   mid                   /6
       2E:   mid/6
        

Some applications may find it more efficient to implement the remove_dot_segments algorithm by using two segment stacks rather than strings.

一些应用程序可能会发现,使用两个段堆栈而不是字符串来实现remove_dot_段算法更有效。

Note: Beware that some older, erroneous implementations will fail to separate a reference's query component from its path component prior to merging the base and reference paths, resulting in an interoperability failure if the query component contains the strings "/../" or "/./".

注意:请注意,一些旧的、错误的实现将无法在合并基本路径和引用路径之前将引用的查询组件与其路径组件分离,如果查询组件包含字符串“/../”或“/。/”,则会导致互操作性失败。

5.3. Component Recomposition
5.3. 组件重组

Parsed URI components can be recomposed to obtain the corresponding URI reference string. Using pseudocode, this would be:

可以重新组合已解析的URI组件以获得相应的URI引用字符串。使用伪代码,这将是:

result = ""

result=“”

      if defined(scheme) then
         append scheme to result;
         append ":" to result;
      endif;
        
      if defined(scheme) then
         append scheme to result;
         append ":" to result;
      endif;
        
      if defined(authority) then
         append "//" to result;
         append authority to result;
      endif;
        
      if defined(authority) then
         append "//" to result;
         append authority to result;
      endif;
        

append path to result;

将路径附加到结果;

      if defined(query) then
         append "?" to result;
         append query to result;
      endif;
        
      if defined(query) then
         append "?" to result;
         append query to result;
      endif;
        
      if defined(fragment) then
         append "#" to result;
         append fragment to result;
      endif;
        
      if defined(fragment) then
         append "#" to result;
         append fragment to result;
      endif;
        

return result;

返回结果;

Note that we are careful to preserve the distinction between a component that is undefined, meaning that its separator was not present in the reference, and a component that is empty, meaning that the separator was present and was immediately followed by the next component separator or the end of the reference.

请注意,我们小心地保留未定义的组件(即其分隔符不在引用中)与空组件(即分隔符存在且紧接着下一个组件分隔符或引用结束)之间的区别。

5.4. Reference Resolution Examples
5.4. 参考分辨率示例

Within a representation with a well defined base URI of

在具有定义良好的基URI的表示中

      http://a/b/c/d;p?q
        
      http://a/b/c/d;p?q
        

a relative reference is transformed to its target URI as follows.

将相对引用转换为其目标URI,如下所示。

5.4.1. Normal Examples
5.4.1. 正常示例
      "g:h"           =  "g:h"
      "g"             =  "http://a/b/c/g"
      "./g"           =  "http://a/b/c/g"
      "g/"            =  "http://a/b/c/g/"
      "/g"            =  "http://a/g"
      "//g"           =  "http://g"
      "?y"            =  "http://a/b/c/d;p?y"
      "g?y"           =  "http://a/b/c/g?y"
      "#s"            =  "http://a/b/c/d;p?q#s"
      "g#s"           =  "http://a/b/c/g#s"
      "g?y#s"         =  "http://a/b/c/g?y#s"
      ";x"            =  "http://a/b/c/;x"
      "g;x"           =  "http://a/b/c/g;x"
      "g;x?y#s"       =  "http://a/b/c/g;x?y#s"
      ""              =  "http://a/b/c/d;p?q"
      "."             =  "http://a/b/c/"
      "./"            =  "http://a/b/c/"
      ".."            =  "http://a/b/"
      "../"           =  "http://a/b/"
      "../g"          =  "http://a/b/g"
      "../.."         =  "http://a/"
      "../../"        =  "http://a/"
      "../../g"       =  "http://a/g"
        
      "g:h"           =  "g:h"
      "g"             =  "http://a/b/c/g"
      "./g"           =  "http://a/b/c/g"
      "g/"            =  "http://a/b/c/g/"
      "/g"            =  "http://a/g"
      "//g"           =  "http://g"
      "?y"            =  "http://a/b/c/d;p?y"
      "g?y"           =  "http://a/b/c/g?y"
      "#s"            =  "http://a/b/c/d;p?q#s"
      "g#s"           =  "http://a/b/c/g#s"
      "g?y#s"         =  "http://a/b/c/g?y#s"
      ";x"            =  "http://a/b/c/;x"
      "g;x"           =  "http://a/b/c/g;x"
      "g;x?y#s"       =  "http://a/b/c/g;x?y#s"
      ""              =  "http://a/b/c/d;p?q"
      "."             =  "http://a/b/c/"
      "./"            =  "http://a/b/c/"
      ".."            =  "http://a/b/"
      "../"           =  "http://a/b/"
      "../g"          =  "http://a/b/g"
      "../.."         =  "http://a/"
      "../../"        =  "http://a/"
      "../../g"       =  "http://a/g"
        
5.4.2. Abnormal Examples
5.4.2. 异常例子

Although the following abnormal examples are unlikely to occur in normal practice, all URI parsers should be capable of resolving them consistently. Each example uses the same base as that above.

尽管在正常实践中不太可能出现以下异常示例,但所有URI解析器都应该能够一致地解析它们。每个示例使用与上面相同的基础。

Parsers must be careful in handling cases where there are more ".." segments in a relative-path reference than there are hierarchical levels in the base URI's path. Note that the ".." syntax cannot be used to change the authority component of a URI.

在处理相对路径引用中的“.”段多于基本URI路径中的层次结构的情况时,解析器必须小心。请注意,“.”语法不能用于更改URI的权限组件。

      "../../../g"    =  "http://a/g"
      "../../../../g" =  "http://a/g"
        
      "../../../g"    =  "http://a/g"
      "../../../../g" =  "http://a/g"
        

Similarly, parsers must remove the dot-segments "." and ".." when they are complete components of a path, but not when they are only part of a segment.

类似地,当点段“.”和“.”是路径的完整组件时,解析器必须删除它们,但当它们只是一个段的一部分时,则不能删除它们。

      "/./g"          =  "http://a/g"
      "/../g"         =  "http://a/g"
      "g."            =  "http://a/b/c/g."
      ".g"            =  "http://a/b/c/.g"
      "g.."           =  "http://a/b/c/g.."
      "..g"           =  "http://a/b/c/..g"
        
      "/./g"          =  "http://a/g"
      "/../g"         =  "http://a/g"
      "g."            =  "http://a/b/c/g."
      ".g"            =  "http://a/b/c/.g"
      "g.."           =  "http://a/b/c/g.."
      "..g"           =  "http://a/b/c/..g"
        

Less likely are cases where the relative reference uses unnecessary or nonsensical forms of the "." and ".." complete path segments.

相对引用使用不必要或无意义的“.”和“.”完整路径段的情况不太可能出现。

      "./../g"        =  "http://a/b/g"
      "./g/."         =  "http://a/b/c/g/"
      "g/./h"         =  "http://a/b/c/g/h"
      "g/../h"        =  "http://a/b/c/h"
      "g;x=1/./y"     =  "http://a/b/c/g;x=1/y"
      "g;x=1/../y"    =  "http://a/b/c/y"
        
      "./../g"        =  "http://a/b/g"
      "./g/."         =  "http://a/b/c/g/"
      "g/./h"         =  "http://a/b/c/g/h"
      "g/../h"        =  "http://a/b/c/h"
      "g;x=1/./y"     =  "http://a/b/c/g;x=1/y"
      "g;x=1/../y"    =  "http://a/b/c/y"
        

Some applications fail to separate the reference's query and/or fragment components from the path component before merging it with the base path and removing dot-segments. This error is rarely noticed, as typical usage of a fragment never includes the hierarchy ("/") character and the query component is not normally used within relative references.

在将引用的查询和/或片段组件与基本路径合并并删除点段之前,某些应用程序无法将引用的查询和/或片段组件与路径组件分离。很少注意到这个错误,因为片段的典型用法从不包括层次结构(“/”)字符,并且查询组件通常不在相对引用中使用。

      "g?y/./x"       =  "http://a/b/c/g?y/./x"
      "g?y/../x"      =  "http://a/b/c/g?y/../x"
      "g#s/./x"       =  "http://a/b/c/g#s/./x"
      "g#s/../x"      =  "http://a/b/c/g#s/../x"
        
      "g?y/./x"       =  "http://a/b/c/g?y/./x"
      "g?y/../x"      =  "http://a/b/c/g?y/../x"
      "g#s/./x"       =  "http://a/b/c/g#s/./x"
      "g#s/../x"      =  "http://a/b/c/g#s/../x"
        

Some parsers allow the scheme name to be present in a relative reference if it is the same as the base URI scheme. This is considered to be a loophole in prior specifications of partial URI [RFC1630]. Its use should be avoided but is allowed for backward compatibility.

如果方案名称与基本URI方案相同,则某些解析器允许该方案名称出现在相对引用中。这被认为是先前部分URI规范[RFC1630]中的漏洞。应避免使用它,但允许向后兼容。

      "http:g"        =  "http:g"         ; for strict parsers
                      /  "http://a/b/c/g" ; for backward compatibility
        
      "http:g"        =  "http:g"         ; for strict parsers
                      /  "http://a/b/c/g" ; for backward compatibility
        
6. Normalization and Comparison
6. 规范化与比较

One of the most common operations on URIs is simple comparison: determining whether two URIs are equivalent without using the URIs to access their respective resource(s). A comparison is performed every time a response cache is accessed, a browser checks its history to color a link, or an XML parser processes tags within a namespace. Extensive normalization prior to comparison of URIs is often used by spiders and indexing engines to prune a search space or to reduce duplication of request actions and response storage.

对URI最常见的操作之一是简单的比较:确定两个URI是否相等,而不使用URI访问它们各自的资源。每次访问响应缓存、浏览器检查其历史记录以给链接着色或XML解析器处理命名空间内的标记时,都会执行比较。spider和索引引擎通常使用URI比较之前的广泛规范化来修剪搜索空间或减少请求操作和响应存储的重复。

URI comparison is performed for some particular purpose. Protocols or implementations that compare URIs for different purposes will often be subject to differing design trade-offs in regards to how much effort should be spent in reducing aliased identifiers. This section describes various methods that may be used to compare URIs, the trade-offs between them, and the types of applications that might use them.

URI比较是为了某些特定目的而执行的。为了不同的目的比较URI的协议或实现通常会在减少别名标识符方面花费多少精力,从而在设计上做出不同的权衡。本节介绍可用于比较URI的各种方法、它们之间的权衡以及可能使用它们的应用程序类型。

6.1. Equivalence
6.1. 等值

Because URIs exist to identify resources, presumably they should be considered equivalent when they identify the same resource. However, this definition of equivalence is not of much practical use, as there is no way for an implementation to compare two resources unless it has full knowledge or control of them. For this reason, determination of equivalence or difference of URIs is based on string comparison, perhaps augmented by reference to additional rules provided by URI scheme definitions. We use the terms "different" and "equivalent" to describe the possible outcomes of such comparisons, but there are many application-dependent versions of equivalence.

因为URI的存在是为了标识资源,所以当它们标识相同的资源时,应该认为它们是等效的。然而,这种等价性的定义没有太多实际用途,因为除非实现完全了解或控制两种资源,否则实现无法比较这两种资源。出于这个原因,URI的等价性或差异性的确定是基于字符串比较的,可能通过引用URI方案定义提供的附加规则来增强。我们使用术语“不同”和“等效”来描述此类比较的可能结果,但存在许多依赖于应用程序的等效版本。

Even though it is possible to determine that two URIs are equivalent, URI comparison is not sufficient to determine whether two URIs identify different resources. For example, an owner of two different domain names could decide to serve the same resource from both, resulting in two different URIs. Therefore, comparison methods are designed to minimize false negatives while strictly avoiding false positives.

即使可以确定两个URI是等效的,但URI比较不足以确定两个URI是否标识不同的资源。例如,两个不同域名的所有者可以决定从这两个域名服务相同的资源,从而产生两个不同的URI。因此,比较方法的目的是尽量减少误报,同时严格避免误报。

In testing for equivalence, applications should not directly compare relative references; the references should be converted to their respective target URIs before comparison. When URIs are compared to select (or avoid) a network action, such as retrieval of a representation, fragment components (if any) should be excluded from the comparison.

在测试等价性时,应用程序不应直接比较相对引用;在比较之前,应将引用转换为各自的目标URI。将URI与选择(或避免)网络操作(如检索表示)进行比较时,应将片段组件(如果有)排除在比较之外。

6.2. Comparison Ladder
6.2. 比较阶梯

A variety of methods are used in practice to test URI equivalence. These methods fall into a range, distinguished by the amount of processing required and the degree to which the probability of false negatives is reduced. As noted above, false negatives cannot be eliminated. In practice, their probability can be reduced, but this reduction requires more processing and is not cost-effective for all applications.

实践中使用了多种方法来测试URI等价性。这些方法属于一个范围,其区别在于所需的处理量和假阴性概率降低的程度。如上所述,不能消除假阴性。在实践中,它们的概率可以降低,但这种降低需要更多的处理,并且并非对所有应用程序都具有成本效益。

If this range of comparison practices is considered as a ladder, the following discussion will climb the ladder, starting with practices that are cheap but have a relatively higher chance of producing false negatives, and proceeding to those that have higher computational cost and lower risk of false negatives.

如果将这一系列比较实践视为一个阶梯,那么下面的讨论将沿着阶梯上升,首先是成本较低但产生假阴性概率相对较高的实践,然后是计算成本较高且假阴性风险较低的实践。

6.2.1. Simple String Comparison
6.2.1. 简单字符串比较

If two URIs, when considered as character strings, are identical, then it is safe to conclude that they are equivalent. This type of equivalence test has very low computational cost and is in wide use in a variety of applications, particularly in the domain of parsing.

如果两个URI(当被视为字符串时)是相同的,那么可以安全地断定它们是等效的。这种类型的等价性测试具有非常低的计算成本,并且在各种应用中被广泛使用,特别是在解析领域。

Testing strings for equivalence requires some basic precautions. This procedure is often referred to as "bit-for-bit" or "byte-for-byte" comparison, which is potentially misleading. Testing strings for equality is normally based on pair comparison of the characters that make up the strings, starting from the first and proceeding until both strings are exhausted and all characters are found to be equal, until a pair of characters compares unequal, or until one of the strings is exhausted before the other.

测试字符串的等价性需要一些基本的预防措施。此过程通常被称为“位对位”或“字节对字节”比较,这可能会产生误导。测试字符串是否相等通常基于组成字符串的字符对的比较,从第一个字符串开始,一直到两个字符串都用尽并且所有字符都相等,直到一对字符比较不相等,或者直到其中一个字符串先用尽。

This character comparison requires that each pair of characters be put in comparable form. For example, should one URI be stored in a byte array in EBCDIC encoding and the second in a Java String object (UTF-16), bit-for-bit comparisons applied naively will produce errors. It is better to speak of equality on a character-for-character basis rather than on a byte-for-byte or bit-for-bit basis. In practical terms, character-by-character comparisons should be done codepoint-by-codepoint after conversion to a common character encoding.

这种字符比较要求每对字符都采用可比较的形式。例如,如果一个URI存储在EBCDIC编码的字节数组中,第二个URI存储在Java字符串对象(UTF-16)中,那么简单地应用逐位比较将产生错误。最好在字符对字符的基础上谈论平等,而不是在字节对字节或位对位的基础上谈论平等。实际上,在转换为公共字符编码后,逐字符比较应该逐码点进行。

False negatives are caused by the production and use of URI aliases. Unnecessary aliases can be reduced, regardless of the comparison method, by consistently providing URI references in an already-normalized form (i.e., a form identical to what would be produced after normalization is applied, as described below).

误报是由URI别名的产生和使用引起的。通过以已经规范化的形式(即,与应用规范化后生成的形式相同的形式,如下所述)一致地提供URI引用,可以减少不必要的别名,而不考虑比较方法。

Protocols and data formats often limit some URI comparisons to simple string comparison, based on the theory that people and implementations will, in their own best interest, be consistent in providing URI references, or at least consistent enough to negate any efficiency that might be obtained from further normalization.

协议和数据格式通常将一些URI比较限制为简单的字符串比较,这是基于这样一种理论,即人们和实现在提供URI引用时会出于自身的最佳利益保持一致,或者至少一致到足以否定进一步规范化可能获得的任何效率。

6.2.2. Syntax-Based Normalization
6.2.2. 基于语法的规范化

Implementations may use logic based on the definitions provided by this specification to reduce the probability of false negatives. This processing is moderately higher in cost than character-for-character string comparison. For example, an application using this approach could reasonably consider the following two URIs equivalent:

实现可以使用基于本规范提供的定义的逻辑来降低误报概率。此处理的成本略高于字符串对字符串的比较。例如,使用这种方法的应用程序可以合理地考虑以下两个URI等价:

      example://a/b/c/%7Bfoo%7D
      eXAMPLE://a/./b/../b/%63/%7bfoo%7d
        
      example://a/b/c/%7Bfoo%7D
      eXAMPLE://a/./b/../b/%63/%7bfoo%7d
        

Web user agents, such as browsers, typically apply this type of URI normalization when determining whether a cached response is available. Syntax-based normalization includes such techniques as case normalization, percent-encoding normalization, and removal of dot-segments.

Web用户代理(如浏览器)通常在确定缓存响应是否可用时应用这种URI规范化。基于语法的规范化包括大小写规范化、百分比编码规范化和删除点段等技术。

6.2.2.1. Case Normalization
6.2.2.1. 案例规范化

For all URIs, the hexadecimal digits within a percent-encoding triplet (e.g., "%3a" versus "%3A") are case-insensitive and therefore should be normalized to use uppercase letters for the digits A-F.

对于所有URI,百分比编码三元组中的十六进制数字(例如,“%3a”与“%3a”)不区分大小写,因此应规范化,以使用大写字母表示数字a-F。

When a URI uses components of the generic syntax, the component syntax equivalence rules always apply; namely, that the scheme and host are case-insensitive and therefore should be normalized to lowercase. For example, the URI <HTTP://www.EXAMPLE.com/> is equivalent to <http://www.example.com/>. The other generic syntax components are assumed to be case-sensitive unless specifically defined otherwise by the scheme (see Section 6.2.3).

当URI使用通用语法的组件时,组件语法等价规则始终适用;也就是说,scheme和host不区分大小写,因此应该规范化为小写。例如,URI<HTTP://www.example.com/>相当于<http://www.example.com/>. 除非方案另有明确规定,否则其他通用语法组件假定区分大小写(见第6.2.3节)。

6.2.2.2. Percent-Encoding Normalization
6.2.2.2. 百分比编码规范化

The percent-encoding mechanism (Section 2.1) is a frequent source of variance among otherwise identical URIs. In addition to the case normalization issue noted above, some URI producers percent-encode octets that do not require percent-encoding, resulting in URIs that are equivalent to their non-encoded counterparts. These URIs should be normalized by decoding any percent-encoded octet that corresponds to an unreserved character, as described in Section 2.3.

百分比编码机制(第2.1节)是其他相同URI之间经常出现差异的来源。除了上面提到的大小写规范化问题外,一些URI生产者对不需要百分比编码的八位字节进行百分比编码,从而产生与其未编码的对等URI等价的URI。如第2.3节所述,这些URI应通过解码对应于无保留字符的任何百分比编码八位字节来规范化。

6.2.2.3. Path Segment Normalization
6.2.2.3. 路径段规范化

The complete path segments "." and ".." are intended only for use within relative references (Section 4.1) and are removed as part of the reference resolution process (Section 5.2). However, some deployed implementations incorrectly assume that reference resolution is not necessary when the reference is already a URI and thus fail to remove dot-segments when they occur in non-relative paths. URI normalizers should remove dot-segments by applying the remove_dot_segments algorithm to the path, as described in Section 5.2.4.

完整的路径段“.”和“.”仅用于相对参考(第4.1节),并作为参考解析过程的一部分(第5.2节)删除。但是,一些已部署的实现错误地假设,当引用已经是URI时,引用解析是不必要的,因此当点段出现在非相对路径中时,无法删除它们。URI规范化程序应通过对路径应用remove_dot_segments算法来删除点段,如第5.2.4节所述。

6.2.3. Scheme-Based Normalization
6.2.3. 基于方案的规范化

The syntax and semantics of URIs vary from scheme to scheme, as described by the defining specification for each scheme. Implementations may use scheme-specific rules, at further processing cost, to reduce the probability of false negatives. For example, because the "http" scheme makes use of an authority component, has a default port of "80", and defines an empty path to be equivalent to "/", the following four URIs are equivalent:

URI的语法和语义因方案而异,如每个方案的定义规范所述。实现可以使用特定于方案的规则,以进一步的处理成本降低误报的概率。例如,由于“http”方案使用了授权组件,具有默认端口“80”,并定义了一个空路径等效于“/”,因此以下四个URI是等效的:

      http://example.com
      http://example.com/
      http://example.com:/
      http://example.com:80/
        
      http://example.com
      http://example.com/
      http://example.com:/
      http://example.com:80/
        

In general, a URI that uses the generic syntax for authority with an empty path should be normalized to a path of "/". Likewise, an explicit ":port", for which the port is empty or the default for the scheme, is equivalent to one where the port and its ":" delimiter are elided and thus should be removed by scheme-based normalization. For example, the second URI above is the normal form for the "http" scheme.

一般来说,对具有空路径的权限使用通用语法的URI应规范化为“/”路径。同样,显式“:port”(端口为空或方案的默认值)等同于省略端口及其“:”分隔符,因此应通过基于方案的规范化删除。例如,上面的第二个URI是“http”方案的正常形式。

Another case where normalization varies by scheme is in the handling of an empty authority component or empty host subcomponent. For many scheme specifications, an empty authority or host is considered an error; for others, it is considered equivalent to "localhost" or the end-user's host. When a scheme defines a default for authority and a URI reference to that default is desired, the reference should be normalized to an empty authority for the sake of uniformity, brevity, and internationalization. If, however, either the userinfo or port subcomponents are non-empty, then the host should be given explicitly even if it matches the default.

规范化因方案而异的另一种情况是处理空权限组件或空主机子组件。对于许多方案规范,空权限或主机被视为错误;对于其他主机,它被认为等同于“localhost”或最终用户的主机。当一个方案为权限定义了一个默认值,并且需要对该默认值的URI引用时,为了统一性、简洁性和国际化,应该将该引用规范化为一个空权限。但是,如果userinfo或port子组件为非空,则即使主机与默认值匹配,也应显式指定主机。

Normalization should not remove delimiters when their associated component is empty unless licensed to do so by the scheme

当分隔符的关联组件为空时,规范化不应删除分隔符,除非方案授权这样做

specification. For example, the URI "http://example.com/?" cannot be assumed to be equivalent to any of the examples above. Likewise, the presence or absence of delimiters within a userinfo subcomponent is usually significant to its interpretation. The fragment component is not subject to any scheme-based normalization; thus, two URIs that differ only by the suffix "#" are considered different regardless of the scheme.

规格例如,URI“http://example.com/?“不能假定与上述任何示例等效。同样,userinfo子组件中是否存在分隔符通常对其解释非常重要。片段组件不受任何基于方案的规范化的约束;因此,仅因后缀“#”不同的两个URI被认为是不同的,而与方案无关。

Some schemes define additional subcomponents that consist of case-insensitive data, giving an implicit license to normalizers to convert this data to a common case (e.g., all lowercase). For example, URI schemes that define a subcomponent of path to contain an Internet hostname, such as the "mailto" URI scheme, cause that subcomponent to be case-insensitive and thus subject to case normalization (e.g., "mailto:Joe@Example.COM" is equivalent to "mailto:Joe@example.com", even though the generic syntax considers the path component to be case-sensitive).

一些方案定义了由不区分大小写的数据组成的附加子组件,为规范化程序提供了将这些数据转换为普通大小写(例如,所有小写)的隐式许可。例如,定义path的子组件以包含Internet主机名的URI方案,如“mailto”URI方案,会导致该子组件不区分大小写,因此需要进行大小写规范化(例如,“mailto:Joe@Example.COM“相当于”mailto:Joe@example.com",即使通用语法认为路径组件区分大小写)。

Other scheme-specific normalizations are possible.

其他特定于方案的规范化也是可能的。

6.2.4. Protocol-Based Normalization
6.2.4. 基于协议的规范化

Substantial effort to reduce the incidence of false negatives is often cost-effective for web spiders. Therefore, they implement even more aggressive techniques in URI comparison. For example, if they observe that a URI such as

对于网络蜘蛛来说,为减少误报率所做的大量努力通常是具有成本效益的。因此,与URI相比,它们实现了更具攻击性的技术。例如,如果他们观察到一个URI,如

      http://example.com/data
        
      http://example.com/data
        

redirects to a URI differing only in the trailing slash

重定向到仅在尾部斜杠上不同的URI

      http://example.com/data/
        
      http://example.com/data/
        

they will likely regard the two as equivalent in the future. This kind of technique is only appropriate when equivalence is clearly indicated by both the result of accessing the resources and the common conventions of their scheme's dereference algorithm (in this case, use of redirection by HTTP origin servers to avoid problems with relative references).

在未来,他们可能会将两者视为等价物。只有当访问资源的结果和他们方案的解引用算法的常见约定(在这种情况下,HTTP源服务器使用重定向以避免相对引用的问题)清楚地表明了等价性时,这种技术才是合适的。

7. Security Considerations
7. 安全考虑

A URI does not in itself pose a security threat. However, as URIs are often used to provide a compact set of instructions for access to network resources, care must be taken to properly interpret the data within a URI, to prevent that data from causing unintended access, and to avoid including data that should not be revealed in plain text.

URI本身并不构成安全威胁。但是,由于URI通常用于提供访问网络资源的一组紧凑指令,因此必须注意正确解释URI中的数据,防止该数据导致意外访问,并避免包含不应以纯文本显示的数据。

7.1. Reliability and Consistency
7.1. 可靠性和一致性

There is no guarantee that once a URI has been used to retrieve information, the same information will be retrievable by that URI in the future. Nor is there any guarantee that the information retrievable via that URI in the future will be observably similar to that retrieved in the past. The URI syntax does not constrain how a given scheme or authority apportions its namespace or maintains it over time. Such guarantees can only be obtained from the person(s) controlling that namespace and the resource in question. A specific URI scheme may define additional semantics, such as name persistence, if those semantics are required of all naming authorities for that scheme.

无法保证一旦URI被用于检索信息,将来该URI将检索相同的信息。也不能保证将来通过该URI检索到的信息与过去检索到的信息明显相似。URI语法不限制给定的方案或授权如何分配其命名空间或随时间进行维护。此类保证只能从控制该命名空间和相关资源的人员处获得。如果特定URI方案的所有命名机构都需要这些语义,则该方案可以定义其他语义,例如名称持久性。

7.2. Malicious Construction
7.2. 恶意构造

It is sometimes possible to construct a URI so that an attempt to perform a seemingly harmless, idempotent operation, such as the retrieval of a representation, will in fact cause a possibly damaging remote operation. The unsafe URI is typically constructed by specifying a port number other than that reserved for the network protocol in question. The client unwittingly contacts a site running a different protocol service, and data within the URI contains instructions that, when interpreted according to this other protocol, cause an unexpected operation. A frequent example of such abuse has been the use of a protocol-based scheme with a port component of "25", thereby fooling user agent software into sending an unintended or impersonating message via an SMTP server.

有时可以构造URI,以便尝试执行看似无害的幂等操作(例如检索表示)实际上将导致可能具有破坏性的远程操作。不安全URI通常是通过指定一个端口号而不是为所讨论的网络协议保留的端口号来构造的。客户端无意中联系了运行不同协议服务的站点,URI中的数据包含指令,当根据另一协议进行解释时,这些指令会导致意外操作。这种滥用的一个常见例子是使用端口组件为“25”的基于协议的方案,从而欺骗用户代理软件通过SMTP服务器发送非预期或模拟消息。

Applications should prevent dereference of a URI that specifies a TCP port number within the "well-known port" range (0 - 1023) unless the protocol being used to dereference that URI is compatible with the protocol expected on that well-known port. Although IANA maintains a registry of well-known ports, applications should make such restrictions user-configurable to avoid preventing the deployment of new services.

应用程序应防止取消引用指定“已知端口”范围(0-1023)内TCP端口号的URI,除非用于取消引用该URI的协议与该已知端口上预期的协议兼容。尽管IANA维护着一个著名端口的注册中心,但应用程序应使用户可配置此类限制,以避免阻止新服务的部署。

When a URI contains percent-encoded octets that match the delimiters for a given resolution or dereference protocol (for example, CR and LF characters for the TELNET protocol), these percent-encodings must not be decoded before transmission across that protocol. Transfer of the percent-encoding, which might violate the protocol, is less harmful than allowing decoded octets to be interpreted as additional operations or parameters, perhaps triggering an unexpected and possibly harmful remote operation.

当URI包含与给定解析或解引用协议的分隔符相匹配的百分比编码八位字节(例如,TELNET协议的CR和LF字符)时,在通过该协议传输之前,不得对这些百分比编码进行解码。传输可能违反协议的百分比编码比允许将解码的八位字节解释为附加操作或参数危害更小,可能会触发意外且可能有害的远程操作。

7.3. Back-End Transcoding
7.3. 后端转码

When a URI is dereferenced, the data within it is often parsed by both the user agent and one or more servers. In HTTP, for example, a typical user agent will parse a URI into its five major components, access the authority's server, and send it the data within the authority, path, and query components. A typical server will take that information, parse the path into segments and the query into key/value pairs, and then invoke implementation-specific handlers to respond to the request. As a result, a common security concern for server implementations that handle a URI, either as a whole or split into separate components, is proper interpretation of the octet data represented by the characters and percent-encodings within that URI.

当URI被取消引用时,其中的数据通常由用户代理和一个或多个服务器解析。例如,在HTTP中,典型的用户代理将URI解析为五个主要组件,访问授权服务器,并向其发送授权、路径和查询组件中的数据。典型的服务器将获取该信息,将路径解析为段,将查询解析为键/值对,然后调用特定于实现的处理程序来响应请求。因此,对于处理URI的服务器实现来说,一个常见的安全问题是正确解释由该URI中的字符和百分比编码表示的八位字节数据,无论是作为一个整体还是分割成单独的组件。

Percent-encoded octets must be decoded at some point during the dereference process. Applications must split the URI into its components and subcomponents prior to decoding the octets, as otherwise the decoded octets might be mistaken for delimiters. Security checks of the data within a URI should be applied after decoding the octets. Note, however, that the "%00" percent-encoding (NUL) may require special handling and should be rejected if the application is not expecting to receive raw data within a component.

百分比编码的八位字节必须在解引用过程中的某个时刻解码。在解码八位字节之前,应用程序必须将URI拆分为其组件和子组件,否则解码的八位字节可能会被误认为是分隔符。URI中数据的安全检查应在解码八位字节后应用。但是,请注意,“%00”百分比编码(NUL)可能需要特殊处理,如果应用程序不希望在组件内接收原始数据,则应拒绝该编码。

Special care should be taken when the URI path interpretation process involves the use of a back-end file system or related system functions. File systems typically assign an operational meaning to special characters, such as the "/", "\", ":", "[", and "]" characters, and to special device names like ".", "..", "...", "aux", "lpt", etc. In some cases, merely testing for the existence of such a name will cause the operating system to pause or invoke unrelated system calls, leading to significant security concerns regarding denial of service and unintended data transfer. It would be impossible for this specification to list all such significant characters and device names. Implementers should research the reserved names and characters for the types of storage device that may be attached to their applications and restrict the use of data obtained from URI components accordingly.

当URI路径解释过程涉及使用后端文件系统或相关系统功能时,应特别小心。文件系统通常为特殊字符(如“/”、“\”、“:”、“[”、“]”和“]”等)以及特殊设备名称(如“.”、“…”、“…”、“aux”、“lpt”等)分配操作含义。在某些情况下,仅测试此类名称的存在将导致操作系统暂停或调用不相关的系统调用,导致对拒绝服务和意外数据传输的重大安全问题。本规范不可能列出所有这些重要字符和设备名称。实现者应该研究可能附加到其应用程序的存储设备类型的保留名称和字符,并相应地限制使用从URI组件获得的数据。

7.4. Rare IP Address Formats
7.4. 罕见的IP地址格式

Although the URI syntax for IPv4address only allows the common dotted-decimal form of IPv4 address literal, many implementations that process URIs make use of platform-dependent system routines, such as gethostbyname() and inet_aton(), to translate the string literal to an actual IP address. Unfortunately, such system routines often allow and process a much larger set of formats than those described in Section 3.2.2.

尽管IPv4address的URI语法只允许IPv4地址文本的公共点十进制形式,但许多处理URI的实现都使用依赖于平台的系统例程,如gethostbyname()和inet_aton(),将字符串文本转换为实际的IP地址。不幸的是,此类系统例程通常允许并处理比第3.2.2节中描述的更大的一组格式。

For example, many implementations allow dotted forms of three numbers, wherein the last part is interpreted as a 16-bit quantity and placed in the right-most two bytes of the network address (e.g., a Class B network). Likewise, a dotted form of two numbers means that the last part is interpreted as a 24-bit quantity and placed in the right-most three bytes of the network address (Class A), and a single number (without dots) is interpreted as a 32-bit quantity and stored directly in the network address. Adding further to the confusion, some implementations allow each dotted part to be interpreted as decimal, octal, or hexadecimal, as specified in the C language (i.e., a leading 0x or 0X implies hexadecimal; a leading 0 implies octal; otherwise, the number is interpreted as decimal).

例如,许多实现允许三个数字的虚线形式,其中最后一部分被解释为16位数量,并放置在网络地址(例如,B类网络)的最右两个字节中。类似地,两个数字的虚线形式意味着最后一部分被解释为24位数量,并放置在网络地址(a类)的最右边三个字节中,单个数字(不带点)被解释为32位数量,并直接存储在网络地址中。更令人困惑的是,一些实现允许将每个虚线部分解释为十进制、八进制或十六进制,如C语言中所指定的(即,前导0x或0x表示十六进制;前导0表示八进制;否则,数字解释为十进制)。

These additional IP address formats are not allowed in the URI syntax due to differences between platform implementations. However, they can become a security concern if an application attempts to filter access to resources based on the IP address in string literal format. If this filtering is performed, literals should be converted to numeric form and filtered based on the numeric value, and not on a prefix or suffix of the string form.

由于平台实现之间的差异,URI语法中不允许使用这些额外的IP地址格式。但是,如果应用程序试图基于字符串文本格式的IP地址过滤对资源的访问,则它们可能会成为一个安全问题。如果执行此筛选,则应将文字转换为数字形式,并基于数字值进行筛选,而不是基于字符串形式的前缀或后缀。

7.5. Sensitive Information
7.5. 敏感信息

URI producers should not provide a URI that contains a username or password that is intended to be secret. URIs are frequently displayed by browsers, stored in clear text bookmarks, and logged by user agent history and intermediary applications (proxies). A password appearing within the userinfo component is deprecated and should be considered an error (or simply ignored) except in those rare cases where the 'password' parameter is intended to be public.

URI生产者不应提供包含用户名或密码的URI,该用户名或密码应为机密。URI通常由浏览器显示,存储在明文书签中,并由用户代理历史记录和中间应用程序(代理)记录。userinfo组件中出现的密码已被弃用,应视为错误(或忽略),除非“password”参数是公开的,否则极少数情况除外。

7.6. Semantic Attacks
7.6. 语义攻击

Because the userinfo subcomponent is rarely used and appears before the host in the authority component, it can be used to construct a URI intended to mislead a human user by appearing to identify one (trusted) naming authority while actually identifying a different authority hidden behind the noise. For example

由于userinfo子组件很少使用,并且出现在authority组件中的主机之前,因此可以使用它来构造一个URI,该URI旨在误导人类用户,其表现为标识一个(受信任的)命名机构,同时实际标识隐藏在噪声后面的另一个机构。例如

      ftp://cnn.example.com&story=breaking_news@10.0.0.1/top_story.htm
        
      ftp://cnn.example.com&story=breaking_news@10.0.0.1/top_story.htm
        

might lead a human user to assume that the host is 'cnn.example.com', whereas it is actually '10.0.0.1'. Note that a misleading userinfo subcomponent could be much longer than the example above.

可能会导致人类用户假设主机是“cnn.example.com”,而实际上是“10.0.0.1”。请注意,误导性的userinfo子组件可能比上面的示例长得多。

A misleading URI, such as that above, is an attack on the user's preconceived notions about the meaning of a URI rather than an attack on the software itself. User agents may be able to reduce the impact of such attacks by distinguishing the various components of the URI when they are rendered, such as by using a different color or tone to render userinfo if any is present, though there is no panacea. More information on URI-based semantic attacks can be found in [Siedzik].

一个误导性的URI,如上文所述,是对用户对URI含义的先入为主的观念的攻击,而不是对软件本身的攻击。用户代理可能能够通过在呈现URI时区分URI的各种组件来减少此类攻击的影响,例如,如果存在用户信息,则使用不同的颜色或色调来呈现用户信息,尽管没有灵丹妙药。有关基于URI的语义攻击的更多信息,请参见[Siedzik]。

8. IANA Considerations
8. IANA考虑

URI scheme names, as defined by <scheme> in Section 3.1, form a registered namespace that is managed by IANA according to the procedures defined in [BCP35]. No IANA actions are required by this document.

第3.1节中<scheme>定义的URI方案名称构成一个注册名称空间,由IANA根据[BCP35]中定义的过程进行管理。本文件不要求IANA采取任何行动。

9. Acknowledgements
9. 致谢

This specification is derived from RFC 2396 [RFC2396], RFC 1808 [RFC1808], and RFC 1738 [RFC1738]; the acknowledgements in those documents still apply. It also incorporates the update (with corrections) for IPv6 literals in the host syntax, as defined by Robert M. Hinden, Brian E. Carpenter, and Larry Masinter in [RFC2732]. In addition, contributions by Gisle Aas, Reese Anschultz, Daniel Barclay, Tim Bray, Mike Brown, Rob Cameron, Jeremy Carroll, Dan Connolly, Adam M. Costello, John Cowan, Jason Diamond, Martin Duerst, Stefan Eissing, Clive D.W. Feather, Al Gilman, Tony Hammond, Elliotte Harold, Pat Hayes, Henry Holtzman, Ian B. Jacobs, Michael Kay, John C. Klensin, Graham Klyne, Dan Kohn, Bruce Lilly, Andrew Main, Dave McAlpin, Ira McDonald, Michael Mealling, Ray Merkert, Stephen Pollei, Julian Reschke, Tomas Rokicki, Miles Sabin, Kai Schaetzl, Mark Thomson, Ronald Tschalaer, Norm Walsh, Marc Warne, Stuart Williams, and Henry Zongaro are gratefully acknowledged.

本规范源自RFC 2396[RFC2396]、RFC 1808[RFC1808]和RFC 1738[RFC1738];这些文件中的确认仍然适用。它还包含了主机语法中IPv6文本的更新(带更正),如Robert M.Hinden、Brian E.Carpenter和Larry Masinter在[RFC2732]中所定义。此外,Gisle Aas、Reese Anschultz、Daniel Barclay、Tim Bray、Mike Brown、Rob Cameron、Jeremy Carroll、Dan Connolly、Adam M.Costello、John Cowan、Jason Diamond、Martin Duerst、Stefan Eissing、Clive D.W.Feather、Al Gilman、Tony Hammond、Elliotte Harold、Pat Hayes、Henry Holtzman、Ian B.Jacobs、Michael Kay、John C.Klesin、,感谢格雷厄姆·克莱恩、丹·科恩、布鲁斯·礼来、安德鲁·梅因、戴夫·麦卡宾、艾拉·麦克唐纳、迈克尔·米林、雷·默克特、斯蒂芬·波莱、朱利安·雷什克、托马斯·罗基奇、迈尔斯·萨宾、凯·谢茨尔、马克·汤姆森、罗纳德·查莱尔、诺姆·沃尔什、马克·沃恩、斯图尔特·威廉姆斯和亨利·桑加罗。

10. References
10. 工具书类
10.1. Normative References
10.1. 规范性引用文件

[ASCII] American National Standards Institute, "Coded Character Set -- 7-bit American Standard Code for Information Interchange", ANSI X3.4, 1986.

[ASCII]美国国家标准协会,“编码字符集——信息交换用7位美国标准代码”,ANSI X3.41986。

[RFC2234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997.

[RFC2234]Crocker,D.和P.Overell,“语法规范的扩充BNF:ABNF”,RFC 2234,1997年11月。

[STD63] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003.

[STD63]Yergeau,F.,“UTF-8,ISO 10646的转换格式”,STD 63,RFC 3629,2003年11月。

[UCS] International Organization for Standardization, "Information Technology - Universal Multiple-Octet Coded Character Set (UCS)", ISO/IEC 10646:2003, December 2003.

[UCS]国际标准化组织,“信息技术-通用多八位编码字符集(UCS)”,ISO/IEC 10646:2003,2003年12月。

10.2. Informative References
10.2. 资料性引用

[BCP19] Freed, N. and J. Postel, "IANA Charset Registration Procedures", BCP 19, RFC 2978, October 2000.

[BCP19]Freed,N.和J.Postel,“IANA字符集注册程序”,BCP 19,RFC 2978,2000年10月。

[BCP35] Petke, R. and I. King, "Registration Procedures for URL Scheme Names", BCP 35, RFC 2717, November 1999.

[BCP35]Petke,R.和I.King,“URL方案名称的注册程序”,BCP 35,RFC 2717,1999年11月。

[RFC0952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD Internet host table specification", RFC 952, October 1985.

[RFC0952]Harrenstien,K.,Stahl,M.和E.Feinler,“国防部互联网主机表规范”,RFC 952,1985年10月。

[RFC1034] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987.

[RFC1034]Mockapetris,P.,“域名-概念和设施”,STD 13,RFC 1034,1987年11月。

[RFC1123] Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, October 1989.

[RFC1123]Braden,R.,“互联网主机的要求-应用和支持”,STD 3,RFC 1123,1989年10月。

[RFC1535] Gavron, E., "A Security Problem and Proposed Correction With Widely Deployed DNS Software", RFC 1535, October 1993.

[RFC1535]Gavron,E.,“广泛部署DNS软件的安全问题和建议纠正”,RFC 1535,1993年10月。

[RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in the World-Wide Web", RFC 1630, June 1994.

[RFC1630]Berners Lee,T.,“万维网中的通用资源标识符:万维网中使用的网络对象名称和地址表达的统一语法”,RFC 1630,1994年6月。

[RFC1736] Kunze, J., "Functional Recommendations for Internet Resource Locators", RFC 1736, February 1995.

[RFC1736]Kunze,J.,“互联网资源定位器的功能建议”,RFC1736,1995年2月。

[RFC1737] Sollins, K. and L. Masinter, "Functional Requirements for Uniform Resource Names", RFC 1737, December 1994.

[RFC1737]Sollins,K.和L.Masinter,“统一资源名称的功能要求”,RFC 1737,1994年12月。

[RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform Resource Locators (URL)", RFC 1738, December 1994.

[RFC1738]Berners Lee,T.,Masinter,L.,和M.McCahill,“统一资源定位器(URL)”,RFC 17381994年12月。

[RFC1808] Fielding, R., "Relative Uniform Resource Locators", RFC 1808, June 1995.

[RFC1808]菲尔丁,R.,“相对统一资源定位器”,RFC18081995年6月。

[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996.

[RFC2046]Freed,N.和N.Borenstein,“多用途Internet邮件扩展(MIME)第二部分:媒体类型”,RFC 20461996年11月。

[RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997.

[RFC2141]Moats,R.,“瓮语法”,RFC 21411997年5月。

[RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998.

[RFC2396]Berners Lee,T.,Fielding,R.,和L.Masinter,“统一资源标识符(URI):通用语法”,RFC 2396,1998年8月。

[RFC2518] Goland, Y., Whitehead, E., Faizi, A., Carter, S., and D. Jensen, "HTTP Extensions for Distributed Authoring -- WEBDAV", RFC 2518, February 1999.

[RFC2518]Goland,Y.,Whitehead,E.,Faizi,A.,Carter,S.,和D.Jensen,“分布式创作的HTTP扩展——WEBDAV”,RFC25181999年2月。

[RFC2557] Palme, J., Hopmann, A., and N. Shelness, "MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)", RFC 2557, March 1999.

[RFC2557]Palme,J.,Hopmann,A.,和N.Shelness,“聚合文档的MIME封装,如HTML(MHTML)”,RFC 2557,1999年3月。

[RFC2718] Masinter, L., Alvestrand, H., Zigmond, D., and R. Petke, "Guidelines for new URL Schemes", RFC 2718, November 1999.

[RFC2718]Masinter,L.,Alvestrand,H.,Zigmond,D.,和R.Petke,“新URL方案指南”,RFC 27181999年11月。

[RFC2732] Hinden, R., Carpenter, B., and L. Masinter, "Format for Literal IPv6 Addresses in URL's", RFC 2732, December 1999.

[RFC2732]Hinden,R.,Carpenter,B.,和L.Masinter,“URL中文字IPv6地址的格式”,RFC 2732,1999年12月。

[RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint W3C/IETF URI Planning Interest Group: Uniform Resource Identifiers (URIs), URLs, and Uniform Resource Names (URNs): Clarifications and Recommendations", RFC 3305, August 2002.

[RFC3305]Mealling,M.和R.Denenberg,“W3C/IETF URI规划联合兴趣小组的报告:统一资源标识符(URI)、URL和统一资源名称(URN):澄清和建议”,RFC 33052002年8月。

[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, "Internationalizing Domain Names in Applications (IDNA)", RFC 3490, March 2003.

[RFC3490]Faltstrom,P.,Hoffman,P.,和A.Costello,“应用程序中的域名国际化(IDNA)”,RFC 34902003年3月。

[RFC3513] Hinden, R. and S. Deering, "Internet Protocol Version 6 (IPv6) Addressing Architecture", RFC 3513, April 2003.

[RFC3513]Hinden,R.和S.Deering,“互联网协议版本6(IPv6)寻址体系结构”,RFC 3513,2003年4月。

[Siedzik] Siedzik, R., "Semantic Attacks: What's in a URL?", April 2001, <http://www.giac.org/practical/gsec/ Richard_Siedzik_GSEC.pdf>.

[Siedzik]Siedzik,R.,“语义攻击:URL中有什么?”,2001年4月<http://www.giac.org/practical/gsec/ Richard_Siedzik_GSEC.pdf>。

Appendix A. Collected ABNF for URI
附录A.为URI收集的ABNF
   URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
        
   URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
        
   hier-part     = "//" authority path-abempty
                 / path-absolute
                 / path-rootless
                 / path-empty
        
   hier-part     = "//" authority path-abempty
                 / path-absolute
                 / path-rootless
                 / path-empty
        
   URI-reference = URI / relative-ref
        
   URI-reference = URI / relative-ref
        

absolute-URI = scheme ":" hier-part [ "?" query ]

绝对URI=方案“:“hier部分[“?”查询]

relative-ref = relative-part [ "?" query ] [ "#" fragment ]

相对引用=相对部分[“?”查询][“#”片段]

   relative-part = "//" authority path-abempty
                 / path-absolute
                 / path-noscheme
                 / path-empty
        
   relative-part = "//" authority path-abempty
                 / path-absolute
                 / path-noscheme
                 / path-empty
        
   scheme        = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
        
   scheme        = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
        
   authority     = [ userinfo "@" ] host [ ":" port ]
   userinfo      = *( unreserved / pct-encoded / sub-delims / ":" )
   host          = IP-literal / IPv4address / reg-name
   port          = *DIGIT
        
   authority     = [ userinfo "@" ] host [ ":" port ]
   userinfo      = *( unreserved / pct-encoded / sub-delims / ":" )
   host          = IP-literal / IPv4address / reg-name
   port          = *DIGIT
        

IP-literal = "[" ( IPv6address / IPvFuture ) "]"

IP literal=“[(IPV6地址/IPvFuture)]”

   IPvFuture     = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
        
   IPvFuture     = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
        
   IPv6address   =                            6( h16 ":" ) ls32
                 /                       "::" 5( h16 ":" ) ls32
                 / [               h16 ] "::" 4( h16 ":" ) ls32
                 / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
                 / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
                 / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
                 / [ *4( h16 ":" ) h16 ] "::"              ls32
                 / [ *5( h16 ":" ) h16 ] "::"              h16
                 / [ *6( h16 ":" ) h16 ] "::"
        
   IPv6address   =                            6( h16 ":" ) ls32
                 /                       "::" 5( h16 ":" ) ls32
                 / [               h16 ] "::" 4( h16 ":" ) ls32
                 / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
                 / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
                 / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
                 / [ *4( h16 ":" ) h16 ] "::"              ls32
                 / [ *5( h16 ":" ) h16 ] "::"              h16
                 / [ *6( h16 ":" ) h16 ] "::"
        
   h16           = 1*4HEXDIG
   ls32          = ( h16 ":" h16 ) / IPv4address
   IPv4address   = dec-octet "." dec-octet "." dec-octet "." dec-octet
        
   h16           = 1*4HEXDIG
   ls32          = ( h16 ":" h16 ) / IPv4address
   IPv4address   = dec-octet "." dec-octet "." dec-octet "." dec-octet
        
   dec-octet     = DIGIT                 ; 0-9
                 / %x31-39 DIGIT         ; 10-99
                 / "1" 2DIGIT            ; 100-199
                 / "2" %x30-34 DIGIT     ; 200-249
                 / "25" %x30-35          ; 250-255
        
   dec-octet     = DIGIT                 ; 0-9
                 / %x31-39 DIGIT         ; 10-99
                 / "1" 2DIGIT            ; 100-199
                 / "2" %x30-34 DIGIT     ; 200-249
                 / "25" %x30-35          ; 250-255
        
   reg-name      = *( unreserved / pct-encoded / sub-delims )
        
   reg-name      = *( unreserved / pct-encoded / sub-delims )
        
   path          = path-abempty    ; begins with "/" or is empty
                 / path-absolute   ; begins with "/" but not "//"
                 / path-noscheme   ; begins with a non-colon segment
                 / path-rootless   ; begins with a segment
                 / path-empty      ; zero characters
        
   path          = path-abempty    ; begins with "/" or is empty
                 / path-absolute   ; begins with "/" but not "//"
                 / path-noscheme   ; begins with a non-colon segment
                 / path-rootless   ; begins with a segment
                 / path-empty      ; zero characters
        
   path-abempty  = *( "/" segment )
   path-absolute = "/" [ segment-nz *( "/" segment ) ]
   path-noscheme = segment-nz-nc *( "/" segment )
   path-rootless = segment-nz *( "/" segment )
   path-empty    = 0<pchar>
        
   path-abempty  = *( "/" segment )
   path-absolute = "/" [ segment-nz *( "/" segment ) ]
   path-noscheme = segment-nz-nc *( "/" segment )
   path-rootless = segment-nz *( "/" segment )
   path-empty    = 0<pchar>
        
   segment       = *pchar
   segment-nz    = 1*pchar
   segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
                 ; non-zero-length segment without any colon ":"
        
   segment       = *pchar
   segment-nz    = 1*pchar
   segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
                 ; non-zero-length segment without any colon ":"
        
   pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
        
   pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
        
   query         = *( pchar / "/" / "?" )
        
   query         = *( pchar / "/" / "?" )
        
   fragment      = *( pchar / "/" / "?" )
        
   fragment      = *( pchar / "/" / "?" )
        

pct-encoded = "%" HEXDIG HEXDIG

pct编码=“%”HEXDIG HEXDIG

   unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
   reserved      = gen-delims / sub-delims
   gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
   sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "="
        
   unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
   reserved      = gen-delims / sub-delims
   gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
   sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "="
        
Appendix B. Parsing a URI Reference with a Regular Expression
附录B.用正则表达式解析URI引用

As the "first-match-wins" algorithm is identical to the "greedy" disambiguation method used by POSIX regular expressions, it is natural and commonplace to use a regular expression for parsing the potential five components of a URI reference.

由于“first match wins”算法与POSIX正则表达式使用的“贪婪”消歧方法相同,因此使用正则表达式解析URI引用的潜在五个组件是自然而常见的。

The following line is the regular expression for breaking-down a well-formed URI reference into its components.

下面一行是将格式良好的URI引用分解为其组件的正则表达式。

      ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
       12            3  4          5       6  7        8 9
        
      ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
       12            3  4          5       6  7        8 9
        

The numbers in the second line above are only to assist readability; they indicate the reference points for each subexpression (i.e., each paired parenthesis). We refer to the value matched for subexpression <n> as $<n>. For example, matching the above expression to

上面第二行中的数字仅用于帮助可读性;它们表示每个子表达式(即每个成对括号)的参考点。我们将子表达式<n>的匹配值称为$<n>。例如,将上述表达式与

      http://www.ics.uci.edu/pub/ietf/uri/#Related
        
      http://www.ics.uci.edu/pub/ietf/uri/#Related
        

results in the following subexpression matches:

导致以下子表达式匹配:

      $1 = http:
      $2 = http
      $3 = //www.ics.uci.edu
      $4 = www.ics.uci.edu
      $5 = /pub/ietf/uri/
      $6 = <undefined>
      $7 = <undefined>
      $8 = #Related
      $9 = Related
        
      $1 = http:
      $2 = http
      $3 = //www.ics.uci.edu
      $4 = www.ics.uci.edu
      $5 = /pub/ietf/uri/
      $6 = <undefined>
      $7 = <undefined>
      $8 = #Related
      $9 = Related
        

where <undefined> indicates that the component is not present, as is the case for the query component in the above example. Therefore, we can determine the value of the five components as

其中<undefined>表示组件不存在,如上例中的查询组件。因此,我们可以将五种成分的值确定为

scheme = $2 authority = $4 path = $5 query = $7 fragment = $9

方案=$2权限=$4路径=$5查询=$7片段=$9

Going in the opposite direction, we can recreate a URI reference from its components by using the algorithm of Section 5.3.

相反,我们可以使用第5.3节的算法从其组件重新创建URI引用。

Appendix C. Delimiting a URI in Context
附录C.在上下文中界定URI

URIs are often transmitted through formats that do not provide a clear context for their interpretation. For example, there are many occasions when a URI is included in plain text; examples include text sent in email, USENET news, and on printed paper. In such cases, it is important to be able to delimit the URI from the rest of the text, and in particular from punctuation marks that might be mistaken for part of the URI.

URI通常通过无法为其解释提供清晰上下文的格式进行传输。例如,在很多情况下,URI包含在纯文本中;示例包括通过电子邮件、USENET新闻和打印纸发送的文本。在这种情况下,重要的是能够将URI与文本的其余部分分隔开来,特别是与可能被误认为URI一部分的标点符号分隔开来。

In practice, URIs are delimited in a variety of ways, but usually within double-quotes "http://example.com/", angle brackets <http://example.com/>, or just by using whitespace:

实际上,URI的分隔方式多种多样,但通常在双引号内”http://example.com/“,尖括号<http://example.com/>,或仅使用空白:

      http://example.com/
        
      http://example.com/
        

These wrappers do not form part of the URI.

这些包装器不构成URI的一部分。

In some cases, extra whitespace (spaces, line-breaks, tabs, etc.) may have to be added to break a long URI across lines. The whitespace should be ignored when the URI is extracted.

在某些情况下,可能必须添加额外的空白(空格、换行符、制表符等),以便跨行断开长URI。提取URI时应忽略空白。

No whitespace should be introduced after a hyphen ("-") character. Because some typesetters and printers may (erroneously) introduce a hyphen at the end of line when breaking it, the interpreter of a URI containing a line break immediately after a hyphen should ignore all whitespace around the line break and should be aware that the hyphen may or may not actually be part of the URI.

在连字符(“-”)后不应引入空格。由于某些排字机和打印机在断行时可能(错误地)在行尾引入一个连字符,因此在一个连字符之后立即包含一个断行符的URI的解释器应该忽略断行符周围的所有空格,并且应该知道该连字符可能是,也可能不是URI的一部分。

Using <> angle brackets around each URI is especially recommended as a delimiting style for a reference that contains embedded whitespace.

特别推荐在每个URI周围使用<>尖括号作为包含嵌入空格的引用的定界样式。

The prefix "URL:" (with or without a trailing space) was formerly recommended as a way to help distinguish a URI from other bracketed designators, though it is not commonly used in practice and is no longer recommended.

前缀“URL:”(带或不带尾随空格)以前被推荐作为帮助将URI与其他括号中的指示符区分开来的一种方式,尽管它在实践中并不常用,也不再推荐使用。

For robustness, software that accepts user-typed URI should attempt to recognize and strip both delimiters and embedded whitespace.

为了健壮性,接受用户类型URI的软件应该尝试识别并去除分隔符和嵌入的空白。

For example, the text

例如,文本

Yes, Jim, I found it under "http://www.w3.org/Addressing/", but you can probably pick it up from <ftp://foo.example. com/rfc/>. Note the warning in <http://www.ics.uci.edu/pub/ ietf/uri/historical.html#WARNING>.

是的,吉姆,我在下面找到的“http://www.w3.org/Addressing/“,但您可能可以从<ftp://foo.example. com/rfc/>。请注意中的警告<http://www.ics.uci.edu/pub/ ietf/uri/historical.html#WARNING>。

contains the URI references

包含URI引用

      http://www.w3.org/Addressing/
      ftp://foo.example.com/rfc/
      http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING
        
      http://www.w3.org/Addressing/
      ftp://foo.example.com/rfc/
      http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING
        
Appendix D. Changes from RFC 2396
附录D.RFC 2396的变更
D.1. Additions
D.1. 添加物

An ABNF rule for URI has been introduced to correspond to one common usage of the term: an absolute URI with optional fragment.

为URI引入了ABNF规则,以对应该术语的一种常见用法:带有可选片段的绝对URI。

IPv6 (and later) literals have been added to the list of possible identifiers for the host portion of an authority component, as described by [RFC2732], with the addition of "[" and "]" to the reserved set and a version flag to anticipate future versions of IP literals. Square brackets are now specified as reserved within the authority component and are not allowed outside their use as delimiters for an IP literal within host. In order to make this change without changing the technical definition of the path, query, and fragment components, those rules were redefined to directly specify the characters allowed.

IPv6(及更高版本)文本已添加到授权组件主机部分的可能标识符列表中,如[RFC2732]所述,在保留集中添加“[”和“]”,并添加版本标志以预测IP文本的未来版本。方括号现在被指定为权限组件内的保留,并且不允许在其外部用作主机内IP文本的分隔符。为了在不更改路径、查询和片段组件的技术定义的情况下进行此更改,这些规则被重新定义为直接指定允许的字符。

As [RFC2732] defers to [RFC3513] for definition of an IPv6 literal address, which, unfortunately, lacks an ABNF description of IPv6address, we created a new ABNF rule for IPv6address that matches the text representations defined by Section 2.2 of [RFC3513]. Likewise, the definition of IPv4address has been improved in order to limit each decimal octet to the range 0-255.

由于[RFC2732]遵从[RFC3513]对IPv6文本地址的定义,不幸的是,该地址缺少IPv6address的ABNF描述,因此我们为IPv6address创建了一个新的ABNF规则,该规则与[RFC3513]第2.2节定义的文本表示相匹配。同样,IPv4address的定义也得到了改进,以便将每个十进制八位字节限制在0-255之间。

Section 6, on URI normalization and comparison, has been completely rewritten and extended by using input from Tim Bray and discussion within the W3C Technical Architecture Group.

第6节,关于URI规范化和比较,通过使用Tim Bray的输入和W3C技术架构组内的讨论,已经完全重写和扩展。

D.2. Modifications
D.2. 修改

The ad-hoc BNF syntax of RFC 2396 has been replaced with the ABNF of [RFC2234]. This change required all rule names that formerly included underscore characters to be renamed with a dash instead. In addition, a number of syntax rules have been eliminated or simplified to make the overall grammar more comprehensible. Specifications that refer to the obsolete grammar rules may be understood by replacing those rules according to the following table:

RFC 2396的特殊BNF语法已替换为[RFC2234]的ABNF。此更改要求所有以前包含下划线字符的规则名称改用破折号重命名。此外,一些语法规则已被删除或简化,以使整个语法更易于理解。参考过时语法规则的规范可以通过根据下表替换这些规则来理解:

   +----------------+--------------------------------------------------+
   | obsolete rule  | translation                                      |
   +----------------+--------------------------------------------------+
   | absoluteURI    | absolute-URI                                     |
   | relativeURI    | relative-part [ "?" query ]                      |
   | hier_part      | ( "//" authority path-abempty /                  |
   |                | path-absolute ) [ "?" query ]                    |
   |                |                                                  |
   | opaque_part    | path-rootless [ "?" query ]                      |
   | net_path       | "//" authority path-abempty                      |
   | abs_path       | path-absolute                                    |
   | rel_path       | path-rootless                                    |
   | rel_segment    | segment-nz-nc                                    |
   | reg_name       | reg-name                                         |
   | server         | authority                                        |
   | hostport       | host [ ":" port ]                                |
   | hostname       | reg-name                                         |
   | path_segments  | path-abempty                                     |
   | param          | *<pchar excluding ";">                           |
   |                |                                                  |
   | uric           | unreserved / pct-encoded / ";" / "?" / ":"       |
   |                |  / "@" / "&" / "=" / "+" / "$" / "," / "/"       |
   |                |                                                  |
   | uric_no_slash  | unreserved / pct-encoded / ";" / "?" / ":"       |
   |                |  / "@" / "&" / "=" / "+" / "$" / ","             |
   |                |                                                  |
   | mark           | "-" / "_" / "." / "!" / "~" / "*" / "'"          |
   |                |  / "(" / ")"                                     |
   |                |                                                  |
   | escaped        | pct-encoded                                      |
   | hex            | HEXDIG                                           |
   | alphanum       | ALPHA / DIGIT                                    |
   +----------------+--------------------------------------------------+
        
   +----------------+--------------------------------------------------+
   | obsolete rule  | translation                                      |
   +----------------+--------------------------------------------------+
   | absoluteURI    | absolute-URI                                     |
   | relativeURI    | relative-part [ "?" query ]                      |
   | hier_part      | ( "//" authority path-abempty /                  |
   |                | path-absolute ) [ "?" query ]                    |
   |                |                                                  |
   | opaque_part    | path-rootless [ "?" query ]                      |
   | net_path       | "//" authority path-abempty                      |
   | abs_path       | path-absolute                                    |
   | rel_path       | path-rootless                                    |
   | rel_segment    | segment-nz-nc                                    |
   | reg_name       | reg-name                                         |
   | server         | authority                                        |
   | hostport       | host [ ":" port ]                                |
   | hostname       | reg-name                                         |
   | path_segments  | path-abempty                                     |
   | param          | *<pchar excluding ";">                           |
   |                |                                                  |
   | uric           | unreserved / pct-encoded / ";" / "?" / ":"       |
   |                |  / "@" / "&" / "=" / "+" / "$" / "," / "/"       |
   |                |                                                  |
   | uric_no_slash  | unreserved / pct-encoded / ";" / "?" / ":"       |
   |                |  / "@" / "&" / "=" / "+" / "$" / ","             |
   |                |                                                  |
   | mark           | "-" / "_" / "." / "!" / "~" / "*" / "'"          |
   |                |  / "(" / ")"                                     |
   |                |                                                  |
   | escaped        | pct-encoded                                      |
   | hex            | HEXDIG                                           |
   | alphanum       | ALPHA / DIGIT                                    |
   +----------------+--------------------------------------------------+
        

Use of the above obsolete rules for the definition of scheme-specific syntax is deprecated.

不推荐使用上述过时规则定义特定于方案的语法。

Section 2, on characters, has been rewritten to explain what characters are reserved, when they are reserved, and why they are reserved, even when they are not used as delimiters by the generic syntax. The mark characters that are typically unsafe to decode, including the exclamation mark ("!"), asterisk ("*"), single-quote ("'"), and open and close parentheses ("(" and ")"), have been moved to the reserved set in order to clarify the distinction between reserved and unreserved and, hopefully, to answer the most common question of scheme designers. Likewise, the section on percent-encoded characters has been rewritten, and URI normalizers are now given license to decode any percent-encoded octets

关于字符的第2节已被重写,以解释保留哪些字符、何时保留以及为什么保留这些字符,即使通用语法未将它们用作分隔符。通常解码不安全的标记字符,包括感叹号(“!”)、星号(“*”)、单引号(“”)以及开括号和右括号(“(“和”)”),已移动到保留集中,以澄清保留和非保留之间的区别,希望,回答方案设计师最常见的问题。同样,关于编码字符百分比的部分已经重写,URI规范化器现在被授予对任何编码字符百分比进行解码的许可

corresponding to unreserved characters. In general, the terms "escaped" and "unescaped" have been replaced with "percent-encoded" and "decoded", respectively, to reduce confusion with other forms of escape mechanisms.

对应于未保留的字符。一般来说,术语“逃逸”和“未逃逸”已分别替换为“百分比编码”和“解码”,以减少与其他形式逃逸机制的混淆。

The ABNF for URI and URI-reference has been redesigned to make them more friendly to LALR parsers and to reduce complexity. As a result, the layout form of syntax description has been removed, along with the uric, uric_no_slash, opaque_part, net_path, abs_path, rel_path, path_segments, rel_segment, and mark rules. All references to "opaque" URIs have been replaced with a better description of how the path component may be opaque to hierarchy. The relativeURI rule has been replaced with relative-ref to avoid unnecessary confusion over whether they are a subset of URI. The ambiguity regarding the parsing of URI-reference as a URI or a relative-ref with a colon in the first segment has been eliminated through the use of five separate path matching rules.

ABNF for URI和URI引用已经过重新设计,以使它们对LALR解析器更加友好,并降低复杂性。因此,语法描述的布局形式以及URIA、URIA_no_斜线、不透明_部分、net_路径、abs_路径、rel_路径、路径_段、rel_段和标记规则被删除。所有对“不透明”URI的引用都已替换为对路径组件如何对层次结构不透明的更好描述。relativeURI规则已替换为relative ref,以避免对它们是否是URI的子集产生不必要的混淆。通过使用五个单独的路径匹配规则,消除了将URI引用解析为URI或第一段中带有冒号的相对引用的歧义。

The fragment identifier has been moved back into the section on generic syntax components and within the URI and relative-ref rules, though it remains excluded from absolute-URI. The number sign ("#") character has been moved back to the reserved set as a result of reintegrating the fragment syntax.

片段标识符已被移回关于泛型语法组件的部分以及URI和相对ref规则中,尽管它仍然被排除在绝对URI之外。重新整合片段语法后,数字符号(“#”)字符已移回保留集。

The ABNF has been corrected to allow the path component to be empty. This also allows an absolute-URI to consist of nothing after the "scheme:", as is present in practice with the "dav:" namespace [RFC2518] and with the "about:" scheme used internally by many WWW browser implementations. The ambiguity regarding the boundary between authority and path has been eliminated through the use of five separate path matching rules.

ABNF已更正为允许路径组件为空。这还允许绝对URI在“scheme:”之后不包含任何内容,这与许多WWW浏览器实现内部使用的“dav:”名称空间[RFC2518]和“about:”方案的实际情况相同。通过使用五个单独的路径匹配规则,消除了权限和路径之间边界的模糊性。

Registry-based naming authorities that use the generic syntax are now defined within the host rule. This change allows current implementations, where whatever name provided is simply fed to the local name resolution mechanism, to be consistent with the specification. It also removes the need to re-specify DNS name formats here. Furthermore, it allows the host component to contain percent-encoded octets, which is necessary to enable internationalized domain names to be provided in URIs, processed in their native character encodings at the application layers above URI processing, and passed to an IDNA library as a registered name in the UTF-8 character encoding. The server, hostport, hostname, domainlabel, toplabel, and alphanum rules have been removed.

使用通用语法的基于注册表的命名机构现在在主机规则中定义。此更改允许当前的实现与规范保持一致,其中提供的任何名称都被简单地馈送到本地名称解析机制。它还消除了在此处重新指定DNS名称格式的需要。此外,它允许主机组件包含百分比编码的八位字节,这对于在URI中提供国际化域名、在URI处理之上的应用层以其本机字符编码进行处理以及以UTF-8字符编码作为注册名称传递到IDNA库是必需的。已删除服务器、主机端口、主机名、domainlabel、toplabel和alphanum规则。

The resolving relative references algorithm of [RFC2396] has been rewritten with pseudocode for this revision to improve clarity and fix the following issues:

[RFC2396]的解析相对参考算法已使用本修订版的伪代码重写,以提高清晰度并修复以下问题:

o [RFC2396] section 5.2, step 6a, failed to account for a base URI with no path.

o [RFC2396]第5.2节步骤6a未能说明没有路径的基本URI。

o Restored the behavior of [RFC1808] where, if the reference contains an empty path and a defined query component, the target URI inherits the base URI's path component.

o 已还原[RFC1808]的行为,其中,如果引用包含空路径和已定义的查询组件,则目标URI继承基本URI的路径组件。

o The determination of whether a URI reference is a same-document reference has been decoupled from the URI parser, simplifying the URI processing interface within applications in a way consistent with the internal architecture of deployed URI processing implementations. The determination is now based on comparison to the base URI after transforming a reference to absolute form, rather than on the format of the reference itself. This change may result in more references being considered "same-document" under this specification than there would be under the rules given in RFC 2396, especially when normalization is used to reduce aliases. However, it does not change the status of existing same-document references.

o URI引用是否为同一文档引用的确定已与URI解析器分离,以与已部署URI处理实现的内部体系结构一致的方式简化了应用程序内的URI处理接口。现在,确定基于将引用转换为绝对形式后与基本URI的比较,而不是基于引用本身的格式。与RFC 2396中给出的规则相比,此更改可能会导致更多的引用在本规范下被视为“同一文档”,尤其是当使用规范化来减少别名时。但是,它不会更改现有相同文档引用的状态。

o Separated the path merge routine into two routines: merge, for describing combination of the base URI path with a relative-path reference, and remove_dot_segments, for describing how to remove the special "." and ".." segments from a composed path. The remove_dot_segments algorithm is now applied to all URI reference paths in order to match common implementations and to improve the normalization of URIs in practice. This change only impacts the parsing of abnormal references and same-scheme references wherein the base URI has a non-hierarchical path.

o 将路径合并例程分为两个例程:merge,用于描述基本URI路径与相对路径引用的组合;remove_dot_段,用于描述如何从合成路径中删除特殊的“.”和“.”段。remove_dot_segments算法现在应用于所有URI引用路径,以便匹配常见实现并在实践中改进URI的规范化。此更改仅影响异常引用和相同方案引用的解析,其中基本URI具有非分层路径。

Index

指数

A ABNF 11 absolute 27 absolute-path 26 absolute-URI 27 access 9 authority 17, 18

绝对权限17绝对权限27绝对权限27绝对权限27绝对权限

B base URI 28

B基URI 28

C character encoding 4 character 4 characters 8, 11 coded character set 4

C字符编码4字符4字符8,11编码字符集4

D dec-octet 20 dereference 9 dot-segments 23

D dec八位组20解引用9点段23

F fragment 16, 24

F片段16、24

G gen-delims 13 generic syntax 6

G gen delims 13通用语法6

H h16 20 hier-part 16 hierarchical 10 host 18

H h16 20 hier第16部分分层10主机18

I identifier 5 IP-literal 19 IPv4 20 IPv4address 19, 20 IPv6 19 IPv6address 19, 20 IPvFuture 19

I标识符5 IP文字19 IPv4 20 IPv4地址19,20 IPv6 19 IPv6地址19,20 IPvFuture 19

L locator 7 ls32 20

L定位器7 ls32 20

M merge 32

M合并32

N name 7 network-path 26

N名称7网络路径26

P path 16, 22, 26 path-abempty 22 path-absolute 22 path-empty 22 path-noscheme 22 path-rootless 22 path-abempty 16, 22, 26 path-absolute 16, 22, 26 path-empty 16, 22, 26

P路径16,22,26路径绝对22路径绝对22路径空22路径无根22路径绝对16,22,26路径空16,22,26路径

path-rootless 16, 22 pchar 23 pct-encoded 12 percent-encoding 12 port 22

路径无根16,22 pchar 23 pct编码12%编码12端口22

Q query 16, 23

问题16、23

R reg-name 21 registered name 20 relative 10, 28 relative-path 26 relative-ref 26 remove_dot_segments 33 representation 9 reserved 12 resolution 9, 28 resource 5 retrieval 9

R reg name 21注册名称20相对10,28相对路径26相对参考26删除点段33表示9保留12分辨率9,28资源5检索9

S same-document 27 sameness 9 scheme 16, 17 segment 22, 23 segment-nz 23 segment-nz-nc 23 sub-delims 13 suffix 27

S同一文件27相同9方案16,17段22,23段nz 23段nz nc 23子文件13后缀27

T transcription 8

T转录8

U uniform 4 unreserved 13 URI grammar absolute-URI 27 ALPHA 11 authority 18 CR 11 dec-octet 20 DIGIT 11 DQUOTE 11 fragment 24 gen-delims 13

U统一4无保留13 URI语法绝对URI 27 ALPHA 11 authority 18 CR 11 dec八位组20位数字11 DQUOTE 11片段24 gen delims 13

h16 20 HEXDIG 11 hier-part 16 host 19 IP-literal 19 IPv4address 20 IPv6address 20 IPvFuture 19 LF 11 ls32 20 OCTET 11 path 22 path-abempty 22 path-absolute 22 path-empty 22 path-noscheme 22 path-rootless 22 pchar 23 pct-encoded 12 port 22 query 24 reg-name 21 relative-ref 26 reserved 13 scheme 17 segment 23 segment-nz 23 segment-nz-nc 23 SP 11 sub-delims 13 unreserved 13 URI 16 URI-reference 25 userinfo 18 URI 16 URI-reference 25 URL 7 URN 7 userinfo 18

h16 20 HEXDIG 11 hier part 16主机19 IP文字19 IPV4地址20 IPV6地址20 IPvFuture 19 LF 11 ls32 20八位组11路径22路径abempty 22路径绝对22路径空22路径noscheme 22路径无根22 pchar 23 pct编码12端口22查询24注册表名21相对参考26保留13方案17段23段nz 23段nz nc 23 SP 11子delims 13未保留的13 URI 16 URI引用25用户信息18 URI 16 URI引用25 URL 7 URN 7用户信息18

Authors' Addresses

作者地址

Tim Berners-Lee World Wide Web Consortium Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge, MA 02139 USA

Tim Berners-Lee万维网联盟麻省理工学院麻省大道77号美国马萨诸塞州剑桥02139

   Phone: +1-617-253-5702
   Fax:   +1-617-258-5999
   EMail: timbl@w3.org
   URI:   http://www.w3.org/People/Berners-Lee/
        
   Phone: +1-617-253-5702
   Fax:   +1-617-258-5999
   EMail: timbl@w3.org
   URI:   http://www.w3.org/People/Berners-Lee/
        

Roy T. Fielding Day Software 5251 California Ave., Suite 110 Irvine, CA 92617 USA

Roy T.Fielding Day软件美国加利福尼亚州欧文市加利福尼亚大道5251号110室,邮编92617

   Phone: +1-949-679-2960
   Fax:   +1-949-679-2972
   EMail: fielding@gbiv.com
   URI:   http://roy.gbiv.com/
        
   Phone: +1-949-679-2960
   Fax:   +1-949-679-2972
   EMail: fielding@gbiv.com
   URI:   http://roy.gbiv.com/
        

Larry Masinter Adobe Systems Incorporated 345 Park Ave San Jose, CA 95110 USA

美国加利福尼亚州圣何塞公园大道345号Larry Masinter Adobe系统公司,邮编95110

   Phone: +1-408-536-3024
   EMail: LMM@acm.org
   URI:   http://larry.masinter.net/
        
   Phone: +1-408-536-3024
   EMail: LMM@acm.org
   URI:   http://larry.masinter.net/
        

Full Copyright Statement

完整版权声明

Copyright (C) The Internet Society (2005).

版权所有(C)互联网协会(2005年)。

This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

本文件受BCP 78中包含的权利、许可和限制的约束,除其中规定外,作者保留其所有权利。

This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

本文件及其包含的信息是按“原样”提供的,贡献者、他/她所代表或赞助的组织(如有)、互联网协会和互联网工程任务组不承担任何明示或暗示的担保,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。

Intellectual Property

知识产权

The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the IETF's procedures with respect to rights in IETF Documents can be found in BCP 78 and BCP 79.

IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何独立努力来确定任何此类权利。有关IETF文件中权利的IETF程序信息,请参见BCP 78和BCP 79。

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.

向IETF秘书处披露的知识产权副本和任何许可证保证,或本规范实施者或用户试图获得使用此类专有权利的一般许可证或许可的结果,可从IETF在线知识产权存储库获取,网址为http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.

IETF邀请任何相关方提请其注意任何版权、专利或专利申请,或其他可能涵盖实施本标准所需技术的专有权利。请将信息发送至IETF的IETF-ipr@ietf.org.

Acknowledgement

确认

Funding for the RFC Editor function is currently provided by the Internet Society.

RFC编辑功能的资金目前由互联网协会提供。