Network Working Group I. Cooper Request for Comments: 3143 Equinix, Inc. Category: Informational J. Dilley Akamai Technologies, Inc. June 2001
Network Working Group I. Cooper Request for Comments: 3143 Equinix, Inc. Category: Informational J. Dilley Akamai Technologies, Inc. June 2001
Known HTTP Proxy/Caching Problems
已知HTTP代理/缓存问题
Status of this Memo
本备忘录的状况
This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.
本备忘录为互联网社区提供信息。它没有规定任何类型的互联网标准。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (C) The Internet Society (2001). All Rights Reserved.
版权所有(C)互联网协会(2001年)。版权所有。
Abstract
摘要
This document catalogs a number of known problems with World Wide Web (WWW) (caching) proxies and cache servers. The goal of the document is to provide a discussion of the problems and proposed workarounds, and ultimately to improve conditions by illustrating problems. The construction of this document is a joint effort of the Web caching community.
本文档对万维网(WWW)(缓存)代理和缓存服务器的一些已知问题进行了分类。本文件的目标是对问题和建议的解决办法进行讨论,并最终通过说明问题来改善条件。本文档的构建是Web缓存社区的共同努力。
Table of Contents
目录
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1 Problem Template . . . . . . . . . . . . . . . . . . . . . . 2 2. Known Problems . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Known Specification Problems . . . . . . . . . . . . . . . . 5 2.1.1 Vary header is underspecified and/or misleading . . . . . . 5 2.1.2 Client Chaining Loses Valuable Length Meta-Data . . . . . . 9 2.2 Known Architectural Problems . . . . . . . . . . . . . . . . 10 2.2.1 Interception proxies break client cache directives . . . . . 10 2.2.2 Interception proxies prevent introduction of new HTTP methods . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.3 Interception proxies break IP address-based authentication . 12 2.2.4 Caching proxy peer selection in heterogeneous networks . . . 13 2.2.5 ICP Performance . . . . . . . . . . . . . . . . . . . . . . 15 2.2.6 Caching proxy meshes can break HTTP serialization of content 16 2.3 Known Implementation Problems . . . . . . . . . . . . . . . 17 2.3.1 User agent/proxy failover . . . . . . . . . . . . . . . . . 17 2.3.2 Some servers send bad Content-Length headers for files that contain CR . . . . . . . . . . . . . . . . . . . . . . . 18
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1 Problem Template . . . . . . . . . . . . . . . . . . . . . . 2 2. Known Problems . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Known Specification Problems . . . . . . . . . . . . . . . . 5 2.1.1 Vary header is underspecified and/or misleading . . . . . . 5 2.1.2 Client Chaining Loses Valuable Length Meta-Data . . . . . . 9 2.2 Known Architectural Problems . . . . . . . . . . . . . . . . 10 2.2.1 Interception proxies break client cache directives . . . . . 10 2.2.2 Interception proxies prevent introduction of new HTTP methods . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.3 Interception proxies break IP address-based authentication . 12 2.2.4 Caching proxy peer selection in heterogeneous networks . . . 13 2.2.5 ICP Performance . . . . . . . . . . . . . . . . . . . . . . 15 2.2.6 Caching proxy meshes can break HTTP serialization of content 16 2.3 Known Implementation Problems . . . . . . . . . . . . . . . 17 2.3.1 User agent/proxy failover . . . . . . . . . . . . . . . . . 17 2.3.2 Some servers send bad Content-Length headers for files that contain CR . . . . . . . . . . . . . . . . . . . . . . . 18
3. Security Considerations . . . . . . . . . . . . . . . . . . 18 References . . . . . . . . . . . . . . . . . . . . . . . . . 19 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 20 A. Archived Known Problems . . . . . . . . . . . . . . . . . . 21 A.1 Architectural . . . . . . . . . . . . . . . . . . . . . . . 21 A.1.1 Cannot specify multiple URIs for replicated resources . . . 21 A.1.2 Replica distance is unknown . . . . . . . . . . . . . . . . 22 A.1.3 Proxy resource location . . . . . . . . . . . . . . . . . . 23 A.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . 23 A.2.1 Use of Cache-Control headers . . . . . . . . . . . . . . . . 23 A.2.2 Lack of HTTP/1.1 compliance for caching proxies . . . . . . 24 A.2.3 ETag support . . . . . . . . . . . . . . . . . . . . . . . . 25 A.2.4 Servers and content should be optimized for caching . . . . 26 A.3 Administration . . . . . . . . . . . . . . . . . . . . . . . 27 A.3.1 Lack of fine-grained, standardized hierarchy controls . . . 27 A.3.2 Proxy/Server exhaustive log format standard for analysis . . 27 A.3.3 Trace log timestamps . . . . . . . . . . . . . . . . . . . . 28 A.3.4 Exchange format for log summaries . . . . . . . . . . . . . 29 Full Copyright Statement . . . . . . . . . . . . . . . . . . 32
3. Security Considerations . . . . . . . . . . . . . . . . . . 18 References . . . . . . . . . . . . . . . . . . . . . . . . . 19 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 20 A. Archived Known Problems . . . . . . . . . . . . . . . . . . 21 A.1 Architectural . . . . . . . . . . . . . . . . . . . . . . . 21 A.1.1 Cannot specify multiple URIs for replicated resources . . . 21 A.1.2 Replica distance is unknown . . . . . . . . . . . . . . . . 22 A.1.3 Proxy resource location . . . . . . . . . . . . . . . . . . 23 A.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . 23 A.2.1 Use of Cache-Control headers . . . . . . . . . . . . . . . . 23 A.2.2 Lack of HTTP/1.1 compliance for caching proxies . . . . . . 24 A.2.3 ETag support . . . . . . . . . . . . . . . . . . . . . . . . 25 A.2.4 Servers and content should be optimized for caching . . . . 26 A.3 Administration . . . . . . . . . . . . . . . . . . . . . . . 27 A.3.1 Lack of fine-grained, standardized hierarchy controls . . . 27 A.3.2 Proxy/Server exhaustive log format standard for analysis . . 27 A.3.3 Trace log timestamps . . . . . . . . . . . . . . . . . . . . 28 A.3.4 Exchange format for log summaries . . . . . . . . . . . . . 29 Full Copyright Statement . . . . . . . . . . . . . . . . . . 32
This memo discusses problems with proxies - which act as application-level intermediaries for Web requests - and more specifically with caching proxies, which retain copies of previously requested resources in the hope of improving overall quality of service by serving the content locally. Commonly used terminology in this memo can be found in the "Internet Web Replication and Caching Taxonomy"[2].
本备忘录讨论了代理(充当Web请求的应用程序级中介)的问题,更具体地说,还讨论了缓存代理的问题,缓存代理保留以前请求的资源的副本,希望通过在本地提供内容来提高整体服务质量。本备忘录中常用的术语可在“Internet Web复制和缓存分类法”[2]中找到。
No individual or organization has complete knowledge of the known problems in Web caching, and the editors are grateful to the contributors to this document.
没有任何个人或组织完全了解Web缓存中的已知问题,编辑非常感谢本文档的贡献者。
A common problem template is used within the following sections. We gratefully acknowledge RFC2525 [1] which helped define an initial format for this known problems list. The template format is summarized in the following table and described in more detail below.
以下部分使用了一个常见问题模板。我们非常感谢RFC2525[1],它帮助定义了此已知问题列表的初始格式。下表总结了模板格式,并在下面进行了更详细的描述。
Name: short, descriptive name of the problem (3-5 words) Classification: classifies the problem: performance, security, etc Description: describes the problem succinctly Significance: magnitude of problem, environments where it exists Implications: the impact of the problem on systems and networks See Also: a reference to a related known problem Indications: states how to detect the presence of this problem
名称:问题的简短描述性名称(3-5个单词)分类:对问题进行分类:性能、安全性等描述:简洁地描述问题意义:问题的严重程度,存在问题的环境含义:问题对系统和网络的影响另请参见:对相关已知问题的参考指示:说明如何检测此问题的存在
Solution(s): describe the solution(s) to this problem, if any Workaround: practical workaround for the problem References: information about the problem or solution Contact: contact name and email address for this section
Solution(s): describe the solution(s) to this problem, if any Workaround: practical workaround for the problem References: information about the problem or solution Contact: contact name and email address for this section
Name A short, descriptive, name (3-5 words) name associated with the problem.
命名与问题相关的简短、描述性名称(3-5个单词)。
Classification Problems are grouped into categories of similar problems for ease of reading of this memo. Choose the category that best describes the problem. The suggested categories include three general categories and several more specific categories.
为了便于阅读本备忘录,分类问题被分为类似问题的类别。选择最能描述问题的类别。建议的类别包括三个一般类别和几个更具体的类别。
* Architecture: the fundamental design is incomplete, or incorrect
* 架构:基本设计不完整或不正确
* Specification: the spec is ambiguous, incomplete, or incorrect.
* 规范:规范不明确、不完整或不正确。
* Implementation: the implementation of the spec is incorrect.
* 实现:规范的实现不正确。
* Performance: perceived page response at the client is excessive; network bandwidth consumption is excessive; demand on origin or proxy servers exceed reasonable bounds.
* 性能:客户端感知到的页面响应过多;网络带宽消耗过大;对源服务器或代理服务器的需求超出合理界限。
* Administration: care and feeding of caches is, or causes, a problem.
* 管理:缓存的维护和供给是一个问题,或者是一个问题的原因。
* Security: privacy, integrity, or authentication concerns.
* 安全性:隐私、完整性或身份验证问题。
Description A definition of the problem, succinct but including necessary background information.
描述问题的定义,简洁但包含必要的背景信息。
Significance (High, Medium, Low) May include a brief summary of the environments for which the problem is significant.
重要性(高、中、低)可能包括问题严重的环境的简要总结。
Implications Why the problem is viewed as a problem. What inappropriate behavior results from it? This section should substantiate the magnitude of any problem indicated with High significance.
问题被视为问题的含义。这会导致什么不适当的行为?本节应证明任何问题的严重性,并具有高度的重要性。
See Also Optional. List of other known problems that are related to this one.
另请参见可选。与此相关的其他已知问题的列表。
Indications How to detect the presence of the problem. This may include references to one or more substantiating documents that demonstrate the problem. This should include the network configuration that led to the problem such that it can be reproduced. Problems that are not reproducible will not appear in this memo.
指示如何检测问题的存在。这可能包括引用一个或多个证明问题的证明文件。这应包括导致问题的网络配置,以便能够重现。不可再现的问题不会出现在本备忘录中。
Solution(s) Solutions that permanently fix the problem, if such are known. For example, what version of the software does not exhibit the problem? Indicate if the solution is accepted by the community, one of several solutions pending agreement, or open possibly with experimental solutions.
解决方案永久解决问题的解决方案(如果已知)。例如,哪个版本的软件不会出现问题?说明该解决方案是否为社区所接受,是否为多个待达成协议的解决方案之一,或者是否可能与实验性解决方案一起开放。
Workaround Practical workaround if no solution is available or usable. The workaround should have sufficient detail for someone experiencing the problem to get around it.
解决方案如果没有可用或可用的解决方案,则为实际解决方案。解决方案应该有足够的细节,以便遇到问题的人能够解决问题。
References References to related information in technical publications or on the web. Where can someone interested in learning more go to find out more about this problem, its solution, or workarounds?
技术出版物或网络上的相关信息参考。有兴趣了解更多信息的人可以去哪里了解更多有关此问题、解决方案或解决方法的信息?
Contact Contact name and email address of the person who supplied the information for this section. The editors are listed as contacts for anonymous submissions.
为本节提供信息的人员的联系人姓名和电子邮件地址。编辑被列为匿名提交的联系人。
The remaining sections of this document present the currently documented known problems. The problems are ordered by classification and significance. Issues with protocol specification or architecture are first, followed by implementation issues. Issues of high significance are first, followed by lower significance.
本文件其余部分介绍了当前记录的已知问题。这些问题按分类和重要性排序。首先是协议规范或体系结构问题,然后是实现问题。意义重大的问题首先是,其次是意义较低的问题。
Some of the problems initially identified in the previous versions of this document have been moved to Appendix A since they discuss issues where resolution primarily involves education rather than protocol work.
本文件先前版本中最初确定的一些问题已移至附录A,因为它们讨论的问题主要涉及教育而非协议工作。
A full list of the problems is available in the table of contents.
目录中提供了问题的完整列表。
Name The "Vary" header is underspecified and/or misleading
名称“Vary”标题未指定和/或具有误导性
Classification Specification
分类规范
Description The Vary header in HTTP/1.1 was designed to allow a caching proxy to safely cache responses even if the server's choice of variants is not entirely understood. As RFC 2616 says:
说明HTTP/1.1中的Vary头设计用于允许缓存代理安全地缓存响应,即使服务器的变体选择不完全清楚。正如RFC 2616所说:
The Vary header field can be used to express the parameters the server uses to select a representation that is subject to server-driven negotiation.
Vary header字段可用于表示服务器用于选择受服务器驱动协商约束的表示的参数。
One might expect that this mechanism is useful in general for extensions that change the response message based on some aspects of the request. However, that is not true.
人们可能会认为,这种机制通常对于根据请求的某些方面更改响应消息的扩展非常有用。然而,事实并非如此。
During the design of the HTTP delta encoding specification[9] it was realized that an HTTP/1.1 proxy that does not understand delta encoding might cache a delta-encoded response and then later deliver it to a non-delta-capable client, unless the extension included some mechanism to prevent this. Initially, it was thought that Vary would suffice, but the following scenario proves this wrong.
在设计HTTP增量编码规范[9]的过程中,认识到不理解增量编码的HTTP/1.1代理可能缓存增量编码的响应,然后将其交付给不支持增量编码的客户端,除非扩展包含某种机制来防止这种情况。起初,人们认为Vary就足够了,但下面的场景证明这是错误的。
NOTE: It is likely that other scenarios exhibiting the same basic problem with "Vary" could be devised, without reference to delta encoding. This is simply a concrete scenario used to explain the problem.
注:可能会设计出与“Vary”具有相同基本问题的其他场景,而无需参考增量编码。这只是一个用于解释问题的具体场景。
A complete description of the IM and A-IM headers may be found in the "Delta encoding in HTTP" specification. For the purpose of this problem description, the relevant details are:
IM和A-IM头的完整描述可以在“HTTP中的增量编码”规范中找到。在本问题描述中,相关详细信息如下:
1. The concept of an "instance manipulation" is introduced. In some ways, this is similar to a content-coding, but there are differences. One example of an instance manipulation name is "vcdiff".
1. 引入了“实例操作”的概念。在某些方面,这类似于内容编码,但也有区别。实例操作名称的一个示例是“vcdiff”。
2. A client signals its willingness to accept one or more instance-manipulations using the A-IM header.
2. 客户机使用A-IM头表示愿意接受一个或多个实例操作。
3. A server indicates which instance-manipulations are used to encode the body of a response using the IM header.
3. 服务器指示哪些实例操作用于使用IM头对响应体进行编码。
4. Existing implementations will ignore the A-IM and IM headers, following the usual HTTP rules for handling unknown headers.
4. 现有的实现将忽略A-IM和IM头,遵循处理未知头的常见HTTP规则。
5. Responses encoded with an instance-manipulation are sent using the (proposed) 226 status code, "IM Used".
5. 使用实例操作编码的响应使用(建议的)226状态代码“IM Used”发送。
6. In response to a conditional request that carries an IM header, if the request-URI has been modified then a server may transmit a compact encoding of the modifications using a delta-encoding instead of a status-200 response. The encoded response cannot be understood by an implementation that does not support delta encodings.
6. 响应于携带IM报头的条件请求,如果请求URI已被修改,则服务器可以使用增量编码而不是status-200响应来发送修改的压缩编码。不支持增量编码的实现无法理解编码响应。
This summary omits many details.
这个摘要省略了许多细节。
Suppose client A sends this request via proxy P:
假设客户端A通过代理P发送此请求:
GET http://example.com/foo.html HTTP/1.1 Host: example.com If-None-Match: "abc" A-IM: vcdiff
GET http://example.com/foo.html HTTP/1.1 Host: example.com If-None-Match: "abc" A-IM: vcdiff
and the origin server returns, via P, this response:
原始服务器通过P返回以下响应:
HTTP/1.1 226 IM Used Etag: "def" Date: Wed, 19 Apr 2000 18:46:13 GMT IM: vcdiff Cache-Control: max-age-60 Vary: A-IM, If-None-Match
HTTP/1.1 226 IM Used Etag: "def" Date: Wed, 19 Apr 2000 18:46:13 GMT IM: vcdiff Cache-Control: max-age-60 Vary: A-IM, If-None-Match
the body of which is a delta-encoded response (it encodes the difference between the Etag "abc" instance of foo.html, and the "def" instance). Assume that P stores this response in its cache, and that P does not understand the vcdiff encoding.
其主体是一个增量编码的响应(它编码foo.html的Etag“abc”实例和“def”实例之间的差异)。假设P将此响应存储在其缓存中,并且P不理解vcdiff编码。
Later, client B, also ignorant of delta-encoding, sends this request via P:
稍后,客户端B也不知道增量编码,通过P:
GET http://example.com/foo.html HTTP/1.1 Host: example.com
GET http://example.com/foo.html HTTP/1.1 Host: example.com
What can P do now? According to the specification for the Vary header in RFC2616,
P现在能做什么?根据RFC2616中的可变收割台规范,
The Vary field value indicates the set of request-header fields that fully determines, while the response is fresh, whether a cache is permitted to use the response to reply to a subsequent request without revalidation.
Vary字段值表示一组请求头字段,当响应是新的时,这些字段完全确定是否允许缓存使用响应来答复后续请求,而无需重新验证。
Implicitly, however, the cache would be allowed to use the stored response in response to client B WITH "revalidation". This is the potential bug.
然而,隐式地,缓存将被允许使用存储的响应来响应客户机B的“重新验证”。这就是潜在的bug。
An obvious implementation of the proxy would send this request to test whether its cache entry is fresh (i.e., to revalidate the entry):
代理的一个明显实现将发送此请求以测试其缓存项是否新鲜(即重新验证该项):
GET /foo.html HTTP/1.1 Host: example.com If-None-Match: "def"
GET/foo.html HTTP/1.1 Host:example.com如果没有匹配:“def”
That is, the proxy simply forwards the new request, after doing the usual transformation on the URL and tacking on the "obvious" If-None-Match header.
也就是说,在对URL进行常规转换并在“显而易见的”If None匹配头上加上标记后,代理只转发新请求。
If the origin server's Etag for the current instance is still "def", it would naturally respond:
如果当前实例的源服务器Etag仍然是“def”,它将自然响应:
HTTP/1.1 304 Not Modified Etag: "def" Date: Wed, 19 Apr 2000 18:46:14 GMT
HTTP/1.1 304 Not Modified Etag: "def" Date: Wed, 19 Apr 2000 18:46:14 GMT
thus telling the proxy P that it can use its stored response. But this cache response actually involves a delta-encoding that would not be sensible to client B, signaled by a header field that would be ignored by B, and so the client displays garbage.
从而告诉代理P它可以使用其存储的响应。但是这个缓存响应实际上涉及一个对客户机B不敏感的增量编码,由一个将被B忽略的头字段发出信号,因此客户机显示垃圾。
The problem here is that the original request (from client A) generated a response that is not sensible to client B, not merely one that is not "the appropriate representation" (as the result of server-driven negotiation).
这里的问题是,原始请求(来自客户机A)生成的响应对客户机B不敏感,而不仅仅是一个不是“适当表示”(作为服务器驱动协商的结果)的响应。
One might argue that the proxy P shouldn't be storing status-226 responses in the first place. True in theory, perhaps, but unfortunately RFC2616, section 13.4, says:
有人可能会争辩说,代理P首先不应该存储status-226响应。理论上可能正确,但不幸的是,RFC2616第13.4节指出:
A response received with any [status code other than 200, 203, 206, 300, 301 or 410] MUST NOT be returned in a reply to a subsequent request unless there are cache-control directives or another header(s) that explicitly allow it. For example, these
除非存在明确允许的缓存控制指令或其他标头,否则不得在对后续请求的答复中返回使用任何[状态代码(200、203、206、300、301或410除外]的响应。例如,这些
include the following: an Expires header (section 14.21); a "max-age", "s-maxage", "must-revalidate", "proxy-revalidate", "public" or "private" cache-control directive (section 14.9).
包括以下内容:Expires标头(第14.21节);“最大年龄”、“s-maxage”、“必须重新验证”、“代理重新验证”、“公共”或“私有”缓存控制指令(第14.9节)。
In other words, the specification allows caching of responses with yet-to-be-defined status codes if the response carries a plausible Cache-Control directive. So unless we ban servers implementing this kind of extension from using these Cache-Control directives at all, the Vary header just won't work.
换句话说,如果响应带有合理的缓存控制指令,则该规范允许使用尚未定义的状态代码缓存响应。因此,除非我们禁止实现这种扩展的服务器使用这些缓存控制指令,否则Vary头就无法工作。
Significance Medium
重要性媒介
Implications Certain plausible extensions to the HTTP/1.1 protocol might not interoperate correctly with older HTTP/1.1 caches, if the extensions depend on an interpretation of Vary that is not the same as is used by the cache implementer.
含义如果扩展依赖于与缓存实现者使用的不同的Vary解释,则HTTP/1.1协议的某些看似合理的扩展可能无法与较旧的HTTP/1.1缓存正确互操作。
This would have the effect either of causing hard-to-debug cache transparency failures, or of discouraging the deployment of such extensions, or of encouraging the implementers of such extensions to disable caching entirely.
这将导致难以调试的缓存透明性故障,或阻止部署此类扩展,或鼓励此类扩展的实现者完全禁用缓存。
Indications The problem is visible when hand-simulating plausible message exchanges, especially when using the proposed delta encoding extension. It probably has not been visible in practice yet.
指示当手动模拟合理的消息交换时,问题是显而易见的,特别是当使用建议的增量编码扩展时。它可能还没有在实践中显现出来。
Solution(s)
解决方案
1. Section 13.4 of the HTTP/1.1 specification should probably be changed to prohibit caching of responses with status codes that the cache doesn't understand, whether or not they include Expires headers and the like. (It might require some care to define what "understands" means, leaving room for future extensions with new status codes.) The behavior in this case needs to be defined as equivalent to "Cache-Control: no-store" rather than "no-cache", since the latter allows revalidation.
1. HTTP/1.1规范的第13.4节可能应该更改,以禁止缓存不理解状态代码的响应,无论它们是否包括Expires头等等。(可能需要谨慎定义“理解”的含义,为将来使用新的状态代码进行扩展留出空间。)这种情况下的行为需要定义为等同于“缓存控制:无存储”而不是“无缓存”,因为后者允许重新验证。
Possibly the specification of Vary should require that it be treated as "Cache-Control: no-store" whenever the status code is unknown - that should solve the problem in the scenario given here.
Vary的规范可能要求在状态代码未知时将其视为“Cache Control:no store”——这将解决这里给出的场景中的问题。
2. Designers of HTTP/1.1 extensions should consider using mechanisms other than Vary to prevent false caching.
2. HTTP/1.1扩展的设计者应该考虑使用不同的机制来防止错误的缓存。
It is not clear whether the Vary mechanism is widely implemented in caches; if not, this favors solution #1.
目前还不清楚Vary机制是否在缓存中广泛实现;如果不是,这有利于解决方案1。
Workaround A cache could treat the presence of a Vary header in a response as an implicit "Cache-control: no-store", except for "known" status codes, even though this is not required by RFC 2616. This would avoid any transparency failures. "Known status codes" for basic HTTP/1.1 caches probably include: 200, 203, 206, 300, 301, 410 (although this list should be re-evaluated in light of the problem discussed here).
解决缓存问题的方法可以将响应中存在的Vary头视为隐含的“缓存控制:无存储”,但“已知”状态代码除外,即使RFC 2616不要求这样做。这将避免任何透明度失败。基本HTTP/1.1缓存的“已知状态代码”可能包括:200、203、206、300、301、410(尽管应根据此处讨论的问题重新评估此列表)。
References See [9] for the specification of the delta encoding extension, as well as for an example of the use of a Cache-Control extension instead of "Vary."
有关增量编码扩展的规范,以及使用缓存控制扩展而不是“Vary”的示例,请参见参考文献[9]
Contact Jeff Mogul <mogul@pa.dec.com>
联系杰夫·莫格尔<mogul@pa.dec.com>
Name Client Chaining Loses Valuable Length Meta-Data
名称客户端链接会丢失有价值的长度元数据
Classification Performance
分类性能
Description HTTP/1.1[3] implementations are prohibited from sending Content-Length headers with any message whose body has been Transfer-Encoded. Because 1.0 clients cannot accept chunked Transfer-Encodings, receiving 1.1 implementations must forward the body to 1.0 clients must do so without the benefit of information that was discarded earlier in the chain.
说明HTTP/1.1[3]实现禁止将内容长度头与正文已传输编码的任何消息一起发送。由于1.0客户端无法接受分块传输编码,因此接收1.1实现必须将主体转发到1.0客户端必须这样做,而不会从链中先前丢弃的信息中获益。
Significance Low
显著性低
Implications Lacking either a chunked transfer encoding or Content-Length indication creates negative performance implications for how the proxy must forward the message body.
缺少分块传输编码或内容长度指示的含义会对代理必须如何转发消息体产生负面性能影响。
In the case of response bodies, the server may either forward the response while closing the connection to indicate the end of the response or must utilize store and forward semantics to buffer the entire response in order to calculate a Content-Length. The former option defeats the performance benefits of persistent connections in HTTP/1.1 (and their Keep-Alive cousin in HTTP/1.0) as well as creating some ambiguously lengthed responses. The latter store and forward option may not even be feasible given the size of the resource and it will always introduce increased latency.
对于响应主体,服务器可以在关闭连接时转发响应以指示响应结束,或者必须利用存储转发语义来缓冲整个响应以计算内容长度。前一个选项破坏了HTTP/1.1中持久连接(以及它们在HTTP/1.0中保持活动)的性能优势,并创建了一些含糊不清的长响应。考虑到资源的大小,后一种存储转发选项甚至可能不可行,而且它总是会增加延迟。
Request bodies must undertake the store and forward process as 1.0 request bodies must be delimited by Content-Length headers. As with response bodies this may place unacceptable resource constraints on the proxy and the request may not be able to be satisfied.
请求主体必须执行存储和转发过程,因为1.0请求主体必须由内容长度头分隔。与响应机构一样,这可能会对代理造成不可接受的资源限制,并且请求可能无法满足。
Indications The lack of HTTP/1.0 style persistent connections between 1.0 clients and 1.1 proxies, only when accessing 1.1 servers, is a strong indication of this problem.
指示仅在访问1.1服务器时,1.0客户端和1.1代理之间缺少HTTP/1.0样式的持久连接,这是此问题的强烈指示。
Solution(s) An HTTP specification clarification that would allow origin known identity document Content-Lengths to be carried end to end would alleviate this issue.
解决方案HTTP规范澄清允许端到端携带来源已知的身份文件内容长度,这将缓解此问题。
Workaround None.
无解决方法。
Contact Patrick McManus <mcmanus@AppliedTheory.com>
联系Patrick McManus<mcmanus@AppliedTheory.com>
Name Interception proxies break client cache directives
名称拦截代理中断客户端缓存指令
Classification Architecture
分类体系结构
Description HTTP[3] is designed for the user agent to be aware if it is connected to an origin server or to a proxy. User agents believing they are transacting with an origin server but which are
说明HTTP[3]旨在让用户代理知道它是否连接到源服务器或代理。用户代理认为他们正在与源服务器进行交易,但是
really in a connection with an interception proxy may fail to send critical cache-control information they would have otherwise included in their request.
实际上,在与拦截代理的连接中,可能无法发送关键的缓存控制信息,否则这些信息将包含在请求中。
Significance High
显著性高
Implications Clients may receive data that is not synchronized with the origin even when they request an end to end refresh, because of the lack of inclusion of either a "Cache-control: no-cache" or "must-revalidate" header. These headers have no impact on origin server behavior so may not be included by the browser if it believes it is connected to that resource. Other related data implications are possible as well. For instance, data security may be compromised by the lack of inclusion of "private" or "no-store" clauses of the Cache-control header under similar conditions.
暗示客户端可能会收到与源站不同步的数据,即使在请求端到端刷新时也是如此,因为没有包含“缓存控制:无缓存”或“必须重新验证”标头。这些标头对源服务器行为没有影响,因此如果浏览器认为它已连接到该资源,则可能不会包含在其中。其他相关数据的影响也是可能的。例如,在类似条件下,数据安全性可能因缓存控制头中没有包含“private”或“no store”子句而受到损害。
Indications Easily detected by placing fresh (un-expired) content on a caching proxy while changing the authoritative copy, then requesting an end-to-end reload of the data through a proxy in both interception and explicit modes.
通过在缓存代理上放置新的(未过期的)内容,同时更改权威副本,然后在拦截和显式模式下通过代理请求端到端重新加载数据,可以轻松检测到指示。
Solution(s) Eliminate the need for interception proxies and IP spoofing, which will return correct context awareness to the client.
解决方案消除了拦截代理和IP欺骗的需要,这将向客户端返回正确的上下文感知。
Workaround Include relevant Cache-Control directives in every request at the cost of increased bandwidth and CPU requirements.
解决方法包括在每个请求中使用相关的缓存控制指令,但会增加带宽和CPU需求。
Contact Patrick McManus <mcmanus@AppliedTheory.com>
联系Patrick McManus<mcmanus@AppliedTheory.com>
Name Interception proxies prevent introduction of new HTTP methods
名称拦截代理阻止引入新的HTTP方法
Classification Architecture
分类体系结构
Description A proxy that receives a request with a method unknown to it is required to generate an HTTP 501 Error as a response. HTTP methods are designed to be extensible so there may be applications deployed with initial support just for the user agent and origin
说明接收方法未知的请求的代理需要生成HTTP 501错误作为响应。HTTP方法的设计是可扩展的,因此可能会部署一些仅支持用户代理和源的应用程序
server. An interception proxy that hijacks requests which include new methods destined for servers that have implemented those methods creates a de-facto firewall where none may be intended.
服务器拦截代理劫持包含新方法的请求,这些新方法将发送给已实现这些方法的服务器,从而创建事实上的防火墙,而这些防火墙可能不存在。
Significance Medium within interception proxy environments.
拦截代理环境中的重要性介质。
Implications Renders new compliant applications useless unless modifications are made to proxy software. Because new methods are not required to be globally standardized it is impossible to keep up to date in the general case.
除非对代理软件进行修改,否则新的兼容应用程序将变得无用。由于新方法不需要全球标准化,因此在一般情况下不可能保持最新。
Solution(s) Eliminate the need for interception proxies. A client receiving a 501 in a traditional HTTP environment may either choose to repeat the request to the origin server directly, or perhaps be configured to use a different proxy.
解决方案消除了拦截代理的需要。在传统HTTP环境中接收501的客户端可以选择直接向源服务器重复请求,也可以配置为使用不同的代理。
Workaround Level 5 switches (sometimes called Level 7 or application layer switches) can be used to keep HTTP traffic with unknown methods out of the proxy. However, these devices have heavy buffering responsibilities, still require TCP sequence number spoofing, and do not interact well with persistent connections.
解决方案5级交换机(有时称为7级或应用层交换机)可用于将具有未知方法的HTTP流量排除在代理之外。然而,这些设备有很重的缓冲责任,仍然需要TCP序列号欺骗,并且不能与持久连接很好地交互。
The HTTP/1.1 specification allows a proxy to switch over to tunnel mode when it receives a request with a method or HTTP version it does not understand how to handle.
HTTP/1.1规范允许代理在接收到带有其不了解如何处理的方法或HTTP版本的请求时切换到隧道模式。
Contact Patrick McManus <mcmanus@AppliedTheory.com> Henrik Nordstrom <hno@hem.passagen.se> (HTTP/1.1 clarification)
Contact Patrick McManus <mcmanus@AppliedTheory.com> Henrik Nordstrom <hno@hem.passagen.se> (HTTP/1.1 clarification)
Name Interception proxies break IP address-based authentication
名称拦截代理中断基于IP地址的身份验证
Classification Architecture
分类体系结构
Description Some web servers are not open for public access, but restrict themselves to accept only requests from certain IP address ranges for security reasons. Interception proxies alter the source (client) IP addresses to that of the proxy itself, without the
说明某些web服务器不开放供公众访问,但出于安全原因,将其自身限制为仅接受来自特定IP地址范围的请求。拦截代理将源(客户端)IP地址更改为代理本身的IP地址,而无需
knowledge of the client/user. This breaks such authentication mechanisms and prohibits otherwise allowed clients access to the servers.
客户/用户的知识。这会破坏此类身份验证机制,并禁止其他允许的客户端访问服务器。
Significance Medium
重要性媒介
Implications Creates end user confusion and frustration.
这会给最终用户带来困惑和挫折。
Indications Users may start to see refused connections to servers after interception proxies are deployed.
在部署拦截代理后,用户可能会开始看到拒绝连接到服务器的迹象。
Solution(s) Use user-based authentication instead of (IP) address-based authentication.
解决方案使用基于用户的身份验证而不是基于IP地址的身份验证。
Workaround Using IP filters at the intercepting device (L4 switch) and bypass all requests to such servers concerned.
在拦截设备(L4交换机)上使用IP筛选器的解决方法,并绕过所有对相关服务器的请求。
Contact Keith K. Chau <keithc@unitechnetworks.com>
联络周启康<keithc@unitechnetworks.com>
Name Caching proxy peer selection in heterogeneous networks
异构网络中的名称缓存代理节点选择
Classification Architecture
分类体系结构
Description ICP[4] based caching proxy peer selection in networks with large variance in latency and bandwidth between peers can lead to non-optimal peer selection. For example take Proxy C with two siblings, Sib1 and Sib2, and the following network topology (summarized).
说明基于ICP[4]的缓存代理对等点选择在对等点之间延迟和带宽差异较大的网络中可能会导致非最佳对等点选择。例如,以具有两个同级(Sib1和Sib2)的代理C为例,以及以下网络拓扑(摘要)。
* Cache C's link to Sib1, 2 Mbit/sec with 300 msec latency
* 缓存C到Sib1的链接,2 Mbit/s,延迟300毫秒
* Cache C's link to Sib2, 64 Kbit/sec with 10 msec latency.
* 缓存C到Sib2的链接,64 Kbit/s,延迟10毫秒。
ICP[4] does not work well in this context. If a user submits a request to Proxy C for page P that results in a miss, C will send an ICP request to Sib1 and Sib2. Assume both siblings have the
国际比较项目[4]在这种情况下并不奏效。如果用户向代理C提交页面P请求,导致未命中,C将向Sib1和Sib2发送ICP请求。假设两个兄弟姐妹都有
requested object P. The ICP_HIT reply will always come from Sib2 before Sib1. However, it is clear that the retrieval of large objects will be faster from Sib1, rather than Sib2.
请求的对象P。ICP_命中回复总是来自Sib1之前的Sib2。但是,很明显,从Sib1检索大型对象比从Sib2检索更快。
The problem is more complex because Sib1 and Sib2 can't have a 100% hit ratio. With a hit rate of 10%, it is more efficient to use Sib1 with resources larger than 48K. The best choice depends on at least the hit rate and link characteristics; maybe other parameters as well.
问题更复杂,因为Sib1和Sib2不能有100%的命中率。命中率为10%时,使用资源大于48K的Sib1更有效。最佳选择至少取决于命中率和链路特性;也许还有其他参数。
Significance Medium
重要性媒介
Implications By using the first peer to respond, peer selection algorithms are not optimizing retrieval latency to end users. Furthermore they are causing more work for the high-latency peer since it must respond to such requests but will never be chosen to serve content if the lower latency peer has a copy.
通过使用第一个对等响应,对等选择算法没有优化最终用户的检索延迟。此外,它们还为高延迟对等方带来了更多的工作,因为它必须响应此类请求,但如果低延迟对等方有一个副本,它将永远不会被选择提供内容。
Indications Inherent in design of ICP v1, ICP v2, and any cache mesh protocol that selects peers based upon first response.
ICP v1、ICP v2和基于第一次响应选择对等点的任何缓存网格协议设计中固有的指示。
This problem is not exhibited by cache digest or other protocols which (attempt to) maintain knowledge of peer contents and only hit peers that are believed to have a copy of the requested page.
缓存摘要或其他协议(试图)维护对等内容的知识,并且只命中被认为具有请求页面副本的对等方,而这些协议不会显示此问题。
Solution(s) This problem is architectural with the peer selection protocols.
解决方案这个问题是对等选择协议的体系结构问题。
Workaround Cache mesh design when using such a protocol should be done in such a way that there is not a high latency variance among peers. In the example presented in the above description the high latency high bandwidth peer could be used as a parent, but should not be used as a sibling.
在使用这种协议时,解决缓存网格设计的方法应确保对等方之间没有高延迟差异。在上面描述的示例中,高延迟高带宽对等点可以用作父节点,但不应用作兄弟节点。
Contact Ivan Lovric <ivan.lovric@cnet.francetelecom.fr> John Dilley <jad@akamai.com>
Contact Ivan Lovric <ivan.lovric@cnet.francetelecom.fr> John Dilley <jad@akamai.com>
Name ICP performance
名称ICP绩效
Classification Architecture(ICP), Performance
分类体系结构(ICP)、性能
Description ICP[4] exhibits O(n^2) scaling properties, where n is the number of participating peer proxies. This can lead ICP traffic to dominate HTTP traffic within a network.
说明ICP[4]展示了O(n^2)缩放特性,其中n是参与对等代理的数量。这可能导致ICP流量在网络中主导HTTP流量。
Significance Medium
重要性媒介
Implications If a proxy has many ICP peers the bandwidth demand of ICP can be excessive. System managers must carefully regulate ICP peering. ICP also leads proxies to become homogeneous in what they serve; if your proxy does not have a document it is unlikely your peers will have it either. Therefore, ICP traffic requests are largely unable to locate a local copy of an object (see [6]).
影响如果代理具有多个ICP对等点,则ICP的带宽需求可能会过高。系统管理员必须仔细管理ICP对等。ICP还导致代理在其服务内容上变得同质化;如果您的代理没有文档,您的同事也不太可能有文档。因此,ICP流量请求基本上无法定位对象的本地副本(参见[6])。
Indications Inherent in design of ICP v1, ICP v2.
ICP v1、ICP v2设计中固有的指示。
Solution(s) This problem is architectural - protocol redesign or replacement is required to solve it if ICP is to continue to be used.
解决方案这是一个架构问题——如果要继续使用ICP,则需要重新设计或更换协议来解决此问题。
Workaround Implementation workarounds exist, for example to turn off use of ICP, to carefully regulate peering, or to use another mechanism if available, such as cache digests. A cache digest protocol shares a summary of cache contents using a Bloom Filter technique. This allows a cache to estimate whether a peer has a document. Filters are updated regularly but are not always up-to-date so cannot help when a spike in popularity occurs. They also increase traffic but not as much as ICP.
存在解决方案实现解决方案,例如关闭ICP的使用,仔细调节对等,或者使用另一种机制(如果可用),如缓存摘要。缓存摘要协议使用Bloom筛选器技术共享缓存内容的摘要。这允许缓存估计对等方是否有文档。过滤器会定期更新,但并不总是最新的,因此当流行度出现峰值时,就无能为力了。它们也增加了流量,但没有ICP那么多。
Proxy clustering protocols organize proxies into a mesh provide another alternative solution. There is ongoing research on this topic.
代理群集协议将代理组织到网格中,提供了另一种替代解决方案。目前正在对这一主题进行研究。
Contact John Dilley <jad@akamai.com>
联系约翰·迪利<jad@akamai.com>
Name Caching proxy meshes can break HTTP serialization of content
名称缓存代理网格可能会破坏内容的HTTP序列化
Classification Architecture (HTTP protocol)
分类体系结构(HTTP协议)
Description A caching proxy mesh where a request may travel different paths, depending on the state of the mesh and associated caches, can break HTTP content serialization, possibly causing the end user to receive older content than seen on an earlier request, where the request traversed another path in the mesh.
说明缓存代理网格中的请求可能会通过不同的路径,具体取决于网格和相关缓存的状态。缓存代理网格可能会中断HTTP内容序列化,可能会导致最终用户接收到比先前请求中的内容更旧的内容,其中请求会通过网格中的另一条路径。
Significance Medium
重要性媒介
Implications Can cause end user confusion. May in some situations (sibling cache hit, object has changed state from cacheable to uncacheable) be close to impossible to get the caches properly updated with the new content.
暗示可能会导致最终用户混淆。在某些情况下(同级缓存命中,对象状态已从可缓存更改为不可缓存),几乎不可能使用新内容正确更新缓存。
Indications Older content is unexpectedly returned from a caching proxy mesh after some time.
指示一段时间后,缓存代理网格意外返回旧内容。
Solutions(s) Work with caching proxy vendors and researchers to find a suitable protocol for maintaining proxy relations and object state in a mesh.
解决方案与缓存代理供应商和研究人员合作,找到一个合适的协议来维护网格中的代理关系和对象状态。
Workaround When designing a hierarchy/mesh, make sure that for each end-user/URL combination there is only one single path in the mesh during normal operation.
解决方法在设计层次结构/网格时,请确保在正常操作期间,对于每个最终用户/URL组合,网格中只有一条路径。
Contact Henrik Nordstrom <hno@hem.passagen.se>
联系Henrik Nordstrom<hno@hem.passagen.se>
Name User agent/proxy failover
名称用户代理/代理故障切换
Classification Implementation
分类实施
Description Failover between proxies at the user agent (using a proxy.pac[8] file) is erratic and no standard behavior is defined. Additionally, behavior is hard-coded into the browser, so that proxy administrators cannot use failover at the user agent effectively.
说明用户代理上代理之间的故障切换(使用proxy.pac[8]文件)不稳定,未定义标准行为。此外,行为硬编码到浏览器中,因此代理管理员无法有效地在用户代理上使用故障切换。
Significance Medium
重要性媒介
Implications Architects are forced to implement failover at the proxy itself, when it may be more appropriate and economical to do it within the user agent.
当在用户代理中执行故障切换可能更合适、更经济时,架构师被迫在代理本身实现故障切换。
Indications If a browser detects that its primary proxy is down, it will wait n minutes before trying the next one it is configured to use. It will then wait y minutes before asking the user if they'd like to try the original proxy again. This is very confusing for end users.
指示如果浏览器检测到其主代理已关闭,它将等待n分钟,然后再尝试配置为使用的下一个代理。然后,它将等待y分钟,然后询问用户是否愿意再次尝试原始代理。这对最终用户来说非常混乱。
Solution(s) Work with browser vendors to establish standard extensions to JavaScript proxy.pac libraries that will allow configuration of these timeouts.
解决方案与浏览器供应商合作,建立JavaScript proxy.pac库的标准扩展,以允许配置这些超时。
Workaround User education; redundancy at the proxy level.
解决办法用户教育;代理级别的冗余。
Contact Mark Nottingham <mnot@mnot.net>
联系马克·诺丁汉<mnot@mnot.net>
2.3.2 Some servers send bad Content-Length headers for files that contain CR
2.3.2 某些服务器为包含CR的文件发送错误的内容长度头
Name Some servers send bad Content-Length headers for files that contain CR
名称某些服务器为包含CR的文件发送错误的内容长度头
Classification Implementation
分类实施
Description Certain web servers send a Content-length value that is larger than number of bytes in the HTTP message body. This happens when the server strips off CR characters from text files with lines terminated with CRLF as the file is written to the client. The server probably uses the stat() system call to get the file size for the Content-Length header. Servers that exhibit this behavior include the GN Web server (version 2.14 at least).
说明某些web服务器发送的内容长度值大于HTTP消息正文中的字节数。当文件写入客户机时,服务器从文本文件中删除以CRLF结尾的行中的CR字符时,就会发生这种情况。服务器可能使用stat()系统调用来获取内容长度头的文件大小。表现出这种行为的服务器包括GN Web服务器(至少2.14版)。
Significance Low. Surveys indicate only a small number of sites run faulty servers.
显著性低。调查表明,只有少数站点运行有故障的服务器。
Implications In this case, an HTTP client (e.g., user agent or proxy) may believe it received a partial response. HTTP/1.1 [3] advises that caches MAY store partial responses.
在这种情况下,HTTP客户端(例如,用户代理或代理)可能认为它收到了部分响应。HTTP/1.1[3]建议缓存可以存储部分响应。
Indications Count the number of bytes in the message body and compare to the Content-length value. If they differ the server exhibits this problem.
指示统计消息正文中的字节数,并与内容长度值进行比较。如果它们不同,服务器就会出现此问题。
Solutions Upgrade or replace the buggy server.
解决方案升级或更换有缺陷的服务器。
Workaround Some browsers and proxies use one TCP connection per object and ignore the Content-Length. The document end of file is identified by the close of the TCP socket.
一些浏览器和代理对每个对象使用一个TCP连接,而忽略内容长度。通过关闭TCP套接字来标识文件结尾。
Contact Duane Wessels <wessels@measurement-factory.com>
联系Duane Wessels<wessels@measurement-factory.com>
This memo does not raise security considerations in itself. See the individual submissions for details of security concerns and issues.
本备忘录本身并不涉及安全问题。有关安全顾虑和问题的详细信息,请参阅个人提交的资料。
References
工具书类
[1] Paxson, V., Allman, M., Dawson, S., Fenner, W., Griner, J., Heavens, I., Lahey, K., Semke, J. and B. Volz, "Known TCP Implementation Problems", RFC 2525, March 1999.
[1] Paxson,V.,Allman,M.,Dawson,S.,Fenner,W.,Griner,J.,天堂,I.,Lahey,K.,Semke,J.和B.Volz,“已知的TCP实现问题”,RFC 25251999年3月。
[2] Cooper, I., Melve, I. and G. Tomlinson, "Internet Web Replication and Caching Taxonomy", RFC 3040, January 2001.
[2] Cooper,I.,Melve,I.和G.Tomlinson,“互联网Web复制和缓存分类”,RFC3040,2001年1月。
[3] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
[3] 菲尔丁,R.,盖蒂斯,J.,莫卧儿,J.,弗莱斯蒂克,H.,马斯特,L.,利奇,P.和T.伯纳斯李,“超文本传输协议——HTTP/1.1”,RFC2616,1999年6月。
[4] Wessels, D. and K. Claffy, "Internet Cache Protocol (ICP), Version 2", RFC 2186, September 1997.
[4] Wessels,D.和K.Claffy,“互联网缓存协议(ICP),第2版”,RFC2186,1997年9月。
[5] Davison, B., "Web Traffic Logs: An Imperfect Resource for Evaluation", in Proceedings of the Ninth Annual Conference of the Internet Society (INET'99), July 1999.
[5] Davidson,B.,“网络流量日志:一种不完善的评估资源”,载于互联网协会第九届年会论文集(INET'99),1999年7月。
[6] Melve, I., "Relation Analysis, Cache Meshes", in Proceedings of the 3rd International WWW Caching Workshop, June 1998, <http://wwwcache.ja.net/events/workshop/29/magicnumber.html>.
[6] Melve,I.,“关系分析,缓存网格”,第三届国际WWW缓存研讨会论文集,1998年6月<http://wwwcache.ja.net/events/workshop/29/magicnumber.html>.
[7] Krishnamurthy, B. and M. Arlett, "PRO-COW: Protocol Compliance on the Web", AT&T Labs Technical Report #990803-05-TM, August 1999, <http://www.research.att.com/~bala/papers/procow-1.ps.gz>.
[7] Krishnamurthy,B.和M.Arlett,“PRO-COW:网络协议合规性”,AT&T实验室技术报告#990803-05-TM,1999年8月<http://www.research.att.com/~bala/papers/procow-1.ps.gz>。
[8] Netscape, Inc., "Navigator Proxy Auto-Config File Format", March 1996, http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy- live.html
[8] Netscape, Inc., "Navigator Proxy Auto-Config File Format", March 1996, http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy- live.html
[9] Mogul, J., Krishnamurthy, B., Douglis, F., Feldmann, A., Goland, Y., van Hoff, A. and D. Hellerstein, "HTTP Delta in HTTP", Work in Progress.
[9] 莫卧儿,J.,克里希那穆尔西,B.,杜利斯,F.,费尔德曼,A.,戈兰,Y.,范霍夫,A.和D.海勒斯坦,“HTTP中的HTTP增量”,正在进行中。
Authors' Addresses
作者地址
Ian Cooper Equinix, Inc. 2450 Bayshore Parkway Mountain View, CA 94043 USA
Ian Cooper Equinix,Inc.美国加利福尼亚州海岸公园山景城2450号,邮编94043
Phone: +1 650 316 6065 EMail: icooper@equinix.com
Phone: +1 650 316 6065 EMail: icooper@equinix.com
John Dilley Akamai Technologies, Inc. 1400 Fashion Island Blvd Suite 703 San Mateo, CA 94404 USA
John Dilley Akamai Technologies,Inc.美国加利福尼亚州圣马特奥市时尚岛大道1400号703套房,邮编94404
Phone: +1 650 627 5244 EMail: jad@akamai.com
Phone: +1 650 627 5244 EMail: jad@akamai.com
The following sub-sections are an archive of problems identified in the initial production of this memo. These are typically problems requiring further work/research, or user education. They are included here for reference purposes only.
以下小节是本备忘录初始制作过程中发现的问题的存档。这些问题通常需要进一步的工作/研究或用户教育。此处所列内容仅供参考。
Name Cannot specify multiple URIs for replicated resources
名称不能为复制的资源指定多个URI
Classification Architecture
分类体系结构
Description There is no way to specify that multiple URIs may be used for a single resource, one for each replica of the resource. Similarly, there is no way to say that some set of proxies (each identified by a URI) may be used to resolve a URI.
说明无法指定单个资源可以使用多个URI,资源的每个副本可以使用一个URI。类似地,也没有办法说可以使用某组代理(每个代理由URI标识)来解析URI。
Significance Medium
重要性媒介
Implications Forces users to understand the replication model and mechanism. Makes it difficult to create a replication framework without protocol support for replication and naming.
含义迫使用户理解复制模型和机制。如果没有对复制和命名的协议支持,则很难创建复制框架。
Indications Inherent in HTTP/1.0, HTTP/1.1.
HTTP/1.0、HTTP/1.1中固有的指示。
Solution(s) Architectural - protocol design is necessary.
解决方案架构-协议设计是必要的。
Workaround Replication mechanisms force users to locate a replica or mirror site for replicated content.
变通复制机制强制用户定位复制内容的副本或镜像站点。
Contact Daniel LaLiberte <liberte@w3.org>
联系Daniel Lalibert<liberte@w3.org>
Name Replica distance is unknown
名称副本距离未知
Classification Architecture
分类体系结构
Description There is no recommended way to find out which of several servers or proxies is closer either to the requesting client or to another machine, either geographically or in the network topology.
说明:没有推荐的方法可以确定几个服务器或代理中的哪一个在地理位置或网络拓扑中更靠近请求客户端或另一台机器。
Significance Medium
重要性媒介
Implications Clients must guess which replica is closer to them when requesting a copy of a document that may be served from multiple locations. Users must know the set of servers that can serve a particular object. This in general is hard to determine and maintain. Users must understand network topology in order to choose the closest copy. Note that the closest copy is not always the one that will result in quickest service. A nearby but heavily loaded server may be slower than a more distant but lightly loaded server.
暗示客户在请求可以从多个位置提供的文档副本时,必须猜测哪个副本离他们更近。用户必须知道可以为特定对象提供服务的服务器集。这通常很难确定和维持。用户必须了解网络拓扑,才能选择最接近的副本。请注意,最接近的副本并不总是能够提供最快服务的副本。邻近但负载较重的服务器可能比较远但负载较轻的服务器慢。
Indications Inherent in HTTP/1.0, HTTP/1.1.
HTTP/1.0、HTTP/1.1中固有的指示。
Solution(s) Architectural - protocol work is necessary. This is a specific instance of a general problem in widely distributed systems. A general solution is unlikely, however a specific solution in the web context is possible.
解决方案体系结构-协议工作是必要的。这是广泛分布系统中一般问题的一个具体实例。一般的解决方案是不可能的,但是在web环境中有一个特定的解决方案是可能的。
Workaround Servers can (many do) provide location hints in a replica selection web page. Users choose one based upon their location. Users can learn which replica server gives them best performance. Note that the closest replica geographically is not necessarily the closest in terms of network topology. Expecting users to understand network topology is unreasonable.
变通服务器可以(很多)在副本选择网页中提供位置提示。用户根据其位置选择一个。用户可以了解哪个副本服务器提供了最佳性能。请注意,就网络拓扑而言,地理位置最近的副本不一定是最近的副本。期望用户了解网络拓扑是不合理的。
Contact Daniel LaLiberte <liberte@w3.org>
联系Daniel Lalibert<liberte@w3.org>
Name Proxy resource location
名称代理资源位置
Classification Architecture
分类体系结构
Description There is no way for a client or server (including another proxy) to inform a proxy of an alternate address (perhaps including the proxy to use to reach that address) to use to fetch a resource. If the client does not trust where the redirected resource came from, it may need to validate it or validate where it came from.
说明客户端或服务器(包括另一个代理)无法通知代理用于获取资源的备用地址(可能包括用于访问该地址的代理)。如果客户端不信任重定向资源的来源,则可能需要验证它或验证它的来源。
Significance Medium
重要性媒介
Implications Proxies have no systematic way to locate resources within other proxies or origin servers. This makes it more difficult to share information among proxies. Information sharing would improve global efficiency.
暗示代理没有系统的方法来定位其他代理或源服务器中的资源。这使得在代理之间共享信息变得更加困难。信息共享将提高全球效率。
Indications Inherent in HTTP/1.0, HTTP/1.1.
HTTP/1.0、HTTP/1.1中固有的指示。
Solution(s) Architectural - protocol design is necessary.
解决方案架构-协议设计是必要的。
Workaround Certain proxies share location hints in the form of summary digests of their contents (e.g., Squid). Certain proxy protocols enable a proxy query another for its contents (e.g., ICP). (See however "ICP Performance" issue (Section 2.2.5).)
解决方法某些代理以其内容摘要的形式共享位置提示(例如Squid)。某些代理协议允许代理查询另一个代理的内容(例如ICP)。(但请参见“ICP绩效”问题(第2.2.5节)。)
Contact Daniel LaLiberte <liberte@w3.org>
联系Daniel Lalibert<liberte@w3.org>
Name Use of Cache-Control headers
缓存控制头的名称使用
Classification Implementation
分类实施
Description Many (if not most) implementations incorrectly interpret Cache-Control response headers.
说明许多(如果不是大多数)实现错误地解释缓存控制响应头。
Significance High
显著性高
Implications Cache-Control headers will be spurned by end users if there are conflicting or non-standard implementations.
如果存在冲突或非标准实现,最终用户将拒绝使用缓存控制头。
Indications -
迹象-
Solution(s) Work with vendors and others to assure proper application
解决方案与供应商和其他人合作,以确保正确应用
Workaround None.
无解决方法。
Contact Mark Nottingham <mnot@mnot.net>
联系马克·诺丁汉<mnot@mnot.net>
Name Lack of HTTP/1.1 compliance for caching proxies
名称缓存代理缺少HTTP/1.1遵从性
Classification Implementation
分类实施
Description Although performance benchmarking of caches is starting to be explored, protocol compliance is just as important.
描述虽然缓存的性能基准测试已经开始探索,但协议遵从性也同样重要。
Significance High
显著性高
Implications Caching proxy vendors implement their interpretation of the specification; because the specification is very large, sometimes vague and ambiguous, this can lead to inconsistent behavior between caching proxies.
代理供应商实施其对规范的解释;由于规范非常大,有时含糊不清,这可能导致缓存代理之间的行为不一致。
Caching proxies need to comply to the specification (or the specification needs to change).
缓存代理需要符合规范(或者规范需要更改)。
Indications There is no currently known compliance test being used.
指示目前未使用已知的符合性测试。
There is work underway to quantify how closely servers comply with the current specification. A joint technical report between AT&T and HP Labs [7] describes the compliance testing. This report examines how well each of a set of top traffic-producing sites support certain HTTP/1.1 features.
目前正在对服务器与当前规范的符合程度进行量化。AT&T和HP实验室之间的联合技术报告[7]描述了合规性测试。本报告检查了一组顶级流量生成站点对某些HTTP/1.1功能的支持程度。
The Measurement Factory (formerly IRCache) is working to develop protocol compliance testing software. Running such a conformance test suite against caching proxy products would measure compliance and ultimately would help assure they comply to the specification.
测量工厂(前身为IRCache)正在开发协议符合性测试软件。对缓存代理产品运行这样一个一致性测试套件将度量一致性,并最终帮助确保它们符合规范。
Solution(s) Testing should commence and be reported in an open industry forum. Proxy implementations should conform to the specification.
解决方案测试应开始,并在开放式行业论坛上报告。代理实现应该符合规范。
Workaround There is no workaround for non-compliance.
解决办法不存在不合规问题的解决办法。
Contact Mark Nottingham <mnot@mnot.net> Duane Wessels <wessels@measurement-factory.com>
Contact Mark Nottingham <mnot@mnot.net> Duane Wessels <wessels@measurement-factory.com>
Name ETag support
名称ETag支持
Classification Implementation
分类实施
Description Available caching proxies appear not to support ETag (strong) validation.
说明可用缓存代理似乎不支持ETag(强)验证。
Significance Medium
重要性媒介
Implications Last-Modified/If-Modified-Since validation is inappropriate for many requirements, both because of its weakness and its use of dates. Lack of a usable, strong coherency protocol leads developers and end users not to trust caches.
影响上次修改/如果修改,因为验证不适合于许多需求,因为它的弱点和日期的使用。缺乏可用的、强一致性的协议导致开发人员和最终用户不信任缓存。
Indications -
迹象-
Solution(s) Work with vendors to implement ETags; work for better validation protocols.
解决方案与供应商合作实施ETAG;为更好的验证协议而工作。
Workaround Use Last-Modified/If-Modified-Since validation.
解决方法使用上次修改/如果自验证以来已修改。
Contact Mark Nottingham <mnot@mnot.net>
联系马克·诺丁汉<mnot@mnot.net>
Name Servers and content should be optimized for caching
应针对缓存优化名称服务器和内容
Classification Implementation (Performance)
分类实施(绩效)
Description Many web servers and much web content could be implemented to be more conducive to caching, reducing bandwidth demand and page load delay.
说明可以实现许多web服务器和许多web内容,以更有利于缓存,减少带宽需求和页面加载延迟。
Significance Medium
重要性媒介
Implications By making poor use of caches, origin servers encourage longer load times, greater load on caching proxies, and increased network demand.
由于缓存使用不当,源服务器鼓励更长的加载时间、缓存代理上更大的负载以及更大的网络需求。
Indications The problem is most apparent for pages that have low or zero expires time, yet do not change.
指示对于过期时间较短或为零但未更改的页面,问题最为明显。
Solution(s) -
解决方案-
Workaround Servers could start using unique object identifiers for write-only content: if an object changes it gets a new name, otherwise it is considered to be immutable and therefore have an infinite expire age. Certain hosting providers do this already.
解决方案服务器可以开始为只写内容使用唯一的对象标识符:如果对象发生更改,它将获得一个新名称,否则它将被视为不可变的,因此具有无限的过期期限。某些主机提供商已经这样做了。
Contact Peter Danzig
联系Peter Danzig
Name Lack of fine-grained, standardized hierarchy controls
名称缺少细粒度、标准化的层次结构控件
Classification Administration
分类管理
Description There is no standard for instructing a proxy as to how it should resolve the parent to fetch a given object from. Implementations therefore vary greatly, and it can be difficult to make them interoperate correctly in a complex environment.
说明没有标准指示代理如何解析父对象以从中获取给定对象。因此,实现差异很大,很难使它们在复杂环境中正确地互操作。
Significance Medium
重要性媒介
Implications Complications in deployment of caches in a complex network (especially corporate networks)
在复杂网络(尤其是公司网络)中部署缓存的复杂性
Indications Inability of some proxies to be configured to direct traffic based on domain name, reverse lookup IP address, raw IP address, in normal operation and in failover mode. Inability in some proxies to set a preferred parent / backup parent configuration.
指示某些代理无法配置为在正常操作和故障切换模式下基于域名、反向查找IP地址、原始IP地址定向流量。某些代理无法设置首选父/备份父配置。
Solution(s) -
解决方案-
Workaround Work with vendors to establish an acceptable configuration within the limits of their product; standardize on one product.
与供应商合作,在其产品范围内建立可接受的配置;对一种产品进行标准化。
Contact Mark Nottingham <mnot@mnot.net>
联系马克·诺丁汉<mnot@mnot.net>
Name Proxy/Server exhaustive log format standard for analysis
用于分析的名称代理/服务器详尽日志格式标准
Classification Administration
分类管理
Description Most proxy or origin server logs used for characterization or evaluation do not provide sufficient detail to determine cacheability of responses.
描述用于表征或评估的大多数代理或源服务器日志没有提供足够的详细信息来确定响应的可缓存性。
Significance Low (for operationality; high significance for research efforts)
重要性低(对于操作性;对于研究工作的重要性高)
Implications Characterizations and simulations are based on non-representative workloads.
含义表征和模拟基于非代表性工作负载。
See Also W3C Web Characterization Activity, since they are also concerned with collecting high quality logs and building characterizations from them.
另请参见W3C Web角色化活动,因为他们还关注收集高质量日志并从中构建角色化。
Indications -
迹象-
Solution(s) To properly clean and to accurately determine cacheability of responses, a complete log is required (including all request headers as well as all response headers such as "User-agent" [for removal of spiders] and "Expires", "max-age", "Set-cookie", "no-cache", etc.)
解决方案为了正确清理和准确确定响应的可缓存性,需要一个完整的日志(包括所有请求头以及所有响应头,如“用户代理”[用于删除爬虫]和“过期”、“最大年龄”、“设置cookie”、“无缓存”等)
Workaround -
变通办法-
References See "Web Traffic Logs: An Imperfect Resource for Evaluation"[5] for some discussion of this.
参考文献请参阅“Web流量日志:不完善的评估资源”[5]以了解对此的一些讨论。
Contact Brian D. Davison <davison@acm.org> Terence Kelly <tpkelly@eecs.umich.edu>
Contact Brian D. Davison <davison@acm.org> Terence Kelly <tpkelly@eecs.umich.edu>
Name Trace log timestamps
名称跟踪日志时间戳
Classification Administration
分类管理
Description Some proxies/servers log requests without sufficient timing detail. Millisecond resolution is often too small to preserve request ordering and either the servers should record request reception time in addition to completion time, or elapsed time plus either one.
说明一些代理/服务器记录请求,但没有足够的时间细节。毫秒分辨率通常太小,无法保存请求顺序,服务器除了记录完成时间外,还应记录请求接收时间,或者记录经过的时间加上其中一个。
Significance Low (for operationality; medium significance for research efforts)
低显著性(对于操作性;对于研究工作中等显著性)
Implications Characterization and simulation fidelity is improved with accurate timing and ordering information. Since logs are generally written in order of request completion, these logs cannot be re-played without knowing request generation times and reordering accordingly.
通过精确的定时和排序信息,提高了表征和模拟保真度。由于日志通常是按照请求完成的顺序编写的,因此在不知道请求生成时间并相应地重新排序的情况下,无法重新播放这些日志。
See Also -
另见-
Indications Timestamps can be identical for multiple entries (when only millisecond resolution is used). Request orderings can be jumbled when clients open additional connections for embedded objects while still receiving the container object.
对于多个条目,指示时间戳可以相同(当仅使用毫秒分辨率时)。当客户端在仍然接收容器对象的情况下为嵌入对象打开其他连接时,请求排序可能会混乱。
Solution(s) Since request completion time is common (e.g., Squid), recommend continuing to use it (with microsecond resolution if possible) plus recording elapsed time since request reception.
解决方案由于请求完成时间很常见(例如Squid),建议继续使用它(如果可能,分辨率为微秒),并记录自收到请求以来经过的时间。
Workaround -
变通办法-
References See "Web Traffic Logs: An Imperfect Resource for Evaluation"[5] for some discussion of this.
参考文献请参阅“Web流量日志:不完善的评估资源”[5]以了解对此的一些讨论。
Contact Brian D. Davison <davison@acm.org>
联系Brian D.Davison<davison@acm.org>
Name Exchange format for log summaries
日志摘要的名称交换格式
Classification Administration/Analysis?
分类管理/分析?
Description Although we have (more or less) a standard log file format for proxies (plain vanilla Common Logfile and Squid), there isn't a commonly accepted format for summaries of those log files. Summaries could be generated by the cache itself, or by post-processing existing log file formats such as Squid's.
描述尽管我们有(或多或少)一个标准的代理日志文件格式(普通的普通日志文件和Squid),但对于这些日志文件的摘要,还没有一个普遍接受的格式。摘要可以由缓存本身生成,也可以通过对现有日志文件格式(如Squid)进行后处理生成。
Significance High, since it means that each log file summarizing/analysis tool is essentially reinventing the wheel (un-necessary repetition of code), and the cost of processing a large number of large log files through a variety of analysis tools is (again for no good reason) excessive.
重要程度很高,因为这意味着每个日志文件汇总/分析工具基本上都在重新发明轮子(不必要的代码重复),并且通过各种分析工具处理大量大型日志文件的成本(同样没有充分的理由)过高。
Implications In order to perform a meaningful analysis (e.g., to measure performance in relation to loading/configuration over time) the access logs from multiple busy caches, it's often necessary to run first one tool then another, each against the entire log file (or a significantly large subset of the log). With log files running into hundreds of MB even after compression (for a cache dealing with millions of transactions per day) this is a non-trivial task.
含义为了对多个繁忙缓存中的访问日志执行有意义的分析(例如,测量与随时间加载/配置相关的性能),通常需要先运行一个工具,然后再运行另一个工具,每个工具都针对整个日志文件(或日志的很大子集)。即使在压缩之后,日志文件也会达到数百MB(对于每天处理数百万事务的缓存而言),这是一项非常重要的任务。
See Also IP packet/header sniffing - it may be that individual transactions are at a level of granularity which simply isn't sensible to be attempting on extremely busy caches. There may also be legal implications in some countries, e.g., if this analysis identifies individuals.
另请参见IP数据包/报头嗅探-可能是单个事务处于粒度级别,而在非常繁忙的缓存上进行尝试是不明智的。在某些国家也可能存在法律影响,例如,如果该分析确定了个人。
Indications Disks/memory full(!) Stats (using multiple programs) take too long to run. Stats crunching must be distributed out to multiple machines because of its high computational cost.
指示磁盘/内存已满(!)状态(使用多个程序)运行时间过长。统计数据处理必须分发到多台机器,因为它的计算成本很高。
Solution(s) Have the proxy produce a standardized summary of its activity either automatically or via an external (e.g., third party) tool, in a commonly agreed format. The format could be something like XML or the Extended Common Logfile, but the format and contents are subjects for discussion. Ideally this approach would permit individual cache server products to supply subsets of the possible summary info, since it may not be feasible for all servers to provide all of the information which people would like to see.
解决方案让代理自动或通过外部(如第三方)工具以共同商定的格式生成其活动的标准化摘要。格式可以类似于XML或扩展的通用日志文件,但格式和内容是讨论的主题。理想情况下,这种方法将允许单个缓存服务器产品提供可能的摘要信息子集,因为并非所有服务器都能提供人们希望看到的所有信息。
Workaround Devise a private summary format for your own personal use - but this complicates or even precludes the exchange of summary info with other interested parties.
解决方法:设计一种私人摘要格式供您个人使用,但这会使摘要信息与其他相关方的交流变得复杂,甚至无法进行。
References See the web pages for the commonly used cache stats analysis programs, e.g., Calamaris, squidtimes, squidclients, etc.
参考资料请参阅常用缓存统计分析程序的网页,例如Calamaris、squidtimes、SquidClient等。
Contact Martin Hamilton <martin@wwwcache.ja.net>
联系马丁·汉密尔顿<martin@wwwcache.ja.net>
Full Copyright Statement
完整版权声明
Copyright (C) The Internet Society (2001). All Rights Reserved.
版权所有(C)互联网协会(2001年)。版权所有。
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.
本文件及其译本可复制并提供给他人,对其进行评论或解释或协助其实施的衍生作品可全部或部分编制、复制、出版和分发,不受任何限制,前提是上述版权声明和本段包含在所有此类副本和衍生作品中。但是,不得以任何方式修改本文件本身,例如删除版权通知或对互联网协会或其他互联网组织的引用,除非出于制定互联网标准的需要,在这种情况下,必须遵循互联网标准过程中定义的版权程序,或根据需要将其翻译成英语以外的其他语言。
The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.
上述授予的有限许可是永久性的,互联网协会或其继承人或受让人不会撤销。
This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
本文件和其中包含的信息是按“原样”提供的,互联网协会和互联网工程任务组否认所有明示或暗示的保证,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。
Acknowledgement
确认
Funding for the RFC Editor function is currently provided by the Internet Society.
RFC编辑功能的资金目前由互联网协会提供。