Internet Engineering Task Force (IETF)                         T. Talpey
Request for Comments: 5667                                  Unaffiliated
Category: Standards Track                                   B. Callaghan
ISSN: 2070-1721                                                    Apple
                                                            January 2010
        
Internet Engineering Task Force (IETF)                         T. Talpey
Request for Comments: 5667                                  Unaffiliated
Category: Standards Track                                   B. Callaghan
ISSN: 2070-1721                                                    Apple
                                                            January 2010
        

Network File System (NFS) Direct Data Placement

网络文件系统(NFS)直接数据放置

Abstract

摘要

This document defines the bindings of the various Network File System (NFS) versions to the Remote Direct Memory Access (RDMA) operations supported by the RPC/RDMA transport protocol. It describes the use of direct data placement by means of server-initiated RDMA operations into client-supplied buffers for implementations of NFS versions 2, 3, 4, and 4.1 over such an RDMA transport.

本文档定义了各种网络文件系统(NFS)版本与RPC/RDMA传输协议支持的远程直接内存访问(RDMA)操作的绑定。它描述了通过服务器启动的RDMA操作将数据直接放置到客户机提供的缓冲区中,以便通过这种RDMA传输实现NFS版本2、3、4和4.1。

Status of This Memo

关于下段备忘

This is an Internet Standards Track document.

这是一份互联网标准跟踪文件。

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.

本文件是互联网工程任务组(IETF)的产品。它代表了IETF社区的共识。它已经接受了公众审查,并已被互联网工程指导小组(IESG)批准出版。有关互联网标准的更多信息,请参见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5667.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc5667.

Copyright Notice

版权公告

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2010 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本,并提供简化BSD许可证中所述的无担保。

This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.

本文件可能包含2008年11月10日之前发布或公开的IETF文件或IETF贡献中的材料。控制某些材料版权的人员可能未授予IETF信托允许在IETF标准流程之外修改此类材料的权利。在未从控制此类材料版权的人员处获得充分许可的情况下,不得在IETF标准流程之外修改本文件,也不得在IETF标准流程之外创建其衍生作品,除了将其格式化以RFC形式发布或将其翻译成英语以外的其他语言。

Table of Contents

目录

   1. Introduction ....................................................2
      1.1. Requirements Language ......................................2
   2. Transfers from NFS Client to NFS Server .........................3
   3. Transfers from NFS Server to NFS Client .........................3
   4. NFS Versions 2 and 3 Mapping ....................................4
   5. NFS Version 4 Mapping ...........................................6
      5.1. NFS Version 4 Callbacks ....................................7
   6. Port Usage Considerations .......................................8
   7. Security Considerations .........................................9
   8. Acknowledgments .................................................9
   9. References ......................................................9
      9.1. Normative References .......................................9
      9.2. Informative References ....................................10
        
   1. Introduction ....................................................2
      1.1. Requirements Language ......................................2
   2. Transfers from NFS Client to NFS Server .........................3
   3. Transfers from NFS Server to NFS Client .........................3
   4. NFS Versions 2 and 3 Mapping ....................................4
   5. NFS Version 4 Mapping ...........................................6
      5.1. NFS Version 4 Callbacks ....................................7
   6. Port Usage Considerations .......................................8
   7. Security Considerations .........................................9
   8. Acknowledgments .................................................9
   9. References ......................................................9
      9.1. Normative References .......................................9
      9.2. Informative References ....................................10
        
1. Introduction
1. 介绍

The Remote Direct Memory Access (RDMA) Transport for Remote Procedure Call (RPC) [RFC5666] allows an RPC client application to post buffers in a Chunk list for specific arguments and results from an RPC call. The RDMA transport header conveys this list of client buffer addresses to the server where the application can associate them with client data and use RDMA operations to transfer the results directly to and from the posted buffers on the client. The client and server must agree on a consistent mapping of posted buffers to RPC. This document details the mapping for each version of the NFS protocol [RFC1094] [RFC1813] [RFC3530] [RFC5661].

远程过程调用(RPC)的远程直接内存访问(RDMA)传输[RFC5666]允许RPC客户端应用程序在区块列表中为RPC调用的特定参数和结果发布缓冲区。RDMA传输头将此客户机缓冲区地址列表传送到服务器,应用程序可将其与客户机数据关联,并使用RDMA操作将结果直接传输到客户机上的已发布缓冲区,或从中传输。客户端和服务器必须就已发布缓冲区到RPC的一致映射达成一致。本文档详细介绍了NFS协议[RFC1094][RFC1813][RFC3530][RFC5661]每个版本的映射。

1.1. Requirements Language
1.1. 需求语言

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中所述进行解释。

2. Transfers from NFS Client to NFS Server
2. 从NFS客户端到NFS服务器的传输

The RDMA Read list, in the RDMA transport header, allows an RPC client to marshal RPC call data selectively. Large chunks of data, such as the file data of an NFS WRITE request, MAY be referenced by an RDMA Read list and be moved efficiently and directly placed by an RDMA Read operation initiated by the server.

RDMA传输头中的RDMA读取列表允许RPC客户端有选择地封送RPC调用数据。大数据块(如NFS写入请求的文件数据)可由RDMA读取列表引用,并可由服务器启动的RDMA读取操作高效、直接地移动和放置。

The process of identifying these chunks for the RDMA Read list can be implemented entirely within the RPC layer. It is transparent to the upper-level protocol, such as NFS. For instance, the file data portion of an NFS WRITE request can be selected as an RDMA "chunk" within the eXternal Data Representation (XDR) marshaling code of RPC based on a size criterion, independently of the NFS protocol layer. The XDR unmarshaling on the receiving system can identify the correspondence between Read chunks and protocol elements via the XDR position value encoded in the Read chunk entry.

为RDMA读取列表标识这些块的过程可以完全在RPC层中实现。它对上层协议(如NFS)是透明的。例如,NFS写入请求的文件数据部分可以根据大小标准在RPC的外部数据表示(XDR)封送代码中选择为RDMA“块”,与NFS协议层无关。接收系统上的XDR解组可以通过在读块条目中编码的XDR位置值来识别读块和协议元素之间的对应关系。

RPC RDMA Read chunks are employed by this NFS mapping to convey specific NFS data to the server in a manner that may be directly placed. The following sections describe this mapping for versions of the NFS protocol.

此NFS映射使用RPC RDMA读块,以直接放置的方式将特定NFS数据传送到服务器。以下各节介绍NFS协议版本的映射。

3. Transfers from NFS Server to NFS Client
3. 从NFS服务器到NFS客户端的传输

The RDMA Write list, in the RDMA transport header, allows the client to post one or more buffers into which the server will RDMA Write designated result chunks directly. If the client sends a null Write list, then results from the RPC call will be returned either as an inline reply, as chunks in an RDMA Read list of server-posted buffers, or in a client-posted reply buffer.

RDMA传输头中的RDMA写入列表允许客户端发布一个或多个缓冲区,服务器将直接向其中写入指定的结果块。如果客户端发送空写列表,则RPC调用的结果将作为内联应答、服务器发布缓冲区的RDMA读取列表中的块或客户端发布的应答缓冲区中的块返回。

Each posted buffer in a Write list is represented as an array of memory segments. This allows the client some flexibility in submitting discontiguous memory segments into which the server will scatter the result. Each segment is described by a triplet consisting of the segment handle or steering tag (STag), segment length, and memory address or offset.

写入列表中每个已发布的缓冲区都表示为一个内存段数组。这允许客户端在提交不连续的内存段时具有一定的灵活性,服务器将结果分散到这些内存段中。每个段由一个三元组描述,三元组由段句柄或转向标记(STag)、段长度和内存地址或偏移量组成。

      struct xdr_rdma_segment {
         uint32 handle;    /* Registered memory handle */
         uint32 length;    /* Length of the chunk in bytes */
         uint64 offset;    /* Chunk virtual address or offset */
      };
        
      struct xdr_rdma_segment {
         uint32 handle;    /* Registered memory handle */
         uint32 length;    /* Length of the chunk in bytes */
         uint64 offset;    /* Chunk virtual address or offset */
      };
        
      struct xdr_write_chunk {
         struct xdr_rdma_segment target<>;
      };
        
      struct xdr_write_chunk {
         struct xdr_rdma_segment target<>;
      };
        
      struct xdr_write_list {
         struct xdr_write_chunk entry;
         struct xdr_write_list  *next;
      };
        
      struct xdr_write_list {
         struct xdr_write_chunk entry;
         struct xdr_write_list  *next;
      };
        

The sum of the segment lengths yields the total size of the buffer, which MUST be large enough to accept the result. If the buffer is too small, the server MUST return an XDR encode error. The server MUST return the result data for a posted buffer by progressively filling its segments, perhaps leaving some trailing segments unfilled or partially full if the size of the result is less than the total size of the buffer segments.

段长度之和产生缓冲区的总大小,缓冲区必须足够大才能接受结果。如果缓冲区太小,服务器必须返回XDR编码错误。服务器必须通过逐步填充其段来返回已发布缓冲区的结果数据,如果结果的大小小于缓冲区段的总大小,则可能会留下一些未填充的尾随段或部分已满的尾随段。

The server returns the RDMA Write list to the client with the segment length fields overwritten to indicate the amount of data RDMA written to each segment. Results returned by direct placement MUST NOT be returned by other methods, e.g., by Read chunk list or inline. If no result data at all is returned for the element, the server places no data in the buffer(s), but does return zeros in the segment length fields corresponding to the result.

服务器将RDMA写入列表返回给客户端,并覆盖段长度字段,以指示RDMA写入每个段的数据量。通过直接放置返回的结果不能通过其他方法返回,例如,通过读取区块列表或内联。如果没有为元素返回任何结果数据,则服务器不会在缓冲区中放置任何数据,但会在与结果相对应的段长度字段中返回零。

The RDMA Write list allows the client to provide multiple result buffers -- each buffer maps to a specific result in the reply. The NFS client and server implementations agree by specifying the mapping of results to buffers for each RPC procedure. The following sections describe this mapping for versions of the NFS protocol.

RDMA写列表允许客户端提供多个结果缓冲区——每个缓冲区映射到应答中的特定结果。NFS客户端和服务器实现通过为每个RPC过程指定结果到缓冲区的映射来达成一致。以下各节介绍NFS协议版本的映射。

Through the use of RDMA Write lists in NFS requests, it is not necessary to employ the RDMA Read lists in the NFS replies, as described in the RPC/RDMA protocol. This enables more efficient operation, by avoiding the need for the server to expose buffers for RDMA, and also avoiding "RDMA_DONE" exchanges. Clients MAY additionally employ RDMA Reply chunks to receive entire messages, as described in [RFC5666].

通过在NFS请求中使用RDMA写列表,无需在NFS应答中使用RDMA读列表,如RPC/RDMA协议中所述。通过避免服务器需要为RDMA公开缓冲区,以及避免“RDMA_DONE”交换,这可以实现更高效的操作。如[RFC5666]中所述,客户端还可以使用RDMA应答块来接收整个消息。

4. NFS Versions 2 and 3 Mapping
4. NFS版本2和3映射

A single RDMA Write list entry MAY be posted by the client to receive either the opaque file data from a READ request or the pathname from a READLINK request. The server MUST ignore a Write list for any other NFS procedure, as well as any Write list entries beyond the first in the list.

客户端可以发布单个RDMA写列表条目,以从读取请求接收不透明文件数据或从读取链接请求接收路径名。服务器必须忽略任何其他NFS过程的写入列表,以及列表中第一个之外的任何写入列表项。

Similarly, a single RDMA Read list entry MAY be posted by the client to supply the opaque file data for a WRITE request or the pathname for a SYMLINK request. The server MUST ignore any Read list for other NFS procedures, as well as additional Read list entries beyond the first in the list.

类似地,客户端可以发布单个RDMA读取列表条目,为写入请求提供不透明文件数据,或为符号链接请求提供路径名。服务器必须忽略其他NFS过程的任何读取列表,以及列表中第一个之外的其他读取列表项。

Because there are no NFS version 2 or 3 requests that transfer bulk data in both directions, it is not necessary to post requests containing both Write and Read lists. Any unneeded Read or Write lists are ignored by the server.

由于不存在双向传输大容量数据的NFS版本2或3请求,因此无需发布同时包含写列表和读列表的请求。服务器将忽略任何不需要的读或写列表。

In the case where the outgoing request or expected incoming reply is larger than the maximum size supported on the connection, it is possible for the RPC layer to post the entire message or result in a special "RDMA_NOMSG" message type that is transferred entirely by RDMA. This is implemented in RPC, below NFS, and therefore has no effect on the message contents.

在传出请求或预期传入应答大于连接上支持的最大大小的情况下,RPC层可以发布整个消息,或生成完全由RDMA传输的特殊“RDMA_NOMSG”消息类型。这是在NFS下面的RPC中实现的,因此对消息内容没有影响。

Non-RDMA (inline) WRITE transfers MAY OPTIONALLY employ the "RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the appropriate value for the server is known to the client. Padding allows the opaque file data to arrive at the server in an aligned fashion, which may improve server performance.

如果客户端知道服务器的适当值,则非RDMA(内联)写传输可以选择性地使用RPC/RDMA协议中描述的“RDMA_MSGP”填充方法。填充允许不透明文件数据以对齐方式到达服务器,这可能会提高服务器性能。

The NFS version 2 and 3 protocols are frequently limited in practice to requests containing less than or equal to 8 kilobytes and 32 kilobytes of data, respectively. In these cases, it is often practical to support basic operation without employing a configuration exchange as discussed in [RFC5666]. The server MUST post buffers large enough to receive the largest possible incoming message (approximately 12 KB for NFS version 2, or 36 KB for NFS version 3, would be vastly sufficient), and the client can post buffers large enough to receive replies based on the "rsize" it is using to the server, plus a fixed overhead for the RPC and NFS headers. Because the server MUST NOT return data in excess of this size, the client can be assured of the adequacy of its posted buffer sizes.

NFS版本2和3协议在实践中通常仅限于分别包含小于或等于8KB和32KB数据的请求。在这些情况下,支持基本操作而不采用[RFC5666]中讨论的配置交换通常是可行的。服务器必须发布足够大的缓冲区,以接收可能的最大传入消息(对于NFS版本2,大约12 KB,对于NFS版本3,大约36 KB就足够了),并且客户端可以发布足够大的缓冲区,以接收基于其对服务器使用的“rsize”的回复,加上RPC和NFS头的固定开销。由于服务器返回的数据不得超过此大小,因此可以确保客户端的已发布缓冲区大小足够。

Flow control is handled dynamically by the RPC RDMA protocol, and write padding is OPTIONAL and therefore MAY remain unused.

流控制由RPC RDMA协议动态处理,写填充是可选的,因此可能会保持未使用状态。

Alternatively, if the server is administratively configured to values appropriate for all its clients, the same assurance of interoperability within the domain can be made.

或者,如果服务器在管理上配置为适合其所有客户机的值,则可以在域内对互操作性做出相同的保证。

The use of a configuration protocol with NFS v2 and v3 is therefore OPTIONAL. Employing a configuration exchange may allow some advantage to server resource management through accurately sizing buffers, enabling the server to know exactly how many RDMA Reads may be in progress at once on the client connection, and enabling client write padding, which may be desirable for certain servers when RDMA Read is impractical.

因此,在NFS v2和v3中使用配置协议是可选的。通过精确调整缓冲区大小、使服务器能够准确地知道在客户端连接上一次有多少RDMA读取正在进行,以及启用客户端写填充(当RDMA读取不切实际时,这对于某些服务器可能是可取的),采用配置交换可以为服务器资源管理带来一些优势。

5. NFS Version 4 Mapping
5. NFS版本4映射

This specification applies to the first minor version of NFS version 4 (NFSv4.0) and any subsequent minor versions that do not override this mapping.

本规范适用于NFS第4版(NFSv4.0)的第一个次要版本以及不覆盖此映射的任何后续次要版本。

The Write list MUST be considered only for the COMPOUND procedure. This procedure returns results from a sequence of operations. Only the opaque file data from an NFS READ operation and the pathname from a READLINK operation MUST utilize entries from the Write list.

必须仅针对复合过程考虑写入列表。此过程返回一系列操作的结果。只有来自NFS读取操作的不透明文件数据和来自READLINK操作的路径名必须使用写入列表中的条目。

If there is no Write list, i.e., the list is null, then any READ or READLINK operations in the COMPOUND MUST return their data inline. The NFSv4.0 client MUST ensure in this case that any result of its READ and READLINK requests will fit within its receive buffers, in order to avoid a resulting RDMA transport error upon transfer. The server is not required to detect this.

如果没有写列表,即列表为空,则化合物中的任何读或读链接操作都必须以内联方式返回其数据。在这种情况下,NFSv4.0客户机必须确保其读取和读取链接请求的任何结果都适合其接收缓冲区,以避免传输时产生RDMA传输错误。不需要服务器来检测此情况。

The first entry in the Write list MUST be used by the first READ or READLINK in the COMPOUND request. The next Write list entry is used by the next READ or READLINK, and so on. If there are more READ or READLINK operations than Write list entries, then any remaining operations MUST return their results inline.

复合请求中的第一个读取或读取链接必须使用写入列表中的第一个条目。下一个写入列表条目由下一个读取或读取链接使用,依此类推。如果读或读链接操作多于写列表项,则任何剩余操作都必须内联返回其结果。

If a Write list entry is presented, then the corresponding READ or READLINK MUST return its data via an RDMA Write to the buffer indicated by the Write list entry. If the Write list entry has zero RDMA segments, or if the total size of the segments is zero, then the corresponding READ or READLINK operation MUST return its result inline.

如果出现写列表项,则相应的读或读链接必须通过RDMA写入将其数据返回到写列表项指示的缓冲区。如果写列表项具有零个RDMA段,或者如果段的总大小为零,则相应的读或读链接操作必须内联返回其结果。

The following example shows an RDMA Write list with three posted buffers A, B, and C. The designated operations in the compound request, READ and READLINK, consume the posted buffers by writing their results back to each buffer.

下面的示例显示了一个RDMA写列表,其中包含三个已发布的缓冲区A、B和C。复合请求中的指定操作READ和READLINK通过将结果写回每个缓冲区来消耗已发布的缓冲区。

RDMA Write list:

RDMA写入列表:

         A --> B --> C
        
         A --> B --> C
        

Compound request:

复合请求:

         PUTFH LOOKUP READ PUTFH LOOKUP READLINK PUTFH LOOKUP READ
                       |                   |                   |
                       v                   v                   v
                       A                   B                   C
        
         PUTFH LOOKUP READ PUTFH LOOKUP READLINK PUTFH LOOKUP READ
                       |                   |                   |
                       v                   v                   v
                       A                   B                   C
        

If the client does not want to have the READLINK result returned directly, then it provides a zero-length array of segment triplets for buffer B or sets the values in the segment triplet for buffer B to zeros so that the READLINK result MUST be returned inline.

如果客户端不希望直接返回READLINK结果,那么它将为缓冲区B提供一个长度为零的段三元组数组,或者将缓冲区B的段三元组中的值设置为零,以便必须以内联方式返回READLINK结果。

The situation is similar for RDMA Read lists sent by the client and applies to the NFSv4.0 WRITE and SYMLINK procedures as for v3. Additionally, inline segments too large to fit in posted buffers MAY be transferred in special "RDMA_NOMSG" messages.

客户端发送的RDMA读取列表的情况类似,适用于NFSv4.0写入和符号链接过程,如v3。此外,过大的内联段无法放入已发布的缓冲区,可能会在特殊的“RDMA_NOMSG”消息中传输。

Non-RDMA (inline) WRITE transfers MAY OPTIONALLY employ the "RDMA_MSGP" padding method described in the RPC/RDMA protocol, if the appropriate value for the server is known to the client. Padding allows the opaque file data to arrive at the server in an aligned fashion, which may improve server performance. In order to ensure accurate alignment for all data, it is likely that the client will restrict its use of OPTIONAL padding to COMPOUND requests containing only a single WRITE operation.

如果客户端知道服务器的适当值,则非RDMA(内联)写传输可以选择性地使用RPC/RDMA协议中描述的“RDMA_MSGP”填充方法。填充允许不透明文件数据以对齐方式到达服务器,这可能会提高服务器性能。为了确保所有数据的精确对齐,客户机可能会将可选填充的使用限制为只包含单个写入操作的复合请求。

Unlike NFS versions 2 and 3, the maximum size of an NFS version 4 COMPOUND is not bounded, even when RDMA chunks are in use. While it might appear that a configuration protocol exchange (such as the one described in [RFC5666]) would help, in fact the layering issues involved in building COMPOUNDs by NFS make such a mechanism unworkable.

与NFS版本2和3不同,NFS版本4化合物的最大大小不受限制,即使在使用RDMA块时也是如此。虽然看起来配置协议交换(如[RFC5666]中所述)可能会有所帮助,但事实上,通过NFS构建化合物所涉及的分层问题使得这种机制不可行。

However, typical NFS version 4 clients rarely issue such problematic requests. In practice, they behave in much more predictable ways, in fact most still support the traditional rsize/wsize mount parameters. Therefore, most NFS version 4 clients function over RPC/RDMA in the same way as NFS versions 2 and 3, operationally.

但是,典型的NFS版本4客户端很少发出此类有问题的请求。实际上,它们的行为更加可预测,事实上大多数仍然支持传统的rsize/wsize挂载参数。因此,大多数NFS版本4客户端在RPC/RDMA上的运行方式与NFS版本2和3的运行方式相同。

There are however advantages to allowing both client and server to operate with prearranged size constraints, for example, use of the sizes to better manage the server's response cache. An extension to NFS version 4 supporting a more comprehensive exchange of upper-layer parameters is part of [RFC5661].

然而,允许客户端和服务器在预先安排的大小限制下运行有一些优势,例如,使用大小更好地管理服务器的响应缓存。[RFC5661]的一部分是对NFS版本4的扩展,支持更全面的上层参数交换。

5.1. NFS Version 4 Callbacks
5.1. NFS版本4回调

The NFS version 4 protocols support server-initiated callbacks to selected clients, in order to notify them of events such as recalled delegations, etc. These callbacks present no particular issue to being framed over RPC/RDMA, since such callbacks do not carry bulk data such as NFS READ or NFS WRITE. They MAY be transmitted inline via RDMA_MSG, or if the callback message or its reply overflow the

NFS版本4协议支持服务器启动的对选定客户机的回调,以便向它们通知事件,如调用的委托等。这些回调对于通过RPC/RDMA构建框架没有特殊问题,因为此类回调不携带大量数据,如NFS读或NFS写。它们可以通过RDMA_MSG内联传输,或者如果回调消息或其回复溢出

negotiated buffer sizes for a callback connection, they MAY be transferred via the RDMA_NOMSG method as described above for other exchanges.

对于回调连接的协商缓冲区大小,它们可以通过RDMA_NOMSG方法传输,如上所述,用于其他交换。

One special case is noteworthy: in NFS version 4.1, the callback channel is optionally negotiated to be on the same connection as one used for client requests. In this case, and because the transaction ID (XID) is present in the RPC/RDMA header, the client MUST ascertain whether the message is in fact an RPC REPLY, and therefore a reply to a prior request and carrying its XID, before processing it as such. By the same token, the server MUST ascertain whether an incoming message on such a callback-eligible connection is an RPC CALL, before optionally processing the XID.

有一种特殊情况值得注意:在NFS版本4.1中,回调通道可以选择协商为与客户端请求使用的连接处于同一连接上。在这种情况下,由于事务ID(XID)存在于RPC/RDMA报头中,因此客户机必须在将消息作为RPC应答进行处理之前,确定消息是否实际上是一个RPC应答,从而是对先前请求的应答并携带其XID。出于同样的原因,在可选地处理XID之前,服务器必须确定此类回调合格连接上的传入消息是否为RPC调用。

In the callback case, the XID present in the RPC/RDMA header will potentially have any value, which may (or may not) collide with an XID used by the client for a previous or future request. The client and server MUST inspect the RPC component of the message to determine its potential disposition as either an RPC CALL or RPC REPLY, prior to processing this XID, and MUST NOT reject or accept it without also determining the proper context.

在回调情况下,RPC/RDMA头中存在的XID可能具有任何值,这些值可能(也可能不)与客户端用于以前或将来请求的XID冲突。在处理此XID之前,客户端和服务器必须检查消息的RPC组件,以确定其作为RPC调用或RPC应答的潜在处置,并且在未确定正确上下文的情况下,不得拒绝或接受消息。

6. Port Usage Considerations
6. 港口使用注意事项

NFS use of direct data placement introduces a need for an additional NFS port number assignment for networks that share traditional UDP and TCP port spaces with RDMA services. The iWARP [RFC5041] [RFC5040] protocol is such an example (InfiniBand is not).

NFS对直接数据放置的使用为与RDMA服务共享传统UDP和TCP端口空间的网络引入了额外的NFS端口号分配需求。iWARP[RFC5041][RFC5040]协议就是这样一个例子(InfiniBand不是)。

NFS servers for versions 2 and 3 [RFC1094] [RFC1813] traditionally listen for clients on UDP and TCP port 2049, and additionally, they register these with the portmapper and/or rpcbind [RFC1833] service. However, [RFC3530] requires NFS servers for version 4 to listen on TCP port 2049, and they are not required to register.

版本2和版本3[RFC1094][RFC1813]的NFS服务器传统上侦听UDP和TCP端口2049上的客户端,此外,它们还向portmapper和/或rpcbind[RFC1833]服务注册这些客户端。但是,[RFC3530]要求版本4的NFS服务器在TCP端口2049上侦听,并且不需要注册。

An NFS version 2 or version 3 server supporting RPC/RDMA on such a network and registering itself with the RPC portmapper MAY choose an arbitrary port, or MAY use the alternative well-known port number for its RPC/RDMA service. The chosen port MAY be registered with the RPC portmapper under the netid assigned by the requirement in [RFC5666].

在这样的网络上支持RPC/RDMA并向RPC端口映射器注册的NFS版本2或版本3服务器可以选择任意端口,或者可以为其RPC/RDMA服务使用其他已知端口号。所选端口可在[RFC5666]中要求分配的netid下向RPC端口映射器注册。

An NFS version 4 server supporting RPC/RDMA on such a network MUST use the alternative well-known port number for its RPC/RDMA service. Clients SHOULD connect to this well-known port without consulting the RPC portmapper (as for NFSv4/TCP).

在这样的网络上支持RPC/RDMA的NFS版本4服务器必须为其RPC/RDMA服务使用备用的已知端口号。客户端应连接到此众所周知的端口,而无需咨询RPC端口映射器(对于NFSv4/TCP)。

The port number assigned to an NFS service over an RPC/RDMA transport is available from the IANA port registry [RFC3232].

通过RPC/RDMA传输分配给NFS服务的端口号可从IANA端口注册表[RFC3232]获得。

7. Security Considerations
7. 安全考虑

The RDMA transport for RPC [RFC5666] supports all RPC [RFC5531] security models, including RPCSEC_GSS [RFC2203] security and link-level security. The choice of RDMA Read and RDMA Write to return RPC argument and results, respectively, does not affect this, since it only changes the method of data transfer. Specifically, the requirements of [RFC5666] ensure that this choice does not introduce new vulnerabilities.

RPC[RFC5666]的RDMA传输支持所有RPC[RFC5531]安全模型,包括RPCSEC_GSS[RFC2203]安全和链路级安全。选择RDMA Read和RDMA Write分别返回RPC参数和结果不会影响这一点,因为它只会更改数据传输的方法。具体而言,[RFC5666]的要求确保此选择不会引入新的漏洞。

Because this document defines only the binding of the NFS protocols atop [RFC5666], all relevant security considerations are therefore to be described at that layer.

由于本文档仅定义了[RFC5666]上NFS协议的绑定,因此所有相关的安全注意事项都将在该层进行描述。

8. Acknowledgments
8. 致谢

The authors would like to thank Dave Noveck and Chet Juszczak for their contributions to this document.

作者要感谢Dave Noveck和Chet Juszczak对本文件的贡献。

9. References
9. 工具书类
9.1. Normative References
9.1. 规范性引用文件

[RFC1094] Sun Microsystems, "NFS: Network File System Protocol specification", RFC 1094, March 1989.

[RFC1094]Sun Microsystems,“NFS:网络文件系统协议规范”,RFC10941989年3月。

[RFC1813] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS Version 3 Protocol Specification", RFC 1813, June 1995.

[RFC1813]Callaghan,B.,Pawlowski,B.,和P.Staubach,“NFS版本3协议规范”,RFC 1813,1995年6月。

[RFC1833] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", RFC 1833, August 1995.

[RFC1833]Srinivasan,R.,“ONC RPC版本2的绑定协议”,RFC 1833,1995年8月。

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[RFC2203] Eisler, M., Chiu, A., and L. Ling, "RPCSEC_GSS Protocol Specification", RFC 2203, September 1997.

[RFC2203]Eisler,M.,Chiu,A.,和L.Ling,“RPCSEC_GSS协议规范”,RFC 2203,1997年9月。

[RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, C., Eisler, M., and D. Noveck, "Network File System (NFS) version 4 Protocol", RFC 3530, April 2003.

[RFC3530]Shepler,S.,Callaghan,B.,Robinson,D.,Thurlow,R.,Beame,C.,Eisler,M.,和D.Noveck,“网络文件系统(NFS)版本4协议”,RFC 3530,2003年4月。

[RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol Specification Version 2", RFC 5531, May 2009.

[RFC5531]Thurlow,R.,“RPC:远程过程调用协议规范版本2”,RFC 55312009年5月。

[RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., "Network File System (NFS) Version 4 Minor Version 1 Protocol", RFC 5661, January 2010.

[RFC5661]Shepler,S.,Ed.,Eisler,M.,Ed.,和D.Noveck,Ed.,“网络文件系统(NFS)版本4次要版本1协议”,RFC 56612010年1月。

9.2. Informative References
9.2. 资料性引用

[RFC3232] Reynolds, J., Ed., "Assigned Numbers: RFC 1700 is Replaced by an On-line Database", RFC 3232, January 2002.

[RFC3232]Reynolds,J.,Ed.“分配的数字:RFC 1700被在线数据库取代”,RFC 3232,2002年1月。

[RFC5040] Recio, R., Metzler, B., Culley, P., Hilland, J., and D. Garcia, "A Remote Direct Memory Access Protocol Specification", RFC 5040, October 2007.

[RFC5040]Recio,R.,Metzler,B.,Culley,P.,Hilland,J.,和D.Garcia,“远程直接内存访问协议规范”,RFC 50402007年10月。

[RFC5041] Shah, H., Pinkerton, J., Recio, R., and P. Culley, "Direct Data Placement over Reliable Transports", RFC 5041, October 2007.

[RFC5041]Shah,H.,Pinkerton,J.,Recio,R.,和P.Culley,“可靠传输上的直接数据放置”,RFC 50412007年10月。

[RFC5666] Talpey, T. and B. Callaghan, "Remote Direct Memory Access Transport for Remote Procedure Call", RFC 5666, January 2010.

[RFC5666]Talpey,T.和B.Callaghan,“远程过程调用的远程直接内存访问传输”,RFC 5666,2010年1月。

Authors' Addresses

作者地址

Tom Talpey 170 Whitman St. Stow, MA 01775 USA

美国马萨诸塞州惠特曼街170号汤姆·塔尔佩01775

   EMail: tmtalpey@gmail.com
        
   EMail: tmtalpey@gmail.com
        

Brent Callaghan Apple Computer, Inc. MS: 302-4K 2 Infinite Loop Cupertino, CA 95014 USA

Brent Callaghan Apple Computer,Inc.MS:302-4K 2无限循环库珀蒂诺,加利福尼亚州95014

   EMail: brentc@apple.com
        
   EMail: brentc@apple.com