Network Working Group                                    Y. Shafranovich
Request for Comments: 4180                SolidMatrix Technologies, Inc.
Category: Informational                                     October 2005
        
Network Working Group                                    Y. Shafranovich
Request for Comments: 4180                SolidMatrix Technologies, Inc.
Category: Informational                                     October 2005
        

Common Format and MIME Type for Comma-Separated Values (CSV) Files

逗号分隔值(CSV)文件的通用格式和MIME类型

Status of This Memo

关于下段备忘

This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.

本备忘录为互联网社区提供信息。它没有规定任何类型的互联网标准。本备忘录的分发不受限制。

Copyright Notice

版权公告

Copyright (C) The Internet Society (2005).

版权所有(C)互联网协会(2005年)。

Abstract

摘要

This RFC documents the format used for Comma-Separated Values (CSV) files and registers the associated MIME type "text/csv".

此RFC记录逗号分隔值(CSV)文件所用的格式,并注册相关的MIME类型“text/CSV”。

Table of Contents

目录

   1. Introduction ....................................................2
   2. Definition of the CSV Format ....................................2
   3. MIME Type Registration of text/csv ..............................4
   4. IANA Considerations .............................................5
   5. Security Considerations .........................................5
   6. Acknowledgments .................................................6
   7. References ......................................................6
      7.1. Normative References .......................................6
      7.2. Informative References .....................................6
        
   1. Introduction ....................................................2
   2. Definition of the CSV Format ....................................2
   3. MIME Type Registration of text/csv ..............................4
   4. IANA Considerations .............................................5
   5. Security Considerations .........................................5
   6. Acknowledgments .................................................6
   7. References ......................................................6
      7.1. Normative References .......................................6
      7.2. Informative References .....................................6
        
1. Introduction
1. 介绍

The comma separated values format (CSV) has been used for exchanging and converting data between various spreadsheet programs for quite some time. Surprisingly, while this format is very common, it has never been formally documented. Additionally, while the IANA MIME registration tree includes a registration for "text/tab-separated-values" type, no MIME types have ever been registered with IANA for CSV. At the same time, various programs and operating systems have begun to use different MIME types for this format. This RFC documents the format of comma separated values (CSV) files and formally registers the "text/csv" MIME type for CSV in accordance with RFC 2048 [1].

逗号分隔值格式(CSV)用于在各种电子表格程序之间交换和转换数据已有相当一段时间了。令人惊讶的是,虽然这种格式非常常见,但从未正式记录过。此外,虽然IANA MIME注册树包含“文本/制表符分隔值”类型的注册,但从未向IANA注册过CSV的MIME类型。同时,各种程序和操作系统已经开始为这种格式使用不同的MIME类型。本RFC记录逗号分隔值(CSV)文件的格式,并根据RFC 2048[1]正式注册CSV的“文本/CSV”MIME类型。

2. Definition of the CSV Format
2. CSV格式的定义

While there are various specifications and implementations for the CSV format (for ex. [4], [5], [6] and [7]), there is no formal specification in existence, which allows for a wide variety of interpretations of CSV files. This section documents the format that seems to be followed by most implementations:

虽然CSV格式有各种规范和实现(例如[4]、[5]、[6]和[7]),但目前还没有正式规范,允许对CSV文件进行各种解释。本节记录了大多数实现似乎遵循的格式:

1. Each record is located on a separate line, delimited by a line break (CRLF). For example:

1. 每条记录位于单独的行上,由换行符(CRLF)分隔。例如:

aaa,bbb,ccc CRLF zzz,yyy,xxx CRLF

aaa、bbb、ccc CRLF zzz、yyy、xxx CRLF

2. The last record in the file may or may not have an ending line break. For example:

2. 文件中的最后一条记录可能有也可能没有结束换行符。例如:

aaa,bbb,ccc CRLF zzz,yyy,xxx

aaa、bbb、ccc CRLF zzz、yyy、xxx

3. There maybe an optional header line appearing as the first line of the file with the same format as normal record lines. This header will contain names corresponding to the fields in the file and should contain the same number of fields as the records in the rest of the file (the presence or absence of the header line should be indicated via the optional "header" parameter of this MIME type). For example:

3. 可能有一个可选的标题行显示为文件的第一行,格式与正常记录行相同。此标头将包含与文件中的字段对应的名称,并且应包含与文件其余部分中的记录相同数量的字段(应通过此MIME类型的可选“header”参数指示是否存在标头行)。例如:

field_name,field_name,field_name CRLF aaa,bbb,ccc CRLF zzz,yyy,xxx CRLF

字段名称、字段名称、字段名称CRLF aaa、bbb、ccc CRLF zzz、yyy、xxx CRLF

4. Within the header and each record, there may be one or more fields, separated by commas. Each line should contain the same number of fields throughout the file. Spaces are considered part of a field and should not be ignored. The last field in the record must not be followed by a comma. For example:

4. 在标题和每条记录中,可能有一个或多个字段,用逗号分隔。在整个文件中,每行应包含相同数量的字段。空格被视为字段的一部分,不应忽略。记录中的最后一个字段不得后跟逗号。例如:

aaa,bbb,ccc

aaa、bbb、ccc

5. Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields. For example:

5. 每个字段可以用双引号括起来,也可以不用双引号括起来(但是有些程序,如Microsoft Excel,根本不使用双引号)。如果字段没有用双引号括起来,则双引号可能不会出现在字段内。例如:

"aaa","bbb","ccc" CRLF zzz,yyy,xxx

“aaa”,“bbb”,“ccc”CRLF zzz,yyy,xxx

6. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes. For example:

6. 包含换行符(CRLF)、双引号和逗号的字段应包含在双引号中。例如:

"aaa","b CRLF bb","ccc" CRLF zzz,yyy,xxx

“aaa”,“b CRLF bb”,“ccc”CRLF zzz,yyy,xxx

7. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote. For example:

7. 如果使用双引号括起字段,则必须通过在字段前面加上另一个双引号来转义字段中出现的双引号。例如:

"aaa","b""bb","ccc"

“aaa”、“b”、“bb”、“ccc”

The ABNF grammar [2] appears as follows:

ABNF语法[2]如下所示:

   file = [header CRLF] record *(CRLF record) [CRLF]
        
   file = [header CRLF] record *(CRLF record) [CRLF]
        

header = name *(COMMA name)

标题=名称*(逗号名称)

record = field *(COMMA field)

记录=字段*(逗号字段)

   name = field
        
   name = field
        
   field = (escaped / non-escaped)
        
   field = (escaped / non-escaped)
        
   escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
        
   escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
        
   non-escaped = *TEXTDATA
        
   non-escaped = *TEXTDATA
        
   COMMA = %x2C
        
   COMMA = %x2C
        
   CR = %x0D ;as per section 6.1 of RFC 2234 [2]
        
   CR = %x0D ;as per section 6.1 of RFC 2234 [2]
        
   DQUOTE =  %x22 ;as per section 6.1 of RFC 2234 [2]
        
   DQUOTE =  %x22 ;as per section 6.1 of RFC 2234 [2]
        
   LF = %x0A ;as per section 6.1 of RFC 2234 [2]
        
   LF = %x0A ;as per section 6.1 of RFC 2234 [2]
        

CRLF = CR LF ;as per section 6.1 of RFC 2234 [2]

CRLF=CRLF;根据RFC 2234[2]第6.1节

   TEXTDATA =  %x20-21 / %x23-2B / %x2D-7E
        
   TEXTDATA =  %x20-21 / %x23-2B / %x2D-7E
        
3. MIME Type Registration of text/csv
3. 文本/csv的MIME类型注册

This section provides the media-type registration application (as per RFC 2048 [1].

本节提供媒体类型注册应用程序(根据RFC 2048[1])。

To: ietf-types@iana.org

致:ietf-types@iana.org

Subject: Registration of MIME media type text/csv

主题:注册MIME媒体类型text/csv

MIME media type name: text

MIME媒体类型名称:text

MIME subtype name: csv

MIME子类型名称:csv

Required parameters: none

所需参数:无

Optional parameters: charset, header

可选参数:字符集、标头

Common usage of CSV is US-ASCII, but other character sets defined by IANA for the "text" tree may be used in conjunction with the "charset" parameter.

CSV的常见用法是US-ASCII,但IANA为“文本”树定义的其他字符集可与“字符集”参数结合使用。

The "header" parameter indicates the presence or absence of the header line. Valid values are "present" or "absent". Implementors choosing not to use this parameter must make their own decisions as to whether the header line is present or absent.

“header”参数表示是否存在标题行。有效值为“存在”或“不存在”。选择不使用此参数的实现者必须自行决定是否存在标题行。

Encoding considerations:

编码注意事项:

As per section 4.1.1. of RFC 2046 [3], this media type uses CRLF to denote line breaks. However, implementors should be aware that some implementations may use other values.

根据第4.1.1节。在RFC 2046[3]中,此媒体类型使用CRLF表示换行符。但是,实现者应该知道,一些实现可能使用其他值。

Security considerations:

安全考虑:

CSV files contain passive text data that should not pose any risks. However, it is possible in theory that malicious binary data may be included in order to exploit potential buffer overruns in the program processing CSV data. Additionally, private data may be shared via this format (which of course applies to any text data).

CSV文件包含不应构成任何风险的被动文本数据。然而,从理论上讲,可能包含恶意二进制数据,以利用处理CSV数据的程序中潜在的缓冲区溢出。此外,私有数据可以通过该格式共享(当然,该格式适用于任何文本数据)。

Interoperability considerations:

互操作性注意事项:

Due to lack of a single specification, there are considerable differences among implementations. Implementors should "be conservative in what you do, be liberal in what you accept from others" (RFC 793 [8]) when processing CSV files. An attempt at a common definition can be found in Section 2.

由于缺乏单一的规范,实现之间存在很大的差异。在处理CSV文件时,实现者应该“在你做的事情上保持保守,在你接受他人的东西上保持自由”(RFC 793[8])。在第2节中可以找到一个通用定义的尝试。

Implementations deciding not to use the optional "header" parameter must make their own decision as to whether the header is absent or present.

决定不使用可选“header”参数的实现必须自行决定是否存在该头。

Published specification:

已发布的规范:

While numerous private specifications exist for various programs and systems, there is no single "master" specification for this format. An attempt at a common definition can be found in Section 2.

虽然各种程序和系统都有许多专用规范,但这种格式没有单一的“主”规范。在第2节中可以找到一个通用定义的尝试。

Applications that use this media type:

使用此媒体类型的应用程序:

Spreadsheet programs and various data conversion utilities

电子表格程序和各种数据转换实用程序

Additional information:

其他信息:

Magic number(s): none

幻数:无

File extension(s): CSV

文件扩展名:CSV

Macintosh File Type Code(s): TEXT

Macintosh文件类型代码:文本

Person & email address to contact for further information:

联系人和电子邮件地址,以获取更多信息:

      Yakov Shafranovich <ietf@shaftek.org>
        
      Yakov Shafranovich <ietf@shaftek.org>
        

Intended usage: COMMON

预期用途:普通

Author/Change controller: IESG

作者/变更控制员:IESG

4. IANA Considerations
4. IANA考虑

The IANA has registered the MIME type "text/csv" using the application provided in Section 3 of this document.

IANA已使用本文件第3节中提供的应用程序注册MIME类型“text/csv”。

5. Security Considerations
5. 安全考虑

See discussion above in section 3.

见上文第3节的讨论。

6. Acknowledgments
6. 致谢

The author would like to thank Dave Crocker, Martin Duerst, Joel M. Halpern, Clyde Ingram, Graham Klyne, Bruce Lilly, Chris Lilley, and members of the IESG for their helpful suggestions. A special word of thanks goes to Dave for helping with the ABNF grammar.

作者要感谢戴夫·克罗克、马丁·杜尔斯特、乔尔·M·哈尔彭、克莱德·英格拉姆、格雷厄姆·克莱恩、布鲁斯·莉莉、克里斯·莉莉和IESG成员提出的有益建议。特别感谢Dave帮助学习ABNF语法。

The author would also like to thank Henrik Lefkowetz, Marshall Rose, and the folks at xml.resource.org for providing many of the tools used for preparing RFCs and Internet drafts.

作者还要感谢Henrik Lefkowetz、Marshall Rose和xml.resource.org上的人员提供了许多用于准备RFC和互联网草案的工具。

A special thank you goes to L.T.S.

向L.T.S.表示特别的感谢。

7. References
7. 工具书类
7.1. Normative References
7.1. 规范性引用文件

[1] Freed, N., Klensin, J., and J. Postel, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", BCP 13, RFC 2048, November 1996.

[1] Freed,N.,Klensin,J.,和J.Postel,“多用途互联网邮件扩展(MIME)第四部分:注册程序”,BCP 13,RFC 2048,1996年11月。

[2] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997.

[2] Crocker,D.和P.Overell,“语法规范的扩充BNF:ABNF”,RFC 2234,1997年11月。

[3] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996.

[3] Freed,N.和N.Borenstein,“多用途互联网邮件扩展(MIME)第二部分:媒体类型”,RFC 20461996年11月。

7.2. Informative References
7.2. 资料性引用

[4] Repici, J., "HOW-TO: The Comma Separated Value (CSV) File Format", 2004, <http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm>.

[4] Repici,J.,“操作:逗号分隔值(CSV)文件格式”,2004年<http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm>.

[5] Edoceo, Inc., "CSV Standard File Format", 2004, <http://www.edoceo.com/utilis/csv-file-format.php>.

[5] Edoceo,Inc.,“CSV标准文件格式”,2004年<http://www.edoceo.com/utilis/csv-file-format.php>.

[6] Rodger, R. and O. Shanaghy, "Documentation for Ricebridge CSV Manager", February 2005, <http://www.ricebridge.com/products/csvman/reference.htm>.

[6] Rodger,R.和O.Shanaghy,“Ricebridge CSV经理的文件”,2005年2月<http://www.ricebridge.com/products/csvman/reference.htm>.

[7] Raymond, E., "The Art of Unix Programming, Chapter 5", September 2003, <http://www.catb.org/~esr/writings/taoup/html/ch05s02.html>.

[7] Raymond,E.,“Unix编程的艺术,第5章”,2003年9月<http://www.catb.org/~esr/writings/taoup/html/ch05s02.html>。

[8] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, September 1981.

[8] 《传输控制协议》,标准7,RFC 793,1981年9月。

Author's Address

作者地址

Yakov Shafranovich SolidMatrix Technologies, Inc.

雅科夫·沙夫拉诺维奇SolidMatrix技术有限公司。

   EMail: ietf@shaftek.org
   URI:   http://www.shaftek.org
        
   EMail: ietf@shaftek.org
   URI:   http://www.shaftek.org
        

Full Copyright Statement

完整版权声明

Copyright (C) The Internet Society (2005).

版权所有(C)互联网协会(2005年)。

This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

本文件受BCP 78中包含的权利、许可和限制的约束,除其中规定外,作者保留其所有权利。

This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

本文件及其包含的信息是按“原样”提供的,贡献者、他/她所代表或赞助的组织(如有)、互联网协会和互联网工程任务组不承担任何明示或暗示的担保,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。

Intellectual Property

知识产权

The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何独立努力来确定任何此类权利。有关RFC文件中权利的程序信息,请参见BCP 78和BCP 79。

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.

向IETF秘书处披露的知识产权副本和任何许可证保证,或本规范实施者或用户试图获得使用此类专有权利的一般许可证或许可的结果,可从IETF在线知识产权存储库获取,网址为http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.

IETF邀请任何相关方提请其注意任何版权、专利或专利申请,或其他可能涵盖实施本标准所需技术的专有权利。请将信息发送至IETF的IETF-ipr@ietf.org.

Acknowledgement

确认

Funding for the RFC Editor function is currently provided by the Internet Society.

RFC编辑功能的资金目前由互联网协会提供。