Network Working Group J. Klensin Request for Comments: 4185 October 2005 Category: Informational
Network Working Group J. Klensin Request for Comments: 4185 October 2005 Category: Informational
National and Local Characters for DNS Top Level Domain (TLD) Names
DNS顶级域(TLD)名称的国家和本地字符
Status of This Memo
关于下段备忘
This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.
本备忘录为互联网社区提供信息。它没有规定任何类型的互联网标准。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (C) The Internet Society (2005).
版权所有(C)互联网协会(2005年)。
IESG Note
IESG注释
This RFC is not a candidate for any level of Internet Standard. The IETF disclaims any knowledge of the fitness of this RFC for any purpose and notes that the decision to publish is not based on IETF review apart from IESG review for conflict with IETF work. The RFC Editor has chosen to publish this document at its discretion. See RFC 3932 [RFC3932] for more information.
本RFC不适用于任何级别的互联网标准。IETF不承认任何关于本RFC适用于任何目的的知识,并指出,除了IESG审查与IETF工作的冲突外,发布决定并非基于IETF审查。RFC编辑已自行决定发布本文件。有关更多信息,请参阅RFC 3932[RFC3932]。
Abstract
摘要
In the context of work on internationalizing the Domain Name System (DNS), there have been extensive discussions about "multilingual" or "internationalized" top level domain names (TLDs), especially for countries whose predominant language is not written in a Roman-based script. This document reviews some of the motivations for such domains, several suggestions that have been made to provide needed functionality, and the constraints that the DNS imposes. It then suggests an alternative, local translation, that may solve a superset of the problem while avoiding protocol changes, serious deployment delays, and other difficulties. The suggestion utilizes a localization technique in applications to permit any TLD to be accessed using the vocabulary and characters of any language. It is not restricted to language- or country-specific "multilingual" TLDs in the language(s) and script(s) of that country.
在域名系统(DNS)国际化工作的背景下,对“多语言”或“国际化”顶级域名(TLD)进行了广泛的讨论,特别是对于主要语言不是用罗马文字书写的国家。本文档回顾了此类域的一些动机、为提供所需功能而提出的一些建议以及DNS施加的约束。然后,它提出了一种替代方案,本地翻译,可以解决问题的超集,同时避免协议更改、严重的部署延迟和其他困难。该建议在应用程序中使用本地化技术,允许使用任何语言的词汇和字符访问任何TLD。它不限于该国语言和脚本中特定于语言或国家的“多语言”TLD。
Table of Contents
目录
1. Introduction ....................................................3 1.1. Terminology ................................................3 1.2. Background on the "Multilingual Name" Problem ..............3 1.2.1. Approaches to the Requirement .......................3 1.2.2. Writing the Name of One's Country in its Own Characters ..........................................4 1.2.3. Countries with Multiple Languages and Countries with Multiple .............................5 1.2.4. Availability of Non-ASCII Characters in Programs ....5 1.3. Domain Name System Constraints .............................6 1.3.1. Administrative Hierarchy ............................6 1.3.2. Aliases .............................................6 1.4. Internationalization and Localization ......................7 2. Client-Side Solutions ...........................................7 2.1. IDNA and the Client ........................................8 2.2. Local Translation Tables for TLD Names .....................8 3. Advantages and Disadvantages of Local Translation ...............9 3.1. Every TLD Appears in the Local Language and Character Set ..9 3.2. Unification of Country Code Domains .......................10 3.3. User Understanding of Local and Global References .........11 3.4. Limits on Expansion of the Number of TLDs .................11 3.5. Standardization of the Translations .......................12 3.6. Implications for Future New Domain Names ..................13 3.7. Mapping for TLDs, Not Domain Names or Keywords ............13 4. Information Interchange, IDNs, Comparisons, and Translations ...13 5. Internationalization Considerations ............................15 6. Security Considerations ........................................15 7. Acknowledgements ...............................................16 8. Informative References .........................................17
1. Introduction ....................................................3 1.1. Terminology ................................................3 1.2. Background on the "Multilingual Name" Problem ..............3 1.2.1. Approaches to the Requirement .......................3 1.2.2. Writing the Name of One's Country in its Own Characters ..........................................4 1.2.3. Countries with Multiple Languages and Countries with Multiple .............................5 1.2.4. Availability of Non-ASCII Characters in Programs ....5 1.3. Domain Name System Constraints .............................6 1.3.1. Administrative Hierarchy ............................6 1.3.2. Aliases .............................................6 1.4. Internationalization and Localization ......................7 2. Client-Side Solutions ...........................................7 2.1. IDNA and the Client ........................................8 2.2. Local Translation Tables for TLD Names .....................8 3. Advantages and Disadvantages of Local Translation ...............9 3.1. Every TLD Appears in the Local Language and Character Set ..9 3.2. Unification of Country Code Domains .......................10 3.3. User Understanding of Local and Global References .........11 3.4. Limits on Expansion of the Number of TLDs .................11 3.5. Standardization of the Translations .......................12 3.6. Implications for Future New Domain Names ..................13 3.7. Mapping for TLDs, Not Domain Names or Keywords ............13 4. Information Interchange, IDNs, Comparisons, and Translations ...13 5. Internationalization Considerations ............................15 6. Security Considerations ........................................15 7. Acknowledgements ...............................................16 8. Informative References .........................................17
This document assumes the conventional terminology used to discuss the domain name system (DNS) and its hierarchical arrangements. Terms such as "top level domain" (or just "TLD"), "subdomain", "subtree", and "zone file" are used without further explanation. In addition, the term "ccTLD" is used to denote a "country code top level domain" and "gTLD" is used to denote a "generic top level domain" as described in [RFC1591] and in common usage.
本文档采用传统术语讨论域名系统(DNS)及其层次结构。使用诸如“顶级域”(或仅“TLD”)、“子域”、“子树”和“区域文件”等术语时,无需进一步解释。此外,术语“ccTLD”用于表示“国家代码顶级域”,而“gTLD”用于表示[RFC1591]中所述的“通用顶级域”以及常见用法。
People who share a language usually prefer to communicate in it, using whatever characters are normally used to write that language, rather than in some "foreign" one. There have been standards for using mutually-agreed characters and languages in electronic mail message bodies and selected headers since the introduction of MIME in 1992 [MIME] and the Web has permitted multilingual text since its inception, also using MIME. Actual use of non-Roman-character content came even earlier, using private conventions. However, domain names are exposed to users in email addresses and URLs. Corresponding arrangements, typically also exposing domain names, are made for other application protocols. The combination of exposed domain names with internationalization requirements led rapidly to demands to permit domain names in applications that used characters other than those of the very restrictive, ASCII-subset, "hostname" (or "letter-digit-hyphen" ("LDH")) conventions recommended in the DNS specifications [RFC1035]. The effort to do this soon became known as "multilingual domain names". That was actually a misnomer, since the DNS deals only with characters and identifier strings, and not, except by accident or local registration conventions, with what people usually think of as "names". There has also been little interest in what would actually be a "multilingual name", i.e., a name that contains components from more than one language. Instead, interest has focused on the use, in the context of the DNS, of strings that conform to specific individual languages.
共享一种语言的人通常更喜欢用它来交流,使用通常用来书写该语言的任何字符,而不是用某种“外国”语言。自1992年引入MIME以来,已经有了在电子邮件正文和选定标题中使用双方同意的字符和语言的标准[MIME],而且自其诞生以来,网络已经允许使用多语言文本,也使用MIME。使用私人约定,非罗马字符内容的实际使用甚至更早。但是,域名在电子邮件地址和URL中向用户公开。针对其他应用协议做出了相应的安排,通常还公开了域名。公开域名与国际化要求的结合迅速导致了在应用程序中允许域名使用DNS规范[RFC1035]中推荐的非常严格的ASCII子集、“主机名”(或“字母数字连字符”)约定以外的字符的要求。这样做的努力很快被称为“多语言域名”。这实际上是用词不当,因为DNS只处理字符和标识符字符串,而不处理人们通常认为的“名称”,除非是出于偶然或本地注册约定。人们对“多语言名称”的实际含义也不太感兴趣,即一个包含多种语言成分的名称。相反,人们的兴趣集中在DNS上下文中使用符合特定语言的字符串。
When the requirement was seen, not as "modifying the DNS", but as "providing users with access to the DNS from a variety of languages and character sets", three sets of proposals emerged in the IETF and elsewhere. They were:
当该要求被视为不是“修改DNS”,而是“为用户提供从各种语言和字符集访问DNS的权限”时,IETF和其他地方出现了三套提案。他们是:
1. Perform processing in client software that recodes a user-visible string into an ASCII-compatible form that can safely be passed through the DNS protocols and stored in the DNS. This is the approach used, for example, in the IETF's "IDNA" protocol [RFC3490].
1. 在客户端软件中执行处理,将用户可见字符串重新编码为ASCII兼容格式,该格式可以安全地通过DNS协议传递并存储在DNS中。这是IETF的“IDNA”协议[RFC3490]中使用的方法。
2. Modify the DNS to be more hospitable to non-ASCII names and strings. There have been a variety of proposals to do this, using several different techniques. Some of these have been implemented on a proprietary basis by various vendors. None of them have gained acceptance in the IETF community, primarily because they would take a long time to deploy, would leave many problems unsolved, and have been shown to cause problems with deployed approaches that had not yet been upgraded.
2. 修改DNS,使其更适合非ASCII名称和字符串。有各种各样的建议,使用几种不同的技术来实现这一点。其中一些是由不同的供应商在专有的基础上实施的。它们都没有在IETF社区中得到认可,主要是因为它们需要很长时间才能部署,会留下许多问题未解决,并且已经证明会导致尚未升级的已部署方法出现问题。
3. Move the problem out of the DNS entirely, relying instead on a "directory" or "presentation" layer to handle internationalization. The rationale for this approach is discussed in [RFC3467].
3. 将问题完全移出DNS,而是依赖“目录”或“表示”层来处理国际化。[RFC3467]中讨论了该方法的基本原理。
This document proposes a fourth approach, applicable to the top level domains (TLDs) only (see Section 1.3.1 for a discussion of the special issues that make TLDs both problematic and a special opportunity). That approach involves having the user interface of applications map non-ASCII names for TLDs to existing TLDs and could be used as an alternate or supplement to the strategies summarized above.
本文件提出了第四种方法,仅适用于顶级域(TLD)(有关使TLD成为问题和特殊机会的特殊问题的讨论,请参见第1.3.1节)。该方法包括让应用程序的用户界面将TLD的非ASCII名称映射到现有TLD,并可作为上述策略的替代或补充。
An early focus of the "multilingual domain name" efforts was expressed in statements such as "users in my country, in which ASCII is rarely used, should be able to write an entire domain name in their own character set". In particular, since all top-level domain names, at present, follow the LDH rules, the modified naming rules discussed in [RFC1123], and the coding conventions specified in [RFC1591], all fully-qualified DNS names were effectively required to contain at least one ASCII label (the TLD name). Some advocates for internationalized names have considered the presence of any ASCII labels inappropriate. One should, instead, be able to write the name of the ccTLD for China in Chinese, the name of the ccTLD for Saudi Arabia in Arabic, the name for Spain in Spanish, and so on.
“多语种域名”工作的早期重点体现在“我国很少使用ASCII码的用户应该能够用自己的字符集书写整个域名”等声明中。特别是,由于目前所有顶级域名都遵循LDH规则、[RFC1123]中讨论的修改后的命名规则以及[RFC1591]中指定的编码约定,因此所有完全限定的DNS名称都必须包含至少一个ASCII标签(TLD名称)。一些国际化名称的拥护者认为任何ASCII标签的存在都是不合适的。相反,我们应该能够用中文书写中国的国家反恐委员会名称,用阿拉伯语书写沙特阿拉伯的国家反恐委员会名称,用西班牙语书写西班牙的国家反恐委员会名称,等等。
That much could be accomplished, given updated applications, by using a new TLD name with IDNA encoding. Of course, adding such a TLD would raise new questions: what to do about gTLDs, how to handle countries with several official languages (perhaps even using different scripts), how should name strings be chosen, and whether
在更新应用程序的情况下,通过使用带有IDNA编码的新TLD名称可以实现这一点。当然,添加这样的TLD会带来新的问题:如何处理GTLD,如何处理使用多种官方语言(甚至可能使用不同的脚本)的国家,如何选择名称字符串,以及
there should be an attempt to coordinate the contents of the local-language TLD zone and the traditional ISO 3166-coded one. A few of these issues are addressed below. But, if one examines (or even thinks about) user behavior and preferences, it is almost as important that one be able to write the name of the ccTLD for China in Arabic and that of Saudi Arabia in Chinese: true internationalization implies that, at least to the extent to which ambiguity and conflicts can be avoided, people should be able to use the languages and character sets they prefer. For the same reasons that one would like to have all-Chinese domain names available in China, it is important to have the capability to have an apparent Chinese-language TLD for a domain whose second level and beyond are Chinese characters, even when the TLD itself serves predominantly non-Chinese-speaking registrants and users.
应尝试协调本地语言TLD区域和传统ISO 3166编码区域的内容。下面将讨论其中一些问题。但是,如果你研究(甚至思考)用户行为和偏好,那么能够用阿拉伯语书写中国的ccTLD名称和用汉语书写沙特阿拉伯的ccTLD名称几乎同样重要:真正的国际化意味着,至少在可以避免歧义和冲突的程度上,人们应该能够使用他们喜欢的语言和字符集。出于同样的原因,我们希望所有中文域名都能在中国使用,因此,即使TLD本身主要服务于非华语注册者和用户,对于第二级及以上为汉字的域名,拥有明显的中文TLD也是很重要的。
1.2.3. Countries with Multiple Languages and Countries with Multiple Names
1.2.3. 使用多种语言的国家和使用多种名称的国家
From a user interface standpoint, writing ccTLD names in local characters is a problem. As discussed below in Section 1.3.2, the DNS itself does not easily permit a domain to be referred to by more than one name (or spelling or translation of a name). Countries with more than one official language would require that the country name be represented in each of those languages. And, just as it is important that a user in China be able to represent the name of the Chinese ccTLD in Chinese characters, she should be able to access a Chinese-language site in France using Chinese characters. That would require that she be able to write the name of the French ccTLD in Chinese characters rather than in a form based on a Roman character set.
从用户界面的角度来看,用本地字符编写ccTLD名称是一个问题。如下文第1.3.2节所述,DNS本身不允许一个域被多个名称引用(或名称的拼写或翻译)。使用一种以上官方语言的国家将要求用其中每种语言代表国家名称。此外,正如中国用户能够用汉字表示中国国家版权和商标局的名称非常重要一样,她也应该能够使用汉字访问法国的中文网站。这就要求她能够用汉字而不是罗马字符集来书写法国国家反恐委员会的名称。
Over the years, computer users have gotten used to the fact that not every computer has a full set of characters available to every program. An extreme example is an Arabic speaker using a public kiosk computer in an airport in the United States: there is only a small chance that the web browser there will be able to input and render Arabic correctly. This has a direct effect on the multilingual TLD problem in that it is not possible to simply change a name of the ccTLDs in the DNS to be one of a given country's non-ASCII names without possibly preventing people from entering those names throughout the world.
多年来,计算机用户已经习惯了这样一个事实:并非每台计算机都有一套完整的字符可供每个程序使用。一个极端的例子是,一位阿拉伯语使用者在美国机场使用公共信息亭计算机:那里的web浏览器能够正确输入和呈现阿拉伯语的可能性很小。这对多语言TLD问题有直接影响,因为不可能简单地将DNS中的CCTLD名称更改为给定国家的非ASCII名称之一,而不可能阻止世界各地的人输入这些名称。
The domain name system is firmly rooted in the idea of an "administrative hierarchy", with the entity responsible for a given node of the hierarchy responsible for policies applicable to its subhierarchies (Cf. [RFC1034], [RFC1035], and [RFC1591]). The model works quite well for the domain and subdomains of a particular enterprise. In an enterprise situation, the hierarchy can be organized to match the organizational structure; there are established ways to set policies; and there are, at least presumably, shared assumptions about overall goals and objectives among all registrants in the domain. It is more problematic when a domain is shared by unrelated entities that lack common policy assumptions because it is difficult to reach agreement on rules that should apply to all of the entities and subdomains of such a domain. In general, the unrelated entities situation always prevails for the labels registered in a TLD (second-level names). Exceptions occur in those TLDs for which the second level is structural (e.g., the .CO, .AC, .GOV conventions in many ccTLDs or in the historical geographical organization of .US [RFC1480]). In those cases, it exists for the labels within that structural level.
域名系统牢牢植根于“管理层级”的理念,负责层级中给定节点的实体负责适用于其子层级的政策(参见[RFC1034]、[RFC1035]和[RFC1591])。该模型适用于特定企业的域和子域。在企业情况下,可以组织层次结构以匹配组织结构;有既定的方法来制定政策;而且,至少可以想见,该领域所有注册者对总体目标和目的都有共同的假设。如果一个域由不相关的实体共享,而这些实体缺乏共同的策略假设,则问题更大,因为很难就适用于该域的所有实体和子域的规则达成一致。一般来说,在TLD(二级名称)中注册的标签通常存在不相关实体的情况。例外情况发生在第二级为结构性的TLD中(例如,许多CCTLD中的.CO、.AC、.GOV约定或.US[RFC1480]的历史地理组织中)。在这些情况下,它存在于该结构级别内的标签中。
TLDs may, but need not, have consistent registration policies for those second (or third) level names. Countries (or ccTLD administrators) have often adopted rules about what entities may register in their ccTLDs, and what forms the names may take. RFC 1591 outlined registration norms for most of the then-extant gTLDs; however, those norms have been largely ignored in recent years. Some recent "sponsored" and purpose-specific domains are based on quite specific rules about appropriate registrations. Homogeneous registration rules for the root are, by contrast, impossible: almost by definition, the subdomains registered in the root (TLDs) are diverse, and no single policy about types and formats of names applying to all root subdomains is feasible.
TLD可能(但不必)对这些第二(或第三)级名称具有一致的注册策略。国家(或国家/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/地区/。RFC 1591概述了当时大多数现存GTD的注册规范;然而,这些准则近年来在很大程度上被忽视。最近一些“赞助”和特定目的的域名是基于关于适当注册的非常具体的规则。相比之下,根目录的同构注册规则是不可能的:几乎根据定义,在根目录(TLD)中注册的子域是不同的,并且关于应用于所有根子域的名称类型和格式的单一策略是不可行的。
In an environment different from the DNS, a rational way to permit assigning local-language names to a country code (or other) domain would be to set up an alias for the name, or to use some sort of "see instead" reference. But the DNS does not have facilities for either. Instead, it supports a "CNAME" record, whose label can refer only to a particular label and not to a subtree. For example, if A.B.C is a fully-qualified name, then a CNAME reference in B.C from X to A would make X.B.C appear to have the same values as A.B.C. However, a CNAME reference from Y to C in the root would not make A.B.Y referenceable
在与DNS不同的环境中,允许将本地语言名称分配给国家代码(或其他)域的合理方法是为名称设置别名,或使用某种“请参阅”引用。但是DNS没有这两个方面的设施。相反,它支持“CNAME”记录,其标签只能引用特定标签,而不能引用子树。例如,如果A.B.C是完全限定名,则B.C中从X到A的CNAME引用将使X.B.C看起来与A.B.C具有相同的值。但是,根中从Y到C的CNAME引用将不使A.B.Y可引用
(or even defined) at all. A second record type, DNAME [RFC2672], can provide an alias for a portion of the tree. But many believe that it is problematic technically. At a minimum, it can cause synchronization issues when references across zones occur, and its use has been discouraged within the IETF, except as a means of enabling a transition from one domain to another. Even if the design of yet another alias-type record type were contemplated, DNS technical constraints of query-response integrity and DNSSec zone signing (cf. [RFC4033], [RFC4034], and [RFC4035]) make it extremely unlikely that one could be defined that would meet the desired requirements for "see instead" or true synonym references.
(甚至是定义)完全没有。第二种记录类型DNAME[RFC2672]可以为树的一部分提供别名。但许多人认为这在技术上是有问题的。至少,当跨区域引用发生时,它可能会导致同步问题,并且IETF中不鼓励使用它,除非作为从一个域过渡到另一个域的手段。即使考虑了另一种别名类型记录类型的设计,查询响应完整性和DNSSec区域签名(参见[RFC4033]、[RFC4034]和[RFC4035])的DNS技术约束也使得定义满足“请参阅”或真正同义词引用所需要求的别名类型的可能性极低。
It has often been observed that, while many people talk about "internationalization", they often really mean, and want, "localization". "Internationalization", in this context, suggests making something globally accessible while incorporating a broad-range "universal" character set and conventions appropriate to all languages and cultures. "Localization", by contrast, involves having things work well in a particular locality or for a broad range of localities, although aspects of the style of operation might differ for each locality. Anything that actually involves the DNS must be global, and hence internationalized, since the DNS cannot meaningfully support different responses or query and matching models based, e.g., on the location of the user making a query. While the DNS cannot support localization internally, many of the features discussed earlier in this section are much more easily thought about in local terms -- whether localized to a geographical area, users of a language, or using some other criteria -- than in global ones.
人们经常观察到,虽然许多人谈论“国际化”,但他们通常真正的意思是,并且想要“本地化”。在这种情况下,“国际化”意味着在融入适合所有语言和文化的广泛的“通用”字符集和惯例的同时,使某些东西可以在全球范围内访问。相比之下,“本地化”涉及到让事情在特定地区或广泛地区顺利进行,尽管每个地区的运作方式可能有所不同。实际上涉及DNS的任何内容都必须是全局的,因此必须国际化,因为DNS不能有意义地支持基于(例如)用户进行查询的位置的不同响应或查询和匹配模型。虽然DNS无法在内部支持本地化,但本节前面讨论的许多功能在本地术语中比在全局术语中更容易考虑——无论是本地化到某个地理区域、语言用户还是使用其他一些标准。
Traditionally, the IETF avoided becoming involved in standardization for actions that take place strictly on individual hosts on the network, instead confining itself to behavior that is observable "on the wire", i.e., in protocols between network hosts. Exceptions to this general principle have been made when different clients were required to utilize data or interpret values in compatible ways to preserve interoperability: the standards for email and web body formats, and IDNA itself, are examples of these exceptions. Regardless of what is required to be standardized, it is almost never required, and often unwise, that a user interface present "on the wire" formats to the user, at least by default (debugging options that show the wire formats are common and often quite useful). However, in most cases when the presentation format and the wire format differ, the client program must take precautions to ensure that the wire format can be reconstructed from user input, or to keep
传统上,IETF避免参与严格在网络上的单个主机上进行的操作的标准化,而是将自己局限于“在线”上可以观察到的行为,即网络主机之间的协议。当要求不同的客户以兼容的方式使用数据或解释值以保持互操作性时,这一一般原则有例外:电子邮件和web正文格式的标准以及IDNA本身就是这些例外的例子。不管需要标准化什么,至少在默认情况下,几乎从不要求用户界面向用户提供“在线”格式,这通常是不明智的(显示在线格式的调试选项很常见,而且通常非常有用)。但是,在大多数情况下,当表示格式和wire格式不同时,客户机程序必须采取预防措施,以确保wire格式可以从用户输入中重建,或者
the wire format, while hidden, bound to the presentation mechanism so that it can be reconstructed. While it is rarely a goal in itself, it is often necessary that the user be at least vaguely aware that the wire ("real") format is different from the presentation one and that the wire format be available for debugging.
wire格式在隐藏时绑定到表示机制,以便可以重建。虽然这本身很少是一个目标,但用户通常需要至少模糊地意识到wire(“真实”)格式与演示格式不同,并且wire格式可用于调试。
In fact, the DNS itself is an excellent example of the difference between the wire format and the user presentation format. Most Internet users do not realize that the wire format for DNS queries and responses does not include the "." character. Instead, each label is represented by a length in bytes of the label, followed by the label itself.
事实上,DNS本身就是wire格式和用户表示格式之间差异的一个很好的例子。大多数Internet用户没有意识到DNS查询和响应的wire格式不包含“.”字符。相反,每个标签都由标签的长度(以字节为单位)表示,后跟标签本身。
As mentioned above, IDNA itself is entirely a client-side protocol. It works by performing some mappings and then encoding labels to be placed into the DNS in a special format called "punycode" [RFC3492]. When labels in that format are encountered, they are transformed, by the client, back into internationalized (normally Unicode [ISO10646]) characters. In the context of this document, the important observation about IDNA is that any application program that supports it is already doing considerable transformation work in the client; it is not simply presenting the "on the wire" formats to the user. It is also the case that, if an application implementation makes different mappings than those called for by IDNA, it is likely to be detected only when, and if, users complain about unexpected behavior. As long as the punycode strings sent to it are valid, the server cannot tell what mappings were applied to develop those strings.
如上所述,IDNA本身完全是一个客户端协议。它的工作原理是执行一些映射,然后对标签进行编码,以一种称为“punycode”[RFC3492]的特殊格式放入DNS。当遇到该格式的标签时,客户机会将其转换回国际化(通常为Unicode[ISO10646])字符。在本文档的上下文中,关于IDNA的重要观察结果是,任何支持IDNA的应用程序都已经在客户机中进行了大量的转换工作;它不是简单地向用户呈现“在线”格式。同样的情况是,如果应用程序实现与IDNA调用的映射不同,那么只有当用户抱怨意外行为时,才可能检测到它。只要发送给它的punycode字符串有效,服务器就无法判断应用了哪些映射来开发这些字符串。
We suggest that, in addition to maintaining the code and tables required to support IDNA, authors of application programs may want to maintain a table that contains a list of TLDs and locally-desirable names for each one. For ccTLDs, these might be the names (or locally-standard abbreviations) by which the relevant countries are known locally (whether in ASCII characters or others). With some care on the part of the application designer (e.g., to ensure that local forms do not conflict with the actual TLD names), a particular TLD name input from the user could be either in local or standard form without special tagging or problems. When DNS names are received by these client programs, the TLD labels would be mapped to local form before IDNA is applied to the rest of the name; when names are received from users, local TLD names would be mapped to the global ones before applying IDNA or being used in other DNS processing.
我们建议,除了维护支持IDNA所需的代码和表外,应用程序的作者可能还希望维护一个包含TLD列表和每个TLD的本地所需名称的表。对于CCTLD,这些可能是当地已知的相关国家的名称(或当地标准缩写)(无论是ASCII字符还是其他字符)。应用程序设计人员应谨慎(例如,确保本地表单不会与实际TLD名称冲突),用户输入的特定TLD名称可以是本地表单,也可以是标准表单,没有特殊标记或问题。当这些客户端程序接收到DNS名称时,TLD标签将映射到本地形式,然后IDNA应用到名称的其余部分;当从用户处收到名称时,本地TLD名称将映射到全局名称,然后再应用IDNA或用于其他DNS处理。
The notion of a top-level domain whose name matches, e.g., the name that is used for a country in that country or the name of a language in that language as, as mentioned above, is immediately appealing. But most of the reasons for it argue equally strongly for other TLDs being accessible from that language. A user in Korea who can access the national ccTLD in the Korean language and character set has every reason to expect that both generic top level domains and domains associated with other countries would be similarly accessible, especially if the second-level domains bear Korean names. A user native to Spain or Portugal, or in Latin America, would presumably have similar expectations, but would expect to use Spanish or Portuguese names, not Korean ones.
一个顶级域名的概念,其名称匹配,例如,用于该国某个国家的名称或该语言中的某一语言的名称,如上所述,立即具有吸引力。但它的大多数理由都同样强烈地支持从该语言访问其他TLD。可以访问朝鲜语和字符集的国家ccTLD的韩国用户完全有理由期望通用顶级域和与其他国家相关的域都可以类似地访问,特别是如果第二级域具有朝鲜语名称。原产于西班牙、葡萄牙或拉丁美洲的用户可能会有类似的期望,但希望使用西班牙语或葡萄牙语名称,而不是韩语名称。
That level of local optimization is not realistic -- some would argue not possible -- with the DNS since it would ultimately require that every top level domain be replicated for each of the world's languages. That replication process would involve not just the top level domain itself; in principle, all of its subtrees would need to be completely replicated as well. Perhaps in practice, not all subtrees would require replication, but only those for which a language variation or translation was significant. But, while that restriction would change the scale of the problem, it would not alter its basic nature. The administrative hierarchy characteristics of the DNS (see Section 1.3.1) turn the replication process into an administrative nightmare: every administrator of a second-level domain in the world would be forced to maintain dozens, probably hundreds, of similar zone files for the replicates of the domain. Even if only the zones relevant to a particular country or language were replicated, the administrative and tracking problems to bind these to the appropriate top-level domain and keep all of the replicas synchronized would be extremely difficult at best. And many administrators of third- and fourth-level domains, and beyond, would be faced with similar problems.
对于DNS来说,这种级别的局部优化是不现实的,有些人认为不可能,因为它最终需要为世界上的每种语言复制每个顶级域。复制过程将不仅仅涉及顶级域本身;原则上,它的所有子树也需要完全复制。也许在实践中,并非所有的子树都需要复制,而只需要那些语言变异或翻译非常重要的子树。但是,尽管这一限制会改变问题的规模,但不会改变其基本性质。DNS的管理层次结构特征(请参见第1.3.1节)将复制过程变成了一场管理噩梦:世界上第二级域的每个管理员都将被迫为域的复制维护数十个(可能数百个)类似的区域文件。即使只复制与特定国家或语言相关的区域,将这些区域绑定到适当的顶级域并保持所有副本同步的管理和跟踪问题充其量也是极其困难的。许多三级和四级域以及其他域的管理员也会面临类似的问题。
By contrast, dealing with the names of TLDs as a localization problem, using local translation, is fairly simple, although it places some burden of understanding on the user (see Section 4). Each function represented by a TLD -- a country, generic registrations, or purpose-specific registrations -- could be represented in the local language and character set as needed. And, for countries with many languages -- or users living, working in, or visiting countries where their language is not dominant -- "local" could be defined in terms of the needs or wishes of each particular user.
相比之下,使用本地翻译将TLD的名称作为本地化问题处理相当简单,尽管这会给用户带来一些理解负担(参见第4节)。TLD(国家、通用注册或特定用途注册)表示的每个函数都可以根据需要用本地语言和字符集表示。而且,对于使用多种语言的国家——或者在其语言不占主导地位的国家生活、工作或访问的用户——可以根据每个特定用户的需求或愿望来定义“本地”。
An additional benefit is that, if two countries called themselves by the same name in their local languages -- if, e.g., Western Slobbovia and Eastern Slobbovia both called themselves "Slobland" -- local conventions could be followed as long as users understood that only internal forms (in this case, the ISO 3166-based ccTLD name) could be exported outside the country (see Section 3.3).
另一个好处是,如果两个国家在其当地语言中用相同的名称称呼自己——例如,如果西部斯洛博维亚和东部斯洛博维亚都称自己为“斯洛布兰德”——只要用户理解只有内部形式(在本例中,基于ISO 3166的ccTLD名称),就可以遵循当地惯例可出口到国外(见第3.3节)。
Note that this proposal is to allow mapping of native-language strings to existing TLDs. It would almost certainly be ill-advised to stretch this idea too far and try to map strings that local users would be unlikely to guess into TLDs. For example, there are probably no languages in which the country known in English as "Finland" is called "FI". Thus, one would not want to create a mapping from two characters that look or sound like a Roman "F" and a Roman "I" to the ccTLD ".fi".
请注意,此建议允许将本机语言字符串映射到现有TLD。将这一想法延伸得太远,试图将本地用户不太可能猜到的字符串映射到TLD中,几乎肯定是不明智的。例如,在英语中被称为“芬兰”的国家可能没有一种语言被称为“菲”。因此,我们不希望创建从两个看起来或听起来像罗马字母“F”和罗马字母“I”到ccTLD.fi的映射。
It follows from some of the comments above that, while there appears to be some immediate appeal from having (at least) two domains for each country, one using the ISO 3166-1 code [ISO3166] and another one using a name based on the national name in the national language, such a situation would create considerable problems for registrants in both domains. For registrants maintaining enterprise or organizational subdomains, ease of administration of a single family of zone files will usually make a registration in a single top-level domain preferable to replicated sets of them, at least as long as their functional requirements (such a local-language access) are met by the unified structure. For those registrants with no interest in any Internet function or protocols other than use of the HTTP/HTTPS-based web, this problem can be dealt with at the applications level by the use of redirects but, in the general case, that is not a feasible solution.
从上面的一些评论可以看出,虽然每个国家(至少)有两个域名似乎有一些直接的吸引力,一个使用ISO 3166-1代码[ISO3166],另一个使用基于国家语言中国家名称的名称,这种情况将给这两个领域的注册人带来相当大的问题。对于维护企业或组织子域的注册者,单一区域文件族的易于管理通常会使单个顶级域中的注册优于复制的域集,至少只要统一结构满足其功能要求(如本地语言访问)。对于那些除了使用基于HTTP/HTTPS的web之外对任何Internet功能或协议不感兴趣的注册者,可以通过使用重定向在应用程序级别解决此问题,但在一般情况下,这不是一个可行的解决方案。
For countries with multiple national languages that are considered equal and legally equivalent, the advantages of a translation-based approach, rather than multiple registrations and replicated trees, would be even more significant. Actually installing and maintaining a separate TLD for each language would be an administrative nightmare, especially if it was intended that the associated zones be kept synchronized. The oft-suggested proposal to adopt an "exactly one extra domain for each country" rule would essentially require some of the multiple-official-language countries to violate their own constitutions. Conversely, having multiple domains for a given country, based on the number of official languages and without any expectation of synchronization, would give some countries an additional allocation of TLDs that others would certainly consider unfair.
对于多个国家的语言被认为是平等的和法律上等同的,基于翻译的方法,而不是多次注册和复制树的优势将更加显著。实际上,为每种语言安装和维护一个单独的TLD将是一场管理噩梦,特别是如果相关区域保持同步的话。公平交易会建议采用“每个国家恰好有一个额外域名”规则,这基本上要求多个官方语言国家中的一些国家违反自己的宪法。相反,对于一个给定的国家有多个域,基于官方语言的数量和没有任何同步预期,将给一些国家额外分配TLD,其他人肯定会认为不公平。
Of course, having replicated domains might be popular with some registries and registrars, since replication would almost inevitably increase the total number of domains to be registered. Helping that group of registries and registrars, while hurting Internet users by adding administrative overhead and confusion, is not a goal of this document.
当然,有些注册中心和注册中心可能很喜欢复制域,因为复制几乎不可避免地会增加要注册的域总数。帮助这组注册中心和注册中心,同时通过增加管理开销和混乱来伤害互联网用户,不是本文件的目标。
While the IDNA tables (actually Nameprep [RFC3491] and Stringprep [RFC3454]) must be identical globally for IDNA to work reliably, the tables for mapping between local names and TLD names could be locally determined, and differ from one locale to another, as long as users understood that international interchange of names required using the standard forms. That understanding puts some additional burden of learning on users, although part of it could be assisted by software (see Section 4).
虽然IDNA表(实际上是Nameprep[RFC3491]和Stringprep[RFC3454])必须在全局上相同,IDNA才能可靠工作,但本地名称和TLD名称之间的映射表可以在本地确定,并且在不同的语言环境中有所不同,只要用户理解国际名称交换需要使用标准格式。这种理解给用户带来了一些额外的学习负担,尽管其中一部分可以通过软件来辅助(见第4节)。
In any event, at least in the foreseeable future, it is likely that DNS names being passed among users in different countries, or using different languages, will be forced to be in punycode form to guarantee compatibility, since those users would not, in general, have the ability to read each other's scripts or have appropriate input facilities (keyboards, etc.) for then. So the marginal knowledge or effort needed to put TLD names into standard form and transmit them in that way would actually be fairly small.
在任何情况下,至少在可预见的未来,在不同国家或使用不同语言的用户之间传递的DNS名称很可能会被强制采用punycode形式以保证兼容性,因为这些用户通常不会有能力读取彼此的脚本或拥有适当的输入设备因此,将TLD名称转换为标准格式并以这种方式传输它们所需的边缘知识或努力实际上是相当小的。
The concept of using local translation does have one side effect that some portions of the Internet community might consider undesirable. The size and complexity of translation tables, and maintaining those tables, will be, to a considerable extent, a function of the number of top-level domains of interest, the frequency with which new domains are added, and the number of domains added at a time. A country or other locale that wished to maintain a complete set of translations (i.e., so that every TLD had a representation in the local language) would presumably find setting up a table for the current collection of a few hundred domains to be a task that would take some days. If the number of TLDs were relatively stable, with a relatively small number being added at infrequent intervals, the updates could probably be dealt with on an ad hoc basis. But, if large numbers of domains were added frequently, or if the total number of TLDs became very large, maintaining the table might require dedicated staff if each new TLD is to be accommodated. Worse, updating the tables stored on client machines might require update
使用本地翻译的概念有一个副作用,即互联网社区的某些部分可能会认为不合意。翻译表的大小和复杂性,以及对这些表的维护,在很大程度上取决于顶级感兴趣域的数量、添加新域的频率以及一次添加的域的数量。如果一个国家或其他地区希望维护一套完整的翻译(即,使每个TLD都有本地语言的表示),则可能会发现为当前数百个域的集合建立一个表是一项需要几天时间的任务。如果TLD的数量相对稳定,在不频繁的时间间隔内添加相对较少的TLD,则可能会在临时基础上处理更新。但是,如果频繁添加大量域,或者TLD的总数变得非常大,那么如果要容纳每个新的TLD,维护该表可能需要专门的人员。更糟糕的是,更新客户机上存储的表可能需要更新
and synchronization protocols and all of the complexities that tend to go with them (see [RFC3696] for a discussion of some related issues in applications).
以及同步协议,以及与之相关的所有复杂性(有关应用程序中一些相关问题的讨论,请参见[RFC3696])。
In practice, there will be little requirement to translate every TLD into a local language. There are already existing TLDs for which there is no obvious translations in many languages (most notably, ".arpa") or where the translation will be far from obvious to typical users (for example, ".int" and ".aero"). Of course, these could be translated by function: ".arpa" to the local term for "infrastructure", ".int" with "international" or "international organization", ".aero" with "aeronautical" or "airlines", and so on; but it is not clear whether doing so would have significant value. For almost every language, there are dozens of ccTLDs for which there are no translations of the country names into the local language that would be known by anyone other than geographers. If new TLDs are added, there might not be a strong need (or even capability) to have language-specific equivalents for each.
实际上,几乎不需要将每个TLD翻译成本地语言。已有的TLD在许多语言中都没有明显的翻译(最显著的是“.arpa”),或者对于典型用户来说,翻译远远不明显(例如“.int”和“.aero”)。当然,这些可以按功能翻译:“.arpa”为“基础设施”的本地术语,“.int”为“国际”或“国际组织”;“.aero”为“航空”或“航空公司”,等等;但不清楚这样做是否有重大价值。对于几乎每种语言,都有几十个国家/地区的地名,除了地理学家之外,其他任何人都不会知道这些地名被翻译成当地语言。如果添加了新的TLD,则可能不太需要(甚至不需要能力)为每个TLD提供特定于语言的等价物。
An immediate question when proposals such as this one are considered is whether the names for the various TLDs that do not match the strings that are actually in the DNS should be standardized and, if so, by what mechanism. Standardization would promote communication within a country or among people sharing a language. However, it is likely to be very difficult to reach appropriate international agreements to which wide conformance could be expected. Exceptions might arise within particular countries or language groups but, even then, there might be advantages to users being able to specify additional synonymous names that are easy for them to remember. As with IDNA-based IDNs, users who wish to transmit information about domain names to people whose exact capabilities and software are unknown, and to do so with minimal risk of confusion, will probably confine themselves to the names that actually appear in the DNS, i.e., the "punycode" representations.
在考虑像这样的提案时,一个直接的问题是,是否应该标准化与DNS中实际存在的字符串不匹配的各种TLD的名称,如果是,则采用何种机制。标准化将促进国家内部或共享一种语言的人之间的交流。然而,很可能很难达成预期广泛遵守的适当国际协定。例外情况可能出现在特定的国家或语言组中,但即使如此,用户能够指定其他易于记忆的同义名称也可能有好处。与基于IDNA的IDN一样,希望将有关域名的信息传输给其确切功能和软件未知的人,并且希望以最小的混淆风险这样做的用户,可能会将自己局限于DNS中实际出现的名称,即“punycode”表示。
In any event, neither standardization nor uniform use of either the system outlined here or of a specific collection of names is required to make the system work for those who would find it useful. Similarly, mechanisms for country-wide coordination, and examination of the appropriateness or inappropriateness of such mechanisms, is beyond the scope of this document.
在任何情况下,都不需要标准化或统一使用此处概述的系统或特定的名称集合,以使系统为那些认为有用的人工作。同样,全国范围内的协调机制以及对这些机制的适当性或不适当性的审查也超出了本文件的范围。
Applications that implement the proposal in this document are likely to make the subsequent creation and acceptance of new IDNA-based TLDs significantly more difficult. If this proposal becomes widely adopted, local language names mapped as it suggests will be generally expected by users of those languages to mean the same as a current TLD. Creating a new, stand-alone IDNA-based TLD will then require more deliberation and care to avoid conflicts and, when executed, will require all the application software that maps the name to the existing TLD to change the mapping tables.
实施本文件中建议的应用程序可能会使后续创建和接受新的基于IDNA的TLD变得更加困难。如果这项建议被广泛采用,使用这些语言的用户通常期望按照建议映射的当地语言名称与当前TLD的含义相同。创建一个新的、独立的、基于IDNA的TLD将需要更多的考虑和谨慎,以避免冲突,并且在执行时,将需要所有将名称映射到现有TLD的应用程序软件更改映射表。
For several reasons, this problem may not be as serious in practice as it might first appear. For ccTLDs allocated according to the ISO 3166-1 list, there will presumably be no problem at all: not only are the 3166-1 alpha-2 codes strictly in ASCII, but general trends, such as those embodied in ICANN's "GAC Recommendations" against using country names or codes for any purpose not associated with those specific countries, make conflicts with internationalized names extremely unlikely. Because the DNS does not currently have a usable aliasing function (see Section 1.3.2), it is likely that new IDNA-based TLDs will be allocated only after there is considerable opportunity for countries and other individual entities to identify any problems they see with proposed new names.
出于几个原因,这个问题在实践中可能不像最初出现的那样严重。对于根据ISO 3166-1列表分配的CCTLD,可能根本不会有任何问题:不仅3166-1 alpha-2代码严格使用ASCII,而且总体趋势,如ICANN的“GAC建议”中所体现的反对将国家名称或代码用于与这些特定国家无关的任何目的的趋势,使与国际化名称的冲突极不可能发生。由于DNS目前没有可用的别名功能(见第1.3.2节),因此,只有在国家和其他实体有相当大的机会识别其在拟议新名称中发现的任何问题后,才可能分配新的基于IDNA的TLD。
It should be clear to anyone who has read this far that the mapping described in this document is limited to TLDs, not full domain names or keywords. In particular, nothing here should be construed as applying to anything other than TLDs, due at least in part to the limitations described in Section 3.1. Further, this document is only about the domain name system (DNS), not about any keyword system. The interactions between particular keyword systems and the proposals here are left as a (possibly very difficult) exercise for the reader or implementer of such systems. However, for the subset of such systems whose intent is to entirely hide DNS names or URIs from the user, their output would presumably be the LDH names that actually appeared in the DNS, i.e., in punycode form for IDNA names and without any application processing of the type contemplated here.
阅读本文的任何人都应该清楚,本文档中描述的映射仅限于TLD,而不是完整的域名或关键字。特别是,由于第3.1节所述的限制,此处的任何内容均不得解释为适用于TLD以外的任何内容。此外,本文档仅涉及域名系统(DNS),不涉及任何关键字系统。特定关键字系统和此处建议之间的交互留给此类系统的读者或实现者作为练习(可能非常困难)。然而,对于意图对用户完全隐藏DNS名称或uri的这类系统的子集,其输出可能是实际出现在DNS中的LDH名称,即,对于IDNA名称以punycode形式出现,并且没有此处所设想的任何类型的应用处理。
This specification is based on a pair of fairly explicit assumptions. The first is that the greatest and most important impact and value of any internationalization or localization technique is to permit users who share a language or culture to communicate with others who also share that language or culture. Communication among users from
本规范基于一对相当明确的假设。首先,任何国际化或本地化技术的最大和最重要的影响和价值是允许共享一种语言或文化的用户与共享该语言或文化的其他人进行交流。来自网络的用户之间的通信
different cultures, using different languages or different scripts is inherently more difficult, and still more difficult if they cannot easily identify languages and scripts in common. The reason for those difficulties are age-old issues in language translation and differences among languages and scripts, not problems associated with the DNS or IDNs, however they are represented. That is the second assumption: when communication across language or cultural groups is required, the users who need to do it -- typically a much smaller number than those communicating within the same language and culture -- are going to need to rely on commonly-understood languages and scripts and will need to exert somewhat more care and effort than within their own groups.
不同的文化,使用不同的语言或不同的脚本本来就比较困难,如果他们不能很容易地识别出共同的语言和脚本,那么就更加困难。这些困难的原因是语言翻译中的古老问题以及语言和脚本之间的差异,而不是与DNS或IDN相关的问题,不管它们是如何表示的。这是第二个假设:当需要跨语言或文化群体的交流时,需要这样做的用户——通常比在同一语言和文化中交流的用户要少得多——将需要依赖于人们普遍理解的语言和脚本,并且需要比他们自己的团队付出更多的关注和努力。
As outlined in the sections above, the suggestions made in this document could clearly be turned into major problems by misuse or misunderstanding. For example, if two applications on the same host used different translation tables, a situation could easily result that would be very confusing to the user. However, in some cases, this would be only slightly worse than some of the alternatives. For example, if, on a given system, IDNs are expressed in native script, but ASCII TLD names are used, cutting and pasting from one application to another may not work as expected, unless both applications and the underlying operating system are all Unicode-based and use the same encoding model for Unicode. Some applications writers have already discovered, even without significant use of IDNs, that they need to support separate "copy string" and "copy link location", and the corresponding "paste" operations. Any use of IDNs or Internationalized Resource Identifiers (IRIs, see [RFC3987]) may require similar operations, or extensions to those operations, to force strings into internal ("punycode" or URI) form on the copy operation and to translate them back on paste. Were that done, the appropriate translations could be performed as part of the same process. If this author's hypothesis is correct -- that these operations are likely to be required on many systems whether this proposal is adopted or not -- then the additional translation operations are likely to be invisible to the user.
如上文各节所述,本文件中提出的建议可能因误用或误解而明显变成重大问题。例如,如果同一主机上的两个应用程序使用不同的翻译表,则很容易导致用户非常困惑的情况。然而,在某些情况下,这只会比某些替代方案稍差。例如,如果在给定系统上,IDN以本机脚本表示,但使用ASCII TLD名称,则从一个应用程序到另一个应用程序的剪切和粘贴可能无法按预期工作,除非两个应用程序和底层操作系统都基于Unicode,并使用相同的Unicode编码模型。一些应用程序作者已经发现,即使没有大量使用IDN,它们也需要支持单独的“复制字符串”和“复制链接位置”以及相应的“粘贴”操作。使用IDN或国际化资源标识符(IRIs,请参见[RFC3987])可能需要类似的操作或这些操作的扩展,以便在复制操作中将字符串强制转换为内部(“punycode”或URI)形式,并在粘贴时将其翻译回。如果做到了这一点,就可以作为同一过程的一部分进行适当的翻译。如果这位作者的假设是正确的——不管这个提议是否被采纳,许多系统都可能需要这些操作——那么用户很可能看不到额外的翻译操作。
In particular, precisely because the translated names proposed here are part of a presentation form, rather than the internal form names, they are inappropriate in a number of circumstances in which a globally-unique, internal-form name is actually required. It would be a poor, indeed dangerous, idea to use these names in security contexts such as names in certificates, access lists, or other contexts in which accurate comparisons are necessary.
特别是,正是因为此处提出的翻译名称是演示表单的一部分,而不是内部表单名称,所以在实际需要全局唯一的内部表单名称的许多情况下,这些名称是不合适的。在安全上下文(如证书、访问列表或其他需要精确比较的上下文中的名称)中使用这些名称将是一个糟糕的、甚至是危险的想法。
A more general issue exists when DNS or IRI references are transferred among users whose systems may be localized for different languages or conventions. In general, a user in one part of the
当DNS或IRI引用在系统可能针对不同语言或约定进行本地化的用户之间传输时,存在一个更普遍的问题。通常情况下,用户在
world will not actually know how another user's systems are set up, precisely what software is being used, etc., nor should users be expected or forced to learn that information. But, if the user transmitting an internationalized reference doesn't know that the receiving system supports the same characters and fonts, and that the receiving user is prepared to deal with them, the prudent user will transmit the internal form of the reference in addition to, or even instead of, the native-character form. And, of course, if the reference is transmitted on paper, on a sign, in some coded character set other than Unicode, or even as an image, rather than as a Unicode string, the importance of supplementing it with the internal form becomes even more important. The addition of a translation requirement for TLD labels makes availability of internal forms in interchange significantly more important, but does not actually change the requirement to do so.
世界将不会真正知道另一个用户的系统是如何建立的,确切地说,正在使用什么软件,等等,也不应该期望或强迫用户了解这些信息。但是,如果发送国际化参考的用户不知道接收系统支持相同的字符和字体,并且接收用户准备好处理这些字符和字体,那么谨慎的用户将发送参考的内部形式,除了本地字符形式之外,甚至代替本地字符形式。当然,如果引用是以纸张、符号、某些编码字符集(而非Unicode)或图像而不是Unicode字符串的形式传输的,那么用内部形式补充它的重要性就变得更加重要。TLD标签的翻译要求的增加使得内部表单在交换中的可用性变得更加重要,但实际上并没有改变这样做的要求。
It may be helpful to note that, in a different networking model than that used in the Internet, both this proposal and IDNA itself are essentially "presentation layer" approaches rather than constructions that can be expected to work well in interchange.
值得注意的是,在与互联网不同的网络模型中,本提案和IDNA本身本质上都是“表示层”方法,而不是预期在交换中工作良好的结构。
This entire specification addresses issues in internationalization and especially the boundaries between internationalization and localization and between network protocols and client/user interface actions.
整个规范解决了国际化中的问题,特别是国际化和本地化之间以及网络协议和客户端/用户界面操作之间的界限。
IDNA provides a client-based mechanism for presenting Unicode names in applications while passing only ASCII-based names on the wire. As such, it constitutes a major step along the path of introducing a client-based presentation layer into the Internet. Client-based presentation layer transformations introduce risks from non-conforming tables that can change meaning without external protection. For example, if a mapping table normally maps A onto C, and that table is altered by an attacker so that A maps onto D instead, much mischief can be committed. On the other hand, these are not the usual sort of network attacks: they may be thought of as falling into the "users can always cause harm to themselves" category. The local translation model outlined here does not significantly increase the risks over those associated with IDNA, but may provide some new avenues for exploiting them.
IDNA提供了一种基于客户端的机制,用于在应用程序中显示Unicode名称,同时在线路上仅传递基于ASCII的名称。因此,它构成了将基于客户端的表示层引入Internet的一个重要步骤。基于客户端的表示层转换引入了来自不一致表的风险,这些不一致表可以在没有外部保护的情况下更改含义。例如,如果一个映射表通常将a映射到C,并且该表被攻击者更改,从而使a映射到D,那么可能会发生很多破坏。另一方面,这些不是常见的网络攻击:它们可能被认为属于“用户总是会对自己造成伤害”类别。与IDNA相关的风险相比,此处概述的本地翻译模式不会显著增加风险,但可能会提供一些新的途径来利用这些风险。
Both this approach and IDNA rely on having updated programs present information to the user in a very different form than the one in which it is transmitted on the wire. Unless the internal (wire) form
这种方法和IDNA都依赖于更新的程序以一种与有线传输完全不同的形式向用户呈现信息。除非是内部(电线)形式
is always used in interchange, or at least made available when DNS names are exchanged, there are possibilities for ambiguity and confusion about references. As with IDNA itself, if only the "wire" form is presented, the user will perceive that nothing of value has been done, i.e., that no internationalization or localization has occurred. So presentation of the "wire" form to eliminate the potential ambiguities is unlikely to be considered an acceptable solution, regardless of its security advantages.
总是在交换中使用,或者至少在交换DNS名称时可用,这可能会导致对引用的歧义和混淆。与IDNA本身一样,如果只显示“wire”表单,用户将感觉到没有做任何有价值的事情,即没有发生国际化或本地化。因此,尽管“wire”表单具有安全优势,但它不太可能被认为是可接受的解决方案,以消除潜在的歧义。
If the translation tables associated with the technique suggested here are obtained from a server, or translations are obtained from a remote machine using some protocol, the mechanisms used should ensure that the values received are authentic, i.e., that neither they, nor the query for them, have been intercepted and tampered with in any way.
如果与此处建议的技术相关联的翻译表是从服务器获得的,或者翻译是使用某种协议从远程机器获得的,则所使用的机制应确保接收到的值是真实的,即,它们以及对它们的查询均未被截获和以任何方式篡改。
This document was inspired by a number of conversations in ICANN, IETF, MINC, and private contexts about the future evolution and internationalization of top level domains. Unknown to the author, but unsurprisingly (the general concept should be obvious to anyone even slightly skilled in the relevant technologies), the concept has been apparently developed independently in other groups but, as far as this author knows, not written up for general comment. Discussions within, and about, the ICANN IDN Committee were particularly helpful, although several of the participants in that committee may be surprised about where those discussions led. Email correspondence with several people after the first version of this document was posted, notably Richard Hill, Paul Hoffman, Lee XiaoDong, and Soobok Lee, led to considerable clarification in the subsequent versions. The author is particularly grateful to Paul Hoffman for extensive comments and additional text for the third version and to Patrik Faltstrom, Joel Halpern, Sam Hartman, and Russ Housley for suggestions incorporated into the final one.
本文件的灵感来源于ICANN、IETF、MINC和私有环境中关于顶级域名未来演变和国际化的大量对话。作者不知道,但毫不奇怪(一般概念对任何人来说都是显而易见的,即使是对相关技术略知一二的人),这个概念显然是在其他群体中独立开发的,但据作者所知,并不是为一般评论而写的。ICANN IDN委员会内部和有关ICANN IDN委员会的讨论特别有帮助,尽管该委员会的一些参与者可能会对这些讨论的结果感到惊讶。本文件第一版发布后,与多人的电子邮件通信,尤其是Richard Hill、Paul Hoffman、Lee XiaoDong和Soobok Lee,在随后的版本中进行了大量澄清。作者特别感谢保罗·霍夫曼(Paul Hoffman)对第三版的大量评论和补充文本,以及帕特里克·法茨特罗姆(Patrik Faltstrom)、乔尔·哈尔彭(Joel Halpern)、萨姆·哈特曼(Sam Hartman)和罗斯·霍斯利(Russ Housley)在最后一版中提出的建议。
The first version of this document was posted on October 21, 2002.
本文件的第一版于2002年10月21日发布。
[ISO10646] International Organization for Standardization, "Information Technology - Universal Multiple-octet coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane", ISO Standard 10646-1, May 1993.
[ISO10646]国际标准化组织,“信息技术-通用多八位编码字符集(UCS)-第1部分:体系结构和基本多语言平面”,ISO标准10646-11993年5月。
[ISO3166] International Organization for Standardization, "Codes for the representation of names of countries and their subdivisions -- Part 1: Country codes", ISO Standard 3166-1:1977, 1997.
[ISO3166]国际标准化组织,“国家及其分支机构名称表示代码——第1部分:国家代码”,ISO标准3166-1:1977,1997。
[MIME] Borenstein, N. and N. Freed, "MIME (Multipurpose Internet Mail Extensions): Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1341, June 1992.
[MIME]Borenstein,N.和N.Freed,“MIME(多用途Internet邮件扩展):指定和描述Internet邮件正文格式的机制”,RFC 13411992年6月。
Updated and replaced by Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC2045, November 1996. Also, Moore, K., "Representation of Non-ASCII Text in Internet Message Headers", RFC 1342, June 1992. Updated and replaced by Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text", RFC 2047, November 1996.
更新并替换为Freed,N.和N.Borenstein,“多用途Internet邮件扩展(MIME)第一部分:Internet邮件正文格式”,RFC20451996年11月。此外,Moore,K.,“互联网消息头中非ASCII文本的表示”,RFC 1342,1992年6月。更新并由Moore,K.替换,“MIME(多用途互联网邮件扩展)第三部分:非ASCII文本的消息头扩展”,RFC 2047,1996年11月。
[RFC1034] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987.
[RFC1034]Mockapetris,P.,“域名-概念和设施”,STD 13,RFC 1034,1987年11月。
[RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987.
[RFC1035]Mockapetris,P.,“域名-实现和规范”,STD 13,RFC 1035,1987年11月。
[RFC1123] Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, October 1989.
[RFC1123]Braden,R.,“互联网主机的要求-应用和支持”,STD 3,RFC 1123,1989年10月。
[RFC1480] Cooper, A. and J. Postel, "The US Domain", RFC 1480, June 1993.
[RFC1480]Cooper,A.和J.Postel,“美国领域”,RFC 1480,1993年6月。
[RFC1591] Postel, J., "Domain Name System Structure and Delegation", RFC 1591, March 1994.
[RFC1591]Postel,J.,“域名系统结构和授权”,RFC15911994年3月。
[RFC2672] Crawford, M., "Non-Terminal DNS Name Redirection", RFC 2672, August 1999.
[RFC2672]克劳福德,M.,“非终端DNS名称重定向”,RFC 26721999年8月。
[RFC3454] Hoffman, P. and M. Blanchet, "Preparation of Internationalized Strings ("stringprep")", RFC 3454, December 2002.
[RFC3454]Hoffman,P.和M.Blanchet,“国际化弦的准备(“stringprep”)”,RFC 3454,2002年12月。
[RFC3467] Klensin, J., "Role of the Domain Name System (DNS)", RFC 3467, February 2003.
[RFC3467]Klensin,J.,“域名系统(DNS)的作用”,RFC 3467,2003年2月。
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, "Internationalizing Domain Names in Applications (IDNA)", RFC 3490, March 2003.
[RFC3490]Faltstrom,P.,Hoffman,P.,和A.Costello,“应用程序中的域名国际化(IDNA)”,RFC 34902003年3月。
[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", RFC 3491, March 2003.
[RFC3491]Hoffman,P.和M.Blanchet,“Nameprep:国际化域名(IDN)的Stringprep配置文件”,RFC 3491,2003年3月。
[RFC3492] Costello, A., "Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)", RFC 3492, March 2003.
[RFC3492]Costello,A.,“Punycode:应用程序中国际化域名的Unicode引导字符串编码(IDNA)”,RFC 3492,2003年3月。
[RFC3696] Klensin, J., "Application Techniques for Checking and Transformation of Names", RFC 3696, February 2004.
[RFC3696]Klensin,J.,“名称检查和转换的应用技术”,RFC 36962004年2月。
[RFC3932] Alvestrand, H., "The IESG and RFC Editor Documents: Procedures", BCP 92, RFC 3932, October 2004.
[RFC3932]Alvestrand,H.,“IESG和RFC编辑文件:程序”,BCP 92,RFC 3932,2004年10月。
[RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource Identifiers (IRIs)", RFC 3987, January 2005.
[RFC3987]Duerst,M.和M.Suignard,“国际化资源标识符(IRIs)”,RFC 3987,2005年1月。
[RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, "DNS Security Introduction and Requirements", RFC 4033, March 2005.
[RFC4033]Arends,R.,Austein,R.,Larson,M.,Massey,D.,和S.Rose,“DNS安全介绍和要求”,RFC 4033,2005年3月。
[RFC4034] Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, "Resource Records for the DNS Security Extensions", RFC 4034, March 2005.
[RFC4034]Arends,R.,Austein,R.,Larson,M.,Massey,D.,和S.Rose,“DNS安全扩展的资源记录”,RFC 40342005年3月。
[RFC4035] Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, "Protocol Modifications for the DNS Security Extensions", RFC 4035, March 2005.
[RFC4035]Arends,R.,Austein,R.,Larson,M.,Massey,D.,和S.Rose,“DNS安全扩展的协议修改”,RFC 4035,2005年3月。
Author's Address
作者地址
John C Klensin 1770 Massachusetts Ave, #322 Cambridge, MA 02140 USA
美国马萨诸塞州剑桥市322号马萨诸塞大道1770号约翰·C·克伦辛,邮编:02140
Phone: +1 617 491 5735 EMail: john-ietf@jck.com
Phone: +1 617 491 5735 EMail: john-ietf@jck.com
Full Copyright Statement
完整版权声明
Copyright (C) The Internet Society (2005).
版权所有(C)互联网协会(2005年)。
This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.
本文件受BCP 78中包含的权利、许可和限制的约束,除其中规定外,作者保留其所有权利。
This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
本文件及其包含的信息是按“原样”提供的,贡献者、他/她所代表或赞助的组织(如有)、互联网协会和互联网工程任务组不承担任何明示或暗示的担保,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。
Intellectual Property
知识产权
The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.
IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何独立努力来确定任何此类权利。有关RFC文件中权利的程序信息,请参见BCP 78和BCP 79。
Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.
向IETF秘书处披露的知识产权副本和任何许可证保证,或本规范实施者或用户试图获得使用此类专有权利的一般许可证或许可的结果,可从IETF在线知识产权存储库获取,网址为http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.
IETF邀请任何相关方提请其注意任何版权、专利或专利申请,或其他可能涵盖实施本标准所需技术的专有权利。请将信息发送至IETF的IETF-ipr@ietf.org.
Acknowledgement
确认
Funding for the RFC Editor function is currently provided by the Internet Society.
RFC编辑功能的资金目前由互联网协会提供。