Internet Architecture Board (IAB)                            A. Sullivan
Request for Comments: 6912                                     Dyn, Inc.
Category: Informational                                        D. Thaler
ISSN: 2070-1721                                                Microsoft
                                                              J. Klensin
        
Internet Architecture Board (IAB)                            A. Sullivan
Request for Comments: 6912                                     Dyn, Inc.
Category: Informational                                        D. Thaler
ISSN: 2070-1721                                                Microsoft
                                                              J. Klensin
        

O. Kolkman NLnet Labs April 2013

O.Kolkman NLnet实验室2013年4月

Principles for Unicode Code Point Inclusion in Labels in the DNS

DNS中标签中包含Unicode代码点的原则

Abstract

摘要

Internationalized Domain Names in Applications (IDNA) makes available to DNS zone administrators a very wide range of Unicode code points. Most operators of zones should probably not permit registration of U-labels using the entire range. This is especially true of zones that accept registrations across organizational boundaries, such as top-level domains and, most importantly, the root. It is unfortunately not possible to generate algorithms to determine whether permitting a code point presents a low risk. This memo presents a set of principles that can be used to guide the decision of whether a Unicode code point may be wisely included in the repertoire of permissible code points in a U-label in a zone.

应用程序中的国际化域名(IDNA)为DNS区域管理员提供了非常广泛的Unicode代码点。区域的大多数运营商可能不允许使用整个范围注册U型标签。对于跨组织边界接受注册的区域尤其如此,例如顶级域,最重要的是根域。不幸的是,无法生成算法来确定允许代码点是否具有低风险。本备忘录提出了一套原则,可用于指导决定是否将Unicode代码点明智地包括在分区U型标签中允许的代码点清单中。

Status of This Memo

关于下段备忘

This document is not an Internet Standards Track specification; it is published for informational purposes.

本文件不是互联网标准跟踪规范;它是为了提供信息而发布的。

This document is a product of the Internet Architecture Board (IAB) and represents information that the IAB has deemed valuable to provide for permanent record. It represents the consensus of the Internet Architecture Board (IAB). Documents approved for publication by the IAB are not a candidate for any level of Internet Standard; see Section 2 of RFC 5741.

本文件是互联网体系结构委员会(IAB)的产品,代表IAB认为有价值提供永久记录的信息。它代表了互联网体系结构委员会(IAB)的共识。IAB批准发布的文件不适用于任何级别的互联网标准;见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6912.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc6912.

Copyright Notice

版权公告

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

版权所有(c)2013 IETF信托基金和确定为文件作者的人员。版权所有。

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。

Table of Contents

目录

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Background  . . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.1.  More-Restrictive Rules Going Up the DNS Tree  . . . . . .   6
   3.  Principles Applicable to All Zones  . . . . . . . . . . . . .   6
     3.1.  Longevity Principle . . . . . . . . . . . . . . . . . . .   6
     3.2.  Least Astonishment Principle  . . . . . . . . . . . . . .   6
     3.3.  Contextual Safety Principle . . . . . . . . . . . . . . .   7
   4.  Principles Applicable to All Public Zones . . . . . . . . . .   7
     4.1.  Conservatism Principle  . . . . . . . . . . . . . . . . .   7
     4.2.  Inclusion Principle . . . . . . . . . . . . . . . . . . .   7
     4.3.  Simplicity Principle  . . . . . . . . . . . . . . . . . .   7
     4.4.  Predictability Principle  . . . . . . . . . . . . . . . .   8
     4.5.  Stability Principle . . . . . . . . . . . . . . . . . . .   8
   5.  Principle Specific to the Root Zone . . . . . . . . . . . . .   8
     5.1.  Letter Principle  . . . . . . . . . . . . . . . . . . . .   8
   6.  Confusion and Context . . . . . . . . . . . . . . . . . . . .   9
   7.  Conclusion  . . . . . . . . . . . . . . . . . . . . . . . . .   9
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  10
   10. IAB Members at the Time of Approval . . . . . . . . . . . . .  10
   11. Informative References  . . . . . . . . . . . . . . . . . . .  10
        
   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Background  . . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.1.  More-Restrictive Rules Going Up the DNS Tree  . . . . . .   6
   3.  Principles Applicable to All Zones  . . . . . . . . . . . . .   6
     3.1.  Longevity Principle . . . . . . . . . . . . . . . . . . .   6
     3.2.  Least Astonishment Principle  . . . . . . . . . . . . . .   6
     3.3.  Contextual Safety Principle . . . . . . . . . . . . . . .   7
   4.  Principles Applicable to All Public Zones . . . . . . . . . .   7
     4.1.  Conservatism Principle  . . . . . . . . . . . . . . . . .   7
     4.2.  Inclusion Principle . . . . . . . . . . . . . . . . . . .   7
     4.3.  Simplicity Principle  . . . . . . . . . . . . . . . . . .   7
     4.4.  Predictability Principle  . . . . . . . . . . . . . . . .   8
     4.5.  Stability Principle . . . . . . . . . . . . . . . . . . .   8
   5.  Principle Specific to the Root Zone . . . . . . . . . . . . .   8
     5.1.  Letter Principle  . . . . . . . . . . . . . . . . . . . .   8
   6.  Confusion and Context . . . . . . . . . . . . . . . . . . . .   9
   7.  Conclusion  . . . . . . . . . . . . . . . . . . . . . . . . .   9
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  10
   10. IAB Members at the Time of Approval . . . . . . . . . . . . .  10
   11. Informative References  . . . . . . . . . . . . . . . . . . .  10
        
1. Introduction
1. 介绍

Operators of a DNS zone need to set policies around what Unicode code points are allowed in labels in that zone. Typically there are a number of important goals to consider when constructing such policies. These include, for instance, avoiding possible visual confusability between two labels, avoiding possible confusion between Fully Qualified Domain Names (FQDNs) and IP address literals, accessibility to the disabled (see "Web Content Accessibility Guidelines (WCAG) 2.0" [WCAG20] for some discussion in a web context), and other usability issues.

DNS区域的操作员需要围绕该区域的标签中允许的Unicode代码点设置策略。通常,在构建这样的政策时,有许多重要的目标要考虑。这些包括,例如,避免两个标签之间可能的视觉混淆,避免完全限定域名(FQDN)和IP地址文字之间可能的混淆,残疾人的可访问性(有关Web上下文中的一些讨论,请参阅“Web内容可访问性指南(WCAG)2.0”[WCAG20]),以及其他可用性问题。

This document provides a set of principles that zone operators can use to construct their code point policies in order to improve usability and clarity and thereby reduce confusion.

本文档提供了一组原则,区域操作员可以使用这些原则来构建其代码点策略,以提高可用性和清晰度,从而减少混淆。

1.1. Terminology
1.1. 术语

This document uses the following terms.

本文件使用以下术语。

A-label: an LDH label that starts with "xn--" and meets all the IDNA requirements, with additional restrictions as explained in Section 2.3.2.1 of the IDNA Definitions document [RFC5890].

A-标签:一种LDH标签,以“xn-”开头,满足所有IDNA要求,附加限制如IDNA定义文件[RFC5890]第2.3.2.1节所述。

Character: a member of a set of elements used for the organization, control, or representation of data. See Section 2 of the Internationalization Terminology document [RFC6365] for more details.

字符:用于组织、控制或表示数据的一组元素的成员。有关更多详细信息,请参阅国际化术语文档[RFC6365]的第2节。

Language: a way that humans communicate. The use of language occurs in many forms, the most common of which are speech, writing, and signing. See Section 2 of RFC 6365 for more details.

语言:人类交流的一种方式。语言的使用有多种形式,其中最常见的是演讲、写作和签名。详见RFC 6365第2节。

LDH label: a string consisting of ASCII letters, digits, and the hyphen, with additional restrictions as explained in Section 2.3.1 of RFC 5890.

LDH标签:由ASCII字母、数字和连字符组成的字符串,附加限制如RFC 5890第2.3.1节所述。

Public zone: in this document, a DNS zone that accepts registration requests from organizations outside the zone administrator's own organization. (Whether the zone performs delegation is a separate question. What is important is the diversity of the registration-requesting community.) Note that under this definition, the root zone is a public zone, though one that has a unique function in the DNS.

公共区域:在本文档中,一个DNS区域,它接受来自区域管理员自己组织之外的组织的注册请求。(区域是否执行委派是一个单独的问题。重要的是注册请求社区的多样性。)请注意,根据此定义,根区域是一个公共区域,尽管它在DNS中具有唯一的功能。

Rendering: the display of a string of text. See Section 5 of RFC 6365 for more details.

呈现:文本字符串的显示。详见RFC 6365第5节。

Script: a set of graphic characters used for the written form of one or more languages. See Section 2 of RFC 6365 for more details.

脚本:一组图形字符,用于一种或多种语言的书面形式。详见RFC 6365第2节。

U-label: a string of Unicode characters that meets all the IDNA requirements and includes at least one non-ASCII character, with additional restrictions as explained in Section 2.3.2.1 of RFC 5890.

U-label:符合所有IDNA要求的Unicode字符字符串,至少包括一个非ASCII字符,附加限制如RFC 5890第2.3.2.1节所述。

Writing system: a set of rules for using one or more scripts to write a particular language. See Section 2 of RFC 6365 for more details.

书写系统:使用一个或多个脚本编写特定语言的一组规则。详见RFC 6365第2节。

This memo does not propose a protocol standard, and the use of words such as "should" follow the ordinary English meaning, and not that laid out in [RFC2119].

本备忘录未提出协议标准,诸如“应”等词语的使用遵循普通英语含义,而不是[RFC2119]中规定的含义。

2. Background
2. 出身背景

In recent communications [IABCOMM1] [IABCOMM2], the IAB has emphasized the importance of conservatism in allocating labels conforming to IDNA2008 [RFC5890] [RFC5891] [RFC5892] [RFC5893] [RFC5894] [RFC5895] in DNS zones, and especially in the root zone. Traditional LDH labels in the root zone used only alphabetic characters (i.e., ASCII a-z, which under the DNS also match A-Z). Matters are more complicated with U-labels, however. The IAB communications recommended that U-labels permit only code points with a General_Category (gc) of Ll (Lowercase_Letter), Lo (Other_Letter), or Lm (Modifier_Letter), but noted that for practical considerations other code points might be permitted on a case-by-case basis.

在最近的通信[IABCOM1][IABCOM2]中,IAB强调了在DNS区域,特别是根区域分配符合IDNA2008[RFC5890][RFC5891][RFC5892][RFC5893][RFC5894][RFC5895]的标签时保守主义的重要性。根区域中的传统LDH标签仅使用字母字符(即ASCII a-z,在DNS下也与a-z匹配)。然而,U型标签的问题更为复杂。IAB通信建议U型标签仅允许通用字母类别(gc)为Ll(小写字母)、Lo(其他字母)或Lm(修饰字母)的代码点,但注意到出于实际考虑,可能会根据具体情况允许其他代码点。

The IAB recommendations do, however, leave some issues open that need to be addressed. It is not clear that all code points permitted under IDNA2008 that have a General_Category of Lo or Lm are appropriate for a zone such as the root zone. To take but one example, the code point U+02BC (MODIFIER LETTER APOSTROPHE) has a General_Category of Lm. In practically every rendering (and we are unaware of an exception), U+02BC is indistinguishable from U+2019 (RIGHT SINGLE QUOTATION MARK), which has a General_Category of Pf (Final_Punctuation). U+02BC will also be read by large numbers of people as being the same character as U+0027 (APOSTROPHE), which has a General_Category of Po (Other_Punctuation), and some computer systems may treat U+02BC as U+0027. U+02BC is PROTOCOL VALID (PVALID) under IDNA2008 (see the IDNA Code Points document [RFC5892]), whereas both other code points are DISALLOWED. So, to begin with, it is plain that not every code point with a

然而,IAB的建议确实留下了一些有待解决的问题。目前尚不清楚IDNA2008允许的所有代码点是否适用于根区域等区域,这些代码点具有Lo或Lm的一般_类别。仅举一个例子,代码点U+02BC(修饰字母撇号)的一般类别为Lm。在几乎所有渲染中(我们不知道有例外),U+02BC与U+2019(右单引号)是无法区分的,U+2019的一般类别为Pf(最终标点符号)。大量人也会将U+02BC理解为与U+0027(撇号)相同的字符,U+0027具有Po(其他标点符号)的一般类别,一些计算机系统可能将U+02BC视为U+0027。根据IDNA2008(参见IDNA代码点文档[RFC5892]),U+02BC是协议有效(PVALID),而其他两个代码点都是不允许的。因此,首先,很明显,不是每个代码点都有

General_Category of Ll, Lo, or Lm is consistent with the type of conservatism principle discussed in Section 4.1 below or the previous IAB recommendations.

Ll、Lo或Lm的一般类别与下文第4.1节讨论的保守主义原则类型或之前的IAB建议一致。

To make matters worse, some languages are dependent on code points with General_Category Mc (Spacing_Mark) or General_Category Mn (Nonspacing_Mark). This dependency is particularly common in Indic languages, though not exclusive to them. (At the risk of vastly oversimplifying, the overarching issue is mostly the interaction of complex writing systems and the way Unicode works.) To restrict users of those languages to only code points with General_Category of Ll, Lo, or Lm would be extremely limiting. While DNS labels are not words, or sentences, or phrases (as noted in the next steps for IDN [RFC4690]), they are intended to support useful mnemonics. Mnemonics that diverge wildly from the usual conventions are poor ones, because in not following the usual conventions they are not easy to remember. Also, wide divergence from usual conventions, if not well-justified (and especially in a shared namespace like the root), invites political controversy.

更糟糕的是,有些语言依赖于带有General_Category Mc(间距标记)或General_Category Mn(非间距标记)的代码点。这种依赖性在印度语中尤其常见,尽管不是它们独有的。(冒着过于简单化的风险,首要问题主要是复杂书写系统的交互和Unicode的工作方式。)将这些语言的用户限制为仅使用一般类别为Ll、Lo或Lm的代码点将是极其有限的。虽然DNS标签不是单词、句子或短语(如IDN[RFC4690]下一步所述),但它们旨在支持有用的助记符。偏离常规的记忆法很差,因为不遵循常规的记忆法不容易记住。此外,与常规惯例的巨大分歧,如果没有充分的理由(尤其是在根这样的共享名称空间中),也会引发政治争议。

Many of the issues above turn out to be relevant to all public zones. Moreover, the overall issue of developing a policy for code point permission is common to all zones that accept A-labels or U-labels for registration. As Section 4.3 of the IDNA Protocol document [RFC5891] says, every registry at every level of the DNS is "expected to establish policies about label registrations".

上述许多问题都与所有公共区域有关。此外,制定代码点许可政策的总体问题对于所有接受a标签或U标签注册的区域来说都是共同的。正如IDNA协议文件[RFC5891]第4.3节所述,DNS每一级的每个注册中心“都应该建立关于标签注册的政策”。

For reasons of sound management, it is not desirable to decide whether to permit a given code point only when an application containing that code point is pending. That approach reduces predictability and is bound to appear subject to special pleas. It is better instead to produce the rules governing acceptance of code points in advance.

出于可靠管理的原因,不希望仅在包含给定代码点的应用程序挂起时才决定是否允许该代码点。这种方法降低了可预测性,必然会受到特别请求的约束。相反,最好提前制定规范代码点接受的规则。

As is evident from the foregoing discussion about the Letter and Mark categories, it is simply not possible to make code point decisions algorithmically. If it were possible to develop such an algorithm, it would already exist: the DNS is hardly unique in needing to impose restrictions on code points while accommodating many different linguistic communities. Nevertheless, new guidelines can be made by starting from overarching principles. These guidelines act more as meta-rules, leading to the establishment of other rules about the inclusion and exclusion of particular code points in labels in a given zone, always based on the list of code points permitted by IDNA.

从前面关于字母和标记类别的讨论中可以明显看出,根本不可能通过算法做出代码点决策。如果有可能开发出这样一种算法,那么它已经存在了:DNS在需要对代码点施加限制的同时,也适应了许多不同的语言社区,这并不是独一无二的。然而,可以从总体原则出发制定新的指导方针。这些准则更像元规则,导致建立关于在给定区域的标签中包含和排除特定代码点的其他规则,始终基于IDNA允许的代码点列表。

2.1. More-Restrictive Rules Going Up the DNS Tree
2.1. DNS树上的限制性规则更多

A set of principles derived from the above ideas follows in Sections 3 through 5 below. Such principles fall into three categories. Some principles apply to every DNS zone. Some additional principles apply to all public zones, including the root zone. Finally, other principles apply only to the root zone. This means that zones higher in the DNS tree tend to have more restrictive rules (since additional principles apply), and zones lower in the DNS tree tend to have less restrictive rules, since they are used within a more narrow context. In general, the relevant context for a principle is that of the zone, not that of a given subset of the user community; for the root zone, for example, the context is "the entire Internet population".

以下第3节至第5节介绍了从上述思想中得出的一套原则。这些原则分为三类。有些原则适用于每个DNS区域。一些附加原则适用于所有公共区域,包括根区域。最后,其他原则仅适用于根区域。这意味着DNS树中较高的区域往往具有更多的限制性规则(因为应用了其他原则),而DNS树中较低的区域往往具有更少的限制性规则,因为它们在更狭窄的上下文中使用。一般来说,原则的相关上下文是区域上下文,而不是用户社区的给定子集上下文;例如,对于根区域,上下文是“整个互联网人口”。

3. Principles Applicable to All Zones
3. 适用于所有区域的原则
3.1. Longevity Principle
3.1. 长寿原则

Unicode properties of a code point ought to be stable across the versions of Unicode that users of the zone are likely to have installed. Because it is possible for the properties of a code point to change between Unicode versions, a good way to predict such stability is to ensure that a code point has in fact been stable for multiple successive versions of Unicode. This principle is related to the Stability Principle in Section 4.5.

代码点的Unicode属性应该在区域用户可能安装的Unicode版本中保持稳定。由于代码点的属性可能在Unicode版本之间发生变化,因此预测这种稳定性的一个好方法是确保一个代码点实际上对于多个连续的Unicode版本是稳定的。该原则与第4.5节中的稳定性原则有关。

The more diverse the community using the zone, the greater the importance of following this principle. The policy for a leaf zone in the DNS might only require stability across two Unicode versions, whereas a more public zone might require stability across four or more releases before the code point's properties are considered long-lived and stable.

使用区域的社区越多样化,遵循这一原则的重要性就越大。DNS中叶区域的策略可能只需要跨两个Unicode版本的稳定性,而更公共的区域可能需要跨四个或更多版本的稳定性,然后才能将代码点的属性视为长期和稳定的。

3.2. Least Astonishment Principle
3.2. 最小惊奇原则

Every zone administrator should be sensitive to the likely use of a code point to be permitted, particularly taking into account the population likely to use the zone. Zone administrators should especially consider whether a candidate code point could present difficulty if the code point is encountered outside the usual linguistic circumstances. By the same token, the failure to support a code point that is normal in some linguistic circumstances could be very surprising for users likely to encounter the names in that circumstance.

每个区域管理员都应该对可能允许使用的代码点保持敏感,特别是考虑到可能使用该区域的人口。区域管理员应该特别考虑如果在普通语言环境之外遇到代码点,候选代码点是否会出现困难。出于同样的原因,在某些语言环境中无法支持正常的代码点,这对于在这种情况下可能遇到名称的用户来说可能是非常令人惊讶的。

3.3. Contextual Safety Principle
3.3. 上下文安全原则

Every zone administrator should be sensitive to ways in which a code point that is permitted could be used in support of malicious activity. This is not a completely new problem: the digit 1 and the lowercase letter l are, for instance, easily confused in many contexts. The very large repertoire of code points in Unicode (even just the subset permitted for IDNs) makes the problem somewhat worse, just because of the scale.

每个区域管理员都应该对允许的代码点用于支持恶意活动的方式非常敏感。这不是一个全新的问题:例如,数字1和小写字母l在许多上下文中很容易混淆。Unicode中的大量代码点(甚至只是IDN允许的子集)使问题变得更糟,这仅仅是因为规模。

4. Principles Applicable to All Public Zones
4. 适用于所有公共区域的原则
4.1. Conservatism Principle
4.1. 保守主义原则

Public zones are, by definition, zones that are shared by different groups of people. Therefore, any decision to permit a code point in a public zone (including the root) should be as conservative as practicable. Doubts should always be resolved in favor of rejecting a code point for inclusion rather than in favor of including it, in order to minimize risk.

根据定义,公共区域是由不同人群共享的区域。因此,任何允许在公共区域(包括根)中设置代码点的决定都应尽可能保守。为了将风险降至最低,应始终通过拒绝包含代码点而不是包含代码点来解决疑问。

4.2. Inclusion Principle
4.2. 包容原则

Just as IDNA2008 starts from the principle that the Unicode range is excluded, and then adds code points according to derived properties of the code points, so a public zone should only permit inclusion of a code point if it is known to be "safe" in terms of usability and confusability within the context of that zone. The default treatment of a code point should be that it is excluded.

正如IDNA2008从排除Unicode范围的原则开始,然后根据代码点的派生属性添加代码点一样,公共区域应该只允许包含在该区域上下文中可用性和易混淆性已知“安全”的代码点。代码点的默认处理方式应该是将其排除在外。

4.3. Simplicity Principle
4.3. 简单性原则

The rules for determining whether a code point is to be included should be simple enough that they are readily understood by someone with a moderate background in the DNS and Unicode issues. This principle does not mean that a completely naive person needs to be able to understand the rationale for including a code point, but it does mean that if the reason for inclusion of a very peculiar code point, even a safe one, is too difficult to understand, the code point would not be permitted.

确定是否包含代码点的规则应该足够简单,以便具有DNS和Unicode问题中等背景的人容易理解。这一原则并不意味着一个完全天真的人需要能够理解包含代码点的理由,但它确实意味着,如果包含非常特殊的代码点(即使是安全的代码点)的原因太难理解,则不允许包含代码点。

The meaning of "simple" or "readily understood" is context-dependent. For instance, the root zone has to serve everyone in the world; for practical purposes, this means that the reasons for including a code point need to be comprehensible even to people who cannot use the script where the code point is found. In a zone that permits a constrained subset of Unicode characters (for instance, only those needed to write a single alphabetic language) and that supports a

“简单”或“易于理解”的含义取决于上下文。例如,根区必须服务于世界上的每一个人;出于实际目的,这意味着包含代码点的原因需要让那些无法使用找到代码点的脚本的人也能理解。在允许Unicode字符的受限子集(例如,仅允许编写单一字母语言所需的字符)且支持

clearly delineated linguistic community (for instance, the speakers of a single language with well-understood written conventions), more complicated rules might be acceptable. Compare this principle with the Least Astonishment Principle in Section 3.2.

明确界定的语言群体(例如,使用一种语言的人,他们的书面约定很好理解),更复杂的规则可能是可以接受的。将该原则与第3.2节中的最小惊奇原则进行比较。

4.4. Predictability Principle
4.4. 可预测性原则

The rules for determining whether a code point is to be included should be predictable enough that those with the requisite understanding of DNS, IDNA, and Unicode will usually reach the same conclusion. This is not a requirement for algorithmic treatment of code points; as previously noted, that is not possible. Rather, it is to say that the consistent application of professional judgment is likely to yield the same results; combined with the principle in Section 4.1, when results are not predictable, the anomalous code point would not be permitted.

确定是否包含代码点的规则应该足够可预测,以便那些对DNS、IDNA和Unicode有必要了解的人通常会得出相同的结论。这不是代码点算法处理的要求;如前所述,这是不可能的。相反,这是说,专业判断的一致应用可能会产生相同的结果;结合第4.1节中的原则,当结果不可预测时,不允许出现异常代码点。

Just as in Section 4.3, this principle tends to cause more restriction the more diverse the community using the zone; it is most restrictive for the root zone. This is because what is predictable within a given language community is possibly very surprising across languages.

正如第4.3节所述,这一原则往往会造成更多的限制,社区使用区域的多样性越高;它对根区域的限制最大。这是因为在一个给定的语言社区中可以预测的东西在不同的语言中可能是非常令人惊讶的。

4.5. Stability Principle
4.5. 稳定性原理

Once a code point is permitted, it is at least very hard to stop permitting that code point. In public zones (including the root), the list of code points to be permitted should change very slowly, if at all, and usually only in the direction of permitting an addition as time and experience indicate that inclusion of such a code point is both safe and consistent with these principles.

一旦一个代码点被允许,至少很难停止对该代码点的允许。在公共区域(包括根区域),允许的代码点列表应非常缓慢地更改(如果有),并且通常仅在允许添加的方向上更改,因为时间和经验表明,包含此类代码点既安全又符合这些原则。

5. Principle Specific to the Root Zone
5. 特定于根区域的原则
5.1. Letter Principle
5.1. 字母原则

"Requirements for Internet Hosts - Application and Support" [RFC1123] notes that top-level labels "will be alphabetic". In the absence of widespread agreement about the force of that note, prudence suggests that U-labels in the root zone should exclude code points that are not normally used to write words, or that are in some cases normally used for purposes other than writing words. This is not the same as using Unicode's General_Category to include only letters. It is a restriction that expands the possible class of included code points beyond the Unicode letters, but only expands so far as to include the things that are normally used to create words. Under this principle, code points with (for example) General_Category Mn (Nonspacing_Mark) might be included -- but only those that are used to write words and

“互联网主机要求-应用程序和支持”[RFC1123]指出顶级标签“将按字母顺序排列”。在没有就该注释的效力达成广泛一致的情况下,prudence建议根区域的U形标签应排除通常不用于书写文字的代码点,或者在某些情况下通常用于书写文字以外的目的的代码点。这与使用Unicode的General_类别仅包含字母不同。这是一个限制,它将可能包含的代码点类扩展到Unicode字母之外,但仅扩展到包含通常用于创建单词的内容。根据这一原则,可能会包括(例如)带有General_Category Mn(非空格标记)的代码点——但只包括那些用于编写单词和字符的代码点

not (for instance) musical symbols. In addition, such marks should only be used within a label in ways that they would be used when making a word: combinations that would be nonsense when used in a word should also be rejected when tried in DNS labels. This principle should be applied as narrowly as possible; as the next steps for IDN document [RFC4690] says, "While DNS labels may conveniently be used to express words in many circumstances, the goal is not to express words (or sentences or phrases), but to permit the creation of unambiguous labels with good mnemonic value".

不是(例如)音乐符号。此外,此类标记只能在标签中使用,其使用方式应与制作单词时相同:在DNS标签中尝试时,也应拒绝在单词中使用无意义的组合。这一原则的适用范围应尽可能狭窄;正如IDN文档[RFC4690]的下一步所述,“虽然DNS标签可以方便地用于在许多情况下表达单词,但目标不是表达单词(或句子或短语),而是允许创建具有良好记忆价值的明确标签”。

6. Confusion and Context
6. 混淆与语境

While many discussions of confusion have focused on characters, e.g., whether two characters are confusable with each other (and under what circumstances), a focus on characters alone could lead to the prohibition of very large numbers of labels, including many that present little risk. Instead, the focus should be on whether one label is confusable with another. For example, if a label contains several characters that are distinct to a particular script, and all of its characters are from that script, it is inherently not confusable with a label from any other script no matter what other characters might appear in it. Another label that lacks those distinguishing characters might be a problem. The notion extends from labels to domain names, in the sense that distinguishing characters used in a higher-level label may set expectations with respect to the characters in the lower-level labels. This expectation might be regarded as a benefit, but it is also a problem, since there is no technical way to require consistent policies in delegated namespaces.

虽然许多关于混淆的讨论都集中在字符上,例如,两个字符是否可相互混淆(以及在什么情况下),但仅关注字符就可能导致禁止大量标签,包括许多风险很小的标签。相反,重点应该放在一个标签与另一个标签是否容易混淆上。例如,如果标签包含多个与特定脚本不同的字符,并且其所有字符都来自该脚本,则无论其他字符可能出现在标签中,标签本质上都不会与任何其他脚本的标签混淆。另一个缺少这些区别字符的标签可能是一个问题。这一概念从标签延伸到域名,从某种意义上说,在更高级别标签中使用的区别字符可能会设定对较低级别标签中字符的期望。这种期望可能被认为是一种好处,但也是一个问题,因为在委派的名称空间中没有技术方法要求一致的策略。

7. Conclusion
7. 结论

The principles outlined in this document can be applied when considering any range of Unicode code points for possible inclusion in a DNS zone. It is worth observing that doing anything (especially in light of Section 4.5) implicitly disadvantages communities with a writing system not yet well understood and not represented in the technical and policy communities involved in the discussion. That disadvantage is to be guarded against as much as practical, but is effectively impossible to prevent (while still taking action) in light of imperfect human knowledge.

当考虑可能包含在DNS区域中的任何Unicode代码点范围时,可以应用本文档中概述的原则。值得注意的是,做任何事情(特别是根据第4.5节)都会暗含着对写作系统的不利影响,因为该系统尚未得到充分理解,也没有在参与讨论的技术和政策团体中得到体现。这一劣势需要尽可能地加以防范,但鉴于人类知识的不完善,这一劣势实际上是不可能预防的(尽管仍在采取行动)。

8. Security Considerations
8. 安全考虑

The principles outlined in this memo are intended to improve usability and clarity and thereby reduce confusion among different labels. While these principles may contribute to reduction of risk, they are not sufficient to provide a comprehensive internationalization policy for zone management.

本备忘录中概述的原则旨在提高可用性和清晰度,从而减少不同标签之间的混淆。虽然这些原则可能有助于降低风险,但不足以为区域管理提供全面的国际化政策。

Additional discussion of security considerations can be found in the Unicode Security Considerations [UTR36].

有关安全注意事项的更多讨论,请参见Unicode安全注意事项[UTR36]。

9. Acknowledgements
9. 致谢

The authors thank the participants in the IAB Internationalization program for the discussion of the ideas in this memo, particularly Marc Blanchet. In addition, Stephane Bortzmeyer, Paul Hoffman, Daniel Kalchev, Panagiotis Papaspiliopoulos, and Vaggelis Segredakis made specific comments.

作者感谢IAB国际化项目的参与者讨论了本备忘录中的想法,特别是Marc Blanchet。此外,Stephane Bortzmeyer、Paul Hoffman、Daniel Kalchev、Panagiotis Papaspiliopoulos和Vaggelis Segredakis发表了具体评论。

10. IAB Members at the Time of Approval
10. 批准时的IAB成员

Bernard Aboba Jari Arkko Marc Blanchet Ross Callon Alissa Cooper Spencer Dawkins Joel Halpern Russ Housley David Kessens Danny McPherson Jon Peterson Dave Thaler Hannes Tschofenig

伯纳德·阿博巴·贾里·阿尔科·马克·布兰切特·罗斯·卡隆·艾莉莎·库珀·斯宾塞·道金斯·乔尔·哈尔本·罗斯·霍斯利·大卫·凯森斯·丹尼·麦克弗森·乔恩·彼得森·戴夫·泰勒·汉内斯·茨霍芬尼

11. Informative References
11. 资料性引用

[IABCOMM1] Internet Architecture Board, "IAB Statement: 'The interpretation of rules in the ICANN gTLD Applicant Guidebook'", February 2012, <http://www.iab.org/ documents/correspondence-reports-documents/201/>.

[IABCOMM1]互联网架构委员会,“IAB声明:ICANN gTLD申请人指南中的规则解释”,2012年2月<http://www.iab.org/ 文件/通信报告文件/201/>。

[IABCOMM2] Internet Architecture Board, "Response to ICANN questions concerning 'The interpretation of rules in the ICANN gTLD Applicant Guidebook'", March 2012, <http://www.iab.org/ documents/correspondence-reports-documents/201/>.

[IABCOMM2]互联网架构委员会,“对ICANN关于“ICANN gTLD申请人指南中规则的解释”的问题的回应”,2012年3月<http://www.iab.org/ 文件/通信报告文件/201/>。

[RFC1123] Braden, R., "Requirements for Internet Hosts - Application and Support", STD 3, RFC 1123, October 1989.

[RFC1123]Braden,R.,“互联网主机的要求-应用和支持”,STD 3,RFC 1123,1989年10月。

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and Recommendations for Internationalized Domain Names (IDNs)", RFC 4690, September 2006.

[RFC4690]Klensin,J.,Faltstrom,P.,Karp,C.,和IAB,“国际化域名(IDN)的审查和建议”,RFC 46902006年9月。

[RFC5890] Klensin, J., "Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework", RFC 5890, August 2010.

[RFC5890]Klensin,J.,“应用程序的国际化域名(IDNA):定义和文档框架”,RFC 58902010年8月。

[RFC5891] Klensin, J., "Internationalized Domain Names in Applications (IDNA): Protocol", RFC 5891, August 2010.

[RFC5891]Klensin,J.,“应用程序中的国际化域名(IDNA):协议”,RFC 58912010年8月。

[RFC5892] Faltstrom, P., "The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)", RFC 5892, August 2010.

[RFC5892]Faltstrom,P.,“Unicode代码点和应用程序的国际化域名(IDNA)”,RFC 58922010年8月。

[RFC5893] Alvestrand, H. and C. Karp, "Right-to-Left Scripts for Internationalized Domain Names for Applications (IDNA)", RFC 5893, August 2010.

[RFC5893]Alvestrand,H.和C.Karp,“应用程序国际化域名(IDNA)的从右到左脚本”,RFC 58932010年8月。

[RFC5894] Klensin, J., "Internationalized Domain Names for Applications (IDNA): Background, Explanation, and Rationale", RFC 5894, August 2010.

[RFC5894]Klensin,J.“应用程序的国际化域名(IDNA):背景、解释和基本原理”,RFC 58942010年8月。

[RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for Internationalized Domain Names in Applications (IDNA) 2008", RFC 5895, September 2010.

[RFC5895]Resnick,P.和P.Hoffman,“应用程序中国际化域名的映射字符(IDNA)2008”,RFC 58952010年9月。

[RFC6365] Hoffman, P. and J. Klensin, "Terminology Used in Internationalization in the IETF", BCP 166, RFC 6365, September 2011.

[RFC6365]Hoffman,P.和J.Klensin,“IETF国际化中使用的术语”,BCP 166,RFC 6365,2011年9月。

[UTR36] Davis, M. and M. Suignard, "Unicode Security Considerations", Unicode Technical Report #36, July 2012.

[UTR36]Davis,M.和M.Suignard,“Unicode安全注意事项”,Unicode技术报告#36,2012年7月。

[WCAG20] W3C, "Web Content Accessibility Guidelines (WCAG) 2.0", W3C Recommendation, December 2008, <http://www.w3.org/TR/2008/REC-WCAG20-20081211/>.

[WCAG20]W3C,“Web内容可访问性指南(WCAG)2.0”,W3C建议,2008年12月<http://www.w3.org/TR/2008/REC-WCAG20-20081211/>.

Authors' Addresses

作者地址

Andrew Sullivan Dyn, Inc. 150 Dow St Manchester, NH 03101 USA

安德鲁·沙利文·戴恩公司,美国新罕布什尔州曼彻斯特道150号,邮编:03101

   EMail: asullivan@dyn.com
        
   EMail: asullivan@dyn.com
        

Dave Thaler Microsoft One Microsoft Way Redmond, WA 98052 USA

Dave Thaler微软一路微软雷德蒙德,华盛顿州98052美国

   EMail: dthaler@microsoft.com
        
   EMail: dthaler@microsoft.com
        

John C Klensin 1770 Massachusetts Ave, Ste 322 Cambridge, MA 02140 USA

美国马萨诸塞州剑桥322号马萨诸塞大道1770号约翰·C·克伦辛邮编:02140

   Phone: +1 617 491 5735
   EMail: john-ietf@jck.com
        
   Phone: +1 617 491 5735
   EMail: john-ietf@jck.com
        

Olaf Kolkman NLnet Labs Science Park 400 Amsterdam 1098 XH The Netherlands

Olaf Kolkman NLnet实验室科技园400阿姆斯特丹1098 XH荷兰

   EMail: olaf@NLnetLabs.nl
        
   EMail: olaf@NLnetLabs.nl