Independent Submission S. Sharikov Request for Comments: 5992 Regtime Ltd Category: Informational D. Miloshevic ISSN: 2070-1721 Afilias J. Klensin October 2010
Independent Submission S. Sharikov Request for Comments: 5992 Regtime Ltd Category: Informational D. Miloshevic ISSN: 2070-1721 Afilias J. Klensin October 2010
Internationalized Domain Names Registration and Administration Guidelines for European Languages Using Cyrillic
使用西里尔文的欧洲语言的国际化域名注册和管理指南
Abstract
摘要
This document is a guideline for registries and registrars on registering internationalized domain names (IDNs) based on (in alphabetical order) Bosnian, Bulgarian, Byelorussian, Kildin Sami, Macedonian, Montenegrin, Russian, Serbian, and Ukrainian languages in a DNS zone. It describes appropriate characters for registration and variant considerations for characters from Greek and Latin scripts with similar appearances and/or derivations.
本文件是注册机构和注册机构在DNS区域内基于波斯尼亚语、保加利亚语、白俄罗斯语、基尔丁萨米语、马其顿语、黑山语、俄语、塞尔维亚语和乌克兰语(按字母顺序)注册国际化域名(IDN)的指南。它描述了用于注册的适当字符,以及具有类似外观和/或派生形式的希腊和拉丁文字字符的变体注意事项。
Status of This Memo
关于下段备忘
This document is not an Internet Standards Track specification; it is published for informational purposes.
本文件不是互联网标准跟踪规范;它是为了提供信息而发布的。
This is a contribution to the RFC Series, independently of any other RFC stream. The RFC Editor has chosen to publish this document at its discretion and makes no statement about its value for implementation or deployment. Documents approved for publication by the RFC Editor are not a candidate for any level of Internet Standard; see Section 2 of RFC 5741.
这是对RFC系列的贡献,独立于任何其他RFC流。RFC编辑器已选择自行发布此文档,并且未声明其对实现或部署的价值。RFC编辑批准发布的文件不适用于任何级别的互联网标准;见RFC 5741第2节。
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5992.
有关本文件当前状态、任何勘误表以及如何提供反馈的信息,请访问http://www.rfc-editor.org/info/rfc5992.
Copyright Notice
版权公告
Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2010 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。
Table of Contents
目录
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Similar Characters and Variants . . . . . . . . . . . . . 3 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 2. Languages and Characters . . . . . . . . . . . . . . . . . . . 5 2.1. Bosnian and Serbian . . . . . . . . . . . . . . . . . . . 5 2.2. Bulgarian . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3. Byelorussian (Belarusian, Belarusan) . . . . . . . . . . . 5 2.4. Kildin Sami . . . . . . . . . . . . . . . . . . . . . . . 6 2.5. Macedonian . . . . . . . . . . . . . . . . . . . . . . . . 7 2.6. Montenegrin . . . . . . . . . . . . . . . . . . . . . . . 7 2.7. Russian . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.8. Serbian . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.9. Ukrainian . . . . . . . . . . . . . . . . . . . . . . . . 8 3. Language-Based Tables . . . . . . . . . . . . . . . . . . . . 8 4. Table Processing Rules . . . . . . . . . . . . . . . . . . . . 8 5. Table Format . . . . . . . . . . . . . . . . . . . . . . . . . 8 6. Steps after Registering an Input Label . . . . . . . . . . . . 9 7. Security Considerations . . . . . . . . . . . . . . . . . . . 9 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 9.1. Normative References . . . . . . . . . . . . . . . . . . . 10 9.2. Informative References . . . . . . . . . . . . . . . . . . 10 Appendix A. European Cyrillic Character Tables . . . . . . . . . 13
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Similar Characters and Variants . . . . . . . . . . . . . 3 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 2. Languages and Characters . . . . . . . . . . . . . . . . . . . 5 2.1. Bosnian and Serbian . . . . . . . . . . . . . . . . . . . 5 2.2. Bulgarian . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3. Byelorussian (Belarusian, Belarusan) . . . . . . . . . . . 5 2.4. Kildin Sami . . . . . . . . . . . . . . . . . . . . . . . 6 2.5. Macedonian . . . . . . . . . . . . . . . . . . . . . . . . 7 2.6. Montenegrin . . . . . . . . . . . . . . . . . . . . . . . 7 2.7. Russian . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.8. Serbian . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.9. Ukrainian . . . . . . . . . . . . . . . . . . . . . . . . 8 3. Language-Based Tables . . . . . . . . . . . . . . . . . . . . 8 4. Table Processing Rules . . . . . . . . . . . . . . . . . . . . 8 5. Table Format . . . . . . . . . . . . . . . . . . . . . . . . . 8 6. Steps after Registering an Input Label . . . . . . . . . . . . 9 7. Security Considerations . . . . . . . . . . . . . . . . . . . 9 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 9.1. Normative References . . . . . . . . . . . . . . . . . . . 10 9.2. Informative References . . . . . . . . . . . . . . . . . . 10 Appendix A. European Cyrillic Character Tables . . . . . . . . . 13
Cyrillic is one of a fairly small number of scripts that are used, with different subsets of characters, to write a large number of languages, some of which are not closely related to the others. When those languages might be used together in a zone (typical of generic TLDs (gTLDs) but likely in other zones both at and below the root), special considerations for intermixing characters may apply. Cyrillic also has the property that, while it is usually considered a separate script from the Latin (Roman) and Greek ones, it shares many characters with them, creating opportunities for visual confusion. Those difficulties are especially pronounced when "all of Cyrillic" is used rather than only the characters associated with a particular language.
西里尔文是使用不同字符子集编写大量语言的极少数脚本之一,其中一些与其他语言没有密切关系。当这些语言可能在一个区域中一起使用时(典型的通用TLD(GTLD),但可能在根目录下和根目录下的其他区域中使用),混合字符的特殊注意事项可能适用。西里尔文还有一个特点,即虽然它通常被认为是与拉丁(罗马)和希腊文字不同的一种文字,但它与这些文字有许多相同的字符,从而造成视觉上的混乱。当使用“所有西里尔文”而不仅仅是与特定语言相关的字符时,这些困难尤其明显。
This specification provides guidelines for the use of Cyrillic, as encoded in Unicode [Unicode52] with internationalized domain name (IDN) labels derived from most "European" languages that use the script (use of the term "European" is a convenience, since there is disagreement about the relevant boundaries for different purposes and, of course, much of Russia lies within geological Asia). Specifically, it covers (in alphabetic order) Bosnian, Bulgarian,
本规范提供了西里尔文的使用指南,西里尔文以Unicode[Unicode52]编码,其国际化域名(IDN)标签源自使用该脚本的大多数“欧洲”语言(使用术语“欧洲”)这是一种方便,因为对于不同目的的相关边界存在分歧,当然,俄罗斯的大部分地区位于亚洲地质区域内)。具体来说,它包括(按字母顺序)波斯尼亚语、保加利亚语、,
Byelorussian, the Kildin member of the Sami (often written "Saami") language family, Macedonian, Montenegrin, Russian, Serbian, and Ukrainian. Supplemental tables, based on information in the Unicode Standard and a recently completed Montenegrin government standard [MontenegrinChars] are provided for use with Montenegrin. Moldovan is no longer in official use with Cyrillic script: no registrations are considered likely in Cyrillic, at least within the relevant ccTLD, and it is not further discussed in this document. Languages of Asia that use Cyrillic are not considered here and should be the subject of separate specifications.
白俄罗斯语,萨米语(通常写为“萨米语”)语系的基尔丁语成员,马其顿语、黑山语、俄语、塞尔维亚语和乌克兰语。根据Unicode标准和最近完成的黑山政府标准[Montenegrin Cars]中的信息,提供了补充表格,供黑山使用。摩尔多瓦不再正式使用西里尔文:至少在相关的《国家版权和商标法公约》中,西里尔文不可能注册,本文件不再进一步讨论。这里不考虑使用西里尔文的亚洲语言,它们应该是单独规范的主题。
While Cyrillic script is the primary one used for many of the relevant languages and countries, Latin script is often used instead of, or in combination with, it. Standard keyboards used in most of the countries have both Cyrillic and Latin characters. Therefore, some registries could use Latin scripts for domain name registration in their zones. From time to time, some registries and users have claimed that there is a requirement for mixing Cyrillic and Latin characters in the same label. We strongly recommend against such mixing as user confusion is almost certain to result. In addition, registries that support many scripts will probably encounter the need to support labels in Greek or Latin scripts as well as Cyrillic, and a large number of character forms are shared among those three scripts.
虽然西里尔文字是许多相关语言和国家使用的主要文字,但拉丁文字经常代替西里尔文字或与之结合使用。大多数国家使用的标准键盘都有西里尔字母和拉丁字母。因此,一些注册中心可以在其区域内使用拉丁语脚本进行域名注册。有时,一些注册中心和用户声称需要在同一标签中混合西里尔字母和拉丁字母。我们强烈建议避免这种混合,因为几乎肯定会导致用户混淆。此外,支持许多脚本的注册中心可能需要支持希腊文或拉丁文以及西里尔文的标签,并且这三种脚本之间共享大量字符形式。
Because the DNS has no way for the end user to distinguish among the languages that might have been used to inspire a particular label, it seems useful to treat the characters of a large number of languages that use Cyrillic in their writing systems together, rather than trying to differentiate them. The discussion and tables in this specification should provide a foundation for developing more restrictive rules for zones in which only a single language is likely to be used, but it does not specify those language-specific rules.
由于DNS无法让最终用户区分可能用于激发特定标签的语言,因此将大量在书写系统中使用西里尔文的语言的字符放在一起处理似乎很有用,而不是试图区分它们。本规范中的讨论和表应该为开发仅使用单一语言的区域的更严格的规则提供基础,但它不指定那些特定于语言的规则。
Readers of this document should be aware that its recommendations are about use in DNS labels. The orthography for some of the languages involved, especially Kildin Sami, is not completely standardized and local usage sometimes permits substitution of Latin-based characters for their Cyrillic equivalents. Unless they are required by official orthographies, those substitutions should generally be avoided in DNS labels because of the risk of additional user confusion with the Latin characters that are visually similar.
本文档的读者应该知道,它的建议是关于在DNS标签中使用的。所涉及的一些语言,特别是基尔丁-萨米语的正字法并没有完全标准化,当地的使用有时允许以拉丁语为基础的字符替代西里尔语。除非官方正字法要求,否则在DNS标签中通常应避免这些替换,因为用户可能会与视觉上相似的拉丁字符混淆。
For some human languages, there are characters and/or strings that have equivalent or near-equivalent meanings. If someone is allowed to register a name with such a character or string, the registry
对于某些人类语言,有些字符和/或字符串具有等效或近似等效的含义。如果允许某人使用此类字符或字符串注册名称,则注册表
might want to automatically register all the names that have the same meaning in that language. Further, some registries might want to restrict the set of characters to be registered for language-based reasons.
可能希望自动注册该语言中具有相同含义的所有名称。此外,出于基于语言的原因,一些注册中心可能希望限制要注册的字符集。
So-called "variant techniques", introduced in the JET specification for the CJK script [RFC3743] and its generalization [RFC4290], describe ways of registering IDNs to decrease the risk of misunderstandings, cybersquatting, and other forms of confusion.
在CJK脚本[RFC3743]及其推广[RFC4290]的JET规范中引入了所谓的“变体技术”,描述了注册IDN以减少误解、网络抢注和其他形式混淆的风险的方法。
The tables below (Appendix A) identify confusable characters in Latin and Greek scripts that might be easily confused with Cyrillic ones.
下表(附录A)确定了拉丁语和希腊语中容易与西里尔语混淆的字符。
As with variant approaches for other scripts (e.g., see RFC 4713 [RFC4713] for the Chinese language or RFC 5564 [RFC5564] for the Arabic language), this document identifies sets of characters that need special consideration and provides information about them. A registry that handles names using these characters can then make a policy decision about how to actually handle them. The options for those policy decisions would include automatically registering all look-alike strings to the same registrant, registering one such string and blocking the others, and so on.
与其他脚本的不同方法一样(例如,汉语参见RFC 4713[RFC4713],阿拉伯语参见RFC 5564[RFC5564]),本文件确定了需要特别考虑的字符集,并提供了有关这些字符集的信息。然后,使用这些字符处理名称的注册表可以就如何实际处理它们作出策略决定。这些策略决策的选项包括自动将所有相似字符串注册到同一注册人,注册一个这样的字符串并阻止其他字符串,等等。
The terminology that follows is derived from the JET specification for the CJK script [RFC3743] and its generalization [RFC4290], but this specification does not depend on them. All characters listed here have been verified to be "PVALID" under the IDNA2008 specification [RFC5890] [RFC5892].
以下术语来源于CJK脚本[RFC3743]的JET规范及其泛化[RFC4290],但本规范不依赖于它们。此处列出的所有字符已根据IDNA2008规范[RFC5890][RFC5892]验证为“PVALID”。
A "string" is a sequence of one or more characters.
“字符串”是一个或多个字符的序列。
This document discusses characters that have equivalent or near-equivalent characters or strings. The "base character" is the character that has one or more equivalents; the "variant(s)" are the character(s) and/or string(s) that are equivalent to the base character.
本文档讨论具有等效或近似等效字符或字符串的字符。“基本字符”是具有一个或多个等价物的字符;“变体”是与基字符等效的字符和/或字符串。
A "registration bundle" is the set of all labels that comes from expanding all base characters for a single name into their variants.
“注册包”是将单个名称的所有基本字符扩展为其变体而产生的所有标签的集合。
A registry is the administrative authority for a DNS zone. That is, the registry is the body that makes and enforces policies that are used in a particular zone in the DNS. The term "registry" applies to all zones in the DNS, not only those that exist at the top level.
注册表是DNS区域的管理机构。也就是说,注册中心是制定和实施DNS中特定区域中使用的策略的机构。术语“注册表”适用于DNS中的所有区域,而不仅仅是存在于顶层的区域。
In the interest of clarity and balance, this document describes a "Base Cyrillic" set of 23 characters for use in comparing the character usage for Russian and Central European languages that use Cyrillic. The balance of this section compares the character usage of the individual languages in that group.
为了清晰和平衡,本文档描述了一组23个“基本西里尔文”字符,用于比较使用西里尔文的俄语和中欧语言的字符用法。本节的其余部分比较了该组中各个语言的字符使用情况。
"Base Cyrillic" consists of the following Unicode code points (names associated with these code points and those below appear in Appendix A): U+0430, U+0431, U+0432, U+0433, U+0434, U+0435, U+0436, U+0437, U+043A, U+043B, U+043C, U+043D, U+043E, U+043F, U+0440, U+0441, U+0442, U+0443, U+0444, U+0445, U+0446, U+0447, U+0448.
“基本西里尔文”由以下Unicode代码点组成(与这些代码点和以下代码点相关的名称见附录A):U+0430、U+0431、U+0432、U+0433、U+0434、U+0435、U+0436、U+0437、U+043A、U+043B、U+043C、U+043D、U+043E、U+043F、U+0440、U+0441、U+0442、U+0443、U+0444、U+0445、U+0446、U+0447、U+0448。
In addition, modern writing systems that use Cyrillic do not have digits separate from the "European" ones used with Latin characters. For registries that permit digits to appear in domain name labels, the "Base Cyrillic" code point listed above should be considered to include U+0030, U+0031, U+0032, U+0033, U+0034, U+0035, U+0036, U+0037, U+0038, and U+0039 (Digit Zero, and Digit One through Digit Nine). The Hyphen-Minus character (U+002D) may also be used.
此外,使用西里尔文的现代书写系统没有与拉丁字符使用的“欧洲”数字分开的数字。对于允许数字出现在域名标签中的注册中心,上面列出的“基本西里尔字母”代码点应包括U+0030、U+0031、U+0032、U+0033、U+0034、U+0035、U+0036、U+0037、U+0038和U+0039(数字0、数字1到数字9)。也可以使用连字符减号(U+002D)。
It is worth noting that the EU top-level domain registry allows Cyrillic registrations using 32 code points [EU-registry]. That list is sufficient for some of the languages listed here but not for others.
值得注意的是,EU顶级域注册表允许使用32个代码点[EU注册表]进行西里尔文注册。该列表对此处列出的某些语言足够,但对其他语言不够。
The individual languages that are the focus of this specification are discussed below (in English alphabetical order).
本规范关注的各个语言如下所述(按英文字母顺序)。
Bosnian and Serbian have 30 letters in the alphabet and the additional seven characters to the base of 23 shared Cyrillic characters: U+0438, U+0458, U+0452, U+0459, U+045A, U+045B, U+045F.
波斯尼亚语和塞尔维亚语在字母表中有30个字母,在23个共享西里尔语字符的基础上增加了7个字符:U+0438、U+0458、U+0452、U+0459、U+045A、U+045B、U+045F。
The Bulgarian alphabet has 30 characters, seven in addition to the basic 23: U+0438, U+0439, U+0449, U+044A, U+044C, U+044E, U+044F.
保加利亚字母表有30个字符,除了基本的23个字符外还有7个:U+0438、U+0439、U+0449、U+044A、U+044C、U+044E、U+044F。
The Byelorussian (now often spelled Belarusian or Belarusan) alphabet has 32 characters, i.e., nine characters in addition to the Base Cyrillic set of 23 characters: U+0451, U+0456, U+0439, U+044B, U+044C, U+045E, U+044D, U+044E, U+044F.
白俄罗斯(现在通常拼写为白俄罗斯语或白俄罗斯语)字母表有32个字符,也就是说,除了23个基本西里尔字母集外,还有9个字符:U+0451、U+0456、U+0439、U+044B、U+044C、U+045E、U+044D、U+044E、U+044F。
The phonetics of the Kildin Sami are quite complex and not easily represented in Cyrillic (see, e.g., Kertom's work [Kert]). The orthography is not standardized and the writing system may best be thought of as an attempt to transcribe the language phonetically (primary in Latin script in the 1930s but in Cyrillic more recently). Different scholars have reported different numbers of phonemes, further complicating the transcription process. Kertom identifies 53 consonants with long-short distinctions and, in many cases, hard-soft ones. He also identifies ascending and descending diphthongs and one triphthong as well as more common short and long vowels.
基尔丁萨米语的语音非常复杂,不容易用西里尔语表达(例如,参见克托姆的著作[Kert])。正字法没有标准化,书写系统最好被认为是试图从语音上转录语言(20世纪30年代主要使用拉丁语,但最近使用西里尔语)。不同的学者报道了不同数量的音素,使转录过程更加复杂。Kertom识别出53个具有长短区别的辅音,在许多情况下,还识别出硬辅音和软辅音。他还识别了升序和降序双元音、一个三元音以及更常见的短元音和长元音。
The primary reference for Kildin Sami, widely circulated for some time but only in draft, is apparently used by Sami language(s) experts in Scandinavian countries [Riessl07]. It, and the references it cites, uses 56 characters, 33 of which do not appear in the basic set. Eight* of these characters have no precomposed forms in Unicode and hence must be written as a sequence of two code points with the second one being COMBINING MACRON (U+0304). Using parentheses to make the two-code-point sequences more obvious, the additional characters are: (U+0430 U+0304)*, (U+0435 U+0304)*, U+0438, U+0439, (U+043E U+0304), U+044A, U+044B, (U+044B U+0304), U+044C, U+044D, (U+044D U+0304), U+044E, (U+044E U+0304), U+044F, (U+044F U+0304), U+0451, (U+0451 U+0304), U+0458, U+048B, U+048D, U+048F, U+04BB, U+04C6, U+04C8, U+04CA, U+04CE, U+04D3, U+04E3, U+04E7, U+04ED, U+04EF, U+04F1, U+04F9.
Kildin-Sami的主要参考文献已经广泛流传了一段时间,但仅在草稿中,显然被斯堪的纳维亚国家的萨米语专家使用[Riessl07]。它及其引用的参考文献使用56个字符,其中33个字符未出现在基本集合中。其中8*个字符在Unicode中没有预合成形式,因此必须作为两个代码点的序列写入,第二个是组合宏(U+0304)。使用括号使两个代码点序列更加明显,附加字符是:(U+0430U+0304)*,(U+0435U+0304)*,U+0438,U+0439,(U+043EU+0304),U+044A,U+044B,(U+044B U+0304),U+044C,U+044D,(U+044D U+0304),U+044E,(U+044E U+0304),U+044F,(U+0304),U+044F,(U+044F),U+0304),U+0451,(U+0304),U+048D,U+048D,U+048B,U+048F,U+04C6、U+04C8、U+04CA、U+04CE、U+04D3、U+04E3、U+04E7、U+04ED、U+04EF、U+04F1、U+04F9。
* These characters, CYRILLIC SMALL LETTER A (U+0430) with a COMBINING MACRON (U+0304) and CYRILLIC SMALL LETTER IE (U+0435) with a COMBINING MACRON (U+0304), respectively, have the same visual appearance as LATIN SMALL LETTER A WITH MACRON (U+0101) and LATIN SMALL LETTER E WITH MACRON (U+0113). There are no known keyboards designed specifically for Kildin Sami. If an extended Latin-based keyboard and associated software are used, these characters might appear with the code point based on Latin (e.g., U+0113 for the second case). By contrast, keyboards and input software that are designed to be more Cyrillic-friendly are more likely to produce code points for the Cyrillic base characters. The use of a Latin character base for that second case occurs in some Western European sources including Riessler's work [Riessl07]. While we have not found explicit substitutions for A with Macron, we believe they might be found in practice. These alternatives are not mapped together by Unicode Normalization Form C (NFC) (or Normalization Form KC (NFKC)), so registries, and possibly applications software, should exercise some care about
* 这些字符,西里尔文小写字母A(U+0430)和西里尔文小写字母IE(U+0435和U+0304)分别与拉丁文小写字母A(U+0101)和拉丁文小写字母E(U+0113)具有相同的视觉外观。目前还没有专门为Kildin Sami设计的键盘。如果使用扩展的基于拉丁语的键盘和相关软件,这些字符可能与基于拉丁语的代码点一起出现(例如,第二种情况下为U+0113)。相比之下,键盘和输入软件设计得更为西里尔字母友好,更可能为西里尔字母基本字符生成代码点。第二种情况下使用拉丁字符基出现在一些西欧来源中,包括里斯勒的作品[Riessl07]。虽然我们还没有发现用马克龙显式替换A,但我们相信在实践中可能会发现。这些替代方案并不是通过Unicode规范化表单C(NFC)(或规范化表单KC(NFKC))映射在一起的,因此注册中心,可能还有应用程序软件,应该注意
these coding variations. However, U+0101 and U+0113 are Latin Script characters so, if either is used, any tests on homogeneity of the script within a label need to be made with care.
这些编码变体。但是,U+0101和U+0113是拉丁字母,因此,如果使用其中一种,则需要小心地对标签内的文字同质性进行任何测试。
Similar issues may apply to other Kildin Sami characters constructed with combining sequences.
类似的问题可能适用于使用组合序列构造的其他Kildin-Sami字符。
The key references in Russian ([Anto90], [Kert86], [Kuru85]) all propose slightly different character tables relative to each other and to Riessler's list. Because the latter list appears to be more comprehensive and to represent more recent scholarship, we have based the tables in this document on it. We recommend, however, that registries review these recommendations and the relevant papers should registration requests for Kildin Sami actually appear.
俄语中的主要参考文献([Anto90]、[Kert86]、[Kuru85])都提出了相对彼此和Riessler列表略有不同的字符表。由于后一个列表似乎更全面,代表了最近的学术成果,我们将本文件中的表格建立在它的基础上。然而,我们建议各登记处审查这些建议,如果Kildin Sami的登记申请实际出现,则应审查相关文件。
Additional perspectives on Kildin Sami can be found on the Omniglot Sami pages [OmniglotSaami].
关于Kildin Sami的其他观点可以在Omniglot Sami页面[Omniglotsami]上找到。
Macedonian has 31 characters in the alphabet. This is eight in addition to the basic set: U+0438, U+0458, U+0452, U+0459, U+045A, U+045C, U+045F, U+0491, U+0455.
马其顿语字母表中有31个字符。这是除基本设置外的八个:U+0438、U+0458、U+0452、U+0459、U+045A、U+045C、U+045F、U+0491、U+0455。
According to the most recent, and now final, government specification [MontenegrinChars], Montenegrin has 32 characters in its alphabet, including two that have no precomposed forms in Unicode. This is nine in addition to the basic set and two in addition to Bosnian and Serbian: U+0437 U+0301, U+0438, U+0441 U+0301, U+0452, U+0458, U+0459, U+045A, U+045B, U+045F.
根据最新的、现在是最终的政府规范[黑山共和国],黑山共和国的字母表中有32个字符,其中两个字符没有Unicode的预合成形式。这是除基本设置外的九个,以及除波斯尼亚语和塞尔维亚语之外的两个:U+0437 U+0301、U+0438、U+0441 U+0301、U+0452、U+0458、U+0459、U+045A、U+045B、U+045F。
See Bosnian, Section 2.1, above.
见上文波斯尼亚语,第2.1节。
The current Russian alphabet has 33 characters, consisting of the Base Cyrillic set plus an additional ten characters: U+0451, U+0438, U+0439, U+0449, U+044A, U+044B, U+044C, U+044D, U+044E, U+044F.
目前的俄语字母表有33个字符,包括基本西里尔字母集和另外10个字符:U+0451、U+0438、U+0439、U+0449、U+044A、U+044B、U+044C、U+044D、U+044E、U+044F。
See Bosnian, Section 2.1, above.
见上文波斯尼亚语,第2.1节。
The character list for modern Ukrainian has apparently not completely stabilized. Some references claim 31 characters and therefore an additional 8 characters to the Base Cyrillic set of 23. Others claim 33, adding U+0438 and U+0439 and replacing U+044A (Hard Sign) with U+044C (Soft Sign), for a total of an additional 11 characters as compared to the Base Cyrillic set. Unless better information is available, the prudent registry should probably assume that all 34 characters are in use, i.e., the Base Cyrillic set plus U+0438, U+0439, U+0454, U+0456, U+0457, U+0491, U+0449, U+044A, U+044C, U+044E, U+044F.
现代乌克兰人的性格名单显然还没有完全稳定下来。一些参考文献要求31个字符,因此在23个基本西里尔字母的基础上增加了8个字符。其他人要求33个字符,添加U+0438和U+0439,并将U+044A(硬符号)替换为U+044C(软符号),与基本西里尔字母集相比,总共增加了11个字符。除非有更好的信息可用,否则审慎的注册表可能会假设所有34个字符都在使用,即基本西里尔字母集加上U+0438、U+0439、U+0454、U+0456、U+0457、U+0491、U+0449、U+044A、U+044C、U+044E、U+044F。
The registration strategy described in this document uses a table that lists all characters allowed for input and any variants of those characters. Note that the table lists all characters allowed, not only the ones that have variants.
本文档中描述的注册策略使用一个表,该表列出了允许输入的所有字符以及这些字符的任何变体。请注意,该表列出了所有允许的字符,而不仅仅是具有变体的字符。
The input to the process is called the "input label". The output of the process is either failure (the input label cannot be registered at all), or a registration bundle that contains one or more labels in A-label form.
流程的输入称为“输入标签”。进程的输出要么失败(输入标签根本无法注册),要么是包含一个或多个a标签形式标签的注册包。
The table in Appendix A consists of four columns. The first and second identify the Cyrillic character, and the third and fourth identify Latin or Greek characters that might be easily confused with them visually. If both a Latin and Greek character are present, the Greek one appears in the third and fourth columns on the subsequent line (with "..." in the first column to indicate more information about the character specified on the previous line). Variants needed only because of case folding are shown with "+++" in the first column, as noted in the table.
The table in Appendix A consists of four columns. The first and second identify the Cyrillic character, and the third and fourth identify Latin or Greek characters that might be easily confused with them visually. If both a Latin and Greek character are present, the Greek one appears in the third and fourth columns on the subsequent line (with "..." in the first column to indicate more information about the character specified on the previous line). Variants needed only because of case folding are shown with "+++" in the first column, as noted in the table.
Each character in the table is given in the "U+" notation for Unicode characters followed, in the next column, by its name as shown in the Unicode Standard. For easy reference, the characters are listed in the order in which they appear in the Unicode Standard.
表中的每个字符都以Unicode字符的“U+”表示法给出,在下一列中,后面是其名称,如Unicode标准所示。为便于参考,字符按其在Unicode标准中的显示顺序列出。
The table does not, and any future revision MUST NOT, have more than one entry for a particular base character.
对于一个特定的基字符,该表不能有多个条目,将来的任何修订版也不能有多个条目。
A registry has at least three policy options for handling the cases where the registration bundle has more than one label. These options, and their key implications, are:
注册表至少有三个策略选项,用于处理注册包具有多个标签的情况。这些选择及其关键影响包括:
o Allocate all labels to the same registrant, making the zone information identical to that of the input label.
o 将所有标签分配给同一注册人,使区域信息与输入标签的信息相同。
This option will cause end users to be able to find names with variants more easily, but will result in larger zone files. In principle, the zone file could become so large that it could negatively affect the ability of the registry to perform name resolution.
此选项将使最终用户能够更轻松地找到带有变体的名称,但会导致区域文件变大。原则上,区域文件可能变得非常大,从而对注册表执行名称解析的能力产生负面影响。
o Block all labels so they cannot be registered in the future.
o 阻止所有标签,以便将来无法注册。
This option does not increase the size of the zone file, but it may cause end users to not be able to find names with variants that they would expect.
此选项不会增加区域文件的大小,但可能会导致最终用户无法找到具有预期变体的名称。
o Allocate some labels and block some other labels.
o 分配一些标签并阻止一些其他标签。
This option is likely to cause the most confusion with users because including some variants will cause a name to be found, but using other variants will cause the name to be not found.
此选项可能会导致与用户的最大混淆,因为包含某些变体会导致找到名称,但使用其他变体会导致找不到名称。
With any of these three options, the registry MUST keep a database that links each label in the registration bundle to the input label. This link needs to be maintained so that changes in the non-DNS registration information (such as the label's owner name and address) are reflected in every member of the registration bundle as well.
使用这三个选项中的任何一个,注册表必须保留一个数据库,将注册捆绑包中的每个标签链接到输入标签。需要维护此链接,以便非DNS注册信息(如标签的所有者名称和地址)的更改也反映在注册包的每个成员中。
The information provided in this document may assist DNS zone administrators and registrants in selecting names that are less likely to be confused with others and in adopting policies that help avoid confusion. It may also assist user-interface designers in identifying possible areas of confusion so that they can better protect users. The document otherwise has no consequences for the security of the Internet.
本文档中提供的信息可能有助于DNS区域管理员和注册人选择不太可能与他人混淆的名称,并有助于采取有助于避免混淆的策略。它还可以帮助用户界面设计人员确定可能的混淆区域,以便更好地保护用户。否则,该文件不会对互联网的安全产生任何影响。
Support from Afilias for a major portion of this work is appreciated.
感谢Afilias对这项工作大部分的支持。
The material on Kildin Sami would not have been possible without the efforts of Cary Karp for his help directly and his pointer to Riessler's work [Riessl07] and from Vladimir Shadrunov and Sergey Nikolaevich Teryoshkin for their own analyses and references ([Anto90], [Kert86], and [Kuru85]) and partial translations from them. We are grateful for their efforts that facilitated treating it nearly the same way as other actively used European languages that use Cyrillic script.
如果没有卡里·卡普(Cary Karp)的直接帮助和他指向里斯勒作品[Riessl07]的指针,以及弗拉基米尔·沙德鲁诺夫(Vladimir Shadrunov)和谢尔盖·尼古拉耶维奇·特约什金(Sergey Nikolaevich Teryoshkin)自己的分析和参考([Anto90]、[Kert86]和[Kuru85]),以及他们的部分翻译,基尔丁·萨米的资料是不可能的。我们对他们的努力表示感谢,因为他们的努力使我们能够像对待其他使用西里尔文字的积极使用的欧洲语言一样对待它。
Careful reading of late drafts of this document by Bill McQuillan, Alexey Melnikov, and Peter Saint-Andre, identified a number of editorial problems, some of which might not have been caught otherwise.
比尔·麦克奎兰、阿列克谢·梅尔尼科夫和彼得·圣安德烈仔细阅读了这份文件的最新草案,发现了一些编辑问题,其中一些问题可能没有被发现。
[RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters in Mapping Characters for Internationalized Domain Names in Applications (IDNA) 2008", RFC 5895, September 2010.
[RFC5895]Resnick,P.和P.Hoffman,“2008年国际域名应用程序映射字符(IDNA)中的映射字符”,RFC 58952010年9月。
[Unicode52] The Unicode Consortium. The Unicode Standard, Version 5.2.0, defined by: "The Unicode Standard, Version 5.2.0", (Mountain View, CA: The Unicode Consortium, 2009. ISBN 978-1-936213-00-9). <http://www.unicode.org/versions/Unicode5.2.0/>.
[Unicode 52]Unicode联盟。Unicode标准,版本5.2.0,定义为:“Unicode标准,版本5.2.0”(加利福尼亚州山景城:Unicode联盟,2009年。ISBN 978-1-936213-00-9)<http://www.unicode.org/versions/Unicode5.2.0/>.
[Anto90] Antonova, A., "Primer for Sami schools first grade: Sami language, 2nd edition", Leningrad: Prosveshchenie, Leningrad department, 1990. Published in Russian, no authoritative translation is known.
[Anto90]Antonova,A.“萨米族学校初级读本:萨米族语言,第二版”,列宁格勒:Prosveshchenie,列宁格勒系,1990年。以俄语出版,没有权威的翻译是已知的。
[EU-registry] European Registry of Internet Domain Names (EURid), ".eu Supported Characters", January 2010, <http://www.eurid.eu/en/ eu-domain-names/technical-limitations/ supported-characters>.
[欧盟注册处]欧洲互联网域名注册处(EURid),“欧盟支持的字符”,2010年1月<http://www.eurid.eu/en/ 欧盟域名/技术限制/支持的字符>。
[Kert] Kertom, G., "Kildin dialect of the Sami language". Published in Russian, no authoritative translation is known.
[Kert]Kertom,G.,“萨米语的基尔丁方言”。以俄语出版,没有权威的翻译是已知的。
[Kert86] Kertom, G., "Sami-Russian and Russian-Sami dictionary: textbook for primary school pupils", Leningrad: Prosveshchenie Leningrad Department, 1986. Published in Russian, no authoritative translation is known.
[Kert86]Kertom,G.“萨米语俄语和俄语萨米语词典:小学生教科书”,列宁格勒:Prosveshchenie列宁格勒系,1986年。以俄语出版,没有权威的翻译是已知的。
[Kuru85] Kuruch, R., "Sami-Russian dictionary: eight thousand words", Moscow: Russkiy yazyk, 1985. Published in Russian, no authoritative translation is known.
[Kuru85]Kuruch,R.,“萨米语俄语词典:八千字”,莫斯科:俄罗斯语亚兹克,1985年。以俄语出版,没有权威的翻译是已知的。
[MontenegrinChars] Crna Gora Ministarstvo prosvjete i nauke (Ministry of Science and Education, Montenegro), "Pravopis Crnogorskoga Jezika I", 2009, <http://www.gov.me/files/1248442673.pdf>. In Montenegrin, no known English translation. See especially the table on page 8.
[黑山共和国]克罗地亚共和国科学和教育部,2009年,“黑山共和国科学和教育部”,第1期<http://www.gov.me/files/1248442673.pdf>. 在黑山,没有已知的英语翻译。具体见第8页的表格。
[OmniglotSaami] Ager, S., "Sami (Saami)", 2009, <http://www.omniglot.com/writing/saami.htm>.
[OmniglotSaami]Ager,S.,“萨米(萨米)”,2009年<http://www.omniglot.com/writing/saami.htm>.
[RFC3743] Konishi, K., Huang, K., Qian, H., and Y. Ko, "Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean", RFC 3743, April 2004.
[RFC3743]Konishi,K.,Huang,K.,Qian,H.,和Y.Ko,“中国,日本和韩国的国际域名(IDN)注册和管理联合工程团队(JET)指南”,RFC 37432004年4月。
[RFC4290] Klensin, J., "Suggested Practices for Registration of Internationalized Domain Names (IDN)", RFC 4290, December 2005.
[RFC4290]Klensin,J.,“国际域名(IDN)注册的建议做法”,RFC 42902005年12月。
[RFC4713] Lee, X., Mao, W., Chen, E., Hsu, N., and J. Klensin, "Registration and Administration Recommendations for Chinese Domain Names", RFC 4713, October 2006.
[RFC4713]Lee,X.,Mao,W.,Chen,E.,Hsu,N.,和J.Klensin,“中文域名的注册和管理建议”,RFC 4713,2006年10月。
[RFC5564] El-Sherbiny, A., Farah, M., Oueichek, I., and A. Al-Zoman, "Linguistic Guidelines for the Use of the Arabic Language in Internet Domains", RFC 5564, February 2010.
[RFC5564]El Sherbiny,A.,Farah,M.,Oueichek,I.,和A.Al Zoman,“互联网领域使用阿拉伯语的语言指南”,RFC 5564,2010年2月。
[RFC5890] Klensin, J., "Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework", RFC 5890, August 2010.
[RFC5890]Klensin,J.,“应用程序的国际化域名(IDNA):定义和文档框架”,RFC 58902010年8月。
[RFC5892] Faltstrom, P., "The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)", RFC 5892, August 2010.
[RFC5892]Faltstrom,P.,“Unicode代码点和应用程序的国际化域名(IDNA)”,RFC 58922010年8月。
[Riessl07] Riessler, M., "Kola Saami character chart (draft)", November 2007.
[Riessl07]Riessler,M.,“科拉萨米性格图(草稿)”,2007年11月。
These tables are constructed on the basis of the characters that can actually occur in the DNS, i.e., those that are valid in U-labels as defined in RFC 5890. If the characters that can be mapped into those characters are to be considered instead, then the number of variants would increase considerably. For example, while CYRILLIC SMALL LETTER A (U+0430) and GREEK SMALL LETTER ALPHA (U+03B1) are readily distinguished visually, their capital letter equivalents are not, so, if case mappings such as those discussed in the IDNA2008 Mapping document [RFC5895] are considered, the two small letters must be considered variants of each other. Some of the variants have been selected on the assumption that unusual fonts may be used and that users will see what they expect to see; others, involving subtle decorations but considered more far-fetched out of context, have not been listed.
这些表是根据DNS中实际出现的字符(即RFC 5890中定义的U型标签中有效的字符)构建的。如果要考虑可以映射到这些字符中的字符,则变体的数量将显著增加。例如,虽然西里尔文小写字母A(U+0430)和希腊文小写字母ALPHA(U+03B1)在视觉上很容易区分,但它们的大写字母等价物并非如此,因此,如果考虑IDNA2008映射文档[RFC5895]中讨论的大小写映射,则必须将这两个小写字母视为彼此的变体。选择一些变体的前提是,可能会使用不寻常的字体,并且用户会看到他们期望看到的内容;其他涉及微妙装饰但被认为更离题的作品则没有被列出。
These additional, possibly required, variants are shown below with "+++" in the first column of the table.
These additional, possibly required, variants are shown below with "+++" in the first column of the table.
"..." in the first column is used to indicate more information about the character specified on the previous line.
第一列中的“…”用于表示有关前一行中指定字符的更多信息。
Characters needed for European languages, other than Montenegrin and Sami, written in Cyrillic.
除黑山语和萨米语以外的欧洲语言所需的字符,用西里尔语书写。
+----------+--------------------------+---------+-------------------+ | Cyrillic | Unicode Name | Variant | Unicode Name | | Char | | | | +----------+--------------------------+---------+-------------------+ | U+0430 | CYRILLIC SMALL LETTER A | U+0061 | LATIN SMALL | | | | | LETTER A | | | | | | | +++ | | U+03B1 | GREEK SMALL | | | | | LETTER ALPHA | | | | | | | U+0431 | CYRILLIC SMALL LETTER BE | | | | | | | | | U+0432 | CYRILLIC SMALL LETTER VE | U+0062 | LATIN SMALL | | | | | LETTER B | | | | | | | +++ | | U+03B2 | GREEK SMALL | | | | | LETTER BETA | | | | | | | U+0433 | CYRILLIC SMALL LETTER | U+0072 | LATIN SMALL | | | GHE | | LETTER R | | | | | | | +++ | | U+03B3 | GREEK SMALL | | | | | LETTER GAMMA | | | | | | | U+0434 | CYRILLIC SMALL LETTER DE | | | | | | | | | +++ | | U+03B4 | GREEK SMALL | | | | | LETTER DELTA | | | | | | | U+0435 | CYRILLIC SMALL LETTER IE | U+0065 | LATIN SMALL | | | | | LETTER E | | | | | | | +++ | | U+03B5 | GREEK SMALL | | | | | LETTER EPSILON | | | | | | | U+0436 | CYRILLIC SMALL LETTER | | | | | ZHE | | | | | | | | | U+0437 | CYRILLIC SMALL LETTER ZE | | | | | | | | | U+0438 | CYRILLIC SMALL LETTER I | U+0075 | LATIN SMALL | | | | | LETTER U | | | | | | | U+0439 | CYRILLIC SMALL LETTER | | | | | SHORT I | | |
+----------+--------------------------+---------+-------------------+ | Cyrillic | Unicode Name | Variant | Unicode Name | | Char | | | | +----------+--------------------------+---------+-------------------+ | U+0430 | CYRILLIC SMALL LETTER A | U+0061 | LATIN SMALL | | | | | LETTER A | | | | | | | +++ | | U+03B1 | GREEK SMALL | | | | | LETTER ALPHA | | | | | | | U+0431 | CYRILLIC SMALL LETTER BE | | | | | | | | | U+0432 | CYRILLIC SMALL LETTER VE | U+0062 | LATIN SMALL | | | | | LETTER B | | | | | | | +++ | | U+03B2 | GREEK SMALL | | | | | LETTER BETA | | | | | | | U+0433 | CYRILLIC SMALL LETTER | U+0072 | LATIN SMALL | | | GHE | | LETTER R | | | | | | | +++ | | U+03B3 | GREEK SMALL | | | | | LETTER GAMMA | | | | | | | U+0434 | CYRILLIC SMALL LETTER DE | | | | | | | | | +++ | | U+03B4 | GREEK SMALL | | | | | LETTER DELTA | | | | | | | U+0435 | CYRILLIC SMALL LETTER IE | U+0065 | LATIN SMALL | | | | | LETTER E | | | | | | | +++ | | U+03B5 | GREEK SMALL | | | | | LETTER EPSILON | | | | | | | U+0436 | CYRILLIC SMALL LETTER | | | | | ZHE | | | | | | | | | U+0437 | CYRILLIC SMALL LETTER ZE | | | | | | | | | U+0438 | CYRILLIC SMALL LETTER I | U+0075 | LATIN SMALL | | | | | LETTER U | | | | | | | U+0439 | CYRILLIC SMALL LETTER | | | | | SHORT I | | |
| | | | | | U+043A | CYRILLIC SMALL LETTER KA | U+006B | LATIN SMALL | | | | | LETTER K | | | | | | | ... | | U+03BA | GREEK SMALL | | | | | LETTER KAPPA | | | | | | | U+043B | CYRILLIC SMALL LETTER EL | | | | | | | | | +++ | | U+03BB | GREEK SMALL | | | | | LETTER LAMBDA | | | | | | | U+043C | CYRILLIC SMALL LETTER EM | U+006D | LATIN SMALL | | | | | LETTER M | | | | | | | +++ | | U+03BC | GREEK SMALL | | | | | LETTER MU | | | | | | | U+043D | CYRILLIC SMALL LETTER EN | U+0048 | LATIN CAPITAL | | | | | LETTER H | | | | | | | +++ | | U+0068 | LATIN SMALL | | | | | LETTER H (in some | | | | | fonts) | | | | | | | +++ | | U+03B7 | GREEK SMALL | | | | | LETTER ETA | | | | | | | U+043E | CYRILLIC SMALL LETTER O | U+006F | LATIN SMALL | | | | | LETTER O | | | | | | | ... | | U+03BF | GREEK SMALL | | | | | LETTER OMICRON | | | | | | | U+043F | CYRILLIC SMALL LETTER PE | U+006E | LATIN SMALL | | | | | LETTER N | | | | | | | ... | | U+03C0 | GREEK SMALL | | | | | LETTER PI | | | | | | | U+0440 | CYRILLIC SMALL LETTER ER | U+0070 | LATIN SMALL | | | | | LETTER P | | | | | | | ... | | U+03C1 | GREEK SMALL | | | | | LETTER RHO | | | | | | | U+0441 | CYRILLIC SMALL LETTER ES | U+0063 | LATIN SMALL | | | | | LETTER C |
| | | | | | U+043A | CYRILLIC SMALL LETTER KA | U+006B | LATIN SMALL | | | | | LETTER K | | | | | | | ... | | U+03BA | GREEK SMALL | | | | | LETTER KAPPA | | | | | | | U+043B | CYRILLIC SMALL LETTER EL | | | | | | | | | +++ | | U+03BB | GREEK SMALL | | | | | LETTER LAMBDA | | | | | | | U+043C | CYRILLIC SMALL LETTER EM | U+006D | LATIN SMALL | | | | | LETTER M | | | | | | | +++ | | U+03BC | GREEK SMALL | | | | | LETTER MU | | | | | | | U+043D | CYRILLIC SMALL LETTER EN | U+0048 | LATIN CAPITAL | | | | | LETTER H | | | | | | | +++ | | U+0068 | LATIN SMALL | | | | | LETTER H (in some | | | | | fonts) | | | | | | | +++ | | U+03B7 | GREEK SMALL | | | | | LETTER ETA | | | | | | | U+043E | CYRILLIC SMALL LETTER O | U+006F | LATIN SMALL | | | | | LETTER O | | | | | | | ... | | U+03BF | GREEK SMALL | | | | | LETTER OMICRON | | | | | | | U+043F | CYRILLIC SMALL LETTER PE | U+006E | LATIN SMALL | | | | | LETTER N | | | | | | | ... | | U+03C0 | GREEK SMALL | | | | | LETTER PI | | | | | | | U+0440 | CYRILLIC SMALL LETTER ER | U+0070 | LATIN SMALL | | | | | LETTER P | | | | | | | ... | | U+03C1 | GREEK SMALL | | | | | LETTER RHO | | | | | | | U+0441 | CYRILLIC SMALL LETTER ES | U+0063 | LATIN SMALL | | | | | LETTER C |
| | | | | | U+0442 | CYRILLIC SMALL LETTER TE | U+0074 | LATIN SMALL | | | | | LETTER T | | | | | | | +++ | | U+03C4 | GREEK SMALL | | | | | LETTER TAU | | | | | | | U+0443 | CYRILLIC SMALL LETTER U | U+0079 | LATIN SMALL | | | | | LETTER Y | | | | | | | +++ | | U+03C5 | GREEK SMALL | | | | | LETTER UPSILON | | | | | | | U+0444 | CYRILLIC SMALL LETTER EF | U+03D5 | GREEK PHI SYMBOL | | | | | | | +++ | | U+03C6 | GREEK SMALL | | | | | LETTER PHI | | | | | | | U+0445 | CYRILLIC SMALL LETTER HA | U+0078 | LATIN SMALL | | | | | LETTER X | | | | | | | ... | | U+03C7 | GREEK SMALL | | | | | LETTER CHI | | | | | | | U+0446 | CYRILLIC SMALL LETTER | | | | | TSE | | | | | | | | | U+0447 | CYRILLIC SMALL LETTER | | | | | CHE | | | | | | | | | U+0448 | CYRILLIC SMALL LETTER | | | | | SHA | | | | | | | | | U+0449 | CYRILLIC SMALL LETTER | | | | | SHCHA | | | | | | | | | U+044A | CYRILLIC SMALL LETTER | U+0062 | LATIN SMALL | | | HARD SIGN | | LETTER B | | | | | | | U+044B | CYRILLIC SMALL LETTER | | | | | YERU | | | | | | | | | U+044C | CYRILLIC SMALL LETTER | U+0062 | LATIN SMALL | | | SOFT SIGN | | LETTER B | | | | | | | U+044D | CYRILLIC SMALL LETTER E | | | | | | | | | U+044E | CYRILLIC SMALL LETTER YU | | |
| | | | | | U+0442 | CYRILLIC SMALL LETTER TE | U+0074 | LATIN SMALL | | | | | LETTER T | | | | | | | +++ | | U+03C4 | GREEK SMALL | | | | | LETTER TAU | | | | | | | U+0443 | CYRILLIC SMALL LETTER U | U+0079 | LATIN SMALL | | | | | LETTER Y | | | | | | | +++ | | U+03C5 | GREEK SMALL | | | | | LETTER UPSILON | | | | | | | U+0444 | CYRILLIC SMALL LETTER EF | U+03D5 | GREEK PHI SYMBOL | | | | | | | +++ | | U+03C6 | GREEK SMALL | | | | | LETTER PHI | | | | | | | U+0445 | CYRILLIC SMALL LETTER HA | U+0078 | LATIN SMALL | | | | | LETTER X | | | | | | | ... | | U+03C7 | GREEK SMALL | | | | | LETTER CHI | | | | | | | U+0446 | CYRILLIC SMALL LETTER | | | | | TSE | | | | | | | | | U+0447 | CYRILLIC SMALL LETTER | | | | | CHE | | | | | | | | | U+0448 | CYRILLIC SMALL LETTER | | | | | SHA | | | | | | | | | U+0449 | CYRILLIC SMALL LETTER | | | | | SHCHA | | | | | | | | | U+044A | CYRILLIC SMALL LETTER | U+0062 | LATIN SMALL | | | HARD SIGN | | LETTER B | | | | | | | U+044B | CYRILLIC SMALL LETTER | | | | | YERU | | | | | | | | | U+044C | CYRILLIC SMALL LETTER | U+0062 | LATIN SMALL | | | SOFT SIGN | | LETTER B | | | | | | | U+044D | CYRILLIC SMALL LETTER E | | | | | | | | | U+044E | CYRILLIC SMALL LETTER YU | | |
| | | | | | U+044F | CYRILLIC SMALL LETTER YA | | | | | | | | | U+0451 | CYRILLIC SMALL LETTER IO | U+00EB | LATIN SMALL | | | | | LETTER E WITH | | | | | DIAERESIS | | | | | | | U+0452 | CYRILLIC SMALL LETTER | | | | | DJE | | | | | | | | | U+0453 | CYRILLIC SMALL LETTER | | | | | GJE | | | | | | | | | U+0454 | CYRILLIC SMALL LETTER | U+03B5 | GREEK SMALL | | | UKRAINIAN IE | | LETTER EPSILON | | | | | | | U+0455 | CYRILLIC SMALL LETTER | U+0073 | LATIN SMALL | | | DZE | | LETTER S | | | | | | | U+0456 | CYRILLIC SMALL LETTER | U+0069 | LATIN SMALL | | | BYELORUSSIAN-UKRAINIAN I | | LETTER I | | | | | | | +++ | | U+03B9 | GREEK SMALL | | | | | LETTER IOTA | | | | | | | U+0457 | CYRILLIC SMALL LETTER | U+03CA | GREEK SMALL | | | UKRAINIAN YI | | LETTER IOTA WITH | | | | | DIALYTIKA | | | | | | | +++ | | U+00EF | LATIN SMALL | | | | | LETTER I WITH | | | | | DIAERESIS | | | | | | | U+0458 | CYRILLIC SMALL LETTER JE | U+006A | LATIN SMALL | | | | | LETTER J | | | | | | | ... | | U+03F3 | GREEK LETTER YOT | | | | | | | U+0459 | CYRILLIC SMALL LETTER | | | | | LJE | | | | | | | | | U+045A | CYRILLIC SMALL LETTER | | | | | NJE | | | | | | | | | U+045B | CYRILLIC SMALL LETTER | | | | | TSHE | | | | | | | |
| | | | | | U+044F | CYRILLIC SMALL LETTER YA | | | | | | | | | U+0451 | CYRILLIC SMALL LETTER IO | U+00EB | LATIN SMALL | | | | | LETTER E WITH | | | | | DIAERESIS | | | | | | | U+0452 | CYRILLIC SMALL LETTER | | | | | DJE | | | | | | | | | U+0453 | CYRILLIC SMALL LETTER | | | | | GJE | | | | | | | | | U+0454 | CYRILLIC SMALL LETTER | U+03B5 | GREEK SMALL | | | UKRAINIAN IE | | LETTER EPSILON | | | | | | | U+0455 | CYRILLIC SMALL LETTER | U+0073 | LATIN SMALL | | | DZE | | LETTER S | | | | | | | U+0456 | CYRILLIC SMALL LETTER | U+0069 | LATIN SMALL | | | BYELORUSSIAN-UKRAINIAN I | | LETTER I | | | | | | | +++ | | U+03B9 | GREEK SMALL | | | | | LETTER IOTA | | | | | | | U+0457 | CYRILLIC SMALL LETTER | U+03CA | GREEK SMALL | | | UKRAINIAN YI | | LETTER IOTA WITH | | | | | DIALYTIKA | | | | | | | +++ | | U+00EF | LATIN SMALL | | | | | LETTER I WITH | | | | | DIAERESIS | | | | | | | U+0458 | CYRILLIC SMALL LETTER JE | U+006A | LATIN SMALL | | | | | LETTER J | | | | | | | ... | | U+03F3 | GREEK LETTER YOT | | | | | | | U+0459 | CYRILLIC SMALL LETTER | | | | | LJE | | | | | | | | | U+045A | CYRILLIC SMALL LETTER | | | | | NJE | | | | | | | | | U+045B | CYRILLIC SMALL LETTER | | | | | TSHE | | | | | | | |
| U+045C | CYRILLIC SMALL LETTER | | | | | KJE | | | | | | | | | U+045D | CYRILLIC SMALL LETTER I | | | | | WITH GRAVE | | | | | | | | | U+045E | CYRILLIC SMALL LETTER | | | | | SHORT U | | | | | | | | | U+045F | CYRILLIC SMALL LETTER | | | | | DZHE | | | | | | | | | U+0491 | CYRILLIC SMALL LETTER | U+0072 | LATIN SMALL | | | GHE WITH UPTURN | | LETTER R | | | | | | | U+04C2 | CYRILLIC SMALL LETTER | | | | | ZHE WITH BREVE | | | +----------+--------------------------+---------+-------------------+
| U+045C | CYRILLIC SMALL LETTER | | | | | KJE | | | | | | | | | U+045D | CYRILLIC SMALL LETTER I | | | | | WITH GRAVE | | | | | | | | | U+045E | CYRILLIC SMALL LETTER | | | | | SHORT U | | | | | | | | | U+045F | CYRILLIC SMALL LETTER | | | | | DZHE | | | | | | | | | U+0491 | CYRILLIC SMALL LETTER | U+0072 | LATIN SMALL | | | GHE WITH UPTURN | | LETTER R | | | | | | | U+04C2 | CYRILLIC SMALL LETTER | | | | | ZHE WITH BREVE | | | +----------+--------------------------+---------+-------------------+
Additional characters needed for Montenegrin written in Cyrillic.
用西里尔文书写的黑山语所需的附加字符。
+--------------+-----------------------------+---------+------------+ | Cyrillic | Unicode Name | Variant | Unicode | | Char | | | Name | +--------------+-----------------------------+---------+------------+ | U+0437 + | CYRILLIC SMALL LETTER ZE | | | | U+0301 | WITH ACUTE | | | | | | | | | U+0441 + | CYRILLIC SMALL LETTER ES | | | | U+0301 | WITH ACUTE | | | +--------------+-----------------------------+---------+------------+
+--------------+-----------------------------+---------+------------+ | Cyrillic | Unicode Name | Variant | Unicode | | Char | | | Name | +--------------+-----------------------------+---------+------------+ | U+0437 + | CYRILLIC SMALL LETTER ZE | | | | U+0301 | WITH ACUTE | | | | | | | | | U+0441 + | CYRILLIC SMALL LETTER ES | | | | U+0301 | WITH ACUTE | | | +--------------+-----------------------------+---------+------------+
Additional characters needed for Kildin Sami written in Cyrillic.
用西里尔文书写的Kildin Sami所需的其他字符。
+----------+---------------------+----------+-----------------------+ | Cyrillic | Unicode Name | Variant | Unicode Name | | Char | | | | +----------+---------------------+----------+-----------------------+ | U+0430 + | CYRILLIC SMALL | U+0101 | LATIN SMALL LETTER A | | U+0304 | LETTER A WITH | | WITH MACRON | | | MACRON | | | | | | | | | ... | | U+03B1 + | GREEK SMALL LETTER | | | | U+0304 | ALPHA WITH MACRON | | | | | | | U+0435 + | CYRILLIC SMALL | U+0113 | LATIN SMALL LETTER E | | U+0304 | LETTER IE WITH | | WITH MACRON | | | MACRON | | | | | | | | | U+043E + | CYRILLIC SMALL | U+014D | LATIN SMALL LETTER O | | U+0304 | LETTER O WITH | | WITH MACRON | | | MACRON | | | | | | | | | ... | | U+03BF + | GREEK SMALL LETTER | | | | U+0304 | OMICRON WITH MACRON | | | | | | | U+044B + | CYRILLIC SMALL | | | | U+0304 | LETTER YERU WITH | | | | | MACRON | | | | | | | | | U+044D + | CYRILLIC SMALL | | | | U+0304 | LETTER E WITH | | | | | MACRON | | | | | | | | | U+044E + | CYRILLIC SMALL | | | | U+0304 | LETTER YU WITH | | | | | MACRON | | | | | | | | | U+044F + | CYRILLIC SMALL | | | | U+0304 | LETTER YA WITH | | | | | MACRON | | | | | | | | | U+0451 + | CYRILLIC SMALL | U+00EB + | LATIN SMALL LETTER E | | U+0304 | LETTER IO WITH | U0304 | WITH DIAERESIS AND | | | MACRON | | MACRON | | | | | | | U+048B | CYRILLIC SMALL | | | | | LETTER SHORT I WITH | | | | | TAIL | | | | | | | |
+----------+---------------------+----------+-----------------------+ | Cyrillic | Unicode Name | Variant | Unicode Name | | Char | | | | +----------+---------------------+----------+-----------------------+ | U+0430 + | CYRILLIC SMALL | U+0101 | LATIN SMALL LETTER A | | U+0304 | LETTER A WITH | | WITH MACRON | | | MACRON | | | | | | | | | ... | | U+03B1 + | GREEK SMALL LETTER | | | | U+0304 | ALPHA WITH MACRON | | | | | | | U+0435 + | CYRILLIC SMALL | U+0113 | LATIN SMALL LETTER E | | U+0304 | LETTER IE WITH | | WITH MACRON | | | MACRON | | | | | | | | | U+043E + | CYRILLIC SMALL | U+014D | LATIN SMALL LETTER O | | U+0304 | LETTER O WITH | | WITH MACRON | | | MACRON | | | | | | | | | ... | | U+03BF + | GREEK SMALL LETTER | | | | U+0304 | OMICRON WITH MACRON | | | | | | | U+044B + | CYRILLIC SMALL | | | | U+0304 | LETTER YERU WITH | | | | | MACRON | | | | | | | | | U+044D + | CYRILLIC SMALL | | | | U+0304 | LETTER E WITH | | | | | MACRON | | | | | | | | | U+044E + | CYRILLIC SMALL | | | | U+0304 | LETTER YU WITH | | | | | MACRON | | | | | | | | | U+044F + | CYRILLIC SMALL | | | | U+0304 | LETTER YA WITH | | | | | MACRON | | | | | | | | | U+0451 + | CYRILLIC SMALL | U+00EB + | LATIN SMALL LETTER E | | U+0304 | LETTER IO WITH | U0304 | WITH DIAERESIS AND | | | MACRON | | MACRON | | | | | | | U+048B | CYRILLIC SMALL | | | | | LETTER SHORT I WITH | | | | | TAIL | | | | | | | |
| U+048D | CYRILLIC SMALL | | | | | LETTER SEMISOFT | | | | | SIGN | | | | | | | | | U+048F | CYRILLIC SMALL | | | | | LETTER ER WITH TICK | | | | | | | | | U+04BB | CYRILLIC SMALL | U+0068 | LATIN SMALL LETTER H | | | LETTER SHHA | | | | | | | | | U+04C6 | CYRILLIC SMALL | | | | | LETTER EL WITH TAIL | | | | | | | | | U+04C8 | CYRILLIC SMALL | | | | | LETTER EN WITH HOOK | | | | | | | | | U+04CA | CYRILLIC SMALL | | | | | LETTER EN WITH TAIL | | | | | | | | | U+04CE | CYRILLIC SMALL | | | | | LETTER EM WITH TAIL | | | | | | | | | U+04D3 | CYRILLIC SMALL | U+00E4 | LATIN SMALL LETTER A | | | LETTER A WITH | | WITH DIAERESIS | | | DIAERESIS | | | | | | | | | U+04E3 | CYRILLIC SMALL | U+016B | LATIN SMALL LETTER U | | | LETTER I WITH | | WITH MACRON | | | MACRON | | | | | | | | | U+04E7 | CYRILLIC SMALL | U+00F6 | LATIN SMALL LETTER O | | | LETTER O WITH | | WITH DIAERESIS | | | DIAERESIS | | | | | | | | | U+04ED | CYRILLIC SMALL | | | | | LETTER E WITH | | | | | DIAERESIS | | | | | | | | | U+04EF | CYRILLIC SMALL | | | | | LETTER U WITH | | | | | MACRON | | | | | | | | | U+04F1 | CYRILLIC SMALL | | | | | LETTER U WITH | | | | | DIAERESIS | | | | | | | |
| U+048D | CYRILLIC SMALL | | | | | LETTER SEMISOFT | | | | | SIGN | | | | | | | | | U+048F | CYRILLIC SMALL | | | | | LETTER ER WITH TICK | | | | | | | | | U+04BB | CYRILLIC SMALL | U+0068 | LATIN SMALL LETTER H | | | LETTER SHHA | | | | | | | | | U+04C6 | CYRILLIC SMALL | | | | | LETTER EL WITH TAIL | | | | | | | | | U+04C8 | CYRILLIC SMALL | | | | | LETTER EN WITH HOOK | | | | | | | | | U+04CA | CYRILLIC SMALL | | | | | LETTER EN WITH TAIL | | | | | | | | | U+04CE | CYRILLIC SMALL | | | | | LETTER EM WITH TAIL | | | | | | | | | U+04D3 | CYRILLIC SMALL | U+00E4 | LATIN SMALL LETTER A | | | LETTER A WITH | | WITH DIAERESIS | | | DIAERESIS | | | | | | | | | U+04E3 | CYRILLIC SMALL | U+016B | LATIN SMALL LETTER U | | | LETTER I WITH | | WITH MACRON | | | MACRON | | | | | | | | | U+04E7 | CYRILLIC SMALL | U+00F6 | LATIN SMALL LETTER O | | | LETTER O WITH | | WITH DIAERESIS | | | DIAERESIS | | | | | | | | | U+04ED | CYRILLIC SMALL | | | | | LETTER E WITH | | | | | DIAERESIS | | | | | | | | | U+04EF | CYRILLIC SMALL | | | | | LETTER U WITH | | | | | MACRON | | | | | | | | | U+04F1 | CYRILLIC SMALL | | | | | LETTER U WITH | | | | | DIAERESIS | | | | | | | |
| U+04F9 | CYRILLIC SMALL | | | | | LETTER YERU WITH | | | | | DIAERESIS | | | +----------+---------------------+----------+-----------------------+
| U+04F9 | CYRILLIC SMALL | | | | | LETTER YERU WITH | | | | | DIAERESIS | | | +----------+---------------------+----------+-----------------------+
Authors' Addresses
作者地址
Sergey Sharikov Regtime Ltd Kalinina str.,14 Samara 443008 Russia
Sergey Sharikov Regtime有限公司Kalinina街14号,俄罗斯萨马拉443008
Phone: +7(846) 979-9039 Fax: +7(846)979-9038 EMail: s.shar@regtime.net
Phone: +7(846) 979-9039 Fax: +7(846)979-9038 EMail: s.shar@regtime.net
Desiree Miloshevic Afilias Oxford Internet Institute, 1 St. Giles Oxford OX1 3JS United Kingdom
Desiree Miloshevic Afilias牛津互联网研究所,1 St.Giles牛津OX1 3JS英国
Phone: +44 7973 987 147 EMail: dmiloshevic@afilias.info
Phone: +44 7973 987 147 EMail: dmiloshevic@afilias.info
John C Klensin 1770 Massachusetts Ave, #322 Cambridge, MA 02140 USA
美国马萨诸塞州剑桥市322号马萨诸塞大道1770号约翰·C·克伦辛,邮编:02140
Phone: +1 617 491 5735 EMail: john-ietf@jck.com
Phone: +1 617 491 5735 EMail: john-ietf@jck.com