Network Working Group                                   A. Phillips, Ed.
Request for Comments: 4647                                   Yahoo! Inc.
BCP: 47                                                    M. Davis, Ed.
Obsoletes: 3066                                                   Google
Category: Best Current Practice                           September 2006
        
Network Working Group                                   A. Phillips, Ed.
Request for Comments: 4647                                   Yahoo! Inc.
BCP: 47                                                    M. Davis, Ed.
Obsoletes: 3066                                                   Google
Category: Best Current Practice                           September 2006
        

Matching of Language Tags

语言标记的匹配

Status of This Memo

关于下段备忘

This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements. Distribution of this memo is unlimited.

本文件规定了互联网社区的最佳现行做法,并要求进行讨论和提出改进建议。本备忘录的分发不受限制。

Copyright Notice

版权公告

Copyright (C) The Internet Society (2006).

版权所有(C)互联网协会(2006年)。

Abstract

摘要

This document describes a syntax, called a "language-range", for specifying items in a user's list of language preferences. It also describes different mechanisms for comparing and matching these to language tags. Two kinds of matching mechanisms, filtering and lookup, are defined. Filtering produces a (potentially empty) set of language tags, whereas lookup produces a single language tag. Possible applications include language negotiation or content selection. This document, in combination with RFC 4646, replaces RFC 3066, which replaced RFC 1766.

本文档描述了一种称为“语言范围”的语法,用于指定用户语言首选项列表中的项目。它还描述了将这些标记与语言标记进行比较和匹配的不同机制。定义了两种匹配机制:过滤和查找。过滤生成一组(可能为空)语言标记,而查找生成单个语言标记。可能的应用包括语言协商或内容选择。本文件与RFC 4646一起取代了RFC 3066,后者取代了RFC 1766。

Table of Contents

目录

   1. Introduction ....................................................3
   2. The Language Range ..............................................3
      2.1. Basic Language Range .......................................4
      2.2. Extended Language Range ....................................4
      2.3. The Language Priority List .................................5
   3. Types of Matching ...............................................6
      3.1. Choosing a Matching Scheme .................................6
      3.2. Implementation Considerations ..............................7
      3.3. Filtering ..................................................8
           3.3.1. Basic Filtering .....................................9
           3.3.2. Extended Filtering .................................10
      3.4. Lookup ....................................................12
           3.4.1. Default Values .....................................14
   4. Other Considerations ...........................................15
      4.1. Choosing Language Ranges ..................................15
      4.2. Meaning of Language Tags and Ranges .......................16
      4.3. Considerations for Private-Use Subtags ....................17
      4.4. Length Considerations for Language Ranges .................17
   5. Security Considerations ........................................17
   6. Character Set Considerations ...................................17
   7. References .....................................................18
      7.1. Normative References ......................................18
      7.2. Informative References ....................................18
   Appendix A. Acknowledgements ......................................19
        
   1. Introduction ....................................................3
   2. The Language Range ..............................................3
      2.1. Basic Language Range .......................................4
      2.2. Extended Language Range ....................................4
      2.3. The Language Priority List .................................5
   3. Types of Matching ...............................................6
      3.1. Choosing a Matching Scheme .................................6
      3.2. Implementation Considerations ..............................7
      3.3. Filtering ..................................................8
           3.3.1. Basic Filtering .....................................9
           3.3.2. Extended Filtering .................................10
      3.4. Lookup ....................................................12
           3.4.1. Default Values .....................................14
   4. Other Considerations ...........................................15
      4.1. Choosing Language Ranges ..................................15
      4.2. Meaning of Language Tags and Ranges .......................16
      4.3. Considerations for Private-Use Subtags ....................17
      4.4. Length Considerations for Language Ranges .................17
   5. Security Considerations ........................................17
   6. Character Set Considerations ...................................17
   7. References .....................................................18
      7.1. Normative References ......................................18
      7.2. Informative References ....................................18
   Appendix A. Acknowledgements ......................................19
        
1. Introduction
1. 介绍

Human beings on our planet have, past and present, used a number of languages. There are many reasons why one would want to identify the language used when presenting or requesting information.

我们星球上的人类过去和现在都使用过多种语言。有许多原因可以解释为什么人们希望识别在呈现或请求信息时使用的语言。

Applications, protocols, or specifications that use language identifiers, such as the language tags defined in [RFC4646], sometimes need to match language tags to a user's language preferences.

使用语言标识符(如[RFC4646]中定义的语言标记)的应用程序、协议或规范有时需要将语言标记与用户的语言首选项相匹配。

This document defines a syntax (called a language range (Section 2)) for specifying items in the user's list of language preferences (called a language priority list (Section 2.3)), as well as several schemes for selecting or filtering sets of language tags by comparing the language tags to the user's preferences. Applications, protocols, or specifications will have varying needs and requirements that affect the choice of a suitable matching scheme.

本文档定义了一种语法(称为语言范围(第2节)),用于指定用户的语言首选项列表(称为语言优先级列表(第2.3节)),以及通过将语言标记与用户的首选项进行比较来选择或过滤语言标记集的几种方案。应用程序、协议或规范将具有不同的需求和要求,影响合适匹配方案的选择。

This document describes how to indicate a user's preferences using language ranges, three schemes for matching these ranges to a set of language tags, and the various practical considerations that apply to implementing and using these schemes.

本文档描述了如何使用语言范围表示用户的偏好,将这些范围与一组语言标记匹配的三种方案,以及应用于实现和使用这些方案的各种实际注意事项。

This document, in combination with [RFC4646], replaces [RFC3066], which replaced [RFC1766].

本文件与[RFC4646]一起取代了[RFC3066],后者取代了[RFC1766]。

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

本文件中的关键词“必须”、“不得”、“必需”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[RFC2119]中所述进行解释。

2. The Language Range
2. 语言范围

Language tags [RFC4646] are used to help identify languages, whether spoken, written, signed, or otherwise signaled, for the purpose of communication. Applications, protocols, or specifications that use language tags are often faced with the problem of identifying sets of content that share certain language attributes. For example, HTTP/1.1 [RFC2616] describes one such mechanism in its discussion of the Accept-Language header (Section 14.4), which is used when selecting content from servers based on the language of that content.

语言标签[RFC4646]用于帮助识别语言,无论是口语、书面语、有符号语言还是其他信号语言,以便于交流。使用语言标记的应用程序、协议或规范经常面临识别共享某些语言属性的内容集的问题。例如,HTTP/1.1[RFC2616]在其对Accept Language标头(第14.4节)的讨论中描述了一种这样的机制,该标头在基于内容的语言从服务器选择内容时使用。

It is, thus, useful to have a mechanism for identifying sets of language tags that share specific attributes. This allows users to select or filter the language tags based on specific requirements. Such an identifier is called a "language range".

因此,有一种机制来识别共享特定属性的语言标记集是很有用的。这允许用户根据特定要求选择或过滤语言标记。这种标识符称为“语言范围”。

There are different types of language range, whose specific attributes vary according to their application. Language ranges are similar to language tags: they consist of a sequence of subtags separated by hyphens. In a language range, each subtag MUST either be a sequence of ASCII alphanumeric characters or the single character '*' (%x2A, ASTERISK). The character '*' is a "wildcard" that matches any sequence of subtags. The meaning and uses of wildcards vary according to the type of language range.

有不同类型的语言范围,其特定属性因其应用而不同。语言范围类似于语言标记:它们由一系列用连字符分隔的子标记组成。在语言范围中,每个子标记必须是ASCII字母数字字符序列或单个字符“*”(%x2A,星号)。字符“*”是与任何子标记序列匹配的“通配符”。通配符的含义和用法因语言范围的类型而异。

Language tags and thus language ranges are to be treated as case-insensitive: there exist conventions for the capitalization of some of the subtags, but these MUST NOT be taken to carry meaning. Matching of language tags to language ranges MUST be done in a case-insensitive manner.

语言标记和语言范围应视为不区分大小写:某些子标记的大小写存在约定,但这些约定不得被视为具有意义。语言标记与语言范围的匹配必须以不区分大小写的方式进行。

2.1. Basic Language Range
2.1. 基本语言范围

A "basic language range" has the same syntax as an [RFC3066] language tag or is the single character "*". The basic language range was originally described by HTTP/1.1 [RFC2616] and later [RFC3066]. It is defined by the following ABNF [RFC4234]:

“基本语言范围”的语法与[RFC3066]语言标记相同,或者是单个字符“*”。基本语言范围最初由HTTP/1.1[RFC2616]和后来的[RFC3066]描述。它由以下ABNF[RFC4234]定义:

   language-range   = (1*8ALPHA *("-" 1*8alphanum)) / "*"
   alphanum         = ALPHA / DIGIT
        
   language-range   = (1*8ALPHA *("-" 1*8alphanum)) / "*"
   alphanum         = ALPHA / DIGIT
        

A basic language range differs from the language tags defined in [RFC4646] only in that there is no requirement that it be "well-formed" or be validated against the IANA Language Subtag Registry. Such ill-formed ranges will probably not match anything. Note that the ABNF [RFC4234] in [RFC2616] is incorrect, since it disallows the use of digits anywhere in the 'language-range' (see [RFC2616errata]).

基本语言范围与[RFC4646]中定义的语言标记的不同之处在于,它不需要“格式良好”或根据IANA语言子标记注册表进行验证。这种格式错误的范围可能与任何内容都不匹配。请注意,[RFC2616]中的ABNF[RFC4234]不正确,因为它不允许在“语言范围”中的任何位置使用数字(请参见[RFC2616勘误表])。

2.2. Extended Language Range
2.2. 扩展语言范围

Occasionally, users will wish to select a set of language tags based on the presence of specific subtags. An "extended language range" describes a user's language preference as an ordered sequence of subtags. For example, a user might wish to select all language tags that contain the region subtag 'CH' (Switzerland). Extended language ranges are useful for specifying a particular sequence of subtags that appear in the set of matching tags without having to specify all of the intervening subtags.

有时,用户希望根据特定子标签的存在选择一组语言标签。“扩展语言范围”将用户的语言偏好描述为有序的子标签序列。例如,用户可能希望选择包含区域子标记“CH”(瑞士)的所有语言标记。扩展语言范围对于指定出现在匹配标记集中的特定子标记序列非常有用,而无需指定所有中间子标记。

An extended language range can be represented by the following ABNF:

扩展的语言范围可由以下ABNF表示:

   extended-language-range = (1*8ALPHA / "*")
                             *("-" (1*8alphanum / "*"))
        
   extended-language-range = (1*8ALPHA / "*")
                             *("-" (1*8alphanum / "*"))
        

The wildcard subtag '*' can occur in any position in the extended language range, where it matches any sequence of subtags that might occur in that position in a language tag. However, wildcards outside the first position are ignored by Extended Filtering (see Section 3.2.2). The use or absence of one or more wildcards cannot be taken to imply that a certain number of subtags will appear in the matching set of language tags.

通配符子标记“*”可以出现在扩展语言范围中的任何位置,其中它匹配可能出现在语言标记中该位置的任何子标记序列。但是,扩展筛选会忽略第一个位置之外的通配符(参见第3.2.2节)。使用或不使用一个或多个通配符并不意味着在匹配的语言标记集中会出现一定数量的子标记。

2.3. The Language Priority List
2.3. 语言优先级列表

A user's language preferences will often need to specify more than one language range, and thus users often need to specify a prioritized list of language ranges in order to best reflect their language preferences. This is especially true for speakers of minority languages. A speaker of Breton in France, for example, can specify "br" followed by "fr", meaning that if Breton is available, it is preferred, but otherwise French is the best alternative. It can get more complex: a different user might want to fall back from Skolt Sami to Northern Sami to Finnish.

用户的语言首选项通常需要指定多个语言范围,因此用户通常需要指定语言范围的优先列表,以便最好地反映其语言首选项。对于讲少数民族语言的人来说尤其如此。例如,在法国讲布雷顿语的人可以在“br”后面加上“fr”,这意味着如果有布雷顿语,则首选法语,否则法语是最佳选择。它可能会变得更复杂:不同的用户可能希望从斯科尔特萨米人退到北萨米人再退到芬兰人。

A "language priority list" is a prioritized or weighted list of language ranges. One well-known example of such a list is the "Accept-Language" header defined in RFC 2616 [RFC2616] (see Section 14.4) and RFC 3282 [RFC3282].

“语言优先级列表”是一个优先或加权的语言范围列表。这种列表的一个众所周知的例子是RFC 2616[RFC2616](见第14.4节)和RFC 3282[RFC3282]中定义的“接受语言”头。

The various matching operations described in this document include considerations for using a language priority list. This document does not define the syntax for a language priority list; defining such a syntax is the responsibility of the protocol, application, or specification that uses it. When given as examples in this document, language priority lists will be shown as a quoted sequence of ranges separated by commas, like this: "en, fr, zh-Hant" (which is read "English before French before Chinese as written in the Traditional script").

本文档中描述的各种匹配操作包括使用语言优先级列表的注意事项。本文件未定义语言优先级列表的语法;定义这种语法是使用它的协议、应用程序或规范的责任。当在本文件中作为示例给出时,语言优先级列表将以逗号分隔的引用范围序列显示,如:“en、fr、zh Hant”(即“英文在繁体中文之前,法语在英文之前”)。

A simple list of ranges is considered to be in descending order of priority. Other language priority lists provide "quality weights" for the language ranges in order to specify the relative priority of the user's language preferences. An example of this is the use of "q" values in the syntax of the "Accept-Language" header (defined in [RFC2616], Section 14.4, and [RFC3282]).

范围的简单列表按优先级降序排列。其他语言优先级列表为语言范围提供“质量权重”,以指定用户语言首选项的相对优先级。例如,在“接受语言”标题的语法中使用“q”值(定义见[RFC2616],第14.4节和[RFC3282])。

3. Types of Matching
3. 匹配类型

Matching language ranges to language tags can be done in many different ways. This section describes three such matching schemes, as well as the considerations for choosing between them. Protocols and specifications requiring conformance to this specification MUST clearly indicate the particular mechanism used in selecting or matching language tags.

将语言范围与语言标记匹配可以通过许多不同的方式完成。本节介绍三种此类匹配方案,以及在它们之间进行选择的注意事项。要求符合本规范的协议和规范必须明确指出选择或匹配语言标记时使用的特定机制。

There are two types of matching scheme in this document. A matching scheme that produces zero or more matching language tags is called "filtering". A matching scheme that produces exactly one match for a given request is called "lookup".

本文档中有两种类型的匹配方案。生成零个或多个匹配语言标记的匹配方案称为“过滤”。为给定请求生成恰好一个匹配项的匹配方案称为“查找”。

3.1. Choosing a Matching Scheme
3.1. 选择匹配方案

Applications, protocols, and specifications are faced with the decision of what type of matching to use. Sometimes, different styles of matching are suited to different kinds of processing within a particular application or protocol.

应用程序、协议和规范需要决定使用哪种类型的匹配。有时,不同类型的匹配适用于特定应用程序或协议中的不同类型的处理。

This document describes three matching schemes:

本文档描述了三种匹配方案:

1. Basic Filtering (Section 3.3.1) matches a language priority list consisting of basic language ranges (Section 2.1) to sets of language tags.

1. 基本筛选(第3.3.1节)将包含基本语言范围(第2.1节)的语言优先级列表与语言标记集相匹配。

2. Extended Filtering (Section 3.3.2) matches a language priority list consisting of extended language ranges (Section 2.2) to sets of language tags.

2. 扩展筛选(第3.3.2节)将包含扩展语言范围(第2.2节)的语言优先级列表与语言标记集相匹配。

3. Lookup (Section 3.4) matches a language priority list consisting of basic language ranges to sets of language tags to find the one exact language tag that best matches the range.

3. 查找(第3.4节)将包含基本语言范围的语言优先级列表与语言标记集相匹配,以找到与范围最匹配的一个精确语言标记。

Filtering can be used to produce a set of results (such as a collection of documents) by comparing the user's preferences to a set of language tags. For example, when performing a search, filtering can be used to limit the results to items tagged as being in the French language. Filtering can also be used when deciding whether to perform a language-sensitive process on some content. For example, a process might cause paragraphs whose language tag matched the language range "nl" (Dutch) to be displayed in italics within a document.

通过将用户的首选项与一组语言标记进行比较,过滤可用于生成一组结果(例如文档集合)。例如,在执行搜索时,可以使用筛选将结果限制为标记为法语的项目。在决定是否对某些内容执行语言敏感的过程时,也可以使用筛选。例如,一个过程可能会导致其语言标记与语言范围“nl”(荷兰语)匹配的段落在文档中以斜体显示。

Lookup produces the single result that best matches the user's preferences from the list of available tags, so it is useful in cases in which a single item is required (and for which only a single item

Lookup从可用标记列表中生成与用户首选项最匹配的单个结果,因此它在需要单个项(并且仅需要单个项)的情况下非常有用

can be returned). For example, if a process were to insert a human-readable error message into a protocol header, it might select the text based on the user's language priority list. Since the process can return only one item, it is forced to choose a single item and it has to return some item, even if none of the content's language tags match the language priority list supplied by the user.

可以退货)。例如,如果一个进程要将一条人类可读的错误消息插入协议头,它可能会根据用户的语言优先级列表选择文本。由于进程只能返回一个项目,因此它必须选择单个项目,并且必须返回某些项目,即使内容的语言标记与用户提供的语言优先级列表都不匹配。

3.2. Implementation Considerations
3.2. 实施考虑

Language tag matching is a tool, and does not by itself specify a complete procedure for the use of language tags. Such procedures are intimately tied to the application protocol in which they occur. When specifying a protocol operation using matching, the protocol MUST specify:

语言标记匹配是一种工具,它本身并不指定使用语言标记的完整过程。这些过程与它们发生的应用程序协议密切相关。使用匹配指定协议操作时,协议必须指定:

o Which type(s) of language tag matching it uses

o 匹配它使用的语言标记的类型

o Whether the operation returns a single result (lookup) or a possibly empty set of results (filtering)

o 操作是返回单个结果(查找)还是可能返回空结果集(筛选)

o For lookup, what the default item is (or the sequence of operations or configuration information used to determine the default) when no matching tag is found. For instance, a protocol might define the result as failure of the operation, an empty value, returning some protocol defined or implementation defined default, or returning i-default [RFC2277].

o 对于查找,当找不到匹配的标记时,默认项是什么(或用于确定默认项的操作序列或配置信息)。例如,协议可能将结果定义为操作失败、空值、返回一些协议定义或实现定义的默认值或返回i-default[RFC2277]。

Applications, protocols, and specifications are not required to validate or understand any of the semantics of the language tags or ranges or of the subtags in them, nor do they require access to the IANA Language Subtag Registry (see Section 3 in [RFC4646]). This simplifies implementation.

应用程序、协议和规范无需验证或理解语言标记或范围或其中子标记的任何语义,也无需访问IANA语言子标记注册表(见[RFC4646]第3节)。这简化了实现。

However, designers of applications, protocols, or specifications are encouraged to use the information from the IANA Language Subtag Registry to support canonicalizing language tags and ranges in order to map grandfathered and obsolete tags or subtags into modern equivalents.

但是,鼓励应用程序、协议或规范的设计者使用IANA语言子标签注册中心提供的信息来支持规范化语言标签和范围,以便将过时和过时的标签或子标签映射为现代等价物。

Applications, protocols, or specifications that canonicalize ranges MUST either perform matching operations with both the canonical and original (unmodified) form of the range or MUST also canonicalize each tag for the purposes of comparison.

规范化范围的应用程序、协议或规范必须使用范围的规范和原始(未修改)形式执行匹配操作,或者还必须规范化每个标记以进行比较。

Note that canonicalizing language ranges makes certain operations impossible. For example, an implementation that canonicalizes the language range "art-lojban" (artificial language, lojban variant) to use the more modern "jbo" (Lojban) cannot be used to select just the items with the older tag.

请注意,规范化语言范围会使某些操作无法进行。例如,将语言范围“art lojban”(人工语言,lojban变体)规范化以使用更现代的“jbo”(lojban)的实现不能仅用于选择具有旧标记的项。

Applications, protocols, or specifications that use basic ranges might sometimes receive extended language ranges instead. An application, protocol, or specification MUST choose to a) map extended language ranges to basic ranges using the algorithm below, b) reject any extended language ranges in the language priority list that are not valid basic language ranges, or c) treat each extended language range as if it were a basic language range, which will have the same result as ignoring them, since these ranges will not match any valid language tags.

使用基本范围的应用程序、协议或规范有时可能会接收扩展的语言范围。应用程序、协议或规范必须选择a)使用以下算法将扩展语言范围映射到基本范围,b)拒绝语言优先级列表中无效的基本语言范围,或c)将每个扩展语言范围视为基本语言范围,这将产生与忽略它们相同的结果,因为这些范围将不匹配任何有效的语言标记。

An extended language range is mapped to a basic language range as follows: if the first subtag is a '*' then the entire range is treated as "*", otherwise each wildcard subtag is removed. For example, the extended language range "en-*-US" maps to "en-US" (English, United States).

扩展语言范围映射到基本语言范围,如下所示:如果第一个子标记为“*”,则整个范围将被视为“*”,否则将删除每个通配符子标记。例如,扩展的语言范围“en-*-US”映射到“en-US”(英语,美国)。

Applications, protocols, or specifications, in addressing their particular requirements, can offer pre-processing or configuration options. For example, an implementation could allow a user to associate or map a particular language range to a different value. Such a user might wish to associate the language range subtags 'nn' (Nynorsk Norwegian) and 'nb' (Bokmal Norwegian) with the more general subtag 'no' (Norwegian). Or perhaps a user would want to associate requests for the range "zh-Hans" (Chinese as written in the Simplified script) with content bearing the language tag "zh-CN" (Chinese as used in China, where the Simplified script is predominant). Documentation on how the ranges or tags are altered, prioritized, or compared in the subsequent match in such an implementation will assist users in making these types of configuration choices.

应用程序、协议或规范在满足其特定需求时,可以提供预处理或配置选项。例如,实现可以允许用户将特定语言范围关联或映射到不同的值。这样的用户可能希望将语言范围子标记“nn”(尼诺尔斯克挪威语)和“nb”(博克马尔挪威语)与更一般的子标记“no”(挪威语)相关联。或者,用户可能希望将范围为“zh-Hans”(简体中文)的请求与带有语言标签“zh-CN”(简体中文在中国使用,简体中文占主导地位)的内容相关联。关于在这种实现中如何在后续匹配中更改、优先排序或比较范围或标记的文档将帮助用户做出这些类型的配置选择。

3.3. Filtering
3.3. 过滤

Filtering is used to select the set of language tags that matches a given language priority list. It is called "filtering" because this set might contain no items at all or it might return an arbitrarily large number of matching items: as many items as match the language priority list, thus "filtering out" the non-matching items.

筛选用于选择与给定语言优先级列表匹配的语言标记集。它被称为“过滤”,因为这个集合可能根本不包含任何项,或者它可能返回任意数量的匹配项:与语言优先级列表匹配的项的数量,从而“过滤”出不匹配的项。

In filtering, each language range represents the least specific language tag (that is, the language tag with fewest number of subtags) that is an acceptable match. All of the language tags in

在筛选中,每个语言范围表示可接受匹配的最不特定的语言标记(即子标记数最少的语言标记)。中的所有语言标记

the matching set of tags will have an equal or greater number of subtags than the language range. Every non-wildcard subtag in the language range will appear in every one of the matching language tags. For example, if the language priority list consists of the range "de-CH" (German as used in Switzerland), one might see tags such as "de-CH-1996" (German as used in Switzerland, orthography of 1996) but one will never see a tag such as "de" (because the 'CH' subtag is missing).

匹配的标记集的子标记数将等于或大于语言范围。语言范围中的每个非通配符子标记都将出现在每个匹配的语言标记中。例如,如果语言优先级列表包含范围“de CH”(瑞士使用的德语),则可能会看到诸如“de-CH-1996”(瑞士使用的德语,1996年的正字法)之类的标记,但永远不会看到诸如“de”之类的标记(因为缺少“CH”子标记)。

If the language priority list (see Section 2.3) contains more than one range, the content returned is typically ordered in descending level of preference, but it MAY be unordered, according to the needs of the application or protocol.

如果语言优先级列表(见第2.3节)包含多个范围,则返回的内容通常按优先级别降序排列,但根据应用程序或协议的需要,可能会无序排列。

Some examples of applications where filtering might be appropriate include:

过滤可能适用的应用程序示例包括:

o Applying a style to sections of a document in a particular set of languages.

o 将样式应用于一组特定语言中的文档部分。

o Displaying the set of documents containing a particular set of keywords written in a specific set of languages.

o 显示包含用特定语言编写的特定关键字集的文档集。

o Selecting all email items written in a specific set of languages.

o 选择以特定语言编写的所有电子邮件项目。

o Selecting audio files spoken in a particular language.

o 选择特定语言的音频文件。

Filtering seems to imply that there is a semantic relationship between language tags that share the same prefix. While this is often the case, it is not always true: the language tags that match a specific language range do not necessarily represent mutually intelligible languages.

过滤似乎暗示共享相同前缀的语言标记之间存在语义关系。虽然这种情况经常发生,但并不总是如此:匹配特定语言范围的语言标记不一定表示相互可理解的语言。

3.3.1. Basic Filtering
3.3.1. 基本滤波

Basic filtering compares basic language ranges to language tags. Each basic language range in the language priority list is considered in turn, according to priority. A language range matches a particular language tag if, in a case-insensitive comparison, it exactly equals the tag, or if it exactly equals a prefix of the tag such that the first character following the prefix is "-". For example, the language-range "de-de" (German as used in Germany) matches the language tag "de-DE-1996" (German as used in Germany, orthography of 1996), but not the language tags "de-Deva" (German as written in the Devanagari script) or "de-Latn-DE" (German, Latin script, as used in Germany).

基本筛选将基本语言范围与语言标记进行比较。根据优先级依次考虑语言优先级列表中的每个基本语言范围。如果在不区分大小写的比较中,某个语言范围与某个特定的语言标记完全相同,或者该语言范围与该标记的前缀完全相同,因此前缀后面的第一个字符是“-”。例如,语言范围“de de”(在德国使用的德语)与语言标记“de-de-1996”(在德国使用的德语,1996年的拼字法)匹配,但与语言标记“de Deva”(德文用德文书写)或“de Latn de”(德文,拉丁语,在德国使用)不匹配。

The special range "*" in a language priority list matches any tag. A protocol that uses language ranges MAY specify additional rules about the semantics of "*"; for instance, HTTP/1.1 [RFC2616] specifies that the range "*" matches only languages not matched by any other range within an "Accept-Language" header.

语言优先级列表中的特殊范围“*”与任何标记匹配。使用语言范围的协议可以指定有关“*”语义的附加规则;例如,HTTP/1.1[RFC2616]指定范围“*”只匹配“接受语言”头中任何其他范围都不匹配的语言。

Basic filtering is identical to the type of matching described in [RFC3066], Section 2.5 (Language-range).

基本过滤与[RFC3066]第2.5节(语言范围)中描述的匹配类型相同。

3.3.2. Extended Filtering
3.3.2. 扩展滤波

Extended filtering compares extended language ranges to language tags. Each extended language range in the language priority list is considered in turn, according to priority. A language range matches a particular language tag if each respective list of subtags matches. To determine a match:

扩展筛选将扩展语言范围与语言标记进行比较。语言优先级列表中的每个扩展语言范围将根据优先级依次考虑。如果每个相应的子标记列表匹配,则语言范围匹配特定的语言标记。要确定匹配项,请执行以下操作:

1. Split both the extended language range and the language tag being compared into a list of subtags by dividing on the hyphen (%x2D) character. Two subtags match if either they are the same when compared case-insensitively or the language range's subtag is the wildcard '*'.

1. 通过在连字符(%x2D)上除法,将扩展语言范围和要比较的语言标记拆分为子标记列表。如果两个子标记在不区分大小写的情况下进行比较时相同,或者语言范围的子标记为通配符“*”,则两个子标记匹配。

2. Begin with the first subtag in each list. If the first subtag in the range does not match the first subtag in the tag, the overall match fails. Otherwise, move to the next subtag in both the range and the tag.

2. 从每个列表中的第一个子标签开始。如果范围中的第一个子标记与标记中的第一个子标记不匹配,则整体匹配失败。否则,移动到范围和标记中的下一个子标记。

3. While there are more subtags left in the language range's list:

3. 虽然语言范围的列表中还有更多子标签:

A. If the subtag currently being examined in the range is the wildcard ('*'), move to the next subtag in the range and continue with the loop.

A.如果当前正在范围内检查的子标记是通配符(“*”),则移动到范围内的下一个子标记并继续循环。

B. Else, if there are no more subtags in the language tag's list, the match fails.

否则,如果语言标记列表中没有更多的子标记,则匹配失败。

C. Else, if the current subtag in the range's list matches the current subtag in the language tag's list, move to the next subtag in both lists and continue with the loop.

否则,如果范围列表中的当前子标记与语言标记列表中的当前子标记匹配,则移动到两个列表中的下一个子标记并继续循环。

D. Else, if the language tag's subtag is a "singleton" (a single letter or digit, which includes the private-use subtag 'x') the match fails.

D.否则,如果语言标记的子标记是“singleton”(单个字母或数字,其中包括专用子标记“x”),则匹配失败。

E. Else, move to the next subtag in the language tag's list and continue with the loop.

否则,移动到语言标签列表中的下一个子标签并继续循环。

4. When the language range's list has no more subtags, the match succeeds.

4. 当语言范围的列表没有更多子标记时,匹配成功。

Subtags not specified, including those at the end of the language range, are thus treated as if assigned the wildcard value '*'. Much like basic filtering, extended filtering selects content with arbitrarily long tags that share the same initial subtags as the language range. In addition, extended filtering selects language tags that contain any intermediate subtags not specified in the language range. For example, the extended language range "de-*-DE" (or its synonym "de-DE") matches all of the following tags:

因此,未指定的子标记(包括位于语言范围末尾的子标记)将被视为已分配通配符值“*”。与基本过滤非常相似,扩展过滤选择具有任意长标记的内容,这些标记与语言范围共享相同的初始子标记。此外,扩展筛选选择包含未在语言范围中指定的任何中间子标记的语言标记。例如,扩展语言范围“de-*-de”(或其同义词“de de”)匹配以下所有标记:

de-DE (German, as used in Germany)

de de(德语,在德国使用)

de-de (German, as used in Germany)

de de(德语,在德国使用)

de-Latn-DE (Latin script)

拉丁语

de-Latf-DE (Fraktur variant of Latin script)

德拉特夫德(弗拉克图尔拉丁语的变体)

de-DE-x-goethe (private-use subtag)

de-de-x-GOETE(私人使用子标签)

de-Latn-DE-1996 (orthography of 1996)

de-Latn-de-1996(1996年正字法)

de-Deva-DE (Devanagari script)

德瓦德(德瓦纳加里文字)

The same range does not match any of the following tags for the reasons shown:

由于以下原因,同一范围不匹配以下任何标记:

de (missing 'DE')

de(缺少“de”)

de-x-DE (singleton 'x' occurs before 'DE')

de-x-de(单例“x”出现在“de”之前)

de-Deva ('Deva' not equal to 'DE')

德瓦(“德瓦”不等于“德”)

Note: [RFC4646] defines each type of subtag (language, script, region, and so forth) according to position, size, and content. This means that subtags in a language range can only match specific types of subtags in a language tag. For example, a subtag such as 'Latn' is always a script subtag (unless it follows a singleton) while a subtag such as 'nedis' can only match the equivalent variant subtag. Two-letter subtags in the initial position have a different type (language) than two-letter subtags in later positions (region). This is the reason why a wildcard in the extended language range is significant in the first position but is ignored in all other positions.

注意:[RFC4646]根据位置、大小和内容定义每种类型的子标记(语言、脚本、区域等)。这意味着语言范围中的子标记只能与语言标记中特定类型的子标记相匹配。例如,诸如“Latn”之类的子标记始终是脚本子标记(除非它跟在单例后面),而诸如“nedis”之类的子标记只能匹配等效的变体子标记。初始位置的两个字母子标签的类型(语言)与后面位置(区域)的两个字母子标签的类型(语言)不同。这就是为什么扩展语言范围中的通配符在第一个位置很重要,但在所有其他位置都被忽略的原因。

3.4. Lookup
3.4. 查找

Lookup is used to select the single language tag that best matches the language priority list for a given request. When performing lookup, each language range in the language priority list is considered in turn, according to priority. By contrast with filtering, each language range represents the most specific tag that is an acceptable match. The first matching tag found, according to the user's priority, is considered the closest match and is the item returned. For example, if the language range is "de-ch", a lookup operation can produce content with the tags "de" or "de-CH" but never content with the tag "de-CH-1996". If no language tag matches the request, the "default" value is returned.

查找用于选择与给定请求的语言优先级列表最匹配的单一语言标记。执行查找时,将根据优先级依次考虑语言优先级列表中的每个语言范围。与过滤相比,每个语言范围表示可接受匹配的最特定标记。根据用户的优先级,找到的第一个匹配标记被视为最接近的匹配项,并且是返回的项。例如,如果语言范围为“de-ch”,则查找操作可以生成具有标记“de”或“de-ch”的内容,但决不能生成具有标记“de-ch-1996”的内容。如果没有与请求匹配的语言标记,则返回“default”值。

For example, if an application inserts some dynamic content into a document, returning an empty string if there is no exact match is not an option. Instead, the application "falls back" until it finds a matching language tag associated with a suitable piece of content to insert. Some applications of lookup include:

例如,如果应用程序将一些动态内容插入到文档中,如果没有完全匹配,则返回空字符串不是一个选项。相反,应用程序会“后退”,直到找到与要插入的适当内容相关联的匹配语言标记。查找的一些应用包括:

o Selection of a template containing the text for an automated email response.

o 选择包含自动电子邮件响应文本的模板。

o Selection of an item containing some text for inclusion in a particular Web page.

o 选择包含某些文本的项目以包含在特定网页中。

o Selection of a string of text for inclusion in an error log.

o 选择要包含在错误日志中的文本字符串。

o Selection of an audio file to play as a prompt in a phone system.

o 选择要在电话系统中作为提示播放的音频文件。

In the lookup scheme, the language range is progressively truncated from the end until a matching language tag is located. Single letter or digit subtags (including both the letter 'x', which introduces private-use sequences, and the subtags that introduce extensions) are removed at the same time as their closest trailing subtag. For example, starting with the range "zh-Hant-CN-x-private1-private2" (Chinese, Traditional script, China, two private-use tags) the lookup progressively searches for content as shown below:

在查找方案中,语言范围从末尾逐渐截断,直到找到匹配的语言标记。单个字母或数字子标记(包括引入专用序列的字母“x”和引入扩展名的子标记)与其最接近的尾随子标记同时删除。例如,从范围“zh-Hant-CN-x-private1-private2”(中文、繁体字、中国、两个专用标记)开始,查找逐步搜索内容,如下所示:

Example of a Lookup Fallback Pattern

查找回退模式的示例

Range to match: zh-Hant-CN-x-private1-private2 1. zh-Hant-CN-x-private1-private2 2. zh-Hant-CN-x-private1 3. zh-Hant-CN 4. zh-Hant 5. zh 6. (default)

要匹配的范围:zh-Hant-CN-x-private1-private2 1。zh-Hant-CN-x-private1-private2。zh-Hant-CN-x-private1 3。中韩中国4。中弘5。zh 6。(默认)

This fallback behavior allows some flexibility in finding a match. Without fallback, the default content would be returned immediately if exactly matching content is unavailable. With fallback, a result more closely matching the user request can be provided.

这种回退行为允许在查找匹配项时具有一定的灵活性。在没有回退的情况下,如果完全匹配的内容不可用,将立即返回默认内容。使用回退,可以提供与用户请求更紧密匹配的结果。

Extensions and unrecognized private-use subtags might be unrelated to a particular application of lookup. Since these subtags come at the end of the subtag sequence, they are removed first during the fallback process and usually pose no barrier to interoperability. However, an implementation MAY remove these from ranges prior to performing the lookup (provided the implementation also removes them from the tags being compared). Such modification is internal to the implementation and applications, protocols, or specifications SHOULD NOT remove or modify subtags in content that they return or forward, because this removes information that can be used elsewhere.

扩展和无法识别的专用子标签可能与特定的查找应用程序无关。由于这些子标签位于子标签序列的末尾,因此在回退过程中首先删除它们,并且通常不会对互操作性造成障碍。但是,实现可以在执行查找之前从范围中删除这些标记(前提是该实现还将它们从要比较的标记中删除)。这种修改是实现内部的,应用程序、协议或规范不应删除或修改它们返回或转发的内容中的子标签,因为这样会删除可在其他地方使用的信息。

The special language range "*" matches any language tag. In the lookup scheme, this range does not convey enough information by itself to determine which language tag is most appropriate, since it matches everything. If the language range "*" is followed by other language ranges, it is skipped. If the language range "*" is the only one in the language priority list or if no other language range follows, the default value is computed and returned.

特殊语言范围“*”与任何语言标记匹配。在查找方案中,此范围本身无法传递足够的信息来确定哪种语言标记最合适,因为它匹配所有内容。如果语言范围“*”后面跟着其他语言范围,则跳过它。如果语言范围“*”是语言优先级列表中唯一的语言范围,或者如果后面没有其他语言范围,则计算并返回默认值。

In some cases, the language priority list can contain one or more extended language ranges (as, for example, when the same language priority list is used as input for both lookup and filtering operations). Wildcard values in an extended language range normally match any value that can occur in that position in a language tag. Since only one item can be returned for any given lookup request, wildcards in a language range have to be processed in a consistent manner or the same request will produce widely varying results. Applications, protocols, or specifications that accept extended language ranges MUST define which item is returned when more than one item matches the extended language range.

在某些情况下,语言优先级列表可以包含一个或多个扩展语言范围(例如,当相同的语言优先级列表用作查找和筛选操作的输入时)。扩展语言范围中的通配符值通常与语言标记中该位置可能出现的任何值匹配。由于任何给定的查找请求只能返回一个项,因此必须以一致的方式处理语言范围中的通配符,否则同一请求将产生差异很大的结果。接受扩展语言范围的应用程序、协议或规范必须定义当多个项与扩展语言范围匹配时返回的项。

For example, an implementation could map the extended language ranges to basic ranges. Another possibility would be for an implementation to return the matching tag that is first in ASCII-order. If the language range were "*-CH" ('CH' represents Switzerland) and the set of tags included "de-CH" (German as used in Switzerland), "fr-CH" (French, Switzerland), and "it-CH" (Italian, Switzerland), then the tag "de-CH" would be returned.

例如,一个实现可以将扩展的语言范围映射到基本范围。另一种可能是实现返回匹配的标记,该标记以ASCII顺序排在第一位。如果语言范围为“*-CH”(“CH”代表瑞士),并且标记集包括“de CH”(瑞士使用的德语)、“fr CH”(瑞士法语)和“it CH”(瑞士意大利语),则将返回标记“de CH”。

3.4.1. Default Values
3.4.1. 默认值

Each application, protocol, or specification that uses lookup MUST define the defaulting behavior when no tag matches the language priority list. What this action consists of strongly depends on how lookup is being applied. Some examples of defaulting behavior include:

当没有标记与语言优先级列表匹配时,使用查找的每个应用程序、协议或规范都必须定义默认行为。此操作包含的内容在很大程度上取决于如何应用查找。默认行为的一些示例包括:

o return an item with no language tag or an item of a non-linguistic nature, such as an image or sound

o 返回没有语言标记的项目或非语言性质的项目,如图像或声音

o return a null string as the language tag value, in cases where the protocol permits the empty value (see, for example, "xml:lang" in [XML10])

o 在协议允许空值的情况下,返回空字符串作为语言标记值(例如,请参见[XML10]中的“xml:lang”)

o return a particular language tag designated for the operation

o 返回为操作指定的特定语言标记

o return the language tag "i-default" (see [RFC2277])

o 返回语言标记“i-default”(请参见[RFC2277])

o return an error condition or error message

o 返回错误条件或错误消息

o return a list of available languages for the user to select from

o 返回可供用户选择的可用语言列表

When performing lookup using a language priority list, the progressive search MUST process each language range in the list before seeking or calculating the default.

使用语言优先级列表执行查找时,渐进式搜索必须先处理列表中的每个语言范围,然后再查找或计算默认值。

The default value MAY be calculated or include additional searching or matching. Applications, protocols, or specifications can specify different ways in which users can specify or override the defaults.

可以计算默认值,也可以包括额外的搜索或匹配。应用程序、协议或规范可以指定用户可以指定或覆盖默认值的不同方式。

One common way to provide for a default is to allow a specific language range to be set as the default for a specific type of request. If this approach is chosen, this language range MUST be treated as if it were appended to the end of the language priority list as a whole, rather than after each item in the language priority list. The application, protocol, or specification MUST also define the defaulting behavior if that search fails to find a matching tag or item.

提供默认值的一种常见方法是允许将特定语言范围设置为特定请求类型的默认值。如果选择此方法,则必须将此语言范围视为作为一个整体附加到语言优先级列表的末尾,而不是附加到语言优先级列表中的每个项目之后。如果搜索未能找到匹配的标记或项目,则应用程序、协议或规范还必须定义默认行为。

For example, if a particular user's language priority list is "fr-FR, zh-Hant" (French as used in France followed by Chinese as written in the Traditional script) and the program doing the matching had a default language range of "ja-JP" (Japanese as used in Japan), then the program searches as follows:

例如,如果特定用户的语言优先级列表为“fr-fr,zh-Hant”(法语在法国使用,后跟中文在传统脚本中书写),并且进行匹配的程序的默认语言范围为“ja-JP”(日语在日本使用),则程序搜索如下:

1. fr-FR 2. fr 3. zh-Hant // next language 4. zh 5. ja-JP // now searching for the default content 6. ja 7. (implementation defined default)

1. fr 2。fr 3。zh Hant//next language 4。zh 5。ja JP//正在搜索默认内容6。ja 7。(实现定义的默认值)

4. Other Considerations
4. 其他考虑

When working with language ranges and matching schemes, there are some additional points that can influence the choice of either.

在使用语言范围和匹配方案时,还有一些附加点可能会影响其中一种的选择。

4.1. Choosing Language Ranges
4.1. 选择语言范围

Users indicate their language preferences via the choice of a language range or the list of language ranges in a language priority list. The type of matching affects what the best choice is for a user.

用户通过选择语言范围或语言优先级列表中的语言范围列表来表示其语言偏好。匹配类型会影响用户的最佳选择。

Most matching schemes make no attempt to process the semantic meaning of the subtags. The language range is compared, in a case-insensitive manner, to each language tag being matched, using basic string processing. Users SHOULD select language ranges that are well-formed, valid language tags according to [RFC4646] (substituting wildcards as appropriate in extended language ranges).

大多数匹配方案都没有试图处理子标签的语义。使用基本字符串处理,以不区分大小写的方式将语言范围与匹配的每个语言标记进行比较。用户应根据[RFC4646](在扩展语言范围中酌情替换通配符)选择格式良好、有效的语言标记的语言范围。

Applications are encouraged to canonicalize language tags and ranges by using the Preferred-Value from the IANA Language Subtag Registry for tags or subtags that have been deprecated. If the user is working with content that might use the older form, the user might want to include both the new and old forms in a language priority list. For example, the tag "art-lojban" is deprecated. The subtag 'jbo' is supposed to be used instead, so the user might use it to form the language range. Or the user might include both in a language priority list: "jbo, art-lojban".

鼓励应用程序通过使用IANA语言子标记注册表中已弃用的标记或子标记的首选值来规范化语言标记和范围。如果用户正在处理可能使用旧表单的内容,则用户可能希望在语言优先级列表中同时包含新表单和旧表单。例如,标签“art lojban”已被弃用。子标记“jbo”应该被使用,因此用户可以使用它来形成语言范围。或者用户可以在语言优先级列表中包括这两个选项:“jbo,art-lojban”。

Users SHOULD avoid subtags that add no distinguishing value to a language range. When filtering, the fewer the number of subtags that appear in the language range, the more content the range will probably match, while in lookup unnecessary subtags can cause "better", more-specific content to be skipped in favor of less specific content. For example, the range "de-Latn-DE" returns content tagged "de" instead of content tagged "de-DE", even though the latter is probably a better match.

用户应避免使用对语言范围没有任何区别价值的子标签。过滤时,出现在语言范围中的子标签数量越少,该范围可能匹配的内容就越多,而在查找时,不必要的子标签可能会导致跳过“更好”的、更具体的内容,以支持不太具体的内容。例如,范围“de Latn de”返回标记为“de”的内容,而不是标记为“de de”的内容,即使后者可能更匹配。

Whether a subtag adds distinguishing value can depend on the context of the request. For example, a user who reads both Simplified and Traditional Chinese, but who prefers Simplified, might use the range "zh" for filtering (matching all items that user can read) but "zh-Hans" for lookup (making sure that user gets the preferred form if it's available, but the fallback to "zh" will still work). On the other hand, content in this case ought to be labeled as "zh-Hans" (or "zh-Hant" if that applies) for filtering, while for lookup, if there is either "zh-Hans" content or "zh-Hant" content, one of them (the one considered 'default') also ought to be made available with the simple "zh". Note that the user can create a language priority list "zh-Hans, zh" that delivers the best possible results for both schemes. If the user cannot be sure which scheme is being used (or if more than one might be applied to a given request), the user SHOULD specify the most specific (largest number of subtags) range first and then supply shorter prefixes later in the list to ensure that filtering returns a complete set of tags.

子标记是否添加区分值取决于请求的上下文。例如,一个同时阅读简体中文和繁体中文,但更喜欢简体中文的用户可能会使用范围“zh”进行筛选(匹配用户可以阅读的所有项目),但使用范围“zh-Hans”进行查找(确保用户获得首选表单(如果有),但回退到“zh”仍然有效)。另一方面,在这种情况下,内容应标记为“zh-Hans”(或“zh-Hant”,如果适用),用于过滤,而对于查找,如果存在“zh-Hans”内容或“zh-Hant”内容,则其中一个(被视为“默认”)也应与简单的“zh”一起提供。请注意,用户可以创建一个语言优先级列表“zh Hans,zh”,为这两种方案提供最好的结果。如果某个特定的AGS不能被应用,那么请确保该AGS中的哪一个将被应用到指定的子范围中(如果该用户不能被应用到指定的子范围中),并确保该AGS中的第一个将被应用到指定的最新用户范围中。

Many languages are written predominantly in a single script. This is usually recorded in the Suppress-Script field in that language subtag's registry entry. For these languages, script subtags SHOULD NOT be used to form a language range. Thus, the language range "en-Latn" is inappropriate in most cases (because the vast majority of English documents are written in the Latin script and thus the 'en' language subtag has a Suppress-Script field for 'Latn' in the registry).

许多语言主要用一个脚本编写。这通常记录在该语言子标记的注册表项的抑制脚本字段中。对于这些语言,不应使用脚本子标签来形成语言范围。因此,语言范围“en-Latn”在大多数情况下是不合适的(因为绝大多数英文文档都是用拉丁语编写的,因此“en”语言子标签在注册表中有一个“Latn”的抑制脚本字段)。

When working with tags and ranges, note that extensions and most private-use subtags are orthogonal to language tag matching, in that they specify additional attributes of the text not related to the goals of most matching schemes. Users SHOULD avoid using these subtags in language ranges, since they interfere with the selection of available content. When used in language tags (as opposed to ranges), these subtags normally do not interfere with filtering (Section 3), since they appear at the end of the tag and will match all prefixes. Lookup (Section 3.4) implementations are advised to ignore unrecognized private-use and extension subtags when performing language tag fallback.

使用标记和范围时,请注意,扩展和大多数专用子标记与语言标记匹配正交,因为它们指定与大多数匹配方案的目标无关的文本的附加属性。用户应避免在语言范围内使用这些子标签,因为它们会干扰可用内容的选择。当在语言标记(与范围相反)中使用时,这些子标记通常不会干扰过滤(第3节),因为它们出现在标记的末尾,并且将匹配所有前缀。在执行语言标记回退时,建议查找(第3.4节)实现忽略未识别的专用和扩展子标记。

4.2. Meaning of Language Tags and Ranges
4.2. 语言标记和范围的含义

Selecting language tags using language ranges requires some understanding by users of what they are selecting. The meanings of the various subtags in a language range are identical to their meanings in a language tag (see Section 4.2 in [RFC4646]), with the addition that the wildcard "*" represents any matching sequence of values.

使用语言范围选择语言标记需要用户对所选内容有一定的了解。语言范围内各子标记的含义与其在语言标记中的含义相同(见[RFC4646]第4.2节),此外,通配符“*”表示任何匹配的值序列。

4.3. Considerations for Private-Use Subtags
4.3. 专用子标签的注意事项

Private agreement is necessary between the parties that intend to use or exchange language tags that contain private-use subtags. Great caution SHOULD be used in employing private-use subtags in content or protocols intended for general use. Private-use subtags are simply useless for information exchange without prior arrangement.

有意使用或交换包含私用子标签的语言标签的各方之间必须达成私人协议。在用于一般用途的内容或协议中使用私用子标签时应特别小心。未经事先安排,私用子标签对于信息交换毫无用处。

The value and semantic meaning of private-use tags and of the subtags used within such a language tag are not defined. Matching private-use tags using language ranges or extended language ranges can result in unpredictable content being returned.

私用标记和此类语言标记中使用的子标记的值和语义未定义。使用语言范围或扩展语言范围匹配私用标记可能会导致返回不可预知的内容。

4.4. Length Considerations for Language Ranges
4.4. 语言范围的长度考虑

Language ranges are very similar to language tags in terms of content and usage. The same types of restrictions on length that can be applied to language tags can also be applied to language ranges. See [RFC4646] Section 4.3 (Length Considerations).

就内容和用法而言,语言范围与语言标记非常相似。可以应用于语言标记的相同类型的长度限制也可以应用于语言范围。参见[RFC4646]第4.3节(长度考虑)。

5. Security Considerations
5. 安全考虑

Language ranges used in content negotiation might be used to infer the nationality of the sender, and thus identify potential targets for surveillance. In addition, unique or highly unusual language ranges or combinations of language ranges might be used to track a specific individual's activities.

内容协商中使用的语言范围可以用来推断发送者的国籍,从而确定潜在的监视目标。此外,独特或极不寻常的语言范围或语言范围的组合可用于跟踪特定个人的活动。

This is a special case of the general problem that anything you send is visible to the receiving party. It is useful to be aware that such concerns can exist in some cases.

这是一般问题的一个特例,即您发送的任何内容都对接收方可见。意识到这种担忧在某些情况下可能存在是有益的。

The evaluation of the exact magnitude of the threat, and any possible countermeasures, is left to each application or protocol.

对威胁的确切程度以及任何可能的应对措施的评估将留给每个应用程序或协议。

6. Character Set Considerations
6. 字符集注意事项

Language tags permit only the characters A-Z, a-z, 0-9, and HYPHEN-MINUS (%x2D). Language ranges also use the character ASTERISK (%x2A). These characters are present in most character sets, so presentation or exchange of language tags or ranges should not be constrained by character set issues.

语言标记只允许字符A-Z、A-Z、0-9和连字符减号(%x2D)。语言范围也使用星号(%x2A)字符。这些字符存在于大多数字符集中,因此语言标记或范围的表示或交换不应受到字符集问题的限制。

7. References
7. 工具书类
7.1. Normative References
7.1. 规范性引用文件

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[RFC2119]Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[RFC2277] Alvestrand, H., "IETF Policy on Character Sets and Languages", BCP 18, RFC 2277, January 1998.

[RFC2277]Alvestrand,H.,“IETF字符集和语言政策”,BCP 18,RFC 2277,1998年1月。

[RFC4234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 4234, October 2005.

[RFC4234]Crocker,D.,Ed.和P.Overell,“语法规范的扩充BNF:ABNF”,RFC 4234,2005年10月。

[RFC4646] Phillips, A., Ed., and M. Davis, Ed., "Tags for Identifying Languages", BCP 47, RFC 4646, September 2006.

[RFC4646]Phillips,A.,Ed.,和M.Davis,Ed.,“识别语言的标记”,BCP 47,RFC 46462006年9月。

7.2. Informative References
7.2. 资料性引用

[RFC1766] Alvestrand, H., "Tags for the Identification of Languages", RFC 1766, March 1995.

[RFC1766]Alvestrand,H.,“语言识别标签”,RFC1766,1995年3月。

[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

[RFC2616]菲尔丁,R.,盖蒂斯,J.,莫卧儿,J.,弗莱斯蒂克,H.,马斯特,L.,利奇,P.,和T.伯纳斯李,“超文本传输协议——HTTP/1.1”,RFC 2616,1999年6月。

[RFC2616errata] IETF, "HTTP/1.1 Specification Errata", October 2004, <http://purl.org/NET/http-errata>.

[RFC2616勘误表]IETF,“HTTP/1.1规范勘误表”,2004年10月<http://purl.org/NET/http-errata>.

[RFC3066] Alvestrand, H., "Tags for the Identification of Languages", BCP 47, RFC 3066, January 2001.

[RFC3066]Alvestrand,H.,“语言识别标签”,BCP 47,RFC 3066,2001年1月。

[RFC3282] Alvestrand, H., "Content Language Headers", RFC 3282, May 2002.

[RFC3282]Alvestrand,H.,“内容语言标题”,RFC 3282,2002年5月。

[XML10] Bray, T., Paoli, J., Sperberg-McQueen, C., Maler, E., and F. Yergeau, "Extensible Markup Language (XML) 1.0 (Third Edition)", World Wide Web Consortium Recommendation, February 2004, <http://www.w3.org/TR/REC-xml>.

[XML10]Bray,T.,Paoli,J.,Sperberg McQueen,C.,Maler,E.,和F.Yergeau,“可扩展标记语言(XML)1.0(第三版)”,万维网联盟建议,2004年2月<http://www.w3.org/TR/REC-xml>.

Appendix A. Acknowledgements
附录A.确认书

Any list of contributors is bound to be incomplete; please regard the following as only a selection from the group of people who have contributed to make this document what it is today.

任何贡献者的名单都是不完整的;请将以下内容视为仅从为本文件的编制做出贡献的人员中选出的一部分。

The contributors to [RFC1766] and [RFC3066], each of which was a precursor to this document, contributed greatly to the development of language tag matching, and, in particular, the basic language range and the basic matching scheme. This document was originally part of [RFC4646], but was split off before that document's completion. Thus, directly or indirectly, those acknowledged in [RFC4646] also had a hand in the development of this document, and work done prior to the split is acknowledged in that document.

[RFC1766]和[RFC3066]的撰稿人都是本文档的前身,他们对语言标记匹配的发展做出了巨大贡献,尤其是基本语言范围和基本匹配方案。该文档最初是[RFC4646]的一部分,但在该文档完成之前被拆分。因此,直接或间接地,[RFC4646]中确认的人员也参与了本文件的编制,并且在拆分之前完成的工作在该文件中得到确认。

The following people (in alphabetical order by family name) contributed to this document:

以下人员(按姓氏字母顺序)参与了本文件:

Harald Alvestrand, Stephane Bortzmeyer, Jeremy Carroll, Peter Constable, John Cowan, Mark Crispin, Martin Duerst, Frank Ellermann, Doug Ewell, Debbie Garside, Marion Gunn, Jon Hanna, Kent Karlsson, Erkki Kolehmainen, Jukka Korpela, Ira McDonald, M. Patton, Randy Presuhn, Eric van der Poel, Markus Scherer, Misha Wolf, and many, many others.

哈拉尔德·阿尔维斯特兰、斯蒂芬·博茨迈耶、杰里米·卡罗尔、彼得·康斯特布尔、约翰·考恩、马克·克里斯宾、马丁·杜尔斯、弗兰克·埃勒曼、道格·埃维尔、黛比·加赛德、马里恩·冈恩、乔恩·汉纳、肯特·卡尔松、埃尔基·科勒梅宁、朱卡·科佩拉、艾拉·麦克唐纳、巴顿先生、兰迪·普雷森、埃里克·范德波尔、马库斯·舍勒、米莎·沃尔夫,以及许多其他人。

Very special thanks must go to Harald Tveit Alvestrand, who originated RFCs 1766 and 3066, and without whom this document would not have been possible.

必须特别感谢Harald Tveit Alvestrand,他是RFC 1766和3066的发起人,没有他,本文件就不可能完成。

Authors' Addresses

作者地址

Addison Phillips (Editor) Yahoo! Inc.

艾迪生·菲利普斯(编辑)雅虎!股份有限公司。

   EMail: addison@inter-locale.com
        
   EMail: addison@inter-locale.com
        

Mark Davis (Editor) Google

马克·戴维斯(编辑)谷歌

   EMail: mark.davis@macchiato.com or mark.davis@google.com
        
   EMail: mark.davis@macchiato.com or mark.davis@google.com
        

Full Copyright Statement

完整版权声明

Copyright (C) The Internet Society (2006).

版权所有(C)互联网协会(2006年)。

This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

本文件受BCP 78中包含的权利、许可和限制的约束,除其中规定外,作者保留其所有权利。

This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

本文件及其包含的信息是按“原样”提供的,贡献者、他/她所代表或赞助的组织(如有)、互联网协会和互联网工程任务组不承担任何明示或暗示的担保,包括但不限于任何保证,即使用本文中的信息不会侵犯任何权利,或对适销性或特定用途适用性的任何默示保证。

Intellectual Property

知识产权

The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何独立努力来确定任何此类权利。有关RFC文件中权利的程序信息,请参见BCP 78和BCP 79。

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.

向IETF秘书处披露的知识产权副本和任何许可证保证,或本规范实施者或用户试图获得使用此类专有权利的一般许可证或许可的结果,可从IETF在线知识产权存储库获取,网址为http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.

IETF邀请任何相关方提请其注意任何版权、专利或专利申请,或其他可能涵盖实施本标准所需技术的专有权利。请将信息发送至IETF的IETF-ipr@ietf.org.

Acknowledgement

确认

Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA).

RFC编辑器功能的资金由IETF行政支持活动(IASA)提供。