Network Working Group J. Rosenberg Request for Comments: 5629 Cisco Systems Category: Standards Track October 2009
Network Working Group J. Rosenberg Request for Comments: 5629 Cisco Systems Category: Standards Track October 2009
A Framework for Application Interaction in the Session Initiation Protocol (SIP)
会话启动协议(SIP)中的应用程序交互框架
Abstract
摘要
This document describes a framework for the interaction between users and Session Initiation Protocol (SIP) based applications. By interacting with applications, users can guide the way in which they operate. The focus of this framework is stimulus signaling, which allows a user agent (UA) to interact with an application without knowledge of the semantics of that application. Stimulus signaling can occur to a user interface running locally with the client, or to a remote user interface, through media streams. Stimulus signaling encompasses a wide range of mechanisms, ranging from clicking on hyperlinks, to pressing buttons, to traditional Dual-Tone Multi-Frequency (DTMF) input. In all cases, stimulus signaling is supported through the use of markup languages, which play a key role in this framework.
本文档描述了用户与基于会话初始化协议(SIP)的应用程序之间交互的框架。通过与应用程序交互,用户可以指导他们的操作方式。该框架的重点是刺激信号,它允许用户代理(UA)与应用程序交互,而不需要知道该应用程序的语义。刺激信号可以通过媒体流发送到与客户端本地运行的用户界面,或者发送到远程用户界面。刺激信号包括多种机制,从点击超链接到按键,再到传统的双音多频(DTMF)输入。在所有情况下,刺激信号都是通过使用标记语言来支持的,标记语言在该框架中起着关键作用。
Status of This Memo
关于下段备忘
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
本文件规定了互联网社区的互联网标准跟踪协议,并要求进行讨论和提出改进建议。有关本协议的标准化状态和状态,请参考当前版本的“互联网官方协议标准”(STD 1)。本备忘录的分发不受限制。
Copyright Notice
版权公告
Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved.
版权所有(c)2009 IETF信托基金和确定为文件作者的人员。版权所有。
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License.
本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件,因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括《信托法律条款》第4.e节中所述的简化BSD许可文本,并且提供BSD许可中所述的代码组件时不提供任何担保。
This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow
本文件可能包含2008年11月10日之前发布或公开的IETF文件或IETF贡献中的材料。控制某些材料版权的人可能没有授予IETF信托允许的权利
modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.
在IETF标准过程之外修改此类材料。在未从控制此类材料版权的人员处获得充分许可的情况下,不得在IETF标准流程之外修改本文件,也不得在IETF标准流程之外创建其衍生作品,除了将其格式化以RFC形式发布或将其翻译成英语以外的其他语言。
Table of Contents
目录
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Conventions Used in This Document . . . . . . . . . . . . . . 4 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 4. A Model for Application Interaction . . . . . . . . . . . . . 7 4.1. Functional vs. Stimulus . . . . . . . . . . . . . . . . . 9 4.2. Real-Time vs. Non-Real-Time . . . . . . . . . . . . . . . 10 4.3. Client-Local vs. Client-Remote . . . . . . . . . . . . . . 10 4.4. Presentation-Capable vs. Presentation-Free . . . . . . . . 11 5. Interaction Scenarios on Telephones . . . . . . . . . . . . . 11 5.1. Client Remote . . . . . . . . . . . . . . . . . . . . . . 12 5.2. Client Local . . . . . . . . . . . . . . . . . . . . . . . 12 5.3. Flip-Flop . . . . . . . . . . . . . . . . . . . . . . . . 13 6. Framework Overview . . . . . . . . . . . . . . . . . . . . . . 13 7. Deployment Topologies . . . . . . . . . . . . . . . . . . . . 16 7.1. Third-Party Application . . . . . . . . . . . . . . . . . 16 7.2. Co-Resident Application . . . . . . . . . . . . . . . . . 17 7.3. Third-Party Application and User Device Proxy . . . . . . 18 7.4. Proxy Application . . . . . . . . . . . . . . . . . . . . 19 8. Application Behavior . . . . . . . . . . . . . . . . . . . . . 19 8.1. Client-Local Interfaces . . . . . . . . . . . . . . . . . 20 8.1.1. Discovering Capabilities . . . . . . . . . . . . . . . 20 8.1.2. Pushing an Initial Interface Component . . . . . . . . 20 8.1.3. Updating an Interface Component . . . . . . . . . . . 22 8.1.4. Terminating an Interface Component . . . . . . . . . . 22 8.2. Client-Remote Interfaces . . . . . . . . . . . . . . . . . 23 8.2.1. Originating and Terminating Applications . . . . . . . 23 8.2.2. Intermediary Applications . . . . . . . . . . . . . . 24 9. User Agent Behavior . . . . . . . . . . . . . . . . . . . . . 24 9.1. Advertising Capabilities . . . . . . . . . . . . . . . . . 24 9.2. Receiving User Interface Components . . . . . . . . . . . 25 9.3. Mapping User Input to User Interface Components . . . . . 26 9.4. Receiving Updates to User Interface Components . . . . . . 27 9.5. Terminating a User Interface Component . . . . . . . . . . 27 10. Inter-Application Feature Interaction . . . . . . . . . . . . 27 10.1. Client-Local UI . . . . . . . . . . . . . . . . . . . . . 28 10.2. Client-Remote UI . . . . . . . . . . . . . . . . . . . . . 29 11. Intra Application Feature Interaction . . . . . . . . . . . . 29 12. Example Call Flow . . . . . . . . . . . . . . . . . . . . . . 30 13. Security Considerations . . . . . . . . . . . . . . . . . . . 36 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 36 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 36 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36 16.1. Normative References . . . . . . . . . . . . . . . . . . . 36 16.2. Informative References . . . . . . . . . . . . . . . . . . 37
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Conventions Used in This Document . . . . . . . . . . . . . . 4 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 4. A Model for Application Interaction . . . . . . . . . . . . . 7 4.1. Functional vs. Stimulus . . . . . . . . . . . . . . . . . 9 4.2. Real-Time vs. Non-Real-Time . . . . . . . . . . . . . . . 10 4.3. Client-Local vs. Client-Remote . . . . . . . . . . . . . . 10 4.4. Presentation-Capable vs. Presentation-Free . . . . . . . . 11 5. Interaction Scenarios on Telephones . . . . . . . . . . . . . 11 5.1. Client Remote . . . . . . . . . . . . . . . . . . . . . . 12 5.2. Client Local . . . . . . . . . . . . . . . . . . . . . . . 12 5.3. Flip-Flop . . . . . . . . . . . . . . . . . . . . . . . . 13 6. Framework Overview . . . . . . . . . . . . . . . . . . . . . . 13 7. Deployment Topologies . . . . . . . . . . . . . . . . . . . . 16 7.1. Third-Party Application . . . . . . . . . . . . . . . . . 16 7.2. Co-Resident Application . . . . . . . . . . . . . . . . . 17 7.3. Third-Party Application and User Device Proxy . . . . . . 18 7.4. Proxy Application . . . . . . . . . . . . . . . . . . . . 19 8. Application Behavior . . . . . . . . . . . . . . . . . . . . . 19 8.1. Client-Local Interfaces . . . . . . . . . . . . . . . . . 20 8.1.1. Discovering Capabilities . . . . . . . . . . . . . . . 20 8.1.2. Pushing an Initial Interface Component . . . . . . . . 20 8.1.3. Updating an Interface Component . . . . . . . . . . . 22 8.1.4. Terminating an Interface Component . . . . . . . . . . 22 8.2. Client-Remote Interfaces . . . . . . . . . . . . . . . . . 23 8.2.1. Originating and Terminating Applications . . . . . . . 23 8.2.2. Intermediary Applications . . . . . . . . . . . . . . 24 9. User Agent Behavior . . . . . . . . . . . . . . . . . . . . . 24 9.1. Advertising Capabilities . . . . . . . . . . . . . . . . . 24 9.2. Receiving User Interface Components . . . . . . . . . . . 25 9.3. Mapping User Input to User Interface Components . . . . . 26 9.4. Receiving Updates to User Interface Components . . . . . . 27 9.5. Terminating a User Interface Component . . . . . . . . . . 27 10. Inter-Application Feature Interaction . . . . . . . . . . . . 27 10.1. Client-Local UI . . . . . . . . . . . . . . . . . . . . . 28 10.2. Client-Remote UI . . . . . . . . . . . . . . . . . . . . . 29 11. Intra Application Feature Interaction . . . . . . . . . . . . 29 12. Example Call Flow . . . . . . . . . . . . . . . . . . . . . . 30 13. Security Considerations . . . . . . . . . . . . . . . . . . . 36 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 36 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 36 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36 16.1. Normative References . . . . . . . . . . . . . . . . . . . 36 16.2. Informative References . . . . . . . . . . . . . . . . . . 37
The Session Initiation Protocol (SIP) [2] provides the ability for users to initiate, manage, and terminate communications sessions. Frequently, these sessions will involve a SIP application. A SIP application is defined as a program running on a SIP-based element (such as a proxy or user agent) that provides some value-added function to a user or system administrator. Examples of SIP applications include prepaid calling card calls, conferencing, and presence-based [12] call routing.
会话启动协议(SIP)[2]为用户提供启动、管理和终止通信会话的能力。这些会话通常涉及SIP应用程序。SIP应用程序定义为在基于SIP的元素(如代理或用户代理)上运行的程序,该元素向用户或系统管理员提供一些增值功能。SIP应用程序的示例包括预付费电话卡呼叫、会议和基于状态的[12]呼叫路由。
In order for most applications to properly function, they need input from the user to guide their operation. As an example, a prepaid calling card application requires the user to input their calling card number, their PIN code, and the destination number they wish to reach. The process by which a user provides input to an application is called "application interaction".
为了让大多数应用程序正常运行,它们需要用户的输入来指导它们的操作。例如,预付费电话卡应用程序要求用户输入其电话卡号、PIN码和希望到达的目的地号码。用户向应用程序提供输入的过程称为“应用程序交互”。
Application interaction can be either functional or stimulus. Functional interaction requires the user device to understand the semantics of the application, whereas stimulus interaction does not. Stimulus signaling allows for applications to be built without requiring modifications to the user device. Stimulus interaction is the subject of this framework. The framework provides a model for how users interact with applications through user interfaces, and how user interfaces and applications can be distributed throughout a network. This model is then used to describe how applications can instantiate and manage user interfaces.
应用程序交互可以是功能性的,也可以是刺激性的。功能交互需要用户设备理解应用程序的语义,而刺激交互则不需要。刺激信号允许在不需要修改用户设备的情况下构建应用程序。刺激相互作用是这个框架的主题。该框架提供了一个模型,用于说明用户如何通过用户界面与应用程序交互,以及用户界面和应用程序如何分布在整个网络中。然后,该模型用于描述应用程序如何实例化和管理用户界面。
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [1]
本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“应”、“不应”、“建议”、“可”和“可选”应按照[1]中所述进行解释
SIP Application: A SIP application is defined as a program running on a SIP-based element (such as a proxy or user agent) that provides some value-added function to a user or system administrator. Examples of SIP applications include prepaid calling card calls, conferencing, and presence-based [12] call routing.
SIP应用程序:SIP应用程序定义为在基于SIP的元素(如代理或用户代理)上运行的程序,该元素为用户或系统管理员提供一些增值功能。SIP应用程序的示例包括预付费电话卡呼叫、会议和基于状态的[12]呼叫路由。
Application Interaction: The process by which a user provides input to an application.
应用程序交互:用户向应用程序提供输入的过程。
Real-Time Application Interaction: Application interaction that takes place while an application instance is executing. For example, when a user enters their PIN number into a prepaid calling card application, this is real-time application interaction.
实时应用程序交互:应用程序实例执行时发生的应用程序交互。例如,当用户将其PIN号码输入预付费电话卡应用程序时,这是实时应用程序交互。
Non-Real-Time Application Interaction: Application interaction that takes place asynchronously with the execution of the application. Generally, non-real-time application interaction is accomplished through provisioning.
非实时应用程序交互:与应用程序执行异步进行的应用程序交互。通常,非实时应用程序交互是通过资源调配完成的。
Functional Application Interaction: Application interaction is functional when the user device has an understanding of the semantics of the interaction with the application.
功能性应用程序交互:当用户设备理解与应用程序交互的语义时,应用程序交互是功能性的。
Stimulus Application Interaction: Application interaction is stimulus when the user device has no understanding of the semantics of the interaction with the application.
刺激应用程序交互:当用户设备不理解与应用程序交互的语义时,应用程序交互就是刺激。
User Interface (UI): The user interface provides the user with context to make decisions about what they want. The user interacts with the device, which conveys the user input to the user interface. The user interface interprets the information and passes it to the application.
用户界面(UI):用户界面为用户提供上下文,以决定他们想要什么。用户与设备交互,设备将用户输入传送到用户界面。用户界面解释信息并将其传递给应用程序。
User Interface Component: A piece of user interface that operates independently of other pieces of the user interface. For example, a user might have two separate web interfaces to a prepaid calling card application: one for hanging up and making another call, and another for entering the username and PIN.
用户界面组件:独立于其他用户界面操作的用户界面。例如,用户可能有两个独立的预付费电话卡应用程序web界面:一个用于挂断并拨打另一个电话,另一个用于输入用户名和PIN。
User Device: The software or hardware system that the user directly interacts with to communicate with the application. An example of a user device is a telephone. Another example is a PC with a web browser.
用户设备:用户直接与之交互以与应用程序通信的软件或硬件系统。用户设备的一个示例是电话。另一个例子是带有web浏览器的PC。
User Device Proxy: A software or hardware system that a user indirectly interacts through to communicate with the application. This indirection can be through a network. An example is a gateway from IP to the Public Switched Telephone Network (PSTN). It acts as a user device proxy, acting on behalf of the user on the circuit network.
用户设备代理:一种软件或硬件系统,用户通过它与应用程序进行间接交互。这种间接方式可以通过网络实现。例如,从IP到公共交换电话网(PSTN)的网关。它充当用户设备代理,在电路网络上代表用户。
User Input: The "raw" information passed from a user to a user interface. Examples of user input include a spoken word or a click on a hyperlink.
用户输入:从用户传递到用户界面的“原始”信息。用户输入的示例包括口头单词或单击超链接。
Client-Local User Interface: A user interface that is co-resident with the user device.
客户端本地用户界面:与用户设备共同驻留的用户界面。
Client-Remote User Interface: A user interface that executes remotely from the user device. In this case, a standardized interface is needed between the user device and the user interface. Typically, this is done through media sessions: audio, video, or application sharing.
客户端远程用户界面:从用户设备远程执行的用户界面。在这种情况下,用户设备和用户界面之间需要标准化的界面。通常,这是通过媒体会话完成的:音频、视频或应用程序共享。
Markup Language: A markup language describes a logical flow of presentation of information to the user, collection of information from the user, and transmission of that information to an application.
标记语言:标记语言描述向用户表示信息、从用户收集信息以及将信息传输到应用程序的逻辑流。
Media Interaction: A means of separating a user and a user interface by connecting them with media streams.
媒体交互:一种通过将用户和用户界面与媒体流连接来分离用户和用户界面的方法。
Interactive Voice Response (IVR): An IVR is a type of user interface that allows users to speak commands to the application, and hear responses to those commands prompting for more information.
交互式语音响应(IVR):IVR是一种用户界面,允许用户向应用程序发出命令,并听到对这些命令的响应,提示提供更多信息。
Prompt-and-Collect: The basic primitive of an IVR user interface. The user is presented with a voice option, and the user speaks their choice.
提示和收集:IVR用户界面的基本原语。用户会看到一个语音选项,用户会说出自己的选择。
Barge-In: The act of entering information into an IVR user interface prior to the completion of a prompt requesting that information.
插入:在完成请求信息的提示之前,将信息输入IVR用户界面的行为。
Focus: A user interface component has focus when user input is provided to it, as opposed to any other user interface components. This is not to be confused with the term "focus" within the SIP conferencing framework, which refers to the center user agent in a conference [14].
焦点:用户界面组件在提供用户输入时具有焦点,而不是任何其他用户界面组件。这不能与SIP会议框架中的术语“焦点”混淆,该术语指的是会议中的中心用户代理[14]。
Focus Determination: The process by which the user device determines which user interface component will receive the user input.
焦点确定:用户设备确定哪个用户界面组件将接收用户输入的过程。
Focusless Device: A user device that has no ability to perform focus determination. An example of a focusless device is a telephone with a keypad.
无焦点设备:无法执行焦点确定的用户设备。无焦点设备的一个例子是带有键盘的电话。
Presentation-Capable UI: A user interface that can prompt the user with input, collect results, and then prompt the user with new information based on those results.
支持演示的UI:一个用户界面,可以向用户提示输入,收集结果,然后根据这些结果向用户提示新信息。
Presentation-Free UI: A user interface that cannot prompt the user with information.
无演示文稿UI:无法向用户提示信息的用户界面。
Feature Interaction: A class of problems that result when multiple applications or application components are trying to provide services to a user at the same time.
功能交互:当多个应用程序或应用程序组件试图同时向用户提供服务时所产生的一类问题。
Inter-Application Feature Interaction: Feature interactions that occur between applications.
应用程序间功能交互:应用程序之间发生的功能交互。
DTMF: Dual-Tone Multi-Frequency. DTMF refers to a class of tones generated by circuit-switched telephony devices when the user presses a key on the keypad. As a result, DTMF and keypad input are often used synonymously, when in fact one of them (DTMF) is merely a means of conveying the other (the keypad input) to a client-remote user interface (the switch, for example).
DTMF:双音多频。DTMF是指当用户按下键盘上的键时,电路交换电话设备产生的一类音调。因此,DTMF和键盘输入通常是同义的,而实际上其中一个(DTMF)只是将另一个(键盘输入)传送到客户端远程用户界面(例如交换机)的一种手段。
Application Instance: A single execution path of a SIP application.
应用程序实例:SIP应用程序的单个执行路径。
Originating Application: A SIP application that acts as a User Agent Client (UAC), making a call on behalf of the user.
发起应用程序:充当用户代理客户端(UAC)的SIP应用程序,代表用户进行呼叫。
Terminating Application: A SIP application that acts as a User Agent Server (UAS), answering a call generated by a user. IVR applications are terminating applications.
终止应用程序:充当用户代理服务器(UAS)的SIP应用程序,应答用户生成的呼叫。IVR应用程序正在终止应用程序。
Intermediary Application: A SIP application that is neither the caller or callee, but rather a third party involved in a call.
中间应用程序:既不是呼叫者也不是被呼叫者,而是参与呼叫的第三方的SIP应用程序。
+---+ +---+ +---+ +---+ | | | | | | | | | | | U | | U | | A | | | Input | s | Input | s | Results | p | | | ---------> | e | ---------> | e | ----------> | p | | U | | r | | r | | l | | s | | | | | | i | | e | | D | | I | | c | | r | Output | e | Output | f | Update | a | | | <--------- | v | <--------- | a | <.......... | t | | | | i | | c | | i | | | | c | | e | | o | | | | e | | | | n | | | | | | | | | +---+ +---+ +---+ +---+
+---+ +---+ +---+ +---+ | | | | | | | | | | | U | | U | | A | | | Input | s | Input | s | Results | p | | | ---------> | e | ---------> | e | ----------> | p | | U | | r | | r | | l | | s | | | | | | i | | e | | D | | I | | c | | r | Output | e | Output | f | Update | a | | | <--------- | v | <--------- | a | <.......... | t | | | | i | | c | | i | | | | c | | e | | o | | | | e | | | | n | | | | | | | | | +---+ +---+ +---+ +---+
Figure 1: Model for Real-Time Interactions
图1:实时交互的模型
Figure 1 presents a general model for how users interact with applications. Generally, users interact with a user interface through a user device. A user device can be a telephone, or it can be a PC with a web browser. Its role is to pass the user input from the user to the user interface. The user interface provides the user with context in order to make decisions about what they want. The user interacts with the device, causing information to be passed from the device to the user interface. The user interface interprets the information, and passes it as a user interface event to the application. The application may be able to modify the user interface based on this event. Whether or not this is possible depends on the type of user interface.
图1展示了用户如何与应用程序交互的一般模型。通常,用户通过用户设备与用户界面交互。用户设备可以是电话,也可以是带有web浏览器的PC。它的作用是将用户输入从用户传递到用户界面。用户界面为用户提供了上下文,以便决定他们想要什么。用户与设备交互,导致信息从设备传递到用户界面。用户界面解释信息,并将其作为用户界面事件传递给应用程序。应用程序可以基于此事件修改用户界面。这是否可行取决于用户界面的类型。
User interfaces are fundamentally about rendering and interpretation. Rendering refers to the way in which the user is provided context. This can be through hyperlinks, images, sounds, videos, text, and so on. Interpretation refers to the way in which the user interface takes the "raw" data provided by the user, and returns the result to the application as a meaningful event, abstracted from the particulars of the user interface. As an example, consider a prepaid calling card application. The user interface worries about details such as what prompt the user is provided, whether the voice is male or female, and so on. It is concerned with recognizing the speech that the user provides, in order to obtain the desired information. In this case, the desired information is the calling card number, the PIN code, and the destination number. The application needs that data, and it doesn't matter to the application whether it was collected using a male prompt or a female one.
用户界面基本上是关于渲染和解释的。呈现是指为用户提供上下文的方式。这可以通过超链接、图像、声音、视频、文本等实现。解释是指用户界面获取用户提供的“原始”数据的方式,并将结果作为有意义的事件返回给应用程序,从用户界面的细节中抽象出来。作为一个例子,考虑一个预付费的电话卡应用程序。用户界面担心一些细节,比如用户得到了什么提示,声音是男性还是女性,等等。它涉及识别用户提供的语音,以获得所需的信息。在这种情况下,所需的信息是电话卡号、PIN码和目的地号码。应用程序需要这些数据,无论是使用男性提示符还是女性提示符收集数据,对应用程序来说都无关紧要。
User interfaces generally have real-time requirements towards the user. That is, when a user interacts with the user interface, the user interface needs to react quickly, and that change needs to be propagated to the user right away. However, the interface between the user interface and the application need not be that fast. Faster is better, but the user interface itself can frequently compensate for long latencies between the user interface and the application. In the case of a prepaid calling card application, when the user is prompted to enter their PIN, the prompt should generally stop immediately once the first digit of the PIN is entered. This is referred to as "barge-in". After the user interface collects the rest of the PIN, it can tell the user to "please wait while processing". The PIN can then be gradually transmitted to the application. In this example, the user interface has compensated for a slow UI to application interface by asking the user to wait.
用户界面通常对用户有实时性要求。也就是说,当用户与用户界面交互时,用户界面需要快速反应,并且需要立即将更改传播给用户。但是,用户界面和应用程序之间的接口不需要那么快。速度越快越好,但是用户界面本身可以经常补偿用户界面和应用程序之间的长延迟。对于预付费电话卡应用程序,当提示用户输入其PIN时,提示通常应在输入PIN的第一位后立即停止。这被称为“驳入”。在用户界面收集剩余PIN后,它可以告诉用户“请等待处理”。然后,PIN可以逐渐传输到应用程序。在本例中,用户界面通过要求用户等待来补偿UI到应用程序界面的缓慢。
The separation between user interface and application is absolutely fundamental to the entire framework provided in this document. Its importance cannot be overstated.
用户界面和应用程序之间的分离对于本文档中提供的整个框架来说是绝对重要的。它的重要性怎么强调也不为过。
With this basic model, we can begin to taxonomize the types of systems that can be built.
有了这个基本模型,我们可以开始对可以构建的系统类型进行分类。
The first way to taxonomize the system is to consider the interface between the UI and the application. There are two fundamentally different models for this interface. In a functional interface, the user interface has detailed knowledge about the application and is, in fact, specific to the application. The interface between the two components is through a functional protocol, capable of representing the semantics that can be exposed through the user interface. Because the user interface has knowledge of the application, it can be optimally designed for that application. As a result, functional user interfaces are almost always the most user friendly, the fastest, and the most responsive. However, in order to allow interoperability between user devices and applications, the details of the functional protocols need to be specified in standards. This slows down innovation and limits the scope of applications that can be built.
分类系统的第一种方法是考虑UI和应用程序之间的接口。此接口有两种根本不同的模型。在功能界面中,用户界面具有关于应用程序的详细知识,并且实际上是特定于应用程序的。两个组件之间的接口是通过功能协议实现的,能够表示可以通过用户界面公开的语义。因为用户界面了解应用程序,所以可以针对该应用程序进行优化设计。因此,功能用户界面几乎总是最友好、最快、响应最快的。然而,为了允许用户设备和应用程序之间的互操作性,需要在标准中指定功能协议的细节。这减慢了创新速度,限制了可构建应用程序的范围。
An alternative is a stimulus interface. In a stimulus interface, the user interface is generic -- that is, totally ignorant of the details of the application. Indeed, the application may pass instructions to the user interface describing how it should operate. The user interface translates user input into "stimulus", which are data understood only by the application, and not by the user interface. Because they are generic, and because they require communications with the application in order to change the way in which they render information to the user, stimulus user interfaces are usually slower, less user friendly, and less responsive than a functional counterpart. However, they allow for substantial innovation in applications, since no standardization activity is needed to build a new application, as long as it can interact with the user within the confines of the user interface mechanism. The web is an example of a stimulus user interface to applications.
另一种选择是刺激界面。在刺激界面中,用户界面是通用的——也就是说,完全不知道应用程序的细节。实际上,应用程序可以向用户界面传递说明其应如何操作的指令。用户界面将用户输入转换为“刺激”,即仅由应用程序而非用户界面理解的数据。因为它们是通用的,并且因为它们需要与应用程序通信以改变它们向用户呈现信息的方式,所以用户界面通常比功能对应的界面更慢、更不友好,响应性也更低。然而,它们允许应用程序中的实质性创新,因为构建新应用程序不需要标准化活动,只要它能够在用户界面机制的范围内与用户交互。web是应用程序用户界面的一个示例。
In SIP systems, functional interfaces are provided by extending the SIP protocol to provide the needed functionality. For example, the SIP caller preferences specification [15] provides a functional interface that allows a user to request applications to route the call to specific types of user agents. Functional interfaces are important, but are not the subject of this framework. The primary goal of this framework is to address the role of stimulus interfaces to SIP applications.
在SIP系统中,通过扩展SIP协议来提供功能接口,以提供所需的功能。例如,SIP呼叫者偏好规范[15]提供了一个功能接口,允许用户请求应用程序将呼叫路由到特定类型的用户代理。功能接口很重要,但不是本框架的主题。该框架的主要目标是解决SIP应用程序的刺激接口的作用。
Application interaction systems can also be real-time or non-real-time. Non-real-time interaction allows the user to enter information about application operation asynchronously with its invocation. Frequently, this is done through provisioning systems. As an example, a user can set up the forwarding number for a call-forward on no-answer application using a web page. Real-time interaction requires the user to interact with the application at the time of its invocation.
应用程序交互系统也可以是实时或非实时的。非实时交互允许用户在调用时异步输入有关应用程序操作的信息。通常,这是通过供应系统完成的。例如,用户可以使用网页为无应答应用程序上的呼叫转接设置转接号码。实时交互要求用户在调用应用程序时与应用程序交互。
Another axis in the taxonomization is whether the user interface is co-resident with the user device (which we refer to as a client-local user interface), or the user interface runs in a host separated from the client (which we refer to as a client-remote user interface). In a client-remote user interface, there exists some kind of protocol between the client device and the UI that allows the client to interact with the user interface over a network.
分类中的另一个轴是用户界面是否与用户设备共存(我们称之为客户端本地用户界面),还是用户界面在与客户端分离的主机中运行(我们称之为客户端远程用户界面)。在客户端远程用户界面中,客户端设备和UI之间存在某种协议,允许客户端通过网络与用户界面交互。
The most important way to separate the UI and the client device is through media interaction. In media interaction, the interface between the user and the user interface is through media: audio, video, messaging, and so on. This is the classic mode of operation for VoiceXML [5], where the user interface (also referred to as the voice browser) runs on a platform in the network. Users communicate with the voice browser through the telephone network (or using a SIP session). The voice browser interacts with the application using HTTP to convey the information collected from the user.
分离UI和客户端设备的最重要方式是通过媒体交互。在媒体交互中,用户和用户界面之间的接口是通过媒体实现的:音频、视频、消息等。这是VoiceXML的经典操作模式[5],其中用户界面(也称为语音浏览器)在网络平台上运行。用户通过电话网络(或使用SIP会话)与语音浏览器通信。语音浏览器使用HTTP与应用程序交互,以传递从用户收集的信息。
In the case of a client-local user interface, the user interface runs co-located with the user device. The interface between them is through the software that interprets the user's input and passes it to the user interface. The classic example of this is the Web. In the Web, the user interface is a web browser, and the interface is defined by the HTML document that it's rendering. The user interacts directly with the user interface running in the browser. The results of that user interface are sent to the application (running on the web server) using HTTP.
在客户端本地用户界面的情况下,用户界面与用户设备在同一位置运行。它们之间的接口是通过解释用户输入并将其传递给用户界面的软件实现的。网络就是一个典型的例子。在Web中,用户界面是Web浏览器,界面由其呈现的HTML文档定义。用户直接与浏览器中运行的用户界面交互。该用户界面的结果使用HTTP发送到应用程序(在web服务器上运行)。
It is important to note that whether or not the user interface is local or remote (in the case of media interaction) is not a property of the modality of the interface, but rather a property of the system. As an example, it is possible for a Web-based user interface to be provided with a client-remote user interface. In such a scenario, video- and application-sharing media sessions can be used between the user and the user interface. The user interface, still
重要的是要注意,用户界面是本地还是远程(在媒体交互的情况下)不是界面模态的属性,而是系统的属性。例如,基于Web的用户界面可以与客户端远程用户界面一起提供。在这种情况下,可以在用户和用户界面之间使用视频和应用程序共享媒体会话。用户界面,仍然是
guided by HTML, now runs "in the network", remote from the client. Similarly, a VoiceXML document can be interpreted locally by a client device, with no media streams at all. Indeed, the VoiceXML document can be rendered using text, rather than media, with no impact on the interface between the user interface and the application.
在HTML的指导下,现在在“网络”中运行,远离客户端。类似地,VoiceXML文档可以由客户端设备在本地进行解释,而不需要任何媒体流。实际上,VoiceXML文档可以使用文本而不是媒体呈现,不会影响用户界面和应用程序之间的接口。
It is also important to note that systems can be hybrid. In a hybrid user interface, some aspects of it (usually those associated with a particular modality) run locally, and others run remotely.
还需要注意的是,系统可以是混合的。在混合用户界面中,它的某些方面(通常与特定模态相关联的方面)在本地运行,而其他方面则在远程运行。
A user interface can be capable of presenting information to the user (a presentation-capable UI), or it can be capable only of collecting user input (a presentation-free UI). These are very different types of user interfaces. A presentation-capable UI can provide the user with feedback after every input, providing the context for collecting the next input. As a result, presentation-capable user interfaces require an update to the information provided to the user after each input. The Web is a classic example of this. After every input (i.e., a click), the browser provides the input to the application and fetches the next page to render. In a presentation-free user interface, this is not the case. Since the user is not provided with feedback, these user interfaces tend to merely collect information as it's entered, and pass it to the application.
用户界面可以向用户呈现信息(可呈现的UI),也可以仅能够收集用户输入(无呈现的UI)。这些是非常不同类型的用户界面。支持演示的UI可以在每次输入后向用户提供反馈,为收集下一次输入提供上下文。因此,支持演示的用户界面需要在每次输入后更新提供给用户的信息。网络就是一个典型的例子。每次输入(即单击)后,浏览器都会向应用程序提供输入,并获取下一个要呈现的页面。在无演示文稿的用户界面中,情况并非如此。由于没有向用户提供反馈,这些用户界面往往只是在输入信息时收集信息,并将其传递给应用程序。
Another difference is that a presentation-free user interface cannot easily support the concept of a focus. Selection of a focus usually requires a means for informing the user of the available applications, allowing the user to choose, and then informing them about which one they have chosen. Without the first and third steps (which a presentation-free UI cannot provide), focus selection is very difficult. Without a selected focus, the input provided to applications through presentation-free user interfaces is more of a broadcast or notification operation.
另一个区别是,无演示文稿的用户界面无法轻松支持焦点的概念。焦点的选择通常需要一种方法来通知用户可用的应用程序,允许用户选择,然后通知他们选择了哪个应用程序。如果没有第一步和第三步(无演示的UI无法提供),焦点选择是非常困难的。如果没有选定的焦点,则通过无表示的用户界面向应用程序提供的输入更多地是广播或通知操作。
In this section, we apply the model of Section 4 to telephones.
在本节中,我们将第4节的模型应用于电话。
In a traditional telephone, the user interface consists of a 12-key keypad, a speaker, and a microphone. Indeed, from here forward, the term "telephone" is used to represent any device that meets, at a minimum, the characteristics described in the previous sentence. Circuit-switched telephony applications are almost universally client-remote user interfaces. In the Public Switched Telephone Network (PSTN), there is usually a circuit interface between the user and the user interface. The user input from the keypad is conveyed
在传统电话中,用户界面由一个12键键盘、一个扬声器和一个麦克风组成。事实上,从这里开始,术语“电话”用于表示至少满足上一句所述特征的任何设备。电路交换电话应用程序几乎都是客户端远程用户界面。在公共交换电话网(PSTN)中,用户和用户接口之间通常有一个电路接口。来自键盘的用户输入被传送
using Dual-Tone Multi-Frequency (DTMF), and the microphone input as Pulse Code Modulated (PCM) encoded voice.
使用双音多频(DTMF),麦克风输入为脉冲编码调制(PCM)编码语音。
In an IP-based system, there is more variability in how the system can be instantiated. Both client-remote and client-local user interfaces to a telephone can be provided.
在基于IP的系统中,系统的实例化方式具有更多的可变性。可以为电话提供客户端远程和客户端本地用户界面。
In this framework, a PSTN gateway can be considered a User Device Proxy. It is a proxy for the user because it can provide, to a user interface on an IP network, input taken from a user on a circuit-switched telephone. The gateway may be able to run a client-local user interface, just as an IP telephone might.
在此框架中,PSTN网关可以被视为用户设备代理。它是用户的代理,因为它可以向IP网络上的用户界面提供从电路交换电话上的用户获取的输入。网关可以像IP电话一样运行客户端本地用户界面。
The most obvious instantiation is the "classic" circuit-switched telephony model. In that model, the user interface runs remotely from the client. The interface between the user and the user interface is through media, which is set up by SIP and carried over the Real Time Transport Protocol (RTP) [18]. The microphone input can be carried using any suitable voice-encoding algorithm. The keypad input can be conveyed in one of two ways. The first is to convert the keypad input to DTMF, and then convey that DTMF using a suitable encoding algorithm (such as PCMU). An alternative, and generally the preferred approach, is to transmit the keypad input using RFC 4733 [19], which provides an encoding mechanism for carrying keypad input within RTP.
最明显的实例是“经典”电路交换电话模型。在该模型中,用户界面从客户端远程运行。用户和用户界面之间的接口是通过媒体实现的,媒体由SIP建立,并通过实时传输协议(RTP)传输[18]。可以使用任何合适的语音编码算法携带麦克风输入。键盘输入可以通过两种方式之一传送。第一种方法是将键盘输入转换为DTMF,然后使用合适的编码算法(如PCMU)传输DTMF。另一种替代方法,通常是首选方法,是使用RFC 4733[19]传输键盘输入,它提供了一种编码机制,用于在RTP内携带键盘输入。
In this classic model, the user interface would run on a server in the IP network. It would perform speech recognition and DTMF recognition to derive the user intent, feed them through the user interface, and provide the result to an application.
在这个经典模型中,用户界面将在IP网络中的服务器上运行。它将执行语音识别和双音多频识别,以获得用户意图,通过用户界面提供给用户,并将结果提供给应用程序。
An alternative model is for the entire user interface to reside on the telephone. The user interface can be a VoiceXML browser, running speech recognition on the microphone input, and feeding the keypad input directly into the script. As discussed above, the VoiceXML script could be rendered using text instead of voice, if the telephone has a textual display.
另一种模式是将整个用户界面驻留在电话上。用户界面可以是VoiceXML浏览器,在麦克风输入上运行语音识别,并将键盘输入直接输入到脚本中。如上所述,如果电话具有文本显示,则可以使用文本而不是语音来呈现VoiceXML脚本。
For simpler phones without a display, the user interface can be described by a Keypad Markup Language request document [8]. As the user enters digits in the keypad, they are passed to the user interface, which generates user interface events that can be transported to the application.
对于没有显示器的简单手机,用户界面可以通过键盘标记语言请求文档[8]来描述。当用户在键盘中输入数字时,它们被传递到用户界面,用户界面生成可以传输到应用程序的用户界面事件。
A middle-ground approach is to flip back and forth between a client-local and client-remote user interface. Many voice applications are of the type that listen to the media stream and wait for some specific trigger that kicks off a more complex user interaction. The long pound in a prepaid calling card application is one example. Another example is a conference recording application, where the user can press a key at some point in the call to begin recording. When the key is pressed, the user hears a whisper to inform them that recording has started.
一种中间方法是在客户端本地用户界面和客户端远程用户界面之间来回切换。许多语音应用程序都是监听媒体流并等待启动更复杂用户交互的特定触发器的类型。预付费电话卡应用程序中的长磅就是一个例子。另一个例子是会议录制应用程序,用户可以在通话中的某个点按一个键开始录制。按下该键时,用户会听到一声耳语,告知他们录制已开始。
The ideal way to support such an application is to install a client-local user interface component that waits for the trigger to kick off the real interaction. Once the trigger is received, the application connects the user to a client-remote user interface that can play announcements, collect more information, and so on.
支持此类应用程序的理想方法是安装一个客户端本地用户界面组件,该组件等待触发器启动真正的交互。收到触发器后,应用程序将用户连接到客户端远程用户界面,该界面可以播放公告、收集更多信息等。
The benefit of flip-flopping between a client-local and client-remote user interface is cost. The client-local user interface will eliminate the need to send media streams into the network just to wait for the user to press the pound key on the keypad.
在客户端本地和客户端远程用户界面之间切换的好处是成本。客户端本地用户界面将消除只需等待用户按下键盘上的井号键即可将媒体流发送到网络的需要。
The Keypad Markup Language (KPML) was designed to support exactly this kind of need [8]. It models the keypad on a phone and allows an application to be informed when any sequence of keys has been pressed. However, KPML has no presentation component. Since user interfaces generally require a response to user input, the presentation will need to be done using a client-remote user interface that gets instantiated as a result of the trigger.
键盘标记语言(KPML)的设计正是为了支持这种需求[8]。它为手机上的键盘建模,并允许在按下任何键序列时通知应用程序。但是,KPML没有表示组件。由于用户界面通常需要对用户输入做出响应,因此需要使用作为触发器结果实例化的客户端远程用户界面来完成演示。
It is tempting to use a hybrid model, where a prompt-and-collect application is implemented by using a client-remote user interface that plays the prompts, and a client-local user interface, described by KPML, that collects digits. However, this only complicates the application. Firstly, the keypad input will be sent to both the media stream and the KPML user interface. This requires the application to sort out which user inputs are duplicates, a process that is very complicated. Secondly, the primary benefit of KPML is to avoid having a media stream towards a user interface. However, there is already a media stream for the prompting, so there is no real savings.
很容易使用混合模型,其中通过使用播放提示的客户端远程用户界面和收集数字的客户端本地用户界面(由KPML描述)来实现提示和收集应用程序。然而,这只会使应用程序复杂化。首先,键盘输入将同时发送到媒体流和KPML用户界面。这需要应用程序对哪些用户输入是重复的进行分类,这是一个非常复杂的过程。第二,KPML的主要好处是避免媒体流流向用户界面。然而,已经有了一个用于提示的媒体流,因此没有真正的节省。
In this framework, we use the term "SIP application" to refer to a broad set of functionality. A SIP application is a program running on a SIP-based element (such as a proxy or user agent) that provides
在这个框架中,我们使用术语“SIP应用程序”来指代一系列广泛的功能。SIP应用程序是在基于SIP的元素(如代理或用户代理)上运行的程序,该元素提供
some value-added function to a user or system administrator. SIP applications can execute on behalf of a caller, a called party, or a multitude of users at once.
为用户或系统管理员提供一些增值功能。SIP应用程序可以同时代表呼叫者、被叫方或多个用户执行。
Each application has a number of instances that are executing at any given time. An instance represents a single execution path for an application. It is established as a result of some event. That event can be a SIP event, such as the reception of a SIP INVITE request, or it can be a non-SIP event, such as a web form post or even a timer. Application instances also have an end time. Some instances have a lifetime that is coupled with a SIP transaction or dialog. For example, a proxy application might begin when an INVITE arrives, and terminate when the call is answered. Other applications have a lifetime that spans multiple dialogs or transactions. For example, a conferencing application instance may exist so long as there are dialogs connected to it. When the last dialog terminates, the application instance terminates. Other applications have a lifetime that is completely decoupled from SIP events.
每个应用程序都有许多在任何给定时间执行的实例。实例表示应用程序的单个执行路径。它是由于某些事件而建立的。该事件可以是SIP事件,例如接收SIP INVITE请求,也可以是非SIP事件,例如web表单post,甚至是计时器。应用程序实例也有一个结束时间。某些实例的生存期与SIP事务或对话相耦合。例如,代理应用程序可能在INVITE到达时开始,在呼叫应答时终止。其他应用程序的生命周期跨越多个对话框或事务。例如,会议应用程序实例可能存在,只要有对话框连接到它。当最后一个对话框终止时,应用程序实例终止。其他应用程序的生存期与SIP事件完全解耦。
It is fundamental to the framework described here that multiple application instances may interact with a user during a single SIP transaction or dialog. Each instance may be for the same application, or different applications. Each of the applications may be completely independent, in that each may be owned by a different provider, and may not be aware of each other's existence. Similarly, there may be application instances interacting with the caller, and instances interacting with the callee, both within the same transaction or dialog.
对于本文描述的框架来说,多个应用程序实例可以在单个SIP事务或对话期间与用户交互是非常重要的。每个实例可以用于相同的应用程序,也可以用于不同的应用程序。每个应用程序都可能是完全独立的,因为每个应用程序可能由不同的提供商拥有,并且可能不知道彼此的存在。类似地,在同一事务或对话框中,可能存在与调用者交互的应用程序实例和与被调用者交互的实例。
The first step in the interaction with the user is to instantiate one or more user interface components for the application instance. A user interface component is a single piece of the user interface that is defined by a logical flow that is not synchronously coupled with any other component. In other words, each component runs independently.
与用户交互的第一步是为应用程序实例实例化一个或多个用户界面组件。用户界面组件是由不与任何其他组件同步耦合的逻辑流定义的单个用户界面。换句话说,每个组件都独立运行。
A user interface component can be instantiated in one of the user agents in a dialog (for a client-local user interface), or within a network element (for a client-remote user interface). If a client-local user interface is to be used, the application needs to determine whether or not the user agent is capable of supporting a client-local user interface, and in what format. In this framework, all client-local user interface components are described by a markup language. A markup language describes a logical flow of presentation of information to the user, a collection of information from the user, and a transmission of that information to an application. Examples of markup languages include HTML, Wireless Markup Language (WML), VoiceXML, and the Keypad Markup Language (KPML) [8].
用户界面组件可以在对话框中的一个用户代理中实例化(对于客户端本地用户界面),也可以在网元中实例化(对于客户端远程用户界面)。如果要使用客户端本地用户界面,应用程序需要确定用户代理是否能够支持客户端本地用户界面,以及支持的格式。在此框架中,所有客户端本地用户界面组件都由标记语言描述。标记语言描述了向用户呈现信息的逻辑流、来自用户的信息集合以及将该信息传输到应用程序的过程。标记语言的示例包括HTML、无线标记语言(WML)、VoiceXML和键盘标记语言(KPML)[8]。
Unlike an application instance, which has a very flexible lifetime, a user interface component has a very fixed lifetime. A user interface component is always associated with a dialog. The user interface component can be created at any point after the dialog (or early dialog) is created. However, the user interface component terminates when the dialog terminates. The user interface component can be terminated earlier by the user agent, and possibly by the application, but its lifetime never exceeds that of its associated dialog.
与具有非常灵活的生命周期的应用程序实例不同,用户界面组件具有非常固定的生命周期。用户界面组件始终与对话框关联。用户界面组件可以在创建对话框(或早期对话框)后的任何时候创建。但是,当对话框终止时,用户界面组件终止。用户界面组件可以由用户代理提前终止,也可以由应用程序提前终止,但其生存期永远不会超过其关联对话框的生存期。
There are two ways to create a client-local interface component. For interface components that are presentation capable, the application sends a REFER [7] request to the user agent. The Refer-To header field contains an HTTP URI that points to the markup for the user interface, and the REFER contains a Target-Dialog header field [10] which identifies the dialog associated with the user interface component. For user interface components that are presentation free (such as those defined by KPML), the application sends a SUBSCRIBE request to the user agent. The body of the SUBSCRIBE request contains a filter, which, in this case, is the markup that defines when information is to be sent to the application in a NOTIFY. The SUBSCRIBE does not contain the Target-Dialog header field, since equivalent information is conveyed in the Event header field.
有两种方法可以创建客户端本地接口组件。对于支持表示的接口组件,应用程序向用户代理发送REFERET[7]请求。Refer-Refer标头字段包含指向用户界面标记的HTTP URI,Refer包含标识与用户界面组件关联的对话框的目标对话框标头字段[10]。对于无表示的用户界面组件(如KPML定义的组件),应用程序向用户代理发送订阅请求。SUBSCRIBE请求的主体包含一个过滤器,在本例中,该过滤器是一个标记,用于定义何时在NOTIFY中向应用程序发送信息。订阅不包含目标对话框标题字段,因为在事件标题字段中传递等效信息。
If a user interface component is to be instantiated in the network, there is no need to determine the capabilities of the device on which the user interface is instantiated. Presumably, it is on a device on which the application knows a UI can be created. However, the application does need to connect the user device to the user interface. This will require manipulation of media streams in order to establish that connection.
如果要在网络中实例化用户界面组件,则无需确定实例化用户界面的设备的功能。据推测,它位于应用程序知道可以在其上创建UI的设备上。但是,应用程序确实需要将用户设备连接到用户界面。这将需要操纵媒体流以建立连接。
The interface between the user interface component and the application depends on the type of user interface. For presentation-capable user interfaces, such as those described by HTML and VoiceXML, HTTP form POST operations are used. For presentation-free user interfaces, a SIP NOTIFY is used. The differing needs and capabilities of these two user interfaces, as described in Section 4.4, are what drives the different choices for the interactions. Since presentation-capable user interfaces require an update to the presentation every time user data is entered, they are a good match for HTTP. Since presentation-free user interfaces merely transmit user input to the application, a NOTIFY is more appropriate.
用户界面组件和应用程序之间的接口取决于用户界面的类型。对于支持表示的用户界面,如HTML和VoiceXML描述的用户界面,使用HTTP表单POST操作。对于无表示的用户界面,使用SIP NOTIFY。如第4.4节所述,这两个用户界面的不同需求和功能决定了交互的不同选择。由于支持表示的用户界面需要在每次输入用户数据时更新表示,因此它们与HTTP非常匹配。由于无表示的用户界面仅将用户输入传输到应用程序,因此通知更合适。
Indeed, for presentation-free user interfaces, there are two different modalities of operation. The first is called "one shot". In the one-shot role, the markup waits for a user to enter some
实际上,对于无演示的用户界面,有两种不同的操作模式。第一种被称为“一次性”。在一次性角色中,标记等待用户输入一些
information and, when they do, reports this event to the application. The application then does something, and the markup is no longer used. In the other modality, called "monitor", the markup stays permanently resident, and reports information back to an application until termination of the associated dialog.
信息,并在执行时向应用程序报告此事件。然后,应用程序会执行一些操作,并且不再使用标记。在另一种称为“监视器”的模式中,标记保持永久驻留,并向应用程序报告信息,直到相关对话框终止。
This section presents some of the network topologies in which this framework can be instantiated.
本节介绍了可以在其中实例化此框架的一些网络拓扑。
+-------------+ /---| Application | / +-------------+ / SUB/ / REFER/ NOT / HTTP / +--------+ SIP (INVITE) +-----+ | UI A--------------------X | |........| | SIP | | User | RTP | UA | | Device B--------------------Y | +--------+ +-----+
+-------------+ /---| Application | / +-------------+ / SUB/ / REFER/ NOT / HTTP / +--------+ SIP (INVITE) +-----+ | UI A--------------------X | |........| | SIP | | User | RTP | UA | | Device B--------------------Y | +--------+ +-----+
Figure 2: Third-Party Topology
图2:第三方拓扑
In this topology, the application that is interested in interacting with the users exists outside of the SIP dialog between the user agents. In that case, the application learns about the initiation and termination of the dialog, along with the dialog identifiers, through some out-of-band means. One such possibility is the dialog event package [16]. Dialog information is only revealed to trusted parties, so the application would need to be trusted by one of the users in order to obtain this information.
在此拓扑中,感兴趣与用户交互的应用程序存在于用户代理之间的SIP对话框之外。在这种情况下,应用程序通过一些带外方式了解对话框的启动和终止以及对话框标识符。这种可能性之一是对话框事件包[16]。对话框信息仅向受信任方公开,因此应用程序需要得到其中一个用户的信任才能获得此信息。
At any point during the dialog, the application can instantiate user interface components on the user device of the caller or callee. It can do this using either SUBSCRIBE or REFER, depending on the type of user interface (presentation capable or presentation free).
在对话框中的任何时候,应用程序都可以在调用者或被调用者的用户设备上实例化用户界面组件。根据用户界面的类型(支持演示或无演示),它可以使用订阅或引用来完成此操作。
+--------+ SIP (INVITE) +-----+ | User A--------------------X SIP | | Device | RTP | UA | |........B--------------------Y | | | SUB/NOT | App)| | UI A'-------------------X' | +--------+ REFER/HTTP +-----+
+--------+ SIP (INVITE) +-----+ | User A--------------------X SIP | | Device | RTP | UA | |........B--------------------Y | | | SUB/NOT | App)| | UI A'-------------------X' | +--------+ REFER/HTTP +-----+
Figure 3: Co-Resident Topology
图3:共驻拓扑
In this deployment topology, the application is co-resident with one of the user agents (the one on the right in the picture above). This application can install client-local user interface components on the other user agent, which is acting as the user device. These components can be installed using either SUBSCRIBE, for presentation-free user interfaces, or REFER, for presentation-capable ones. This situation typically arises when the application wishes to install UI components on a presentation-capable user interface. If the only user input is via keypad input, the framework is not needed per se, because the UA/application will receive the input via RFC 4733 in the RTP stream.
在此部署拓扑中,应用程序与一个用户代理(上图中右侧的那个)共存。此应用程序可以在作为用户设备的其他用户代理上安装客户端本地用户界面组件。这些组件可以使用SUBSCRIBE(用于无演示文稿的用户界面)或REFER(用于支持演示文稿的用户界面)安装。当应用程序希望在支持演示的用户界面上安装UI组件时,通常会出现这种情况。如果唯一的用户输入是通过键盘输入,则框架本身不需要,因为UA/应用程序将通过RTP流中的RFC 4733接收输入。
If the application resides in the called party, it is called a "terminating application". If it resides in the calling party, it is called an "originating application".
如果应用程序驻留在被叫方,则称为“终止应用程序”。如果它位于调用方,则称为“原始应用程序”。
This kind of topology is common in protocol converter and gateway applications.
这种拓扑结构在协议转换器和网关应用中很常见。
+-------------+ /---| Application | / +-------------+ / SUB/ / REFER/ NOT / HTTP / +-----+ SIP +---M----+ SIP +-----+ | V--------------------C A--------------------X | | SIP | | UI | | SIP | | UAa | RTP | | RTP | UAb | | W--------------------D B--------------------Y | +-----+ +--------+ +-----+ User User Device Device Proxy
+-------------+ /---| Application | / +-------------+ / SUB/ / REFER/ NOT / HTTP / +-----+ SIP +---M----+ SIP +-----+ | V--------------------C A--------------------X | | SIP | | UI | | SIP | | UAa | RTP | | RTP | UAb | | W--------------------D B--------------------Y | +-----+ +--------+ +-----+ User User Device Device Proxy
Figure 4: User Device Proxy Topology
图4:用户设备代理拓扑
In this deployment topology, there is a third-party application as in Section 7.1. However, instead of installing a user interface component on the end user device, the component is installed in an intermediate device, known as a User Device Proxy. From the perspective of the actual user device (on the left), the User Device Proxy is a client remote user interface. As such, media, typically transported using RTP (including RFC 4733 for carrying user input), is sent from the user device to the client remote user interface on the User Device Proxy. As far as the application is concerned, it is installing what it thinks is a client-local user interface on the user device, but it happens to be on a user device proxy that looks like the user device to the application.
在此部署拓扑中,有一个第三方应用程序,如第7.1节所示。但是,不是在最终用户设备上安装用户界面组件,而是将该组件安装在称为用户设备代理的中间设备中。从实际用户设备(左侧)的角度来看,用户设备代理是一个客户端远程用户界面。因此,通常使用RTP(包括用于承载用户输入的RFC 4733)传输的媒体从用户设备发送到用户设备代理上的客户端远程用户接口。就应用程序而言,它正在用户设备上安装它认为是客户端本地用户界面的东西,但它恰好位于一个用户设备代理上,该代理看起来像应用程序的用户设备。
The user device proxy will need to terminate and re-originate both signaling (SIP) and media traffic towards the actual peer in the conversation. The User Device Proxy is a media relay in the terminology of RFC 3550 [18]. The User Device Proxy will need to monitor the media streams associated with each dialog, in order to convert user input received in the media stream to events reported to the user interface. This can pose a challenge in multi-media systems, where it may be unclear on which media stream the user input is being sent. As discussed in RFC 3264 [20], if a user agent has a single media source and is supporting multiple streams, it is supposed to send that source to all streams. In cases where there are multiple sources, the mapping is a matter of local policy. In
用户设备代理将需要终止并重新发起信令(SIP)和媒体流量,以到达会话中的实际对等方。用户设备代理是RFC 3550[18]术语中的媒体中继。用户设备代理将需要监视与每个对话框相关联的媒体流,以便将媒体流中接收到的用户输入转换为报告给用户界面的事件。这在多媒体系统中可能会带来挑战,在多媒体系统中,可能不清楚用户输入是在哪个媒体流上发送的。如RFC 3264[20]中所述,如果用户代理具有单个媒体源并且支持多个流,则应将该源发送到所有流。在有多个源的情况下,映射是当地政策的问题。在里面
the absence of a way to explicitly identify or request which sources map to which streams, the user device proxy will need to do the best job it can. This specification RECOMMENDS that the User Device Proxy monitor the first stream (defined in terms of ordering of media sessions within a session description). As such, user agents SHOULD send their user input on the first stream, absent a policy to direct it otherwise.
由于没有明确识别或请求哪些源映射到哪些流的方法,用户设备代理将需要尽其所能完成最好的工作。本规范建议用户设备代理监控第一个流(根据会话描述中媒体会话的顺序定义)。同样地,用户代理应该在第一个流上发送他们的用户输入,而不存在以其他方式指导它的策略。
+----------+ SUB/NOT | App | SUB/NOT +--------------->| |<-----------------+ | REFER/HTTP |..........| REFER/HTTP | | | SIP | | | | Proxy | | | +----------+ | V ^ | V +----------+ | | +----------+ | UI | INVITE | | INVITE | UI | | |------------+ +------------>| | |......... | |..........| | SIP |...................................| SIP | | UA | | UA | +----------+ RTP +----------+ User Device User Device
+----------+ SUB/NOT | App | SUB/NOT +--------------->| |<-----------------+ | REFER/HTTP |..........| REFER/HTTP | | | SIP | | | | Proxy | | | +----------+ | V ^ | V +----------+ | | +----------+ | UI | INVITE | | INVITE | UI | | |------------+ +------------>| | |......... | |..........| | SIP |...................................| SIP | | UA | | UA | +----------+ RTP +----------+ User Device User Device
Figure 5: Proxy Application Topology
图5:代理应用程序拓扑
In this topology, the application is co-resident with a transaction stateful, record-routing proxy server on the call path between two user devices. The application uses SUBSCRIBE or REFER to install user interface components on one or both user devices.
在此拓扑中,应用程序与两个用户设备之间的调用路径上的事务状态、记录路由代理服务器共存。应用程序使用SUBSCRIBE或REFER在一个或两个用户设备上安装用户界面组件。
This topology is common in routing applications, such as a web-assisted call-routing application.
这种拓扑结构在路由应用程序中很常见,例如web辅助呼叫路由应用程序。
The behavior of an application within this framework depends on whether it seeks to use a client-local or client-remote user interface.
此框架中应用程序的行为取决于它是寻求使用客户端本地用户界面还是客户端远程用户界面。
One key component of this framework is support for client-local user interfaces.
该框架的一个关键组件是对客户端本地用户界面的支持。
A client-local user interface can only be instantiated on a user agent if the user agent supports that type of user interface component. Support for client-local user interface components is declared by both the UAC and UAS in their Allow, Accept, Supported, and Allow-Event header fields of dialog-initiating requests and responses. If the Allow header field indicates support for the SIP SUBSCRIBE method, and the Allow-Event header field indicates support for the KPML package [8], and the Supported header field indicates support for the Globally Routable UA URI (GRUU) [9] specification (which, in turn, means that the Contact header field contains a GRUU), it means that the UA can instantiate presentation-free user interface components. In this case, the application can push presentation-free user interface components according to the rules of Section 8.1.2. The specific markup languages that can be supported are indicated in the Accept header field.
如果用户代理支持该类型的用户界面组件,则只能在用户代理上实例化客户端本地用户界面。UAC和UAS在发起请求和响应的对话框的允许、接受、支持和允许事件头字段中声明对客户端本地用户界面组件的支持。如果Allow header字段表示支持SIP SUBSCRIBE方法,Allow Event header字段表示支持KPML包[8],Supported header字段表示支持全局可路由UA URI(GRUU)[9]规范(这反过来意味着Contact header字段包含GRUU),这意味着UA可以实例化无表示的用户界面组件。在这种情况下,应用程序可以根据第8.1.2节的规则推送无演示的用户界面组件。可支持的特定标记语言在Accept标头字段中指明。
If the Allow header field indicates support for the SIP REFER method, and the Supported header field indicates support for the Target-Dialog header field [10], and the Contact header field contains UA capabilities [6] that indicate support for the HTTP URI scheme, it means that the UA supports presentation-capable user interface components. In this case, the application can push presentation-capable user interface components to the client according to the rules of Section 8.1.2. The specific markups that are supported are indicated in the Accept header field.
如果Allow header字段表示支持SIP REFER方法,Supported header字段表示支持Target Dialog header字段[10],Contact header字段包含表示支持HTTP URI方案的UA功能[6],则表示UA支持支持支持演示功能的用户界面组件。在这种情况下,应用程序可以根据第8.1.2节的规则将支持演示的用户界面组件推送到客户端。支持的特定标记在Accept标头字段中指示。
A third-party application that is not present on the call path will not be privy to these header fields in the dialog-initiating requests that pass by. As such, it will need to obtain this capability information in other ways. One way is through the registration event package [21], which can contain user agent capability information provided in REGISTER requests [6].
调用路径上不存在的第三方应用程序将不了解发起经过的请求的对话框中的这些头字段。因此,它将需要以其他方式获得该能力信息。一种方法是通过注册事件包[21],它可以包含注册请求[6]中提供的用户代理功能信息。
Generally, we anticipate that interface components will need to be created at various different points in a SIP session. Clearly, they will need to be pushed during session setup, or after the session is established. A user interface component is always associated with a specific dialog, however.
通常,我们预期在SIP会话中的不同点将需要创建接口组件。显然,需要在会话设置期间或会话建立之后推送它们。但是,用户界面组件始终与特定对话框相关联。
An application MUST NOT attempt to push a user interface component to a user agent until it has determined that the user agent has the necessary capabilities and a dialog has been created. In the case of a UAC, this means that an application MUST NOT push a user interface component for an INVITE-initiated dialog until the application has seen a request confirming the receipt of a dialog-creating response. This could be an ACK for a 200 OK, or a PRACK for a provisional response [3]. For SUBSCRIBE-initiated dialogs, the application MUST NOT push a user interface component until the application has seen a 200 OK to the NOTIFY request. For a user interface component on a UAS, the application MUST NOT push a user interface component for an INVITE-initiated dialog until it has seen a dialog-creating response from the UAS. For a SUBSCRIBE-initiated dialog, it MUST NOT push a user interface component until it has seen a NOTIFY request from the notifier.
在确定用户代理具有必要的功能并创建了对话框之前,应用程序不得尝试将用户界面组件推送到用户代理。在UAC的情况下,这意味着在应用程序看到确认收到对话框创建响应的请求之前,应用程序不得为INVITE启动的对话框推送用户界面组件。这可能是200 OK的确认,或者是临时响应的恶作剧[3]。对于订阅启动的对话框,在应用程序看到NOTIFY请求200 OK之前,应用程序不得推送用户界面组件。对于UAS上的用户界面组件,应用程序在看到从UAS创建响应的对话框之前,不得为INVITE启动的对话框推送用户界面组件。对于订阅启动的对话框,在看到来自通知程序的通知请求之前,它不得推送用户界面组件。
To create a presentation-capable UI component on the UA, the application sends a REFER request to the UA. This REFER MUST be sent to the GRUU [9] advertised by that UA in the Contact header field of the dialog-initiating request or response sent by that UA. Note that this REFER request creates a separate dialog between the application and the UA. The Refer-To header field of the REFER request MUST contain an HTTP URI that references the markup document to be fetched.
要在UA上创建支持演示的UI组件,应用程序将向UA发送REFERER请求。此引用必须发送到该UA在发起该UA发送的请求或响应的对话框的联系人标头字段中公布的GRUU[9]。请注意,此REFER请求在应用程序和UA之间创建了一个单独的对话框。Refer请求的Refer Refer header字段必须包含引用要获取的标记文档的HTTP URI。
Furthermore, it is essential for the REFER request to be correlated with the dialog to which the user interface component will be associated. This is necessary for authorization and for terminating the user interface components when the dialog terminates. To provide this context, the REFER request MUST contain a Target-Dialog header field identifying the dialog with which the user interface component is associated. As discussed in [10], this request will also contain a Require header field with the tdialog option tag.
此外,REFER请求必须与用户界面组件将要关联的对话框相关联。这对于授权和在对话框终止时终止用户界面组件是必需的。要提供此上下文,REFER请求必须包含一个目标对话框标题字段,标识与用户界面组件关联的对话框。如[10]所述,该请求还将包含一个带有tdialog选项标签的Require头字段。
To create a presentation-free user interface component, the application sends a SUBSCRIBE request to the UA. The SUBSCRIBE MUST be sent to the GRUU advertised by the UA. This SUBSCRIBE request creates a separate dialog. The SUBSCRIBE request MUST use the KPML [8] event package. The body of the SUBSCRIBE request contains the markup document that defines the conditions under which the application wishes to be notified of user input.
要创建无演示文稿的用户界面组件,应用程序将向UA发送订阅请求。订阅必须发送到UA公布的GRUU。此订阅请求创建一个单独的对话框。订阅请求必须使用KPML[8]事件包。SUBSCRIBE请求的主体包含标记文档,该文档定义了应用程序希望收到用户输入通知的条件。
In both cases, the REFER or SUBSCRIBE request SHOULD include a display name in the From header field that identifies the name of the application. For example, a prepaid calling card might include a From header field that looks like:
在这两种情况下,REFERE或SUBSCRIBE请求都应该在From头字段中包含一个显示名称,用于标识应用程序的名称。例如,预付费电话卡可能包含一个From头字段,如下所示:
From: "Prepaid Calling Card" <sip:prepaid@example.com>
From: "Prepaid Calling Card" <sip:prepaid@example.com>
Any of the SIP identity assertion mechanisms that have been defined, such as [11] and [13], are applicable to these requests as well.
已定义的任何SIP标识断言机制(如[11]和[13])也适用于这些请求。
Once a user interface component has been created on a client, it can be updated. The means for updating it depends on the type of UI component.
一旦在客户端上创建了用户界面组件,就可以对其进行更新。更新它的方法取决于UI组件的类型。
Presentation-capable UI components are updated using techniques already in place for those markups. In particular, user input will cause an HTTP POST operation to push the user input to the application. The result of the POST operation is a new markup that the UI is supposed to use. This allows the UI to be updated in response to user action. Some markups, such as HTML, provide the ability to force a refresh after a certain period of time, so that the UI can be updated without user input. Those mechanisms can be used here as well. However, there is no support for an asynchronous push of an updated UI component from the application to the user agent. A new REFER request to the same GRUU would create a new UI component rather than update any components already in place.
支持演示文稿的UI组件可以使用这些标记已有的技术进行更新。特别是,用户输入将导致HTTP POST操作将用户输入推送到应用程序。POST操作的结果是UI应该使用的新标记。这允许用户界面根据用户操作进行更新。一些标记(如HTML)提供了在一段时间后强制刷新的能力,因此用户界面可以在无需用户输入的情况下进行更新。这些机制也可以在这里使用。但是,不支持将更新的UI组件从应用程序异步推送到用户代理。对同一GRUU的新REFER请求将创建一个新的UI组件,而不是更新任何已经就位的组件。
For presentation-free UI, the story is different. The application MAY update the filter at any time by generating a SUBSCRIBE refresh with the new filter. The UA will immediately begin using this new filter.
对于无演示的UI,情况就不同了。应用程序可以随时通过使用新过滤器生成订阅刷新来更新过滤器。UA将立即开始使用此新过滤器。
User interface components have a well-defined lifetime. They are created when the component is first pushed to the client. User interface components are always associated with the SIP dialog on which they were pushed. As such, their lifetime is bound by the lifetime of the dialog. When the dialog ends, so does the interface component.
用户界面组件具有定义良好的生命周期。它们是在组件首次推送到客户端时创建的。用户界面组件始终与推送它们的SIP对话框相关联。因此,它们的生存期受对话框的生存期约束。对话框结束时,接口组件也会结束。
However, there are some cases where the application would like to terminate the user interface component before its natural termination point. For presentation-capable user interfaces, this is not possible. For presentation-free user interfaces, the application MAY terminate the component by sending a SUBSCRIBE with Expires equal to zero. This terminates the subscription, which removes the UI component.
但是,在某些情况下,应用程序希望在其自然终止点之前终止用户界面组件。对于支持演示的用户界面,这是不可能的。对于无表示的用户界面,应用程序可以通过发送过期时间等于零的订阅来终止组件。这将终止订阅,从而删除UI组件。
A client can remove a UI component at any time. For presentation-capable UI, this is analogous to the user dismissing the web form
客户端可以随时删除UI组件。对于支持表示的UI,这类似于用户拒绝web表单
window. There is no mechanism provided for reporting this kind of event to the application. The application MUST be prepared to time out and never receive input from a user. The duration of this timeout is application dependent. For presentation-free user interfaces, the UA can explicitly terminate the subscription. This will result in the generation of a NOTIFY with a Subscription-State header field equal to "terminated".
窗没有提供向应用程序报告此类事件的机制。应用程序必须准备好超时,并且永远不会收到用户的输入。此超时的持续时间取决于应用程序。对于无表示的用户界面,UA可以显式终止订阅。这将导致生成订阅状态标头字段等于“已终止”的通知。
As an alternative to, or in conjunction with client-local user interfaces, an application can make use of client-remote user interfaces. These user interfaces can execute co-resident with the application itself (in which case no standardized interfaces between the UI and the application need to be used), or they can run separately. This framework assumes that the user interface runs on a host that has a sufficient trust relationship with the application. As such, the means for instantiating the user interface is not considered here.
作为客户端本地用户界面的替代方案,或与客户端本地用户界面结合使用,应用程序可以使用客户端远程用户界面。这些用户界面可以与应用程序本身共存执行(在这种情况下,不需要使用UI和应用程序之间的标准化界面),也可以单独运行。此框架假设用户界面在与应用程序具有足够信任关系的主机上运行。因此,这里不考虑用于实例化用户界面的方法。
The primary issue is to connect the user device to the remote user interface. Doing so requires the manipulation of media streams between the client and the user interface. Such manipulation can only be done by user agents. There are two types of user agent applications within this framework: originating/terminating applications, and intermediary applications.
主要问题是将用户设备连接到远程用户界面。这样做需要在客户端和用户界面之间操纵媒体流。这种操作只能由用户代理完成。此框架中有两种类型的用户代理应用程序:发起/终止应用程序和中间应用程序。
Originating and terminating applications are applications that are themselves the originator or the final recipient of a SIP invitation. They are "pure" user agent applications, not back-to-back user agents. The classic example of such an application is an interactive voice response (IVR) application, which is typically a terminating application. It is a terminating application because the user explicitly calls it; i.e., it is the actual called party. An example of an originating application is a wakeup call application, which calls a user at a specified time in order to wake them up.
发起和终止应用程序本身就是SIP邀请的发起人或最终接收人。它们是“纯”用户代理应用程序,而不是背靠背用户代理。此类应用程序的经典示例是交互式语音响应(IVR)应用程序,它通常是终止应用程序。它是一个终止的应用程序,因为用户显式地调用它;i、 即实际的被叫方。原始应用程序的一个示例是唤醒调用应用程序,它在指定时间调用用户以唤醒他们。
Because originating and terminating applications are a natural termination point of the dialog, manipulation of the media session by the application is trivial. Traditional SIP techniques for adding and removing media streams, modifying codecs, and changing the address of the recipient of the media streams can be applied.
由于发起和终止应用程序是对话框的自然终止点,因此应用程序对媒体会话的操作非常简单。可以应用用于添加和删除媒体流、修改编解码器和更改媒体流接收者地址的传统SIP技术。
Intermediary applications are, at the same time, more common than originating/terminating applications and more complex. Intermediary applications are applications that are neither the actual caller nor the called party. Rather, they represent a "third party" that wishes to interact with the user. The classic example is the ubiquitous prepaid calling card application.
同时,中间应用程序比原始/终止应用程序更常见,也更复杂。中间应用程序是既不是实际调用方也不是被调用方的应用程序。相反,它们代表希望与用户交互的“第三方”。典型的例子是无处不在的预付费电话卡应用程序。
In order for the intermediary application to add a client-remote user interface, it needs to manipulate the media streams of the user agent to terminate on that user interface. This also introduces a fundamental feature interaction issue. Since the intermediary application is not an actual participant in the call, the user will need to interact with both the intermediary application and its peer in the dialog. Doing both at the same time is complicated and is discussed in more detail in Section 10.
为了让中间应用程序添加客户端远程用户界面,它需要操纵用户代理的媒体流以在该用户界面上终止。这还引入了一个基本的特性交互问题。由于中间应用程序不是调用的实际参与者,因此用户需要在对话框中与中间应用程序及其对等方进行交互。同时做这两件事很复杂,第10节将对此进行更详细的讨论。
In order to participate in applications that make use of stimulus interfaces, a user agent needs to advertise its interaction capabilities.
为了参与使用刺激界面的应用程序,用户代理需要宣传其交互功能。
If a user agent supports presentation-capable user interfaces, it MUST support the REFER method. It MUST include, in all dialog-initiating requests and responses, an Allow header field that includes the REFER method. The user agent MUST support the target dialog specification [10], and MUST include the "tdialog" option tag in the Supported header field of dialog-forming requests and responses. Furthermore, the UA MUST support the SIP user agent capabilities specification [6]. The UA MUST be capable of being REFERed to an HTTP URI. It MUST include, in the Contact header field of its dialog-initiating requests and responses, a "schemes" Contact header field parameter that includes the HTTP URI scheme. The UA MUST include, in all dialog-initiating requests and responses, an Accept header field listing all of those markups supported by the UA. It is RECOMMENDED that all user agents that support presentation-capable user interfaces support HTML.
如果用户代理支持支持支持表示的用户界面,则它必须支持REFER方法。在发起请求和响应的所有对话框中,它必须包括一个包含REFER方法的Allow header字段。用户代理必须支持目标对话框规范[10],并且必须在对话框形成请求和响应的受支持标题字段中包含“tdialog”选项标记。此外,UA必须支持SIP用户代理能力规范[6]。UA必须能够被引用到HTTP URI。它必须在发起请求和响应的对话框的Contact header字段中包含包含HTTP URI方案的“schemes”Contact header字段参数。UA必须在发起请求和响应的所有对话框中包含一个Accept header字段,列出UA支持的所有标记。建议所有支持支持演示用户界面的用户代理都支持HTML。
If a user agent supports presentation-free user interfaces, it MUST support the SUBSCRIBE [4] method. It MUST support the KPML [8] event package. It MUST include, in all dialog-initiating requests and responses, an Allow header field that includes the SUBSCRIBE method. It MUST include, in all dialog-initiating requests and responses, an Allow-Events header field that lists the KPML event package. The UA
如果用户代理支持无表示的用户界面,那么它必须支持SUBSCRIBE[4]方法。它必须支持KPML[8]事件包。在发起请求和响应的所有对话框中,它必须包括一个包含SUBSCRIBE方法的Allow header字段。在发起请求和响应的所有对话框中,它必须包含一个Allow Events标头字段,该字段列出KPML事件包。UA
MUST include, in all dialog-initiating requests and responses, an Accept header field listing those event filters it supports. At a minimum, a UA MUST support the "application/kpml-request+xml" MIME type.
在发起请求和响应的所有对话框中,必须包含一个Accept header字段,列出它支持的事件过滤器。UA至少必须支持“应用程序/kpml请求+xml”MIME类型。
For either presentation-free or presentation-capable user interfaces, the user agent MUST support the GRUU [9] specification. The Contact header field in all dialog-initiating requests and responses MUST contain a GRUU. The UA MUST include a Supported header field that contains the "gruu" option tag and the "tdialog" option tag.
对于无演示或支持演示的用户界面,用户代理必须支持GRUU[9]规范。发起请求和响应的所有对话框中的联系人标头字段必须包含GRUU。UA必须包含一个支持的标题字段,该字段包含“gruu”选项标签和“tdialog”选项标签。
Because these headers are examined by proxies that may be executing applications, a UA that wishes to support client-local user interfaces should not encrypt them.
因为这些头由可能正在执行应用程序的代理进行检查,所以希望支持客户端本地用户界面的UA不应该对它们进行加密。
Once the UA has created a dialog (in either the early or confirmed states), it MUST be prepared to receive a SUBSCRIBE or REFER request against its GRUU. If the UA receives such a request prior to the establishment of a dialog, the UA MUST reject the request.
一旦UA创建了一个对话框(处于早期或已确认状态),它就必须准备接收针对其GRUU的订阅或引用请求。如果UA在建立对话之前收到此类请求,则UA必须拒绝该请求。
A user agent SHOULD attempt to authenticate the sender of the request. The sender will generally be an application; therefore, the user agent is unlikely to ever have a shared secret with it, making digest authentication useless. However, authenticated identities can be obtained through other means, such as the Identity mechanism [11].
用户代理应尝试验证请求的发件人。发送方通常是一个应用程序;因此,用户代理不太可能拥有与之共享的秘密,这使得摘要身份验证毫无用处。然而,可通过其他方式获得认证身份,例如身份机制[11]。
A user agent MAY have pre-defined authorization policies that permit applications which have authenticated themselves with a particular identity to push user interface components. If such a set of policies is present, it is checked first. If the application is authorized, processing proceeds.
用户代理可以具有预定义的授权策略,该策略允许使用特定身份对自己进行身份验证的应用程序推送用户界面组件。如果存在这样一组策略,则首先检查它。如果申请获得授权,则继续处理。
If the application has authenticated itself but is not explicitly authorized or blocked, this specification RECOMMENDS that the application be automatically authorized if it can prove that it was either on the call path, or is trusted by one of the elements on the call path. An application proves this to the user agent by demonstrating that it knows the dialog identifiers. That occurs by including them in a Target-Dialog header field for REFER requests, or in the Event header field parameters of the KPML SUBSCRIBE request.
如果应用程序已进行自身身份验证,但未明确授权或阻止,则本规范建议,如果应用程序能够证明其位于调用路径上,或者被调用路径上的某个元素信任,则应自动授权应用程序。应用程序通过证明它知道对话框标识符向用户代理证明了这一点。这是通过将它们包含在REFER请求的目标对话框标题字段中,或KPML SUBSCRIBE请求的事件标题字段参数中来实现的。
Because the dialog identifiers serve as a tool for authorization, a user agent compliant to this framework SHOULD use dialog identifiers that are cryptographically random, with at least 128 bits of randomness. It is recommended that this randomness be split between the Call-ID and From header field tags in the case of a UAC.
由于对话框标识符用作授权工具,因此符合此框架的用户代理应使用加密随机的对话框标识符,至少具有128位随机性。在UAC的情况下,建议将这种随机性在呼叫ID和来自报头字段标记之间进行划分。
Furthermore, to ensure that only applications resident in or trusted by on-path elements can instantiate a user interface component, a user agent compliant to this specification SHOULD use the Session Initiation Protocol Secure (SIPS) URI scheme for all dialogs it initiates. This will guarantee secure links between all the elements on the signaling path.
此外,为了确保只有驻留在路径元素中或受路径元素信任的应用程序才能实例化用户界面组件,符合本规范的用户代理应为其启动的所有对话框使用会话启动协议安全(SIPS)URI方案。这将保证信令路径上所有元素之间的安全链路。
If the dialog was not established with a SIPS URI, or the user agent did not choose cryptographically random dialog identifiers, then the application MUST NOT automatically be authorized, even if it presented valid dialog identifiers. A user agent MAY apply any other policies in addition to (but not instead of) the ones specified here in order to authorize the creation of the user interface component. One such mechanism would be to prompt the user, informing them of the identity of the application and the dialog it is associated with. If an authorization policy requires user interaction, the user agent SHOULD respond to the SUBSCRIBE or REFER request with a 202. In the case of SUBSCRIBE, if authorization is not granted, the user agent SHOULD generate a NOTIFY to terminate the subscription. In the case of REFER, the user agent MUST NOT act upon the URI in the Refer-To header field until user authorization is obtained.
如果对话框未使用SIPS URI建立,或者用户代理未选择加密随机对话框标识符,则应用程序不得自动授权,即使它提供了有效的对话框标识符。用户代理可以应用除此处指定的策略之外的任何其他策略(但不能代替),以授权创建用户界面组件。一种这样的机制是提示用户,通知他们应用程序的标识及其关联的对话框。如果授权策略需要用户交互,则用户代理应使用202响应订阅或引用请求。对于订阅,如果未授予授权,则用户代理应生成通知以终止订阅。在引用的情况下,在获得用户授权之前,用户代理不得对引用头字段中的URI进行操作。
If an application does not present a valid dialog identifier in its REFER or SUBSCRIBE request, the user agent MUST reject the request with a 403 response.
如果应用程序在其引用或订阅请求中没有提供有效的对话框标识符,则用户代理必须使用403响应拒绝该请求。
If a REFER request to an HTTP URI is authorized, the UA executes the URI and fetches the content to be rendered to the user. This instantiates a presentation-capable user interface component. If a SUBSCRIBE was authorized, a presentation-free user interface component is instantiated.
如果对HTTP URI的引用请求被授权,UA将执行该URI并获取要呈现给用户的内容。这将实例化一个支持表示的用户界面组件。如果已授权订阅,则将实例化无演示文稿的用户界面组件。
Once the user interface components are instantiated, the user agent must direct user input to the appropriate component. In the case of presentation-capable user interfaces, this process is known as focus selection. It is done by means that are specific to the user interface on the device. In the case of a PC, for example, the window manager would allow the user to select the appropriate user interface component to which their input is directed.
用户界面组件实例化后,用户代理必须将用户输入定向到适当的组件。对于支持演示的用户界面,此过程称为焦点选择。这是通过特定于设备上用户界面的方式完成的。例如,在PC的情况下,窗口管理器将允许用户选择其输入指向的适当用户界面组件。
For presentation-free user interfaces, the situation is more complicated. In some cases, the device may support a mechanism that allows the user to select a "line", and thus the associated dialog. Any user input on the keypad while this line is selected are fed to the user interface components associated with that dialog.
对于无表示的用户界面,情况更加复杂。在某些情况下,设备可能支持一种机制,允许用户选择一条“线”,从而选择相关的对话框。当选择此行时,键盘上的任何用户输入都会被馈送到与该对话框关联的用户界面组件。
Otherwise, for client-local user interfaces, the user input is assumed to be associated with all user interface components. For client-remote user interfaces, the user device converts the user input to media, typically conveyed using RFC 4733, and sends this to the client-remote user interface. This user interface then needs to map user input from potentially many media streams into user interface events. The process for doing this is described in Section 7.3.
否则,对于客户端本地用户界面,假定用户输入与所有用户界面组件关联。对于客户端远程用户接口,用户设备将用户输入转换为通常使用RFC 4733传送的媒体,并将其发送到客户端远程用户接口。然后,此用户界面需要将用户输入从潜在的多个媒体流映射到用户界面事件。第7.3节描述了执行此操作的过程。
For presentation-capable user interfaces, updates to the user interface occur in ways specific to that user interface component. In the case of HTML, for example, the document can tell the client to fetch a new document periodically. However, this framework does not provide any additional machinery to asynchronously push a new user interface component to the client.
对于支持表示的用户界面,用户界面的更新以特定于该用户界面组件的方式进行。例如,对于HTML,文档可以告诉客户机定期获取新文档。但是,该框架不提供任何额外的机制来将新的用户界面组件异步推送到客户端。
For presentation-free user interfaces, an application can push an update to a component by sending a SUBSCRIBE refresh with a new filter. The user agent will process these according to the rules of the event package.
对于无表示的用户界面,应用程序可以通过发送带有新过滤器的订阅刷新,将更新推送到组件。用户代理将根据事件包的规则处理这些事件。
Termination of a presentation-capable user interface component is a trivial procedure. The user agent merely dismisses the window (or its equivalent). The fact that the component is dismissed is not communicated to the application. As such, it is purely a local matter.
终止具有演示功能的用户界面组件是一个简单的过程。用户代理只会取消窗口(或其等效项)。组件被驳回的事实未告知应用程序。因此,这纯粹是一个地方问题。
In the case of a presentation-free user interface, the user might wish to cease interacting with the application. However, most presentation-free user interfaces will not have a way for the user to signal this through the device. If such a mechanism did exist, the UA SHOULD generate a NOTIFY request with a Subscription-State header field equal to "terminated" and a reason of "rejected". This tells the application that the component has been removed and that it should not attempt to re-subscribe.
对于无表示的用户界面,用户可能希望停止与应用程序的交互。然而,大多数无演示的用户界面将无法让用户通过设备发出信号。如果确实存在这种机制,UA应生成一个通知请求,其中订阅状态标头字段等于“已终止”,原因为“已拒绝”。这会告诉应用程序该组件已被删除,不应尝试重新订阅。
The inter-application feature interaction problem is inherent to stimulus signaling. Whenever there are multiple applications, there are multiple user interfaces. The system has to determine to which user interface any particular input is destined. That question is the essence of the inter-application feature interaction problem.
应用程序间特征交互问题是刺激信号固有的。只要有多个应用程序,就有多个用户界面。系统必须确定任何特定输入的目标用户界面。这个问题是应用程序间特性交互问题的本质。
Inter-application feature interaction is not an easy problem to resolve. For now, we consider separately the issues for client-local and client-remote user interface components.
应用程序间功能交互不是一个容易解决的问题。现在,我们分别考虑客户端本地和客户端远程用户接口组件的问题。
When the user interface itself resides locally on the client device, the feature interaction problem is actually much simpler. The end device knows explicitly about each application, and therefore can present the user with each one separately. When the user provides input, the client device can determine to which user interface the input is destined. The user interface to which input is destined is referred to as the "application in focus", and the means by which the focused application is selected is called "focus determination".
当用户界面本身驻留在客户端设备上时,功能交互问题实际上要简单得多。终端设备明确了解每个应用程序,因此可以分别向用户展示每个应用程序。当用户提供输入时,客户端设备可以确定输入的目标用户界面。输入目的地的用户界面称为“焦点中的应用程序”,选择焦点应用程序的方法称为“焦点确定”。
Generally speaking, focus determination is purely a local operation. In the PC universe, focus determination is provided by window managers. Each application does not know about focus; it merely receives the user input that has been targeted to it when it's in focus. This basic concept applies to SIP-based applications as well.
一般来说,焦点确定纯粹是一种局部操作。在PC世界中,焦点确定由窗口管理器提供。每个应用程序都不知道焦点;当它处于焦点时,它只接收用户输入。这一基本概念也适用于基于SIP的应用程序。
Focus determination will frequently be trivial, depending on the user interface type. Consider a user that makes a call from a PC. The call passes through a prepaid calling card application and a call-recording application. Both of these wish to interact with the user. Both push an HTML-based user interface to the user. On the PC, each user interface would appear as a separate window. The user interacts with the call-recording application by selecting its window, and with the prepaid calling card application by selecting its window. Focus determination is literally provided by the PC window manager. It is clear to which application the user input is targeted.
根据用户界面类型的不同,焦点的确定通常是琐碎的。考虑从PC发出呼叫的用户。呼叫通过预付费呼叫卡应用程序和呼叫记录应用程序。两者都希望与用户交互。两者都向用户推送基于HTML的用户界面。在PC上,每个用户界面都将显示为一个单独的窗口。用户通过选择其窗口与呼叫记录应用程序交互,并通过选择其窗口与预付费电话卡应用程序交互。焦点确定实际上是由PC窗口管理器提供的。很明显,用户输入的目标是哪个应用程序。
As another example, consider the same two applications, but on a "smart phone" that has a set of buttons, and next to each button, there is an LCD display that can provide the user with an option. This user interface can be represented using the Wireless Markup Language (WML), for example.
作为另一个例子,考虑相同的两个应用程序,但是在具有一组按钮的“智能电话”上,并且在每个按钮的旁边,有一个LCD显示器,可以为用户提供一个选项。例如,可以使用无线标记语言(WML)表示此用户界面。
The phone would allocate some number of buttons to each application. The prepaid calling card would get one button for its "hangup" command, and the recording application would get one for its "start/ stop" command. The user can easily determine which application to interact with by pressing the appropriate button. Pressing a button determines focus and provides user input, both at the same time.
手机会为每个应用程序分配一些按钮。预付费电话卡的“挂断”命令有一个按钮,录音应用程序的“启动/停止”命令有一个按钮。用户可以通过按下相应的按钮轻松确定要与哪个应用程序交互。按下按钮确定焦点并同时提供用户输入。
Unfortunately, not all devices will have these advanced displays. A PSTN gateway, or a basic IP telephone, may only have a 12-key keypad. The user interfaces for these devices are provided through the Keypad
不幸的是,并不是所有的设备都有这些高级显示器。PSTN网关或基本IP电话只能有一个12键键盘。这些设备的用户界面通过键盘提供
Markup Language (KPML). Considering once again the feature interaction case above, the prepaid calling card application and the call-recording application would both pass a KPML document to the device. When the user presses a button on the keypad, to which document does the input apply? The device does not allow the user to select. A device where the user cannot provide focus is called a "focusless device". This is quite a hard problem to solve. This framework does not make any explicit normative recommendation, but it concludes that the best option is to send the input to both user interfaces unless the markup in one interface has indicated that it should be suppressed from others. This is a sensible choice by analogy -- it's exactly what the existing circuit-switched telephone network will do. It is an explicit non-goal to provide a better mechanism for feature interaction resolution than the PSTN on devices that have the same user interface as they do on the PSTN. Devices with better displays, such as PCs or screen phones, can benefit from the capabilities of this framework, allowing the user to determine which application they are interacting with.
标记语言(KPML)。再次考虑上述功能交互情况,预付费电话卡应用程序和通话记录应用程序都将向设备传递KPML文档。当用户按下键盘上的按钮时,输入应用于哪个文档?设备不允许用户选择。用户无法提供焦点的设备称为“无焦点设备”。这是一个很难解决的问题。该框架没有提出任何明确的规范性建议,但它得出结论,最好的选择是将输入发送到两个用户界面,除非一个界面中的标记表明应该禁止其他界面使用它。通过类比,这是一个明智的选择——这正是现有电路交换电话网络将要做的。在具有与PSTN相同用户界面的设备上,提供比PSTN更好的功能交互分辨率机制是一个明确的非目标。具有更好显示的设备(如PC或屏幕电话)可以受益于此框架的功能,使用户可以确定与哪个应用程序交互。
Indeed, when a user provides input on a focusless device, the input must be passed to all client-local user interfaces AND all client-remote user interfaces, unless the markup tells the UI to suppress the media. In the case of KPML, key events are passed to remote user interfaces by encoding them as described in RFC 4733 [19]. Of course, since a client cannot determine whether or not a media stream terminates in a remote user interface, these key events are passed in all audio media streams unless the KPML request document is used to suppress them.
事实上,当用户在无焦点设备上提供输入时,输入必须传递到所有客户端本地用户界面和所有客户端远程用户界面,除非标记告诉UI禁止媒体。在KPML的情况下,通过按照RFC 4733[19]中所述对关键事件进行编码,将关键事件传递给远程用户界面。当然,由于客户端无法确定媒体流是否在远程用户界面中终止,因此这些关键事件将在所有音频媒体流中传递,除非使用KPML请求文档来抑制它们。
When the user interfaces run remotely, the determination of focus can be much, much harder. There are many architectures that can be deployed to handle the interaction. None are ideal. However, all are beyond the scope of this specification.
当用户界面远程运行时,确定焦点会变得越来越困难。可以部署许多体系结构来处理交互。没有一个是理想的。然而,所有这些都超出了本规范的范围。
An application can instantiate a multiplicity of user interface components. For example, a single application can instantiate two separate HTML components and one WML component. Furthermore, an application can instantiate both client-local and client-remote user interfaces.
应用程序可以实例化多个用户界面组件。例如,一个应用程序可以实例化两个单独的HTML组件和一个WML组件。此外,应用程序可以实例化客户端本地和客户端远程用户界面。
The feature interaction issues between these components within the same application are less severe. If an application has multiple client user interface components, their interaction is resolved identically to the inter-application case -- through focus
同一应用程序中这些组件之间的功能交互问题不那么严重。如果一个应用程序有多个客户端用户界面组件,那么它们的交互将通过焦点以与应用程序间相同的方式解决
determination. However, the problems in focusless user devices (such as a keypad on a telephone) generally won't exist, since the application can generate user interfaces that do not overlap in their usage of an input.
决心然而,无焦点用户设备(例如电话上的键盘)中的问题通常不会存在,因为应用程序可以生成在使用输入时不会重叠的用户界面。
The real issue is that the optimal user experience frequently requires some kind of coupling between the differing user interface components. This is a classic problem in multi-modal user interfaces, such as those described by Speech Application Language Tags (SALT). As an example, consider a user interface where a user can either press a labeled button to make a selection, or listen to a prompt, and speak the desired selection. Ideally, when the user presses the button, the prompt should cease immediately, since both of them were targeted at collecting the same information in parallel. Such interactions are best handled by markups that natively support such interactions, such as SALT, and thus require no explicit support from this framework.
真正的问题是,最佳用户体验通常需要不同用户界面组件之间的某种耦合。这是多模式用户界面中的一个经典问题,例如语音应用程序语言标记(SALT)所描述的用户界面。举个例子,考虑一个用户界面,用户可以按一个标记的按钮进行选择,或者听一个提示,并说出所需的选择。理想情况下,当用户按下按钮时,提示应该立即停止,因为这两个按钮的目标都是并行收集相同的信息。此类交互最好由本机支持此类交互的标记(如SALT)处理,因此不需要此框架的明确支持。
This section shows the operation of a call-recording application. This application allows a user to record the media in their call by clicking on a button in a web form. The application uses a presentation-capable user interface component that is pushed to the caller. The conventions of [17] are used to describe representation of long message lines.
本节显示呼叫记录应用程序的操作。此应用程序允许用户通过单击web表单中的按钮来录制通话中的媒体。应用程序使用推送到调用方的支持表示的用户界面组件。[17]中的约定用于描述长消息行的表示。
A Recording App B |(1) INVITE | | |----------------------->| | | |(2) INVITE | | |----------------------->| | |(3) 200 OK | | |<-----------------------| |(4) 200 OK | | |<-----------------------| | |(5) ACK | | |----------------------->| | | |(6) ACK | | |----------------------->| |(7) REFER | | |<-----------------------| | |(8) 200 OK | | |----------------------->| | |(9) NOTIFY | | |----------------------->| | |(10) 200 OK | | |<-----------------------| | |(11) HTTP GET | | |----------------------->| | |(12) 200 OK | | |<-----------------------| | |(13) NOTIFY | | |----------------------->| | |(14) 200 OK | | |<-----------------------| | |(15) HTTP POST | | |----------------------->| | |(16) 200 OK | | |<-----------------------| |
A Recording App B |(1) INVITE | | |----------------------->| | | |(2) INVITE | | |----------------------->| | |(3) 200 OK | | |<-----------------------| |(4) 200 OK | | |<-----------------------| | |(5) ACK | | |----------------------->| | | |(6) ACK | | |----------------------->| |(7) REFER | | |<-----------------------| | |(8) 200 OK | | |----------------------->| | |(9) NOTIFY | | |----------------------->| | |(10) 200 OK | | |<-----------------------| | |(11) HTTP GET | | |----------------------->| | |(12) 200 OK | | |<-----------------------| | |(13) NOTIFY | | |----------------------->| | |(14) 200 OK | | |<-----------------------| | |(15) HTTP POST | | |----------------------->| | |(16) 200 OK | | |<-----------------------| |
Figure 6
图6
First, the caller, A, sends an INVITE to set up a call (message 1). Since the caller supports the framework and can handle presentation-capable user interface components, it includes the Supported header field indicating that the GRUU extension and the Target-Dialog header field are understood, the Allow header field indicating that REFER is understood, and the Contact header field that includes the "schemes" header field parameter.
首先,呼叫者A发送一个邀请以建立呼叫(消息1)。由于调用者支持框架并可以处理支持演示的用户界面组件,因此它包括表示GRUU扩展和目标对话框标题字段已被理解的受支持标题字段、表示REFERE已被理解的允许标题字段以及包括“方案”的联系人标题字段标题字段参数。
INVITE sip:B@example.com SIP/2.0 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.org> Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 INVITE Max-Forwards: 70 Supported: gruu, tdialog Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Accept: application/sdp, text/html <allOneLine> Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip" </allOneLine> Content-Length: ... Content-Type: application/sdp
INVITE sip:B@example.com SIP/2.0 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.org> Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 INVITE Max-Forwards: 70 Supported: gruu, tdialog Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Accept: application/sdp, text/html <allOneLine> Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip" </allOneLine> Content-Length: ... Content-Type: application/sdp
--SDP not shown--
--未显示SDP--
The proxy acts as a recording server, and forwards the INVITE to the called party (message 2). It strips the Record-Route it would normally insert due to the presence of the GRUU in the INVITE:
代理充当记录服务器,并将邀请转发给被叫方(消息2)。由于邀请中存在GRUU,它会剥离通常插入的记录路由:
INVITE sip:B@pc.example.com SIP/2.0 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.org> Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 INVITE Max-Forwards: 70 Supported: gruu, tdialog Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Accept: application/sdp, text/html <allOneLine> Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip" </allOneLine> Content-Length: ... Content-Type: application/sdp
INVITE sip:B@pc.example.com SIP/2.0 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.org> Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 INVITE Max-Forwards: 70 Supported: gruu, tdialog Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Accept: application/sdp, text/html <allOneLine> Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip" </allOneLine> Content-Length: ... Content-Type: application/sdp
--SDP not shown--
--未显示SDP--
B accepts the call with a 200 OK (message 3). It does not support the framework, so the various header fields are not present.
B以200 OK(信息3)接受呼叫。它不支持框架,因此各种标题字段不存在。
SIP/2.0 200 OK Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.com>;tag=7777 Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 INVITE Contact: <sip:B@pc.example.com> Content-Length: ... Content-Type: application/sdp
SIP/2.0 200 OK Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.com>;tag=7777 Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 INVITE Contact: <sip:B@pc.example.com> Content-Length: ... Content-Type: application/sdp
--SDP not shown--
--未显示SDP--
This 200 OK is passed back to the caller (message 4):
此200 OK被传回调用方(消息4):
SIP/2.0 200 OK Record-Route: <sip:app.example.com;lr> Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.com>;tag=7777 Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 INVITE Contact: <sip:B@pc.example.com> Content-Length: ... Content-Type: application/sdp
SIP/2.0 200 OK Record-Route: <sip:app.example.com;lr> Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.com>;tag=7777 Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 INVITE Contact: <sip:B@pc.example.com> Content-Length: ... Content-Type: application/sdp
--SDP not shown--
--未显示SDP--
The caller generates an ACK (message 5).
调用者生成一个ACK(消息5)。
ACK sip:B@pc.example.com Route: <sip:app.example.com;lr> Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.com>;tag=7777 Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 ACK
ACK sip:B@pc.example.com Route: <sip:app.example.com;lr> Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.com>;tag=7777 Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 ACK
The ACK is forwarded to the called party (message 6).
ACK被转发给被叫方(消息6)。
ACK sip:B@pc.example.com Via: SIP/2.0/TLS app.example.com;branch=z9hG4bKh7s Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.com>;tag=7777 Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 ACK
ACK sip:B@pc.example.com Via: SIP/2.0/TLS app.example.com;branch=z9hG4bKh7s Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 From: Caller <sip:A@example.com>;tag=kkaz- To: Callee <sip:B@example.com>;tag=7777 Call-ID: fa77as7dad8-sd98ajzz@host.example.com CSeq: 1 ACK
Now, the application decides to push a user interface component to user A. So, it sends it a REFER request (message 7):
现在,应用程序决定将用户界面组件推送到用户a。因此,它向用户a发送一个引用请求(消息7):
<allOneLine> REFER sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6 SIP/2.0 </allOneLine> Refer-To: https://app.example.com/script.pl Target-Dialog: fa77as7dad8-sd98ajzz@host.example.com ;remote-tag=7777;local-tag=kkaz- Require: tdialog Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 Max-Forwards: 70 From: Recorder Application <sip:app.example.com>;tag=jhgf <allOneLine> To: Caller <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6> </allOneLine> Require: tdialog Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Call-ID: 66676776767@app.example.com CSeq: 1 REFER Event: refer Contact: <sip:app.example.com>
<allOneLine> REFER sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6 SIP/2.0 </allOneLine> Refer-To: https://app.example.com/script.pl Target-Dialog: fa77as7dad8-sd98ajzz@host.example.com ;remote-tag=7777;local-tag=kkaz- Require: tdialog Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 Max-Forwards: 70 From: Recorder Application <sip:app.example.com>;tag=jhgf <allOneLine> To: Caller <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6> </allOneLine> Require: tdialog Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Call-ID: 66676776767@app.example.com CSeq: 1 REFER Event: refer Contact: <sip:app.example.com>
Since the recording application is the same as the authoritative proxy for the domain, it resolves the Request URI to the registered contact of A, and then sent there. The REFER is answered by a 200 OK (message 8).
由于录制应用程序与域的权威代理相同,因此它将请求URI解析为的已注册联系人,然后发送到该联系人。REFER的应答为200 OK(信息8)。
SIP/2.0 200 OK Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 From: Recorder Application <sip:app.example.com>;tag=jhgf To: Caller <sip:A@example.com>;tag=pqoew Call-ID: 66676776767@app.example.com Supported: gruu, tdialog Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER <allOneLine> Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip" </allOneLine> CSeq: 1 REFER
SIP/2.0 200 OK Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 From: Recorder Application <sip:app.example.com>;tag=jhgf To: Caller <sip:A@example.com>;tag=pqoew Call-ID: 66676776767@app.example.com Supported: gruu, tdialog Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER <allOneLine> Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip" </allOneLine> CSeq: 1 REFER
User A sends a NOTIFY (message 9):
用户A发送通知(消息9):
NOTIFY sip:app.example.com SIP/2.0 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 To: Recorder Application <sip:app.example.com>;tag=jhgf From: Caller <sip:A@example.com>;tag=pqoew Call-ID: 66676776767@app.example.com CSeq: 1 NOTIFY Max-Forwards: 70 <allOneLine> Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip" </allOneLine> Event: refer;id=93809824 Subscription-State: active;expires=3600 Content-Type: message/sipfrag;version=2.0 Content-Length: 20
NOTIFY sip:app.example.com SIP/2.0 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 To: Recorder Application <sip:app.example.com>;tag=jhgf From: Caller <sip:A@example.com>;tag=pqoew Call-ID: 66676776767@app.example.com CSeq: 1 NOTIFY Max-Forwards: 70 <allOneLine> Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip" </allOneLine> Event: refer;id=93809824 Subscription-State: active;expires=3600 Content-Type: message/sipfrag;version=2.0 Content-Length: 20
SIP/2.0 100 Trying
SIP/2.0 100
And the recording server responds with a 200 OK (message 10).
记录服务器以200 OK(消息10)响应。
SIP/2.0 200 OK Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 To: Recorder Application <sip:app.example.com>;tag=jhgf From: Caller <sip:A@example.com>;tag=pqoew Call-ID: 66676776767@app.example.com CSeq: 1 NOTIFY
SIP/2.0 200 OK Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 To: Recorder Application <sip:app.example.com>;tag=jhgf From: Caller <sip:A@example.com>;tag=pqoew Call-ID: 66676776767@app.example.com CSeq: 1 NOTIFY
The REFER request contained a Target-Dialog header field parameter with a valid dialog identifier. Furthermore, all of the signaling was over TLS and the dialog identifiers contain sufficient randomness. As such, the caller, A, automatically authorizes the application. It then acts on the Refer-To URI, fetching the script from app.example.com (message 11). The response, message 12, contains a web application that the user can click on to enable recording. Because the client executed the URL in the Refer-To, it generates another NOTIFY to the application, informing it of the successful response (message 13). This is answered with a 200 OK (message 14). When the user clicks on the link (message 15), the results are posted to the server, and an updated display is provided (message 16).
REFERE请求包含具有有效对话框标识符的目标对话框标题字段参数。此外,所有信令都是通过TLS发送的,并且对话标识符包含足够的随机性。因此,调用方A自动授权应用程序。然后,它对引用URI进行操作,从app.example.com获取脚本(消息11)。响应消息12包含一个web应用程序,用户可以单击该应用程序以启用录制。因为客户端在refere中执行了URL,所以它会向应用程序生成另一个通知,通知它成功响应(消息13)。回答为200 OK(信息14)。当用户单击链接(消息15)时,结果将发布到服务器,并提供更新的显示(消息16)。
There are many security considerations associated with this framework. It allows applications in the network to instantiate user interface components on a client device. Such instantiations need to be from authenticated applications, and also need to be authorized to place a UI into the client. Indeed, the stronger requirement is authorization. It is not as important to know the name of the provider of the application, as it is to know that the provider is authorized to instantiate components.
与此框架相关的安全考虑因素很多。它允许网络中的应用程序在客户端设备上实例化用户界面组件。此类实例化需要来自经过身份验证的应用程序,并且还需要授权将UI放入客户端。事实上,更严格的要求是授权。知道应用程序提供者的名称并不重要,重要的是知道提供者有权实例化组件。
This specification defines specific authorization techniques and requirements. Automatic authorization is granted if the application can prove that it is on the call path, or is trusted by an element on the call path. As documented above, this can be accomplished by the use of cryptographically random dialog identifiers and the usage of SIPS for message confidentiality. It is RECOMMENDED that SIPS be implemented by user agents compliant to this specification. This does not represent a change from the requirements in RFC 3261.
本规范定义了特定的授权技术和要求。如果应用程序可以证明它位于调用路径上,或者被调用路径上的元素信任,则授予自动授权。如上所述,这可以通过使用加密随机对话标识符和使用SIP来实现消息机密性。建议SIP由符合本规范的用户代理实施。这并不代表RFC 3261中要求的变更。
This document was produced as a result of discussions amongst the application interaction design team. All members of this team contributed significantly to the ideas embodied in this document. The members of this team were:
本文档是应用程序交互设计团队讨论的结果。该团队的所有成员都对本文件中包含的想法做出了重大贡献。该小组的成员是:
Eric Burger Cullen Jennings Robert Fairlie-Cuninghame
埃里克·伯格·卡伦·詹宁斯·罗伯特·费尔利·库宁汉姆
The authors would like to thank Martin Dolly and Rohan Mahy for their input and comments. Thanks to Allison Mankin for her support of this work.
作者要感谢Martin Dolly和Rohan Mahy的投入和评论。感谢Allison Mankin对这项工作的支持。
[1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.
[1] Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。
[2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.
[2] Rosenberg,J.,Schulzrinne,H.,Camarillo,G.,Johnston,A.,Peterson,J.,Sparks,R.,Handley,M.,和E.Schooler,“SIP:会话启动协议”,RFC 3261,2002年6月。
[3] Rosenberg, J. and H. Schulzrinne, "Reliability of Provisional Responses in Session Initiation Protocol (SIP)", RFC 3262, June 2002.
[3] Rosenberg,J.和H.Schulzrinne,“会话启动协议(SIP)中临时响应的可靠性”,RFC 3262,2002年6月。
[4] Roach, A., "Session Initiation Protocol (SIP)-Specific Event Notification", RFC 3265, June 2002.
[4] Roach,A.,“会话启动协议(SIP)-特定事件通知”,RFC3265,2002年6月。
[5] McGlashan, S., Lucas, B., Porter, B., Rehor, K., Burnett, D., Carter, J., Ferrans, J., and A. Hunt, "Voice Extensible Markup Language (VoiceXML) Version 2.0", W3C CR CR-voicexml20- 20030220, February 2003.
[5] McGrashan,S.,Lucas,B.,Porter,B.,Rehor,K.,Burnett,D.,Carter,J.,Ferrans,J.,和A.Hunt,“语音可扩展标记语言(VoiceXML)版本2.0”,W3C CR-voicexml20-20030220,2003年2月。
[6] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Indicating User Agent Capabilities in the Session Initiation Protocol (SIP)", RFC 3840, August 2004.
[6] Rosenberg,J.,Schulzrinne,H.,和P.Kyzivat,“指出会话启动协议(SIP)中的用户代理功能”,RFC 3840,2004年8月。
[7] Sparks, R., "The Session Initiation Protocol (SIP) Refer Method", RFC 3515, April 2003.
[7] Sparks,R.,“会话启动协议(SIP)引用方法”,RFC 3515,2003年4月。
[8] Burger, E. and M. Dolly, "A Session Initiation Protocol (SIP) Event Package for Key Press Stimulus (KPML)", RFC 4730, November 2006.
[8] Burger,E.和M.Dolly,“按键刺激(KPML)的会话启动协议(SIP)事件包”,RFC 4730,2006年11月。
[9] Rosenberg, J., "Obtaining and Using Globally Routable User Agent URIs (GRUUs) in the Session Initiation Protocol (SIP)", RFC 5627, October 2009.
[9] Rosenberg,J.,“在会话启动协议(SIP)中获取和使用全局可路由用户代理URI(GROUS)”,RFC 5627,2009年10月。
[10] Rosenberg, J., "Request Authorization through Dialog Identification in the Session Initiation Protocol (SIP)", RFC 4538, June 2006.
[10] Rosenberg,J.,“通过会话启动协议(SIP)中的对话标识请求授权”,RFC 4538,2006年6月。
[11] Peterson, J. and C. Jennings, "Enhancements for Authenticated Identity Management in the Session Initiation Protocol (SIP)", RFC 4474, August 2006.
[11] Peterson,J.和C.Jennings,“会话启动协议(SIP)中身份验证管理的增强”,RFC 4474,2006年8月。
[12] Day, M., Rosenberg, J., and H. Sugano, "A Model for Presence and Instant Messaging", RFC 2778, February 2000.
[12] Day,M.,Rosenberg,J.,和H.Sugano,“状态和即时信息模型”,RFC 27782000年2月。
[13] Jennings, C., Peterson, J., and M. Watson, "Private Extensions to the Session Initiation Protocol (SIP) for Asserted Identity within Trusted Networks", RFC 3325, November 2002.
[13] Jennings,C.,Peterson,J.,和M.Watson,“在可信网络中用于断言身份的会话启动协议(SIP)的私有扩展”,RFC 33252002年11月。
[14] Rosenberg, J., "A Framework for Conferencing with the Session Initiation Protocol (SIP)", RFC 4353, February 2006.
[14] Rosenberg,J.,“会话启动协议(SIP)会议框架”,RFC 4353,2006年2月。
[15] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller Preferences for the Session Initiation Protocol (SIP)", RFC 3841, August 2004.
[15] Rosenberg,J.,Schulzrinne,H.,和P.Kyzivat,“会话启动协议(SIP)的呼叫方偏好”,RFC 38412004年8月。
[16] Rosenberg, J., Schulzrinne, H., and R. Mahy, "An INVITE-Initiated Dialog Event Package for the Session Initiation Protocol (SIP)", RFC 4235, November 2005.
[16] Rosenberg,J.,Schulzrinne,H.,和R.Mahy,“会话启动协议(SIP)的邀请启动对话事件包”,RFC 42352005年11月。
[17] Sparks, R., Hawrylyshen, A., Johnston, A., Rosenberg, J., and H. Schulzrinne, "Session Initiation Protocol (SIP) Torture Test Messages", RFC 4475, May 2006.
[17] Sparks,R.,Hawrylyshen,A.,Johnston,A.,Rosenberg,J.,和H.Schulzrinne,“会话启动协议(SIP)酷刑测试消息”,RFC 4475,2006年5月。
[18] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.
[18] Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。
[19] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF Digits, Telephony Tones, and Telephony Signals", RFC 4733, December 2006.
[19] Schulzrinne,H.和T.Taylor,“DTMF数字、电话音和电话信号的RTP有效载荷”,RFC 47332006年12月。
[20] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.
[20] Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,2002年6月。
[21] Rosenberg, J., "A Session Initiation Protocol (SIP) Event Package for Registrations", RFC 3680, March 2004.
[21] Rosenberg,J.,“用于注册的会话启动协议(SIP)事件包”,RFC 36802004年3月。
Author's Address
作者地址
Jonathan Rosenberg Cisco Systems 600 Lanidex Plaza Parsippany, NJ 07054 US
Jonathan Rosenberg Cisco Systems 600美国新泽西州帕西帕尼拉尼德广场07054号
Phone: +1 973 952-5000 EMail: jdrosen@cisco.com URI: http://www.jdrosen.net
Phone: +1 973 952-5000 EMail: jdrosen@cisco.com URI: http://www.jdrosen.net