RFC 7205: Use Cases for Telepresence Multistreams 中文翻译

URL : https://datatracker.ietf.org/doc/html/rfc7205
标题 : RFC 7205
翻译类型 : 自动生成

Internet Engineering Task Force (IETF)                        A. Romanow
Request for Comments: 7205                                         Cisco
Category: Informational                                        S. Botzko
ISSN: 2070-1721                                             M. Duckworth
                                                                 Polycom
                                                            R. Even, Ed.
                                                     Huawei Technologies
                                                              April 2014

Internet Engineering Task Force (IETF)                        A. Romanow
Request for Comments: 7205                                         Cisco
Category: Informational                                        S. Botzko
ISSN: 2070-1721                                             M. Duckworth
                                                                 Polycom
                                                            R. Even, Ed.
                                                     Huawei Technologies
                                                              April 2014

Use Cases for Telepresence Multistreams

远程呈现多流的用例

Abstract

摘要

Telepresence conferencing systems seek to create an environment that gives users (or user groups) that are not co-located a feeling of co-located presence through multimedia communication that includes at least audio and video signals of high fidelity. A number of techniques for handling audio and video streams are used to create this experience. When these techniques are not similar, interoperability between different systems is difficult at best, and often not possible. Conveying information about the relationships between multiple streams of media would enable senders and receivers to make choices to allow telepresence systems to interwork. This memo describes the most typical and important use cases for sending multiple streams in a telepresence conference.

远程临场感会议系统寻求创建一种环境，通过至少包括高保真音频和视频信号的多媒体通信，为不在同一地点的用户（或用户组）提供同一地点存在的感觉。许多处理音频和视频流的技术被用来创造这种体验。当这些技术不相似时，不同系统之间的互操作性充其量是困难的，而且通常是不可能的。传送有关多个媒体流之间关系的信息将使发送者和接收者能够做出选择，以允许远程呈现系统进行交互。本备忘录描述了在远程呈现会议中发送多个流的最典型和最重要的用例。

Status of This Memo

关于下段备忘

This document is not an Internet Standards Track specification; it is published for informational purposes.

本文件不是互联网标准跟踪规范；它是为了提供信息而发布的。

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.

本文件是互联网工程任务组（IETF）的产品。它代表了IETF社区的共识。它已经接受了公众审查，并已被互联网工程指导小组（IESG）批准出版。并非IESG批准的所有文件都适用于任何级别的互联网标准；见RFC 5741第2节。

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7205.

有关本文件当前状态、任何勘误表以及如何提供反馈的信息，请访问http://www.rfc-editor.org/info/rfc7205.

版权公告

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

本文件受BCP 78和IETF信托有关IETF文件的法律规定的约束(http://trustee.ietf.org/license-info)自本文件出版之日起生效。请仔细阅读这些文件，因为它们描述了您对本文件的权利和限制。从本文件中提取的代码组件必须包括信托法律条款第4.e节中所述的简化BSD许可证文本，并提供简化BSD许可证中所述的无担保。

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Overview of Telepresence Scenarios  . . . . . . . . . . . . .   4
   3.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . .   6
     3.1.  Point-to-Point Meeting: Symmetric . . . . . . . . . . . .   7
     3.2.  Point-to-Point Meeting: Asymmetric  . . . . . . . . . . .   7
     3.3.  Multipoint Meeting  . . . . . . . . . . . . . . . . . . .   9
     3.4.  Presentation  . . . . . . . . . . . . . . . . . . . . . .  10
     3.5.  Heterogeneous Systems . . . . . . . . . . . . . . . . . .  11
     3.6.  Multipoint Education Usage  . . . . . . . . . . . . . . .  12
     3.7.  Multipoint Multiview (Virtual Space)  . . . . . . . . . .  14
     3.8.  Multiple Presentation Streams - Telemedicine  . . . . . .  15
   4.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  16
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  16
   6.  Informative References  . . . . . . . . . . . . . . . . . . .  16

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Overview of Telepresence Scenarios  . . . . . . . . . . . . .   4
   3.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . .   6
     3.1.  Point-to-Point Meeting: Symmetric . . . . . . . . . . . .   7
     3.2.  Point-to-Point Meeting: Asymmetric  . . . . . . . . . . .   7
     3.3.  Multipoint Meeting  . . . . . . . . . . . . . . . . . . .   9
     3.4.  Presentation  . . . . . . . . . . . . . . . . . . . . . .  10
     3.5.  Heterogeneous Systems . . . . . . . . . . . . . . . . . .  11
     3.6.  Multipoint Education Usage  . . . . . . . . . . . . . . .  12
     3.7.  Multipoint Multiview (Virtual Space)  . . . . . . . . . .  14
     3.8.  Multiple Presentation Streams - Telemedicine  . . . . . .  15
   4.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  16
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  16
   6.  Informative References  . . . . . . . . . . . . . . . . . . .  16

1. Introduction

1. 介绍

Telepresence applications try to provide a "being there" experience for conversational video conferencing. Often, this telepresence application is described as "immersive telepresence" in order to distinguish it from traditional video conferencing and from other forms of remote presence not related to conversational video conferencing, such as avatars and robots. The salient characteristics of telepresence are often described as: being actual sized, providing immersive video, preserving interpersonal interaction, and allowing non-verbal communication.

远程呈现应用程序试图为对话视频会议提供“在场”的体验。通常，这种临场感应用被描述为“沉浸式临场感”，以区别于传统的视频会议和与对话视频会议无关的其他形式的远程存在，例如化身和机器人。临场感的显著特征通常被描述为：真实尺寸、提供沉浸式视频、保持人际互动以及允许非语言交流。

Although telepresence systems are based on open standards such as RTP [RFC3550], SIP [RFC3261], H.264 [ITU.H264], and the H.323 [ITU.H323] suite of protocols, they cannot easily interoperate with each other without operator assistance and expensive additional equipment that translates from one vendor's protocol to another.

尽管远程临场感系统基于开放标准，如RTP[RFC3550]、SIP[RFC3261]、H.264[ITU.H264]和H.323[ITU.H323]协议套件，但如果没有操作员协助和从一家供应商的协议转换到另一家供应商的昂贵附加设备，它们无法轻松地彼此互操作。

The basic features that give telepresence its distinctive characteristics are implemented in disparate ways in different systems. Currently, telepresence systems from diverse vendors interoperate to some extent, but this is not supported in a standards-based fashion. Interworking requires that translation and transcoding devices be included in the architecture. Such devices increase latency, reducing the quality of interpersonal interaction. Use of these devices is often not automatic; it frequently requires substantial manual configuration and a detailed understanding of the nature of underlying audio and video streams. This state of affairs is not acceptable for the continued growth of telepresence -- these systems should have the same ease of interoperability as do telephones. Thus, a standard way of describing the multiple streams constituting the media flows and the fundamental aspects of their behavior would allow telepresence systems to interwork.

在不同的系统中，以不同的方式实现了赋予远程呈现独特特性的基本功能。目前，来自不同供应商的临场感系统在某种程度上可以互操作，但这在基于标准的方式中不受支持。互通要求在体系结构中包括翻译和转码设备。这种设备增加了延迟，降低了人际互动的质量。这些设备的使用通常不是自动的；它通常需要大量的手动配置和对底层音频和视频流性质的详细理解。这种情况对于网真的持续发展是不可接受的——这些系统应该和电话一样易于互操作。因此，描述构成媒体流的多个流及其行为的基本方面的标准方法将允许远程呈现系统进行交互。

This document presents a set of use cases describing typical scenarios. Requirements will be derived from these use cases in a separate document. The use cases are described from the viewpoint of the users. They are illustrative of the user experience that needs to be supported. It is possible to implement these use cases in a variety of different ways.

本文档提供了一组描述典型场景的用例。需求将在单独的文档中从这些用例中派生。从用户的角度描述用例。它们说明了需要支持的用户体验。可以用各种不同的方式实现这些用例。

Many different scenarios need to be supported. This document describes in detail the most common and basic use cases. These will cover most of the requirements. There may be additional scenarios that bring new features and requirements that can be used to extend the initial work.

需要支持许多不同的场景。本文档详细描述了最常见和最基本的用例。这些将涵盖大部分需求。可能会有其他场景带来新的特性和需求，可用于扩展初始工作。

Point-to-point and multipoint telepresence conferences are considered. In some use cases, the number of screens is the same at all sites; in others, the number of screens differs at different sites. Both use cases are considered. Also included is a use case describing display of presentation material or content.

考虑点对点和多点临场感会议。在某些用例中，所有站点的屏幕数量相同；在其他情况下，不同站点的屏幕数量不同。这两个用例都被考虑。还包括描述演示材料或内容显示的用例。

The multipoint use cases may include a variety of systems from conference room systems to handheld devices, and such a use case is described in the document.

多点用例可以包括从会议室系统到手持设备的各种系统，并且该文档中描述了这样的用例。

This document's structure is as follows: Section 2 gives an overview of scenarios, and Section 3 describes use cases.

本文档的结构如下：第2节概述了场景，第3节描述了用例。

2. Overview of Telepresence Scenarios

2. 临场感场景概述

This section describes the general characteristics of the use cases and what the scenarios are intended to show. The typical setting is a business conference, which was the initial focus of telepresence. Recently, consumer products are also being developed. We specifically do not include in our scenarios the physical infrastructure aspects of telepresence, such as room construction, layout, and decoration. Furthermore, these use cases do not describe all the aspects needed to create the best user experience (for example, the human factors).

本节描述了用例的一般特征以及场景要显示的内容。典型的场景是商务会议，这是网真最初的焦点。最近，消费品也在开发中。在我们的场景中，我们特别不包括临场感的物理基础设施方面，如房间构造、布局和装饰。此外，这些用例并没有描述创造最佳用户体验所需的所有方面（例如，人为因素）。

We also specifically do not attempt to precisely define the boundaries between telepresence systems and other systems, nor do we attempt to identify the "best" solution for each presented scenario.

我们还特别不试图精确定义远程呈现系统和其他系统之间的界限，也不试图为每个呈现的场景确定“最佳”解决方案。

Telepresence systems are typically composed of one or more video cameras and encoders and one or more display screens of large size (diagonal around 60 inches). Microphones pick up sound, and audio codec(s) produce one or more audio streams. The cameras used to capture the telepresence users are referred to as "participant cameras" (and likewise for screens). There may also be other cameras, such as for document display. These will be referred to as "presentation cameras" or "content cameras", which generally have different formats, aspect ratios, and frame rates from the participant cameras. The presentation streams may be shown on participant screens or on auxiliary display screens. A user's computer may also serve as a virtual content camera, generating an animation or playing a video for display to the remote participants.

临场感系统通常由一个或多个摄像机和编码器以及一个或多个大尺寸显示屏（对角线约60英寸）组成。麦克风拾取声音，音频编解码器生成一个或多个音频流。用于捕捉临场感用户的摄像机被称为“参与者摄像机”（同样用于屏幕）。还可能有其他摄像头，例如用于文档显示的摄像头。这些将被称为“演示摄像机”或“内容摄像机”，它们通常具有与参与者摄像机不同的格式、纵横比和帧速率。演示流可以显示在参与者屏幕或辅助显示屏幕上。用户的计算机还可以用作虚拟内容摄像机，生成动画或播放视频以显示给远程参与者。

We describe such a telepresence system as sending one or more video streams, audio streams, and presentation streams to the remote system(s).

我们将这种临场感系统描述为向远程系统发送一个或多个视频流、音频流和表示流。

The fundamental parameters describing today's typical telepresence scenarios include:

描述当今典型临场感场景的基本参数包括：

1. The number of participating sites

1. 参与网站的数量

2. The number of visible seats at a site

2. 一个站点上可见座位的数量

3. The number of cameras

3. 摄像机的数量

4. The number and type of microphones

4. 话筒的数量和类型

5. The number of audio channels

5. 音频频道的数量

6. The screen size

6. 屏幕大小

7. The screen capabilities -- such as resolution, frame rate, aspect ratio

7. 屏幕功能——如分辨率、帧速率、纵横比

8. The arrangement of the screens in relation to each other

8. 屏幕之间的相对布置

9. The number of primary screens at each site

9. 每个站点的主屏幕数

10. Type and number of presentation screens

10. 演示屏幕的类型和数量

11. Multipoint conference display strategies -- for example, the camera-to-screen mappings may be static or dynamic

11. 多点会议显示策略——例如，摄像机到屏幕的映射可能是静态的，也可能是动态的

12. The camera point of capture

12. 摄像机的拍摄点

13. The cameras fields of view and how they spatially relate to each other

13. 摄像机的视野以及它们在空间上如何相互关联

As discussed in the introduction, the basic features that give telepresence its distinctive characteristics are implemented in disparate ways in different systems.

正如导言中所讨论的，使远程呈现具有其独特特性的基本功能在不同的系统中以不同的方式实现。

There is no agreed upon way to adequately describe the semantics of how streams of various media types relate to each other. Without a standard for stream semantics to describe the particular roles and activities of each stream in the conference, interoperability is cumbersome at best.

没有一致同意的方法来充分描述各种媒体类型的流如何相互关联的语义。如果没有一个流语义标准来描述会议中每个流的特定角色和活动，那么互操作性充其量就是麻烦的。

In a multiple-screen conference, the video and audio streams sent from remote participants must be understood by receivers so that they can be presented in a coherent and life-like manner. This includes the ability to present remote participants at their actual size for their apparent distance, while maintaining correct eye contact,

在多屏幕会议中，远程参与者发送的视频和音频流必须被接收者理解，以便能够以连贯和逼真的方式呈现。这包括在保持正确的眼神交流的同时，以实际尺寸呈现远程参与者的视距离，

gesticular cues, and simultaneously providing a spatial audio sound stage that is consistent with the displayed video.

手势提示，同时提供与显示视频一致的空间音频舞台。

The receiving device that decides how to render incoming information needs to understand a number of variables such as the spatial position of the speaker, the field of view of the cameras, the camera zoom, which media stream is related to each of the screens, etc. It is not simply that individual streams must be adequately described, to a large extent this already exists, but rather that the semantics of the relationships between the streams must be communicated. Note that all of this is still required even if the basic aspects of the streams, such as the bit rate, frame rate, and aspect ratio, are known. Thus, this problem has aspects considerably beyond those encountered in interoperation of video conferencing systems that have a single camera/screen.

决定如何呈现传入信息的接收设备需要理解许多变量，例如扬声器的空间位置、摄像机的视野、摄像机变焦、哪个媒体流与每个屏幕相关等。不仅仅是必须充分描述各个流，在很大程度上，这已经存在，但是流之间关系的语义必须进行通信。注意，即使流的基本方面（例如比特率、帧速率和纵横比）已知，仍然需要所有这些。因此，这个问题的方面远远超出了在具有单个摄像机/屏幕的视频会议系统的互操作中遇到的方面。

3. Use Cases

3. 用例

The use cases focus on typical implementations. There are a number of possible variants for these use cases; for example, the audio supported may differ at the end points (such as mono or stereo versus surround sound), etc.

用例集中在典型的实现上。这些用例有许多可能的变体；例如，支持的音频可能在端点不同（例如单声道或立体声与环绕声），等等。

Many of these systems offer a "full conference room" solution, where local participants sit at one side of a table and remote participants are displayed as if they are sitting on the other side of the table. The cameras and screens are typically arranged to provide a panoramic view of the remote room (left to right from the local user's viewpoint).

其中许多系统都提供了“全会议室”解决方案，本地参与者坐在桌子的一侧，远程参与者被显示为好像他们坐在桌子的另一侧。摄像机和屏幕通常被布置成提供远程房间的全景视图（从本地用户的视角从左到右）。

The sense of immersion and non-verbal communication is fostered by a number of technical features, such as:

沉浸感和非语言交流是由一些技术特征培养出来的，例如：

1. Good eye contact, which is achieved by careful placement of participants, cameras, and screens.

1. 通过仔细放置参与者、摄像机和屏幕，实现良好的眼神交流。

2. Camera field of view and screen sizes are matched so that the images of the remote room appear to be full size.

2. 摄像机视场和屏幕尺寸匹配，使远程房间的图像显示为全尺寸。

3. The left side of each room is presented on the right screen at the far end; similarly, the right side of the room is presented on the left screen. The effect of this is that participants of each site appear to be sitting across the table from each other. If 2 participants on the same site glance at each other, all participants can observe it. Likewise, if a participant at one site gestures to a participant on the other site, all participants observe the gesture itself and the participants it includes.

3. 每个房间的左侧显示在远端的右侧屏幕上；同样，房间的右侧显示在左侧屏幕上。这样做的结果是，每个站点的参与者似乎都坐在桌子对面。如果同一站点上的两名参与者互相瞥了一眼，所有参与者都可以观察到。同样，如果一个站点的参与者向另一个站点的参与者做手势，那么所有参与者都会观察手势本身及其包含的参与者。

3.1. Point-to-Point Meeting: Symmetric

3.1. 点对点会议：对称

In this case, each of the 2 sites has an identical number of screens, with cameras having fixed fields of view, and 1 camera for each screen. The sound type is the same at each end. As an example, there could be 3 cameras and 3 screens in each room, with stereo sound being sent and received at each end.

在这种情况下，两个站点中的每个站点都有相同数量的屏幕，摄像机具有固定的视野，每个屏幕有一个摄像机。每一端的声音类型相同。例如，每个房间可能有3个摄像头和3个屏幕，在每一端发送和接收立体声。

Each screen is paired with a corresponding camera. Each camera/ screen pair is typically connected to a separate codec, producing an encoded stream of video for transmission to the remote site, and receiving a similarly encoded stream from the remote site.

每个屏幕都与相应的摄像头配对。每个相机/屏幕对通常连接到单独的编解码器，产生用于传输到远程站点的编码视频流，并从远程站点接收类似编码的流。

Each system has one or multiple microphones for capturing audio. In some cases, stereophonic microphones are employed. In other systems, a microphone may be placed in front of each participant (or pair of participants). In typical systems, all the microphones are connected to a single codec that sends and receives the audio streams as either stereo or surround sound. The number of microphones and the number of audio channels are often not the same as the number of cameras. Also, the number of microphones is often not the same as the number of loudspeakers.

每个系统都有一个或多个用于捕获音频的麦克风。在某些情况下，使用立体声麦克风。在其他系统中，可以在每个参与者（或一对参与者）前面放置麦克风。在典型系统中，所有麦克风都连接到一个编解码器，该编解码器以立体声或环绕声的形式发送和接收音频流。麦克风的数量和音频通道的数量通常与摄像头的数量不同。此外，麦克风的数量通常与扬声器的数量不同。

The audio may be transmitted as multi-channel (stereo/surround sound) or as distinct and separate monophonic streams. Audio levels should be matched, so the sound levels at both sites are identical. Loudspeaker and microphone placements are chosen so that the sound "stage" (orientation of apparent audio sources) is coordinated with the video. That is, if a participant at one site speaks, the participants at the remote site perceive her voice as originating from her visual image. In order to accomplish this, the audio needs to be mapped at the received site in the same fashion as the video. That is, audio received from the right side of the room needs to be output from loudspeaker(s) on the left side at the remote site, and vice versa.

音频可以作为多声道（立体声/环绕声）或作为不同且分离的单声道流传输。音频级别应该匹配，因此两个站点的声级是相同的。选择扬声器和麦克风的位置，以便声音“舞台”（明显音频源的方向）与视频相协调。也就是说，如果一个站点的参与者讲话，那么远程站点的参与者认为她的声音来自她的视觉图像。为了实现这一点，音频需要以与视频相同的方式映射到接收站点。也就是说，从房间右侧接收的音频需要从远程站点左侧的扬声器输出，反之亦然。

3.2. Point-to-Point Meeting: Asymmetric

3.2. 点对点会议：不对称

In this case, each site has a different number of screens and cameras than the other site. The important characteristic of this scenario is that the number of screens is different between the 2 sites. This creates challenges that are handled differently by different telepresence systems.

在这种情况下，每个站点的屏幕和摄像头数量与其他站点不同。此场景的重要特征是两个站点之间的屏幕数量不同。这就产生了由不同的临场感系统以不同方式处理的挑战。

This use case builds on the basic scenario of 3 screens to 3 screens. Here, we use the common case of 3 screens and 3 cameras at one site, and 1 screen and 1 camera at the other site, connected by a point-to-point call. The screen sizes and camera fields of view at both sites

这个用例建立在3个屏幕到3个屏幕的基本场景之上。这里，我们使用一个站点上的3个屏幕和3个摄像头，另一个站点上的1个屏幕和1个摄像头，通过点对点呼叫连接。两个站点的屏幕尺寸和摄像机视野

are basically similar, such that each camera view is designed to show 2 people sitting side by side. Thus, the 1-screen room has up to 2 people seated at the table, while the 3-screen room may have up to 6 people at the table.

基本上是相似的，因此每个摄影机视图设计为显示2个人并排坐着。因此，单屏房间最多有2人坐在桌子上，而三屏房间最多可能有6人坐在桌子上。

The basic considerations of defining left and right and indicating relative placement of the multiple audio and video streams are the same as in the 3-3 use case. However, handling the mismatch between the 2 sites of the number of screens and cameras requires more complicated maneuvers.

定义左侧和右侧以及指示多个音频和视频流的相对位置的基本注意事项与3-3用例中的相同。然而，处理屏幕和摄像机数量的两个站点之间的不匹配需要更复杂的操作。

For the video sent from the 1-camera room to the 3-screen room, usually what is done is to simply use 1 of the 3 screens and keep the second and third screens inactive or, for example, put up the current date. This would maintain the "full-size" image of the remote side.

对于从单摄像头房间发送到三屏幕房间的视频，通常只需使用三个屏幕中的一个，并保持第二和第三个屏幕处于非活动状态，或者，例如，显示当前日期。这将保持远程端的“全尺寸”图像。

For the other direction, the 3-camera room sending video to the 1-screen room, there are more complicated variations to consider. Here are several possible ways in which the video streams can be handled.

对于另一个方向，3摄像机室发送视频到1屏幕房间，有更复杂的变化要考虑。这里有几种处理视频流的可能方法。

1. The 1-screen system might simply show only 1 of the 3 camera images, since the receiving side has only 1 screen. 2 people are seen at full size, but 4 people are not seen at all. The choice of which one of the 3 streams to display could be fixed, or could be selected by the users. It could also be made automatically based on who is speaking in the 3-screen room, such that the people in the 1-screen room always see the person who is speaking. If the automatic selection is done at the sender, the transmission of streams that are not displayed could be suppressed, which would avoid wasting bandwidth.

1. 单屏幕系统可能仅显示3个摄像头图像中的1个，因为接收侧只有1个屏幕。可以看到2人的全尺寸，但完全看不到4人。显示3个流中哪一个流的选择可以是固定的，也可以由用户选择。它也可以根据在三屏房间里讲话的人自动生成，这样在一屏房间里的人总是能看到讲话的人。如果在发送方进行自动选择，则可以抑制未显示的流的传输，从而避免浪费带宽。

2. The 1-screen system might be capable of receiving and decoding all 3 streams from all 3 cameras. The 1-screen system could then compose the 3 streams into 1 local image for display on the single screen. All 6 people would be seen, but smaller than full size. This could be done in conjunction with reducing the image resolution of the streams, such that encode/decode resources and bandwidth are not wasted on streams that will be downsized for display anyway.

2. 单屏幕系统可能能够接收和解码来自所有3台摄像机的所有3个流。然后，单屏幕系统可以将3个流合成1个本地图像，以便在单屏幕上显示。所有6个人都会被看到，但比实际尺寸小。这可以结合降低流的图像分辨率来实现，使得编码/解码资源和带宽不会浪费在无论如何都将被缩小以用于显示的流上。

3. The 3-screen system might be capable of including all 6 people in a single stream to send to the 1-screen system. For example, it could use PTZ (Pan Tilt Zoom) cameras to physically adjust the cameras such that 1 camera captures the whole room of 6 people. Or, it could recompose the 3 camera images into 1 encoded stream to send to the remote site. These variations also show all 6 people but at a reduced size.

3. 三屏系统可能能够将所有6个人包含在一个流中，以发送到单屏系统。例如，它可以使用PTZ（云台-云台-变焦）摄像机对摄像机进行物理调整，以使1台摄像机捕捉到整个房间的6个人。或者，它可以将3个摄像头图像重新组合成1个编码流，发送到远程站点。这些变化也显示了所有6个人，但尺寸有所缩小。

4. Or, there could be a combination of these approaches, such as simultaneously showing the speaker in full size with a composite of all 6 participants in a smaller size.

4. 或者，也可以将这些方法结合起来，例如，同时展示完整尺寸的演讲者，并以较小的尺寸合成所有6名参与者。

The receiving telepresence system needs to have information about the content of the streams it receives to make any of these decisions. If the systems are capable of supporting more than one strategy, there needs to be some negotiation between the 2 sites to figure out which of the possible variations they will use in a specific point-to-point call.

接收远程呈现系统需要有关于它接收到的流的内容的信息来做出这些决定。如果系统能够支持多个策略，则两个站点之间需要进行一些协商，以确定它们将在特定点对点呼叫中使用哪些可能的变体。

3.3. Multipoint Meeting

3.3. 多点会议

In a multipoint telepresence conference, there are more than 2 sites participating. Additional complexity is required to enable media streams from each participant to show up on the screens of the other participants.

在多点临场感会议中，有两个以上的站点参与。为了使来自每个参与者的媒体流显示在其他参与者的屏幕上，需要额外的复杂性。

Clearly, there are a great number of topologies that can be used to display the streams from multiple sites participating in a conference.

显然，有大量拓扑可用于显示来自多个参与会议的站点的流。

One major objective for telepresence is to be able to preserve the "being there" user experience. However, in multi-site conferences, it is often (in fact, usually) not possible to simultaneously provide full-size video, eye contact, and common perception of gestures and gaze by all participants. Several policies can be used for stream distribution and display: all provide good results, but they all make different compromises.

远程呈现的一个主要目标是能够保持“在那里”的用户体验。然而，在多地点会议中，通常（事实上，通常）不可能同时提供全尺寸视频、眼神交流以及所有参与者对手势和凝视的共同感知。有几种策略可用于流分发和显示：所有策略都提供了良好的结果，但它们都做出了不同的妥协。

One common policy is called site switching. Let's say the speaker is at site A and the other participants are at various "remote" sites. When the room at site A shown, all the camera images from site A are forwarded to the remote sites. Therefore, at each receiving remote site, all the screens display camera images from site A. This can be used to preserve full-size image display, and also provide full visual context of the displayed far end, site A. In site switching, there is a fixed relation between the cameras in each room and the screens in remote rooms. The room or participants being shown are switched from time to time based on who is speaking or by manual control, e.g., from site A to site B.

一种常见的策略称为站点切换。假设演讲者在站点A，其他参与者在不同的“远程”站点。当显示站点A的房间时，来自站点A的所有摄像机图像将转发到远程站点。因此，在每个接收远程站点，所有屏幕显示站点A的摄像机图像。这可用于保留全尺寸图像显示，并提供显示远端站点A的完整视觉上下文。在站点切换中，每个房间的摄像机与远程房间的屏幕之间存在固定关系。根据发言的人或通过手动控制（例如，从现场A切换到现场B），不时切换正在展示的房间或参与者。

Segment switching is another policy choice. In segment switching (assuming still that site A is where the speaker is, and "remote" refers to all the other sites), rather than sending all the images from site A, only the speaker at site A is shown. The camera images of the current speaker and previous speakers (if any) are forwarded to the other sites in the conference. Therefore, the screens in each

分段切换是另一种策略选择。在段切换中（假设站点A仍然是扬声器所在的位置，“远程”指所有其他站点），而不是从站点A发送所有图像，只显示站点A的扬声器。当前发言者和先前发言者（如果有）的摄像头图像将转发到会议中的其他站点。因此，每个屏幕中的屏幕

site are usually displaying images from different remote sites -- the current speaker at site A and the previous ones. This strategy can be used to preserve full-size image display and also capture the non-verbal communication between the speakers. In segment switching, the display depends on the activity in the remote rooms (generally, but not necessarily based on audio/speech detection).

站点通常显示来自不同远程站点的图像——站点A的当前发言者和之前的发言者。此策略可用于保留全尺寸图像显示，并捕获说话者之间的非语言交流。在段切换中，显示取决于远程房间中的活动（通常，但不一定基于音频/语音检测）。

A third possibility is to reduce the image size so that multiple camera views can be composited onto one or more screens. This does not preserve full-size image display, but it provides the most visual context (since more sites or segments can be seen). Typically in this case, the display mapping is static, i.e., each part of each room is shown in the same location on the display screens throughout the conference.

第三种可能性是减小图像大小，以便可以将多个摄影机视图合成到一个或多个屏幕上。这不会保留全尺寸图像显示，但它提供了最直观的上下文（因为可以看到更多的站点或片段）。通常在这种情况下，显示映射是静态的，即在整个会议期间，每个房间的每个部分都显示在显示屏上的相同位置。

Other policies and combinations are also possible. For example, there can be a static display of all screens from all remote rooms, with part or all of one screen being used to show the current speaker at full size.

其他策略和组合也是可能的。例如，可以静态显示所有远程房间的所有屏幕，其中一个屏幕的部分或全部用于显示当前扬声器的全尺寸。

3.4. Presentation

3.4. 演示

In addition to the video and audio streams showing the participants, additional streams are used for presentations.

除了显示参与者的视频和音频流外，还使用其他流进行演示。

In systems available today, generally only one additional video stream is available for presentations. Often, this presentation stream is half-duplex in nature, with presenters taking turns. The presentation stream may be captured from a PC screen, or it may come from a multimedia source such as a document camera, camcorder, or a DVD. In a multipoint meeting, the presentation streams for the currently active presentation are always distributed to all sites in the meeting, so that the presentations are viewed by all.

在目前可用的系统中，通常只有一个额外的视频流可用于演示。通常，此演示流本质上是半双工的，演示者轮流进行。演示流可以从PC屏幕捕获，也可以来自多媒体源，例如文档照相机、摄像机或DVD。在多点会议中，当前活动演示文稿的演示文稿流始终分发到会议中的所有站点，以便所有人都可以查看演示文稿。

Some systems display the presentation streams on a screen that is mounted either above or below the 3 participant screens. Other systems provide screens on the conference table for observing presentations. If multiple presentation screens are used, they generally display identical content. There is considerable variation in the placement, number, and size of presentation screens.

一些系统在安装在3个参与者屏幕上方或下方的屏幕上显示演示流。其他系统在会议桌上提供屏幕，用于观察演示文稿。如果使用多个演示屏幕，它们通常显示相同的内容。演示屏幕的位置、数量和大小存在很大差异。

In some systems, presentation audio is pre-mixed with the room audio. In others, a separate presentation audio stream is provided (if the presentation includes audio).

在某些系统中，演示音频与房间音频预混合。在其他情况下，提供单独的演示音频流（如果演示包含音频）。

In H.323 [ITU.H323] systems, H.239 [ITU.H239] is typically used to control the video presentation stream. In SIP systems, similar control mechanisms can be provided using the Binary Floor Control Protocol (BFCP) [RFC4582] for the presentation token. These mechanisms are suitable for managing a single presentation stream.

在H.323[ITU.H323]系统中，H.239[ITU.H239]通常用于控制视频呈现流。在SIP系统中，可以使用表示令牌的二进制地板控制协议（BFCP）[RFC4582]提供类似的控制机制。这些机制适用于管理单个表示流。

Although today's systems remain limited to a single video presentation stream, there are obvious uses for multiple presentation streams:

尽管今天的系统仍然局限于单个视频演示流，但多个演示流有明显的用途：

1. Frequently, the meeting convener is following a meeting agenda, and it is useful for her to be able to show that agenda to all participants during the meeting. Other participants at various remote sites are able to make presentations during the meeting, with the presenters taking turns. The presentations and the agenda are both shown, either on separate screens, or perhaps rescaled and shown on a single screen.

1. 通常情况下，会议召集人遵循会议议程，在会议期间向所有与会者展示该议程对她很有用。不同远程站点的其他参与者可以在会议期间进行演示，演示者轮流进行。演示文稿和议程都显示在单独的屏幕上，或者在单个屏幕上重新缩放和显示。

2. A single multimedia presentation can itself include multiple video streams that should be shown together. For instance, a presenter may be discussing the fairness of media coverage. In addition to slides that support the presenter's conclusions, she also has video excerpts from various news programs that she shows to illustrate her findings. She uses a DVD player for the video excerpts so that she can pause and reposition the video as needed.

2. 单个多媒体演示本身可以包含多个视频流，这些视频流应该一起显示。例如，演讲者可能正在讨论媒体报道的公平性。除了支持演讲者结论的幻灯片外，她还播放了各种新闻节目的视频片段，以说明她的发现。她使用DVD播放机播放视频片段，以便根据需要暂停和重新定位视频。

3. An educator who is presenting a multiscreen slide show. This show requires that the placement of the images on the multiple screens at each site be consistent.

3. 正在演示多屏幕幻灯片的教育家。该节目要求每个站点的多个屏幕上的图像位置保持一致。

There are many other examples where multiple presentation streams are useful.

还有许多其他示例，其中多个表示流非常有用。

3.5. Heterogeneous Systems

3.5. 异构系统

It is common in meeting scenarios for people to join the conference from a variety of environments, using different types of endpoint devices. A multiscreen immersive telepresence conference may include someone on a PC-based video conferencing system, a participant calling in by phone, and (soon) someone on a handheld device.

在会议场景中，人们使用不同类型的端点设备从各种环境加入会议是很常见的。多屏幕沉浸式临场感会议可能包括基于PC的视频会议系统上的某人、通过电话呼叫的参与者，以及（很快）手持设备上的某人。

What experience/view will each of these devices have?

每个设备都有哪些体验/视图？

Some may be able to handle multiple streams, and others can handle only a single stream. (Here, we are not talking about legacy systems, but rather systems built to participate in such a conference, although they are single stream only.) In a single video

一些可以处理多个流，而另一些只能处理单个流。（这里，我们谈论的不是遗留系统，而是为参加此类会议而构建的系统，尽管它们只是单流）

stream, the stream may contain one or more compositions depending on the available screen space on the device. In most cases, an intermediate transcoding device will be relied upon to produce a single stream, perhaps with some kind of continuous presence.

流，该流可以包含一个或多个构图，这取决于设备上的可用屏幕空间。在大多数情况下，中间转码设备将被依赖于产生单个流，可能具有某种连续存在。

Bit rates will vary -- the handheld device and phone having lower bit rates than PC and multiscreen systems.

比特率会有所不同——手持设备和手机的比特率低于PC和多屏幕系统。

Layout is accomplished according to different policies. For example, a handheld device and PC may receive the active speaker stream. The decision can either be made explicitly by the receiver or by the sender if it can receive some kind of rendering hint. The same is true for audio -- i.e., that it receives a mixed stream or a number of the loudest speakers if mixing is not available in the network.

根据不同的策略完成布局。例如，手持设备和PC可以接收活动扬声器流。该决定可以由接收方明确做出，也可以由发送方明确做出（如果发送方可以接收某种呈现提示的话）。音频也是如此——也就是说，如果网络中没有混音，它会接收混音流或一些最大的扬声器。

For the PC-based conferencing participant, the user's experience depends on the application. It could be single stream, similar to a handheld device but with a bigger screen. Or, it could be multiple streams, similar to an immersive telepresence system but with a smaller screen. Control for manipulation of streams can be local in the software application, or in another location and sent to the application over the network.

对于基于PC的会议参与者，用户体验取决于应用程序。它可以是单流，类似于手持设备，但屏幕更大。或者，它可以是多个流，类似于沉浸式临场感系统，但屏幕更小。用于流操作的控制可以是软件应用程序中的本地控制，也可以是其他位置的控制，并通过网络发送到应用程序。

The handheld device is the most extreme. How will that participant be viewed and heard? It should be an equal participant, though the bandwidth will be significantly less than an immersive system. A receiver may choose to display output coming from a handheld device differently based on the resolution, but that would be the case with any low-resolution video stream, e.g., from a powerful PC on a bad network.

手持设备是最极端的。如何看待和聆听该参与者？它应该是一个平等的参与者，尽管带宽将大大低于沉浸式系统。接收机可能会选择根据分辨率以不同方式显示来自手持设备的输出，但对于任何低分辨率视频流，例如，来自坏网络上的强大PC的视频流，都是如此。

The handheld device will send and receive a single video stream, which could be a composite or a subset of the conference. The handheld device could say what it wants or could accept whatever the sender (conference server or sending endpoint) thinks is best. The handheld device will have to signal any actions it wants to take the same way that an immersive system signals actions.

手持设备将发送和接收单个视频流，该视频流可以是会议的组合或子集。手持设备可以说出它想要什么，也可以接受发送方（会议服务器或发送端点）认为最好的任何内容。手持设备必须以沉浸式系统发出动作信号的方式发出它想要采取的任何动作的信号。

3.6. Multipoint Education Usage

3.6. 多点教育使用

The importance of this example is that the multiple video streams are not used to create an immersive conferencing experience with panoramic views at all the sites. Instead, the multiple streams are dynamically used to enable full participation of remote students in a university class. In some instances, the same video stream is displayed on multiple screens in the room; in other instances, an available stream is not displayed at all.

该示例的重要性在于，多个视频流不用于在所有站点创建具有全景视图的沉浸式会议体验。取而代之的是，多个流被动态地使用，以使远程学生能够充分参与大学课堂。在某些情况下，同一视频流显示在房间的多个屏幕上；在其他情况下，根本不显示可用流。

The main site is a university auditorium that is equipped with 3 cameras. One camera is focused on the professor at the podium. A second camera is mounted on the wall behind the professor and captures the class in its entirety. The third camera is co-located with the second and is designed to capture a close-up view of a questioner in the audience. It automatically zooms in on that student using sound localization.

主场地是一个配备有3台摄像机的大学礼堂。一台摄像机聚焦在讲台上的教授身上。第二台摄像机安装在教授身后的墙上，拍摄全班同学。第三个摄像头与第二个摄像头位于同一位置，用于拍摄观众中提问者的特写镜头。它使用声音定位自动放大该学生。

Although the auditorium is equipped with 3 cameras, it is only equipped with 2 screens. One is a large screen located at the front so that the class can see it. The other is located at the rear so the professor can see it. When someone asks a question, the front screen shows the questioner. Otherwise, it shows the professor (ensuring everyone can easily see her).

尽管礼堂配备了3台摄像机，但只配备了2个屏幕。一个是位于前面的大屏幕，以便全班都能看到它。另一个在后面，教授可以看到。当有人提问时，前屏幕显示提问者。否则，它会显示教授（确保每个人都能很容易地看到她）。

The remote sites are typical immersive telepresence rooms, each with 3 camera/screen pairs.

远程站点是典型的沉浸式临场感房间，每个房间有3个摄像头/屏幕对。

All remote sites display the professor on the center screen at full size. A second screen shows the entire classroom view when the professor is speaking. However, when a student asks a question, the second screen shows the close-up view of the student at full size. Sometimes the student is in the auditorium; sometimes the speaking student is at another remote site. The remote systems never display the students that are actually in that room.

所有远程站点都会在中央屏幕上以全尺寸显示教授。第二个屏幕显示教授讲话时的整个教室视图。但是，当学生提问时，第二个屏幕会显示学生的全尺寸特写视图。有时学生在礼堂里；有时会说话的学生在另一个远程站点。远程系统从不显示实际在那个房间里的学生。

If someone at a remote site asks a question, then the screen in the auditorium will show the remote student at full size (as if they were present in the auditorium itself). The screen in the rear also shows this questioner, allowing the professor to see and respond to the student without needing to turn her back on the main class.

如果远程站点有人提出问题，礼堂中的屏幕将显示远程学生的完整尺寸（就像他们在礼堂本身一样）。后面的屏幕也显示了这个提问者，教授可以看到并回答学生，而无需让她背对着主课。

When no one is asking a question, the screen in the rear briefly shows a full-room view of each remote site in turn, allowing the professor to monitor the entire class (remote and local students). The professor can also use a control on the podium to see a particular site -- she can choose either a full-room view or a single-camera view.

当没有人提问时，后面的屏幕会依次显示每个远程站点的全房间视图，允许教授监控整个班级（远程和本地学生）。教授还可以使用讲台上的控件来查看特定站点——她可以选择全房间视图或单摄像机视图。

Realization of this use case does not require any negotiation between the participating sites. Endpoint devices (and a Multipoint Control Unit (MCU), if present) need to know who is speaking and what video stream includes the view of that speaker. The remote systems need some knowledge of which stream should be placed in the center. The ability of the professor to see specific sites (or for the system to show all the sites in turn) would also require the auditorium system

该用例的实现不需要参与站点之间的任何协商。终端设备（和多点控制单元（MCU），如果存在）需要知道谁在讲话，以及什么视频流包含该扬声器的视图。远程系统需要知道哪个流应该放在中心。教授查看特定场地（或系统依次显示所有场地）的能力也需要礼堂系统

to know what sites are available and to be able to request a particular view of any site. Bandwidth is optimized if video that is not being shown at a particular site is not distributed to that site.

了解哪些站点可用，并能够请求任何站点的特定视图。如果未在特定站点显示的视频未分发到该站点，则带宽将得到优化。

3.7. Multipoint Multiview (Virtual Space)

3.7. 多点多视图（虚拟空间）

This use case describes a virtual space multipoint meeting with good eye contact and spatial layout of participants. The use case was proposed very early in the development of video conferencing systems as described in 1983 by Allardyce and Randal [virtualspace]. The use case is illustrated in Figure 2-5 of their report. The virtual space expands the point-to-point case by having all multipoint conference participants "seated" in a virtual room. In this case, each participant has a fixed "seat" in the virtual room, so each participant expects to see a different view having a different participant on his left and right side. Today, the use case is implemented in multiple telepresence-type video conferencing systems on the market. The term "virtual space" was used in their report. The main difference between the result obtained with modern systems and those from 1983 are larger screen sizes.

这个用例描述了一个虚拟空间多点会议，参与者有良好的眼神交流和空间布局。该用例是在视频会议系统开发的早期提出的，如Allardyce和Randal[virtualspace]在1983年所述。用例如他们报告的图2-5所示。虚拟空间通过让所有多点会议参与者“坐”在虚拟房间中来扩展点对点案例。在这种情况下，每个参与者在虚拟房间中都有一个固定的“座位”，因此每个参与者都希望看到一个不同的视图，该视图的左右两侧都有一个不同的参与者。如今，该用例已在市场上的多个远程呈现型视频会议系统中实现。他们在报告中使用了“虚拟空间”一词。使用现代系统获得的结果与1983年的结果之间的主要区别在于屏幕尺寸更大。

Virtual space multipoint as defined here assumes endpoints with multiple cameras and screens. Usually, there is the same number of cameras and screens at a given endpoint. A camera is positioned above each screen. A key aspect of virtual space multipoint is the details of how the cameras are aimed. The cameras are each aimed on the same area of view of the participants at the site. Thus, each camera takes a picture of the same set of people but from a different angle. Each endpoint sender in the virtual space multipoint meeting therefore offers a choice of video streams to remote receivers, each stream representing a different viewpoint. For example, a camera positioned above a screen to a participant's left may take video pictures of the participant's left ear; while at the same time, a camera positioned above a screen to the participant's right may take video pictures of the participant's right ear.

此处定义的虚拟空间多点假设端点具有多个摄像头和屏幕。通常，在给定的端点上有相同数量的摄像头和屏幕。每个屏幕上方都有一个摄像头。虚拟空间多点的一个关键方面是摄像机如何瞄准的细节。摄像机分别对准现场参与者的同一视野区域。因此，每台相机从不同角度拍摄同一组人的照片。因此，虚拟空间多点会议中的每个端点发送者向远程接收器提供视频流的选择，每个流代表不同的视点。例如，位于参与者左侧屏幕上方的照相机可以拍摄参与者左耳的视频照片；同时，位于参与者右侧屏幕上方的摄像机可以拍摄参与者右耳的视频照片。

Since a sending endpoint has a camera associated with each screen, an association is made between the receiving stream output on a particular screen and the corresponding sending stream from the camera associated with that screen. These associations are repeated for each screen/camera pair in a meeting. The result of this system is a horizontal arrangement of video images from remote sites, one per screen. The image from each screen is paired with the camera output from the camera above that screen, resulting in excellent eye contact.

由于发送端点具有与每个屏幕相关联的照相机，因此在特定屏幕上输出的接收流与来自与该屏幕相关联的照相机的相应发送流之间进行关联。对于会议中的每个屏幕/摄像机对，重复这些关联。该系统的结果是来自远程站点的视频图像水平排列，每个屏幕一个。来自每个屏幕的图像与来自该屏幕上方摄像头的摄像头输出配对，从而实现出色的眼神交流。

3.8. Multiple Presentation Streams - Telemedicine

3.8. 多个演示流-远程医疗

This use case describes a scenario where multiple presentation streams are used. In this use case, the local site is a surgery room connected to one or more remote sites that may have different capabilities. At the local site, 3 main cameras capture the whole room (the typical 3-camera telepresence case). Also, multiple presentation inputs are available: a surgery camera that is used to provide a zoomed view of the operation, an endoscopic monitor, a flouroscope (X-ray imaging), an ultrasound diagnostic device, an electrocardiogram (ECG) monitor, etc. These devices are used to provide multiple local video presentation streams to help the surgeon monitor the status of the patient and assist in the surgical process.

此用例描述了使用多个表示流的场景。在这个用例中，本地站点是一个手术室，连接到一个或多个可能具有不同功能的远程站点。在本地站点，3个主摄像机捕捉整个房间（典型的3摄像机临场感案例）。此外，还提供多种演示输入：用于提供手术放大视图的手术摄像机、内窥镜监视器、荧光镜（X射线成像）、超声波诊断设备、心电图（ECG）监视器、，等。这些设备用于提供多个本地视频演示流，以帮助外科医生监控患者的状态并协助手术过程。

The local site may have 3 main screens and one (or more) presentation screen(s). The main screens can be used to display the remote experts. The presentation screen(s) can be used to display multiple presentation streams from local and remote sites simultaneously. The 3 main cameras capture different parts of the surgery room. The surgeon can decide the number, the size, and the placement of the presentations displayed on the local presentation screen(s). He can also indicate which local presentation captures are provided for the remote sites. The local site can send multiple presentation captures to remote sites, and it can receive from them multiple presentations related to the patient or the procedure.

本地站点可能有3个主屏幕和一个（或多个）演示屏幕。主屏幕可用于显示远程专家。演示屏幕可用于同时显示来自本地和远程站点的多个演示流。3个主要摄像头捕捉手术室的不同部分。外科医生可以决定本地演示屏幕上显示的演示文稿的数量、大小和位置。他还可以指出为远程站点提供了哪些本地演示文稿捕获。本地站点可以向远程站点发送多个演示文稿捕获，并且可以从远程站点接收与患者或程序相关的多个演示文稿。

One type of remote site is a single- or dual-screen and one-camera system used by a consulting expert. In the general case, the remote sites can be part of a multipoint telepresence conference. The presentation screens at the remote sites allow the experts to see the details of the operation and related data. Like the main site, the experts can decide the number, the size, and the placement of the presentations displayed on the presentation screens. The presentation screens can display presentation streams from the surgery room, from other remote sites, or from local presentation streams. Thus, the experts can also start sending presentation streams that can carry medical records, pathology data, or their references and analysis, etc.

远程站点的一种类型是由咨询专家使用的单屏或双屏和一个摄像头系统。在一般情况下，远程站点可以是多点临场感会议的一部分。远程站点的演示屏幕允许专家查看操作细节和相关数据。与主站点一样，专家可以决定演示屏幕上显示的演示文稿的数量、大小和位置。演示屏幕可以显示来自手术室、其他远程站点或本地演示流的演示流。因此，专家们也可以开始发送能够携带医疗记录、病理数据或其参考和分析等的演示流。

Another type of remote site is a typical immersive telepresence room with 3 camera/screen pairs, allowing more experts to join the consultation. These sites can also be used for education. The teacher, who is not necessarily the surgeon, and the students are in different remote sites. Students can observe and learn the details of the whole procedure, while the teacher can explain and answer questions during the operation.

另一种类型的远程站点是一个典型的沉浸式远程临场感室，有3个摄像头/屏幕对，允许更多专家加入咨询。这些网站也可用于教育。教师（不一定是外科医生）和学生在不同的偏远地区。学生可以观察和学习整个过程的细节，教师可以在操作过程中解释和回答问题。

All remote education sites can display the surgery room. Another option is to display the surgery room on the center screen, and the rest of the screens can show the teacher and the student who is asking a question. For all the above sites, multiple presentation screens can be used to enhance visibility: one screen for the zoomed surgery stream and the others for medical image streams, such as MRI images, cardiograms, ultrasonic images, and pathology data.

所有远程教育站点都可以显示手术室。另一种选择是在中央屏幕上显示手术室，其余屏幕可以显示正在提问的老师和学生。对于上述所有站点，可以使用多个演示屏幕来增强可见性：一个屏幕用于缩放手术流，另一个屏幕用于医学图像流，如MRI图像、心电图、超声波图像和病理数据。

4. Acknowledgements

4. 致谢

The document has benefitted from input from a number of people including Alex Eleftheriadis, Marshall Eubanks, Tommy Andre Nyquist, Mark Gorzynski, Charles Eckel, Nermeen Ismail, Mary Barnes, Pascal Buhler, and Jim Cole.

该文件得益于许多人的意见，包括亚历克斯·埃列夫瑟里亚迪斯、马歇尔·尤班克斯、汤米·安德烈·尼奎斯特、马克·戈尔津斯基、查尔斯·埃克尔、内尔明·伊斯梅尔、玛丽·巴恩斯、帕斯卡·布勒和吉姆·科尔。

Special acknowledgement to Lennard Xiao, who contributed the text for the telemedicine use case, and to Claudio Allocchio for his detailed review of the document.

特别感谢Lennard Xiao，他为远程医疗用例提供了文本，并感谢Claudio Allocchio对文档的详细审查。

5. Security Considerations

5. 安全考虑

While there are likely to be security considerations for any solution for telepresence interoperability, this document has no security considerations.

虽然任何远程呈现互操作性解决方案都可能有安全考虑，但本文档没有安全考虑。

6. Informative References

6. 资料性引用

[ITU.H239] ITU-T, "Role management and additional media channels for H.300-series terminals", ITU-T Recommendation H.239, September 2005.

[ITU.H239]ITU-T，“H.300系列终端的角色管理和附加媒体通道”，ITU-T建议H.239，2005年9月。

[ITU.H264] ITU-T, "Advanced video coding for generic audiovisual services", ITU-T Recommendation H.264, April 2013.

[ITU.H264]ITU-T，“通用视听服务的高级视频编码”，ITU-T建议H.264，2013年4月。

[ITU.H323] ITU-T, "Packet-based Multimedia Communications Systems", ITU-T Recommendation H.323, December 2009.

[ITU.H323]ITU-T，“基于分组的多媒体通信系统”，ITU-T建议H.323，2009年12月。

[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.

[RFC3261]Rosenberg，J.，Schulzrinne，H.，Camarillo，G.，Johnston，A.，Peterson，J.，Sparks，R.，Handley，M.，和E.Schooler，“SIP：会话启动协议”，RFC 3261，2002年6月。

[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.

[RFC3550]Schulzrinne，H.，Casner，S.，Frederick，R.，和V.Jacobson，“RTP：实时应用的传输协议”，STD 64，RFC 35502003年7月。

[RFC4582] Camarillo, G., Ott, J., and K. Drage, "The Binary Floor Control Protocol (BFCP)", RFC 4582, November 2006.

[RFC4582]Camarillo，G.，Ott，J.，和K.Drage，“二进制地板控制协议（BFCP）”，RFC 4582，2006年11月。

[virtualspace] Allardyce, L. and L. Randall, "Development of Teleconferencing Methodologies with Emphasis on Virtual Space Video and Interactive Graphics", April 1983, <http://www.dtic.mil/docs/citations/ADA127738>.

[虚拟空间]Allardyce，L.和L.Randall，“以虚拟空间视频和交互式图形为重点的远程会议方法的开发”，1983年4月<http://www.dtic.mil/docs/citations/ADA127738>.

Authors' Addresses

作者地址

Allyn Romanow Cisco San Jose, CA 95134 US

美国加利福尼亚州圣何塞市阿林·罗曼诺思科95134

   EMail: allyn@cisco.com

   EMail: allyn@cisco.com

Stephen Botzko Polycom Andover, MA 01810 US

Stephen Botzko Polycom Andover，马萨诸塞州，美国01810

   EMail: stephen.botzko@polycom.com

   EMail: stephen.botzko@polycom.com

Mark Duckworth Polycom Andover, MA 01810 US

美国马萨诸塞州安多弗市马克·达克沃斯宝利通公司01810

   EMail: mark.duckworth@polycom.com

   EMail: mark.duckworth@polycom.com

Roni Even (editor) Huawei Technologies Tel Aviv Israel

Roni Even（编辑）华为技术以色列特拉维夫

   EMail: roni.even@mail01.huawei.com

   EMail: roni.even@mail01.huawei.com