Network Working Group                                   A. van Wijk, Ed.
Request for Comments: 5194                                G. Gybels, Ed.
Category: Informational                                        June 2008
Network Working Group                                   A. van Wijk, Ed.
Request for Comments: 5194                                G. Gybels, Ed.
Category: Informational                                        June 2008

Framework for Real-Time Text over IP Using the Session Initiation Protocol (SIP)


Status of This Memo


This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.




This document lists the essential requirements for real-time Text-over-IP (ToIP) and defines a framework for implementation of all required functions based on the Session Initiation Protocol (SIP) and the Real-Time Transport Protocol (RTP). This includes interworking between Text-over-IP and existing text telephony on the Public Switched Telephone Network (PSTN) and other networks.


Table of Contents


   1. Introduction ....................................................3
   2. Scope ...........................................................4
   3. Terminology .....................................................4
   4. Definitions .....................................................4
   5. Requirements ....................................................6
      5.1. General Requirements for ToIP ..............................6
      5.2. Detailed Requirements for ToIP .............................8
           5.2.1. Session Setup and Control Requirements ..............9
           5.2.2. Transport Requirements .............................10
           5.2.3. Transcoding Service Requirements ...................10
           5.2.4. Presentation and User Control Requirements .........11
           5.2.5. Interworking Requirements ..........................13
         PSTN Interworking Requirements ............13
         Cellular Interworking Requirements ........14
         Instant Messaging Interworking
                           Requirements ..............................14
   6. Implementation Framework .......................................15
      6.1. General Implementation Framework ..........................15
      6.2. Detailed Implementation Framework .........................15
           6.2.1. Session Control and Setup ..........................15
         Pre-Session Setup .........................15
         Session Negotiations ......................16
           6.2.2. Transport ..........................................17
           6.2.3. Transcoding Services ...............................18
           6.2.4. Presentation and User Control Functions ............18
         Progress and Status Information ...........18
         Alerting ..................................18
         Text Presentation .........................19
         File Storage ..............................19
           6.2.5. Interworking Functions .............................19
         PSTN Interworking .........................20
         Mobile Interworking .......................22
                  Cellular "No-gain" .............22
                  Cellular Text Telephone
                                      Modem (CTM) ....................22
                  Cellular "Baudot mode" .........22
                  Mobile Data Channel Mode .......23
                  Mobile ToIP ....................23
         Instant Messaging Interworking ............23
         Multi-Functional Combination Gateways .....24
         Character Set Transcoding .................25
   7. Further Recommendations for Implementers and Service
      Providers ......................................................25
      7.1. Access to Emergency Services ..............................25
      7.2. Home Gateways or Analog Terminal Adapters .................25
      7.3. User Mobility .............................................26
   1. Introduction ....................................................3
   2. Scope ...........................................................4
   3. Terminology .....................................................4
   4. Definitions .....................................................4
   5. Requirements ....................................................6
      5.1. General Requirements for ToIP ..............................6
      5.2. Detailed Requirements for ToIP .............................8
           5.2.1. Session Setup and Control Requirements ..............9
           5.2.2. Transport Requirements .............................10
           5.2.3. Transcoding Service Requirements ...................10
           5.2.4. Presentation and User Control Requirements .........11
           5.2.5. Interworking Requirements ..........................13
         PSTN Interworking Requirements ............13
         Cellular Interworking Requirements ........14
         Instant Messaging Interworking
                           Requirements ..............................14
   6. Implementation Framework .......................................15
      6.1. General Implementation Framework ..........................15
      6.2. Detailed Implementation Framework .........................15
           6.2.1. Session Control and Setup ..........................15
         Pre-Session Setup .........................15
         Session Negotiations ......................16
           6.2.2. Transport ..........................................17
           6.2.3. Transcoding Services ...............................18
           6.2.4. Presentation and User Control Functions ............18
         Progress and Status Information ...........18
         Alerting ..................................18
         Text Presentation .........................19
         File Storage ..............................19
           6.2.5. Interworking Functions .............................19
         PSTN Interworking .........................20
         Mobile Interworking .......................22
                  Cellular "No-gain" .............22
                  Cellular Text Telephone
                                      Modem (CTM) ....................22
                  Cellular "Baudot mode" .........22
                  Mobile Data Channel Mode .......23
                  Mobile ToIP ....................23
         Instant Messaging Interworking ............23
         Multi-Functional Combination Gateways .....24
         Character Set Transcoding .................25
   7. Further Recommendations for Implementers and Service
      Providers ......................................................25
      7.1. Access to Emergency Services ..............................25
      7.2. Home Gateways or Analog Terminal Adapters .................25
      7.3. User Mobility .............................................26
      7.4. Firewalls and NATs ........................................26
      7.5. Quality of Service ........................................26
   8. Security Considerations ........................................26
   9. Contributors ...................................................27
   10. References ....................................................27
      10.1. Normative References .....................................27
      10.2. Informative References ...................................29
      7.4. Firewalls and NATs ........................................26
      7.5. Quality of Service ........................................26
   8. Security Considerations ........................................26
   9. Contributors ...................................................27
   10. References ....................................................27
      10.1. Normative References .....................................27
      10.2. Informative References ...................................29
1. Introduction
1. 介绍

For many years, real-time text has been in use as a medium for conversational, interactive dialogue between users in a similar way to how voice telephony is used. Such interactive text is different from messaging and semi-interactive solutions like Instant Messaging in that it offers an equivalent conversational experience to users who cannot, or do not wish to, use voice. It therefore meets a different set of requirements from other text-based solutions already available on IP networks.


Traditionally, deaf, hard-of-hearing, and speech-impaired people are amongst the most prolific users of real-time, conversational, text but, because of its interactivity, it is becoming popular amongst mainstream users as well. Real-time text conversation can be combined with other conversational media like video or voice.


This document describes how existing IETF protocols can be used to implement a Text-over-IP solution (ToIP). Therefore, this document describes how to use a set of existing components and protocols and provides the requirements and rules for that resulting structure, which is why it is called a "framework", fitting commonly accepted dictionary definitions of that term.


This ToIP framework is specifically designed to be compatible with Voice-over-IP (VoIP), Video-over-IP, and Multimedia-over-IP (MoIP) environments. This ToIP framework also builds upon, and is compatible with, the high-level user requirements of deaf, hard-of-hearing and speech-impaired users as described in RFC3351 [22]. It also meets real-time text requirements of mainstream users.


ToIP also offers an IP equivalent of analog text telephony services as used by deaf, hard-of-hearing, speech-impaired, and mainstream users.


The Session Initiation Protocol (SIP) [2] is the protocol of choice for control of Multimedia communications and Voice-over-IP (VoIP) in particular. It offers all the necessary control and signalling required for the ToIP framework.


The Real-Time Transport Protocol (RTP) [3] is the protocol of choice for real-time data transmission, and its use for real-time text payloads is described in RFC 4103 [4].

实时传输协议(RTP)[3]是实时数据传输的首选协议,其用于实时文本有效载荷的使用在RFC 4103[4]中进行了描述。

This document defines a framework for ToIP to be used either by itself or as part of integrated, multi-media services, including Total Conversation [5].

本文件为ToIP定义了一个框架,该框架可以单独使用,也可以作为集成多媒体服务的一部分使用,包括Total Conversation[5]。

2. Scope
2. 范围

This document defines a framework for the implementation of real-time ToIP, either stand-alone or as a part of multimedia services, including Total Conversation [5]. It provides the:


a. requirements for real-time text;

a. 对实时文本的要求;

b. requirements for ToIP interworking;

b. ToIP互通要求;

c. description of ToIP implementation using SIP and RTP;

c. 使用SIP和RTP实现ToIP的说明;

d. description of ToIP interworking with other text services.

d. ToIP与其他文本服务交互的说明。

3. Terminology
3. 术语

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [6] and indicate requirement levels for compliant implementations.

本文件中的关键词“必须”、“不得”、“要求”、“应”、“不应”、“建议”、“不建议”、“可”和“可选”应按照RFC 2119[6]中的描述进行解释,并指出符合性实施的要求级别。

4. Definitions
4. 定义

Audio bridging: a function of an audio media bridge server, gateway, or relay service that sends to each destination the combination of audio from all participants in a conference, excluding the participant(s) at that destination. At the RTP level, this is an instance of the mixer function as defined in RFC 3550 [3].

音频桥接:音频媒体桥接服务器、网关或中继服务的一种功能,它向每个目的地发送来自会议所有参与者(不包括该目的地的参与者)的音频组合。在RTP级别,这是RFC 3550[3]中定义的混合器功能的一个实例。

Cellular: a telecommunication network that has wireless access and can support voice and data services over very large geographical areas. Also called Mobile.


Full duplex: media is sent independently in both directions.


Half duplex: media can only be sent in one direction at a time, or if an attempt to send information in both directions is made, errors may be introduced into the presented media.


Interactive text: another term for real-time text, as defined below.


Real-time text: a term for real-time transmission of text in a character-by-character fashion for use in conversational services, often as a text equivalent to voice-based conversational services. Conversational text is defined in the ITU-T Framework for multimedia services, Recommendation F.700 [21].


Text gateway: a function that transcodes between different forms of text transport methods, e.g., between ToIP in IP networks and Baudot or ITU-T V.21 text telephony in the PSTN.

文本网关:在不同形式的文本传输方法之间进行代码转换的功能,例如,在IP网络中的ToIP和PSTN中的Baudot或ITU-T V.21文本电话之间进行代码转换。

Textphone: also "text telephone". A terminal device that allows end-to-end real-time text communication using analog transmission. A variety of PSTN textphone protocols exists world-wide. A textphone can often be combined with a voice telephone, or include voice communication functions for simultaneous or alternating use of text and voice in a call.


Text bridging: a function of the text media bridge server, gateway (including transcoding gateways), or relay service analogous to that of audio bridging as defined above, except that text is the medium of conversation.


Text relay service: a third-party or intermediary that enables communications between deaf, hard-of-hearing, and speech-impaired people and voice telephone users by translating between voice and real-time text in a call.


Text telephony: analog textphone service.


Total Conversation: a multimedia service offering real-time conversation in video, real-time text and voice according to interoperable standards. All media streams flow in real time. (See ITU-T F.703, "Multimedia conversational services" [5].)

全对话:一种多媒体服务,根据互操作标准提供视频、实时文本和语音的实时对话。所有媒体流都实时流动。(见ITU-T F.703,“多媒体对话服务”[5]。)

Transcoding service: a service provided by a third-party User Agent that transcodes one stream into another. Transcoding can be done by human operators, in an automated manner, or by a combination of both methods. Within this document, the term particularly applies to conversion between different types of media. A text relay service is an example of a transcoding service that converts between real-time text and audio.


TTY: originally, an abbreviation for "teletype". Often used in North America as an alternative designation for a text telephone or textphone. Also called TDD, Telecommunication Device for the Deaf.


Video relay service: a service that enables communications between deaf and hard-of-hearing people and hearing persons with voice telephones by translating between sign language and spoken language in a call.




2G Second generation cellular (mobile) 2.5G Enhanced second generation cellular (mobile) 3G Third generation cellular (mobile) ATA Analog Telephone Adaptor CDMA Code Division Multiple Access CLI Calling Line Identification CTM Cellular Text Telephone Modem ENUM E.164 number storage in DNS (see RFC3761) GSM Global System for Mobile Communications ISDN Integrated Services Digital Network ITU-T International Telecommunications Union-Telecommunications Standardisation Sector NAT Network Address Translation PSTN Public Switched Telephone Network RTP Real-Time Transport Protocol SDP Session Description Protocol SIP Session Initiation Protocol SRTP Secure Real Time Transport Protocol TDD Telecommunication Device for the Deaf TDMA Time Division Multiple Access TTY Analog textphone (Teletypewriter) ToIP Real-time Text over Internet Protocol URI Uniform Resource Identifier UTF-8 UCS/Unicode Transformation Format-8 VCO/HCO Voice Carry Over/Hearing Carry Over VoIP Voice over Internet Protocol

2G第二代蜂窝(移动)2.5G增强型第二代蜂窝(移动)3G第三代蜂窝(移动)ATA模拟电话适配器CDMA码分多址CLI呼叫线路标识CTM蜂窝文本电话调制解调器ENUM E.164 DNS中的号码存储(请参阅RFC3761)GSM全球移动通信系统ISDN综合业务数字网ITU-T国际电信联盟电信标准化部门NAT网络地址转换PSTN公共交换电话网RTP实时传输协议SDP会话描述协议SIP会话发起协议SRTP安全实时传输协议TDD聋人用TDMA时分多址通信设备TTY模拟文本电话(电传打字机)ToIP实时文本互联网协议URI统一资源标识符UTF-8 UCS/Unicode转换格式-8 VCO/HCO语音转接/听力转接VoIP互联网语音协议

5. Requirements
5. 要求

The framework described in Section 6 defines a real-time text-based conversational service that is the text equivalent of voice-based telephony. This section describes the requirements that the framework is designed to meet and the functionality it should offer.


5.1. General Requirements for ToIP
5.1. ToIP的一般要求

Any framework for ToIP must be derived from the requirements of RFC 3351 [22]. A basic requirement is that it must provide a standardized way for offering real-time text-based conversational services that can be used as an equivalent to voice telephony by deaf, hard-of-hearing, speech-impaired, and mainstream users.

ToIP的任何框架必须源自RFC 3351[22]的要求。一个基本要求是,它必须提供一种标准化的方式来提供基于文本的实时对话服务,这些服务可以被聋人、重听人、言语障碍者和主流用户用作语音电话的等价物。

It is important to understand that real-time text conversations are significantly different from other text-based communications like email or Instant Messaging. Real-time text conversations deliver an equivalent mode to voice conversations by providing transmission of text character by character as it is entered, so that the conversation can be followed closely and that immediate interaction takes place.


Store-and-forward systems like email or messaging on mobile networks, or non-streaming systems like instant messaging, are unable to provide that functionality. In particular, they do not allow for smooth communication through a Text Relay Service.


In order to make ToIP the text equivalent of voice services, ToIP needs to offer equivalent features in terms of conversationality to those provided by voice. To achieve that, ToIP needs to:


a. offer real-time transport and presentation of the conversation;

a. 提供对话的实时传输和演示;

b. provide simultaneous transmission in both directions;

b. 提供双向同步传输;

c. support both point-to-point and multipoint communication;

c. 支持点对点和多点通信;

d. allow other media, like audio and video, to be used in conjunction with ToIP;

d. 允许其他媒体(如音频和视频)与ToIP结合使用;

e. ensure that the real-time text service is always available.

e. 确保实时文本服务始终可用。

Real-time text is a useful subset of Total Conversation as defined in ITU-T F.703 [5]. Total Conversation allows participants to use multiple modes of communication during the conversation, either at the same time or by switching between modes, e.g., between real-time text and audio.

实时文本是ITU-T F.703[5]中定义的整个对话的有用子集。Total Conversation允许参与者在对话期间使用多种通信模式,可以同时使用,也可以在模式之间切换,例如,在实时文本和音频之间切换。

Deaf, hard-of-hearing, and mainstream users may invoke ToIP services for many different reasons:


- because they are in a noisy environment, e.g., in a machine room of a factory where listening is difficult;

- 因为他们处于嘈杂的环境中,例如,在工厂的机房中,很难听到声音;

- because they are busy with another call and want to participate in two calls at the same time;

- 因为他们忙于另一个电话,想同时参加两个电话;

- for implementing text and/or speech recording services (e.g., text documentation/audio recording) for legal purposes, for clarity, or for flexibility;

- 为了法律目的、清晰性或灵活性,实施文本和/或语音记录服务(例如,文本文档/音频记录);

- to overcome language barriers through speech translation and/or transcoding services;

- 通过语音翻译和/或转码服务克服语言障碍;

- because of hearing loss, deafness, or tinnitus as a result of the aging process or for any other reason, creating a need to replace or complement voice with real-time text in conversational sessions.

- 由于衰老过程或任何其他原因导致的听力损失、耳聋或耳鸣,需要在会话中用实时文本替换或补充语音。

In many of the above examples, real-time text may accompany speech. The text could be displayed side by side, or in a manner similar to subtitling in broadcasting environments, or in any other suitable manner. This could occur with users who are hard of hearing and also for mixed media calls with both hearing and deaf people participating in the call.


A ToIP user may wish to call another ToIP user, join a conference session involving several users, or initiate or join a multimedia session, such as a Total Conversation session.


A common scenario for multipoint real-time text is conference calling with many participants. Implementers could, for example, use different colours to render different participants' text, or could create separate windows or rendering areas for each participant.


5.2. Detailed Requirements for ToIP
5.2. ToIP的详细要求

The following sections list individual requirements for ToIP. Each requirement has been given a unique identifier (R1, R2, etc.). Section 6 (Implementation Framework) describes how to implement ToIP based on these requirements by using existing protocols and techniques.


The requirements are organized under the following headings:


- session setup and session control;

- 会话设置和会话控制;

- transport;

- 运输

- use of transcoding services;

- 转码服务的使用;

- presentation and user control;

- 演示和用户控制;

- interworking.

- 互通。

5.2.1. Session Setup and Control Requirements
5.2.1. 会话设置和控制要求

Conversations could be started using a mode other than real-time text. Simultaneous or alternating voice and real-time text is used by a large number of people who can send voice but must receive text (due to a hearing impairment), or who can hear but must send text (due to a speech impairment).


R1: It SHOULD be possible to start conversations in any mode (real-time text, voice, video) or combination of modes.


R2: It MUST be possible for the users to switch to real-time text, or add real-time text as an additional modality, during the conversation.


R3: Systems supporting ToIP MUST allow users to select any of the supported conversation modes at any time, including in mid-conversation.


R4: Systems SHOULD allow the user to specify a preferred mode of communication in each direction, with the ability to fall back to alternatives that the user has indicated are acceptable.


R5: If the user requests simultaneous use of real-time text and audio, and this is not possible because of constraints in the network, the system SHOULD try to establish text-only communication if that is what the user has specified as his/her preference.


R6: If the user has expressed a preference for real-time text, establishment of a connection including real-time text MUST have priority over other outcomes of the session setup.


R7: It MUST be possible to use real-time text in conferences both as a medium of discussion between individual participants (for example, for sidebar discussions in real-time text while listening to the main conference audio) and for central support of the conference with real-time text interpretation of speech.


R8: Session setup and negotiation of modalities MUST allow users to specify the language of the real-time text to be used. (It is RECOMMENDED that similar functionality be provided for the video part of the conversation, i.e., to specify the sign language being used).


R9: Where certain session services are available for the audio media part of a session, these functions MUST also be supported for the real-time text media part of the same session. For example, call transfer must act on all media in the session.


5.2.2. Transport Requirements
5.2.2. 运输要求

ToIP will often be used to access a relay service [24], allowing real-time text users to communicate with voice users. With relay services, as well as in direct user-to-user conversation, it is crucial that text characters are sent as soon as possible after they are entered. While buffering may be done to improve efficiency, the delays SHOULD be kept minimal. In particular, buffering of whole lines of text will not meet character delay requirements.


R10: Characters must be transmitted soon after entry of each character so that the maximum delay requirement can be met. An end-to-end delay time of one second is regarded as good, while users note and appreciate shorter delays, down to 300ms. A delay of up to two seconds is possible to use.


R11: Real-time text transmission from a terminal SHALL be performed character by character as entered, or in small groups of characters, so that no character is delayed from entry to transmission by more than 300 milliseconds.


R12: It MUST be possible to transmit characters at a rate sufficient to support fast human typing as well as speech-to-text methods of generating real-time text. A rate of 30 characters per second is regarded as sufficient.


R13: A ToIP service MUST be able to deal with international character sets.


R14: Where it is possible, loss or corruption of real-time text during transport SHOULD be detected and the user should be informed.


R15: Transport of real-time text SHOULD be as robust as possible, so as to minimize loss of characters.


R16: It SHOULD be possible to send and receive real-time text simultaneously.


5.2.3. Transcoding Service Requirements
5.2.3. 转码服务要求

If the User Agents of different participants indicate that there is an incompatibility between their capabilities to support certain media types, e.g., one User Agent only offering T.140 over IP, as described in RFC 4103 [4], and the other one only supporting audio, the user might want to invoke a transcoding service.

如果不同参与者的用户代理指示其支持某些媒体类型的能力(例如,如RFC 4103[4]中所述,一个用户代理仅通过IP提供T.140,而另一个用户代理仅支持音频)之间存在不兼容,则用户可能希望调用转码服务。

Some users may indicate their preferred modality to be audio while others may indicate real-time text. In this case, transcoding


services might be needed for text-to-speech (TTS) and speech-to-text (STT). Other examples of possible scenarios for including a relay service in the conversation are: text bridging after conversion from speech, audio bridging after conversion from real-time text, etc.


A number of requirements, motivations, and implementation guidelines for relay service invocation can be found in RFC 3351 [22].


R17: It MUST be possible for users to invoke a transcoding service where such service is available.


R18: It MUST be possible for users to indicate their preferred modality (e.g., ToIP).


R19: It MUST be possible to negotiate the requirements for transcoding services in real time in the process of setting up a call.


R20: It MUST be possible to negotiate the requirements for transcoding services in mid-call, for the immediate addition of those services to the call.


R21: Communication between the end participants SHOULD continue after the addition or removal of a text relay service, and the effect of the change should be limited in the users' perception to the direct effect of having or not having the transcoding service in the connection.


R22: When setting up a session, it MUST be possible for a user to specify the type of relay service requested (e.g., speech to text or text to speech). The specification of a type of relay SHOULD include a language specifier.


R23: It SHOULD be possible to route the session to a preferred relay service even if the user invokes the session from another region or network than that usually used.


R24: It is RECOMMENDED that ToIP implementations make the invocation and use of relay services as easy as possible.


5.2.4. Presentation and User Control Requirements
5.2.4. 演示文稿和用户控制要求

A user should never be in doubt about the status of the session, even if the user is unable to make use of the audio or visual indication. For example, tactile indications could be used by deaf-blind individuals.


R25: User Agents for ToIP services MUST have alerting methods (e.g., for incoming sessions) that can be used by deaf and hard-of-hearing people or provide a range of alternative, but equivalent, alerting methods that can be selected by all users, regardless of their abilities.


R26: Where real-time text is used in conjunction with other media, exposure of user control functions through the User Interface needs to be done in an equivalent manner for all supported media. For example, it must be possible for the user to select between audio, visual, or tactile prompts, or all must be supplied.


R27: If available, identification of the originating party (e.g., in the form of a URI or a Calling Line Identification (CLI)) MUST be clearly presented to the user in a form suitable for the user BEFORE the session invitation is answered.


R28: When a session invitation involving ToIP originates from a Public Switched Telephone Network (PSTN) text telephone (e.g., transcoded via a text gateway), this SHOULD be indicated to the user. The ToIP client MAY adjust the presentation of the real-time text to the user as a consequence.


R29: An indication SHOULD be given to the user when real-time text is available during the call, even if it is not invoked at call setup (e.g., when only voice and/or video is used initially).


R30: The user MUST be informed of any change in modalities.


R31: Users MUST be presented with appropriate session progress information at all times.


R32: Systems for ToIP SHOULD support an answering machine function, equivalent to answering machines on telephony networks.


R33: If an answering machine function is supported, it MUST support at least 160 characters for the greeting message. It MUST support incoming text message storage of a minimum of 4096 characters, although systems MAY support much larger storage. It is RECOMMENDED that systems support storage of at least 20 incoming messages of up to 16000 characters per message.


R34: When the answering machine is activated, user alerting SHOULD still take place. The user SHOULD be allowed to monitor the auto-answer progress, and where this is provided, the user SHOULD be allowed to intervene during any stage of the answering machine procedure and take control of the session.


R35: It SHOULD be possible to save the text portion of a conversation.


R36: The presentation of the conversation SHOULD be done in such a way that users can easily identify which party generated any given portion of text.


R37: ToIP SHOULD handle characters such as new line, erasure, and alerting during a session as specified in ITU-T T.140 [8].

R37:ToIP应按照ITU-T T.140[8]中的规定,在会话期间处理新行、擦除和警报等字符。

5.2.5. Interworking Requirements
5.2.5. 互通要求

There is a range of existing real-time text services. There is also a range of network technologies that could support real-time text services.


Real-time/interactive texting facilities exist already in various forms and on various networks. In the PSTN, they are commonly referred to as text telephony.


Text gateways are used for converting between different protocols for text conversation. They can be used between networks or within networks where different transport technologies are used.


R38: ToIP SHOULD provide interoperability with text conversation features in other networks, for instance the PSTN.


R39: When communicating via a gateway to other networks and protocols, the ToIP service SHOULD support the functionality for alternating or simultaneous use of modalities as offered by the interworking network.


R40: Calling party identification information, such as CLI, MUST be passed by gateways and converted to an appropriate form, if required.


R41: When interworking with other networks and services, the ToIP service SHOULD provide buffering mechanisms to deal with delays in call setup and with differences in transmission speeds, and/or to interwork with half-duplex services.

R41:当与其他网络和服务互通时,ToIP服务应提供缓冲机制,以处理呼叫设置延迟和传输速度差异,和/或与半双工服务互通。 PSTN Interworking Requirements PSTN互通要求

Analog text telephony is used in many countries, mainly by deaf, hard-of-hearing and speech-impaired individuals.


R42: ToIP services MUST provide interworking with PSTN legacy text telephony devices.


R43: When interworking with PSTN legacy text telephony services, alternating text and voice function MAY be supported. (Called "voice carry over (VCO) and hearing carry over (HCO)").

R43:当与PSTN传统文本电话服务互通时,可能支持交替文本和语音功能。(称为“语音携带(VCO)和听力携带(HCO)”。 Cellular Interworking Requirements 蜂窝互通要求

As mobile communications have been adopted widely, various solutions for real-time texting while on the move were developed. ToIP services should provide interworking with such services as well.


Alternative means of transferring the text telephony data have been developed when TTY services over cellular were mandated by the FCC in the USA. They are the a) "No-gain" codec solution, and b) the Cellular Text Telephony Modem (CTM) solution [7], both collectively called "Baudot mode" solution in the USA.


The GSM and 3G standards from 3GPP make use of the CTM modem in the voice channel for text telephony. However, implementations also exist that use the data channel to provide such functionality. Interworking with these solutions should be done using text gateways that set up the data channel connection at the GSM side and provide ToIP at the other side.


R44: a ToIP service SHOULD provide interworking with mobile text conversation services.

R44:ToIP服务应提供与移动文本对话服务的互通。 Instant Messaging Interworking Requirements 即时消息互通要求

Many people use Instant Messaging to communicate via the Internet using text. Instant Messaging usually transfers blocks of text rather than streaming as is used by ToIP. Usually a specific action is required by the user to activate transmission, such as pressing the ENTER key or a send button. As such, it is not a replacement for ToIP; in particular, it does not meet the needs for real-time conversations including those of deaf, hard-of-hearing, and speech-impaired users as defined in RFC 3351 [22]. It is less suitable for communications through a relay service [24].

许多人使用即时消息通过互联网通过文本进行交流。即时消息通常传输文本块,而不是ToIP使用的流式传输。通常,用户需要执行特定操作来激活传输,例如按下回车键或发送按钮。因此,它不是ToIP的替代品;特别是,它不能满足实时对话的需要,包括RFC 3351[22]中定义的聋人、重听人和言语障碍用户的实时对话。它不太适合通过中继服务进行通信[24]。

The streaming nature of ToIP provides a more direct conversational user experience and, when given the choice, users may prefer ToIP.


R45: a ToIP service MAY provide interworking with Instant Messaging services.


6. Implementation Framework
6. 实施框架

This section describes an implementation framework for ToIP that meets the requirements and offers the functionality as set out in Section 5. The framework presented here uses existing standards that are already commonly used for voice-based conversational services on IP networks.


6.1. General Implementation Framework
6.1. 一般实施框架

This framework specifies the use of the Session Initiation Protocol (SIP) [2] to set up, control, and tear down the connections between ToIP users whilst the media is transported using the Real-Time Transport Protocol (RTP) [3] as described in RFC 4103 [4].

该框架规定了在使用实时传输协议(RTP)[3]传输媒体时,使用会话发起协议(SIP)[2]来建立、控制和断开ToIP用户之间的连接,如RFC 4103[4]中所述。

RFC 4504 describes how to implement support for real-time text in SIP telephony devices [23].


6.2. Detailed Implementation Framework
6.2. 详细实施框架
6.2.1. Session Control and Setup
6.2.1. 会话控制和设置

ToIP services MUST use the Session Initiation Protocol (SIP) [2] for setting up, controlling, and terminating sessions for real-time text conversation with one or more participants and possibly including other media like video or audio. The Session Description Protocol (SDP) used in SIP to describe the session is used to express the attributes of the session and to negotiate a set of compatible media types.


SIP [2] allows participants to negotiate all media, including real-time text conversation [4]. ToIP services can provide the ability to set up conversation sessions from any location as well as provision for privacy and security through the application of standard SIP techniques.

SIP[2]允许参与者协商所有媒体,包括实时文本对话[4]。ToIP服务可以提供从任何位置建立对话会话的能力,并通过应用标准SIP技术提供隐私和安全性。 Pre-Session Setup 会话前设置

The requirements of the user to be reached at a consistent address and to store preferences for evaluation at session setup are met by pre-session setup actions. That includes storing of registration information in the SIP registrar to provide information about how a user can be contacted. This will allow sessions to be set up rapidly and with proper routing and addressing.


The need to use real-time text as a medium of communications can be expressed by users during registration time. Two situations need to be considered in the pre-session setup environment:


a. User Preferences: It MUST be possible for a user to indicate a preference for real-time text by registering that preference with a SIP server that is part of the ToIP service.

a. 用户首选项:用户必须能够通过向作为ToIP服务一部分的SIP服务器注册实时文本首选项来指示该首选项。

b. Server Support of User Preferences: SIP servers that support ToIP services MUST have the capability to act on calling user preferences for real-time text in order to accept or reject the session. The actions taken can be based on the called users preferences defined as part of the pre-session setup registration. For example, if the user is called by another party, and it is determined that a transcoding server is needed, the session should be re-directed or otherwise handled accordingly.

b. 服务器对用户首选项的支持:支持ToIP服务的SIP服务器必须能够调用实时文本的用户首选项,以便接受或拒绝会话。所采取的操作可以基于作为会话前设置注册的一部分而定义的被调用用户首选项。例如,如果用户被另一方调用,并且确定需要转码服务器,则会话应该被重新定向或以其他方式相应地处理。

The ability to include a transcoding service MUST NOT require user registration in any specific SIP registrar, but MAY require authorisation of the SIP registrar to invoke the service.


A point-to-point session takes place between two parties. For ToIP, one or both of the communicating parties will indicate real-time text as a possible or preferred medium for conversation using SIP in the session setup.


The following features MAY be implemented to facilitate the session establishment using ToIP:


a. Caller Preferences: SIP headers (e.g., Contact) [10] can be used to show that real-time text is the medium of choice for communications.

a. 呼叫方首选项:SIP头(例如联系人)[10]可用于显示实时文本是通信的首选媒介。

b. Called Party Preferences [11]: The called party being passive can formulate a clear rule indicating how a session should be handled, either using real-time text as a preferred medium or not, and whether this session needs to be handled by a designated SIP proxy or the SIP User Agent.

b. 被叫方首选项[11]:被叫方可以制定一个明确的规则,指示如何处理会话,或者使用实时文本作为首选介质,或者不使用,以及该会话是否需要由指定的SIP代理或SIP用户代理处理。

c. SIP Server Support for User Preferences: It is RECOMMENDED that SIP servers also handle the incoming sessions in accordance with preferences expressed for real-time text. The SIP server can also enforce ToIP policy rules for communications (e.g., use of the transcoding server for ToIP).

c. SIP服务器对用户首选项的支持:建议SIP服务器也根据实时文本的首选项处理传入会话。SIP服务器还可以强制执行用于通信的ToIP策略规则(例如,对ToIP使用转码服务器)。 Session Negotiations 届会谈判

The Session Description Protocol (SDP) used in SIP [2] provides the capabilities to indicate real-time text as a medium in the session setup. RFC 4103 [4] uses the RTP payload types "text/red" and "text/t140" for support of ToIP, which can be indicated in the SDP as a part of the SIP INVITE, OK, and SIP/200/ACK media negotiations. In

SIP[2]中使用的会话描述协议(SDP)提供了在会话设置中指示实时文本作为媒介的功能。RFC 4103[4]使用RTP有效负载类型“text/red”和“text/t140”来支持ToIP,这可以在SDP中作为SIP INVITE、OK和SIP/200/ACK媒体协商的一部分进行指示。在里面

addition, SIP's offer/answer model [12] can also be used in conjunction with other capabilities, including the use of a transcoding server for enhanced session negotiations [28,29,13].


6.2.2. Transport
6.2.2. 运输

ToIP services MUST support the Real-Time Transport Protocol (RTP) [3] according to the specification of RFC 4103 [4] for the transport of real-time text between participants.

根据RFC 4103[4]的规范,ToIP服务必须支持实时传输协议(RTP)[3],用于在参与者之间传输实时文本。

RFC 4103 describes the transmission of T.140 [8] real-time text on IP networks.

RFC 4103描述了在IP网络上传输T.140[8]实时文本。

In order to enable the use of international character sets, the transmission format for real-time text conversation SHALL be UTF-8 [14], in accordance with ITU-T T.140.

为了能够使用国际字符集,根据ITU-T T.140,实时文本对话的传输格式应为UTF-8[14]。

If real-time text is detected to be missing after transmission, there SHOULD be a "text loss" indication in the real-time text as specified in T.140 Addendum 1 [8].


The redundancy method of RFC 4103 [4] SHOULD be used to significantly increase the reliability of the real-time text transmission. A redundancy level using 2 generations gives very reliable results and is therefore strongly RECOMMENDED.

应使用RFC 4103[4]的冗余方法,以显著提高实时文本传输的可靠性。使用两代的冗余级别可提供非常可靠的结果,因此强烈建议使用。

In order to avoid exceeding the capabilities of the sender, receiver, or network (congestion), the transmission rate SHOULD be kept at or below 30 characters per second, which is the default maximum rate specified in RFC 4103 [4]. Lower rates MAY be negotiated when needed through the "cps" parameter as specified in RFC 4103 [4].

为了避免超出发送方、接收方或网络的能力(拥塞),传输速率应保持在每秒30个字符或以下,这是RFC 4103[4]中规定的默认最大速率。根据RFC 4103[4]中的规定,需要时可通过“cps”参数协商较低的费率。

Real-time text capability is announced in SDP by a declaration similar to this example:


   m=text 11000 RTP/AVP 100 98
   a=rtpmap:98 t140/1000
   a=rtpmap:100 red/1000
   a=fmtp:100 98/98/98
   m=text 11000 RTP/AVP 100 98
   a=rtpmap:98 t140/1000
   a=rtpmap:100 red/1000
   a=fmtp:100 98/98/98

By having this single coding and transmission scheme for real-time text defined in the SIP session control environment, the opportunity for interoperability is optimized. However, if good reasons exist, other transport mechanisms MAY be offered and used for the T.140- coded text, provided that proper negotiation is introduced, but the RFC 4103 [4] transport MUST be used as both the default and the fallback transport.

通过在SIP会话控制环境中定义实时文本的这种单一编码和传输方案,优化了互操作性的机会。但是,如果有充分的理由,可以为T.140编码的文本提供并使用其他传输机制,前提是引入了适当的协商,但是RFC 4103[4]传输必须同时用作默认传输和回退传输。

6.2.3. Transcoding Services
6.2.3. 转码服务

Invocation of a transcoding service MAY happen automatically when the session is being set up based on any valid indication or negotiation of supported or preferred media types. A transcoding framework document using SIP [28] describes invoking relay services, where the relay acts as a conference bridge or uses the third-party control mechanism. ToIP implementations SHOULD support this transcoding framework.


6.2.4. Presentation and User Control Functions
6.2.4. 演示和用户控制功能 Progress and Status Information 进展和状态信息

Session progress information SHOULD use simple language so that as many users as possible can understand it. The use of jargon or ambiguous terminology SHOULD be avoided. It is RECOMMENDED that text information be used together with icons to symbolise the session progress information.


In summary, it SHOULD be possible to observe indicators about:


- Incoming session

- 传入会话

- Availability of real-time text, voice, and video channels

- 实时文本、语音和视频频道的可用性

- Session progress

- 会议进展

- Incoming real-time text

- 传入实时文本

- Any loss in incoming real-time text

- 传入实时文本中的任何丢失

- Typed and transmitted real-time text

- 键入并传输实时文本 Alerting 提醒

For users who cannot use the audible alerter for incoming sessions, it is RECOMMENDED to include a tactile, as well as a visual, indicator.


Among the alerting options are alerting by the User Agent's User Interface and specific alerting User Agents registered to the same registrar as the main User Agent.


It should be noted that external alerting systems exist and one common interface for triggering the alerting action is a contact closure between two conductors.

应注意的是,存在外部报警系统,触发报警动作的一个常见接口是两个导体之间的触点闭合。 Text Presentation 文本呈现

Requirement R32 states that, in the display of text conversations, users must be able to distinguish easily between different speakers. This could be done using color, positioning of the text (i.e., incoming real-time text and outgoing real-time text in different display areas), in-band identifiers of the parties, or a combination of any of these techniques.

要求R32规定,在显示文本对话时,用户必须能够轻松区分不同的说话人。这可以通过使用颜色、文本的定位(即,不同显示区域中的传入实时文本和传出实时文本)、各方的带内标识符或这些技术的组合来实现。 File Storage 文件存储

Requirement R31 recommends that ToIP systems allow the user to save text conversations. This SHOULD be done using a standard file format. For example: a UTF-8 text file in XHTML format [15], including timestamps, party names (or addresses), and the conversation text.


6.2.5. Interworking Functions
6.2.5. 互通功能

A number of systems for real-time text conversation already exist as well as a number of message-oriented text communication systems. Interoperability is of interest between ToIP and some of these systems.


Interoperation of half-duplex and full-duplex protocols, and between protocols that have different data rates, may require text buffering. Some intelligence will be needed to determine when to change direction when operating in half-duplex mode. Identification may be required of half-duplex operation either at the "user" level (i.e., users must inform each other) or at the "protocol" level (where an indication must be sent back to the gateway). However, special care needs to be taken to provide the best possible real-time performance.


Buffering schemes SHOULD be dimensioned to adjust for receiving at 30 characters per second and transmitting at 6 characters per second for up to 4 minutes (i.e., less than 3000 characters).


When converting between simultaneous voice and text on the IP side, and alternating voice and text on the other side of a gateway, a conflict can occur if the IP user transmits both audio and text at the same time. In such situations, text transmission SHOULD have precedence, so that while text is transmitted, audio is lost.


Transcoding of text to and from other coding formats may need to take place in gateways between ToIP and other forms of text conversation, for example, to connect to a PSTN text telephone.


Session setup through gateways to other networks may require the use of specially formatted addresses or other mechanisms for invoking those gateways.


ToIP interworking requires a method to invoke a text gateway. These text gateways act as User Agents at the IP side. The capabilities of the gateway during the call will be determined by the call capabilities of the terminal that is using the gateway. For example, a PSTN textphone is generally only able to receive voice and real-time text, so the gateway will only allow ToIP and audio.


Examples of possible scenarios for invocation of the text gateway are:


a. PSTN textphone users dial a prefix number before dialing out.

a. PSTN textphone用户在拨号前先拨一个前缀号码。

b. Separate real-time text subscriptions, linked to the phone number or terminal identifier/ IP address.

b. 单独的实时文本订阅,链接到电话号码或终端标识符/IP地址。

c. Real-time text capability indicators.

c. 实时文本能力指标。

d. Real-time text preference indicators.

d. 实时文本首选项指示器。

e. Listen for V.18 modem modulation text activity in all PSTN calls and routing of the call to an appropriate gateway.

e. 收听所有PSTN呼叫中的V.18调制解调器调制文本活动,并将呼叫路由到适当的网关。

f. Call transfer request by the called user.

f. 被叫用户的呼叫转接请求。

g. Placing a call via the Web, and using one of the methods described here

g. 通过Web进行调用,并使用此处描述的方法之一

h. A text gateway with its own telephone number and/or SIP address (this requires user interaction with the gateway to place a call).

h. 具有自己的电话号码和/或SIP地址的文本网关(这需要用户与网关交互以拨打电话)。

i. ENUM address analysis and number plan.

i. 枚举地址分析和编号计划。

j. Number or address analysis leads to a gateway for all PSTN calls.

j. 电话号码或地址分析将为所有PSTN呼叫提供网关。 PSTN Interworking PSTN互通

Analog text telephony is cumbersome because of incompatible national implementations where interworking was never considered. A large number of these implementations have been documented in ITU-T V.18 [16], which also defines the modem detection sequences for the different text protocols. In rare cases, the modem type identification may take considerable time, depending on user actions.

模拟文本电话很麻烦,因为不兼容的国家实施从未考虑互通。ITU-T V.18[16]中记录了大量此类实现,其中还定义了不同文本协议的调制解调器检测序列。在极少数情况下,调制解调器类型识别可能需要相当长的时间,具体取决于用户操作。

To resolve analog textphone incompatibilities, text telephone gateways are needed to transcode incoming analog signals into T.140 and vice versa. The modem capability exchange time can be reduced by the text telephone gateways initially assuming the analog text telephone protocol used in the region where the gateway is located. For example, in the USA, Baudot [25] might be tried as the initial protocol. If negotiation for Baudot fails, the full V.18 modem capability exchange will take place. In the UK, ITU-T V.21 [26] might be the first choice.

为了解决模拟文本电话的不兼容性,需要文本电话网关将输入的模拟信号转换为T.140,反之亦然。调制解调器功能交换时间可以通过文本电话网关来缩短,最初假定网关所在区域使用模拟文本电话协议。例如,在美国,Baudot[25]可能会被尝试作为初始协议。如果Baudot协商失败,将进行完整的V.18调制解调器功能交换。在英国,ITU-T V.21[26]可能是首选。

In particular, transmission of real-time text on PSTN networks takes place using a variety of codings and modulations, including ITU-T V.21 [26], Baudot [25], dual-tone multi-frequency (DTMF), V.23 [27], and others. Many difficulties have arisen as a result of this variety in text telephony protocols and the ITU-T V.18 [16] standard was developed to address some of these issues.

特别是,PSTN网络上的实时文本传输使用各种编码和调制,包括ITU-T V.21[26]、Baudot[25]、双音多频(DTMF)、V.23[27]和其他。由于文本电话协议的这种多样性,出现了许多困难,ITU-T V.18[16]标准就是为了解决其中一些问题而制定的。

ITU-T V.18 [16] offers a native text telephony method, plus it defines interworking with current protocols. In the interworking mode, it will recognise one of the older protocols and fall back to that transmission method when required.

ITU-T V.18[16]提供了本机文本电话方法,并定义了与当前协议的互通。在互通模式下,它将识别一个较旧的协议,并在需要时返回到该传输方法。

Text gateways MUST use the ITU-T V.18 [16] standard at the PSTN side. A text gateway MUST act as a SIP User Agent on the IP side and support RFC 4103 real-time text transport.

文本网关必须在PSTN端使用ITU-T V.18[16]标准。文本网关必须充当IP端的SIP用户代理,并支持RFC 4103实时文本传输。

While ToIP allows receiving and sending real-time text simultaneously and is displayed on a split screen, many analog text telephones require users to take turns typing. This is because many text telephones operate strictly half duplex. Only one can transmit text at a time. The users apply strict turn-taking rules.


There are several text telephones which communicate in full duplex, but merge transmitted text and received text in the same line in the same display window. Here too the users apply strict turn taking rules.


Native V.18 text telephones support full duplex and separate display from reception and transmission so that the full duplex capability can be used fully. Such devices could use the ToIP split screen as well, but almost all text telephones use a restricted character set and many use low text transmission speeds (4 to 7 characters per second).


That is why it is important for the ToIP user to know that he or she is connected with an analog text telephone. The session description [9] SHOULD contain an indication that the other endpoint for the call


is a PSTN textphone (e.g., connected via an ATA or through a text gateway). This means that the textphone user may be used to formal turn taking during the call.

是PSTN文本电话(例如,通过ATA或文本网关连接)。这意味着textphone用户可以在通话过程中进行正式的话轮转换。 Mobile Interworking 移动互通

Mobile wireless (or cellular) circuit switched connections provide a digital real-time transport service for voice or data. The access technologies include GSM, CDMA, TDMA, iDen, and various 3G technologies, as well as WiFi or WiMAX.


ToIP may be supported over the cellular wireless packet-switched service. It interfaces to the Internet.


The following sections describe how mobile text telephony is supported.

以下各节介绍如何支持移动文本电话。 Cellular "No-gain" 蜂窝“无增益”

The "No-gain" text telephone transporting technology uses specially modified Enhanced Full Rate (EFR) [17] and Enhanced Variable Rate (EVR) [18] speech vocoders in mobile terminals used to provide a text telephony call. It provides full duplex operation and supports alternating between voice and text ("VCO/HCO"). It is dedicated to CDMA and TDMA mobile technologies and the US Baudot (i.e., 45 bit/s) type of text telephones.

“无增益”文本电话传输技术在用于提供文本电话呼叫的移动终端中使用特别改进的增强全速率(EFR)[17]和增强可变速率(EVR)[18]语音声码器。它提供全双工操作,并支持语音和文本之间的交替(“VCO/HCO”)。它致力于CDMA和TDMA移动技术以及美国波多特(即45位/秒)类型的文本电话。 Cellular Text Telephone Modem (CTM) 蜂窝文本电话调制解调器(CTM)

CTM [7] is a technology-independent modem technology that provides the transport of text telephone characters at up to 10 characters/sec using modem signals that can be carried by many voice codecs and uses a highly redundant encoding technique to overcome the fading and cell changing losses.

CTM[7]是一种独立于技术的调制解调器技术,它使用可由许多语音编解码器承载的调制解调器信号,以高达10个字符/秒的速度传输文本电话字符,并使用高度冗余的编码技术来克服衰落和小区变化损失。 Cellular "Baudot mode" 蜂窝“波多特模式”

This term is often used by cellular terminal suppliers for a cellular phone mode that allows TTYs to operate into a cellular phone and to communicate with a fixed-line TTY. Thus it is a common name for the "No-Gain" and the CTM solutions when applied to the Baudot-type textphones.

该术语通常由蜂窝终端供应商用于蜂窝电话模式,该模式允许TTY在蜂窝电话中工作并与固定线路TTY通信。因此,当应用于Baudot型文本电话时,它是“无增益”和CTM解决方案的通用名称。 Mobile Data Channel Mode 移动数据信道模式

Many mobile terminals allow the use of the circuit-switched data channel to transfer data in real time. Data rates of 9600 bit/s are usually supported on the 2G mobile network. Gateways provide interoperability with PSTN textphones.

许多移动终端允许使用电路交换数据信道实时传输数据。2G移动网络通常支持9600位/秒的数据速率。网关提供与PSTN文本电话的互操作性。 Mobile ToIP 移动ToIP

ToIP could be supported over mobile wireless packet-switched services that interface to the Internet. For 3GPP 3G services, ToIP support is described in 3G TS 26.235 [19].

ToIP可以通过连接到互联网的移动无线分组交换服务得到支持。对于3GPP 3G服务,3G TS 26.235[19]中描述了ToIP支持。 Instant Messaging Interworking 即时通讯互通

Text gateways MAY be used to allow interworking between Instant Messaging systems and ToIP solutions. Because Instant Messaging is based on blocks of text, rather than on a continuous stream of characters like ToIP, gateways MUST transcode between the two formats. Text gateways for interworking between Instant Messaging and ToIP MUST apply a procedure for bridging the different conversational formats of real-time text versus text messaging. The following advice may improve user experience for both parties in a call through a messaging gateway.


a. Concatenate individual characters originating at the ToIP side into blocks of text.

a. 将源自ToIP端的单个字符连接到文本块中。

b. When the length of the concatenated message becomes longer than 50 characters, the buffered text SHOULD be transmitted to the Instant Messaging side as soon as any non-alphanumerical character is received from the ToIP side.

b. 当连接消息的长度超过50个字符时,一旦从ToIP端接收到任何非字母数字字符,缓冲文本应立即传输到即时消息端。

c. When a new line indicator is received from the ToIP side, the buffered characters up to that point, including the carriage return and/or line-feed characters, SHOULD be transmitted to the Instant Messaging side.

c. 当从ToIP侧接收到新行指示符时,该点之前的缓冲字符(包括回车符和/或换行符)应传输到即时消息侧。

d. When the ToIP side has been idle for at least 5 seconds, all buffered text up to that point SHOULD be transmitted to the Instant Messaging side.

d. 当ToIP端空闲至少5秒时,应将该点之前的所有缓冲文本传输到即时消息端。

e. Text Gateways must be capable of maintaining the real-time performance for ToIP while providing the interworking services.

e. 文本网关必须能够在提供互通服务的同时保持ToIP的实时性能。

It is RECOMMENDED that during the session, both users be constantly updated on the progress of the text input. Many Instant Messaging protocols signal that a user is typing to the other party in the


conversation. Text gateways between such Instant Messaging protocols and ToIP MUST provide this signalling to the Instant Messaging side when characters start being received, or at the beginning of the conversation.


At the ToIP side, an indicator of writing the Instant Message MUST be present where the Instant Messaging protocol provides one. For example, the real-time text user MAY see ". . . waiting for replying IM. . . " and when 5 seconds have passed another . (dot) can be shown.


Those solutions will reduce the difficulties between streaming and blocked text services.


Even though the text gateway can connect Instant Messaging and ToIP, the best solution is to take advantage of the fact that the user interfaces and the user communities for instant messaging and ToIP telephony are very similar. After all, the character input, character display, Internet connectivity, and SIP stack can be the same for Instant Messaging (SIMPLE) and ToIP. Thus, the user may simply use different applications for ToIP and text messaging in the same terminal.


Devices that implement Instant Messaging SHOULD implement ToIP as described in this document so that a more complete text communication service can be provided.

实现即时消息的设备应实现本文档中所述的ToIP,以便提供更完整的文本通信服务。 Multi-Functional Combination Gateways 多功能组合网关

In practice, many interworking gateways will be implemented as gateways that combine different functions. As such, a text gateway could be built to have modems to interwork with the PSTN and support both Instant Messaging as well as ToIP. Such interworking functions are called combination gateways.


Combination gateways could provide interworking between all of their supported text-based functions. For example, a text gateway that has modems to interwork with the PSTN and that support both Instant Messaging and ToIP could support the following interworking functions:


- PSTN text telephony to ToIP

- PSTN文本电话到ToIP

- PSTN text telephony to Instant Messaging

- PSTN文本电话到即时消息

- Instant Messaging to ToIP

- 即时通讯至ToIP Character Set Transcoding 字符集转码

Gateways between the ToIP network and other networks MAY need to transcode text streams. ToIP makes use of the ISO 10646 character set. Most PSTN textphones use a 7-bit character set, or a character set that is converted to a 7-bit character set by the V.18 modem.


When transcoding between character sets and T.140 in gateways, special consideration MUST be given to the national variants of the 7-bit codes, with national characters mapping into different codes in the ISO 10646 code space. The national variant to be used could be selectable by the user on a per-call basis, or be configured as a national default for the gateway.


The indicator of missing text in T.140, specified in T.140 amendment 1, cannot be represented in the 7-bit character codes. Therefore the indicator of missing text SHOULD be transcoded to the ' (apostrophe) character in legacy text telephone systems, where this character exists. For legacy systems where the ' character does not exist, the . (full stop) character SHOULD be used instead.


7. Further Recommendations for Implementers and Service Providers
7. 对实施者和服务提供者的进一步建议
7.1. Access to Emergency Services
7.1. 获得紧急服务

It must be possible to place an emergency call using ToIP and it must be possible to use a relay service in such a call. The emergency service provided to users utilising the real-time text medium must be equivalent to the emergency service provided to users utilising speech or other media.


A text gateway must be able to route real-time text calls to emergency service providers when any of the recognised emergency numbers that support text communications for the country or region are called, e.g., "911" in the USA and "112" in Europe. Routing real-time text calls to emergency services may require the use of a transcoding service.


A text gateway with cellular wireless packet-switched services must be able to route real-time text calls to emergency service providers when any of the recognized emergency numbers that support real-time text communication for the country is called.


7.2. Home Gateways or Analog Terminal Adapters
7.2. 家庭网关或模拟终端适配器

Analog terminal adapters (ATA) using SIP-based IP communication and RJ-11 connectors for connecting traditional PSTN devices SHOULD enable connection of legacy PSTN text telephones [23].


These adapters SHOULD contain V.18 modem functionality, voice handling functionality, and conversion functions to/from SIP-based ToIP with T.140 transported according to RFC 4103 [4], in a similar way as it provides interoperability for voice sessions.

这些适配器应包含V.18调制解调器功能、语音处理功能以及与基于SIP的ToIP之间的转换功能,T.140根据RFC 4103[4]传输,其方式与提供语音会话互操作性的方式类似。

If a session is set up and text/t140 capability is not declared by the destination endpoint (by the endpoint terminal or the text gateway in the network at the endpoint), a method for invoking a transcoding server SHALL be used. If no such server is available, the signals from the textphone MAY be transmitted in the voice channel as audio with a high quality of service.


NOTE: It is preferred that such analog terminal adaptors do use RFC 4103 [4] on board and thus act as a text gateway. Sending textphone signals over the voice channel is undesirable due to possible filtering and compression and packet loss between the endpoints. This can result in character loss in the textphone conversation or even not allowing the textphones to connect to each other.

注:此类模拟终端适配器最好在板上使用RFC 4103[4],从而充当文本网关。由于端点之间可能存在过滤、压缩和数据包丢失,因此通过语音通道发送textphone信号是不可取的。这可能会导致短信电话对话中的字符丢失,甚至不允许短信电话相互连接。

7.3. User Mobility
7.3. 用户移动性

ToIP User Agents SHOULD use the same mechanisms as other SIP User Agents to resolve mobility issues. It is RECOMMENDED that users use a SIP address, resolved by a SIP registrar, to enable basic user mobility. Further mechanisms are defined for all session types for 3G IP multimedia systems.

ToIP用户代理应使用与其他SIP用户代理相同的机制来解决移动性问题。建议用户使用由SIP注册器解析的SIP地址,以实现基本的用户移动性。为3G IP多媒体系统的所有会话类型定义了进一步的机制。

7.4. Firewalls and NATs
7.4. 防火墙和NAT

ToIP uses the same signalling and transport protocols as VoIP. Hence, the same firewall and NAT solutions and network functionality that apply to VoIP MUST also apply to ToIP.


7.5. Quality of Service
7.5. 服务质量

Where Quality of Service (QoS) mechanisms are used, the real-time text streams should be assigned appropriate QoS characteristics, so that the performance requirements can be met and the real-time text stream is not degraded unfavourably in comparison to voice performance in congested situations.


8. Security Considerations
8. 安全考虑

User confidentiality and privacy need to be met as described in SIP [2]. For example, nothing should reveal in an obvious way the fact that the ToIP user might be a person with a hearing or speech impairment. It is up to the ToIP user to make his or her hearing or speech impairment public. If a transcoding server is being used,


this SHOULD be as transparent as possible. However, it might still be possible to discern that a user might be hearing or speech impaired based on the attributes present in SDP, although the intention is that mainstream users might also choose to use ToIP. Encryption SHOULD be used on an end-to-end or hop-by-hop basis as described in SIP [2] and SRTP [20].


Authentication MUST be provided for users in addition to message integrity and access control.


Protection against Denial-of-Service (DoS) attacks needs to be provided, considering the case that the ToIP users might need transcoding servers.


9. Contributors
9. 贡献者

The following people contributed to this document: Willem Dijkstra, Barry Dingle, Gunnar Hellstrom, Radhika R. Roy, Henry Sinnreich, and Gregg C. Vanderheiden.


The content and concepts within are a product of the SIPPING Working Group. Tom Taylor (Nortel) acted as independent reviewer and contributed significantly to the structure and content of this document.

The content and concepts within are a product of the SIPPING Working Group. Tom Taylor (Nortel) acted as independent reviewer and contributed significantly to the structure and content of this document.translate error, please retry

10. References
10. 工具书类
10.1. Normative References
10.1. 规范性引用文件

[1] Bradner, S., Ed., "Intellectual Property Rights in IETF Technology", BCP 79, RFC 3979, March 2005.

[1] Bradner,S.,编辑,“IETF技术中的知识产权”,BCP 79,RFC 3979,2005年3月。

[2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.

[2] Rosenberg,J.,Schulzrinne,H.,Camarillo,G.,Johnston,A.,Peterson,J.,Sparks,R.,Handley,M.,和E.Schooler,“SIP:会话启动协议”,RFC 3261,2002年6月。

[3] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003.

[3] Schulzrinne,H.,Casner,S.,Frederick,R.,和V.Jacobson,“RTP:实时应用的传输协议”,STD 64,RFC 35502003年7月。

[4] Hellstrom, G. and P. Jones, "RTP Payload for Text Conversation", RFC 4103, June 2005.

[4] Hellstrom,G.和P.Jones,“文本对话的RTP有效载荷”,RFC 4103,2005年6月。

[5] ITU-T Recommendation F.703,"Multimedia Conversational Services", November 2000.

[5] ITU-T建议F.703,“多媒体对话服务”,2000年11月。

[6] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.

[6] Bradner,S.,“RFC中用于表示需求水平的关键词”,BCP 14,RFC 2119,1997年3月。

[7] 3GPP TS 26.226, "Cellular Text Telephone Modem Description" (CTM).

[7] 3GPP TS 26.226,“蜂窝文本电话调制解调器描述”(CTM)。

[8] ITU-T Recommendation T.140, "Protocol for Multimedia Application Text Conversation" (February 1998) and Addendum 1 (February 2000).

[8] ITU-T建议T.140,“多媒体应用文本对话协议”(1998年2月)和附录1(2000年2月)。

[9] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006.

[9] Handley,M.,Jacobson,V.,和C.Perkins,“SDP:会话描述协议”,RFC4566,2006年7月。

[10] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Indicating User Agent Capabilities in the Session Initiation Protocol (SIP)", RFC 3840, August 2004.

[10] Rosenberg,J.,Schulzrinne,H.,和P.Kyzivat,“指出会话启动协议(SIP)中的用户代理功能”,RFC 3840,2004年8月。

[11] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller Preferences for the Session Initiation Protocol (SIP)", RFC 3841, August 2004.

[11] Rosenberg,J.,Schulzrinne,H.,和P.Kyzivat,“会话启动协议(SIP)的呼叫方偏好”,RFC 38412004年8月。

[12] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002.

[12] Rosenberg,J.和H.Schulzrinne,“具有会话描述协议(SDP)的提供/应答模型”,RFC 3264,2002年6月。

[13] Camarillo, G., Burger, E., Schulzrinne, H., and A. van Wijk, "Transcoding Services Invocation in the Session Initiation Protocol (SIP) Using Third Party Call Control (3pcc)", RFC 4117, June 2005.

[13] Camarillo,G.,Burger,E.,Schulzrinne,H.,和A.van Wijk,“使用第三方呼叫控制(3pcc)的会话启动协议(SIP)中的代码转换服务调用”,RFC 41172005年6月。

[14] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003.

[14] Yergeau,F.,“UTF-8,ISO 10646的转换格式”,STD 63,RFC 3629,2003年11月。

[15] "XHTML 1.0: The Extensible HyperText Markup Language: A Reformulation of HTML 4 in XML 1.0", W3C Recommendation, Available at

[15] “XHTML1.0:可扩展超文本标记语言:XML1.0中HTML4的重新表述”,W3C建议,可在

[16] ITU-T Recommendation V.18, "Operational and Interworking Requirements for DCEs operating in Text Telephone Mode", November 2000.

[16] ITU-T建议V.18,“在文本电话模式下运行的DCE的操作和互通要求”,2000年11月。

   [17]  TIA/EIA/IS-823-A, "TTY/TDD Extension to TIA/EIA-136-410
         Enhanced Full Rate Speech Codec (must used in conjunction with
   [17]  TIA/EIA/IS-823-A, "TTY/TDD Extension to TIA/EIA-136-410
         Enhanced Full Rate Speech Codec (must used in conjunction with

[18] TIA/EIA/IS-127-2, "Enhanced Variable Rate Codec, Speech Service Option 3 for Wideband Spread Spectrum Digital Systems, Addendum 2."

[18] TIA/EIA/IS-127-2,“用于宽带扩频数字系统的增强型变速率编解码器,语音服务选项3,附录2。”

[19] "IP Multimedia default codecs", 3GPP TS 26.235

[19] “IP多媒体默认编解码器”,3GPP TS 26.235

[20] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004.

[20] Baugher,M.,McGrew,D.,Naslund,M.,Carrara,E.,和K.Norrman,“安全实时传输协议(SRTP)”,RFC 37112004年3月。

[21] ITU-T Recommendation F.700, "Framework Recommendation for Multimedia Services", November 2000.

[21] ITU-T建议F.700,“多媒体服务框架建议”,2000年11月。

10.2. Informative References
10.2. 资料性引用

[22] Charlton, N., Gasson, M., Gybels, G., Spanner, M., and A. van Wijk, "User Requirements for the Session Initiation Protocol (SIP) in Support of Deaf, Hard of Hearing and Speech-impaired Individuals", RFC 3351, August 2002.

[22] N.查尔顿、M.加森、G.吉贝尔斯、M.斯潘纳和A.范威克,“支持聋人、重听人和言语障碍者的会话启动协议(SIP)的用户需求”,RFC 3351,2002年8月。

[23] Sinnreich, H., Ed., Lass, S., and C. Stredicke, "SIP Telephony Device Requirements and Configuration", RFC 4504, May 2006.

[23] Sinnreich,H.,Ed.,Lass,S.,和C.Stredicke,“SIP电话设备要求和配置”,RFC 4504,2006年5月。

[24] European Telecommunications Standards Institute (ETSI), "Human Factors (HF); Guidelines for Telecommunication Relay Services for Text Telephones". TR 101 806, June 2000.

[24] 欧洲电信标准协会(ETSI),“人为因素(HF);文本电话的电信中继服务指南”。TR 101 806,2000年6月。

[25] TIA/EIA/825 "A Frequency Shift Keyed Modem for Use on the Public Switched Telephone Network." (The specification for 45.45 and 50 bit/s TTY modems.)

[25] TIA/EIA/825“公共交换电话网络上使用的频移键控调制解调器”。(45.45和50位/秒TTY调制解调器规范)

[26] International Telecommunication Union (ITU), "300 bits per second duplex modem standardized for use in the general switched telephone network". ITU-T Recommendation V.21, November 1988.

[26] 国际电信联盟(ITU),“通用交换电话网络中使用的标准化300比特/秒双工调制解调器”。ITU-T建议V.21,1988年11月。

[27] International Telecommunication Union (ITU), "600/1200-baud modem standardized for use in the general switched telephone network", ITU-T Recommendation V.23, November 1988.

[27] 国际电信联盟(ITU),“通用交换电话网络使用的标准化600/1200波特调制解调器”,ITU-T建议V.23,1988年11月。

[28] Camarillo, G., "Framework for Transcoding with the Session Initiation Protocol", Work in Progress, May 2006.

[28] Camarillo,G.“使用会话启动协议进行代码转换的框架”,正在进行的工作,2006年5月。

[29] Camarillo, G., "The SIP Conference Bridge Transcoding Model", Work in Progress, January 2006.

[29] Camarillo,G.,“SIP会议桥转码模型”,正在进行的工作,2006年1月。

Authors' Addresses


Guido Gybels Department of New Technologies RNID, 19-23 Featherstone Street London EC1Y 8SL, UK

英国伦敦EC1Y 8SL费瑟斯通街19-23号Guido Gybels新技术部RNID

   Tel +44-20-7294 3713
   Txt +44-20-7296 8001 Ext 3713
   Fax +44-20-7296 8069
   Tel +44-20-7294 3713
   Txt +44-20-7296 8001 Ext 3713
   Fax +44-20-7296 8069

Arnoud A. T. van Wijk Real-Time Text Taskforce (R3TF)

Arnoud A.T.van Wijk实时文本工作组(R3TF)


Full Copyright Statement


Copyright (C) The IETF Trust (2008).


This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

本文件受BCP 78中包含的权利、许可和限制的约束,除其中规定外,作者保留其所有权利。



Intellectual Property


The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

IETF对可能声称与本文件所述技术的实施或使用有关的任何知识产权或其他权利的有效性或范围,或此类权利下的任何许可可能或可能不可用的程度,不采取任何立场;它也不表示它已作出任何独立努力来确定任何此类权利。有关RFC文件中权利的程序信息,请参见BCP 78和BCP 79。

Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at


The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at