by Bob Andreasen
The sipXtapi SDK is a C application programming interface for voice communications over IP. Specifically, sipXtapi provides a generalized telephony interface on top of the Session Initiation Protocol (SIP), RFC 3261, and the real-time Transport Protocol (RTP), RFC 1889. While the SIP and RTP protocols provide signaling and media transport infrastructure, sipXtapi also includes many other protocol and standards implementations needed for voice communications.
sipXtapi is developed under open source (LGPL) and hosted as part of the sipX line of projects available from SIPfoundry. For more information on open source licensing or SIPfoundry, please see www.opensource.org and www.sipfoundry.org.
Some of the technology used underneath sipXtapi was donated by Pingtel Corp (www.pingtel.com) in March of 2004 when they took their IP PBX and soft phone solutions open source. The technology base was and still is used for their proxy servers, media server, soft phone, and hard phone appliance. The technology is considered well tested and very interoperable with other SIP devices. Since then SIPez has supported sipXtapi and has re-written a majority of the media subsystem
The primary objective for sipXtapi is to provide a simple programming interface for application developers. As standards-based protocols; such as SIP, become more popular, enterprises and independent software vendors will provide value-added solutions on top a of a now-commodity voice infrastructure. sipXtapi is designed to enable this class of developers by providing a simple solution that abstracts many of the intricate details of SIP. Developers using sipXtapi do not need to understand the syntax and semantics of the underlying protocols and can focus on a more familiar call model.
Another objective included building an API that was familiar to application developers in telephony. A number of call models exist today, however, Microsoft TAPI 2.2 was chosen as a conceptual base because of its popularity and good separation of line or end point specific features and call center features. Porting a TAPI 2.2 application to sipXtapi requires changes, however, is consider straight forward.
While building full TAPI service provider model under Windows is an eventual goal, cross platform usage is also a key objective. The sipXtapi SDK builds and runs under Windows, Linux, and MacOS. The underlying technologies have also been run on Solaris, embedded Linux, WinCE, and vxWorks. Since running under these environments, the OS portability layer (sipXportLib) has changed and requires additional porting.
Easy, familiar, event-driven call control API
Rich call control feature set
|
Industry leading SIP support
Comprehensive configuration support
|
Commerical distributions are also available. SIPez LLC offers a commercial distribution of sipXtapi that integrates SIPez Media Engine for enhanced audio fidelity, video support, and greater codec selection.
Using sipXtapi as the base for soft phone client is the most obvious and straight forward use. This API was developed to facilitate SIPfoundry’s next generation soft phone (sipXezPhone) and is heavily tested for that purpose. Basic telephony features are supported along with more advanced features such as client-side conference and transfer.
Developers are also using sipXtapi to add soft phone-like features to their existing products. This is a slightly different application than a traditional soft phone; however, the basics are identical. For example, a number of companies and atleast one other open source community has used sipXtapi to add voice communications to their Instant Messaging clients.
sipXtapi is also targeted to server-based user agents. The API can be configured to avoid the use of local audio inputs and outputs. For example, the sipXtapi SDK is positioned to become to call processing engine being SIPfoundry’s sipXvxml project. The sipXvxml project is a VoiceXML driven engine that provides SIP IVR functionality. Additionally, sipXtapi has been used to build flexible ACD.
sipXtapi provides a quick method to add SIP to a legacy telephony product. The
API provides hooks for sourcing and consuming audio data. When combined in a
back-to-back user agent (B2BUA) configuration, developers have successful
bridged legacy to SIP communication.
Virtually all of the sipXtapi API methods require one or more handles as function arguments. SIPX_ handles represent all of the data associated with a logic call, conference, line identity, or user agent instance. A brief description of each handle type is provided in Table 1 (below).
Table 1: sipXtapi Handles
Handle | Description |
---|---|
SIPX_INST | The SIPX_INST handle represents an instance of a user agent. A user agent includes a SIP stack and media processing framework. sipXtapi does support multiple instances of user agents in the same process space, however, certain media processing features become limited or ambiguous. For example, only one user agent should control the local system's input and output audio devices. |
SIPX_LINE | The SIPX_LINE handle represents an inbound or outbound identity. When placing outbound calls, the application programmer must define the outbound line. When receiving inbound calls, the application can query the line. |
SIPX_CALL | The SIPX_CALL handle represents a call or connection between the user agent and another party. All call operations require the call handle as a parameter. |
SIPX_CONF | The SIPX_CONF handle represents a collection of SIPX_CALLs that have bridge (mixed) audio. Application developers can manipulate each leg of the conference through various conference functions. |
SIPX_INFO | The SIPX_INFO handle represents a handle to an INFO message sent by a sipXtapi instance. INFO messages are useful for communicating information between user agents within a logical call. The SIPX_INFO handle is returned when sending an INFO message via sipxCallSendInfo(...). The handle is references as part of the EVENT_CATEGORY_INFO_STATUS event callback/observer. sipXtapi will automatically deallocate this handle immediately after the status call back. |
SIPX_PUB | The SIPX_PUB handle represent a publisher context. Publisher are used to publish application-data to interested parties (Subscribers). This maps directly to the SIP SUBSCRIBE, and NOTIFY methods. The handle is used to mange the life cycle of the publisher. |
SIPX_SUB | A SIPX_SUB handle represent a subscription to a remote publisher. This maps directly to the SIP SUBSCRIBE, and NOTIFY methods. The handle is used to mange the life cycle of the subscription. |
Handle life cycles are managed both by the framework and the application developer. Most handles are explicitly created and destroyed by application developers. When receiving a new inbound call, a SIPX_CALL handle is implicitly created by the framework, but must be destroyed using sipxCallDestroy. In conferences, SIPX_CALLs are destroyed automatically if the remote end terminates (special case)
All of the API functions in sipXtapi can be categorized in functional groups.
The grouping is derivable from the method names. For example
“sipXcallAccept(…)”, “sipXcallReject(…)”, and “sipXcallRedirect(…)” are all
call related functions, while “sipxConferenceGetCalls(…)” is a conference
function. A brief summary of each functional area is provided in Table 2
(below).
Table 2: Functional Method Groups
Functional Area | Description |
---|---|
Config | sipXtapi includes a number of configuration settings that allow application developers to set the SIP proxy server, timeout settings, enabling/disable specific SIP such as symmetric signaling. Settings can be change at any point; however, all settings may not affect calls already in progress. |
Call | Call features include accepting, rejecting, and redirecting new inbound calls; answering, holding, mute, playing tones, playing audio files, and transferring active calls; and accessing called and caller ID. |
Line | sipXtapi provides methods to define lines (SIP identities). Lines are modeled after the LEDs found on key-system telephone handsets. Generally, lines represent both outside PSTN lines and inbound queues such as the sale and support queue. In sipXtapi, lines are defined in terms of SIP identities. Each identity is optionally configured to register with a SIP registrar. Authentication credentials are configured on a per line basis. This mechanism allows for both peer-to-peer environments where users setup calls using IP addresses or host names and central directory-oriented environments with authenticated clients registered with a well known registrar. |
Audio | sipXtapi provides methods to enumerate audio devices, select the in-call speaker device, select the ringer device, and set the input device. Additionally, APIs are available for setting speaker volume and microphone gain levels. For servers, application developers may disable microphone and speakers. |
Conference | Adhoc client-mixed conferences are setup and manipulate through a series of conferencing APIs. Application developers may add and remove conference participants and place individual conference participants on hold. |
Events | A callback mechanism (observer pattern) is used to communicate call state transitions to the application layer. Events are categorized by type and include event types. A small number of major events allow for simple application state machines and streamlined processing. Minor events provide additional information and causes for major event transitions. |
Hooks | Application developers can "hook" audio sources and targets to consume or manipulate audio. This mechanism enables audio logging, audio injection, and audio capture. Additionally, this mechanism has been used to bridge non-SIP voice clients to SIP voice clients using a back to back user agent (B2BUA) approach. |
The sipxTapi API uses events to communicate state transitions to the application layer. Since many of the API calls are asynchronous, events notifications must be reviewed for both operations such as placing a call and externally generated events such as the remote party disconnecting. Descriptions of the major event categories are listed in Table 3.
Table 3: Event Categories
Category |
Description |
---|---|
EVENT_CATEGORY_CALLSTATE |
CALLSTATE events signify a change in state of a call. States range from the notification of a new call to ringing to connection established to changes in audio state (starting sending, stop sending) to termination of a call. |
EVENT_CATEGORY_LINESTATE |
LINESTATE events indicate changes in the status of a line appearance. Lines identify inbound and outbound identities and can be either provisioned (hardcoded) or configured to automatically register with a registrar. Lines also encapsulate the authentication criteria needed for dynamic registrations. |
EVENT_CATEGORY_INFO_STATUS |
INFO_STATUS events are sent when the application requests sipXtapi to send an INFO message to another user agent. The status event includes the response for the INFO method. Application developers should look at this event to determine the outcome of the INFO message. |
EVENT_CATEGORY_INFO |
INFO events are sent to the application whenever an INFO message is received by the sipXtapi user agent. INFO messages are sent to a specific call. sipXtapi will automatically acknowledges the INFO message at the protocol layer. |
EVENT_CATEGORY_SUB_STATUS |
SUB_STATUS events are sent to the application layer for information on the subscription state (e.g. OK, Expired). |
EVENT_CATEGORY_NOTIFY |
NOTIFY evens are send to the application layer after a remote publisher has sent data to the application. The application layer can retrieve the data from this event. |
EVENT_CATEGORY_CONFIG |
CONFIG events signify changes in configuration. For example, when requesting STUN support, a notification is sent with the STUN outcome (either SUCCESS or FAILURE) |
EVENT_CATEGORY_SECURITY |
SECURITY events signify occurences in call security processing. These events are only sent when using S/MIME or TLS. |
EVENT_CATEGORY_MEDIA |
MEDIA events signify changes in the audio state for sipXtapi or a particular call. |
Event handling is performed through a callback mechanism. The callback signature is included below.
Table 4: Major CALL_STATE events
Major Event |
Description |
---|---|
CALLSTATE_NEWCALL |
The NEWCALL event indicates that a new call has been created automatically by the sipXtapi. This event is most frequently generated in response to an inbound call request. |
CALLSTATE_DIALTONE |
The DIALTONE event indicates that a new call has been created for the purpose of placing an outbound call. The application layer should determine if it needs to simulate dial tone for the end user. |
CALLSTATE_REMOTE_OFFERING |
The REMOTE_OFFERING event indicates that a call setup invitation has been sent to the remote party. The invitation may or may not every receive a response. If a response is not received in a timely manor, sipXtapi will move the call into a disconnected state. If calling another sipXtapi user agent, the reciprocal state is OFFER. |
CALLSTATE_REMOTE_ALERTING |
The REMOTE_ALERTING event indicates that a call setup invitation has
been accepted and the end user is in the alerting state (ringing). Depending
on the SIP configuration, end points, and proxy servers involved, this event
should only last for 3 minutes. Afterwards,the state will automatically
move to DISCONNECTED. If calling another sipXtapi user agent, the reciprocate
state is ALERTING. |
CALLSTATE_CONNECTED |
The CONNECTED state indicates that call has been setup between the local and remote party. Network audio should be flowing provided and the microphone and speakers should be engaged. |
CALLSTATE_BRIDGED |
The BRIDGED state indicates that a call is active, however, the local microphone/speaker are not engaged. If this call is part of a conference, the party will be able to talk with other BRIDGED conference parties. Application developers can still play and record media |
CALLSTATE_HELD |
The HELD state indicates that a call is both locally and remotely held. No network audio is flowing and the local microphone and speaker are not engaged. |
CALLSTATE_REMOTE_HELD |
The REMOTE_HELD state indicates that the remote party is on hold. Locally, the microphone and speaker are still engaged, however, no network audio is flowing. |
CALLSTATE_DISCONNECTED |
The DISCONNECTED state indicates that a call was disconnected or failed to connect. A call may move into the DISCONNECTED states from almost every other state. Please review the DISCONNECTED minor events to understand the cause. |
CALLSTATE_OFFERING |
An OFFERING state indicates that a new call invitation has been extended this user agent. Application developers should invoke sipxCallAccept(), sipxCallReject() or sipxCallRedirect() in response. Not responding will result in an implicit call sipXcallReject(). |
CALLSTATE_ALERTING |
An ALERTING state indicates that an inbound call has been accepted and the application layer should alert the end user. The alerting state is limited to 3 minutes in most configurations; afterwards the call will be canceled. Applications will generally play some sort of ringing tone in response to this event. |
CALLSTATE_DESTROYED |
The DESTORYED event indicates the underlying resources have been removed for a call. This is the last event that the application will receive for any call. The call handle is invalid after this event is received. |
CALLSTATE_TRANSFER_EVENT |
The transfer state indicates a state change in a transfer attempt. Please see the CALLSTATE_TRANSFER_EVENT cause codes for details on each state transition. |
CALLSTATE_UNKNOWN |
An UNKNOWN event is generated when the state for a call is no longer known. This is generally an error condition; see the minor event for specific causes. |
In Figure 1, the state diagram depicts the typical life cycle for an outbound call. An event is sent to the application developers on state transitions. Details on each event can be found in Table 4 and the sipXtapi API documentation.
Figure 1: Events for an outbound call
In Figure 2, the state diagram describes the typical life cycle for an inbound call. The “OFFERING” event signals a request for a connection and the application developer can choose to accept the call, reject the call, or redirect the call. Note: Accepting the call is precursor to altering (or ringing) the user.
Figure 2: Events for an inbound call
This example demonstrates how to setup a basic call, monitor states, and then clean up the call. The example assumes that the remote party will initiate the hang up.
Figure 3: sipXtapi Setup
1: SIPX_INST g_hInst; |
In Figure 3, lines 1 to 3 define global variables that are used
throughout the example: user agent instance, default line identity, and call
handle.
Line 5 initializes the user agent and specifies the default port settings. SIP_PORT
and TCP_PORT are traditionally 5060 and define the SIP signaling ports. RTP_START_PORT
defines the starting port for RTP audio traffic. sipXtapi will allocate two
adjacent audio ports (RTP & RTCP) for each call.
Line 6, adds a callback procedure for event notifications.
Lines 8 to 10 define a line identity, add authentication credentials for that line, and starts the registering process.
Figure 4: Placing outbound call
1: sipxCallCreate(g_hInst, g_hLine, &g_hCall); |
In Figure 4, a basic call is created and initiated to “sip:myfriend@example.com”. The line created in Figure 3 was specified and is used for the outbound call identity. Results from the connection attempt are delivered asynchronously though event call backs. However, sipxCallConnect(…) may yield a non-successful return code if the address is malformed or if the domain name is invalid.
Figure 5: Call back signature
1: bool EventCallbackProc( SIPX_EVENT_CATEGORY category, |
Figure 5 provides a skeleton for an event call back. See Table 3 for description of the major event categories. Application developers should not block this event callback thread -- doing so will cause deadlocks and will slow down call processing. You should re-post these events to your own thread context for handling. The sipxDuplicateEvent and sipxFreeDuplicatedEvent methods are available to copy the event callback data (The data is only available for the duration of the callback). For example, upon receiving the callback, copy the data using sipxDuplicateEvent(...), post the copied data to your event callback, process it, and lastly free the event data using sipxFreeDuplicatedEvent(...).
Figure 6: Example callstate skeleton
1: void handleCallStateEvent(SIPX_CALLSTATE_INFO* pCallInfo) |
Figure 6 provides a skeleton for processing a callstate event (partial implemention of Figure 5, line 9).
Figure 6: Handling remote offered
1: handleRemoteOffered(SIPX_CALL hCall, SIPX_CALLSTATE_CAUSE cause) |
The REMOVE_OFFERING event does not require any action. Generally application
developers will display status indicating the progress of the call.
Figure 7: Handling remote alerting
1: handleRemoteAlerting(SIPX_CALL hCall, SIPX_CALLSTATE_CAUSE cause) |
Like REMOTE_OFFERING, the REMOTE_ALERTING event is used to provide
feedback to the end user. The code snipped in Figure 7 will play a ring back
tone to the end user if “early media”, audio sent along with the alerting indication,
is not present. Early media is detectable by looking minor call state event.
Early media is often provided by PSTN gateways to provide audible call status.
Figure 8: Handling connected
1: handleConnected(SIPX_CALL hCall, SIPX_CALLSTATE_CAUSE cause) |
The CONNECTED state is significant for user feedback, however, does not require any actions from the application developer. The application layer should pay attention to the minor state events for changes in the connected event. For example, the call may be placed on or off hold.
Figure 9: Handling disconnected
1: handleDisconnected(SIPX_CALL hCall, SIPX_CALLSTATE_CAUSE cause) |
The DISCONNECTED event is generated in many different scenarios.
Examples range from locally hang up, the remote party hanging up, a busy end
point, a network outage, etc. It is important to look at the minor call state
code to determine the reason for the disconnection and take an appropriate action.
In Figure 9, the code snippet blindly destroys the call; however, if the minor
code was DISCONNECTED_BUSY, one might want to play a busy tone as audible feedback.
Once the end user acknowledged the failure, the application developer would
then destroy the call.
Figure 10: Handling offering event
1: handleOffering(SIPX_CALL hCall, SIPX_CALLSTATE_MINOR eMinor) |
Upon receiving an OFFERING event, the application developer must accept, reject, or redirect the call. In this example, the call is accepted; however, one should consider rejecting the call if resources are limited or the end users has decided to hold all calls, etc. Depending on the SIP environment, the user agent may redirect calls to another user agent (e.g. voicemail) when the phone is busy. In many architects that decision is pushed into the network and the end point is expected to only reject calls.
Figure 11: Handling alerting event
1: handleAlerting(SIPX_CALL hCall, SIPX_CALLSTATE_MINOR eMinor) |
The ALERTING event signifies that a call has been accepted and the end user should be alerted. In a soft phone, one would alert the user by playing a ring tone or a custom ring file. This example automatically answers the calls. The clearLookback() call is described later.
Figure 12: Loopback routines
1: #define SAMPLES_PER_FRAME 80 |
A very simple loopback ring buffer is defined and initialized in Figure 12. For this example, the samples per frame and loopback delay are fixed at 80 samples/frame (8000Hz) and 200 frames (2 seconds).
initLoopback() and clearLoopback() are helper functions. The
initLoopback() method allocates enough memory to hold samples during the delay
period. The clearLoopback routine is called between calls to clears all of the
samples. Samples are formatted as mono, 16-bit signed, little endian PCM.
Figure 13: Hook implementation
1: void SpkrAudioHook(const int nSamples, const short* pSamples) |
The sipXtapi SDK allows application developers to hook audio sources and targets to inject or consume audio. For this example, data heading for the speaker is stored in a ringer buffer and later injected as microphone source data. With the 200 frame delay, the remote calling party will hear their voice 2 seconds later.
Figure 14: Additional call setup
1: initLoopback() ; |
Plugging in the loopback code is fairly easy. One needs to initialize the loopback data structure and set the speaker and microphone audio hooks as demonstrated in Figure 14.
Method |
Description / Instructions |
---|---|
Review API Definition |
API documentation is automatically generated from the source code by doxygen. Please click the "Files" link on the top of this page to access detailed documentation. |
Latest Docs |
The latest overview and API documentation is found in SIPfoundry source
code repository. This information is reviewed easily online: |
SIPfoundry Web Site |
The sipXtapi project is a included as part of the sipXcallLib project.
Please review the sipXcallLib project page for additional information: |
Example source code |
Example source code is provide with the sipXcallLib project. You will need to fetch the sipXcallLib, sipXtackLib, sipXmediaLib, and sipXportLib projects to build sipXtapi, however, source examples are easily reviewed online:
|
sipX developer mailing list |
The sipX-dev mailing list is useful for find answers to questions not covered by any of the other sources. Please search the archive for answers before posting your question. |
Problems with these docs? Please email bob AT sipfoundry.org.