Categories

CUC (6) CUCM (27) Jabber (6) Python (2) Routing (3) Solarwinds Orion NPM (4) switching (1) Video (6) voice (2)

Sunday, 20 September 2015

SIP SDP's explained and picked apart


Session description protocol; SIP's little sidekick


SDP is the companion protocol of SIP. It is used to describe and negotiate media characteristics between two end point. It will negotiate parameters such as Voice codec, video bit rate, DTMF, T38 etc etc.


SDP is described in RFC3261. It Offer answer model, which means, the initiating endpoint will send an SDP with a 'media Offer', and the endpoint it wants to communicate with, sends and answer back.  This SDP answer describes the media characteristics of the endpoint that receives the Offer and the voice/video codec selected by it for  two-way media communications.


SDP is communicated within the SIP framework, and comes in two flavors; Delayed offer and Early Offer.


Delayed Offer:






With delayed Offer the SDP is sent in an OK message somewhere halfway the signalling.




Early Offer:





With Early Offer, the SDP is included in the initial INVITE, and is formed by the calling device. Early Offer is most always used by IP PSTN providers, as it allows one-way media to be established to the calling device on receipt of the SDP Offer in the initial INVITE.

Cisco SIP trunk on CUCM use delayed offer by default and juggling between Delayed and early offers can be challengins, but that is a post for another day. Or perhaps read one that I prepared earlier:


http://ciscoshizzle.blogspot.com.au/2012/12/sip-early-offer-cucm-trunk.html


Either way, it is fairly easy to find out if you are working with early or delayed offer SDP when you debug ccsip or pull up trace files on your  CUCM.

So, now let's dig into the bowels of the actual SDP contents.


Example SDP Message, containing Audio and Vide
o:


v=0
o=Cisco-SIPUA 28123 0 IN IP4 10.4.4.112
s=SIP Call
t=0 0
m=audio 16736 RTP/AVP 0 8 18 102 9 116 101
c=IN IP4 10.4.4.112
a=trafficclass:conversational.audio.avconf.aq:admitted
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no          
a=rtpmap:102 L16/16000
a=rtpmap:9 G722/8000
a=rtpmap:116 iLBC/8000
a=fmtp:116 mode=20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15              
a=sendrecv
m=video 16738 RTP/AVP 126 97
c=IN IP4 10.4.4.112
b=TIAS:2000000
a=trafficclass:conversational.video.avconf.aq:admitted
a=rtpmap:126 H264/90000
a=fmtp:126 profile-level-id=428014;packetization-mode=1;level-asymmetry-allowed=1;max-mbps=36000;max-fs=1200;max-rcmd-nalu-size=1300
a=imageattr:126 send * recv [x=640,y=480]
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=428014;packetization-mode=0;level-asymmetry-allowed=1;max-mbps=36000;max-fs=1200
a=imageattr:97 send * recv [x=640,y=480]
a=rtcp-fb:* ccm tmmbr
a=sendrecv


Any SDP is split up into 3 parts:


  1. Session description, which contains the sessions details
  2. Time description, which contains the timing details
  3. Media description, this part has the details about the media that are part of the capability set, of the sending endpoint. 

Session Description

v= (protocol version)
o= (owner/creator and session identification).
s= (session name)
i= (session information)*
u= (URI of description)*
e=(email address – contact detail)*
p= (phone number – contact detail)*
c= (connection information – not required if included in media description)*
b= (session bandwidth information)*
z= (time zone adjustments)*
k= (encryption key)*
a= (zero or more session attribute lines)*

Time description

t= (time the session is active)
r= (repeat times)*

Media description

m= (media name/ transport address)
i= (media title)*
c= (connection information – not required if included in session description)*
b= (bandwidth information)*
k= (encryption key)*
a= (zero or more media attribute lines)*
* Field is optional

Description of each field

Session Description

  • v=<version> – Specifies the version of Session Description Protocol. As specified in RFC 4566, up to now there is only one version, which is Version 0. No minor versions exist.
  • o=<username><sess-id><sess-version><nettype><addrtype><unicast-address> Details about the originator and identification of the session.
    • <username> –  The user’s login. The MUST NOT contain spaces
    • <sess-id> – A numeric string used as unique identifier for the session
    • <sess-version> – A numeric string used as version number for this session description
    • <nettype> –  Text string, specifying the network type, e.g. IN for internet
    • <addrtype> – Text string specifying the type of the address of originator E.g.IP4 or IP6
    • <unicast-address>  –  The address of the machine from where the session is originating, which can be both FQDN or IP address.
  • s=<session name> – Only one session name per session description can be specified. It must not be empty; therefore if no name is assigned to the session, a single empty space should be used as session name.
  • i=<session description> – Only one session-level “i” field can be specified in the Session description. The “i” filed can be used in session or media description. It is primarily intended for labeling media streams when used in media description section. It can be a human readable description.
  • u=<uri> – The URI (Uniform Resource Identifier) specified in the “u” filed, is a pointer to additional information about the session.
  • e=<email address>
  • p=<phone-number> – Specifies contact information for the person responsible for the conference.
  • c=<nettype> <addrtype> <connection-address> – Connection information can be included in Session description or in media description. A session description MUST contain either at least one “c=” field in each media description or a single “c=” field at the session level
    • <nettype> A text string describing the network type, e.g. IN for internet.
    • <addrype> A text string describing the type of the address used in connection-address; E.g. IP4 or IP6.
    • <connection-address> A Multicast IP address is specified including TTL, e.g. 224.2.36.42/127
  • b=<bwtype>:<bandwidth> – Bandwidth field can be used both in the session description, specifying the total bandwidth of the whole session and can also be used in media description, per media session.
    • <bwtype> Bandwidth type can be CT; conference total upper limit of bandwidth to be used, or AS; application specific, therefore it will be the application’s concept of maximum bandwidth.
    • <bandwidth> is interpreted as kilobits per second by default.
  • z=<adjustment time> <offset> <adjustment time> <offset> – To schedule a repeated session that specifies a change from daylight saving time to standard time or vice versa, it is necessary to specify difference from the originating time.
  • k=<method>:<encryption key> – If channel is secure and trusted, SDP can be used to convey encryption keys. A key can be specified for the whole session or for each media description.
    • <method> Indicates the mechanism which is used to obtain the encryption key from external sources or from encoding the given key. Several different methods exists, such as prompt and URI.
    • <encryption key> The encryption key, or if URI is used as method, the URI from where the key can be retrieved.>
  • a=<attribute>:<value> – Attributes may be defined at “session-level” or at “media-level” or both. Session level attributes are used to advertise additional information that applies to the call/session as a whole. Media level attributes are specific to the media, i.e. advertising information about the media stream. 

Time Description

  • t=<start-time>:<value> – Specifies the start and stop times for a session. If a session is active at irregular intervals, multiple time entries can be used.
  • r=<repeat interval> <active duration> <offsets from start-time> – If a session is to be repeated at fixed intervals, the “r” field is used. By default all values should be specified in seconds, but to make description more compact, time can also be given in different units, such as days, hours or minutes; e.g. r=6d 2h 14m.

Media Description

  • m=<media> <port>/<number of ports> <proto> <fmt> – This field is used in the media description section to advertise properties of the media stream, such as the port it will be using for transmitting, the protocol used for streaming and the format or codec.
    • <media> Used to specify media type, generally this can be audio, video, text etc.
    • <port> The port to which the media stream will be sent. Multiple ports can also be specified if more than 1 port is being used.
    • <proto> The transport protocol used for streaming, e.g. RTP (real time protocol).
    • <fmt> The format of the media being sent, e.g. in which codec is the media encoded; e.g. PCMU, GSM etc.

Of course, when a call is being set up, SDP's are exchanged in both directions, and in general, the highest achievable codec for both parties is what is eventually used to establish the call. Evidently, if no common codec/attributes are found in the two SDP's; the call will fail.


Let me quickly go back to the initial SDP example. That particular endpoint supports the following audio codecs, in preferential order.


  • a=rtpmap:0 PCMU/8000
  • a=rtpmap:8 PCMA/8000
  • a=rtpmap:18 G729/8000    
  • a=rtpmap:102 L16/16000
  • a=rtpmap:9 G722/8000
  • a=rtpmap:116 iLBC/8000


After that the SDP contains:


  • a=rtpmap:101 telephone-event/8000
  • a=fmtp:101 0-15              


The above fields describe the DTMF the phone supports (telephone-events). Such phone supports DTMF payload type number 101, and DTMF tones events from 0 to 16 with a sample rate of 8000 Hertz. Note that as a DTMF standard, all SIP entities should at least support DTMF events from 0 to 15, which are 0-9 (numbers), 10 = *, 11 = # and 12 -15 are A-D.

The next attirbute in our example is:

  • a=sendrecv

This means that the session is aimed at sending and receiving media, if you should see anything else like a=recv that could indicate an issue, or perhaps a phone being put on hold.

After this the second part of the sesssion description contains:

m=video 16738 RTP/AVP 126 97
c=IN IP4 10.4.4.112
b=TIAS:2000000
a=trafficclass:conversational.video.avconf.aq:admitted
a=rtpmap:126 H264/90000
a=fmtp:126 profile-level-id=428014;packetization-mode=1;level-asymmetry-allowed=1;max-mbps=36000;max-fs=1200;max-rcmd-nalu-size=1300
a=imageattr:126 send * recv [x=640,y=480]
a=rtpmap:97 H264/90000
a=fmtp:97 profile-level-id=428014;packetization-mode=0;level-asymmetry-allowed=1;max-mbps=36000;max-fs=1200
a=imageattr:97 send * recv [x=640,y=480]
a=rtcp-fb:* ccm tmmbr
a=sendrecv

This part, is aimed at establishing a video stream component to the 'call'.


  •  b=TIAS:2000000


Is the result of the CUCM region setting for video being set to 2Mbps.

The video attrributes advertise H264 with different packetization modes.

If you feel, anything should be added and or change to this post, as always I am open for suggestions, but please quote your source, for cross referencing.

Namaste!


Sources:


SIP SDP traffic classes:

http://hive2.hive.packetizer.com/users/paulej/internet-drafts/draft-ietf-mmusic-traffic-class-for-sdp-03.txt

No comments:

Post a Comment