ViDe // www.ViDe.net
Videoconferencing Cookbook
Version 4.1
Video Development Initiative      
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Previous Next Print Contents Glossary Feedback Search

Network Matters


ViDe Videoconferencing Cookbook

Network Fundamentals for the Videoconference

Videoconferencing was originally deployed over networks that could provide some guarantees about the level of service that would be delivered to the application. The ISDN and/or dedicated T1 circuits of the H.320 standards-based world provided predictable delays over dedicated paths. This allowed videoconferencing vendors to create products to work within these parameters. However, dedicated circuits are also expensive circuits.

IP standards-based videoconferencing was engineered for videoconferencing that takes place on a data network without any quality-of-service standard, such as the Internet. Such networks are not intended for delivery of sensitive near real-time applications. The data network is used for multiple purposes: e-mail, web browsing, and other activities are inter-mixed with IP videoconferencing.

The audio/video information within a videoconference is segmented into chunks by the application, encoded and compressed, put into a series of data packets and sent over the network to the remote end at basically constant intervals. The data packets arrive at their destination at varying times, if at all, and often out of order. To keep the "real time" impression of an interactive videoconference, the packets must arrive, on time and in time to be re-ordered for delivery through the videoconferencing terminal.

There are five fundamental network problems for videoconferencing over networks such as the Internet. They are bandwidth, packet loss, latency, jitter and policies.

Bandwidth is the fundamental requirement that there be enough space in a network path for all of your packets to get through unimpeded. For a rough idea of scale, a typical ISDN videoconference uses around 128-384Kbps (kilobits per second). IP-based H.323 video systems can use the same bandwidth, although in general they tend to go higher since the network is cheaper, so bandwidth of around 384-768Kbps is very common. The bandwidth required for a given videoconferencing speed is higher on IP networks than on ISDN networks. That is because of the packetizing overhead of IP. The overhead is about 20%, so for example a 384Kbps IP conference actually uses about 450Kbps of bandwidth. Higher-quality H.323 and SIP videoconferences can go to 1.5-3.0Mbps, and if you want to go to broadcast quality with alternate codecs including MPEG 1/2/4 and MJPEG, the sky is the limit — 6 - 20Mbps for NTSC/PAL transmission, 20 - 50Mbps for prerecorded HDTV and higher for ‘live’ content.

This bandwidth should be symmetric — meaning each end should be able to both send and receive the same amount of data for the call. This connection speed, 384Kbps plus overhead for example, is the maximum bandwidth the call will use in either direction. The actual speed varies depending on how active or static the video image is. When the call is initiated, there will be a spike in bandwidth use as the entire video frame is sent; then usage will fall off as only image updates are sent. A graph showing bandwidth usage during the initialization of a 384Kbps call is below. This only shows the outbound traffic, with basically still video after initiation, thus the significant drop in bandwidth use after the call was started.

Different clients are sensitive to discrepancies in bandwidth symmetry in different ways depending on if the bandwidth restriction is incoming or outgoing. In most cases, only video frame rate is affected, though some clients may drop the video or even the call alltogether.

If you are in a multi-point videoconference then you need to keep in mind that the MCU/bridge is seeing all of the streams at the same time, even if it is not forwarding them on. So if you have an 8-site videoconference running at 384Kbps, every site sends and receives up to 384Kbps to the MCU, and the MCU receives and forwards 8*384Kbps = 3Mbps roughly.

Packet loss is when packets fail to arrive correctly. This can be due to insufficient bandwidth along the path (when congestion occurs, routers will drop packets), or perhaps errors in transmission. Errors occur most commonly on wireless links such as microwave, satellite or local wireless Ethernet. They can however also occur on copper and even fiber links. Packet loss results in effects such as "tiling" within the video window, missing pieces or blank areas within the video window, and/or disruptions in audio.

Latency is the time delay between an event occurring and the remote end seeing it. Latency is introduced both by the encoding/decoding process, and hence depends on the equipment used, and also by the time it takes packets to traverse the network. There is little you can usually do to change the network latency, on any large scale, beyond getting directly involved with a carrier or a research network. Optimally, a connection from the United States to Europe on a fiber optic network will have roughly 90ms (milliseconds) of latency. That same connection going through a satellite can be 200ms.

Excessive latency increases the chances of people "talking over one another" because they don't realize that the person at the other end has started speaking too. This is less significant in calls with less than 50ms of network latency. It can become very troublesome in calls with more than 150ms. Another problem is that the latency for the audio and video may be different, and hence lip movements don’t appear synchronized with the audio. This is a function of both the terminal and the network, and can vary dramatically — some products try to compensate for it. You should experiment to see if it is an issue for your applications.

Jitter is the random variation in latency due to things like competing processes running on the terminal (for example on your desktop PC), other traffic temporarily blocking the path through routers along the way, or even the network path changing during a videoconference. This random variation is one of several things that cause packets to arrive out of order from their transmitted order. Jitter results in uneven and unpredictable quality within a videoconference and the endstation client will try to compensate for it by buffering the traffic up to some finite time, before playing it out to you. This increases the latency even further.

Policies are introduced by things like firewalls and network address translation (NAT) devices that are generally used to try to hide or protect network elements from the wider Internet. H.323 uses dynamically allocated ports and is thus not very firewall-friendly.

 
Previous Next Print Contents Glossary Feedback Search

© 2004-6, Video Development Initiative.
Updated March, 2005.