Digital Video for the Next Millennium


This publication is copyright 1999 by the Video Development Initiative (ViDe). The document may not be reproduced, in whole or in part, without written permission from ViDe, except that a single copy for personal use may be printed by the reader. Please direct all comments to the author of this white paper.

   


Section Two: Video Encoding Standards
MPEG-4 (ISO/IEC 14496 MPEG-4)

MPEG-4 ISO/IEC 14496 MPEG-4, the latest encoding standard from MPEG, was finalized in October 1998 and should be ratified as a standard in the first half of 1999. MPEG-4 arose from a need to have a scalable standard supporting a wide bandwidth range from streaming video at less than 64 Kbit/sec, suitable for Internet applications, to approximately 4 Mbit/sec for higher-bandwidth video needs. MPEG-4 also arose from a desire, as digital encoding matures, to advance beyond simple conversion and compression to object recognition and encoding, as well as the provision of synchronized text and metadata tracks, to create a digital file that carries a meaning greater than the sum of its individual parts.

MPEG-4 supports both progressive and interlaced video encoding. The standard is object-based, coding multiple video object planes into images of arbitrary shape. Successive video object planes (VOPs) belonging to the same object in the same scene are encoded as video objects. MPEG-4 supports both natural ("analog") and synthetic ("computer-generated") data coding. Some VRML technology is incorporated to encode dimensionality.

MPEG-4 compression provides temporal scalability utilizing object recognition, providing higher compression for background objects, such as trees and scenery, and lower compression for foreground objects, such as an actor or speaker-much as the human eye filters information by focusing on the most significant object in view, such as the other party in a conversation. Object encoding provides great potential for object or visual recognition indexing, based on discrete objects within a frame rather than requiring a separate text-based or storyboard indexing database. In addition, MPEG-4 provides a synchronized text tract for courseware development and a synchronized metadata track for indexing and access at the frame level.