Section Two: Video Encoding Standards
MPEG-1
(ISO/IEC 11172)
The first digital
video and audio encoding standard, MPEG-1, was adopted as an international
standard in 1992 to provide digital video at bit rates up to 1.5 Mb/sec.
(The standard actually scales higher than 1.5 Mb, but 1.5 Mb is the
accepted "sweet spot" for MPEG-1.) The impetus for the standard was
to provide encoding and playback of VHS-quality digital video for
CD-ROM playback. MPEG-1 is a progressive video sequence encoding standard.
The standard implementation for MPEG-1 (known as "constrained bit
stream") supports 352 pixels x 240 lines/sec at 30 frames/sec and
requires 1.5 Mbit/sec bandwidth for transport. MPEG-1 compression
relies on the considerable redundancy of information within and between
frames to compress a video object without significantly compromising
the integrity of the information it contains.
Video contains
spatial, spectral and temporal redundancies, which may be compressed
without significant sacrifice in meaning. The encoding techniques
in MPEG-1 involve compression based on statistical redundancies in
temporal and spatial directions. Spatial redundancy is based on the
similarity in color values shared by adjacent pixels. A red sweater
in a video frame will generally possess a uniform color value, with
little or no perceptual variation from one pixel to the next. MPEG-1
employs intraframe spatial compression on redundant color values using
DCT (discrete cosine transform).
Spectral redundancy
in video is the similarity between color spectra or "brightness."
MPEG-1 operates in the YUB color space. RGB data is converted to YUB.
24-bit RGB is subsampled at 4:2:0 YCrCB, where Y = luminance (brightness)
and CrCB = Crominance (color difference). The human eye distinguishes
difference in brightness more readily than difference in pure color
value.
Temporal redundancy
is the sameness in temporal motion between video frames. If frames
were not redundant, there would be no perception of smooth, realistic
motion in video. MPEG-1 relies on prediction--more precisely, motion-compensated
prediction--for temporal compression between frames. MPEG-1 utilizes
three frames to create temporal compression-I-Frames, B-frames and
P-frames. An I-frame is an intra-coded frame, a single image heading
a sequence, with no reference to past or future frames. MPEG-1 compresses
only within the frame with no reference to previous or subsequent
frames. P-frames are forward-predicted frames, encoded with reference
to a past I- or P-frame, with pointers to information in a past frame.
B-frames are encoded with reference to a past reference frame, a future
reference frame or both. The motion vectors employed may be forward,
backward, or both. B-frames are also sometimes known as digital video
"spackle."
The MPEG-1 coding
standard is a generic standard, intended to be independent of a specific
application, serving as a toolbox to be adapted to different applications
and their associated hardware and software.