MPEG-4 Part 2

MPEG-4 Part 2, MPEG-4 Visual (formally ISO/IEC 14496-2[1]) is a video compression format developed by MPEG. It belongs to the MPEG-4 ISO/IEC standards. It is a discrete cosine transform compression standard, similar to previous standards such as MPEG-1 Part 2 and H.262/MPEG-2 Part 2. Several popular codecs including DivX, Xvid and Nero Digital implement this standard.

Note that MPEG-4 Part 10 defines a different format from MPEG-4 Part 2 and should not be confused with it. MPEG-4 Part 10 is commonly referred to as H.264 or AVC, and was jointly developed by ITU-T and MPEG.

MPEG-4 Part 2 is H.263 compatible in the sense that a basic H.263 bitstream is correctly decoded by an MPEG-4 Video decoder. (MPEG-4 Video decoder is natively capable of decoding a basic form of H.263.)[2][3][4] In MPEG-4 Visual, there are two types of video object layers: the video object layer that provides full MPEG-4 functionality, and a reduced functionality video object layer, the video object layer with short headers (which provides bitstream compatibility with base-line H.263).[5] MPEG-4 Part 2 is partially based on ITU-T H.263.[6] The first MPEG-4 Video Verification Model (simulation and test model) used ITU-T H.263 coding tools together with shape coding.[7]


MPEG-4 Visual editions[8]
Edition Release date Latest amendment Standard Description
First edition 1999 2000 ISO/IEC 14496-2:1999[9]
Second edition 2001 2003 ISO/IEC 14496-2:2001[10]
Third edition 2004 2009[1] ISO/IEC 14496-2:2004[1]


To address various applications ranging from low-quality, low-resolution surveillance cameras to high definition TV broadcasting and DVDs, many video standards group features into profiles and levels. MPEG-4 Part 2 has approximately 21 profiles, including profiles called Simple, Advanced Simple, Main, Core, Advanced Coding Efficiency, Advanced Real Time Simple, etc. The most commonly deployed profiles are Advanced Simple and Simple, which is a subset of Advanced Simple.

Most of the video compression schemes standardize the bitstream (and thus the decoder) leaving the encoder design to the individual implementations. Therefore, implementations for a particular profile (such as DivX or Nero Digital which are implementations of Advanced Simple Profile and Xvid that implements both profiles) are all technically identical on the decoder side. A point of comparison would be that an MP3 file can be played in any MP3 player, whether it was created through iTunes, Windows Media Player, LAME or the common Fraunhofer encoder.

Simple Profile (SP)

Simple Profile is mostly aimed for use in situations where low bit rate and low resolution are mandated by other conditions of the applications, like network bandwidth, device size etc. Examples are mobile phones, some low end video conferencing systems, electronic surveillance systems etc.

Advanced Simple Profile (ASP)

Advanced Simple Profile's notable technical features relative to the Simple Profile, which is roughly similar to H.263, include:

The MPEG quantization and interlace support are designed in basically similar ways to the way it is found in MPEG-2 Part 2. The B picture support is designed in a basically similar way to the way it is found in MPEG-2 Part 2 and H.263v2.

The quarter-pixel motion compensation feature of ASP was innovative, and was later also included (in somewhat different forms) in MPEG-4 Part 10 and VC-1. Some implementations omit support for this feature, because it has a significantly harmful effect on speed and it is not always beneficial for quality.

The global motion compensation feature is not actually supported in most implementations although the standard officially requires decoders to support it. Most encoders do not support it either, and some experts say that it does not ordinarily provide any benefit in compression. When used, ASP's global motion compensation has a large unfavorable impact on speed and adds considerable complexity to the implementation.

Simple Studio Profile (SStP)

The MPEG-4 Simple Studio Profile (SStP), or ISO/IEC 14496-2, has 6 levels going from SDTV to 4K resolution.[11] MPEG-4 SStP allows for up to 12-bit bit depth and up to 4:4:4 chroma subsampling,[11] using Intra-frame coding only.[12] MPEG-4 SStP is used by HDCAM SR.[11]

Levels with maximum property values[11]
Level Max bit depth and
chroma subsampling
Max resolution
and frame rate
Max data rate
1 10-bit 4:2:2 SDTV 180
2 10-bit 4:2:2 1920×1080 30p/30i 600
3 12-bit 4:4:4 1920×1080 30p/30i 900
4 12-bit 4:4:4 2K×2K 30p 1,350
5 12-bit 4:4:4 4K×2K 30p 1,800
6 12-bit 4:4:4 4K×2K 60p 3,600


MPEG-4 Part 2 has drawn some industry criticism. FFmpeg's maintainer Michael Niedermayer has criticised MPEG-4 for lacking an in-loop deblocking filter, GMC being too computationally intensive, and OBMC being defined but not allowed in any profiles among other things.[13] Microsoft's Ben Waggoner states "Microsoft (well before my time) went down the codec standard route before with MPEG-4 part 2, which turns out to be a profound disappointment across the industry - it didn't offer that much of a compression advantage over MPEG-2, and the protracted license agreement discussions scared off a lot of adoption. I was involved in many digital media projects that wouldn't even touch MPEG-4 in the late 1990s to early 2000s because there was going to be a 'content fee' that hadn't been fully defined yet."[14]

Popular software implementations

See also


  1. ^ a b c ISO. "ISO/IEC 14496-2:2004 - Information technology -- Coding of audio-visual objects -- Part 2: Visual". ISO. Retrieved 2009-11-01.
  2. ^ (2006-08-10). "Riding the Media Bits, End of the Ride?". Archived from the original on 2011-11-01. Retrieved 2010-03-10.
  3. ^ (2003-10-25). "Riding the Media Bits, Inside MPEG-4 - Part B". Archived from the original on 2011-11-01. Retrieved 2010-03-10.
  4. ^ ISO/IEC JTC1/SC29/WG11 (March 2000). "MPEG-4 Video - Frequently Asked Questions". Retrieved 2010-03-10.
  5. ^ Touradj Ebrahimi and Caspar Horne. "MPEG-4 Natural Video Coding - An overview". Archived from the original on 2010-03-22. Retrieved 2010-03-10.
  6. ^ (2009-09-06). "Riding the Media Bits, The development of MPEG-1 - Part A". Archived from the original on 2011-01-22. Retrieved 2010-03-10.
  7. ^ Fernando Pereira. "MPEG-4: Why, What, How and When?". Archived from the original on 2011-10-18. Retrieved 2010-03-10.
  8. ^ MPEG. "MPEG standards - Full list of standards developed or under development". Archived from the original on 2010-04-20. Retrieved 2009-10-31.
  9. ^ ISO. "ISO/IEC 14496-2:1999 - Information technology -- Coding of audio-visual objects -- Part 2: Visual". ISO. Retrieved 2009-11-01.
  10. ^ ISO. "ISO/IEC 14496-2:2001 - Information technology -- Coding of audio-visual objects -- Part 2: Visual". ISO. Retrieved 2009-11-01.
  11. ^ a b c d Yasuhiko Mikami; Hugo Gaggioni. "4K End-to-End HPA Technology Retreat 2010" (PDF). Sony. Retrieved 2012-11-28.
  12. ^ Caroline R. Arms; Carl Fleischhauer; Kate Murray. "MPEG-4, Visual Coding, Simple Studio Profile". Sustainability of Digital Formats. Library of Congress. Retrieved 9 March 2015.
  13. ^ Lair Of The Multimedia Guru » 15 reasons why MPEG4 sucks
  14. ^ VC-1 and H264 - Page 2 - Doom9's Forum

External links

3GP and 3G2

3GP (3GPP file format) is a multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS multimedia services. It is used on 3G mobile phones but can also be played on some 2G and 4G phones.

3G2 (3GPP2 file format) is a multimedia container format defined by the 3GPP2 for 3G CDMA2000 multimedia services. It is very similar to the 3GP file format but consumes less space & bandwidth also has some extensions and limitations in comparison to 3GP.


3ivx ( THRIV-eks) is a video codec suite, created by 3ivx Technologies, based in Sydney, Australia, that allows the creation of MPEG-4 compliant data streams. It has been designed around a need for decreased processing power for use mainly in embedded systems. First versions were published in 2001. 3ivx provides plugins and filters that allow the MPEG-4 data stream to be wrapped by the Microsoft ASF and AVI transports, as well as Apple's QuickTime transport. It also allows the creation of elementary MP4 data streams and provides an audio codec for creation of AAC audio streams. It does not support H.264 video (MPEG-4 Part 10). Only MPEG-4 Part 2 video is supported.

Official decoders and encoders are provided for Microsoft Windows, Mac OS and BeOS, with unmaintained older releases for the Amiga and Linux. In addition, FFmpeg can decode 3ivx encoded video.

The company is notable for its support of the Haiku OS, providing a port of the 3ivx codec. The 3ivx port maintainer has also produced a QuickTime MOV extractor and an MPEG-4 extractor for Haiku. As of 2005, they are the only company to support Haiku in any way.

The 30-day trial version of 3ivx is associated via Microsoft Windows operating systems to such media players as QuickTime and iTunes. Uninstalling 3ivx does not reset the 3IVX.dll file if the media players have been updated.

Uninstalling and restoring the media players does not solve the problem

as Windows still associates them to 3IVX.dll which is expired or been

uninstalled. Therefore, .mp4 files can not be played in QuickTime or

iTunes where Windows is seeking the absent 3IVX.dll. Installing 3ivx MPEG-4 5.0 or greater will correct the issue (you can then uninstall if you wish).

The 3ivx software is also available (in Mac OS X and Windows versions) with the Flip Video series of camcorders from Pure Digital.

3ivx has recently developed an HTTP Live Streaming Client SDK for Windows 8 and Windows 8 Phones for the playback of HLS content on in Windows 8 Modern UI apps


Avidemux is a free and open-source video editing program designed for video editing and video processing. It is written in C++, and uses either GTK+ or Qt for its user interface.


Cinepak is a lossy video codec developed by Peter Barrett at SuperMac Technologies, and released in 1991 with the Video Spigot, and then in 1992 as part of Apple Computer's QuickTime video suite. One of the first video compression tools to achieve full motion video on CD-ROM, it was designed to encode 320×240 resolution video at 1× (150 kbyte/s) CD-ROM transfer rates. The original name of this codec was Compact Video, which is why its FourCC identifier is CVID. The codec was ported to the Microsoft Windows platform in 1993. It was also used on first-generation and some second-generation CD-ROM game consoles, such as the Atari Jaguar CD, Sega CD, Sega Saturn, and 3DO. libavcodec includes a Cinepak decoder and an encoder, both licensed under the terms of the LGPL.

Comparison of video container formats

This table compares features of container formats (video file formats). To see which multimedia players support which container format, look at comparison of media players.

Darwin Streaming Server

Darwin Streaming Server (DSS) was the first open sourced RTP/RTSP streaming server. It was released March 16, 1999 and is a fully featured RTSP/RTP media streaming server capable of streaming a variety of media types including H.264/MPEG-4 AVC, MPEG-4 Part 2 and 3GP.


DivX is a brand of video codec products developed by DivX, LLC. The DivX codec gained fame for its ability to compress lengthy video segments into small sizes while maintaining relatively high visual quality.

There are three DivX codecs; the original MPEG-4 Part 2 DivX codec, the H.264/MPEG-4 AVC DivX Plus HD codec and the High Efficiency Video Coding DivX HEVC Ultra HD codec.

The most recent version of the codec itself is version 6.9.2, which is several years old. New version numbers on the packages now reflect updates to the media player, converter, etc.


HDCAM, introduced in 1997, is a high-definition video digital recording videocassette version of digital Betacam, using an 8-bit discrete cosine transform (DCT) compressed 3:1:1 recording, in 1080i-compatible down-sampled resolution of 1440×1080, and adding 24p and 23.976 progressive segmented frame (PsF) modes to later models. The HDCAM codec uses rectangular pixels and as such the recorded 1440×1080 content is upsampled to 1920×1080 on playback. The recorded video bit rate is 144 Mbit/s. Audio is also similar, with four channels of AES3 20-bit, 48 kHz digital audio.

Like Betacam, HDCAM tapes are produced in small and large cassette sizes; the small cassette uses the same form factor as the original Betamax.

The main competitor to HDCAM is the DVCPRO HD format offered by Panasonic. It uses a similar compression scheme and bit rates ranging from 40 Mbit/s to 100 Mbit/s depending on frame rate.

HDCAM is standardized as SMPTE 367M, also known as SMPTE D-11.

List of open-source codecs

This is a listing of open-source implementations of media formats—usually called codecs. Many of the codecs listed implement media formats that are restricted by patents and are hence not open formats. For example, x264 is a widely used open source implementation of the heavily patent encumbered MPEG-4 AVC media format.


Macroblock is a processing unit in image and video compression formats based on linear block transforms, such as the discrete cosine transform (DCT). A macroblock typically consists of 16×16 samples, and is further subdivided into transform blocks, and may be further subdivided into prediction blocks. Formats which are based on macroblocks include JPEG, where they are called MCU blocks, H.261, MPEG-1 Part 2, H.262/MPEG-2 Part 2, H.263, MPEG-4 Part 2, and H.264/MPEG-4 AVC. In H.265/HEVC, the macroblock as a basic processing unit has been replaced by the coding tree unit.


MediaCoder is a proprietary transcoding program for Microsoft Windows. It has been developed by Stanley Huang since 2005.

MediaCoder uses various open-source (and several proprietary) audio and video codecs to transcode media files to different audio/video formats. Common uses for the program include compression, file type conversion, remuxing and extraction of audio from video files. Many formats are supported, including MP3, Vorbis, Opus, Advanced Audio Coding (AAC), Windows Media Audio (WMA), RealAudio, WAV, H.264/MPEG-4 AVC, MPEG-4 Part 2, MPEG-2, Audio Video Interleave (AVI), Video CD and DVD-Video.

While MediaCoder is supplied free of charge and fully functional as long as there is an internet connection available, it is supported by bundling the OpenCandy software recommendation service in its installer. During a batch conversions once a certain number of conversions have been performed a window appears requiring human interaction (a simple CAPTCHA) to dismiss or pay a donation in order to remove the periodic nagging. ($15 minimum in 2011, $20 in 2012, $25 as of August 2013) This qualifies this application as nagware. After donating the user is supplied with an ID which must be entered and verified in the software. If the software is subsequently used without being connected to the internet the nags will reappear because the software randomly checks online whether the user has a valid ID unless a USB license dongle is purchased.

Prior to 2008, MediaCoder was a free and open-source software application and was available on SourceForge. MediaCoder was a nominee of SourceForge.NET 2007 Community Choice Award of Best Project for Multimedia along with Audacity, InkScape and FFDShow. On December 2009 however, Stanley Huang announced that the project is no longer hosted on SourceForge and no longer open-source. A list of older versions, release dates and release notes can be found here.

Reconfigurable video coding

The Reconfigurable Video Coding (RVC) is an MPEG initiative to provide an innovative framework of video coding development. This framework offers a way to overcome the lack of interoperability between the many video codecs deployed in the market. Indeed, an RVC codec is described using the dataflow programming paradigm which permits flexibility and reusability. Two standards have been produced by the RVC working group:

The codec configuration representation (ISO/IEC 23001-4 or MPEG-B pt. 4) describes the format with which an RVC decoder can be defined as a network of computational blocks, as well as a textual language for the definition of video coding blocks.

A video tool library (ISO/IEC 23002-4 or MPEG-C pt. 4) that standardizes actors needed to describe existing video coding standards (currently MPEG-4 part 2 and MPEG-4 part 10).


Theora is a free lossy video compression format. It is developed by the Xiph.Org Foundation and distributed without licensing fees alongside their other free and open media projects, including the Vorbis audio format and the Ogg container.

The libtheora video codec is the reference implementation of the Theora video compression format being developed by the Xiph.Org Foundation.Theora is derived from the formerly proprietary VP3 codec, released into the public domain by On2 Technologies. It is broadly comparable in design and bitrate efficiency to MPEG-4 Part 2, early versions of Windows Media Video, and RealVideo while lacking some of the features present in some of these other codecs. It is comparable in open standards philosophy to the BBC's Dirac codec.

Theora is named after Theora Jones, Edison Carter's Controller on the Max Headroom television program.In 2014, a bug requesting Theora support on Android was closed "Won't Fix (Obsolete)".

Wikipedia stopped preferring Ogg Theora and now prefers WebM.

Variable bitrate

Variable bitrate (VBR) is a term used in telecommunications and computing that relates to the bitrate used in sound or video encoding. As opposed to constant bitrate (CBR), VBR files vary the amount of output data per time segment. VBR allows a higher bitrate (and therefore more storage space) to be allocated to the more complex segments of media files while less space is allocated to less complex segments. The average of these rates can be calculated to produce an average bitrate for the file.

MP3, WMA and AAC audio files can optionally be encoded in VBR, while Opus, Vorbis are always in VBR. Variable bit rate encoding is also commonly used on MPEG-2 video, MPEG-4 Part 2 video (Xvid, DivX, etc.), MPEG-4 Part 10/H.264 video, Theora, Dirac and other video compression formats. Additionally, variable rate encoding is inherent in lossless compression schemes such as FLAC and Apple Lossless.

Video codec

A video codec is an electronic circuit or software that compresses or decompresses digital video. It converts uncompressed video to a compressed format or vice versa. In the context of video compression, "codec" is a concatenation of "encoder" and "decoder"—a device that only compresses is typically called an encoder, and one that only decompresses is a decoder.

The compressed data format usually conforms to a standard video compression specification. The compression is typically lossy, meaning that the compressed video lacks some information present in the original video. A consequence of this is that decompressed video has lower quality than the original, uncompressed video because there is insufficient information to accurately reconstruct the original video.

There are complex relationships between the video quality, the amount of data used to represent the video (determined by the bit rate), the complexity of the encoding and decoding algorithms, sensitivity to data losses and errors, ease of editing, random access, and end-to-end delay (latency).

Video coding format

A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital video content (such as in a data file or bitstream). Examples of video coding formats include MPEG-2 Part 2, MPEG-4 Part 2, H.264 (MPEG-4 Part 10), HEVC, Theora, RealVideo RV40, VP9, and AV1. A specific software or hardware implementation capable of video compression and/or decompression to/from a specific video coding format is called a video codec; an example of a video codec is Xvid, which is one of several different codecs which implements encoding and decoding videos in the MPEG-4 Part 2 video coding format in software.

Some video coding formats are documented by a detailed technical specification document known as a video coding specification. Some such specifications are written and approved by standardization organizations as technical standards, and are thus known as a video coding standard. The term 'standard' is also sometimes used for de facto standards as well as formal standards.

Video content encoded using a particular video coding format is normally bundled with an audio stream (encoded using an audio coding format) inside a multimedia container format such as AVI, MP4, FLV, RealMedia, or Matroska. As such, the user normally doesn't have a H.264 file, but instead has a .mp4 video file, which is an MP4 container containing H.264-encoded video, normally alongside AAC-encoded audio. Multimedia container formats can contain any one of a number of different video coding formats; for example the MP4 container format can contain video in either the MPEG-2 Part 2 or the H.264 video coding format, among others. Another example is the initial specification for the file type WebM, which specified the container format (Matroska), but also exactly which video (VP8) and audio (Vorbis) compression format is used inside the Matroska container, even though the Matroska container format itself is capable of containing other video coding formats (VP9 video and Opus audio support was later added to the WebM specification).

Windows Media Video

Windows Media Video (WMV) is a series of video codecs and their corresponding video coding formats developed by Microsoft. It is part of the Windows Media framework. WMV consists of three distinct codecs: The original video compression technology known as WMV, was originally designed for Internet streaming applications, as a competitor to RealVideo. The other compression technologies, WMV Screen and WMV Image, cater for specialized content. After standardization by the Society of Motion Picture and Television Engineers (SMPTE), WMV version 9 was adapted for physical-delivery formats such as HD DVD and Blu-ray Disc and became known as VC-1. Microsoft also developed a digital container format called Advanced Systems Format to store video encoded by Windows Media Video.

X-Video Bitstream Acceleration

X-Video Bitstream Acceleration (XvBA), designed by AMD Graphics for its Radeon GPU and Fusion APU, is an arbitrary extension of the X video extension (Xv) for the X Window System on Linux operating-systems. XvBA API allows video programs to offload portions of the video decoding process to the GPU video-hardware. Currently, the portions designed to be offloaded by XvBA onto the GPU are currently motion compensation (MC) and inverse discrete cosine transform (IDCT), and variable-length decoding (VLD) for MPEG-2, MPEG-4 ASP (MPEG-4 Part 2, including Xvid, and older DivX and Nero Digital), MPEG-4 AVC (H.264), WMV3, and VC-1 encoded video.XvBA is a direct competitor to NVIDIA's Video Decode and Presentation API for Unix (VDPAU) and Intel's Video Acceleration API (VA API).In November 2009 a XvBA backend for Video Acceleration API (VA API) was released, which means any software that supports VA API will also support XvBA.On 24 February 2011, an official XvBA SDK (Software Development Kit) was publicly released alongside a suite of open source tools by AMD.


Xvid (formerly "XviD") is a video codec library following the MPEG-4 video coding standard, specifically MPEG-4 Part 2 Advanced Simple Profile (ASP). It uses ASP features such as b-frames, global and quarter pixel motion compensation, lumi masking, trellis quantization, and H.263, MPEG and custom quantization matrices.

Xvid is a primary competitor of the DivX Pro Codec. In contrast with the DivX codec, which is proprietary software developed by DivX, Inc., Xvid is free software distributed under the terms of the GNU General Public License. This also means that unlike the DivX codec, which is only available for a limited number of platforms, Xvid can be used on all platforms and operating systems for which the source code can be compiled.

MPEG-1 Parts
MPEG-2 Parts
MPEG-4 Parts
MPEG-7 Parts
MPEG-21 Parts
MPEG-D Parts
MPEG-G Parts
MPEG-H Parts

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.