MPEG-4 is a method of defining compression of audio and visual (AV) digital data. It was introduced in late 1998 and designated a standard for a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group (MPEG) (ISO/IEC JTC1/SC29/WG11) under the formal standard ISO/IEC 14496 – Coding of audio-visual objects. Uses of MPEG-4 include compression of AV data for web (streaming media) and CD distribution, voice (telephone, videophone) and broadcast television applications.


MPEG-4 absorbs many of the features of MPEG-1 and MPEG-2 and other related standards, adding new features such as (extended) VRML support for 3D rendering, object-oriented composite files (including audio, video and VRML objects), support for externally specified Digital Rights Management and various types of interactivity. AAC (Advanced Audio Coding) was standardized as an adjunct to MPEG-2 (as Part 7) before MPEG-4 was issued.

MPEG-4 is still an evolving standard and is divided into a number of parts. Companies promoting MPEG-4 compatibility do not always clearly state which "part" level compatibility they are referring to. The key parts to be aware of are MPEG-4 Part 2 (including Advanced Simple Profile, used by codecs such as DivX, Xvid, Nero Digital and 3ivx and by QuickTime 6) and MPEG-4 part 10 (MPEG-4 AVC/H.264 or Advanced Video Coding, used by the x264 encoder, Nero Digital AVC, QuickTime 7, and high-definition video media like Blu-ray Disc).

Most of the features included in MPEG-4 are left to individual developers to decide whether or not to implement. This means that there are probably no complete implementations of the entire MPEG-4 set of standards. To deal with this, the standard includes the concept of "profiles" and "levels", allowing a specific set of capabilities to be defined in a manner appropriate for a subset of applications.

Initially, MPEG-4 was aimed primarily at low bit-rate video communications; however, its scope as a multimedia coding standard was later expanded. MPEG-4 is efficient across a variety of bit-rates ranging from a few kilobits per second to tens of megabits per second. MPEG-4 provides the following functions:

  • Improved coding efficiency over MPEG-2[1]
  • Ability to encode mixed media data (video, audio, speech)
  • Error resilience to enable robust transmission
  • Ability to interact with the audio-visual scene generated at the receiver


MPEG-4 provides a series of technologies for developers, for various service-providers and for end users:

  • MPEG-4 enables different software and hardware developers to create multimedia objects possessing better abilities of adaptability and flexibility to improve the quality of such services and technologies as digital television, animation graphics, the World Wide Web and their extensions.
  • Data network providers can use MPEG-4 for data transparency. With the help of standard procedures, MPEG-4 data can be interpreted and transformed into other signal types compatible with any available network.
  • The MPEG-4 format provides end users with a wide range of interaction with various animated objects.
  • Standardized Digital Rights Management signaling, otherwise known in the MPEG community as Intellectual Property Management and Protection (IPMP).

The MPEG-4 format can perform various functions, among which might be the following:

  • Multiplexes and synchronizes data, associated with media objects, in such a way that they can be efficiently transported further via network channels.
  • Interaction with the audio-visual scene, which is formed on the side of the receiver.

Profiles and Levels

MPEG-4 provides a large and rich set of tools for encoding. Subsets of the MPEG-4 tool sets have been provided for use in specific applications. These subsets, called 'Profiles', limit the size of the tool set a decoder is required to implement.[2] In order to restrict computational complexity, one or more 'Levels' are set for each Profile.[2] A Profile and Level combination allows:[2]

  • A codec builder to implement only the subset of the standard needed, while maintaining interworking with other MPEG-4 devices that implement the same combination.[2]
  • Checking whether MPEG-4 devices comply with the standard, referred to as conformance testing.[2]

MPEG-4 Parts

MPEG-4 consists of several standards—termed "parts"—including the following (each part covers a certain aspect of the whole specification):

MPEG-4 parts[3][4]
Part Number First public release date (first edition) Latest public release date (last edition) Latest amendment Title Description
Part 1 ISO/IEC 14496-1 1999 2010[5] 2014[6] Systems Describes synchronization and multiplexing of video and audio. For example, the MPEG-4 file format version 1 (obsoleted by version 2 defined in MPEG-4 Part 14). The functionality of a transport protocol stack for transmitting and/or storing content complying with ISO/IEC 14496 is not within the scope of 14496-1 and only the interface to this layer is considered (DMIF). Information about transport of MPEG-4 content is defined e.g. in MPEG-2 Transport Stream, RTP Audio Video Profiles and others.[7][8][9][10][11]
Part 2 ISO/IEC 14496-2 1999 2004[12] 2009 Visual A compression format for visual data (video, still textures, synthetic images, etc.). One of the many "profiles" in Part 2 is the Advanced Simple Profile (ASP).
Part 3 ISO/IEC 14496-3 1999 2009[13] 2017[14] Audio A set of compression formats for perceptual coding of audio signals, including some variations of Advanced Audio Coding (AAC) as well as other audio/speech coding formats and tools (such as Audio Lossless Coding (ALS), Scalable Lossless Coding (SLS), Structured Audio, Text-To-Speech Interface (TTSI), HVXC, CELP and others)
Part 4 ISO/IEC 14496-4 2000 2004[15] 2016 Conformance testing Describes procedures for testing conformance to other parts of the standard.
Part 5 ISO/IEC 14496-5 2000 2001[16] 2017 Reference software Provides reference software for demonstrating and clarifying the other parts of the standard.
Part 6 ISO/IEC 14496-6 1999 2000[17] Delivery Multimedia Integration Framework (DMIF)
Part 7 ISO/IEC TR 14496-7 2002 2004[18] Optimized reference software for coding of audio-visual objects Provides examples of how to make improved implementations (e.g., in relation to Part 5).
Part 8 ISO/IEC 14496-8 2004 2004[19] Carriage of ISO/IEC 14496 contents over IP networks Specifies a method to carry MPEG-4 content on IP networks. It also includes guidelines to design RTP payload formats, usage rules of SDP to transport ISO/IEC 14496-1-related information, MIME type definitions, analysis on RTP security and multicasting.
Part 9 ISO/IEC TR 14496-9 2004 2009[20] Reference hardware description Provides hardware designs for demonstrating how to implement the other parts of the standard.
Part 10 ISO/IEC 14496-10 2003 2014[21] 2016[22] Advanced Video Coding (AVC) A compression format for video signals which is technically identical to the ITU-T H.264 standard.
Part 11 ISO/IEC 14496-11 2005 2015[23] Scene description and application engine Can be used for rich, interactive content with multiple profiles, including 2D and 3D versions. MPEG-4 Part 11 revised MPEG-4 Part 1 – ISO/IEC 14496-1:2001 and two amendments to MPEG-4 Part 1. It describes a system level description of an application engine (delivery, lifecycle, format and behaviour of downloadable Java byte code applications) and the Binary Format for Scene (BIFS) and the Extensible MPEG-4 Textual (XMT) format – a textual representation of the MPEG-4 multimedia content using XML, etc.[23] (It is also known as BIFS, XMT, MPEG-J.[24] MPEG-J was defined in MPEG-4 Part 21)
Part 12 ISO/IEC 14496-12 2004 2015[25] 2017[26] ISO base media file format A file format for storing time-based media content. It is a general format forming the basis for a number of other more specific file formats (e.g. 3GP, Motion JPEG 2000, MPEG-4 Part 14). It is technically identical to ISO/IEC 15444-12 (JPEG 2000 image coding system – Part 12).
Part 13 ISO/IEC 14496-13 2004 2004[27] Intellectual Property Management and Protection (IPMP) Extensions MPEG-4 Part 13 revised an amendment to MPEG-4 Part 1 – ISO/IEC 14496-1:2001/Amd 3:2004. It specifies common Intellectual Property Management and Protection (IPMP) processing, syntax and semantics for the carriage of IPMP tools in the bit stream, IPMP information carriage, mutual authentication for IPMP tools, a list of registration authorities required for the support of the amended specifications (e.g. CISAC), etc. It was defined due to the lack of interoperability of different protection mechanisms (different DRM systems) for protecting and distributing copyrighted digital content such as music or video.[28][29][30][31][32][33][34][35][36]
Part 14 ISO/IEC 14496-14 2003 2003[37] 2010[38] MP4 file format It is also known as "MPEG-4 file format version 2". The designated container file format for MPEG-4 content, which is based on Part 12. It revises and completely replaces Clause 13 of ISO/IEC 14496-1 (MPEG-4 Part 1: Systems), in which the MPEG-4 file format was previously specified.
Part 15 ISO/IEC 14496-15 2004 2017[39] Part 15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format For storage of Part 10 video. File format is based on Part 12, but also allows storage in other file formats.
Part 16 ISO/IEC 14496-16 2004 2011[40] 2016[41] Animation Framework eXtension (AFX) It specifies MPEG-4 Animation Framework eXtension (AFX) model for representing 3D Graphics content. MPEG-4 is extended with higher-level synthetic objects for specifying geometry, texture, animation and dedicated compression algorithms.
Part 17 ISO/IEC 14496-17 2006 2006[42] Streaming text format Timed Text subtitle format
Part 18 ISO/IEC 14496-18 2004 2004[43] 2014 Font compression and streaming For Open Font Format defined in Part 22.
Part 19 ISO/IEC 14496-19 2004 2004[44] Synthesized texture stream Synthesized texture streams are used for creation of very low bitrate synthetic video clips.
Part 20 ISO/IEC 14496-20 2006 2008[45] 2010 Lightweight Application Scene Representation (LASeR) and Simple Aggregation Format (SAF) LASeR requirements (compression efficiency, code and memory footprint) are fulfilled by building upon the existing the Scalable Vector Graphics (SVG) format defined by the World Wide Web Consortium.[46]
Part 21 ISO/IEC 14496-21 2006 2006[47] MPEG-J Graphics Framework eXtensions (GFX) Describes a lightweight programmatic environment for advanced interactive multimedia applications – a framework that marries a subset of the MPEG standard Java application environment (MPEG-J) with a Java API.[24][47][48][49] (at "FCD" stage in July 2005, FDIS January 2006, published as ISO standard on 2006-11-22).
Part 22 ISO/IEC 14496-22 2007 2015[50] 2017 Open Font Format OFFS is based on the OpenType version 1.4 font format specification, and is technically equivalent to that specification.[51][52] Reached "CD" stage in July 2005, published as ISO standard in 2007
Part 23 ISO/IEC 14496-23 2008 2008[53] Symbolic Music Representation (SMR) Reached "FCD" stage in October 2006, published as ISO standard in 2008-01-28
Part 24 ISO/IEC TR 14496-24 2008 2008[54] Audio and systems interaction Describes the desired joint behavior of MPEG-4 File Format and MPEG-4 Audio.
Part 25 ISO/IEC 14496-25 2009 2011[55] 3D Graphics Compression Model Defines a model for connecting 3D Graphics Compression tools defined in MPEG-4 standards to graphics primitives defined in any other standard or specification.
Part 26 ISO/IEC 14496-26 2010 2010[56] 2016 Audio Conformance
Part 27 ISO/IEC 14496-27 2009 2009[57] 2015[58] 3D Graphics conformance 3D Graphics Conformance summarizes the requirements, cross references them to characteristics, and defines how conformance with them can be tested. Guidelines are given on constructing tests to verify decoder conformance.
Part 28 ISO/IEC 14496-28 2012 2012[59] Composite font representation
Part 29 ISO/IEC 14496-29 2014 2015 Web video coding Text of Part 29 is derived from Part 10 - ISO/IEC 14496-10. Web video coding is a technology that is compatible with the Constrained Baseline Profile of ISO/IEC 14496-10 (the subset that is specified in Annex A for Constrained Baseline is a normative specification, while all remaining parts are informative).
Part 30 ISO/IEC 14496-30 2014 2014 Timed text and other visual overlays in ISO base media file format It describes the carriage of some forms of timed text and subtitle streams in files based on ISO/IEC 14496-12 - W3C Timed Text Markup Language 1.0, W3C WebVTT (Web Video Text Tracks). The documentation of these forms does not preclude other definition of carriage of timed text or subtitles; see, for example, 3GPP Timed Text (3GPP TS 26.245).
Part 31 ISO/IEC 14496-31 Under development (2018-05) Video Coding for Browsers Video Coding for Browsers (VCB) - a video compression technology that is intended for use within World Wide Web browser
Part 32 ISO/IEC CD 14496-32 Under development Conformance and reference software
Part 33 ISO/IEC FDIS 14496-33 Under development Internet video coding

Profiles are also defined within the individual "parts", so an implementation of a part is ordinarily not an implementation of an entire part.

MPEG-1, MPEG-2, MPEG-7 and MPEG-21 are other suites of MPEG standards.

MPEG-4 Levels

The low profile levels are part of the MPEG-4 video encoding/decoding constraints and are compatible with the older ITU H.261 standard, also compatible with former analog TV standards for broadcast and records (such as NTSC or PAL video). The ASP profile in its highest level is suitable for most usual DVD medias and players or for many online video sites, but not for Blu-ray records or online HD video contents.

Profile Level Max.
Max. framesize
@ max.
@ 30 Hz @ 25 Hz @ 24 Hz @ 15 Hz @ 12.5 Hz
SP L0 160 64 2.50 2,048 99 1,485 QCIF (176×144)
L0b 320 128
L1 160 64 128×96 144×96 160×96
L2 640 128 5.00 4,096 396 5,940 256×192 304×192, 288×208 304×208 CIF (352×288)
L3 384 1.66 8,192 11,880 CIF (352×288)
L4a 1,280 4,000 0.32 16,384 1,200 36,000 VGA (640×480)
L5 1,792 8,000 0.22 1,620 40,500 D1 NTSC (720×480) D1 PAL (720×576)
L6 3,968 12,000 0.33 3,600 108,000 720p (1280x720)
ASP L0 160 128 1.25 2,048 99 2,970 QCIF (176×144)
L2 640 384 1.66 4,096 396 5,940 256×192 304×192, 288×208 304×208 CIF (352×288)
L3 768 0.86 11,880 CIF (352×288)
L3b 1,040 1,500 0.69
L4 1,280 3,000 0.43 8,192 792 23,760 352×576, 704×288
L5 1,792 8,000 0.22 16,384 1,620 48,600 720×576
Units kbits kbits/s seconds bits macroblocks macroblocks/s pixels

More advanced profiles for HD media have been defined later in the AVC profile, which is functionally identical to the ITU H.264 standard but are now also integrated in MPEG-4 Part 10 (see H.264/MPEG-4 AVC for the list of defined levels in this AVC profile).


MPEG-4 contains patented technologies, the use of which requires licensing in countries that acknowledge software algorithm patents. Over two dozen companies claim to have patents covering MPEG-4. MPEG LA[60] licenses patents required for MPEG-4 Part 2 Visual from a wide range of companies (audio is licensed separately) and lists all of its licensors and licensees on the site. New licenses for MPEG-4 System patents are under development[61] and no new licenses are being offered while holders of its old MPEG-4 Systems license are still covered under the terms of that license for the patents listed (MPEG LA – Patent List).

See also


  1. ^ Wiegand, T; Sullican, G J; Bjontegaard, G; Luthra, A. "Overview of the H.264/AVC video coding standard - IEEE Journals & Magazine". Retrieved 19 October 2018.
  2. ^ a b c d e RFC 3640, IETF, p. 31.
  3. ^ MPEG. "MPEG standards – Full list of standards developed or under development". Chiariglione. Archived from the original on 2010-04-20. Retrieved 2010-02-09.
  4. ^ ISO/IEC JTC 1/SC 29 (2009-11-09). "Programme of Work – MPEG-4 (Coding of audio-visual objects)". Archived from the original on 2013-12-31. Retrieved 2009-11-10.
  5. ^ "ISO/IEC 14496-1:2010 – Information technology — Coding of audio-visual objects — Part 1: Systems". Retrieved 2017-08-30.
  6. ^ ISO. "ISO/IEC 14496-1:2010/Amd 2:2014 – Support for raw audio-visual data". Retrieved 2017-08-30.
  7. ^ ISO/IEC (2004-11-15), ISO/IEC 14496-1:2004 – Third edition 2004-11-15 – Information technology — Coding of audio-visual objects — Part 1: Systems (PDF), archived from the original (PDF) on 2017-08-31, retrieved 2010-04-11
  8. ^ WG11 (MPEG) (March 2002). "Overview of the MPEG-4 Standard". Retrieved 2010-04-11.
  9. ^ WG11 (1997-11-21), Text for CD 14496-1 Systems (MS Word .doc), retrieved 2010-04-11
  10. ^ "MPEG-4 Systems Elementary Stream Management (ESM)". July 2001. Retrieved 2010-04-11.
  11. ^ "MPEG Systems (1-2-4-7) FAQ, Version 17.0". July 2001. Retrieved 2010-04-11.
  12. ^ "ISO/IEC 14496-2:2004 – Information technology — Coding of audio-visual objects — Part 2: Visual". ISO. Retrieved 2017-08-30.
  13. ^ "ISO/IEC 14496-3:2009 – Information technology — Coding of audio-visual objects — Part 3: Audio". ISO. Retrieved 2017-08-30.
  14. ^ "ISO/IEC 14496-3:2009/Amd 6:2017, Profiles, levels and downmixing method for 22.2 channel programs". ISO. 2017. Retrieved 2017-08-30.
  15. ^ "ISO/IEC 14496-4:2004 – Information technology — Coding of audio-visual objects — Part 4: Conformance testing". ISO. Retrieved 2017-08-30.
  16. ^ "ISO/IEC 14496-5:2001 – Information technology — Coding of audio-visual objects — Part 5: Reference software". ISO. Retrieved 2017-08-30.
  17. ^ "ISO/IEC 14496-6:2000 – Information technology — Coding of audio-visual objects — Part 6: Delivery Multimedia Integration Framework (DMIF)". ISO. Retrieved 2017-08-30.
  18. ^ "ISO/IEC TR 14496-7:2004 – Information technology — Coding of audio-visual objects — Part 7: Optimized reference software for coding of audio-visual objects". ISO. Retrieved 2017-08-30.
  19. ^ "ISO/IEC 14496-8:2004 – Information technology — Coding of audio-visual objects — Part 8: Carriage of ISO/IEC 14496 contents over IP networks". ISO. Retrieved 2017-08-30.
  20. ^ "ISO/IEC TR 14496-9:2009 – Information technology — Coding of audio-visual objects — Part 9: Reference hardware description". ISO. Retrieved 2017-08-30.
  21. ^ "ISO/IEC 14496-10:2014 – Information technology — Coding of audio-visual objects — Part 10: Advanced Video Coding". ISO. Retrieved 2017-08-30.
  22. ^ "ISO/IEC 14496-10:2014/Amd 3:2016 – Constrained Additional supplemental enhancement information". ISO. Retrieved 2017-08-30.
  23. ^ a b "ISO/IEC 14496-11:2015 – Information technology — Coding of audio-visual objects — Part 11: Scene description and application engine". ISO. Retrieved 2017-08-30.
  24. ^ a b "MPEG-J White Paper". July 2005. Retrieved 2010-04-11.
  25. ^ "ISO/IEC 14496-12:2015 – Information technology — Coding of audio-visual objects — Part 12: ISO base media file format". ISO. Retrieved 2014-01-19.
  26. ^ ISO. "ISO/IEC 14496-12:2015/Amd 1:2017 – DRC Extensions". Retrieved 2017-08-30.
  27. ^ "ISO/IEC 14496-13:2004 – Information technology — Coding of audio-visual objects — Part 13: Intellectual Property Management and Protection (IPMP) extensions". ISO. Retrieved 2017-08-30.
  28. ^ MPEG (March 2002), FPDAM ISO/IEC 14496-1:2001 / AMD3 (Final Proposed Draft Amendment), archived from the original (MS Word .doc) on 2014-05-12, retrieved 2010-08-01
  29. ^ "MPEG-4 IPMPX white paper". MPEG. July 2005. Retrieved 2010-08-01.
  30. ^ "MPEG Intellectual Property Management and Protection". MPEG. April 2009. Retrieved 2010-08-01.
  31. ^ MPEG-4 IPMP Extension – For Interoperable Protection of Multimedia Content (PDF), 2004, archived from the original (PDF) on 2010-06-18, retrieved 2010-08-01
  32. ^ "MPEG Registration Authority – IPMP". MPEG RA International Agency (CISAC). Archived from the original on 2007-06-16. Retrieved 2010-08-01.
  33. ^ "MPEG RA – FAQ IPMP". MPEG RA International Agency (CISAC). Retrieved 2010-08-01.
  34. ^ "Intellectual Property Management and Protection Registration Authority". CISAC. 2004-12-05. Archived from the original on 2004-12-05. Retrieved 2010-08-01.
  35. ^ Chiariglione, Leonardo (2003), Digital media: Can content, business and users coexist?, Torino, IT: Telecom Italia Lab, archived from the original on 2011-07-25, retrieved 2010-08-01
  36. ^ IPMP in MPEG – W3C DRM workshop 22/23 January 2001 (PPT), retrieved 2010-08-01
  37. ^ ISO. "ISO/IEC 14496-14:2003 – Information technology — Coding of audio-visual objects — Part 14: MP4 file format". Retrieved 2017-08-30.
  38. ^ "ISO/IEC 14496-14:2003/Amd 1:2010 – Handling of MPEG-4 audio enhancement layers". ISO. Retrieved 2017-08-30.
  39. ^ "ISO/IEC 14496-15:2017 – Information technology — Coding of audio-visual objects — Part 15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format". ISO. Retrieved 2017-08-30.
  40. ^ "ISO/IEC 14496-16:2011 – Information technology — Coding of audio-visual objects — Part 16: Animation Framework eXtension (AFX)". ISO. Retrieved 2017-08-30.
  41. ^ "ISO/IEC 14496-16:2011/Amd 3:2016 – Printing material and 3D graphics coding for browsers". Retrieved 2017-08-30.
  42. ^ "ISO/IEC 14496-17:2006 – Information technology — Coding of audio-visual objects — Part 17: Streaming text format". ISO. Retrieved 2017-08-30.
  43. ^ "ISO/IEC 14496-18:2004 – Information technology — Coding of audio-visual objects — Part 18: Font compression and streaming". ISO. Retrieved 2017-08-30.
  44. ^ "ISO/IEC 14496-19:2004 – Information technology – Coding of audio-visual objects — Part 19: Synthesized texture stream". ISO. Retrieved 2017-08-30.
  45. ^ "ISO/IEC 14496-20:2008 – Information technology — Coding of audio-visual objects — Part 20: Lightweight Application Scene Representation (LASeR) and Simple Aggregation Format (SAF)". ISO. Retrieved 2017-08-30.
  46. ^ "MPEG-4 LASeR white paper". July 2005. Retrieved 2010-04-11.
  47. ^ a b "ISO/IEC 14496-21:2006 – Information technology — Coding of audio-visual objects — Part 21: MPEG-J Graphics Framework eXtensions (GFX)". ISO. Retrieved 2017-08-30.
  48. ^ "MPEG-4 Systems MPEG-J". July 2001. Retrieved 2010-04-11.
  49. ^ "MPEG-J GFX white paper". July 2005. Retrieved 2010-04-11.
  50. ^ "ISO/IEC 14496-22:2009 – Information technology — Coding of audio-visual objects — Part 22: Open Font Format". ISO. Retrieved 2017-08-30.
  51. ^ ISO/IEC JTC 1/SC 29/WG 11 (July 2008). "ISO/IEC 14496-22 "Open Font Format"". Chiariglione. Retrieved 2010-02-09.
  52. ^ "ISO/IEC 14496-22 Information technology — Coding of audio-visual objects — Part 22: Open Font Format" (Zip) (first ed.). 2007-03-15. Retrieved 2010-01-28.
  53. ^ "ISO/IEC 14496-23:2008 – Information technology — Coding of audio-visual objects — Part 23: Symbolic Music Representation". ISO. Retrieved 2017-08-30.
  54. ^ "ISO/IEC TR 14496-24:2008 – Information technology — Coding of audio-visual objects — Part 24: Audio and systems interaction". ISO. Retrieved 2017-08-30.
  55. ^ "ISO/IEC 14496-25:2011 – Information technology — Coding of audio-visual objects — Part 25: 3D Graphics Compression Model". ISO. Retrieved 2017-08-30.
  56. ^ "ISO/IEC 14496-26:2010 – Information technology — Coding of audio-visual objects — Part 26: Audio conformance". ISO. Retrieved 2017-08-30.
  57. ^ "ISO/IEC 14496-27:2009 – Information technology — Coding of audio-visual objects — Part 27: 3D Graphics conformance". ISO. Retrieved 2017-08-30.
  58. ^ ISO. "ISO/IEC 14496-27:2009/Amd 6:2015 – Pattern-based 3D mesh coding conformance". Retrieved 2017-08-30.
  59. ^ "ISO/IEC CD 14496-28 – Information technology — Coding of audio-visual objects — Part 28: Composite font representation". ISO. Retrieved 2017-08-30.
  60. ^ MPEG Licensing Authority – MPEG-4 Visual: Introduction
  61. ^ MPEG Licensing Authority – MPEG-4 Systems: Introduction

External links


The MPEG-4 Low Delay Audio Coder (a.k.a. AAC Low Delay, or AAC-LD) is audio compression standard designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the MPEG-2 Advanced Audio Coding (AAC) standard. It was published in MPEG-4 Audio Version 2 (ISO/IEC 14496-3:1999/Amd 1:2000) and in its later revisions.

Advanced Audio Coding

Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves better sound quality than MP3 at the same bit rate. The confusingly named AAC+ (HE-AAC) does so only at low bit rates and less so at high ones.

AAC has been standardized by ISO and IEC, as part of the MPEG-2 and MPEG-4 specifications. Part of AAC, HE-AAC (AAC+), is part of MPEG-4 Audio and also adopted into digital radio standards DAB+ and Digital Radio Mondiale, as well as mobile television standards DVB-H and ATSC-M/H.

AAC supports inclusion of 48 full-bandwidth (up to 96 kHz) audio channels in one stream plus 16 low frequency effects (LFE, limited to 120 Hz) channels, up to 16 "coupling" or dialog channels, and up to 16 data streams. The quality for stereo is satisfactory to modest requirements at 96 kbit/s in joint stereo mode; however, hi-fi transparency demands data rates of at least 128 kbit/s (VBR). Tests of MPEG-4 audio have shown that AAC meets the requirements referred to as "transparent" for the ITU at 128 kbit/s for stereo, and 320 kbit/s for 5.1 audio.AAC is the default or standard audio format for YouTube, iPhone, iPod, iPad, Nintendo DSi, Nintendo 3DS, iTunes, DivX Plus Web Player, PlayStation 3 and various Nokia Series 40 phones. It is supported on PlayStation Vita, Wii (with the Photo Channel 1.1 update installed), Sony Walkman MP3 series and later, Android and BlackBerry. AAC is also supported by manufacturers of in-dash car audio systems.

Audio Lossless Coding

MPEG-4 Audio Lossless Coding, also known as MPEG-4 ALS, is an extension to the MPEG-4 Part 3 audio standard to allow lossless audio compression. The extension was finalized in December 2005 and published as ISO/IEC 14496-3:2005/Amd 2:2006 in 2006. The latest description of MPEG-4 ALS was published as subpart 11 of the MPEG-4 Audio standard (ISO/IEC 14496-3:2009) (4th edition) in August 2009.MPEG-4 ALS combines together a short-term predictor and a long term predictor. The short-term predictor is similar to FLAC in its operation - it is a quantized LPC predictor with a losslessly coded residual using Golomb Rice Coding or Block Gilbert Moore Coding (BGMC). The long term predictor is modeled by 5 long-term weighted residues, each with its own lag (delay). The lag can be hundreds of samples. This predictor improves the compression for sounds with rich harmonics (containing multiples of a single fundamental frequency, locked in phase) present in many musical instruments and human voice.

DD Free Dish

DD Free Dish (previously known as DD Direct Plus) is an Indian Free-To-Air digital direct-broadcast satellite television service. It is owned and operated by Public Service Broadcaster Doordarshan (Prasar Bharati). It has a reach of approximately 30 million households which is 15 % of the total TV Households in the country.It is the only free-to-air satellite television service in India. After upgrade, INSAT-4B satellite at 93.5° was used to broadcast 64 FTA MPEG-2 Channels and 29 radio channels.

With the latest upgrade, to GSAT-15 at 93.5° on 1 February 2016, the present capacity is likely to be enhanced to 104 SDTV channels and 40 Radio channels in near future with the introduction of new MPEG-4, DVB-S2 stream. DD Free Dish DTH viewers can watch 80 SD MPEG-2 TV channels and 18 SD MPEG-4 channels apart from 40 radio channels. New logo for DD Free Dish was added on its transponders on 29 October 2014. DD Free Dish also offers its slots to private channels.At this time DD Free Dish TV channels can be received by using simple free to air set-top boxes with DVB-S and DVB-S2 technology. Now a days many free to air set-top box are available. Recently Prasar Bharati selected few Indian manufacturers to produce iCAS enabled set-top box with MPEG-4 technology. Free to air set-top box can be bought online as well as in offline markets.

There is no monthly charges to access DD Free Dish DTH service. This is India's only free DTH (direct-to-home) service.


DVB-T is an abbreviation for "Digital Video Broadcasting — Terrestrial"; it is the DVB European-based consortium standard for the broadcast transmission of digital terrestrial television that was first published in 1997 and first broadcast in the UK in 1998. This system transmits compressed digital audio, digital video and other data in an MPEG transport stream, using coded orthogonal frequency-division multiplexing (COFDM or OFDM) modulation. It is also the format widely used worldwide (including North America) for Electronic News Gathering for transmission of video and audio from a mobile newsgathering vehicle to a central receive point.

Delivery Multimedia Integration Framework

DMIF, or Delivery Multimedia Integration Framework, is a uniform interface between the application and the transport, that allows the MPEG-4 application developer to stop worrying about that transport. DMIF was defined in MPEG-4 Part 6 (ISO/IEC 14496-6) in 1999. DMIF defines two interfaces: the DAI (DMIF/Application Interface) and the DNI (DMIF-Network Interface). A single application can run on different transport layers when supported by the right DMIF instantiation.

MPEG-4 DMIF supports the following functionalities:

A transparent MPEG-4 DMIF-application interface irrespective of whether the peer is a remote interactive peer, broadcast or local storage media.

Control of the establishment of FlexMux channels

Use of homogeneous networks between interactive peers: IP, ATM, mobile, PSTN, narrowband ISDN.

Support for mobile networks, developed together with ITU-T

UserCommands with acknowledgment messages.

Management of MPEG-4 Sync Layer informationDMIF expands upon the MPEG-2 DSM-CC standard (ISO/IEC 13818-6:1998) to enable the convergence of interactive, broadcast and conversational multimedia into one specification which will be applicable to set tops, desktops and mobile stations. The DSM-CC work was extended as part of the ISO/IEC 14496-6, with the DSM-CC Multimedia Integration Framework (DMIF). DSM-CC stands for Digital Storage Media - Command and Control. DMIF was also a name of working group within Moving Picture Experts Group. The acronym "DSM-CC" was replaced by "Delivery" (Delivery Multimedia Integration Framework) in 1997.

H.264/MPEG-4 AVC

H.264 or MPEG-4 Part 10, Advanced Video Coding (MPEG-4 AVC) is a block-oriented motion-compensation-based video compression standard. As of 2014, it is one of the most commonly used formats for the recording, compression, and distribution of video content. It supports resolutions up to 8192×4320, including 8K UHD.The intent of the H.264/AVC project was to create a standard capable of providing good video quality at substantially lower bit rates than previous standards (i.e., half or less the bit rate of MPEG-2, H.263, or MPEG-4 Part 2), without increasing the complexity of design so much that it would be impractical or excessively expensive to implement. An additional goal was to provide enough flexibility to allow the standard to be applied to a wide variety of applications on a wide variety of networks and systems, including low and high bit rates, low and high resolution video, broadcast, DVD storage, RTP/IP packet networks, and ITU-T multimedia telephony systems. The H.264 standard can be viewed as a "family of standards" composed of a number of different profiles. A specific decoder decodes at least one, but not necessarily all profiles. The decoder specification describes which profiles can be decoded. H.264 is typically used for lossy compression, although it is also possible to create truly lossless-coded regions within lossy-coded pictures or to support rare use cases for which the entire encoding is lossless.

H.264 was developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC JTC1 Moving Picture Experts Group (MPEG). The project partnership effort is known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC MPEG-4 AVC standard (formally, ISO/IEC 14496-10 – MPEG-4 Part 10, Advanced Video Coding) are jointly maintained so that they have identical technical content. The final drafting work on the first version of the standard was completed in May 2003, and various extensions of its capabilities have been added in subsequent editions. High Efficiency Video Coding (HEVC), a.k.a. H.265 and MPEG-H Part 2 is a successor to H.264/MPEG-4 AVC developed by the same organizations, while earlier standards are still in common use.

H.264 is perhaps best known as being one of the video encoding standards for Blu-ray Discs; all Blu-ray Disc players must be able to decode H.264. It is also widely used by streaming Internet sources, such as videos from Vimeo, YouTube, and the iTunes Store, Web software such as the Adobe Flash Player and Microsoft Silverlight, and also various HDTV broadcasts over terrestrial (ATSC, ISDB-T, DVB-T or DVB-T2), cable (DVB-C), and satellite (DVB-S and DVB-S2).

H.264 is protected by patents owned by various parties. A license covering most (but not all) patents essential to H.264 is administered by patent pool MPEG LA. Commercial use of patented H.264 technologies requires the payment of royalties to MPEG LA and other patent owners. MPEG LA has allowed the free use of H.264 technologies for streaming Internet video that is free to end users, and Cisco Systems pays royalties to MPEG LA on behalf of the users of binaries for its open source H.264 encoder.

Harmonic Vector Excitation Coding

Harmonic Vector Excitation Coding, abbreviated as HVXC is a speech coding algorithm specified in MPEG-4 Part 3 (MPEG-4 Audio) standard for very low bit rate speech coding. HVXC supports bit rates of 2 and 4 kbit/s in the fixed and variable bit rate mode and sampling frequency 8 kHz. It also operates at lower bitrates, such as 1.2 - 1.7 kbit/s, using a variable bit rate technique. The total algorithmic delay for the encoder and decoder is 36 ms.It was published as subpart 2 of ISO/IEC 14496-3:1999 (MPEG-4 Audio) in 1999. An extended version of HVXC was published in MPEG-4 Audio Version 2 (ISO/IEC 14496-3:1999/Amd 1:2000).MPEG-4 Natural Speech Coding Tool Set uses two algorithms: HVXC and CELP (Code Excited Linear Prediction). HVXC is used at a low bit rate of 2 or 4 kbit/s. Higher bitrates than 4 kbit/s in addition to 3.85 kbit/s are covered by CELP.

High Efficiency Video Coding

High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard, designed as a successor to the widely used AVC (H.264 or MPEG-4 Part 10). In comparison to AVC, HEVC offers from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate. It supports resolutions up to 8192×4320, including 8K UHD, and unlike the primarily 8-bit AVC, HEVC's higher fidelity Main10 profile has been incorporated into nearly all supporting hardware. HEVC is competing with the AV1 coding format for standardization by the video standard working group NetVC of the Internet Engineering Task Force (IETF).

ISO base media file format

ISO base media file format (ISO/IEC 14496-12 – MPEG-4 Part 12) defines a general structure for time-based multimedia files such as video and audio.

The identical text is published as ISO/IEC 15444-12 (JPEG 2000, Part 12).It is designed as a flexible, extensible format that facilitates interchange, management, editing and presentation of the media. The presentation may be local, or via a network or other stream delivery mechanism. The file format is designed to be independent of any particular network protocol while enabling support for them in general. It is used as the basis for other media file formats (e.g. container formats MP4 and 3GP).

MPEG-4 Part 11

See also: Banded Iron FormationMPEG-4 Part 11 Scene description and application engine was published as ISO/IEC 14496-11 in 2005. MPEG-4 Part 11 is also known as BIFS, XMT, MPEG-J. It defines:

the coded representation of the spatio-temporal positioning of audio-visual objects as well as their behaviour in response to interaction (scene description);

the coded representation of synthetic two-dimensional (2D) or three-dimensional (3D) objects that can be manifested audibly or visually;

the Extensible MPEG-4 Textual (XMT) format - a textual representation of the multimedia content described in MPEG-4 using the Extensible Markup Language (XML);

and a system level description of an application engine (format, delivery, lifecycle, and behaviour of downloadable Java byte code applications). (The MPEG-J Graphics Framework eXtensions (GFX) is defined in MPEG-4 Part 21 - ISO/IEC 14496-21.)Binary Format for Scenes (BIFS) is a binary format for two- or three-dimensional audiovisual content. It is based on VRML and part 11 of the MPEG-4 standard.

BIFS is MPEG-4 scene description protocol to compose MPEG-4 objects, describe interaction with MPEG-4 objects and to animate MPEG-4 objects.

MPEG-4 Binary Format for Scene (BIFS) is used in Digital Multimedia Broadcasting (DMB).The XMT framework accommodates substantial portions of SMIL, W3C Scalable Vector Graphics (SVG) and X3D (the new name of VRML). Such a representation can be directly played back by a SMIL or VRML player, but can also be binarised to become a native MPEG-4 representation that can be played by an MPEG-4 player. Another bridge has been created with BiM (Binary MPEG format for XML).

MPEG-4 Part 14

MPEG-4 Part 14 or MP4 is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows streaming over the Internet. The only official filename extension for MPEG-4 Part 14 files is .mp4. MPEG-4 Part 14 (formally ISO/IEC 14496-14:2003) is a standard specified as a part of MPEG-4.

Portable media players are sometimes advertised as "MP4 Players", although some are simply MP3 Players that also play AMV video or some other video format, and do not necessarily play the MPEG-4 Part 14 format.

MPEG-4 Part 2

MPEG-4 Part 2, MPEG-4 Visual (formally ISO/IEC 14496-2) is a video compression format developed by MPEG. It belongs to the MPEG-4 ISO/IEC standards. It is a discrete cosine transform compression standard, similar to previous standards such as MPEG-1 Part 2 and H.262/MPEG-2 Part 2. Several popular codecs including DivX, Xvid and Nero Digital implement this standard.

Note that MPEG-4 Part 10 defines a different format from MPEG-4 Part 2 and should not be confused with it. MPEG-4 Part 10 is commonly referred to as H.264 or AVC, and was jointly developed by ITU-T and MPEG.

MPEG-4 Part 2 is H.263 compatible in the sense that a basic H.263 bitstream is correctly decoded by an MPEG-4 Video decoder. (MPEG-4 Video decoder is natively capable of decoding a basic form of H.263.) In MPEG-4 Visual, there are two types of video object layers: the video object layer that provides full MPEG-4 functionality, and a reduced functionality video object layer, the video object layer with short headers (which provides bitstream compatibility with base-line H.263). MPEG-4 Part 2 is partially based on ITU-T H.263. The first MPEG-4 Video Verification Model (simulation and test model) used ITU-T H.263 coding tools together with shape coding.

MPEG-4 Part 3

MPEG-4 Part 3 or MPEG-4 Audio (formally ISO/IEC 14496-3) is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods. The first version of ISO/IEC 14496-3 was published in 1999.The MPEG-4 Part 3 consists of a variety of audio coding technologies – from lossy speech coding (HVXC, CELP), general audio coding (AAC, TwinVQ, BSAC), lossless audio compression (MPEG-4 SLS, Audio Lossless Coding, MPEG-4 DST), a Text-To-Speech Interface (TTSI), Structured Audio (using SAOL, SASL, MIDI) and many additional audio synthesis and coding techniques.MPEG-4 Audio does not target a single application such as real-time telephony or high-quality audio compression. It applies to every application which requires the use of advanced sound compression, synthesis, manipulation, or playback.

MPEG-4 Audio is a new type of audio standard that integrates numerous different types of audio coding: natural sound and synthetic sound, low bitrate delivery and high-quality delivery, speech and music, complex soundtracks and simple ones, traditional content and interactive content.


MPEG-4 SLS, or MPEG-4 Scalable to Lossless as per ISO/IEC 14496-3:2005/Amd 3:2006 (Scalable Lossless Coding), is an extension to the MPEG-4 Part 3 (MPEG-4 Audio) standard to allow lossless audio compression scalable to lossy MPEG-4 General Audio coding methods (e.g., variations of AAC). It was developed jointly by the Institute for Infocomm Research (I2R) and Fraunhofer, which commercializes its implementation of a limited subset of the standard under the name of HD-AAC. Standardization of the HD-AAC profile for MPEG-4 Audio is under development (as of September 2009).MPEG-4 SLS allows having both a lossy layer and a lossless correction layer similar to Wavpack Hybrid, OptimFROG DualStream and DTS-HD Master Audio, providing backwards compatibility to MPEG AAC-compliant bitstreams. MPEG-4 SLS can also work without a lossy layer (a.k.a. "SLS Non-Core"), in which case it will not be backwards compatible, Lossy compression of files is necessary for files that need to be streamed to the Internet or played in devices with limited storage.

With DRM, ripping of the lossless data or playback on non DRM-enabled devices could be disabled.

MPEG-4 SLS is not related in any way to MPEG-4 ALS (Audio Lossless Coding).

Neuros Technology

Neuros Technology was a Chicago, Illinois–based company that produced a number of audio and video devices under the brand name Neuros. Founded by Joe Born in 2001 as a division of Digital Innovations, it previously operated under the name Neuros Audio. Like Digital Innovations, Neuros distinguished itself by its use of open-innovation and crowdsourcing techniques to bring products to market, as well as by its prominent use of open-source software and open-source hardware. In its development model, end users were involved throughout the product development process from reviewing initial concepts to beta testing initial product releases.

QuickTime File Format

QuickTime File Format (QTFF) is a computer file format used natively by the QuickTime framework.

Scalable Video Coding

Scalable Video Coding: (SVC) is the name for the Annex G extension of the H.264/MPEG-4 AVC video compression standard. SVC standardizes the encoding of a high-quality video bitstream that also contains one or more subset bitstreams. A subset video bitstream is derived by dropping packets from the larger video to reduce the bandwidth required for the subset bitstream. The subset bitstream can represent a lower spatial resolution (smaller screen), lower temporal resolution (lower frame rate), or lower quality video signal. H.264/MPEG-4 AVC was developed jointly by ITU-T and ISO/IEC JTC 1. These two groups created the Joint Video Team (JVT) to develop the H.264/MPEG-4 AVC standard.


Xvid (formerly "XviD") is a video codec library following the MPEG-4 video coding standard, specifically MPEG-4 Part 2 Advanced Simple Profile (ASP). It uses ASP features such as b-frames, global and quarter pixel motion compensation, lumi masking, trellis quantization, and H.263, MPEG and custom quantization matrices.

Xvid is a primary competitor of the DivX Pro Codec. In contrast with the DivX codec, which is proprietary software developed by DivX, Inc., Xvid is free software distributed under the terms of the GNU General Public License. This also means that unlike the DivX codec, which is only available for a limited number of platforms, Xvid can be used on all platforms and operating systems for which the source code can be compiled.

ISO standards by standard number
MPEG-1 Parts
MPEG-2 Parts
MPEG-4 Parts
MPEG-7 Parts
MPEG-21 Parts
MPEG-D Parts
MPEG-G Parts
MPEG-H Parts
IEC standards
ISO/IEC standards

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.