VP9

VP9 is an open and royalty free video coding format developed by Google.

VP9 is a successor to VP8 and competes mainly with MPEG's High Efficiency Video Coding (HEVC/H.265). At first, VP9 was mainly used on Google's popular video platform YouTube. The emergence of the Alliance for Open Media, and its support for the ongoing development of the successor AV1, led to growing interest in the format.

In contrast to HEVC, VP9 support is common among web browsers (see HTML5 video § Browser support). The combination of VP9 video and Opus audio in the WebM container, as served by YouTube, is supported by roughly ¾ of the browser market (mobile included) as of early 2017, thanks to only two significantly popular video capable browsers lacking VP9 support: the discontinued Internet Explorer (unlike its successor Edge) and Safari (in its desktop and mobile editions) which remains the last H.264 holdout among web browsers. Android has supported VP9 since version 4.4 KitKat, though hardware acceleration varies.

Parts of the format are covered by patents held by Google. The company grants free usage of its own related patents based on reciprocity, i.e. as long as the user doesn't engage in patent litigations.

Vp9-logo-for-mediawiki.svg
VP9 logo

Features

VP9 is customized for video resolutions beyond high-definition video (UHD) and also enables lossless compression.

The VP9 format supports the following color spaces: Rec. 601, Rec. 709, Rec. 2020, SMPTE-170, SMPTE-240, and sRGB.

VP9 supports HDR video using Hybrid Log-Gamma (HLG) and Perceptual Quantizer (PQ).

Efficiency

An offline encoder comparison between libvpx, two HEVC encoders and x264 in May 2017 by Jan Ozer of Streaming Media Magazine, with encoding parameters supplied or reviewed by each encoder vendor (Google, MulticoreWare and MainConcept respectively), and using Netflix's VMAF objective metric, concluded that "VP9 and both HEVC codecs produce very similar performance" and "Particularly at lower bitrates, both HEVC codecs and VP9 deliver substantially better performance than H.264".

Netflix concluded after a large test in August 2016 that libvpx was 20% less efficient than x265, but by October the same year, also found that tweaking encoding parameters could "reduce or even reverse gap between VP9 and HEVC". At NAB 2017, Netflix shared that they had switched to the Eve encoder, which had working two-pass rate control and was 8% more efficient than libvpx.

An early comparison that took varying encoding speed into account showed x265 to narrowly beat libvpx at the very highest quality (slowest encoding) whereas libvpx was superior at any other encoding speed, by SSIM.

Comparison of encoding artefacts

In a subjective quality comparison conducted in 2014 featuring the reference encoders for HEVC (HM 15.0), MPEG-4 AVC/H.264 (JM 18.6), and VP9 (libvpx 1.2.0 with preliminary VP9 support), VP9, like H.264, required about two times the bitrate to reach video quality comparable to HEVC, while with synthetic imagery VP9 was close to HEVC. By contrast, another subjective comparison from 2014 concluded that at higher quality settings HEVC and VP9 were tied at a 40 to 45% bitrate advantage over H.264.

Performance

An encoding speed versus efficiency comparison of the reference implementation in libvpx, x264 and x265 was made by an FFmpeg developer in September 2015: By SSIM index, libvpx was mostly superior to x264 across the range of comparable encoding speeds, but the main benefit was at the slower end of [email protected] (reaching a sweet spot of 30–40% bitrate improvement within twice as slow as this), whereas x265 only became competitive with libvpx around 10 times as slow as [email protected]. It was concluded that libvpx and x265 were both capable of the claimed 50% bitrate improvement over H.264, but only at 10–20 times the encoding time of x264. Judged by the objective quality metric VQM in early 2015, the VP9 reference encoder delivered video quality on par with the best HEVC implementations.

A decoder comparison by the same developer showed 10% faster decoding for ffvp9 than ffh264 for same-quality video, or "identical" at same bitrate. It also showed that the implementation can make a difference, concluding that "ffvp9 beats libvpx consistently by 25–50%".

Another decoder comparison indicated 10–40 percent higher CPU load than H.264 (but does not say whether this was with ffvp9 or libvpx), and that on mobile, the Ittiam demo player was about 40 percent faster than the Chrome browser at playing VP9.

Profiles

There are several variants of the VP9 format, the so-called coding profiles, that successively allow more features, starting from the basic version, the profile 0 (minimum for hardware implementations), up to profile 3:

profile 0
color depth: 8 bit/sample, chroma subsampling: 4:2:0
profile 1
color depth: 8 bit, chroma subsampling: 4:2:0, 4:2:2, 4:4:4
profile 2
color depth: 10–12 bit, chroma subsampling: 4:2:0
profile 3
color depth: 10–12 bit, chroma subsampling: 4:2:0, 4:2:2, 4:4:4

Levels

VP9 offers the following 14 levels:

Level
Luma Samples/s Luma Picture Size Max Bitrate (kbit/s) Max CPB Size for Visual Layer (kbits) Min Compression Ratio Max Tiles Min Alt-Ref Distance Max Reference Frames Examples for resolution @ frame rate
1 829440 36864 200 400 2 1 4 8 [email protected]
1.1 2764800 73728 800 1000 2 1 4 8 [email protected]
2 4608000 122880 1800 1500 2 1 4 8 [email protected]
2.1 9216000 245760 3600 2800 2 2 4 8 [email protected]
3 20736000 552960 7200 6000 2 4 4 8 [email protected]
3.1 36864000 983040 12000 10000 2 4 4 8 [email protected]
4 83558400 2228224 18000 16000 4 4 4 8 [email protected]
4.1 160432128 2228224 30000 18000 4 4 5 6 [email protected]
5 311951360 8912896 60000 36000 6 8 6 4 [email protected]
5.1 588251136 8912896 120000 46000 8 8 10 4 [email protected]
5.2 1176502272 8912896 180000 TBD 8 8 10 4 [email protected]
6 1176502272 35651584 180000 TBD 8 16 10 4 [email protected]
6.1 2353004544 35651584 240000 TBD 8 16 10 4 [email protected]
6.2 4706009088 35651584 480000 TBD 8 16 10 4 [email protected]

Technology

VP9 example superblock partitioning.svg
Example partitioning and internal coding order of a coding unit
VP9 coefficient scan order.svg
Transform coefficients are scanned in a round pattern (increasing distance from the corner). This is to coincide (better than the traditional zig-zag pattern) with the expected order of importance of the coefficients, so to increase their compressibility by entropy coding. A skewed variant of the pattern is used when the horizontal or vertical edge is more important.

VP9 is a traditional block-based transform coding format. The bitstream format is relatively simple compared to formats that offer similar bitrate efficiency like HEVC.

VP9 has many design improvements compared to VP8. Its biggest improvement is support for the use of coding units of 64×64 pixels. This is especially useful with high-resolution video. Also the prediction of motion vectors was improved. In addition to the VP8's four modes (average/"DC", "true motion", horizontal, vertical), VP9 supports six oblique directions for linear extrapolation of pixels in intra-frame prediction.

New coding tools also include:

  • eighth-pixel precision for motion vectors,
  • three different switchable 8-tap subpixel interpolation filters,
  • improved selection of reference motion vectors,
  • improved coding of offsets of motion vectors to their reference,
  • improved entropy coding,
  • improved and adapted (to new block sizes) loop filtering,
  • the asymmetric discrete sine transform (ADST),
  • larger discrete cosine transforms (DCT, 16×16 and 32×32), and
  • improved segmentation of frames into areas with specific similarities (e.g. fore-/background)

In order to enable some parallel processing of frames, video frames can be split along coding unit boundaries into up to four rows of 256 to 4096 pixels wide evenly spaced tiles with each tile column coded independently. This is mandatory for video resolutions in excess of 4096 pixels. A tile header contains the tile size in bytes so decoders can skip ahead and decode each tile row in a separate thread. The image is then divided into coding units called superblocks of 64×64 pixels which are adaptively subpartitioned in a quadtree coding structure. They can be subdivided either horizontally or vertically or both; square (sub)units can be subdivided recursively down to 4×4 pixel blocks. Subunits are coded in raster scan order: left to right, top to bottom.

Starting from each key frame, decoders keep 8 frames buffered to be used as reference frames or to be shown later. Transmitted frames signal which buffer to overwrite and can optionally be decoded into one of the buffers without being shown. The encoder can send a minimal frame that just triggers one of the buffers to be displayed ("skip frame"). Each inter frame can reference up to three of the buffered frames for temporal prediction. Up to two of those reference frames can be used in each coding block to calculate a sample data prediction, using spatially displaced (motion compensation) content from a reference frame or an average of content from two reference frames ("compound prediction mode"). The (ideally small) remaining difference (delta encoding) from the computed prediction to the actual image content is transformed using a DCT or ADST (for edge blocks) and quantized.

Something like a b-frame can be coded while preserving the original frame order in the bitstream using a structure named superframes. Hidden alternate reference frames can be packed together with an ordinary inter frame and a skip frame that triggers display of previous hidden altref content from its reference frame buffer right after the accompanying p-frame.

VP9 enables lossless encoding by transmitting at the lowest quantization level (q index 0) an additional 4×4-block encoded Walsh–Hadamard transformed (WHT) residue signal.

In container formats VP9 streams are marked with the FourCC VP90 (or in the future possibly VP91, ...) or VP09. In order to be searchable, raw VP9 bitstreams have to be contained either in Googles Matroska-derived WebM format (.webm) or the older minimalistic Indeo video file (IVF) format which is traditionally supported by libvpx.

Adoption

Adobe Flash, which traditionally used VPx formats up to VP7, was never upgraded to VP8 or VP9, but instead to H.264. Therefore, VP9 often penetrated corresponding web applications only with the gradual shift from Flash to HTML5 technology, which was still somewhat immature when VP9 was introduced. Trends towards UHD resolutions, higher color depth and wider gamuts are driving a shift towards new, specialized video formats. With the clear development perspective and support from the industry demonstrated by the founding of the Alliance for Open Media, as well as the pricey and complex licensing situation of HEVC it is expected that users of the hitherto leading MPEG formats will often switch to the royalty-free alternative formats of the VPx/AVx series instead of upgrading to HEVC.

Content providers

A main user of VP9 is Google's popular video platform YouTube, which offers VP9 video at all resolutions along with Opus audio in the WebM file format, through DASH streaming.

Another early adopter is Wikipedia (specifically Wikimedia Commons, which hosts multimedia files across Wikipedia's subpages and languages). Wikipedia endorses open and royalty-free multimedia formats. As of 2016, the 3 accepted video formats are VP9, VP8 and Theora.

As of December 2016, Netflix is in progress of encoding their catalog to VP9, alongside AVC High, for bitrates aimed at mobile users.

Google Play Movies & TV uses (at least in part) VP9 profile 2 with Widevine DRM.

Encoding services

Since 2016 a series of cloud encoding services (Amazon, Brightcove, castLabs, JW Player, Telestream, Wowza) offer VP9 encoding.

Encoding.com has offered VP9 encoding since Q4 2016, which amounted to a yearly average of 11% popularity for VP9 among its customers that year.

Web middleware

JW Player supports VP9 in its widely used software-as-a-service HTML5 video player.

Browser support

VP9 is implemented in the webbrowsers

Internet Explorer and Apple Safari are missing VP9 support completely. In March 2016 an estimated 65 to 75% of webbrowsers in use on desktop and notebook systems were able to play VP9 videos in HTML5 webpages, based on data from StatCounter.

Media player software support

VP9 is supported in all major open source media player software, including VLC, MPlayer/MPlayer2/MPV, Kodi, MythTV and FFplay.

Hardware device support

Android has had VP9 software decoding since version 4.4 "KitKat". For a list of consumer electronics with hardware support, including TVs, smartphones, set top boxes and game consoles, see webmproject.org's list.

Hardware implementations

The following chips, architectures, CPUs, GPUs and SoCs provide hardware acceleration of VP9. Some of these are known to have fixed function hardware, but this list also incorporates GPU or DSP based implementations – software implementations on non-CPU hardware. The latter category also serve the purpose of offloading the CPU, but power efficiency is not as good as the fixed function hardware (more comparable to well optimized SIMD aware software).

Intel Kaby Lake CPU family, Intel Apollo Lake CPU family, Nvidia Maxwell GM206 & Pascal GPU family have full fixed function VP9 hardware decoding for highest decoding performance and power efficiency.

Company Chip/Architecture Notable uses Encoding Decoding
AMD Polaris RX 480/470/460 Red XN Green tickY
Bristol Ridge FX 9800P/A12-9700P Red XN
Stoney Ridge A9-9410/A6-9210/E-9010 Red XN
ARM Mali-V61 ("Egil") VPU Green tickY
AllWinner A80 Red XN
Amlogic S9 family Red XN
HiSilicon HI3798C Red XN
Imagination PowerVR Series6 Apple iPhone 6/6s Red XN
Intel Bay Trail Red XN
Merrifield Red XN
Moorefield Red XN
Skylake Intel Core i7 6700 Green tickY
Kaby Lake Intel Core i7 7700 Green tickY
MediaTek MT6595 Red XN
MT8135 Red XN
Helio X20/X25 Red XN
Helio X30 Green tickY
NVIDIA Maxwell GM206 GTX 950/960/750v2 Red XN
Pascal GTX 1080/1070/1060/1050 Red XN
Tegra X1 Nvidia Shield Android TV,

Nintendo Switch

Red XN
Qualcomm SnapDragon 820/821 OnePlus 3, Samsung Galaxy S7,

LG G5, Google Pixel

Red XN
SnapDragon 835 Samsung Galaxy S8 Green tickY
Realtek RTD1295 Red XN
Samsung Exynos 7 Octa 7420 Samsung Galaxy S6,

Samsung Galaxy Note 5

Red XN
Exynos 8 Octa 8890 Samsung Galaxy S7 Green tickY
Exynos 9 Octa 8895 Samsung Galaxy S8 Green tickY

This is not a complete list. Further SoCs, as well as hardware IP vendors can be found at webmproject.org.

Software implementations

The reference implementation from Google is found in the free software programming library libvpx. It has a single-pass and a two-pass encoding mode, whereas the single-pass mode is considered broken and doesn't offer effective control over the target bitrate.

Encoding

  • libvpx
  • Eve – a commercial encoder
  • Ittiam's encoder products (OTT, broadcast, consumer)

Decoding

  • libvpx
  • ffvp9 (FFmpeg)
  • Ittiam's consumer decoder

FFmpeg's VP9 decoder takes advantage of a corpus of SIMD optimizations shared with other codecs to make it fast. A comparison made by an FFmpeg developer indicated that this was faster than libvpx, and compared to FFmpeg's h.264 decoder, "identical" performance for same-bitrate video, or about 10% faster for same-quality video.

History

VP9 is the last official iteration of the TrueMotion series of video formats that Google bought in 2010 for $134 million together with the company On2 Technologies that created it. The development of VP9 started in the second half of 2011 under the development names of Next Gen Open Video (NGOV) and VP-Next. The design goals for VP9 included reducing the bit rate by 50% compared to VP8 while maintaining the same video quality, and aiming for better compression efficiency than the MPEG High Efficiency Video Coding (HEVC) standard. In June 2013 the "profile 0" of VP9 was finalized, and two months later Google's Chrome browser was released with support for VP9 video playback. In October of that year a native VP9 decoder was added to FFmpeg, and to Libav six weeks later. Mozilla added VP9 support to Firefox in March 2014. In 2014 Google added two high bit depth profiles: profile 2 and profile 3.

In 2013 an updated version of the WebM format was published, featuring support for VP9 together with Opus audio.

In March 2013, the MPEG Licensing Administration dropped an announced assertion of disputed patent claims against VP8 and its successors after the United States Department of Justice started to investigate whether it was acting to unfairly stifle competition.

Throughout, Google has worked with hardware vendors to get VP9 support into silicon. In January 2014, Ittiam, in collaboration with ARM and Google, demonstrated its VP9 decoder for ARM Cortex devices. Using GPGPU techniques, the decoder was capable of 1080p at 30fps on an Arndale Board. In early 2015 Nvidia announced VP9 support in its Tegra X1 SoC, and VeriSilicon announced VP9 Profile 2 support in its Hantro G2v2 decoder IP.

In April 2015 Google released a significant update to its libvpx library, with version 1.4.0 adding support for 10-bit and 12-bit bit depth, 4:2:2 and 4:4:4 chroma subsampling, and VP9 multithreaded decoding/encoding.

In December 2015, Netflix published a draft proposal for including VP9 video in an MP4 container with MPEG Common Encryption.

In January 2016, Ittiam demonstrated an OpenCL based VP9 encoder. The encoder is targeting ARM Mali mobile GPUs and was demonstrated on a Samsung Galaxy S6.

VP9 support was added to Microsoft's webbrowser Edge. It is present in development releases starting with EdgeHTML 14.14291 and due to be officially released in summer 2016.

In March 2017, Ittiam announced the completion of a project to enhance the encoding speed of libvpx. The speed improvement was said to be 50-70%, and the code "publicly available as part of libvpx".

Successor: from VP10 to AV1

On September 12, 2014, Google announced that development on VP10 had begun and that after the release of VP10 they plan to have an 18-month gap between releases of video formats. In August 2015, Google began to publish code for VP10.

However, Google decided to incorporate VP10 into AOMedia Video 1 (AV1). The AV1 codec will use elements of VP10 as well as the experimental formats Daala (Xiph/Mozilla) and Thor (Cisco). Accordingly, Google has stated that they will not deploy VP10 internally or officially release it, making VP9 the last of the VPx-based codecs to be released by Google.

Content from Wikipedia