AES47 is a standard which describes a method for transporting AES3 professional digital audio streams over Asynchronous Transfer Mode (ATM) networks.

The Audio Engineering Society (AES) published AES47 in 2002. The method described by AES47 is also published by the International Electrotechnical Commission as IEC 62365.[1]


Many professional audio systems are now combined with telecommunication and IT technologies to provide new functionality, flexibility and connectivity over both local and wide area networks. AES47 was developed to provide a standardised method of transporting the standard digital audio per AES3 over telecommunications networks that provide a quality of service required by many professional low-latency live audio uses. AES47 may be used directly between specialist audio devices or in combination with telecommunication and computer equipment with suitable network interfaces. In both cases, AES47 the same physical structured cable used as standard by the telecommunications networks.

Common network protocols like Ethernet use large packet sizes, which produce a larger minimum latency. Asynchronous transfer mode divides data into 48-byte cells which provide lower latency.


The original work was carried out at the British Broadcasting Corporation’s R&D department and published as "White Paper 074",[2] which established that this approach provides the necessary performance for professional media production. AES47 was originally published in 2002 and was republished with minor revisions in February 2006. Amendment 1 to AES47 was published in February 2009, adding code points in the ATM Adaptation Layer Parameters Information Element to signal that the time to which each audio sample relates can be identified as specified in AES53.[3]

The change in thinking from traditional ATM network design is not to necessarily use ATM to pass IP traffic (apart from management traffic) but to use AES47 in parallel with standard Ethernet structures to deal with extremely high performance secure media streams.

AES47 has been developed to allow the simultaneous transport and switched distribution of a large number of AES3 linear audio streams at different sample frequencies. AES47 can support any of the standard AES3 sample rates and word size. AES11 Annex D (the November 2005 printing or version of AES11-2003) shows an example method to provide isochronous timing relationships for distributed AES3 structures over asynchronous networks such as AES47 where reference signals may be locked to common timing sources such as GPS. AES53 specifies how timing markers within AES47 can be used to associate an absolute time stamp with individual audio samples as described in AES47 Amendment 1.

An additional standard has been published by the Audio Engineering Society to extend AES3 digital audio carried as AES47 streams to enable this to be transported over standard physical Ethernet hardware. This additional standard is known as AES51-2006.

AES47 details

For minimum latency, AES47 uses "raw" ATM cells, ATM adaptation layer 0. Each ATM virtual circuit negotiates the parameters of a stream at connection time. In addition to the same rate and number of channels (which may be more than the 2 supported by AES3), the negotiation covers the number of bits per sample and the presence of an optional data byte. The total must be 1, 2, 3, 4 or 6 bytes per sample, so it evenly divides the ATM cell size.[4] AES3 uses 4 bytes per sample (24 bits of sample plus the optional data byte), but AES47 supports additional formats.

The optional data byte contains four "ancillary" bits corresponding to the AES3 VUCP bits. However, the P (parity) bit is replaced by a B bit which is set on the first sample of each audio block, and clear at all other times. This serves the same function as the B (or Z) synchronization preamble.

The other half of the data byte contains three "data protection" bits for error control and a sequencing bit. The concatenation of the sequencing bits from all samples in a cell (combined little-endian) form a sequencing word of 8, 12, 16, or 24 bits. Only the first 12 bits are defined.

The first four bits of the sequencing word are a sequencing number, used to detect dropped cells. This increments by 1 for each cell transmitted.

The second four bits are for error detection, with bit 7 being an even parity bit for the first byte.

The third four bits, if present, are a second sequencing number which can be used to align multiple virtual circuits.

See also


  1. ^ IEC 62365 preview
  2. ^ Chambers, C.J. (September 2003). The development of ATM network technology for live production infrastructure (Technical report). BBC Research & Development. WHP 074.
  3. ^ Audio Engineering Society standards web site
  4. ^ Rumsey, Francis; Watkinson, John (September 2003). "5.4 AES47: Audio Over ATM". Digital Interface Handbook (3rd ed.). Focal press. ISBN 0-240-51909-4.

The AES11 standard published by the Audio Engineering Society provides a systematic approach to the synchronization of digital audio signals. Recommendations are made concerning the accuracy of sample clocks as embodied in the interface signal and the use of this format as a convenient synchronization reference where signals must be rendered co-timed for digital processing. Synchronism is defined, and limits are given which take account of relevant timing uncertainties encountered in an audio studio.

AES11 recommends using an AES3 signal to distribute audio clocks within a facility. In this application, the connection is referred to as a Digital Audio Reference Signal (DARS).

AES11 Annex D (in the November 2005 or later printing or version) shows an example method to provide isochronous timing relationships for distributed AES3 structures over asynchronous networks such as AES47 where reference signals may be locked to common timing sources such as GPS.

In addition, the Audio Engineering Society has now published a related standard called AES53, that specifies how the timing markers already specified in AES47 may be used to associate an absolute time-stamp with individual audio samples. This may be closely associated with AES11 and used to provide a way of aligning streams from disparate sources, including synchronizing audio to video in networked structures.

The media profile defined in annex A of AES67 provides a means of using AES11 synchronization via the Precision Time Protocol.


AES3 (also known as AES/EBU) is a standard for the exchange of digital audio signals between professional audio devices. An AES3 signal can carry two channels of PCM audio over several transmission media including balanced lines, unbalanced lines, and optical fiber.AES3 was jointly developed by the Audio Engineering Society (AES) and the European Broadcasting Union (EBU). The standard was first published in 1985 and was revised in 1992 and 2003. AES3 has been incorporated into the International Electrotechnical Commission's standard IEC 60958, and is available in a consumer-grade variant known as S/PDIF.


AES51 is a standard first published by the Audio Engineering Society in June 2006 that specifies a method of carrying ATM (Asynchronous Transfer Mode) cells over Ethernet physical structure intended in particular for use with AES47 to carry AES3 digital audio transport structure. The purpose of this is to provide an open standard, Ethernet based approach to the networking of linear (uncompressed) digital audio with extremely high quality-of-service alongside standard Internet Protocol connections.

This standard specifies a method, also known as "ATM-E", of carrying asynchronous transfer mode (ATM) cells over hardware specified for IEEE 802.3 (Ethernet). It is intended as a companion standard to AES47 (Transmission of digital audio over ATM networks), to provide a standard method of carrying ATM cells and real-time clock over hardware specified for Ethernet.


AES53 is a standard first published in October 2006 by the Audio Engineering Society that specifies how the timing markers specified in AES47 may be used to associate an absolute time-stamp with individual audio samples. AES47 specifies a format for the transmission of digital audio over asynchronous transfer mode (ATM) networks. A recommendation is made to refer these timestamps to the SMPTE epoch which in turn provides a reference to UTC and GPS time. It thus provides a way of aligning streams from disparate sources, including synchronizing audio to video, and also allows the total delay across a network to be controlled when the transit time of individual cells is unknown. This is most effective in systems where the audio is aligned with an absolute time reference such as GPS, but can also be used with a local reference.

This standard may be studied by downloading a copy of the latest version from the AES standards web site as AES53-2006.

Audio Engineering Society

Established in 1948, the Audio Engineering Society (AES) draws its membership from engineers, scientists, other individuals with an interest or involvement in the professional audio industry. The membership largely comprises engineers developing devices or products for audio, and persons working in audio content production. It also includes acousticians, audiologists, academics, and those in other disciplines related to audio. The AES is the only worldwide professional society devoted exclusively to audio technology.

The Society develops, reviews and publishes engineering standards for the audio and related media industries, and produces the AES Conventions, which are held twice a year alternating between Europe and the US. The AES and individual regional or national sections also hold AES Conferences on different topics during the year.

Audio over Ethernet

In audio and broadcast engineering, Audio over Ethernet (sometimes AoE—not to be confused with ATA over Ethernet) is the use of an Ethernet-based network to distribute real-time digital audio. AoE replaces bulky snake cables or audio-specific installed low-voltage wiring with standard network structured cabling in a facility. AoE provides a reliable backbone for any audio application, such as for large-scale sound reinforcement in stadiums, airports and convention centers, multiple studios or stages.

While AoE bears a resemblance to voice over IP (VoIP) and audio over IP (AoIP), AoE is intended for high-fidelity, low-latency professional audio. Because of the fidelity and latency constraints, AoE systems generally do not utilize audio data compression. AoE systems use a much higher bit rate (typically 1 Mbit/s per channel) and much lower latency (typically less than 10 milliseconds) than VoIP. AoE requires a high-performance network. Performance requirements may be met through use of a dedicated local area network (LAN) or virtual LAN (VLAN), overprovisioning or quality of service features.

Some AoE systems use proprietary protocols (at the higher OSI layers) which create Ethernet frames that are transmitted directly onto the Ethernet (layer 2) for efficiency and reduced overhead. The word clock may be provided by broadcast packets.

Digital audio

Digital audio is sound that has been recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is encoded as numerical samples in continuous sequence. For example, in CD audio, samples are taken 44100 times per second each with 16 bit sample depth. Digital audio is also the name for the entire technology of sound recording and reproduction using audio signals that have been encoded in digital form. Following significant advances in digital audio technology during the 1970s, it gradually replaced analog audio technology in many areas of audio engineering and telecommunications in the 1990s and 2000s.

In a digital audio system, an analog electrical signal representing the sound is converted with an analog-to-digital converter (ADC) into a digital signal, typically using pulse-code modulation. This digital signal can then be recorded, edited, modified, and copied using computers, audio playback machines, and other digital tools. When the sound engineer wishes to listen to the recording on headphones or loudspeakers (or when a consumer wishes to listen to a digital sound file), a digital-to-analog converter (DAC) performs the reverse process, converting a digital signal back into an analog signal, which is then sent through an audio power amplifier and ultimately to a loudspeaker.

Digital audio systems may include compression, storage, processing, and transmission components. Conversion to a digital format allows convenient manipulation, storage, transmission, and retrieval of an audio signal. Unlike analog audio, in which making copies of a recording results in generation loss and degradation of signal quality, digital audio allows an infinite number of copies to be made without any degradation of signal quality.

Real-time multimedia over ATM

In the words of Pazos, Kotelba and Malis, "ATM is becoming increasingly ubiquitous in the core of service provider networks" for multimedia communication, whereas IP remains popular for multimedia communications within an organizational LAN. The QoS guarantees of ATM surpass the best-effort guarantee offered by Internet Protocol. Pazos et al. note that RMOA was developed by the ATM Forum as "an efficient and scalable means to transport native H.323 VoIP traffic over ATM that is not possible with the existing IP over ATM solutions." The RMOA working group has "defined a new type of gateway called H.323-H.323 Gateway", according to the trio.

Real time high quality linear audio over ATM has been standardized by the AES and this is known as AES47 that has been used by contractors providing the wide area broadcast contribution and distribution links between production centers for the BBC in the UK.


Voice over Asynchronous Transfer Mode (VoATM) is a data protocol used to transport packetized voice signals over an Asynchronous Transfer Mode (ATM) network. In ATM, the voice traffic is encapsulated using AAL1/AAL2 ATM packets. VoATM over DSL is a similar service, which is used to carry packetized voice signals over a DSL connection.

IEC standards
ISO/IEC standards

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.