If you’ve recently shopped around for home audio systems, you’ve likely skimmed over the term “3D audio,” or some variant of it. Although the term may seem a bit gimmicky, the end result is far from it. The benefit is a more immersive listening experience that mimics how we perceive the real world by interpreting sounds as movable objects rather than static channels.
While a properly arranged 7.1 surround sound setup remains impressive, the requirement for eight physical speakers to produce a realistic, enveloping sound seems akin to a Rube Goldberg machine when compared to something like the Sennheiser Ambeo Soundbar coming down the pipes. Before going into why 3D sound is the way of the future, let’s dig into how it works.
How 3D sound currently works
The predominant way that we hear immersive audio now is by way of hardware, like your buddy’s home theater system. If they’re a die-hard enthusiast, they may have a complete 5.1 or 7.1 setup with a few speakers descending from the ceiling. This is called channel-based immersive audio because listeners benefit from the realistic sound produced by aerial speakers.
A portable example of hardware adoption for 3D audio is Creative’s Super X-Fi Amp. This combines the miniaturized amp and DAC hardware with the company’s Super X-Fi processing to account for variables like age, sex, and—on occasion—bone construction to emulate the most realistic sound possible.
While both forms of 3D sound work and are impressive to even the most discriminating listeners, they’ll soon look anachronistic compared to what’s to come with the adoption of Fraunhofer’s MPEG-H format.
Related: What is Dolby Atmos?
MPEG-H and 3D audio are changing media consumption
One of the more underreported stories birthed from CES 2019 surrounds Fraunhofer IIS and the first consumer appearances of MPEG-H enabled 3D sound products. You may recognize that company’s name if you’re an MP3 history buff, or have followed the development of AAC.
Unlike channel-based immersive audio, MPEG-H enabled products truly engage the listener by accurately portraying audio to reflect a scene’s environmental changes. According to Robert Bliedt of Fraunhofer, “With MPEG-H, we have immersive sound, which means not only sound from around you but also sound from above you.”
Let’s take a football stadium scene. The crowd’s jeering will be audible to you from 360 degrees. As the camera pans around the field absorbing the moment in 120fps cinematic style, your perception of the crowd’s cheers will also rotate.
3D sound enabled by MPEG-H can track on-screen movement for a hyper-realistic and engaging audio experience.
Perhaps action movies are more your speed. If there’s a helicopter flying over a character as kerosene filled barrels explode, you’ll perceive the helicopter’s rotors whooshing from above as you hear the off-axis explosion. Unlike with how our homes are arranged now with an array of speakers at varying heights and angles, this can all be experienced from a single soundbar for a much smaller footprint with the same big sound.
MPEG-H in short
MPEG-H is able to do this because it’s an audio coding standard a that identifies a sound as an “audio object.” This is essentially a channel that can be placed anywhere within a 3D space. A given object carries a lot of metadata with it to allow the sounds to be edited in any way the producer (or consumer) sees fit, and is afforded a full range of movement and can track on-screen action like an overhead helicopter, something channel-based immersive audio cannot emulate.
MPEG-H Audio allows for the compression of multiple streams of audio and is efficient enough for television broadcasts.
Not only can audio follow characters and objects on screen, it also empowers the viewer to adjust a broadcast to their liking due to Higher Order Ambisonics. Admittedly, this is slightly outside the realm of 3D audio; however, the experience is changed nonetheless. Listeners can adjust the volume of a certain channel, say a sportscaster, and mute it completely to focus strictly on the game being played. This is allowed by MPEG-H because the 25-plus objects of audio can be transmitted in a compressed signal small efficient enough for TV or wireless broadcast.
Sony’s implementation of 3D sound is also interesting, because it opens up a brand new frontier of music mixing. By taking each individual track of a master song recording and converting it to an audio object, the sound can come from anywhere around you, even creating the illusion of distance between the listener and each note. Humans have a lot of auditory-based instinctual responses, and this can forcibly add emotion into songs by tricking your brain.
An exciting future for all
It’s hard to convince impassioned home theater geeks to forgo their extravagant, multi-speaker setups, but the inevitable adoption of MPEG-H Audio support in upcoming soundbars will likely render those complicated systems obsolete—quaint, even. One of the best parts about MPEG-H from an interior design perspective is that you needn’t equip a room with 10-plus speakers anymore, no more cluttered living room.
Ultimately, we recognize Sennheiser as a paragon of high-end audio, yet it’s Fraunhofer’s work that makes the 13-driver Ambeo Soundbar truly a generational leap in audio quality. We expect to see the application of 3D sound become increasingly prevalent in the next few years and hope that it yields more affordable, yet equally immersive products.
Next: Best soundbars