Home Analysis How service providers can offer more immersive audio from standard stereo speakers

How service providers can offer more immersive audio from standard stereo speakers

Netflix recently tapped Sennheiser to create a more immersive audio experience for viewers with standard stereo speakers, by rendering the audio through its AMBEO 2-channel Spatial Audio technology. Renato Pellegrini, Project Leader at AMBEO argues that a significant majority of streaming service users use standard stereo equipment for their audio output, and 2-channel Spatial Audio can make audio sound better “wherever and whatever you’re listening to” regardless of hardware. He makes the case that the technology avoids the flaws of Binaural and Transaural technologies which also aim to make stereo speakers sound more immersive.

Share on

Netflix recently chose Sennheiser to create a more immersive audio experience for viewers who use standard stereo equipment – who make up a significant majority of its users, according to the streaming service. By leveraging Sennheiser’s AMBEO 2-channel Spatial Audio technology, Netflix sought to improve the audio experience for users across standard TV sets, stereo systems, headphones, tablets, and laptops.

Renato Pellegrini, Project Leader at AMBEO, says the power of the technology is its applicability regardless of hardware.

“You could tune almost any TV set to sound great – well almost any TV set – but making sure it works on the vast majority of different devices without needing to know anything about them is the beauty of the technology. You don’t need to know if someone is using headphones or speakers – you can’t know all of that.”

He explains that Netflix approached Sennheiser looking to improve audio across standard stereo speakers without jeopardising the tonal balance of the original mix, or its output on headphones. Sennheiser then created a technology which takes Dolby Atmos mixes as an input, in the form of an ADM file or IAB – the typical object-based immersive sound format available in whatever production tool is being used by audio producers (Pro Tools, Logic etc.)

He remarks, “It felt like so much work is put into a good Dolby Atmos mix –with all that information where sound should be – so can’t we find a way to create something that’s more immersive even in a stereo speaker use-case?”

Pellegrini argues that a significant advantage of the solution is that it avoids the flaws of Transaural and Binaural technologies which also aim to make stereo speakers sound more immersive. Transaural technology uses crosstalk cancellation to issue anti-signals, which work to cancel out audio coming from the left side of the speaker to a user’s right ear, and conversely to cancel out audio coming from the right side of the speaker to a user’s left ear.

While this creates a more immersive, spatial sound for users, Pellegrini believes a defect of the technology is that users experience a “phasi-ness” at high frequencies and when they move. Binaural solutions can also introduce colouration into the audio, causing it to sound different to the original sound, something “no producer would ever accept.”

Another advantage of the AMBEO technology, Pellegrini argues, is that it does not require an additional mix beyond the original Dolby Atmos file it takes as an input. He describes how the technology allows producers to not just turn Spatial Audio on and off, but also define the amount of the effect “from full AMBEO processing…to stereo dynamics with no difference from the original.” When Sennheiser provided producers on several Netflix titles with the ability to control the amount of AMBEO processing, producers converged on the same settings.

Pellegrini elaborates: “The interesting outcome was that – without telling them, without enforcing anything – they came up with the same settings within 1% of accuracy, so really accurate. That’s what they like, and that’s across the board, from documentaries to feature movies, series, Hollywood productions, etc.

“It’s clear the settings are different for dialogue than for effects and music, but producers always chose the same settings. For dialogue it’s always the same settings and for effects, the same settings. By calibrating the technology to those settings, you can just let the tool do its thing.”

Since producers and directors have begun leveraging the tool, Pellegrini remarks that there have been very few instances where they have felt the audio has deviated from their expectations of the sound, and Netflix now mostly renders the audio in the cloud and sends it to producers for final approval.

Share on