Home Newswire Machine learning and parallel processing will dramatically accelerate VOD encoding

Machine learning and parallel processing will dramatically accelerate VOD encoding

Share on

Bitmovin, an innovator in online video technology whose customers include fuboTV, RTL, Sling TV and Bouygues Telecom, has announced a concept it calls AI-powered encoding for use with file-based (i.e. VOD rather than live) content. This is said to accelerate video processing and also improve video quality significantly. The first encoding pass is accelerated thanks partly to machine learning, while the second pass splits the video into multiple pieces and parallel processes them in the cloud. The solution was demonstrated at NAB Show in Las Vegas this week.

In the first pass [of the content through the encoding system], a rapid high-level analysis uses machine learning to identify appropriate encoding settings as well as pre/post processing steps for each part of a video. The AI is trained using library content and will continue to learn over time.

The first pass runs through the entire video, but does this much faster than a first pass would take on a typical two-pass system, according to Bitmovin. This is partly because the machine learning model accelerates the process. Secondly, the new AI-enabled solution performs a much higher-level analysis than is usual.

A spokesperson explains: “The higher-level analysis only defines the bitrate for each chunk [i.e. the typical short chunk of video used in ABR, which may be four seconds long]. This allows the bitrate information to be passed to each chunk so that a frame-by-frame analysis can be made at the container (chunk) level, and that bitrate can be redistributed throughout the chunk, based on the content of each frame.

“So most of the hard work is done in the container. This creates the potential for most of the analysis to be done in parallel.”

The second pass of the content through the encode system then makes use of this potential for parallel processing. The video, split into its chunks (which could be four seconds or any length you want to configure) is analysed in more detail to fine-tune the encoding parameters, thus ensuring the best visual quality outcome. Because you are analysing each chunk in parallel, on separate cloud instances, you can process a video file much faster than if each chunk was analysed after the previous one.

There are practical limits to how parallel the processing can become, based on how many cloud instances you want to use and the cost of using additional instances. But in simple theoretical terms, you could take a 60 minute file for an on-demand programme, split it into four second chunks (900 of them) and put them across 901 cloud instances (one controller and 900 workers).

In this scenario you could process an entire one hour programme in four seconds if working in real-time. A spokesperson stresses that this would be very expensive, so not practical, but the example does illustrate how Bitmovin’s containerized solution works. (One minute of content, split into four second chunks and parallel processed in this way across 15 worker instances would be processed within four seconds, if processing in real-time).

There is then a third pass for the video that is being encoded This applies the analysis, using an optimized bitrate throughout the whole file.

Bitmovin says: “A standard encoding process involves performing an in-depth analysis of the entire video before encoding is started. Our AI-powered encoding technology works by continuously learning the parameters used in previous encodes, so that it can apply AI-optimised settings to every new video file.”

The company promises faster processing times and significantly higher quality with no increase in bandwidth, using this new solution. “Artificial intelligence is a step-change in encoding, allowing operators to significantly improve the visual quality of streams, eliminate buffering and improve consumer satisfaction,” declares Stefan Lederer, CEO and Co-Founder at Bitmovin. “Bandwidth should never hold back operators from delivering the best possible quality experiences.”

Bitmovin says it deployed the world’s first commercial adaptive streaming (MPEG-DASH/HLS) HTML5 Player and the company claims it was also first to achieve 100x real-time encoding speeds in the cloud. The company has just raised $30 million in Series B funding to scale its product R&D, field engineering and worldwide sales teams. Iflix became its latest customer this month, using Bitmovin encoding to deliver TV content in HD over low-bandwidth mobile networks across the Middle East, Africa and Asia Pacific.


Share on