According to Gartner, 30% of searches will be performed without a screen by 2020. In fact, predictions suggest that voice interactions will eventually overtake typing a search query. From Google, with its increasingly intelligent search engine, to Apple’s Siri, using third-party app integrations for everyday tasks, or for sending a voice recording over WhatsApp, we are seeing voice-based systems making increasing noise in all aspects of our digital lives. Voice capabilities are coming to the fore and people are becoming more accustomed to them. TV and on-demand viewing is no different. Applications from high profile companies have helped profile the capabilities associated with voice, and consumers are now beginning to expect the platforms they watch content on to support voice functionality, too.
A personal experience
In today’s crowded OTT market, content discovery has undoubtedly taken centre stage when it comes to enhancing the user experience and personalising offerings to individual subscribers. Voice is becoming an integral part of this equation. The obvious advantage of voice is that it removes the need to immediately understand the UI of a platform. The user doesn’t need to look for a search field or find a filter button before they can find something to watch. Instead, voice offers a convenient and easy way to locate what the users wants to view, rather than navigating manually through traditional programme guides or menus.
We are all naturally conversationalists, and voice enables operators and platform providers to create a much more ‘human’ feel with their product offerings. But can voice help deliver more personalised content experiences? It’s a useful step towards improving the overall user experience; however it’s arguably not particularly relevant for personalising access to content. After all, voice requires the user to have an indication of what they are looking to watch in the first place, even if it’s as simple as an actor’s name.
The underlying algorithms that power content discovery are not tied to any one input method either. Recommendations and curated content suggestions are driven by viewing history and engagement with content, rather than input specifically. With all this in mind, voice integration into any service is less about further personalising a content offering, and is much more about personalising the overall experience.
This is not to say the implementation of voice UI is not without its challenges. Voice command is still in its early years and its potential is yet to be fully realised. It’s currently only offered by a few service providers and is confined to areas of isolation, such as the home. Part of the problem is that using voice as a means of search and discovery is very much data-driven.
Typically, keyboard or remote search is relatively simple with the support of technology that is advanced enough to help a user find what they want to watch quickly. However, the very nature of voice search opens up a wider range of discrepancies. For example, different users may interact with the voice function in a different way using dissimilar words or phrases, but are looking for the same film or TV series. This makes voice search much more complicated and open to misunderstanding, and therefore requires a more complex set of metadata.
To provide a sophisticated response to potential user interactions, the data set will need to integrate things beyond simply recognising a film or actor’s name and be able to make links between content. More advanced voice functions will even take into account trends on social media. This is much more intricate so it is likely we will see operators start to deploy basic voice capabilities first, before taking the plunge with advanced functions.
Not only this, but having access to relevant user data remains a significant limitation. Operators may introduce single sign-on integrations in the future to help improve this. Encouraging a consumer to sign into their Google or Amazon account would certainly help to make this aspect of content discovery infinitely more relevant. Voice search and content discovery overall could be greatly enhanced thanks to how much data operators or platform owners would then have access to. However, not all consumers would cooperate, given the privacy concerns. The challenge is not only about encouraging users to provide this, but also encouraging them to interact with the voice capacity in the first place.
Conversational search also depends almost entirely on natural speech language recognition. It is becoming the main driver behind implementing this technology and advances in AI and machine learning have certainly improved the way systems can recognise and decipher regional or foreign languages or deal with more complex searches, yet it is still some way off.
Creating an operating environment where a cloud-based system can quickly and cost-effectively process this information on-the- fly, in order to understand what a user’s speech actually means, and then act upon that data, is still a big ask. Once it happens and once it is paired with contextual awareness, be that geographical, time of day, type of device or within an app, this will further drive the overall user experience.
The future of voice
Voice has the potential to open up new avenues for service providers to deliver enhanced content discovery and navigation. But it is important to remember that as with all technology, voice will only truly take off if its implementation solves a problem that users are facing. Operators and service providers should consider whether the value added is sufficiently high enough. Poor voice features can do more harm than good, leading to failed search results and frustrated users who become disengaged and stop using the function or service completely. Nevertheless, the popularity of voice is growing and voice-enabled applications are being introduced to the market by major players all the time. What is certain is that the industry will be talking about voice well into 2018.