Amazon’s Alexa voice assistant has been laughing, but consumers don’t see the joke. With users posting their experiences on social media in large numbers, this unintended “add on” has not reflected well on Amazon’s reputation.
But there continues to be a strong market for these devices. While Alexa is arguably the best known Voice User Interface (VUI), there are also many millions around the world in the form of mobile devices. Apple alone has sold over half a billion devices with Siri integrated in multiple languages.
With so many devices around the world potentially listening to our conversations, there is growing concern amongst consumers over privacy, with lots of public debate and controversy in the media. Also Microsoft, for example, faced issues in 2014 when TV advertising using the phrase “Xbox On” turned on users’ devices without their permission.
While there is novelty and some potential convenience in an “always on” or ambient setup in VUIs, where devices listen passively for keywords from users until they are activated, in certain situations they are not appropriate. They can even be potentially dangerous if they were allowed to activate on their own for example if a user is driving.
An alternative that can prove effective, especially when interacting with an output device such as the TV in the living room, is to integrate speech search within another device to enable the user to choose how they converse with technology. Users are always looking for the easiest way to interact with their TV, and voice should be a key part of this communication. But to deliver an optimized user experience, we believe it needs to be part of a holistic and multimodal input strategy.
ruwido has conducted research into this area and found that, while a standalone ambient VUI can bring a feeling of freedom, users indicated that button-based voice interaction is their preferred usage method, the majority stating that it gives them the control on when they speak to the device, rather than it listening to them.
While voice will continue to gain prominence in how consumers interact with their TVs, it is also important that users aren’t forced to talk if they don’t want to or aren’t able to do so. They need to feel empowered to interact with the TV using a modality that works best for them. By combining speech search with other interaction mechanisms such as high quality button based input or more dynamic ways of interaction like ruwido’s patented organic haptic interaction mechanism for step-less scrolling through content, users have the freedom to choose.
Voice is a hugely important mechanism that the industry can now be offering to consumers. It is an especially useful tool for advanced search functions, particularly as the content landscape becomes increasingly fragmented across different platforms. However, it needs to be integrated properly. For TV navigation, only offering an ambient “always on” setup doesn´t seem to be an ideal approach in terms of privacy and usability. By integrating button-based speech search that enables the activation of the microphone on a multimodal input device, TV, set-top box or streaming device manufacturers give consumers the choice on how they interact with their system by always remaining in control.