The song becomes annoying. Paige yells out, “OK Googoo, stop!" Despite the pronunciation, Google Home does as she commands. She exclaims something indecipherable and Google responds, "I am sorry I do not understand". Paige is cross but is happy the other song has stopped, so she says "OK" and moves on. Paige is 18 months old. She, like many kids of her generation, is an early adopter of new technology like voice speakers and digital assistants.
The adoption of new technology has been traditionally influenced by social status, education, peers, risk profile and money. I believe that these factors are largely out of the picture when it comes to influencing a child’s behaviour in adopting new technology (though the family of the child remains influenced by these factors, which would dictate whether a digital assistant is available in the household in the first place). The fact that children are asking Alexa to read them a bedtime story highlights their interest and the low barriers to adoption.
There are a range of digital assistants available on the market, including some you might not have heard of, such as Samsung’s Bixby. At this year’s SXSW (South by Southwest, a major interactive media conference held annually in Austin, Texas), there was a lot of talk about machine learning and its impact on what’s possible with the way computers interact with us. One of the immediate impacts of machine learning in our daily lives is the impact they have had on digital assistants or voice enabled speakers. Voice enabled speakers are the hardware that run the AI-powered software.
When Siri came on to the market with iOS5 in 2011, technology commentators went wild with proclamations of how Siri would solve all our problems. However, she didn’t – and for many the technology seemed both useless and frustrating. Apple was absolutely on the right path, but possibly too early to market. So why is the market suddenly responding now? Did Amazon and Google simply do better marketing than Apple? Well yes, but the answer is simpler. The software got better at understanding us. With accuracy came adoption.
The challenge with voice recognition is our tolerance level for the device to understand us. At SXSW, Christopher Ferrel, Digital Strategist at The Richards Group, presented research that identified that accuracy of the device’s understanding had increased to 95 percent. (This may be biased towards American accents, as it is safe to say that many of us cannot understand a Scotchman to a level of 95 percent accuracy.) As a result of the software’s increase in understanding of what we are saying, we are more likely to use it, as frustration with it not understanding 1 in 20 interactions/questions becomes palatable. Ferrel suggests that once we get to 99 percent comprehension, we will be using it all the time. The idea that the device doesn’t understand us 1 in 100 times creates low friction, and adoption by the late majority will rise sharply.
Reward vs frustration
So why are children among the early adopters if the device cannot understand them? I suggest that children have a higher tolerance than 1 in 20 as they are used to often being misunderstood and have patient parents who continue to encourage them to repeat themselves. Add to this the ability to turn a light off (most kids can’t reach the light switch without a chair) by asking Google/Siri/Alexa to turn off the light, and you have a child who is both empowered and engaged. The tolerance is high because the level of reward is higher than the level of annoyance that it took to get the result.
Conversation with an 18-month-old is at best 20 percent understanding (unless you are the parent who has comprehension super powers) and as such interaction can be stunted. However, our tolerance is high as our expectations of a free-flowing conversation are low. With digital assistants, the situation is reversed. Accuracy allows for conversation. When it gets to the point of feeling human, we will want it to act human, which will present its own challenges for the software and brands.
We have been having limited conversations with banks, phone, internet and insurance companies for several years, often resulting in more frustration and a longing to speak to a real person. Some companies have employed actors to be the automated voice of their brand, in part to soothe you and increase your willingness to stick at it.
The challenge for digital assistants is that their repertoire needs to be able to go beyond the banal interactions around paying a bill to literally any question an 8-year-old or an 80-year-old can come up with. It’s a big ask.
Speech recognition and digital assistants have improved substantially in recent years. In part due to machine learning which allowed for recognition rates to rise up to 30 percent in 2012, making digital assistants more likely to be accepted as a natural human interface.
The remarkable thing about this video is that the speaker’s voice is converted to text and displayed on screen as he talks. Then it translates what he is saying into Chinese.
Speech recognition systems still make a lot of mistakes, but machine learning is allowing the software to understand the individuals in the home more accurately over time. Once speech recognition goes from 95-99 percent we will be using it all the time.
So what does this mean for your brand?
Think of the current situation as analogous to that of the early smartphone. When smartphones took off, marketers pondered whether they should quickly build an app. While many did, the relevance of many others remains to be seen. Most users only use 6 to 10 apps per day (stats vary wildly between reports), and up to 30 in a month.
Should you build an Alexa Skill or an app for the Google Home? Whether your business has a smartphone app already will be the starting point. But whether it is currently used will be your guide. In most cases you do not need to develop a specific voice app, provided your website is readable by digital assistants and it deals with the specific questions that users may have. What time do you open? Do you provide [...] service? What is the cost of a particular product?
My recommendation is to have a page (semantic content) that deals with the questions that someone may ask a digital assistant, so that digital assistants can find the answer quickly. Ask your team and customers what are the top things that customers want to know about your business and then answer those questions, making the question and answer visible to digital assistants. In some cases such as product ordering (“OK Google, order a case of Brown Brothers Prosecco”) or donations (“Alexa, donate $70 to Fred Hollows”) or quoting (“Siri, get a quote for car insurance”), building a specific Skill or voice app will be helpful to your customers. But be wary of ending up with an expensive interface that is not used (promotion and demand will be the key).
How will you use it in the office? Can you hear yourself ordering an Uber via your Alexa? What if your office manager could use it to order a car for you, with card linked to the company account, reducing the need for reimbursements, and ensuring you are on time for your meeting? What if you could easily ask Alexa to order more post-it notes when you run out? This will become a natural way to solve office problems and you will love it.
The challenge we now face is to think like a screenless marketer. In recent years marketing has largely focused on digital where the screen is king. So much so that ‘digital marketing’ has emerged as its own separate discipline. It is likely that this function will change back again to simply ‘marketing’, as the word digital becomes redundant and it all just becomes ‘marketing’ again. What we now call digital marketing will become 30 percent screenless, forcing a shift from mobile-first to voice-first. Voice search on your website will need to be a strategy for 2018.
As we shift from the visual web to audio web, for many the car will be the main place we use digital assistants. Three years ago I predicted that all cars will be WiFi enabled at some point, with digital assistants. While Teslas are already connected to the internet, I now think I was wrong. Your mobile phone is more likely to be the main device with the connection, and the car will simply connect to it via inbuilt software (e.g. Android Car Play), rather than inbuilt hardware (audio and screen tools will improve, though will still connect to your device to access the internet). I can see some people plugging a physical Alexa device into the car or Amazon providing software inbuilt into the car, but I no longer think this will be mainstream. Particularly as Amazon now allows Alexa to be used in the app on any mobile device.
As a marketer, can you see a situation where someone would interact with your brand using a digital assistant, while driving to work? Booking a flight, an anniversary dinner at a restaurant, or even booking their car in for a service? Would you interact with your brand via voice? If so, what response would you expect? The bigger question is what do your customers want to know or be able to do? Could a digital assistant complement your call centre, or assist with support out of hours?
In time it is likely that responses from digital assistants will be sponsored by companies. Imagine you said “OK Google add toothpaste to my shopping list” and Google responds with “I have added Colgate toothpaste to your list”. You might not use Colgate and you may decide to purchase something else, but the name Colgate was just announced in your house and the chance of influencing you is high.
On the street, we will move from face-down to chin-up as we access the device using voice through our headphones, rather than interacting with the screen.
Do I believe that we will become completely screenless? No, but I do believe that we will start our web journey via voice rather than keyboard, with that journey likely to end up on our TV, phone, wearable or computer. Voice becomes the first filter of content or search and is likely to filter us to the screen once we have made our selection. This is already happening with requests of the device to play a movie on the TV.
If an 18-month-old (or an Italian grandmother) is utilising the web via a digital assistant, the reality is that digital assistants will soon be mainstream. This will impact the way people interact with your brand. You now need a voice strategy and you need it fast.