Three Ways to Make Alexa Interactions More Natural

Macadamian Technologies | August 10, 2016 | 4 Min Read

As the language model of Alexa expands, the interactions will be more and more natural. With these and other capabilities, richer experiences will be possible, which will bring us closer to seamless ambient computing.

Since I brought an Amazon Echo home, other than using it to playing music, the timer and alarms have been the most common uses at my place.

Based on some recent experiences with Alexa, the voice service that powers the Echo, here are 3 ideas that could help make interactions feel more natural.

  1. Postponing alarms
  2. Voice formatting
  3. Pre-timer announcements

Recovering From a Bad Request

‘Alexa, postpone the alarm by 30 minutes’

One morning my interaction with the alarms didn’t go very smoothly.

Alexa, wake up Annie at 7am.

Annie looked at me funny.

‘Right, you are not working today, are you?’

Alexa, postpone alarm by 30 minutes

Alexa didn’t reply and just played the termination tone. I had to think of another voice command to recover.

Alexa, cancel Alarm
Alexa, set alarm for 7:30pm

Annie looked at me funny again.

‘I said pm didn’t I?’

Now I had to recover from another user error:

Alexa, cancel alarm
Alexa, wake me up at 7:30am

The first time I tried postponing an alarm, Alexa had set another alarm instead. That was even harder to recover from. It shows that while natural language interfaces can be a good time saver when one request goes bad, recovery can be a pain, and the user experience collapses. Allowing users to postpone an alarm would make the timer and alarm management easier.

Voice formatting: When Less Exact is More Natural

As I’ve mentioned before, voice specific considerations are needed when controlling how some things are said by Alexa. For example, by controlling pauses when saying a phone number, or pronunciation of special words like brand names.

Another case is saying numbers. For example, asking Alexa for 250/3, gets this answer: 250 divided by 3 is 83.3333333333.

Spoken as:

Two hundred and fifty divided by three is eighty-three dot three three three three three three three three three three

I doubt that anyone would say a number like that. Perhaps something like this would be better;

Eighty-three dot three repeating

Alternatively, saying only one or two decimals would have been plenty.

As another example, when asking Alexa for the time, you also get these overly precise answers:

The time is eleven eleven am.

Perhaps “ten past eleven” would have done just fine. I usually know whether it is am or pm, so that could be left out. While you do expect the exact time from the display of a clock, when interacting with a voice assistant, I think a more human way of expressing the answer is better than precision.

Pre-Timer Announcements

‘Alexa, it’s bedtime in 15 minutes’

As a parent, I quickly found that getting a kid to do something is way easier with advance notice and reminders. Asking a kid to do something immediately, always turns out to take longer and require more effort, than giving my daughter a 15-minute notice, with a couple reminders along the way.

One way the Alexa timer could be better is to perform those reminders. (Obviously to enable that, voice reminders/notifications would be needed, which is another enhancement I’ve mentioned before.) The interaction could be triggered by sentences such as; “it’s bed time in 15 minutes”. The same could apply to taking a shower, going to school, or doing homework. All with slightly different wordings.

Then Alexa could select two times to remind users along the way, perhaps 5 and 2 minutes before. Again, precision is not important in this case. Saying “it’s bed time in 5 minutes” while there is actually 7 minutes left does not really matter to the kid, nor does it affect the ease of getting her to bed. It’s just a matter of showing progress towards the deadline. With clear expectations, there is always less resistance.

To further complicate interaction scenarios that are not currently supported, Alexa is inconsistent when it faces a question it can’t answer. Currently, it either;

  • Does not answer at all
  • Answers with “I was not able to answer the question I heard”
  • Does something unexpected such as setting a 2nd timer, instead of postponing the first one.

As to be expected, the volume of these interactions is so large it will take a while for Amazon to cover them all. It would be nice if the Alexa APIs allow for 3rd parties to handle such request that Alexa does not know what to do with.

As the language model of Alexa expands, the interactions will be more and more natural. Along with that, keeping in mind that less precision can, at times, be more useful will also help. With these and other capabilities, richer experiences will be possible, which will bring us closer to seamless ambient computing.

Get Email Updates

Get updates and be the first to know when we publish new blog posts, whitepapers, guides, webinars and more!

Suggested Stories

Applications of Voice Assistants in Healthcare

Discover how organizations across the continuum of care can leverage the growing consumer demand for voice-enabled devices to achieve an extensive list of objectives from increased patient engagement to improved outcomes and lowered care costs.

Read More

Voice UI Design Best Practices

Voice assistants are poised to transform customer engagement as well as business models. Discover why voice is the next digital frontier – and what you should know about voice-first solutions.

Read More

Hacking for Health

Participating teams were tasked with developing solutions that demonstrated the use of sensors, data collection, and the interconnectedness of hardware and software.

Read More