How to Design Voice Applications in a Multi-Channel and Multi-Modal World

Macadamian Technologies | December 15, 2017 | 6 Min Read

In a webinar hosted by Orbita and Macadamian, the benefits that voice assistants like Amazon Echo and Google Home present to an end user, like you and I, were discussed. The presentation focused on what sets apart voice assistants from other existing input modalities, and some of the design challenges that come with them.

In a webinar hosted by Chief Product Officer at Orbita, Bill Cava, and Macadamian User Experience Architect, Ed Sarfeld, the benefits that voice assistants like Amazon Echo and Google Home present to an end user, like you and I, were discussed. The presentation focused on what sets apart voice assistants from other existing input modalities, and some of the design challenges that come with them.

It’s evident that Voice is becoming a prevalent choice of inquiry and action.This year, 35.6 million Americans will use a voice-activated assistant device at least once a month. Even before the webinar began, a quick poll revealed that 46% of attendees had completed or are currently in the process of developing a voice application.

Designing an effective and utilized voice application is no easy feat, and design iteration can be lengthy. So why develop a voice application? What makes voice an appealing avenue to input information?

Benefits of Voice Applications

In the webinar, Bill describes that speaking comes naturally to people. Language is something that we begin learning even in the womb through the sound of a mother’s voice, Touchscreens, in contrast, have a steeper learning curve and might not come as naturally to those who haven’t been exposed to the technology early on.

Humans can also speak much faster than they can type, so if voice can be used, it’s a more efficient way to communicate. Bill notes that the average business professional can type 40 words/ minute, while the average person speaks 150 words/minute.

Finally, the last key benefit mentioned is that voice is often more accessible (eg: when you’re in the shower, car, if you have a visual limitation, etc).

Without a doubt voice technology does have its benefits, but Bill and Ed touch on some of the complications you may come across during the design process and what to make sure you consider while defining your solution.

Challenges and Considerations

One of the primary challenges when building voice applications is designing support that won’t be visible in an interface. Can the user distinguish what your skill or app is capable of doing? Are they able to identify features, correct errors and then prevent errors?

There’s also the challenge of creating a balance between the amount of information you present to the user and the user’s cognitive load capacity. Will the user easily be able to remember how to navigate and handle prompts? You want to be careful not to overload the user with information.

Understanding how people think to form the most efficient and co-operative conversation can also be tricky. Ed provides a good example: if a user tells a voice assistant that they’re running out of gas, you wouldn’t ask them how much gas they have left – that’s irrelevant to the user’s need. You would want to infer that the user needs to stop somewhere close to get gas and would provide them with the location of the closest gas station.

Lastly, Bill mentions that your design should be able to effectively convey to the user what is happening in the system. For example, Alexa does this visually through different light ring states which signify that a response is being articulated or searched for.

One particularly interesting consideration mentioned during the presentation was surrounding who is the most appropriate person to design a conversational UX. Bill mentions a conversation he had with a Facebook employee, where they explained that the best person to design a conversational UX is actually not a writer. A writer is good at writing with the intended purpose of being read. The best person for designing conversational experiences is a person who is an empathetic communicator and is skilled in empathetic human to human communication.

The last thing to consider is that there will always be a refinement of the interface. Ed notes that “a voice interface is never really complete”. There are ways to analyze a conversational UX once it is in use that helps to flag where users may be struggling in the experience, and also identify what aspects of the experience they enjoy.

Best Practices When Designing Voice Interfaces

Ed describes the concept of “placeonas” – an enhanced persona that encompasses a user’s specific environment and situation when they engage with your app. For example, someone on a bike in loud traffic with their eyes and hands busy, someone in a library whose voice is restricted but their hands are free, or someone who has dirty hands while cooking in the kitchen. Ed explains that considering user placeonas can help us define some of the following best practices when designing for Voice-enabled technology.

  1. When to use a voice interface:
    In cases when the context allows for easy use, when a user’s hands are busy when they don’t want to write or use a phone, etc.
  2. Prepare and practice:
    Write sample dialogues based on expected use cases and test the flow by reading them aloud.
  3. Define a personality for the application:
    This will help you focus the language and feel of conversation by being less or more formal.
  4. Be brief and natural:
    Be informative, brief, and stay relevant to the topic. Don’t use jargon or talk down to the user.
  5. Limit choices and be concise:
    Provide the fewest number of options necessary to support flow. Provide instructions only when needed.
  6. Validate the user’s input:
    Be sure to communicate that the system has understood; use confirmations and acknowledgments to reassure users.
  7. Consider the user’s attention span:
    If the user is busy or active, they will want information or guidance to be brief or in chunks.
  8. Prompt for responses clearly:
    Explicit confirmations for important requests (eg: money transfers) and implicit confirmations for less impactful requests (eg: repeating what a user has said).
  9. Support different paths:
    For example, being able to register a user’s request to set an alarm for 8 AM, and a request to set an alarm at 8, then prompt for AM or PM.
  10. Follow elements of a good conversation:
    Turn-taking, conversation flow, efficiency, supporting user’s different experience levels with a variety of words and styles to convey the same message.

The Key: Consistency, Seamlessness, and Context Optimization

“Designing a voice interface really is as much an art as a skill,” says Ed.

It’s a complex process and it requires time to get it right. The main takeaway is to ensure you’re designing with consistency, seamlessness, and context optimization in mind, as Ed emphasizes in the webinar. The user should have a similar experience from one channel to another, and they should be able to start a task in one channel and complete it in another, and using your interface should feel effortless across them all.

Head to Orbita’s website to watch the recorded session of the webinar for more insight.

Get Email Updates

Get updates and be the first to know when we publish new blog posts, whitepapers, guides, webinars and more!

Suggested Stories

Increasing Patient Engagement Using Behavior Design

Increasing patient engagement is easier said than done. In this video course, you'll learn how thoroughly understanding the behavior of health consumers can allow stakeholders to increase patient engagement.

Read More

Applications of Voice Assistants in Healthcare

Discover how organizations across the continuum of care can leverage the growing consumer demand for voice-enabled devices to achieve an extensive list of objectives from increased patient engagement to improved outcomes and lowered care costs.

Read More

Voice UI Design Best Practices

Voice assistants are poised to transform customer engagement as well as business models. Discover why voice is the next digital frontier – and what you should know about voice-first solutions.

Read More
Macadamian has been acquired by Emids 🎉
This is default text for notification bar