Macadamian Blog

Applying User Experience Principles to Alexa Skills

Ed Sarfeld

Crafting a voice controlled interface for our text message skill posed some interesting design hurdles. To have a truly successful skill, expect to commit some time to the wordsmithing that likely kept William Shakespeare up at night.

Applying UX Principles to an Alexa Skill

What Would the Bard Say?

Last month we posted an article detailing the challenges and lessons learned from developing a new Alexa skill that can send a text hands-free. This post is a companion piece to that article focussing on the design of the user experience.

An honest tale speeds best, being plainly told.

William Shakespeare – King Richard III (Act V, Scene V)

Crafting a voice controlled interface for our text message skill posed some interesting design hurdles. Conventional interfaces have screens from which new users can derive guidance or hints, but, in a voice-based interface, it is imperative for the flow of the interactions to have proper structure and to ensure that commands use accurate and familiar verbiage.

Current technology allows a user to speak and be understood without having to train the system to recognize their voice. This is a great advancement from the early days of speech recognition, for those readers who can remember or use dedicated dictation software, but it can also lead to unfulfilled expectations. If users don’t go through a training process, they tend not to build an understanding that the system has specific rules of engagement.

Today, users are accustomed to an “OK Google” or “Siri” style interaction that allows natural language rather than structured phrasing. They expect intelligently curated results that are derived from natural language phrases. This places a greater emphasis during a project’s design phase on ensuring ease of use and user engagement with technology that embodies the beauty and cadence of skillfully crafted prose. To have a truly successful skill, expect to commit some time to the wordsmithing that likely kept William Shakespeare up at night.

What’s in a name? That which we call a rose
By any other name would smell as sweet.

Romeo and Juliet (Act II, Scene II)

In the context of the play, the above quotation implies that what matters is what something is, and not necessarily what it is called. This may be true for Juliet, but not for Alexa.

The built-in skills for Alexa work exceptionally well and allow you to ask questions such as, “Alexa, what time is it?” and quickly be told the correct time. It’s a very straightforward question and response experience.

When using an additional skill built for Alexa, the user must invoke it by name and then articulate the requested action i.e.,

Alexa, tell [skill/invocation name] [connecting word “to” or “that”] [action or text content].

This structure places particular emphasis on finding a suitable skill name. Guided by the Amazon best practices, we worked through a long list of potential names following the recommendations that they be memorable, easily recognizable to Alexa, and somehow relate to the functionality of the skill.

Wisely and slow; they stumble that run fast.

Romeo and Juliet (Act II, Scene III)

We kept multiple user scenarios in mind while developing the skill. For instance, we wanted a child to use Alexa to easily contact their parent, an adult to communicate to a spouse or partner, or an elderly person to provide an update to a child or caregiver. These requirements led us to work through a number of generic names such as “My Friend” or “the Messenger” to see if they would be easy to remember and have the right sound.

Alexa, tell My Friend, that I am home.
Alexa, tell the Messenger, that dinner is almost ready.

Our texting skill sends a text to a single phone number, so we didn’t concern ourselves with how a user selects from a list of recipients.

Brevity is the soul of wit.

Hamlet (Act II, Scene II)

As we progressed through this exercise, we came to the conclusion that it would be ideal to invoke the skill using a single word. We considered terms of endearment i.e., “Honey” but shied away because the sender and recipient may not have a personal relationship. Further deliberation led us to the insight that for the invocation sentence to sound natural it would be ideal if the name were also a verb.

Ultimately we chose to use “Dash” as the name. It was short, has an active tone, and flows well in a sentence. Saying, “Alexa tell Dash that I am going to the gym.” would send the text recipient the message “I am going to the gym.”

We have seen better days.

Timon of Athens (Act IV, Scene II)

Our initial enthusiasm for our single word name deflated quickly after the skill was submitted and subsequently rejected in part due to the name “Dash”. We hadn’t realized that using “Dash” came into conflict with Amazon’s own Dash Buttons for ordering as well as the Dash Product Scanner. This may have been an artifact of living in Canada where these tools are not yet available. After a little head scratching, we decided to proceed with another single word skill name “Scribe.” It is simple, describes its function, and fits well into an invocation string.

“Alexa, tell Scribe, that we will be rejected again.”

These words are razors to my wounded heart.

Titus Andronicus (Act I, Scene I)

The skill was resubmitted and rejected because the word “Scribe” was trademarked. We had searched the web for a product called Scribe and had not found one. We had not found an active registered domain either, so we had thought that it was available.

As we discussed alternate names, it became clear that we were indeed best served with a word such as scribe. So in order to work around potential trademark issues we chose to alter the spelling with the intent to keep the sound phonetically intact. This resulted in the skill name “Scryb.” The change in spelling avoided trademark issues and stayed within the Amazon review guidelines where the title and the example phrases could remain “Scryb” but the invocation name must be an English word to ensure accurate speech recognition. To ensure that Alexa would have 100% recognition confidence, we retained the invocation name “Scribe.”

So oft have I invoked thee for my Muse
And found such fair assistance in my verse

Sonnets (LXXVIII)

Huzzah, the name was accepted! Let’s try the skill:

Alexa, tell Scribe that all the little things matter too.

As we shaped the balance of the user experience, our goal was to ensure that the user would not struggle with remembering the correct phrasing to accomplish a task. We supported this on the skill description page by providing sample phrases, tips about how best to use the skill and its limitations. Since the initial skill set and changing a recipient phone number are both completed entirely by voice without the aid of an app or web page, it’s important to provide clear feedback throughout the process and to ensure that, when there is a problem, the skill fails gracefully. This means providing clear feedback from Alexa on how to resolve the issue without falling into a frustrating exchange loop.

This is the short and the long of it.

The Merry Wives of Windsor (Act II, Scene II)

Creating this first Alexa skill was an excellent opportunity to apply user experience principles to a voice interface. Putting oneself into the role of the user and ensuring that even the minute details are identified and addressed is critical to identifying and eliminating frustrations that could quickly lead to the skill being abandoned. This includes working through all of the facets of the experience from the voice command phrasing used during setup to creating a message, and even the footer text of recipient’s message.

Without the assistance of a graphic user interface, every word and interaction contributes to the success or failure of the Alexa skill.


six oversights IoT product

Six Oversights to Consider Before Building an IoT Product

In this white paper, we outline six oversights that organizations face when entering the realm of IoT.

Download Now


Tags: