Macadamian Blog

Alexa Skills 101: Before the First Line of Code

Christian Nadeau & Martin Larochelle

With the explosion of Amazon Alexa enabled devices entering consumers homes, developers are flocking to the platform. This getting started guide will focus on Alexa Skills 101, and things you need to know before digging in to write the first line of code.

Amazon Alexa Family

With the explosion of Amazon Alexa enabled devices entering homes across the United States, the UK and Germany, it’s no wonder developers are flocking to the platform to create apps or “skills” as Amazon calls them, to give users that ability to interact with their product, service or app by voice. This summer Amazon announced the number of skills for the platform had reached 1000 and as of November, Alexa can now tap into over 5000 custom skills.

number of Alexa skills November 2016

Image from voicebot.ai

If you’re reading this article, chances are you’ve caught wind of the Alexa phenomenon and are now looking at how you can get started developing Alexa skills of your own. If that’s the case you’re in the right place. While there are a lot of getting started guides out there, this one will focus on Alexa Skills 101, and things you need to know before digging in to write the first line of code.

The Devices

The list of devices that support voice interaction with Alexa is constantly growing with many Amazon devices supporting the functionality, including the Echo, Tap, Dot, & Fire tablet and TV devices. Amazon is likely to continue growing this product line and include Alexa in all of their future hardware, striving to make it available everywhere. Alexa isn’t limited to Amazon devices. The Alexa API allows for third parties to create Alexa enabled devices like the Triby, a smart portable speaker, and the LifePod, a virtual caregiver for seniors, to harness much of its power and capabilities.

The APIs

To allow for developers to build on top of Alexa there are two distinct and fully isolated API sets:

  • The Alexa Skills Kit
  • The Alexa Voice Service

Let’s look at both of these APIs and examine what you can build with each.

The Alexa Skills Kit

The Alexa Skills kit allows developers to build skills that extend the capabilities of Alexa. These can be enabled by end users on any Alexa enabled device.

There are three types of skills you can build with the Alexa Skills Kit:

Smart Home Skills

Smart Home Skills can be developed to control cloud-connected devices in your home and the voice interaction model is predefined by Amazon, making these skills easier to create. If you are a smart home device manufacturer, you can create Alexa skills that allow your users to control your devices with their voice. Currently, you can control Wemo switches, Philips Hue Lights, ecobee Thermostats, and GE Appliances, just to name a few, using Smart Home skills.

Flash Briefing Skills

Flash Briefing skills enable publishers to include their content in a user’s “Flash Briefing”. This can be triggered by asking:

  • “Alexa, what’s in the news?”
  • “Alexa, play my Flash Briefing”

The flash briefing comes preloaded with a few networks like NPR or the BBC, however, any network can create their own flash briefing skill that users can enable, like any other skill, to have their content or news included. You can develop a Flash Briefing skill as long as you own the rights to distribute the text or audio and you update your content frequently.

Custom Skills

Custom skills are the category that most other skills fall under and are the most complex to create as you must also define the Skill Interface which isn’t defined by Amazon as it has with the Flash Briefing or Smart Home skills. This allows you to create a skill that fits the context of your product or services. Developers have to define the actions a user can take, the words users can speak to take those actions, and the invocation name that helps Alexa identify the skill.

Getting Started with the Alexa Skills Kit

To write a skill, you can either use AWS Lambda or your own web service. If you want the easy path, write your code in AWS Lambda.

If you want to use another cloud service, it is possible. Most developers will use AWS Lambda as a passthrough to deal with security requirements, although that is not necessary if you have the time. We have published Skills on Azure while using this Alexa Verifier for the signature verification, and we’re using a custom domain name with a trusted HTTPS certificate.

What you need to get started

There is a lot of general functionality shared between skills. If you are building your first Skill, you can start from one of the many available open source starter projects. What you should be looking for, as far as general functionality, includes:

  • The structure of the project. One good practice I recommend is to source control your Interaction Model instead of just configuring it in the developer portal.
  • Build and deployment scripts
  • Base Intent handlers: for common ones such as Launch, Leave, Help. Custom ones for handling sentences like “who built you” can also be useful.
  • Signature validation: if you are not using AWS Lambda
  • Account Linking: if you want the Skill to be aware of user-specific context that already exists in your system, or that is to be configured from a UI.
  • SSML: if you want to control pronunciation of some words like your brand name.
  • Play MP3: have some way to store an HTTPS served media file

Here is a popular Node.js project that provides this. It also simplifies how the intent schema and list of utterances are generated and managed.

The Alexa Voice Service

While the ASK allows developers to extend the cloud capabilities of Alexa, the Alexa Voice Services allows 3rd party product companies to create Alexa enabled devices like the Triby and a project we recently delivered for a client, the LifePod virtual caregiver.

The Alexa Voice Service works by sending an audio stream to the cloud for Alexa to process, then playing audio answers that Alexa sends back. The AVS can be baked into a wide variety of devices like remotes, wearables, mobile apps, or home audio speakers and can be initiated by push-to-talk, tap-to-talk or voice-initiated using a wake word.

Before creating an Alexa Voice Service enabled device you should consult the AVS developer documentation to learn about the requirements for your device and to get your hands on some sample projects to get you started.

Getting Started with the Alexa Voice Service

Much like the Alexa Skills Kit, there are also some good open source projects to start from for the Alexa Voice Service.

Developer Resources

There are several resources available to developers and designers looking to get started with Alexa.

Amazon Developer Documentation

Your first stop for Alexa related resources should be the Amazon Developer documentation. To get you started, it’s best to check out:

Alexa Forum

If you have questions, the developer forum is a good place to ask. It provides better support than technology agnostic channels like stack overflow.

Developer Hours

If you want to ask a question live to an Alexa expert, you can attend the Alexa Skills Kit: Developer Office Hours that are hosted every Tuesday.

It’s great to get some answers to basic questions but if you have more advanced, fringe case issues, you may have a harder time getting an answer here. Also, if you are looking for insight on upcoming features, nothing that has not already been announced will be answered here.

The Slack Channel

If you want a more open form of communication, there is a community driven Slack channel where you can discuss Alexa questions with other developers.

Alexa is just one of the many natural language platforms available for users to interact with products and services using their voice alone. While it may be the early days of Alexa, voice interaction platforms like this are only going to become more and more commonplace as a way of controlling things within your smart home, ordering a pizza or hailing an Uber. Now that you have an overview of the Alexa ecosystem and its possibilities, it’s time to get started building skills of your own.

Insights delivered to your inbox

Subscribe to get the latest insights in IoT, Alexa Skills development and connected products.

Tags:

Author Overview

Christian Nadeau

Christian is a veteran software developer at Macadamian with specialties in .NET ( WPF, Silverlight, Window Phone 8 ), java (J2EE, JBoss), C++ (Qt, BB10). He holds a bachelor's degree in computer engineering from the University of Sherbrooke.
Martin Larochelle

Martin Larochelle has been with Macadamian since 2005. In his ten years with the company, he has tackled projects both big and small as Chief Architect. An expert in C++ and VOIP, his focus has been on mobile platforms. Martin was instrumental for all things BlackBerry providing technical leadership and project oversight. Martin now leads the Macadamian Innovation Lab, a team focused on developing concepts to solve the needs of small and medium businesses and key verticals such as healthcare. While we're all a little nuts at Macadamian, Martin counts himself as the biggest HeadBlade fan in Canada.