Key takeaways
There are many scenarios in which using their voice can be the best possible way for a customer to complete their task, and others in which it’s not. When voice is the best modality to complete a task, skills that respect the five principles of conversation design have the best chance at acquiring long-term, repeat customers.
Need quick advice?
View the conversation design principles checklist to see how your skill design matches up to the five conversation design principles.
In this article:
Before you dive into creating your skill, ask yourself if the customer would prefer to do the task by talking to Alexa, or take care of it on their phone. Because humans have been using speech for thousands of years, our verbal communications can be fast-paced and accomplish a lot in a short amount of time compared to other modalities like writing or typing, filling out a worksheet or form, clicking around a webpage, sending a carrier pigeon, and other ways we’ve communicated in the past and present. Conversation experiences can be hands-free and eyes-free, to operate: The customer is likely to be multi-tasking, with Alexa in the background. Their hands may not be free. They are likely not sitting down and focused on the conversation with Alexa.
Signs your task or feature would make a great conversation experience:
▢ Speech (both the customer’s and Alexa’s) includes low information density
▢ The task would be faster and/or easier with Alexa than the available alternatives
▢ The task would be more fun with Alexa than the alternative
▢ The answers to a customer’s questions would be specific and brief
▢ The task is commonly repeated
▢ Choices are simple and restricted in number OR complex but known ahead of time to the customer
▢ The conversation to complete the task would follow predictable patterns with low variability
The hallmarks of an experience that is appropriate for a conversation include tasks that are:
For example, checking your credit card balance and paying the bill is a task credit card holders repeat at least once a month, follows a similar pattern of conversation each time with few variations, and doesn’t involve making a lot of decisions at once. See how quick and simple a conversation like that can be with Alexa:
Customer: Alexa, ask MyCapital to check my balance
Alexa: Your MyCapital balance is $479.98. Your minimum payment of $25.50 is due in 5 days, on December 8. Want to pay the minimum balance?
Customer: Yes, please
Alexa: What’s your voice code?
Customer: 1-2-3-4
Alexa: Ok, That’s $25.50, using your account ending in 8755. Ready to make your payment?
Customer: Yes
Alexa: Great. You’ve paid your MyCapital bill of $25.50, before the due date. Right on!
Conversely, the hallmarks of an experience that may not be appropriate for a conversation are:
Any task where getting information that is dense or requires verbose explanation, or situations where lots of imagery or in-person experience are necessary, such as making a large purchase, is one where a customer probably won’t prefer to speak to an Alexa skill to accomplish it.
Signs your task or feature would make a great conversation experience:
▢ Speech (both the customer’s and Alexa’s) Includes low information density
▢ It would be faster and/or easier with Alexa than the available alternatives
▢ It would be more fun with Alexa than the alternative
▢ The answers to a customer’s questions would be specific and brief
▢ The task is commonly repeated
▢ Choices are simple and restricted in number OR complex but known to the customer ahead of time
▢ The conversation to complete the task would follow predictable patterns with low variability
Create more natural conversations (Be natural):
▢ Alexa asks a leading question so the customer knows that it's their turn to speak, and what information the skill needs next
▢ Don’t ask multiple questions in the same prompt
▢ Don’t tell the customer explicitly what to say
▢ Don’t tell the customer how to use global controls (“help,” “repeat,” or “exit”)
▢ Support a wide range of utterances a customer can use to tell the skill what they want
▢ Listen to the prompts in the skill read in Alexa's voice (or another Amazon Polly voice) to ensure that they don't sound unintentionally awkward or confusing
Create brief conversations (Be brief):
▢ Take the minimal amount of steps to complete a task; Use the least amount of simple statements possible to communicate accurately and clearly
▢ Speak in active voice, not passive voice
▢ Responses can be read out by a human in one or two breaths
▢ Omit unnecessary asides and redundant information
▢ Give only one to a few options at a time, depending on the length or complexity of the options
▢ Send a card to the customer’s Alexa app when it needs to share information that’s easy to forget if it’s only read out verbally
Create more contextually relevant conversations (Be contextual):
▢ Help new customers understand the skill’s functions; Adapt your messaging to the needs of customers along their journey
▢ Taper messaging down over time as the customer gains more experience
▢ Present options to the customer in order of most relevant
▢ Don’t present options that aren’t relevant to the customer
▢ Don’t bait and switch: Don’t offer options that will result in an error
Create multimodal experiences (Be multimodal):
▢ A customer can use the core functions of the skill without need of a screen
▢ Each touch (or remote) target on screen has a voice command analog (not every voice command needs a target)
▢ Leverage touch interactions where voice may not be ideal (such as displaying long lists to scroll)
▢ Differentiate voice responses to surface to devices with screens
▢ The core voice functions of the skill have corresponding touch interactions
▢ Offer on-screen shortcuts or hints to expedite the experience for tasks the customer will complete most often
▢ The most significant “moments” in the skill experience are punctuated with useful, delightful on-screen interactivity
▢ Clearly display important, contextually relevant data (such as the score of a game, answer to a question, etc.) when the customer most needs it
▢ Offer a variety of imagery based on customer history, seasonality, or other dynamic context on static screens the customer will see often
Design trustworthy skills (Be trustworthy):
▢ Where there are potentially multiple customers or accounts, confirm the intended listener, and/or that there aren’t other people present, before revealing potentially sensitive information
▢ Follow the requirements for protecting sensitive information with a PIN code
▢ Surface only relevant messages to the customer
▢ Don’t surface options that will result in an error if selected
▢ Don’t interrupt an experience with an offer to take an unrelated action
▢ State a clear value proposition for actions your skill asks the customer to perform
▢ Ask customers to confirm actions that have a high consequence for error
▢ Use implicit and contextual confirmations
▢ Don’t reveal sensitive customer information on the screen
▢ Display information on-screen that is relevant to the context of the verbal responses
▢ Test your skill on a range of devices with a screen, without a screen, and on the go
▢ You skill name should be an accurate representation of its functions
▢ Your skill’s description, in the skill store and actual experience, should be an accurate representation of its functions
▢ Don’t explicitly or implicitly promise future functionality
▢ Don’t require additional confirmation from the customer to exit when they use the stop/exit command unless significant data or progress will be lost
▢ The skill does not change Alexa’s name or personality, and is not treated as a brand representative
▢ Alexa uses “we”/“us” pronouns to refer to herself and the customer, not herself and a third party