Editor's Note: The device software update mentioned below has been completed and APL skill certification and device testing are now available. Please refer to the APL skill availability announcement for more information.
Today we’re excited to announce a preview of a new design language and tools that make it easy to create visually rich Alexa skills for tens of millions of Alexa devices with screens. The Alexa Presentation Language (APL) enables you to build interactive voice experiences that include graphics, images, slideshows, and video, and to customize them for different device types. Echo Show, Echo Spot, Fire TV, and select Fire Tablet devices will support skills built using APL next month, as will third-party devices built using the Alexa Smart Screen and TV Device SDK in the coming months. To learn more about building Alexa skills, visit: https://developer.amazon.com/alexa-skills-kit.
Alexa Presentation Language: Purpose-Built for Voice
Customers embrace voice because it’s simple, natural, and conversational. When you build a multimodal experience, you combine voice, touch, text, images, graphics, audio, and video in a single user interface. Combining voice and visual experiences can make skills even more delightful, engaging, and simple for customers. You can provide customers with complementary information that’s easily glanceable from across the room. Or you can use the screen to enrich your voice-first experience and reduce friction for customers. For example, you can offer visual cues for confirmation or show lists of items during a content search experience.
APL is designed from the ground up for creating voice-first, visual Alexa skills that adapt to different device types. Included in the Alexa Skills Kit, APL gives you flexible tools and resources to translate voice-first experiences to the screen. With APL, you can build skills that are:
- Rich: You can build interactive voice-first experiences that include text, graphics, slideshows, and, soon, video content. You can synchronize on-screen text and images with the associated spoken voice. You can support voice commands as well as touch and remote controls, when available, and take advantage of automatic entity resolution for voice-based selection of on-screen elements.
- Flexible: You can control your user experience by defining where visual elements are placed on-screen, and match the visual expression to your brand guidelines. You can reuse designs to deliver a familiar visual experience across multiple skills, and share your designs with others.
- Adaptable: You can customize experiences to reach customers anywhere, through an expanding range of Alexa devices with screens. The experiences developers create with APL can be tailored according to the unique characteristics of the Alexa device they are being rendered on, and can be targeted to devices with a broad range of memory and processing capabilities.
- Easy: To get started quickly, you can take advantage of Amazon-supplied sample APL documents that are designed to work well across a broad range of different device types. You can use these samples as-is, modify them, or build your own from scratch. Although APL is a new language, it adheres to universally understood styling practices, and the syntax will be familiar to anyone with front-end development experience.
“This year alone, customers have interacted with visual skills hundreds of millions of times. You told us you want more design flexibility – in both content and layout - and the ability to optimize experiences for the growing family of Alexa devices with screens,” said Nedim Fresko, Vice President, Alexa Devices and Developer Technologies. “With the Alexa Presentation Language, you can unleash your creativity and build interactive skills that adapt to the unique characteristics of Alexa Smart Screen devices. We can’t wait to see what you create.”
APL Components You Can Use to Build
When you design with APL, you create APL documents, which are JSON files sent from your skill to a device. The device evaluates the APL document, imports images and other data as needed, and renders the experience. In your APL documents, you can use:
- Images, Text, and ScrollViews: You can deliver images and text on-screen, and specify text color, size, and weight for available fonts. You can use ScrollViews to display text that is outside the bounds of the container. And, you can make both text and images responsive to touch using TouchWrappers.
- Pagers and Sequences: You can use Pagers to show a time-ordered sequence of items that typically advance automatically, such as slideshows. Or you can use Sequences to show a continuous list of choices, such as local restaurants, and allow customers to navigate the list via voice or by touch / remote control.
- Layouts and Conditional Expressions: You can use Layouts to group components such as images, text, ScrollViews, pagers, and sequences, as well as describe their placement on screen. You can nest layouts, and take advantage of header, footer, and hint layouts provided by Amazon. You can customize by device type using the when property in your layouts; for example, you might conditionally select one nested layout when the device shape is round, and another when the shape is rectangular.
- Speech Synchronization and Other Commands: You can send commands that change the audio or visual presentation of the content of the screen or generate them automatically within your APL documents. For example, you can highlight the line or block of text currently being read using the SpeakItem command and highlightMode. You can use SetPage and AutoPage commands to control the pages displayed in a Pager component, and the Idle command to insert pauses.
- Video, Audio, and HTML5 (Coming Soon): In the coming months, you'll be able to include video and audio content within your APL layouts and continue your skill experience when media playback is done. You’ll also be able to use HTML5 in your skills. We’ll share more about these capabilities in the coming months.
You can use the new APL authoring tool and test simulator in the Alexa Developer Console to iterate on your designs, visualize how they’ll render, and test interactions.
Combine APL Components to Create Unique Experiences
APL provides you with the tools you need to combine these components to build rich, flexible, adaptable, and easy-to-use skills for customers. If you have already enhanced your Alexa skill with visuals, then you are familiar with display templates. APL is more powerful and allows greater flexibility. With APL, you can:
- Complement voice-first experiences with visuals. You can display images, text, and overlays to communicate requested information at a glance, and provide additional, relevant details. For example, when a customer requests a stock quote, CNBC shows a graph of the stock’s performance, and adds more information based on display size, including a table of stock fundamentals such as market capitalization and dividend. When a customer asks for the forecast, Big Sky displays images for sun, clouds, snow, and other conditions, and overlays projected temperature ranges. Both skills use the when property to adjust layout and information density based on display size.
- Simplify navigation of large data sets. You can use on-screen lists and sequences to help customers quickly review options, and enable them to choose using either touch or voice. For example, Demaecan uses custom layouts and pagers to enable customers to view and order from the menus of thousands of restaurants in Japan. NextThere enables customers to view public transit schedule information across all of Australia and New Zealand, and uses custom APL layouts with images for route maps and text for scheduled trip times. German skill kicker uses APL layouts and TouchWrappers to provide up to date Bundesliga soccer scores, share news about the top leagues in Europe, and enable easy navigation.
- Create immersive and delightful experiences. You can build highly engaging experiences that your customers can sit back and watch, or lean into to get things done. For example, Kayak uses an APL pager to deliver slideshows of iconic images for potential travel destinations, and provides additional information using a multi-row list custom layout on larger displays, such as Fire TV. Food Network created a three-column layout with recipe information including difficulty and total time, and complemented this with buttons that provide verbal suggestions and can be touched to take action. Best Buy uses APL styles to create an on-brand experience aimed at the goal of having the expertise of Best Buy available in your home, and takes advantage of text highlighting and ideal commands in a new step-by-step tech tips feature.
- Adapt skills to work on different devices. Take advantage of conditional expressions and layouts to deliver the right experience on the device a customer is using, from focused personal information delivery on Echo Spot to communal, 7’ experiences on Fire TV. For example, the FX experience on Spot focuses primarily on the ability to play a trailer, while the Echo Show provides additional content description and the TV experience is optimized for large-screen viewing and more content/menu options. Who Wants to be a Millionaire brings the well-known gameshow to Alexa, and uses APL styles and layouts to deliver a familiar game experience that is optimized for Fire TV, Echo Show, and Echo Spot.
Customers can use the skills described above starting next month.
Join Us for a Demonstration and Apply for the APL Preview
Please join us tomorrow, September 21, at 10 am PST on Twitch for a brief APL demonstration. You can also watch the demonstration later. You can also apply to participate in the APL preview, whether you've built a multimodal skill before or have never designed for Alexa. Tell us about your use case via the short survey here. We’ll notify you if your application is selected. Either way, you’ll be able to use APL soon when we release the public beta next month.