Andrew Muir - Accessibility in Voice Assistant Design

Kick-off

After identifying gaps in bot behaviour, I asked the design team to log calls that had poor user experience due to accessibility issues, tagging these with appropriate metadata to gain quick insights. This action confirmed that our assistants could be doing more to support users with accessibility needs, with some systems having no built-in behaviour for this user population.

For this, I pushed the team to follow the double diamond methodology. This involved an initial discovery and subsequent discussion to define the problem, before developing and improving a vulnerable user detector that was rolled out to all voice assistants.

Discovery

*images of internal PolyAI work blurred due to IP constraints.

Myself and a second designer mapped out real user journeys and highlighted accessibility failings. At this early stage, we tried to find high-level ways in which these failed journeys were similar and dissimilar in order to help shape our understanding of the problem space*.

Before jumping to solutions, we researched how accessibility requirements are formally categorised. We collated ways they are grouped and documented by government organisations and charities. From this research, we split accessibility requirements into four flexible categories: vision; hearing; motor; and cognitive. For our own discovery, we separated out speech impairments as a fifth category due to the disproportionate impact these can have when interacting with voice systems. We then created new user journeys to explore how their accessibility requirements would impact their experience with voice assistants.

To further tie this discovery to our product, we explored the different ways in which users could physically interact with our systems. We sourced telephony hardware and support systems used to aid people with accessibility requirements and created problem statements about how these could potentially impact user experience.

Definition

From this work, we created two tiers of voice assistant deployments — 'low' and 'high' risk.

'Low-risk' deployments had user populations with a majority of users that not have accessibility requirements, and so the risk of failure due to accessibility requirements was lower.

'High-risk' deployments were considered to be projects that could have a greater number of accessibility issues and would require custom support. These included healthcare deployments and projects with elderly user bases.

For low-risk projects, we wanted to be able to offer a simple, repeatable and easy-to-build feature that would scale with new deployments, dubbed 'base coverage'. We created a prototype vulnerable user detector that flagged when a user had accessibility issues or the call was failing in some way. This would then either log a metric or have the option to hand off to a human. This meant users would be able to receive custom support depending on their needs.

For high-risk projects, we explored custom, complex voice assistant behaviours. We created prototypes of eight new solutions to account for the most common problematic user journeys previously identified. These included navigating the bot using only the keypad, increasing the volume and speed of the audio, introducing simplified versions of complicated utterances, and alternate user flows. These solutions required resources beyond the design team, with multiple builds required for different projects and needs. As such, this support was deprioritised but documented, in order to quickly spin up builds once a specific need or client request surfaced.

Delivery and outcomes

After testing and iterating the base coverage detector in a single project, a production-ready detector was rolled out to all projects. This detector could be modified to account for custom project requirements without impacting its accuracy.

Moreover, due to the modular nature of the detector, the feature is now included by default in all new voice assistants. This meant we were quickly able to deploy base coverage with minimal engineering effort and no negative impact on project KPIs.

The design team is now working to find new ways to detect accessibility issues, integrating new speech recognition models that are trained to recognise speech impairments and rolling out greater natural language understanding support to all our projects for when users state they have accessibility needs.