Voice input

Overview – Voice input

Voice input, also known as speech recognition, translates spoken language into text and computer commands. It’s very useful for people who have difficulty using their hands to operate a mouse or keyboard, including people with mild repetitive stress injuries, people with limited dexterity or muscular control limitations such as tremors, poor coordination, or paralysis, people with arthritis and people missing limbs.

Voice input allows users to dictate text into any application such as MS Word, email, web forms and more. Rather than type, users talk into a microphone with the words appearing where the keyboard focus is. When browsing web pages using voice input, users can activate links and buttons, fill out forms, copy and paste text, scroll the screen and perform other functions.

Content must be properly designed and coded so that it can be controlled by speech. Generally, pages that allow keyboard interaction and screen reader use are also accessible through voice input.

Challenges that people who use speech recognition software face on the web:

  • Lack of keyboard control for all functionality (e.g., a carousel that moves without a pause button, a custom button that lacks a keyboard event)
  • Invisible focus indicators on interactive elements
  • Mismatched visual order and tab order
  • Linked image with mismatched visible text and alternative text
  • Duplicate link text (e.g., Read More) that leads to different places
  • Visible link or button text that's missing from its ARIA label (see Label in name, below)
  • Form controls without programmatically associated labels
  • Hover only menus
  • Small click targets
  • Clickable controls that don’t look clickable

Label in name

As explained in the description of Dragon Naturally Speaking, below, voice input users activate links and buttons by saying their names. When a control has an accessible name set via an invisble ARIA label as well as a visible label, the visible label text must be part of or match the ARIA label. The ARIA label has precedent over any visible text. If no part of the visible label is part of the accessible name, the user won’t be able to target the control by name.

Dragon Naturally Speaking

Dragon Naturally Speaking (Dragon) is the industry-leading voice input software.

Example Dragon commands

Dragon commands are numerous and allow the user to control the whole computer interface. A subset concerns accessibility.

To move focus from one focusable control to the next, the user says “Press Tab”.

To activate a control, the user says “Click” followed by the name of the control. Users can access any of the following controls by saying their name:

  • Links
  • Buttons
  • Checkboxes
  • Images (actionable)
  • Radio buttons
  • Text fields
  • List boxes

Dragon will click the control once it has a unique match, so users often don’t need to say the entire name.

Alternately, users can number all controls and choose a number:

  1. Users say one of the following:
    • Show links or Click links
    • Click button
    • Click checkbox
    • Click image
    • Click radio button
    • Click text field
    • Click list box

    Numbers appear next to all controls of that type:

    If there is only one control of that type, that control will be clicked.

  2. Users say “Choose [number]” to select from the numbered controls.

Other common commands enable the user to scroll the page, reload, navigate browser history, undo text entered in input fields, etc.

Dragon MouseGrid

The Dragon Naturally Speaking MouseGrid allows the user to move the mouse pointer to a specific area.

Users summon the MouseGrid by saying “Mouse Grid”, which displays a transparent three-by-three grid over the screen with the sections numbered one through nine.

With the MouseGrid on the screen, the user speaks the number they want to focus on. A new three-by-three grid appears in that spot, with the mouse pointer at the center.

The user can repeat the process as many times as needed. When the mouse pointer is finally over the control the user wishes to activate, they say “Click”.


This section is adapted from MouseGrid, from the Nuance Naturally Dragon Speaking documentation.

Related WCAG resources

Related WCAG resources

Success criteria

Techniques

Failures

Back to top