Best practice
Captions
- Provide captions for pre-recorded and live video with audio
- Use
<track>
element to specify timed text tracks for<audio>
or<video>
elements. - Captions are synchronized with the audio.
- Captions are typed in mixed case letters.
- Captions use no more than three lines at a time.
- Put a new sentence on a new line.
- Maximum number of characters per line is 32 characters.
- Insert caption line breaks at logical points rather than in the middle of a phrase.
- Default colors are white text on a black background.
- Default color contrast ratio between font color and background color is a minimum of 3:1 (font size at least 18 points).
- Default font size is at least 22pt.
- Position captions to not obscure on-screen text, people’s faces and other important visual information.
- Ensure a minimum of 1.5 seconds gap in between captions.
- Remove captions from long silent intervals. Captions have a maximum duration of 6 seconds.
Transcripts
- Basic transcripts are a text version of the speech and non-speech audio information.
- Descriptive transcripts also include text description of the visual information.
- Provide descriptive transcripts for pre-recorded video with audio
- Provide descriptive transcripts or audio description for pre-recorded video-only
- Provide basic transcript for pre-recorded audio-only
- Provide basic transcript or captions for live audio-only
- Interactive transcripts enable a user to click a phrase anywhere in the transcript to navigate to that exact point in the video (or audio). Interactive transcripts are built from timed text files specified in the
<track>
element. - Position the transcript or a link to it directly below or adjacent to the media player.
- If the transcript is on another page, provide a link back to the audio or video file.
- Provide the transcript in HTML for maximum accessibility to people and to search engines.
- If working with a captions file, combine several lines into sensible paragraphs.
Transcribing audio to text
- Transcription best practice is nearly identical for captions and transcripts.
- When transcribing, the goal is accuracy:
- Never paraphrase or omit words (and do not censor).
- Never substitute words.
- Never rearrange the order of speech.
- Never correct or edit a speaker’s grammar.
- Never provide clarifying information in the captions (you may in the transcript).
- Transcribe all speech and non-speech sounds (laughs, groans, sighs, screams, car backfiring, footsteps approaching, distant roaring)
- Identify the speakers. Use the full name the first time and single name otherwise. If a speaker is not identified, use Speaker + number (e.g, Speaker 1, Speaker 2) or use a role/title without a number (e.g., interviewer, Doctor)
- Exclude non-relevant speech and non-relevant background noise.
- Do not reveal intentionally held information before the appropriate time.
- Include relevant information about the speech, e.g., (whispering), (mouthing).
- Put non-speech sounds in parenthesis, italics, lowercase, and with a space before and after, e.g., ( chatter in distance )
- Use punctuation to convey emphasis.
- For interrupted speech, use a dash at the end of the line.
- Use all capital letters only to indicate yelling.
- When the speech is unintelligible or inaudible, transcribe [inaudible]
- Indicate large silences as (silence).
- Include background music if it's important to understand the content:
- Identify music with the uppercase label MUSIC (or a verb implying music), followed by a colon and the title in quotation marks followed by the artist.
- Transcribe important lyrics with musical notes to either side, e.g.,
♪ A long, long time ago ♪
- Describe music that’s not part of the action but sets the mood, e.g.,
♪ scary music ♪
- Best practices unique to transcripts:
- For descriptive transcripts, include all relevant audio information as well as description of all relevant visual information.
- If your transcript is generated from timed text files ensure descriptions fit into gaps in the main audio, or use a player that can pause the video during the description.
- Transcripts include onscreen text in videos. Captions do not include onscreen text.
- Ensure transcripts identify the source of sounds, rather than just describing them.
- In some cases, such as legal depositions, the transcript must be verbatim, including ums, ahs, and indicating pauses.
- Headings, topics and links can make the transcript more usable.
- Include timestamps only when useful.
- Add a timestamp to inaudible audio.
Description of visual information
- Provide audio description for pre-recorded video with audio.
- Provide audio description or a descriptive transcript for video-only
- Design new videos with integrated descriptions (script includes all relevant visual information) to avoid the need for audio description
- Make sure the important visual elements are described appropriately and objectively to understand what the video is communicating.
- Write description of visual information in present tense, using an active voice and a third-person narrative style.
- Make sure to include all text, e.g., title text at the beginning, links and email addresses, speaker’s names, and text in a presentation.
Media player accessibility
The ideal media player provides built-in support for captions, audio descriptions, and transcripts.
Keyboard accessibility:
- All controls can receive focus via the tab key.
- Controls have a visible keyboard focus indicator.
- The tab order of controls matches the visual order, left to right.
- All controls are operable by keyboard.
- Text, controls, and backgrounds have sufficient contrast between colors.
Screen reader accessibility:
- Each control presents to screen readers its name and role, and value if one or more is set.
Flashing content
Ensure flashing content:
- Does not flash more than 3 times per second.
- Is not larger than 21,824 sq pixels.
- Does not have high contrast.
Assess flashing content using a tool such as the Photosensitive Epilepsy Analysis Tool (PEAT).
Animation and motion
- Allow users to turn off motion animations.
- Avoid using unnecessary animations.
Pause, stop or hide
- For any moving, blinking and scrolling information that starts automatically, lasts more than five seconds, and is presented in parallel with other content, provide the user a way to pause, stop or hide it.
- For auto-updating information, provide a way for the user to pause, stop or hide the content. Or, provide a way for the user to control the frequency of the update.
- A keyboard accessible “pause button” or other mechanisms can be used to pause the content.
- Avoid unnecessary moving, blinking, scrolling or auto-updating content.
Audio control
If audio plays automatically on page load for more than 3 seconds, enable the user to:
- pause or stop the audio, or
- control the volume independent of the system volume.
Alternately, play sounds only on user request.