Why do I need to include alternatives for “time-based media” (multimedia)?
If a website includes a multimedia presentation like a movie or video, some people will be unable to perceive (or will have difficulty perceiving) the visuals and/or the audio. By providing alternatives, web content authors improve accessibility for a wide range of individuals with and without disabilities:
- Individuals who have trouble understanding the visual track include people with visual impairments, certain cognitive disabilities, language disabilities, and learning disabilities. Text alternatives make the multimedia presentation accessible to these people.
- Individuals who have trouble understanding the audio track include people who are hard of hearing or deaf or who have certain cognitive, language, or learning disabilities. Text alternatives and captioning are the two most common ways to enhance accessibility for people who have trouble processing the audio portion of a multimedia presentation.
- People who are deaf and blind may have trouble (or be unable to) understand the visual track, the audio track, or both. A special type of text alternative, the alternative for time-based media, can make multimedia accessible to people who are deaf and blind.
- Second-language learners benefit from text alternatives, especially those who understand a second language better when they read it rather than listen to it.
- Some people are visual learners — they are “wired” to understand best by looking rather than by hearing or doing. Others are auditory learners — they learn best by listening rather than seeing or doing. Text alternatives help them all get the most from a multimedia presentation.
- Text alternatives and captions are convenient. For example, a captioned video can be understood in a noisy environment (like a bar) or in a quiet environment (like a bedroom where one person is sleeping while the other is watching a movie).
- Text alternatives make multimedia searchable. When search engines “crawl” websites to index their content, they cannot “read” movies and videos. Search engines only understand text (though image recognition technology is currently emerging).
Level A Success Criteria for Time-Based Media
1.2.1 Audio-Only and Video-Only (Prerecorded) Explained
When audio content is presented, people who are deaf or hard of hearing may not be able to access it. When video content is presented (without audio), people who are blind may not be able to access it. In both cases, equivalent alternatives are required. In both cases, text equivalents can serve this purpose by reproducing the content of an audio track as text or describing what is going on in the video in text.
1.2.2 Captions (Prerecorded) Explained
Any video that contains meaningful spoken dialogue requires closed captions in order to make the audio portion of the video accessible to people who are deaf or hard of hearing. Captions should identify who is speaking and describe other meaningful audio elements that are relevant to comprehending the video.
Closed vs. Open Captions
Closed captions should be used instead of open captions. The latter type of captions are burned right into the video, cannot be hidden, and are not typically editable. In contrast, closed captions are contained in a text file and are presented over a video. Closed captions can be hidden away if they are not needed, and they are easily edited in a simple text editor.
Though captions are a Level A requirement in WCAG, they are not required when multimedia is being used as an alternative for text and is clearly identified as such. For example, earlier on in the book you saw an ordered list used to describe the steps required to install ChromeVox. For some people, they may prefer to see it done rather than reading instructions. The video that was included with the instructions is an example of a media alternative for text. That video was captioned, but it did not need to be.
Depending on the quality of the dialogue in a video, errors are likely to occur when captions are automatically generated. However, auto-generated captions can be used as an initial set of captions, which a human being can edit to improve accuracy, provided the error rate is less than about 25%. With error rates higher than that, one would be better off captioning manually from scratch.
Automatically generated captions can be used as a temporary measure, if, for example, a video release is time sensitive, and captions cannot be created in time for that release. Video producers should still caption those videos using human captioners, as soon as it is feasible.
1.2.3 Audio Description or Media Alternative (Prerecorded) Explained
There are two approaches to this success criteria. Both describe elements of a video, such as actions, characters, scene changes, or on-screen text, among other important information that is not spoken in the dialogue.
- Audio Description: A secondary audio track is added to a video. During breaks in dialogue of the video, this second audio track describes visual information not referred to in the dialogue.
- Media Alternative: Important visual information not spoken in the dialogue of the video is integrated into the captions, typically enclosed in square brackets and inserted during breaks in dialogue.