Dr. Dobb's | Accessible Streaming Content

Accessible Streaming Content

Streaming media doesn't have to be inaccessible. Follow these tips and open your site to the largest possible audience.

July 02, 2002
URL:http://www.drdobbs.com/web-development/accessible-streaming-content/184411671

Few content providers intentionally ignore the 12 million blind or visually impaired, and the 24 million deaf or hard-of-hearing individuals in the United States. However, developers' lack of knowledge, plus the technical obstacles of creating and delivering accessible streaming content, commonly result in disappointing experiences for disabled users.

Accessibility of streaming content for people with disabilities is often not part of the spec for multimedia projects, but it certainly affects your quality of service. Most of the resources available on Web accessibility deal with HTML. Fortunately, rich media and streaming content developers have a growing number of experts to turn to for information and assistance.

The essentials of providing accessible streaming content are simple: Blind and visually impaired people need audio to discern important visual detail and interface elements, while deaf and hard-of-hearing people need text to access sound effects and dialog. Actually implementing these principles is quite a challenge, though.

Now due to a relatively new law in the U.S., known as Section 508, dealing with accessibility issues is becoming an essential part of publishing on the Web.

Motivating Factors

The Section 508 amendments to the United States Rehabilitation Act of 1973 took effect in June 2001. The legislation outlines requirements for including equivalent alternatives to non-text components of multimedia presentations (such as captions and audio descriptions). They also require authors to enable content display on accessible media players. These guidelines affect the Federal government and put pressure on both developers creating and providing media, and software vendors that provide media players.

The WC3's own Web Content Accessibility and User Agent Accessibility guidelines make similar recommendations. Many countries, such as Canada, are adopting several W3C guidelines as official legal guidelines.

Making Content Accessible

Captions are a standard way to tackle accessibility issues for video or audio clips. These aren't the same as subtitles, which translate only a program's dialog or narration into a different language. Captions are designed specifically for people with hearing disabilities. They display dialogue; identify speakers; and note on and off-screen sound effects, music, and laughter. Designed for a hearing audience, subtitles don't contain sound-effect cues or other useful information that's normally found in captions.

Few sites that offer streaming media provide captions or subtitles. There are notable exceptions, such as many of the video offerings of the NOVA Web site (www.pbs.org/wgbh/nova/), presidential speeches at the White House Web site (www.whitehouse.gov), and a selection of videos at movieflix.com.

Audio descriptions and a navigable interface make video accessible to blind or visually impaired people. Audio descriptions provide descriptive narration of key visual elements, including actions, gestures, and scene changes. For example, a joke in a comedy sketch based on physical expressions or movement won't provide adequate context to let a blind viewer understand the humor. If a user must interact with interface elements in the presentation, there may be significant access barriers.

Audio descriptions are even less commonly used than captions. Described videos are so rare that if a new video with audio descriptions were published to the Web today, it would likely be discussed on accessibility listservs.

Many blind or visually impaired users employ screen-reading software to access content, including all interface elements in streamed presentations. Screen readers use a text-to-speech engine that voices content. Multimedia that's fully accessible to screen readers must meet two requirements: First, a streamed presentation should provide text for each interface element; second, the screen reader must have access to that text. While most media players can interact with screen readers to allow basic control of the player, rarely are both requirements met.

Currently, only version 6 of the Macromedia Flash player has the built-in ability to make information available to screen readers. Although improvements to the player make even legacy Flash content somewhat accessible, not all types of Flash content can be made accessible.

Captions

You can incorporate caption display for video in two ways. It can be encoded as part of the video, or you can supply a separate text track in a format the player understands, and the player can then display the captions as text.

Encoding captions as part of the video may be easier in some cases, such as when the video is already captioned for television broadcast. However, the text quality suffers during compression, the captions don't scale well, and you'll need to encode a separate version without captions unless they are desirable for all viewers. When delivering audio-only clips instead of video, you can only add captions by adding a text track.

Creating a text track will help you appreciate its benefits. Text tracks display clearly whether the presentation is small or full screen, the captions can be closed (displayed only when the user selects that option), and they don't require a separate version of the video for users who don't want captions. If a provider wants to offer multiple caption and subtitle tracks, the advantages of text tracks far outweigh the convenience of encoded captions.

Caption Tools And Strategies

A text file for captions contains, at a minimum, the text and time codes for when the text should appear. The most important challenge in writing a caption file is accurately entering time codes, because synchronizing text with visual content is often crucial for users to comprehend the content. QuickTime, Real, and Windows Media each employ a proprietary text format for their own players. So, if you're offering media in more than one format, you must create multiple caption files.

A small selection of tools is available to help create the files:

CCaption from Leapfrog Productions is a captioning application for video and DVDs, but it also lets you create text files for QuickTime and Real media. Visit www.ccaption.com for more information and pricing.

Macaw is a free Macintosh application that creates QuickTime text tracks only. Visit www.whitanderson.com/macaw/ for more information about Macaw.

MAGpie, for which I'm a developer, is from the CPB/WGBH National Center for Accessible Media. It's a free tool that helps you write and synchronize captions. For more information about MAGpie, visit ncam.wgbh.org/webaccess/magpie/.

Other tools can facilitate text track creation, such as LiveStage Professional from TotallyHip Software, but any text editor can help you create a text track. The disadvantage of this approach is that your team will have to obtain the exact time code for individual captions, which is time consuming.

Non-professionals can caption media using the tools mentioned above. However, unless your company is willing to provide adequate resources (time, support, and training opportunities) for the person doing the captioning, seeking professional services may be the best route. For large volumes of captioning and for real time captioning, consider including professional services in the project budget.

Each player has its own mechanism for delivering captions with the main media. QuickTime lets users add the track to the movie using authoring tools or QuickTime SMIL. In both cases, the provider controls the placement of the track in the movie. If the project calls for closed captions, the provider must include an additional interface element that lets users turn them on and off.

Example 1

<asx version="3.0">
  <title>SAMI Captions Demo</title>
  <entry>
    <ref href=
     "mms://mydomain.com/mymovie.asf?sami=
http://mydomain.com/mymovie.sami"/>
  </entry>
</asx>

Making a connection between the SAMI file and the main media file for Windows Media Player.

Windows Media Player uses a different approach. When delivering captioned Windows Media, the author specifies that a SAMI file should accompany the main media, and the text will be displayed in the caption area of the player window. An ASX file configured like Example 1 will establish the connection between the SAMI file and the main media file for the Windows Media Player.

RealPlayer uses SMIL to include captions with a presentation. SMIL provides information about layout, timing, and display of different media types for the Real player to interpret. In addition, SMIL provides test attributes that help determine the viewer's player preferences for text display and language choice. For instance, the SMIL code in Example 2 will cause the RealPlayer to omit captions if the player preferences don't request them. It will also give German captions if the player requests German language content and captions, and English captions if they are requested but the language preference is not German.

A significant limitation of text captions is that they depend on the user's font system. RealText solves this problem by only allowing a limited set of common fonts in RealText. But QuickTime text tracks and SAMI have no such limitation, which occasionally results in unexpected font substitutions. To resolve this issue, Scalable Vector Graphics (SVG) can be used for caption display.

Example 2

<switch>
  <textstream src="capsde.rt" 
   region="textregion" systemLanguage="de" 
   systemCaptions="on"/>
  <textstream src="capsen.rt" 
   region="textregion" systemCaptions="on"/>
</switch>

This SMIL code forces the RealPlayer to omit captions if player preferences don't ask for them.

As in Flash, fonts can be embedded in the SVG file and rendered in the SVG viewer. SVG captions can be created by transforming a MAGpie 2.01 XML file into SVG, but the only major player with SVG support is the RealPlayer for Windows. Until a more universal solution is available, exercise care when using all but the most common fonts.

Additional Benefits of Captions

Captions can benefit more than just those who are deaf or those who are hard of hearing. People working in noisy environments, for example, also need text. From a technical perspective, caption data can be used to index and search a video collection. Each caption has a time code that indicates when specific words were spoken. Thus, searching video is only as difficult as building or configuring a search tool to provide the user with a link to an appropriate point in the timeline in response to a search. Caption data in XML can be repurposed to give a text version of the audio content to users with small devices or low bandwidth. There are also educational benefits associated with captions and subtitles. Studies show that students who are learning to read, and those who are learning a new language benefit from reading text on the screen.

Audio Descriptions

Audio descriptions are much easier to implement than they are to create. These are carefully written to narrate key visual elements, and are then recorded and inserted during pauses in dialog. The descriptions are more challenging to create than captions. While captions are written based on sounds and speech, audio descriptions are an interpretation of visual events, which must be worded so that the recorded version will fit into available pauses. Descriptions must be objective, providing enough information that the user can form his or her own interpretation. Professional description services are available and should be considered as a budget item for large, real time, and high visibility projects.

Audio Description Tools And Strategies

Apart from hiring professional services, few options exist for creating audio descriptions. Because audio descriptions fit into pauses in dialog, it's crucial to know when to start the description and how long it can last. You can quickly record audio descriptions at a workstation using basic sound-editing software, but results improve significantly with a more sophisticated set-up, time, and training.

Example 3

<par dur="0:01:46.27">
   <video dur="0:01:46.27" 
    region="videoregion" src="mymovie.mov"/>
   <audio begin="0:00:14.30" src="ad1.wav" 
    systemAudioDesc="on"/>
   <audio begin="0:00:28.16" src="ad2.wav" 
    systemAudioDesc="on"/>
	.
	.
</par>

This SMIL file contains references to the audio description sound files.

MAGpie 2.01 can record and synchronize audio descriptions. When exporting a described presentation for RealPlayer and QuickTime, MAGpie creates a SMIL file that contains references to the audio description sound files. See Example 3 for sample code. The systemAudioDesc test attribute is used to determine whether the player is set to play an audio description. (QuickTime ignores this test attribute.)

Some video clips have no pauses in the dialog or breaks in the action where descriptions might be placed. A new feature in SMIL 2.0 lets you add audio descriptions to a video clip where pauses are nonexistent or insufficient for adequate description. The exclusive tag (<excl>) informs the player that the main media (and other tracks, if applicable) should be paused until the interrupting element has finished playing. This has the effect of lengthening the overall duration of the presentation, but provides a possible avenue for adding description when it is most needed. Example 4 shows sample code for extended audio description, where the video is temporarily paused while ad1.rm and then ad2.rm play.

You can add audio descriptions to a presentation in the RealPlayer using SMIL, as shown in Example 4. QuickTime lets you add audio descriptions in the same way as caption files, by adding the description files to the movie with QuickTime's authoring features or by creating a SMIL file. However, QuickTime doesn't yet support the <excl> tag, so extended audio description in QuickTime must be implemented using the QuickTime authoring features. Windows Media has no official support for audio descriptions, but you add audio description by combining the audio description with the video and program audio while encoding to a Windows Media file type.

Example 4

<excl>
   <priorityClass peers="pause">
     <video src="movie.rm" region="video" 
      title="video"/>
     <audio src="ad1.rm" begin="12.85s" 
      systemAudioDesc="on"/>
     <audio src="ad2.rm" begin="33.71s" 
      systemAudioDesc="on"/>
   </priorityClass>
</excl>

Sample code for extended audio description. The video pauses temporarily while a description catches up.

When delivering high bandwidth media for blind and visually impaired users, providers can offer options that will save bandwidth on both ends. Users who are blind have no use for the video, only the program audio and audio descriptions. Users who are visually impaired may be able to use a screen magnifier to view some aspects of the video, but still benefit from audio descriptions. Sighted users have no need for the audio descriptions. To deliver the smallest amount of necessary material, your presentation can offer separate links representing these three possibilities. When using a RealServer, the systemAudioDesc test attribute controls audio description streaming, but a separate link to a "no video" version is still necessary.

Additional Benefits of Audio Descriptions

As with captions, audio descriptions have additional benefits that increase their appeal. Because descriptions usually exist in text prior to being recorded, you can use the data for indexing video. Caption and description data can be held in XML documents and repurposed to create collated caption and description text files as an alternative for people with low bandwidth connections or who are viewing content through a small device like a cell phone. This approach also helps people who are both deaf and blind and need to convert all information into Braille. Audio descriptions also offer benefits for users who aren't blind. Descriptions are used as an additional mode of accessing information to help people with cognitive and learning disabilities reinforce concepts introduced in video. They can also be used in settings where only audio is available.

What's Next?

Support for audio description and extended audio description isn't as good as it should be in many players. Screen readers must have better access to player content before people who are blind or visually impaired will be able to fully access streaming content. Support for captions is adequate for basic text display, but support for captions that overlap video and for sophisticated text, such as mathematical equations, needs work.

A W3C task force recently began to examine the need for a caption format that all players can support. Improvements in speech-to-text technology are also on the wish list, with the goal being automated captioning. Speech-to-text conversion tools will undoubtedly be commonplace before automated captioning, but they'll assist caption writers and help streamline the work process.

Captions and audio descriptions are workable solutions for providing access to streaming content. While governmental content providers must add these enhancements, there is no such legislative mandate for the private sector yet. However, even if there are no legal reasons to provide accessible content, streaming companies can benefit from doing so. Accessible sites engage potentially millions of new users, and additional textual information adds value to services, which certainly never hurts the bottom line.

Andrew ([email protected]) manages the Access to Rich Media project at the CPB/WGBH National Center for Accessible Media.