Channels ▼

Web Development

Accessible Streaming Content

Few content providers intentionally ignore the 12 million blind or visually impaired, and the 24 million deaf or hard-of-hearing individuals in the United States. However, developers' lack of knowledge, plus the technical obstacles of creating and delivering accessible streaming content, commonly result in disappointing experiences for disabled users.

Accessibility of streaming content for people with disabilities is often not part of the spec for multimedia projects, but it certainly affects your quality of service. Most of the resources available on Web accessibility deal with HTML. Fortunately, rich media and streaming content developers have a growing number of experts to turn to for information and assistance.

The essentials of providing accessible streaming content are simple: Blind and visually impaired people need audio to discern important visual detail and interface elements, while deaf and hard-of-hearing people need text to access sound effects and dialog. Actually implementing these principles is quite a challenge, though.

Now due to a relatively new law in the U.S., known as Section 508, dealing with accessibility issues is becoming an essential part of publishing on the Web.

Motivating Factors

The Section 508 amendments to the United States Rehabilitation Act of 1973 took effect in June 2001. The legislation outlines requirements for including equivalent alternatives to non-text components of multimedia presentations (such as captions and audio descriptions). They also require authors to enable content display on accessible media players. These guidelines affect the Federal government and put pressure on both developers creating and providing media, and software vendors that provide media players.

The WC3's own Web Content Accessibility and User Agent Accessibility guidelines make similar recommendations. Many countries, such as Canada, are adopting several W3C guidelines as official legal guidelines.

Making Content Accessible

Captions are a standard way to tackle accessibility issues for video or audio clips. These aren't the same as subtitles, which translate only a program's dialog or narration into a different language. Captions are designed specifically for people with hearing disabilities. They display dialogue; identify speakers; and note on and off-screen sound effects, music, and laughter. Designed for a hearing audience, subtitles don't contain sound-effect cues or other useful information that's normally found in captions.

Few sites that offer streaming media provide captions or subtitles. There are notable exceptions, such as many of the video offerings of the NOVA Web site (, presidential speeches at the White House Web site (, and a selection of videos at

Audio descriptions and a navigable interface make video accessible to blind or visually impaired people. Audio descriptions provide descriptive narration of key visual elements, including actions, gestures, and scene changes. For example, a joke in a comedy sketch based on physical expressions or movement won't provide adequate context to let a blind viewer understand the humor. If a user must interact with interface elements in the presentation, there may be significant access barriers.

Audio descriptions are even less commonly used than captions. Described videos are so rare that if a new video with audio descriptions were published to the Web today, it would likely be discussed on accessibility listservs.

Many blind or visually impaired users employ screen-reading software to access content, including all interface elements in streamed presentations. Screen readers use a text-to-speech engine that voices content. Multimedia that's fully accessible to screen readers must meet two requirements: First, a streamed presentation should provide text for each interface element; second, the screen reader must have access to that text. While most media players can interact with screen readers to allow basic control of the player, rarely are both requirements met.

Currently, only version 6 of the Macromedia Flash player has the built-in ability to make information available to screen readers. Although improvements to the player make even legacy Flash content somewhat accessible, not all types of Flash content can be made accessible.


You can incorporate caption display for video in two ways. It can be encoded as part of the video, or you can supply a separate text track in a format the player understands, and the player can then display the captions as text.

Encoding captions as part of the video may be easier in some cases, such as when the video is already captioned for television broadcast. However, the text quality suffers during compression, the captions don't scale well, and you'll need to encode a separate version without captions unless they are desirable for all viewers. When delivering audio-only clips instead of video, you can only add captions by adding a text track.

Creating a text track will help you appreciate its benefits. Text tracks display clearly whether the presentation is small or full screen, the captions can be closed (displayed only when the user selects that option), and they don't require a separate version of the video for users who don't want captions. If a provider wants to offer multiple caption and subtitle tracks, the advantages of text tracks far outweigh the convenience of encoded captions.

Caption Tools And Strategies

A text file for captions contains, at a minimum, the text and time codes for when the text should appear. The most important challenge in writing a caption file is accurately entering time codes, because synchronizing text with visual content is often crucial for users to comprehend the content. QuickTime, Real, and Windows Media each employ a proprietary text format for their own players. So, if you're offering media in more than one format, you must create multiple caption files.

A small selection of tools is available to help create the files:

CCaption from Leapfrog Productions is a captioning application for video and DVDs, but it also lets you create text files for QuickTime and Real media. Visit for more information and pricing.

Macaw is a free Macintosh application that creates QuickTime text tracks only. Visit for more information about Macaw.

MAGpie, for which I'm a developer, is from the CPB/WGBH National Center for Accessible Media. It's a free tool that helps you write and synchronize captions. For more information about MAGpie, visit

Other tools can facilitate text track creation, such as LiveStage Professional from TotallyHip Software, but any text editor can help you create a text track. The disadvantage of this approach is that your team will have to obtain the exact time code for individual captions, which is time consuming.

Non-professionals can caption media using the tools mentioned above. However, unless your company is willing to provide adequate resources (time, support, and training opportunities) for the person doing the captioning, seeking professional services may be the best route. For large volumes of captioning and for real time captioning, consider including professional services in the project budget.

Each player has its own mechanism for delivering captions with the main media. QuickTime lets users add the track to the movie using authoring tools or QuickTime SMIL. In both cases, the provider controls the placement of the track in the movie. If the project calls for closed captions, the provider must include an additional interface element that lets users turn them on and off.

Example 1
<asx version="3.0">
  <title>SAMI Captions Demo</title>
    <ref href=
Making a connection between the SAMI file and the main media file for Windows Media Player.

Windows Media Player uses a different approach. When delivering captioned Windows Media, the author specifies that a SAMI file should accompany the main media, and the text will be displayed in the caption area of the player window. An ASX file configured like Example 1 will establish the connection between the SAMI file and the main media file for the Windows Media Player.

RealPlayer uses SMIL to include captions with a presentation. SMIL provides information about layout, timing, and display of different media types for the Real player to interpret. In addition, SMIL provides test attributes that help determine the viewer's player preferences for text display and language choice. For instance, the SMIL code in Example 2 will cause the RealPlayer to omit captions if the player preferences don't request them. It will also give German captions if the player requests German language content and captions, and English captions if they are requested but the language preference is not German.

A significant limitation of text captions is that they depend on the user's font system. RealText solves this problem by only allowing a limited set of common fonts in RealText. But QuickTime text tracks and SAMI have no such limitation, which occasionally results in unexpected font substitutions. To resolve this issue, Scalable Vector Graphics (SVG) can be used for caption display.

Example 2
  <textstream src="capsde.rt" 
   region="textregion" systemLanguage="de" 
  <textstream src="capsen.rt" 
   region="textregion" systemCaptions="on"/>
This SMIL code forces the RealPlayer to omit captions if player preferences don't ask for them.

As in Flash, fonts can be embedded in the SVG file and rendered in the SVG viewer. SVG captions can be created by transforming a MAGpie 2.01 XML file into SVG, but the only major player with SVG support is the RealPlayer for Windows. Until a more universal solution is available, exercise care when using all but the most common fonts.

Additional Benefits of Captions

Captions can benefit more than just those who are deaf or those who are hard of hearing. People working in noisy environments, for example, also need text. From a technical perspective, caption data can be used to index and search a video collection. Each caption has a time code that indicates when specific words were spoken. Thus, searching video is only as difficult as building or configuring a search tool to provide the user with a link to an appropriate point in the timeline in response to a search. Caption data in XML can be repurposed to give a text version of the audio content to users with small devices or low bandwidth. There are also educational benefits associated with captions and subtitles. Studies show that students who are learning to read, and those who are learning a new language benefit from reading text on the screen.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
Dr. Dobb's TV