Channels ▼

Web Development

Implementing Audio CAPTCHA

Source Code Accompanies This Article. Download It Now.

Roll Tape

When I created my audio CAPTCHA, my goals were to keep things simple by using only a small amount of code with no outside dependencies. I also wanted to keep the instructions and user interaction as simple as possible. A solution using PHP seemed best for this, with some JavaScript to handle browser independence and keyboard input.

The main file, index.htm (Listing One), starts by setting a PHP session with a call to session_start(). Sessions are used in PHP as a persistence mechanism, letting data be passed from one page to another. (For more information on PHP sessions, see

After setting the session, the PHP script loads an array with four random numbers, from 0 to 9. These numbers serve as the basis for the randomly generated audio and are used to verify the subsequent user entry. The four digits are then concatenated together and stored in the $_SESSION global array variable. Here I make use of PHP's associative array feature, naming this array index "captchaAnswer".


// the 4 random numbers 
for($i = 0; $i < 4; $i++)
    $fileName[] = GetRand();
// Get a random number from 0-9 
function GetRand()
    return rand(1,10)-1;
$_SESSION['captchaAnswer'] = $fileName[0] . $fileName[1] . $
    fileName[2] . $fileName[3];

<title>Audio CAPTCHA Test</title>
<BODY bgcolor="lightblue" onload="Init()">

<SCRIPT language="JavaScript">

// Init the window, by clearing any input from a former try
function Init()
    document.getElementById('userAnswer').value = "";
var onWindows = true;
var onExplorer = true;
if(navigator.platform.indexOf("Win") == -1)
    onWindows = false;
if(navigator.appName != "Microsoft Internet Explorer")
    onExplorer = false;
// grab user key press
document.onkeyup = KeyCheck;       
function KeyCheck(e)
    /* for browser independence, IE uses window.event, 
       Firefox passes event */
    var keyPressed = (window.event) ? event.keyCode : e.keyCode;
    var hotKey = 80; // p key
    if(keyPressed == hotKey)
        document.getElementById('userAnswer').value = "";
        /* embed for IE, Media player won't show
        iframe for FireFox, embed doesn't play dynamic file */
        if(onWindows && onExplorer)
            document.getElementById('writeHere').innerHTML = 
     '<iframe src="./PlaySound.php" width="0" height="0"></iframe>';
/* if using IE on Windows, use an embedded Windows Media Player */
if(onWindows && onExplorer)
    document.write('<OBJECT id="mediaPlayer" \
    CLASSID="clsid:6BF52A52-394A-11D3-B153-00C04F79FAA6" \
    TYPE="application/x-oleobject" width="0" height="0"> \
    <param name="url" value="./PlaySound.php" /> \
    <param name="autoStart" value="false" /> \
<!-- the form for instructions and user entry -->
<form method="POST" action="CaptchaSubmit.php">
Press the P key to play the sound file. 
<br>Then type the 4 numbers you hear and press Enter.
<br><input type="text" id="userAnswer" 
    name="userAnswer" maxlength="4" size="4"/>

<!-- a place to write the frame information when not on IE/Windows-->
<div id="writeHere" style="visibility:hidden"></div>
Listing One

When the page is loaded, a JavaScript Init() function is called. This simply makes sure that the text box is free of any previous entry. The next two functions determine the user's browser and OS. If the user is running Internet Explorer under Windows, I want to use an embedded Windows Media Player to play the sound files.

A JavaScript function KeyCheck() is called when the browser gets an onkeyup message. I use the key up rather than the key down because the key down can be generated more than once if the user holds a key down for a certain length of time.

KeyCheck() gets the key code, the key that was pressed and released by the user. It gets this code via either the window.event that is built into IE or the passed-in variable that will be present in Mozilla-based browsers.

I chose to use the P key to start the audio. There is no need to check for a lowercase P because the browser does "case folding," meaning any lowercase characters are converted to uppercase. I could have used almost any key here. I wanted to use the Spacebar, because that is what's used in many audio software packages for starting or stopping playback. But the Spacebar is already a browser hot key for page down, so I settled for P as in "Play."

When the P key up is detected, KeyCheck() clears any data that may be in the text box. Then it plays the audio file in one of two ways depending on the browser and the OS. If under IE/Windows, the embedded Windows Media player is used. The Media Player has an object model that allows script control. Calling the Player's starts the Media Player.

If the user is not on IE/Windows, the script uses whatever the user has set as the default media player. When I tested this, I found that when I used the HTML embed tag, the dynamically created sound file did not play. Using the iframe tag instead, with a 0 size frame, plays the file.

When the audio has started, I shift the focus to the input text box. This sequence— activating the audio on the P key hit, shifting the focus to the input text box, and accepting the Enter key as a signal that the entry is complete—allows for a mouse-free user experience.

Below KeyPress() is code that embeds the Windows Media Player if the user is on IE/Windows. The url parameter is given the value of PlaySound.php. This is the file that generates the audio. Also, autoStart is False so the audio won't just start when the page is first loaded. You may want to change this value to True depending on how you are presenting the CAPTCHA.

After the JavaScript section comes the form data. The form takes the user input for the CAPTCHA and calls the CaptchaSubmit.php file to verify the input data. Lastly, I include a div section to contain the HTML frame that may be generated in KeyPress().

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.