Channels ▼

Web Development

Audio-on-Demand With Mr. Voice

April, 2004: Audio-on-Demand With Mr. Voice

H. Wade Minter is a UNIX system administrator and an improv comedian. He lives in Raleigh, NC and can be reached at minter@

We all know how well Perl fits in to solve problems with system administration, text processing, and web applications. There are, however, many less obvious ways that Perl can make a difference. For instance, helping to improve the quality of an improv comedy show.

Let me give you a little background. I joined the ComedySportz improv comedy troupe (now known as ComedyWorx—http:// in 1999. ComedyWorx puts on a competitive, team-on-team improv show, similar to Whose Line Is It Anyway, but with a sports motif. I started out as a player in the show on-stage, but as a geek, I was quickly drawn to the sound booth. The player who provides music, sound effects, and general commentary during the show is known as "Mr. Voice."

When I started working as Mr. Voice, the sound process was very manual. Sound clips were supplied via tapes, CDs, and a MIDI device. Even as well organized as the other Voices had the work area, it still took ages (in improv time) to find the right clip, get it into the player, and start it playing. As things are unfolding onstage, you only have a few seconds to find the right sound clip. Too often, by the time we locate and start playing the music, the moment has been lost. To further complicate matters, consider the delay in moving to the proper CD track or the danger of finding that the last Voice who used the tape you need was inconsiderate and didn't rewind it to the proper point.

I wanted to bring digital audio into the picture for two reasons—ease of use (no need to reposition media, quick access times) and disaster recovery. It was impractical to make lossy dubs of all the tapes in the collection, but the danger of having the only tape holding a particular sound clip break was very real. However, it is trivial to back up perfect copies of audio files onto removable media or a network. I started off by putting some songs into an XMMS playlist and using them that way. It was certainly faster for small numbers of songs, but quickly became unwieldy as the number of songs grew and the need for organization and categorization arose. Scrolling through the playlist started to become as cumbersome as rifling through stacks of CDs.

A search through Google and Freshmeat didn't yield any apps that I felt would solve the problem. I knew that for a situation this specific, I needed to write something myself. I named my project "Mr. Voice" after the job title.

I've never considered myself much of a coder. I did earn a Bachelor of Science in Computer Science at the College of William & Mary, where the curriculum focused primarily on C programming, but I've always been more of a systems guy than a coder. I was pretty comfortable in Perl, though, as it had been my longtime choice for CGI and systems programming work. But I knew that the other Voices would not adopt a command-line solution—I needed to pick a GUI toolkit to make this project work in a way other users would accept.

So I picked up Learning Perl/Tk by Nancy Walsh (O'Reilly & Associates, 2001). I can't recommend this book enough. Within a few weeks, I was able to build a very usable system. I think Perl/Tk was a good choice—even three years later, when I look around at other Perl GUI bindings like wxPerl and Gtk-Perl, the quality of the Perl/Tk documentation stands out. I also recommend Mastering Perl/Tk by Steve Lidie and Nancy Walsh (O'Reilly & Associates, 2001).

With a GUI toolkit in hand, it was time to figure out how things would fit together. I decided to use a simple MySQL database for the metadata back end, storing things like title, artist, category, and filename. While I had some concerns that MySQL might be a little too heavy for the data that I was storing, I concluded that using it would allow for long-term growth. After all, I needed to build a system that was sustainable and could be enhanced as our needs continue to evolve. To actually play the audio files, I went with the venerable XMMS. That meant that my code just had to provide the glue between the two.

An easy-to-use interface was key. I started off by laying out how I wanted the main screen to look. I decided on a fairly vertical layout including (from top to bottom) a menu bar, a search area, a listbox to display the results, and a status bar (see Figure 1). The Pack layout manager seemed to be the right choice to achieve that. Once I wrapped my head around the allocation rectangle system, it turned out that Pack did everything I wanted it to do. And, honestly, the books did a great job of making Pack easy to use.

The justification on the labels and entry fields was achieved by packing each row into its own frame, then packing the label with a set width to the left, followed by the Entry widget to the left.

$title_frame = $mw->Frame()->pack(
    -side => 'top',
    -fill => 'x'
    -text   => "Title contains",
    -width  => 25,
    -anchor => 'w'
)->pack( -side => 'left' );
$title_frame->Entry( -textvariable => \$title )->pack( -side => 'left' );

That way, everything lines up nicely.

Mr. Voice allows users to assign songs to F1 through F12 as hotkeys, so you can preload songs and quickly start music playing at the press of a single button. In my first few releases, the way you assigned a hotkey was pretty manual—you selected a song, hit an "assign hotkey" button, and selected a function key from this list. This process was separate from the Toplevel window that actually listed which songs were assigned. I knew there had to be a better way. The logical answer seemed to be drag and drop.

The drag-and-drop tutorial from (http://www.perltk .org/articles/dnd/dnd.html) got me started. I implemented a system where you can click and drag a song from the main listbox, then drop an icon on one of the hotkey areas in the hotkey window. The selected item will then get assigned to the proper hotkey. I cheated somewhat on the back end, though, because the actual dragged token is meaningless—it's just eye candy. The important thing is which item in the listbox is selected at the time of the drop, as the code below shows:

sub Hotkey_Drop
    my ($fkey_var)  = @_;
    my $widget      = $current_token->parent;
    my (@selection) = $widget->curselection();
    my $id       = get_song_id( $widget, $selection[0] );
    my $filename = get_info_from_id($id)->{filename};
    my $title    = get_info_from_id($id)->{fulltitle};
    $fkeys{$fkey_var}->{id}       = $id;
    $fkeys{$fkey_var}->{filename} = $filename;
    $fkeys{$fkey_var}->{title}    = $title;

The actual drag-and-drop token ($current_token) is only referenced to get the widget ID of the parent, which is one of two listboxes. The listbox is then queried directly to see which item is highlighted. Even if the code isn't the most elegant, it looks good and is easy to use, which also makes the users happy.

This little cheat actually came in quite handy when I added the Holding Tank (an extra Listbox-based Toplevel that is part of the application). Users can drag and drop multiple items from the main listbox into the Holding Tank by way of control-clicking to select multiple items. I'm not sure if I could get the standard drag-and-drop token system to handle multiple items in a single drag. To work around this, when a multiple-item selection is dropped into the Holding Tank, it queries the main listbox, receives an array of indexes, then iterates over them to populate the Holding Tank (see Figure 2).

sub Tank_Drop
    my ($dnd_source) = @_;
    my $parent       = $dnd_source->parent;
    my (@indices)    = $parent->curselection();
    foreach $index (@indices)
        my $entry = $parent->get($index);
        $tankbox->insert( 'end', $entry );
    if ( $#indices > 1 )
        $parent->selectionClear( 0, 'end' );

While we're on the subject of drag and drop, I ran into a functional problem using it within my main listbox. The listbox select mode was set to "extended," which enabled the familiar "Control-click to select multiple items" selections. However, the extended mode also has a feature where you can click-drag to select multiple items. Unfortunately, it led to a race condition where people attempting to drag and drop items ended up with multiple items selected, and they'd drop the wrong thing.

Looking at the Tk::Listbox source, I found that there was a Motion method. To solve my problem, I redefined the method within my code to return immediately without actually doing anything. Of course, it also broke the click-drag selects native to the Listbox widget, but my app didn't really take advantage of that anyway, so I didn't lose anything there.

# Try to override the motion part of Tk::Listbox extended mode.
sub Tk::Listbox::Motion

With those three lines, starting a drag no longer selected multiple items, and one of my biggest complaints disappeared.

I discovered an added bonus in my choice of GUI toolkits: Perl/Tk makes developing a cross-platform application extremely easy. I designed Mr. Voice under Linux, which is how I set up our troupe's computer in Raleigh. However, when other improv clubs started using it, they all ran on Win32. Fortunately, all it took were a handful of checks of $^O and I was able to run the same code on UNIX, Mac OS X (X11 mode), and Win32. An example of this is seen early in the program where I check for the OS and pull in OS-specific modules as needed (see Listing 1).

(The time zone setting is for code where I allow people to query the database for songs that have been added or updated in the last x period of time using methods in Date::Manip, which is discussed later in this article.)

Another example of OS-specific behavior occurs before I query an MP3 or OGG file for metadata—I get the short pathname of the actual audio file if we're on Win32:

if ( $filename =~ /.mp3$/i )
    $filename = Win32::GetShortPathName($filename) if ( $^O eq "MSWin32" );
    my $tag = get_mp3tag($filename);
    $title  = $tag->{TITLE};
    $artist = $tag->{ARTIST};

Figure 3 shows Mr. Voice running on Windows. When it's that easy to make your application cross platform, it's a shame not to do it. With Tk-804 now done, Steve Lidie has indicated that it should now be possible to build-in support for native Aqua widgets under Mac OS X, instead of having to use I'll be watching progress on that front with great interest.

In improv clubs like ours, you normally have a large handful of people who are both qualified to run the sound system and interested in doing so, and people generally take turns. You might be Mr. Voice one weekend, then not be back behind the mic for a few weeks. Meanwhile, the other people all have the ability to add new sound files and modify existing entries. We needed a way to identify changes within the database.

I solved that problem with the help of the Date::Manip module. In Mr. Voice, there is an Advanced Search menu that has several options. Among those are "Show me what has changed in the last 0, 7, 14, or 30 days." The Mr. Voice MySQL database schema is set up with a TIMESTAMP column, which is set to the current time when a row is added or updated. When you choose one of those canned searches, it triggers the following code in the main search function:

if ( ( $_[0] ) && ( $_[0] eq "timespan" ) )
    $date = DateCalc( "today", "- $_[1]" );
    $date =~ /^(\d{4})(\d{2})(\d{2}).*?/;
    $year       = $1;
    $month      = $2;
    $date       = $3;
    $datestring = "$year-$month-$date";

$_[1] is set to "0 days," "7 days," "14 days," or "30 days." DateCalc does the heavy lifting to return the proper date from today, then format it in a MySQL-friendly way. Then, when constructing the MySQL query, we do the following:

$query = $query . "AND modtime >= '$datestring' "
  if ( ( $_[0] ) && ( $_[0] eq "timespan" ) );

This narrows the standard search query to only return rows that have changed in the specified timeframe. So the first thing I do when I get to the computer after time away is to run one of those queries to see what my fellow Voices have been working on. There is also an option to specify absolute start and end dates, which works in a similar fashion.

These are just a few examples of how I simplified my improv life using Perl. I don't consider any of the code I've written for Mr. Voice to be revolutionary. Instead, the magic of Mr. Voice comes from the ease with which Perl lets you put the building blocks together to create your own work of art. There is a well-documented GUI toolkit with an outstandingly helpful user community ( CPAN provides its collection of modules that allowed me to quickly do everything from grabbing title and artist information from OGG files to creating platform-neutral file paths with File::Spec. Of course, Perl itself, with the TMTOWTDI philosophy, lets you use the language rather than having the language use you. All of those pieces let me, a guy with no particular skill in coding or GUI design, put together a very useful application that not only makes my job easier, but allows other people with the same problem to do the same. If you have occasion to use Mr. Voice, I'm always grateful for suggestions as to ways I could improve the application or make my code better. The parts for Mr. Voice, including a web-based CVS repository, are available on my web site for any interested parties.

Finally, a quick note on packaging. As I said before, everyone (that I know of) except me runs Mr. Voice on Win32 systems. Mr. Voice has a fairly large list of module dependencies. I quickly found out that it was unreasonable to expect a group of fairly nontechnical people scattered from Spokane to Los Angeles to Richmond to keep up with installing Perl modules on systems that are not, as a general rule, connected to a network. I originally used Perl2EXE to package my script and its modules into a single .exe (with an icon—Win32 folks love their icons) that I could distribute, which worked well enough. However, I've recently switched to PAR ( Autrijus Tang has done an incredible job with PAR—it can do everything that Perl2EXE can do, and more, plus it's both free and Free. If you're distributing a Perl application of any significant size or complexity, I highly encourage you to check out PAR.

ComedyWorx now runs its shows almost completely on digital audio. It works so well that people come up to us after shows and say, "I can't believe how quickly you were able to start playing music up there!" And since our music is all digital, we're able to take backups of sound files and the database offsite to protect them from accidents. We're able to do more with our shows, now that we have a way to store more audio and get to it quicker—and Perl makes it all happen. If you're ever in Raleigh, stop by and see a show. You'll be able to appreciate it on both an artistic and technical level.

Mr. Voice can be found at


Listing 1

if ( "$^O" eq "MSWin32" )
    our $rcfile = "C:\\mrvoice.cfg";

        if ( $^O eq "MSWin32" )
            require LWP::UserAgent;
            require HTTP::Request;
            require Win32::Process;
            require Tk::Radiobutton;
            require Win32::FileOp;
    $agent = LWP::UserAgent->new;
    $agent->agent("Mr. Voice Audio Software/$0 ");

    # You have to manually set the time zone for Windows.
    my ( $l_min, $l_hour, $l_year, $l_yday ) = ( localtime $^T )[ 1, 2, 5, 7 ];
    my ( $g_min, $g_hour, $g_year, $g_yday ) = ( gmtime $^T )[ 1, 2, 5, 7 ];
    my $tzval =
      ( $l_min - $g_min ) / 60 + $l_hour - $g_hour + 24 *
    $tzval = sprintf( "%2.2d00", $tzval );
    our $homedir = "~";
    $homedir =~ s{ ^ ~ ( [^/]* ) }
              { $1
                   ? (getpwnam($1))[7]
                   : ( $ENV{HOME} || $ENV{LOGDIR}
                        || (getpwuid($>))[7]
    our $rcfile = "$homedir/.mrvoicerc";
Back to article

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
Dr. Dobb's TV