Sound As a Data Type

Programming for sound has never been simple. QuickTime, Apple's system-wide architecture for handling dynamic data, is a first step towards audio ease-of-coding. Aaron presents a technique for converting traditional Macintosh sound resources to QuickTime sound data samples.

December 01, 1992
URL:http://www.drdobbs.com/database/sound-as-a-data-type/184408892

Figure 1

DEC92: SOUND AS A DATA TYPE

SOUND AS A DATA TYPE

QuickTime makes it easy

Aaron E. Walsh

Aaron is cofounder and CEO of Fore-Runner Development and Consulting Corp., a Boston-based software firm. He can be reached at 1357 Washington St., West Newton, MA 02135.

Programming for sound has never been simple. In the Macintosh world, for instance, programmers used to face the onerous task of programming the Sound Driver which was, to be kind, a challenge. Nevertheless, until Apple released the Sound Manager in System 6.0.2, there were no alternatives.

Although difficult to master, the sound manager was a dramatic improvement over previous methods of sound programming. You still had to have an intimate knowledge of complex sound-data structures for anything other than simply playing a sound at preset volumes. Playing a sound asynchronously, altering the playback volume, or even playing two different sounds at the same time often required more effort than it was worth.

QuickTime, Apple's system-wide architecture for handling dynamic data--animation, video, and audio--goes beyond the Sound Manager by defining a data type called a movie for dealing with dynamic data. A movie is simply a structure containing 0 or more data streams, known as tracks. Tracks reference (via pointers) a single-track media, which in turn references the raw data samples comprising a movie. Figure 1 illustrates the basic QuickTime movie structure. (For more information on QuickTime movies, see "Programming QuickTime," DDJ, July 1992.)

QuickTime 1.0 is incapable of directly playing the popular sound resource (snd) data type developers rely on as the Macintosh sound format standard. To do so, snd resource data structures must be manually extracted into the QuickTime sound format. Documentation for converting snd data types to the QuickTime sound format is nonexistent as of this article. The code and article presented here demonstrate a technique for converting traditional and sound resources to QuickTime sound data samples.

QuickTime itself does not provide direct support for specific media types. This is the responsibility of media handler components, to which QuickTime provides a number of interface routines. When accessing and manipulating movie data via media handlers, you are not required to have detailed knowledge about the data itself. In this respect, graphic and audio elements are treated as data types for which a number of standard routines are provided.

Compared to the original Sound Driver and Sound Manager routines, QuickTime makes sound management on the Macintosh amazingly simple:

To retrieve the current volume of a given track, use GetTrackVolume(). * To change the volume of a sound track, call SetTrackVolume().
To play multiple sound tracks at one time, simply activate the desired tracks. Once activated, the track's media handlers take over: The data samples are loaded from disk, sound buffers are automatically created and maintained, and the sounds are played asynchronously.

When playing QuickTime movies, you don't need to know about the data structures in a movie, only the standard mechanics of playing it. Recording a QuickTime movie is another story, however, because you need to know about those data structures (like sound) you wish to record.

Sound Description Record

To add sound to a QuickTime movie, the sound-media handlers must be given a description of the sound data for which they will be responsible.

A Sound Description record contains information such as sample size, sample rate, compression details, and so on. Figure 2 details the sound-description structure. When you provide raw data samples and a corresponding description of that data, QuickTime can add an audio track to any movie.

Figure 2: QuickTime Sound Description record.

  struct SoundDescription
  {
   long descSize;        /* total size in bytes of this SoundDescription
                               structure  */
   long dataFormat;      /* describes sample encoding technique (i.e.,
                               off-set binary, twos-complement) */
   long resvd1;          /* reserved by Apple.  Set to 0 in your sound
                               descriptions */
   short resvd2;         /* reserved by Apple.  Set to 0 in your sound
                               descriptions */
   short dataRefIndex;   /* reserved.  Set to 1 in your sound descriptions
                               */
   short version;        /* reserved by Apple.  Set to 0 in your sound
                               descriptions */
   short revlevel;       /* reserved by Apple.  Set to 0 in your sound
                               descriptions */
   long vendor;          /* reserved by Apple.  Set to 0 in your sound
                               descriptions */
   short numChannels;    /* number of channels of sound.  Set to 1 for
                               monaural, set to 2 for stereo */
   short sampleSize;     /* number of bits per sample.  Set to 8 for 8-bit
                               sound, 16 for 16-bit sound */
   short compressionID;  /* reserved.  Set to 0 in your sound descriptions
                               */
   short packetSize;     /* reserved.  Set to 0 in your sound descriptions
                               */
   Fixed sampleRate;     /* rate at which samples were obtained.  Field
                               contains an unsigned, fixed point number. */
  };

You're responsible for filling in the sound description structure, for which the fields in Table 1 must be set corresponding to the original audio sample; the remaining fields are reserved for use by Apple. Set each of these to 0, with the exception of dataRefIndex, which should be set to 1; see Figure 2.

Table 1: Fields of the sound-description structures.

  Function     Description
  ----------------------------------------------------------------------

  descSize     Total size, in bytes, of the SoundDescription structure.
  dataFormat   Describes sample encoding technique (off-set binary or
                twos-complement, for example).
  numChannels  Indicates the number of sound channels used by the sound
                sample.
  sampleSize   Number of bits in each sound sample.
  sampleRate   Rate at which sound samples were captured.

Listing One (page 102) details how to create a sound-description record based on the data structures of a sound (snd) resource. This technique is an effective means of bridging the sound-format incompatibility between QuickTime and the Sound Manager.

On a computer, sound data is stored as a series of digital samples, each specifying the amplitude of the sound at a given time. This format is commonly known as pulse-code modulation (PCM).

The amplitude values in a sample are typically encoded in one of two ways: offset-binary or twos-complement:

Offset-binary. Amplitude values are represented by an unsigned number, with the midpoint representing silence. For example, sample values stored in offset-binary for an 8-bit sample would range from 0 to 255 with the midpoint (128) specifying silence (no amplitude). Samples stored as Macintosh sound resources (snd resource types) are stored using this technique.
Twos-complement. Amplitude values are represented using a signed integer, with 0 representing silence. An 8-bit sample stored in twos-complement format would have sample values ranging from -128 to 128, with 0 specifying silence (no amplitude). Audio Interchange File Format (AIFF) sounds used by the Sound Manager are stored using this technique.

QuickTime also supports the Macintosh Audio Compression and Expansion (MACE) encoding techniques, yielding compression ratios of up to 6:1. These compression techniques save a considerable amount of disk space and RAM consumption, though not without a penalty in playback speed and quality.

The number of bits used to encode the amplitude value for each sample is known as the sample size. A larger sample size provides more bits to encode the amplitude value. Hence, the size of a sample corresponds directly to the quality of sound. The standard Macintosh hardware is limited to 8-bit samples, where QuickTime supports up to 32-bit samples. When playing larger samples on a standard Macintosh, QuickTime converts the samples to 8-bit format for compatibility.

Also influencing the quality of sound is the sample rate. The sample rate represents the number of samples captured in one second. A higher sample rate can more accurately describe the original sound waveform. The standard Macintosh hardware is capable of an output sampling rate of 22.2545 KHz, where QuickTime supports up to 65.535 KHz. QuickTime will convert higher sample rates to accommodate the playback hardware.

Sound can be added to a QuickTime movie in real time, in which case an audio digitizing device is required to convert the analog sound wave into its digital representation. Alternately, sound can be added directly from a disk file.

Creating a Sound Track

The code in Listing One demonstrates the technique of taking an existing sound resource (snd resource type) from a disk file and incorporating it into a QuickTime movie as a new sound track. The user is prompted to select a file containing the sound resource from which data samples will be extracted, and prompted again to select the movie file into which the sound will be added.

ConvertSndToMovie() opens the selected movie file and adds a new track for the sound data. Using the sound-description structure created by CreateSndDescriptor(), ConvertSndToMovie() then adds sound samples to the track media.

CreateSndDescriptor() creates a sound-description structure based on data in the original sound (snd) resource. This routine handles only type 1 snd resources containing uncompressed raw data stored in offset-binary format. While the majority of sound resources fall into this category, you may wish to enhance the code to handle MACE compressed resources.

Once the sound track has been added to the movie file, execution of the application is terminated. The original sound resource and its associated file remain unchanged, while the QuickTime movie now has a new sound track. The sound samples are saved directly into the movie file itself, although QuickTime allows data samples to be saved as an external file as well.

Manipulating QuickTime Sound

Volume and balance are the two main attributes of sound in QuickTime. Essentially, QuickTime treats sound as a data type. A variety of routines are provided which simplify the interaction between the programmer and the data. Among these are:

GetMovieVolume(), SetMovieVolume(). The current volume setting determines the overall loudness of a movie's sound tracks. The current volume setting is lost when the movie is closed and saved to disk. Use these routines to retrieve and alter the current volume setting.
GetMoviePreferredVolume(), SetMoviePreferredVolume(). When opened from disk, the current volume is set to the preferred volume setting. The preferred volume is saved to disk, and may be changed with this routine. Use these routines to retrieve and alter a movie's preferred volume setting.
GetTrackVolume(), SetTrackVolume(). Each track in a movie has an independent volume setting. Use these routines to retrieve and alter the volume setting for a particular track.
GetSoundMediaBalance(), SetSoundMediaBalance(). The balance setting determines the mix of audio between a computer's two speakers. Use these routines to retrieve and alter the audio balance of a particular media. This setting is stored in the movie file when the movie file is saved to disk.

Conclusion

QuickTime sound support is simple, once the sound data has been successfully saved into the movie format. The current version of QuickTime, Version 1.0, does not provide an API for converting snd sound resources or AIFF sound files to the movie sound format. Ideally, adding sounds to a QuickTime movie should require no more effort than simply specifying the sound file to play. Until this is possible, the developer must devise techniques (such as the one presented here) for converting the non-QuickTime sound formats into the QuickTime format.

With QuickTime, Apple has come a long way towards making the sound-data structure a standard data type on the Macintosh. However, the issue of interformat sound-data exchange remains, though I find it difficult to complain when much of the headache of sound manipulation has been resolved by QuickTime.


_SOUND AS A DATA TYPE_
by Aaron E. Walsh

[LISTING ONE]



/************************************************************************/
/* AddSoundTrackToMovie.c -- QuickTime program demonstrating how to add */
/*  a sound track to a movie using data found in an snd sound resource. */
/*  by Aaron E. Walsh Developed using Symantec Think C 5.0 and          */
/*  Apple QuickTime headers                                         */
/************************************************************************/

#include <Types.h>
#include <Dialogs.h>
#include <Sound.h>
#include <Movies.h>
#include <MoviesFormat.h>

#define     defaultChunkSize    32768

void    main(void);
OSErr   InitMacManagers(void);
void    doSoundToMovie(void);
OSErr OpenSndFile(short vRef, Str255 fileName, short *rRef,
                                                         Handle *soundHandle);
void    ConvertSndToMovie(short destVolume, Str255 destName, Handle sndH,
                                   long chunkSize, Boolean CreateFile);
OSErr CreateSndDescriptor(Handle sndHandle, SoundDescriptionPtr sndDescriptP,
                 unsigned long *dataOffset,unsigned long *frameCount);

/***** main() -- simple main event loop ******/
void main()
{
    OSErr   error;
    error = InitMacManagers();
    if (!error)
        {
        doSoundToMovie();   /* focus of this article */
        ExitMovies();       /* exit Movie Toolbox when done */
        }
} /*main*/

/***** InitMacManagers() -- Initialize Toolbox Managers used here app. *****/
OSErr   InitMacManagers()
{
    OSErr   error;
    InitGraf(&qd.thePort);
    FlushEvents(everyEvent,0);
    InitWindows();
    InitDialogs(nil);
    InitCursor();

    MaxApplZone();

    /* attempt to initialize the Movie Toolbox:  a better approach  */
    /* would be to use the Gestalt Manager, as detailed in my   */
    /* July 1992 DDJ article "Programming QuickTime"        */

    error = EnterMovies();
    return (error);
} /*InitMacManagers*/

/**** doSoundToMovie() -- simple interface: select sound file & add it as a */
/* QT sound track ****/
void doSoundToMovie()
{
    Point           where;
    SFReply         userReply;
    Handle          soundResourceHandle = nil;
    short           resourceRefNum = 0;
    OSErr           soundFileSelectErr, movieFileSelectErr;
    SFTypeList      movieFileTypes;


    soundFileSelectErr=movieFileSelectErr = 1; /* initialize error flags */

    /* select sound file: */
    SFGetFile(where, "\p", nil, -1, nil, nil, &userReply);
    if (userReply.good)
            soundFileSelectErr = OpenSndFile  (userReply.vRefNum,
                         userReply.fName,
                             &resourceRefNum,
                         &soundResourceHandle
                         );
    /* select movie file: */
    movieFileTypes[0] = 'MooV';
    SFGetFile(where, "\p", nil, 1, movieFileTypes, nil, &userReply);
    movieFileSelectErr = !userReply.good;

    if (!soundFileSelectErr && !movieFileSelectErr)
               ConvertSndToMovie(userReply.vRefNum, userReply.fName,
                                soundResourceHandle, defaultChunkSize, false);
    else
        SysBeep(1); /* beep to indicate conversion wasn't done */
} /*doSoundToMovie()*/

/** OpenSndFile() -- open given file and retrieve first snd resource found **/
OSErr OpenSndFile(short volRefNum, Str255 fileName, short *refNum,
                                                            Handle *soundHand)
{
    FSSpec              fileSpec;
    OSErr               error;
    short               saveResFile;

    /* create file specification record: */
    error = FSMakeFSSpec(volRefNum, 0, fileName, &fileSpec);
    if (error) DebugStr("\pFSMakeFSSpec Failed");

    if (*refNum)
    {
        CloseResFile(*refNum);  /* close file once spec is created */
        if (error = ResError()) DebugStr("\pCloseResFile Failed");
        *refNum = 0;
    }
    *refNum = FSpOpenResFile(&fileSpec, fsRdPerm);  /* open using spec */
    if (*refNum < 0) DebugStr("\pFSpOpenResFile Failed");

    saveResFile = CurResFile(); /* save current resource refnum */
    UseResFile(*refNum);        /* switch to our newly opened file */
    if (error = ResError()) DebugStr("\pUseResFile Failed");

    *soundHand = Get1IndResource('snd ', 1); /* get first snd resource */
    error = ResError();
    UseResFile(saveResFile);

    if (!*soundHand)
        error = -1;  /* return error if sound handle is nil */
    return (error);
} /*OpenSndFile*/

/* ConvertSndToMovie()--create new movie sound track from snd resource data */
void ConvertSndToMovie(short destVolume, Str255 destName, Handle soundHand,
                       long chunkSize, Boolean CreateFile)
{
    FSSpec              destSpec;

    short               resID = 1;
    short               resRefNum;
    OSErr               theErr;
    Track               sndTrack = 0;
    Media               sndMedia = 0;
    Movie               theMovie;
    unsigned long       sndSize, bytes, dataOffset, numSampleFrames;
    unsigned long       totalSamples, chunkSamples, samples;
    unsigned long       bytesPerFrame, samplesPerFrame;
    SoundDescription    **descH;
    SoundDescription    *sndDescP;

       descH = (SoundDescription **) NewHandleClear( sizeof(SoundDescription));
       if (!descH) DebugStr("\pCould not get description handle");

       sndDescP = *descH;

       CreateSndDescriptor(soundHand, sndDescP, &dataOffset, &numSampleFrames);
       bytesPerFrame = 1;
       samplesPerFrame = 1;

    theErr = FSMakeFSSpec(destVolume, 0, destName, &destSpec);
    if (theErr == fnfErr) theErr = 0;
    if (theErr) DebugStr("\pFSMakeFSSpec Failed");

    theErr = OpenMovieFile(&destSpec, &resRefNum, fsRdWrPerm); // ALTERED!
    if (theErr) DebugStr("\pOpenMovieFile Failed");

    resID = 0;
    theErr = NewMovieFromFile(&theMovie,resRefNum,&resID,destName,0,0L );
    if (theErr) DebugStr("\pNewMovieFromFile Failed");

    /* Now put it into a track */
    sndTrack = NewMovieTrack(theMovie, 0, 0, 255);
    if (theErr = GetMoviesError()) DebugStr("\pNewMovieTrack Failed");

    sndMedia = NewTrackMedia(sndTrack, SoundMediaType,
                          ((unsigned long)(*descH)->sampleRate) >> 16, nil, 0);
    sndSize = GetHandleSize(soundHand) - dataOffset;

    bytes = numSampleFrames * bytesPerFrame;  /* number of bytes
                                                   we expect in file */
    if (bytes > sndSize)    /* sample too large */
    {
        SysBeep(1); /* give a beep to indication error occured */
        numSampleFrames = sndSize / bytesPerFrame;
    }
    totalSamples = numSampleFrames * samplesPerFrame; /* samples in file */

    if (!chunkSize)
        chunkSamples = totalSamples;    /* all in one chunk */
    else
        chunkSamples = (chunkSize * samplesPerFrame) / bytesPerFrame;
                                             /* get size of chunk in samples */
    theErr = BeginMediaEdits( sndMedia );
    if (theErr) DebugStr("\pBeginMediaEdits Failed");
    while (totalSamples)
    {
        samples = totalSamples;                                 /* samples in chunk */
        if (samples > chunkSamples)
            samples = chunkSamples;
        bytes = (samples * bytesPerFrame) / samplesPerFrame;
                                                          /* bytes in chunk */
        theErr = AddMediaSample( sndMedia, soundHand, dataOffset,
                               bytes, (TimeValue) 1, (SampleDescriptionHandle)
                               descH, samples, 0, 0);
        if (theErr) DebugStr("\pAddMediaSample Failed");
        dataOffset += bytes;
        totalSamples -= samples;
    }
    theErr = EndMediaEdits( sndMedia );
    if (theErr) DebugStr("\pEndMediaEdits Failed");

    theErr = InsertMediaIntoTrack(sndTrack,0L,0L,
                                          GetMediaDuration(sndMedia), 0x10000);
    if (theErr) DebugStr("\pInsertMediaIntoTrack Failed");

    /* Write out the movie */
    DisposHandle((Handle) descH);
    if (CreateFile)
    {
        theErr = AddMovieResource( theMovie, resRefNum, &resID, destName );
        if (theErr) DebugStr("\pAddMovieResource Failed");
    }
    else
    {
      theErr = UpdateMovieResource( theMovie, resRefNum, resID, destName );
      if (theErr) DebugStr("\pUpdateMovieResource Failed");
    }
    theErr = CloseMovieFile( resRefNum);
    if (theErr) DebugStr("\pCloseMovieFile Failed");
    DisposeMovie(theMovie);
}/*ConvertSndToMovie*/

/**** CreateSndDescriptor() -- create a Sound Descriptor record from  */
/*                             snd resource data ****/
OSErr CreateSndDescriptor(Handle sndHandle, SoundDescriptionPtr sndDescriptP,
        unsigned long *dataOffset,unsigned long *frameCount)
{
    short   *i, synthCount, sndCommandCount;
    SndCommand *SoundCommand;
    SoundHeaderPtr sndHeadPtr;

    HLock   (sndHandle);
    i = (short *) *sndHandle; /* get first word of snd resource */

    if (*i != 1)         /* deal only with format 1 snd resource    */
        return (-1); /* return error (-1) for other snd types */
    i++;
    synthCount = *i;     /* count of modifiers/synthesizers in resource */
    i++;
    i += (synthCount * (1 + 2)); /* jump over modifiers/synthesizers, */
                                     /* so we can get at the sound commands */
    sndCommandCount = *i;        /* count of sound commands in resource */
    i++;
       SoundCommand = (SndCommand*) i; /* get reference to 1st sound command */

    sndDescriptP->descSize  = sizeof(SoundDescription);
                                                 /*size of SoundDescription*/
    sndDescriptP->resvd1        = 0;     /* reserved by Apple       */
    sndDescriptP->resvd2        = 0;     /* reserved by Apple       */
    sndDescriptP->version       = 0;     /* data version        */
    sndDescriptP->revlevel      = 0;     /* codec version       */
    sndDescriptP->vendor        = 0;     /* codec vendor        */
    sndDescriptP->compressionID     = 0;     /* sound compression 0=none*/
    sndDescriptP->packetSize    = 0;     /* compression packet size,*/
                                                 /* 0 if not compressed */
    sndDescriptP->numChannels   = 1;  /* number of channels of sound*/
    sndDescriptP->sampleSize    = 8;  /* number of bits per sample  */
    sndDescriptP->dataFormat    = 'raw '; /* uncompressed           */

    /* coerce in order to access sample rate, frame count, & data offsets*/
    sndHeadPtr  = (SoundHeaderPtr) (*sndHandle + SoundCommand->param2);
    sndDescriptP->sampleRate    = sndHeadPtr->sampleRate;
                                                     /*sample rate of data*/
    *frameCount = sndHeadPtr->length;        /* number of frames */
    *dataOffset = sndHeadPtr->sampleArea - *sndHandle; /* offset to data */

    HUnlock(sndHandle);
} /*CreateSndDescriptor*/