Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

Jan02: C Programming


Jan02: C Programming

Plug It In

Al is DDJ's senior contributing editor. He can be contacted at [email protected].


Last month, I wrote about an unsuccessful experiment in digital signal processing. My objective was to encode six logical audio channels into two stereo channels in an MP3 file by using DSP to isolate and silence one logical channel at a time. The project is called "Music Minus Whatever," and its purpose is to let music students practice ensemble performances along with audio recordings where the instrument being practiced is silenced in the playback.

My experiment used an audio file with five discrete logical channels mixed such that each channel occupies a defined, fixed place in the stereo spectrum's pan with no audio crosstalk between channels. To eliminate a selected channel, the algorithm pans the full signal an appropriate amount to bring it to the center of the spectrum and applies center channel elimination digital signal processing, better known as "vocal elimination." I declared the experiment a bust when the audible result was mostly a bunch of garbled echos instead of silence for the eliminated channel.

Fade back a month. In November, I described an effort that worked better. I encoded three logical channels into two by recording one channel fully panned left, one fully panned right, and the third mixed in dead center with equal amplitudes in both channels. Eliminating one logical channel was simply a matter of eliminating one or the other stereo channel or applying center channel elimination. Acoustics engineer Christopher Morgan suggested an approach sounding a lot like December's failed experiment, which hadn't yet been published when he wrote. When someone who knows the science advises me to try what I've already tried and rejected, I'll return and take another look.

I took a long look at the December code, which no helpful engineers had seen yet, and gave more thought to the theory. I set up another experiment and, to my astonishment, it worked. No more raggedy echoes. The culprit turned out to be MP3 compression. My project needed, I thought, the reduced file sizes that MP3 compression provides. Center channel elimination works okay with MP3s, but not, as it turns out, when the channel is panned to one side or the other in the original. It has to do with how MP3 compression deletes frequencies it thinks you can't hear because other audio in the signal normally drowns them out. The key to logical channel identification was its relative position in the stereo spectrum — information MP3 compression tends to destroy. My new experiments used uncom- pressed wave files, and the algorithm worked. Each logical channel I selected to eliminate was indeed silent. But the remaining logical channels suffered a substantial amount of audio degradation having to do with their distributed relative amplitudes after the panning and signal removal operations. I found no way to get acceptable performance by using this technique, so once again, I was back to the November approach.

Reader Bob Sundling then told me that my discovery of three channels in two was the basis for Dolby Surround sound, and that Dolby encoded a fourth logical channel by mixing the fourth channel's audio into the other three with equal amplitudes on both stereo channels, but with one channel's samples inverted. The algorithm for eliminating this logical channel's audio is similar to that for eliminating the center channel, except that instead of subtracting the left sample from the right and doubling the result, the elimination algorithm averages the two samples. Sure enough, that approach worked, and my three-channel elimination algorithm became a four-channel elimination algorithm.

Encoding four instruments into one stereo signal and eliminating any one of the four is closer to Music Minus Whatever for a six-piece band. Close, Monica, but no cigar. I need six logical channels.

I decided to try a combination of both approaches. I added two more instruments to the four-channel mix by increasing the volume of one stereo channel of each by a fixed amount. One instrument was louder in the left channel, and the other was louder in the right. This playback kept these instruments always audible when the other instruments were individually silenced. That might work if you don't care whether you can silence drums and bass, for example. But I care. Relentless, I added the pan-shift algorithm for when one of those two instruments was to be silenced, and the whole thing fell into place. It worked. Not well, but well enough that I can declare last month's failure to be a success. The basic problem is that as certain instruments are eliminated, others suffer volume loss. Such audio degradation is okay for fooling around at home, but not for a professional commercial teaching aid.

I have concluded that achieving my six-track objective and still having acceptable audio quality requires two stereo signals, which, given the capabilities of contemporary audio processing and playback programs, requires two files being read, processed, and mixed in real time. And also requires enough space to store two audio files instead of one.

But for now, by using the interim approach, I can work on the other parts of the project — writing arrangements, recording tracks, and so on — without having to write an audio file decoder that reads and mixes multiple audio files. Not yet, at least.

Plug-Ins

All these experiments are made possible by a technology called "plug-ins" — modules by which an application separates itself from the formats of what it reads and writes. In the world of audio, playback applications can be implemented without built-in audio input and output functions, depending instead on plug-in modules that comply with the application's API for plug-ins. Plug-ins can be used for graphical applications, database applications, musical-notation applications, and any application where data storage formats are independent of how information is processed and displayed. There are many ways to represent images, audio, events in musical scores, database tables and indices, spreadsheet cells, and the like, in disk files. But an application that processes one of these kinds of data collections typically has only one way to represent the data in memory. A plug-in is the filter between the storage and memory formats of the data.

The plug-in mechanism itself separates plug-ins from other kinds of component software. The application looks for plug-ins at run time and enables those it finds. Installing new plug-ins is simply a matter of copying their binary executable file into the subdirectory where the application looks for plug-ins.

A plug-in is a dynamic linked library (DLL) in Windows and shared object (SO) library in Linux. Applications query the library for memory addresses of named functions by passing a string with the function's name. The plug-in returns the memory address where the function can be called, or zero if the function does not exist. A plug-in DLL or SO that complies with the application implements all the specific functions the application calls.

For audio applications, plug-ins include input, DSP, and output modules. An input plug-in reads and perhaps decompresses a specific audio file format, returning raw samples that an audio program needs for playback. The application calls DSP plug-in modules to implement DSP filters on the audio stream during playback — EQ, stereo enhancement, echo chambers, and so on. Then the application calls an output plug-in to play the audio stream through one of the operating system's sound systems or write the stream to a specific audio file format.

The dominant Windows audio playback app is Microsoft's Windows Media Player. I don't know whether it supports an open plug-in architecture because I can't find a specification for one on MSDN. Another comprehensive product, Winamp (http://www.winamp.com/), implements plug-ins in an open architecture documented by example programs. Winamp has input plug-ins for virtually every audio file format, output plug-ins for WAV and MP3 files, and playback plug-ins for the Win32 wave and DirectX sound reproduction systems. (A playback plug-in is simply an output plug-in that plays audio rather than writing it to files.)

Linux has Xmms (http://www.xmms.org/), an application whose developers readily admit is a Winamp knockoff. Xmms employs a plug-in mechanism similar to that of Winamp and supports most file formats and Linux sound systems.

Audio applications often include plug-in systems for what they call "visualizations," a way that the program graphically displays the audio signal as it plays.

Other applications use the plug-in approach. Cool Edit Pro (http://www.syntrillium.com/) is an advanced audio processing app that implements the functions of a digital recording studio. Cool Edit calls its input/output plug-ins "filters" and recognizes them by the filetype .flt, a convenient pseudonym for a DLL. Netscape Navigator and Internet Explorer use plug-ins to implement embedded objects in web pages — well, at least Netscape does. More about that later.

The plug-in concept has great potential for developers. Your application can begin by supporting a limited set of data formats. By publishing the plug-in API, you can support third-party plug-ins. By using the API of an existing application, you can leverage the plug-ins that are already available.

Suppose you want to develop a specialized audio application, something like the instrument eliminator I'm working on. You can use an existing application, such as Winamp, to test your audio processing plug-ins by writing them to comply with Winamp's plug-in architecture. Once the plug-ins work, you can write your application to implement the same plug-in architecture as Winamp and bundle your application with your specialized plug-ins. Or you can require users to acquire and install Winamp. With proper licensing, you might even be able to include Winamp with your distribution. If existing plug-ins support the input/output formats you need, you can implement your application by using the existing plug-ins, getting, of course, permissions and licenses to use and distribute them.

Winamp Plug-Ins

Winamp's plug-in architecture is open, and you can download its SDKs for input, output, visualization, and DSP plug-ins. The SDKs are nothing more than example DLLs in source code that implement basic audio processing functions. There is no documentation. You have to figure out the plug-in APIs from the examples, and the examples are incomplete. There is no example DSP module for audio files with 8-bit samples, for example. Using the 16-bit logic except with streams of 8-bit values just generates a lot of noise that threatens to take out your speaker system.

The example input DSP plug-in source code lets you build two plug-ins. The first plug-in generates a tone and sends it to the audio player. The other reads a file of raw PCM recorded as 44,100 samples per second, 16-bit samples, two channels. The first one doesn't work. Its commented documentation tells you to open an imaginary file named tone://nnnn, where nnnn is the frequency of the tone you want to generate. But the Windows File Open dialog never lets the application call the plug-in because the file does not exit.

Winamp uses naming conventions to know what plug-in DLLs are available. They must be located in the winamp\plugins subdirectory. If a DLL's filename begins with "dsp_" Winamp considers it a DSP plug-in. Winamp loads the DLL, queries it for the winampDSPGetHeader2 function and, if the function exists, calls the function to retrieve data relative to the plug-in's DSP modules. The other kinds of Winamp plug-ins use similar name prefixes.

Winamp's plug-in architecture is C-centric because DLLs export function names. C++ mangles function names for purposes of type safety across translation units, so a C++ function cannot be exported in a DLL unless the program calling the function knows the function name's mangling characteristics, which can vary depending on whether the same compiler built the calling program and DLL. Programs such as Winamp obviously cannot expect mangled exported DLL function names.

To implement a Winamp DSP plug-in, the DLL exports the unmangled function winampDSPGetHeader2, which returns to the caller the address of a winampDSPHeader structure object. This structure is the plug-in's header. It contains the Winamp version number, a text string for the plug-in's name, and the address of a function in the plug-in that Winamp calls. Winamp calls the function several times, passing a parameter that starts at zero and increments with each call. The plug-in returns addresses of winampDSPModule structures that define the API for each logical module the plug-in implements. (A DSP plug-in can implement multiple modules. The example DSP plug-in from http://www.winamp.com/ implements echos, pitch control, and vocal elimination modules, for example.) When there are no more modules to report, the plug-in function returns zero. Winamp uses this information to build a listbox from which users choose a module while the plug-in is activated. Winamp uses the function addresses in the winampDSPModule structures to call functions to initialize and configure the module, to process the audio samples before Winamp sends them to the output, and to cleanup when users exit from the module. The module structures each contain a name for the module that Winamp puts in the listbox for the plug-in.

In short, Winamp plug-ins export one function that returns the address of a plug-in header structure that contains the address of another function, which returns the addresses of module headers that contain the addresses of the plug-in's API functions. Couldn't be simpler? Oh, yes it could.

Xmms Plug-Ins

Xmms plug-ins take a simpler approach. Global symbols in SOs are exported by default, and applications that load the library can query for any global symbol's address by name. Plug-ins simply declare as externs all the data and functions the application's API needs. If a plug-in complies with the API, applications can use the plug-in. This approach requires that plug-in API functions be declared inside an extern "C" block, or that the plug-in itself be a C program.

C++ Plug-Ins

The example plug-ins you download from Winamp's and Xmms's developer pages are C programs, but they don't need to be. The exported functions can be defined within an extern "C" block. The functions passed as addresses in structures can be static C++ member functions. The architecture itself cries for a C++ implementation, so I built one for DSP plug-ins to support building C++ Winamp plug-ins as prototypes for my project.

Dspplugin.h and dspplugin.cpp (available electronically; see "Resource Center," page 5) define the DSPPlugin class and the winampDSPModule and winampDSPHeader structures that Winamp needs. Dspmodule.h and dspmodule.cpp (also available electronically) define the DSPModule base class. These files are common to all DSP plug-in projects.

To build a DSP plug-in, you write a DLL that instantiates a DSPPlugin object in its DLLMain function. Enhancer.cpp (available electronically) defines the DLLMain function for a plug-in that uses this framework to implement stereo and echo audio enhancement modules. The plug-in offers nothing new in the way of digital signal processing. Its purpose is to demonstrate the use of the plug-in C++ framework. The DSPPlugin constructor accepts a string initializer for the plug-in's title. This is the name Winamp displays in the dropdown listbox when users activate the plug-in. If you don't provide a name, the framework provides the default value, "(none)". The constructor also accepts a variable argument list of addresses of DSPModule derived objects. The last argument is followed by a zero address to terminate the list.

Objects in the module list are of classes that you derive from DSPModule. You can also include an object of DSPModule itself, which presents the title "(none)" and does no DSP on the audio samples. You must derive a class from the DSPModule class for each module the plug-in implements. The derived class's constructor may include an initializer for the DSPModule with a string name for the module. The derived class overrides any virtual functions that correspond to the Winamp DSP plug-in API. The functions are initmodule, quitmodule, configmodule, modifysamples, and modifysample. If you do not override some of these functions, the base class provides default behavior. The base class takes care of associating these member functions with the C functions that Winamp expects to call.

Winamp calls the initmodule function when users first select the module from the plug-in's dropdown listbox and the quitmodule function when users select a different module or deselect the plug-in. These functions act as constructor and destructor for the module, and it is where you should put such code. The module object itself is likely to persist outside of the time the user has the module selected. Winamp calls the configmodule function whenever users click the Configure button while the plug-in is selected. Each module can have its own configmodule function, which typically display dialogs that let users modify how the plug-in module operates. These modifications occur in real time as Winamp plays or writes the audio stream.

Echo.h, echo.cpp, stereo.h, and stereo .cpp (available electronically) define the Echo and Stereo classes derived from DSPModule. Because the Echo class maintains a sample buffer in which to build the echos, it overrides the initmodule and quitmodule functions to maintain that buffer. Both classes override the DSPModule::modifysample function.

DSPModule::modifysample provides default behavior for the function that Winamp calls to process blocks of audio samples. The function iterates the samples and calls one of the virtual DSPModule::modifysample functions, depending on whether the samples are 16 or 8 bits. The default behavior for these functions is to do nothing. The derived module class overrides DSPModule::modifysample if the module needs to do its own iterating. Otherwise, the class overrides DSPModule::modifysample to process the samples one at a time (one pair at a time for stereo audio streams).

Internet Explorer Unplugged

Browsers use plug-ins, too. An application that displays its data in a specific format can write a plug-in for Internet browsers to display the data on web pages that embed links to objects of the data. Users who install the application's plug-in can use their browsers to view the data objects in the application's format. The API for such plug-ins originated in Netscape Navigator. Other browsers, in the interest of platform-independent web applications, implemented Netscape's plug-in API. There are plug-ins for Apple QuickTime, Macromedia Flash Player and Shockwave, Adobe Acrobat Reader, Real Player, and many more. You can read the specs for browser plug-ins and download the SDK at http://developer.netscape.com/docs/manuals/communicator/plugin/index.htm.

I use a music notation program called "NoteWorthy Composer" (http://www.ntworthy.com/) that includes a freely distributable browser plug-in. It responds to embedded objects of the musical score file format and displays and prints the score in musical notation. It also plays the score on the user's MIDI system. This plug-in lets arrangers and composers share their work with people who read music but do not write it and do not need the full application. It's availability encourages composers and arrangers to use NoteWorthy Composer.

In November 2001, DDJ published a letter to the editor that told us that as of Version 6.0, Internet Explorer removed support for Netscape-style plug-ins. Fur- thermore, anyone installing Service Pack 2 to IE 5.5 gets the same removal. Likewise, if you buy Windows XP, there's no browser plug-in support. (One wag on the NoteWorthy newsgroup suggested that Microsoft is now in the business of browser "downgrades.")

Removal of browser support for plug-ins derails the software strategies of organizations that build web-based applications and depend on plug-ins to support their users in a browser-independent environment. I can understand releasing a new product that does not support everything, and I can understand Microsoft aggressively promoting its own proprietary ActiveX architecture over platform-independent architectures. I don't agree, but I understand.

On the one hand, plug-ins are indeed browser independent, running on any browser that runs on the particular operating system — Windows in this case. On the other hand, plug-ins are not really platform independent — they are binary dynamic libraries native to a specific CPU and operating system. If you are browsing from a Mac, for example, you only get those plug-ins that someone has found time to port to the Mac. Same with Linux.

That plug-in dependency/nondependency split personality must have piled up some rocks and hard places in the Redmond board room. If you use these plug-ins, you need Windows ("Yeehaw! We're number one!"). If you use these plug-ins, you don't need Internet Explorer ("Oh, no! We're number two!"). Who wins? Neither side. The only thing to do is build a browser that Windows users need but that doesn't support plug-ins. Now how can Microsoft get away with that? What about all those Windows users who don't have any compelling need to upgrade? How can Microsoft force its users to upgrade?

Enter the Nimda virus, which takes advantage of security holes in IE and Outlook to spread itself. To plug those holes, you must install an IE upgrade. Guess what you get in the bargain? You got it, no more plug-in support. Whoever contrived and distributed the Nimda nonsense ought to send Microsoft a big invoice. (Hmm. I wonder? Nah, they wouldn't do that.)

But taking measures to explicitly eliminate something that works just fine is fixing something that ain't broke. How do you justify doing it? I asked a "Microsoft spokesperson" quite specifically why the company found it necessary to remove the feature. Here's the response:

Microsoft made the decision to not support old style Netscape plug-ins in IE 6.0 and IE 5.5. We elected not to support Netscape's new plug-in scheme.

Content creators can continue to create plug-in components that are built on ActiveX technologies, as has been the case since Internet Explorer 3.

Microsoft is working w/ third parties to be sure they understand the implications of these changes, so that they can continue to deliver the functionality consumers are looking for in a browsing experience.

I'm not sure what question the spokesperson was answering, but it sure wasn't the one I asked. Draw your own conclusions. Words fail me in the expression of mine. Well, actually, there are perfectly good words, and I know them all, but this is a family publication. Nice words like "evasive," "predatory," and "unfair" don't quite do the job. Where is Judge Jackson when you really need him?

Some plug-in builders and users might add warnings to web pages that have embedded content: "If you don't hear or see anything, maybe you have the wrong browser. Click here to download Netscape. Be sure and make it your default Internet browser."

In the meantime, I see a great opportunity for enterprising programmers. Build a general-purpose software transparency layer between ActiveX and Netscape-style plug-ins. Build it so that all those orphaned plug-ins work again with the crippled Internet Explorer. Build it, and the world will beat a path to your download site, thumbing their collective noses or flinging some other digit-related salute towards the halls of ivy at Redmond.

DDJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.