Channels ▼

Al Williams

Dr. Dobb's Bloggers

The Pi Transform

August 05, 2013

Last time, I outlined how you could use the GNU Scientific Library — in theory — to decode dual tones (like the ones a TouchTone telephone generates). The code to do it is available online and I want to point out some of the key features of the code and how you might apply it to different problems.

The decoder uses the FFT algorithm — there's no Goertzel in the GSL — and I tried to show a variety of options you might want for an FFT. In the case of TouchTone decoding, you really only need a slow sample rate and a small buffer because you want faster detection times more than high-frequency resolution. The spreadsheet I showed last time lets you play "what if" with the different sampling frequencies and sizes to see how time and frequency response will be for different scenarios.

In this case, there were two limitations: The sound card only supports so many sampling rates and the GSL FFT I wanted to use requires the number of samples to be a power of two. If you look at the code, here's the configuration:

#define BUFSIZE 256     // size of buffer                                                          
#define SAMPRATE 8000   // Sample rate                                                             
#define BUCKET (SAMPRATE/BUFSIZE)  // Computed size of each bucket    

The truth is, you would like a faster total sample time so you could select more samples (which would improve noise rejection). But the sound card I am using won't go lower than 8kHz, and with these parameters, you can't wait for too many samples if you want to read very short burst of tones. Here's the setup for that:

#define SAMPLE_REPEAT 2   // how many times do we see a tone before emitting #                     
#define GAP 1             // Skip samples after detection     

The GAP parameter forces the code to ignore a sample (in this case) after a successful detection. You can experiment with setting this to 0 (which lets you increase SAMPLE_REPEAT — the sum of these determines the shortest burst you can reliably detect).

Because of the Nyquist limit, the result will only be meaningful up to 4kHz. It isn't just that the top half of the buffer will be useless in the results. The GSL library will use the upper half of the buffer to store the imaginary part of the result, so the routine doesn't even try to return the (meaningless) top part of the result. The BUCKET symbol tells you how many Hertz each bucket in the result is worth.

It is annoying to have to compute the bucket number (that is, the index into the result buffer) that corresponds to a particular frequency, so there's a #define for that too:

#define TONE(n) ((int)((n)/BUCKET+0.5))  // Find bucket from tone

Note that the macro rounds the result up by adding 0.5 before converting to an integer. You can see this being used later in the code to easily define the table of tones to search for:

//  what are the tones we are looking for? */                                                      
int tonemap[]={ TONE(697), TONE(770), TONE(852), TONE(941),
                TONE(1209), TONE(1336), TONE(1477)};

There are two defines used for setting the thresholds (that is, how loud the signal has to be before the code tries to decode it). The ATHRESHOLD define prevents processing on signals that don't have at least some data louder than a particular amplitude. The way the PortAudio library is set (see last time), the samples will range from -1.0 to 1.0 (if they are loud enough). In the code, the normalize function finds the maximum amplitude (using fabs so that negative numbers count as positive). If the biggest magnitude in the sample set is less than ATHRESHOLD, no further detection is attempted. If it is and the NORMALIZE flag is set, the code scales each sample in the buffer so that the loudest thing is set to 1 (or -1).

The other threshold define is THRESHOLD. Any result from the FFT less than that is considered "no tone". The code deals with the magnitude of the complex result. This is simple with the relatively recent addition of the complex data type to the C compiler (I mentioned this last time). Here's how a particular bucket gets converted from a real part (in fftbuf[tonemap[i]]) and an imaginary part (in fftbuf[tonemap[i]+BUFSIZE/2]):

complex double c0=fftbuf[tonemap[i]]+fftbuf[tonemap[i]+BUFSIZE/2]*I;

Then it is simple to get the magnitude:

stats.probe[i]=cabs(c0);

If you are set up with narrow buckets, you may want to average the bucket you think you are looking for with the adjacent buckets. This also allows you to accommodate senders that are not quite perfect. The AVG define sets that behavior.

There are two more defines that control windowing. The Fourier transform is a mathematical wonder. However, computers aren't all that great at math. The real transform expects us to provide a function that describes a waveform for all time. Our Raspberry Pi can only sample the signal at a few points for a limited period of time. The math doesn't care. From the math point of view, we have a signal that is zero from the beginning of time, suddenly has a brief bit of activity, and then stays zero for all eternity thereafter.

This can skew the results because you have a very abrupt transition from zero to something (and another one from something to zero at the end). To combat that, it is sometimes useful to multiply the input by a window function that gradually introduces the signal and then gradually fades it away at the end. This has its own problems, of course. A "rectangular" window is really no window (for the purposes of this program, anyway). You select this with WINDOW_RECT. You can experiment with a Hamming window by setting WINDOW_HAMMING to 1 (you should only set one of these to 1 at any given time). There are other windows possible, but I didn't even need Hamming. You can read more about window function on Wikipedia if you are interested.

The code isn't much different from the first PortAudio program I showed. That one gathered some basic statistics about the audio input and communicated it via a structure. The callback for this program is pretty similar, but instead of just measuring amplitude it measures frequencies using the GSL FFT call. However, it still communicates with the rest of the program via a structure:

// communicate with callback                                                                       
typedef struct
  {
    unsigned int count;   // packet #                                                              
    // The probe array will get a value for the magnitude of each tone                             
    double probe[sizeof(tonemap)/sizeof(tonemap[0])];
    // Let the main program see any flags the ISR is reporting                                     
    PaStreamCallbackFlags isrflags;
  } audiostats;

The fields are pretty simple:

  • count — An increasing count so the program can tell when the ISR processed a new sample
  • probe — The magnitudes of the frequencies of interest
  • isrflags — PortAudio reports various things to the callback; this field lets the main program see those reports

Armed with these tools, the rest of the code is pretty straightforward. It waits for something to show up in the probe array and then does simple logic to convert that to a number which it prints. You can read the code online if you want to see exactly how.

If you try to compile this code on the Rapsberry Pi (or, really, just about any Linux system), you'll want to include all the required libraries. Here's the command line I used:

gcc -g -o ttdecode ttdecode.c -lportaudio -lgsl –lgslcblas

Also, make sure to enable your recording device on the Pi and turn up the gain (alsamixer is your friend). The program works pretty well on the Pi, although it can get some false positives with enough noise. A shorter sampling period would help, but that has its own problems, too.

When I started this series of blogs, I asked: "Can you do signal processing on the Pi?" I think the answer is a resounding yes. The program was up to the task for doing a real-time FFT and I didn't even spend much time trying to optimize it — I just grabbed off-the-shelf resources. These inexpensive and relatively powerful Linux platforms open up many new possibilities for embedded designers.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video