Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Al Williams

Dr. Dobb's Bloggers

Raspberry Sounds Continued

July 19, 2013

Last time, I looked at using Portaudio to read sound from a USB microphone connected to a Raspberry Pi. This time, I want to reverse the process and create sounds. Since the Pi has audio output and Linux, of course, you could use a variety of Linux methods to play a sound file from a WAV or MP3 file. What I'm interested in, however, is the software-generation of sounds.

If you recall from last time, Portaudio allowed you to open an input stream and when it had audio data, it called a callback function. Sound generation is very similar. You open an output stream and when the library needs data, it calls a callback so you can provide it.

In this case, I wanted to generate TouchTones (technically know as dual-tone multi-frequency signaling or DTMF). In other words, if you wanted to use the Pi as part of a phone project (or a ham radio project), how would you generate the tones? Although you could just record twelve or sixteen WAV files and play them, I wanted something more flexible and to show how to use Portaudio.

In concept, generating DTMF is simple. Each key corresponds to two tones. The designers picked tones that would not interfere with each other and that were easy to detect using 1963 technology. There are even standards for the minimum duration of the tone (50 ms), the space between the tones (45 ms), and the total duration of a digit (at least 100 ms).

Of course, the expectation is that a human is pressing the buttons, so longer is fine (the numbers above are from ANSI T1.401-1988, although there are other standards with slightly different numbers). Most consumer devices use at least 70 ms just to be safe.

There are only eight frequencies required for DTMF. These eight are divided into four high frequency tones and four low frequency tones. Each digit is a combination of one tone from each set. The mathematically astute might notice that four times four is sixteen, but there are only twelve keys on a typical phone. The standard allows for four control keys (usually called A, B, C, and D) that rarely appear on consumer equipment. My code ignores those keys.

Most modern equipment can read tones even if they are not exactly correct. For example, the standard requires that the high frequency tones must be from -8 to 4 dB relative to the low tone (measured at the receiver). Depending on the application, that means the one tone might need to be generated louder to account for increased loss at either high or low frequencies. In practice, most modern equipment does fine without this stipulation (known as the "twist").

You can find the entire code in the online listings. It is very similar to the last Portaudio program, except it opens an output stream and uses floating-point samples instead of integer values. The callback, of course, is extremely different since it is producing data instead of consuming it. Consider making a "pure" tone:

// Audio data comes from this callback
static int paCallback(const void *in, void *out, unsigned long framesPerBuffer,
                      const PaStreamCallbackTimeInfo *timeinfo,
                      PaStreamCallbackFlags statusFlags,
                      void *userdata)
  unsigned i;
  float *buf=(float *)out;
  ttdata *data=(ttdata *)userdata;

  // fill frame with sine wave
  for (i=0;i<framesPerBuffer;i++)
return paContinue;

The sin function will return a number from -1 to 1 for a given frequency. You usually express a frequency in Hertz, but for the sin function you will need to compute the current phase times the frequency in radians. To convert Hertz to radians, you need to multiply the frequency by two and times pi. The current phase is the sample number (ranging from 0 to the number of samples in one second). So actually the line above should read:


However, it is wasteful to compute 2*pi*f each time through the loop. Here are a few lines from the actual callback:

  temp1=PI2N*data->f1;   // PI2N is 2*PI/SAMPLE_RATE

  // fill frame with sine waves                                                                    
  for (i=0;i<framesPerBuffer;i++)
      if (data->unmute)  // but only if not on mute                                                
if (++data->phase1==SAMPLE_RATE) data->phase1=0;  // update phase counter
. . .

The temp1 and temp2 do the bulk of the math one time per callback. If you were really trying to optimize, you could store the frequency and the temp values away. Then you could do a simple integer compare to detect when the frequency changed and only recompute at that time. That seemed to be overkill for this little project, however.

The computation of the sine waves makes two calls to sin (one for each frequency component) and the calculations with TWIST make sure the total adds up to 1.0 (or -1.0). The default TWIST is defined as 0.6 so the high frequency is 60% louder than the low frequency.

Note that the buffer size and the sample rate are probably not the same, so the phase needs to survive between callbacks. When the count reaches the sample rate, the code resets it to zero.

To use the tone generation system, your main code needs to manipulate the ttdata structure. The f1 and f2 members take the two frequencies to generate (with f2 assumed to be the higher frequency). There is a member of the ttdata structure named unmute. If this member is non-zero, the system generates tones. The phase variable is in that structure as well, but you should not need to change it. It is essentially a static variable for the callback.

Not very painful. Even so, I wrote a handful of simple functions to serve as an API to the tone generation system:

  • run — Serves as a "main" for your logic that runs after Portaudio is set up and ready to go.
  • play — Generates tones for a specified duration and then maintains silence for a second duration.
  • tone — Generates a tone for a given digit with a specified duration and then maintains silence for the same duration.
  • us_busy — Generates a fake US-style busy signal for a certain duration.
  • us_ringback — Generates a fake ring signal for 250mS.
  • us_startdialtone — Starts generating a dial tone.
  • us_stopdialtone — Stops generating a dial tone (or any tone, for that matter).
  • dial — Dials a string.

Here's the example run() function I included with the online listings:

// "main" program other than PA init                                                              
void run(ttdata *tt)

Now that is easy! The last two pieces of code give you the tools you need to do audio input and output on the Raspberry Pi (or, actually, any platform that can handle PortAudio). The real question is: Can the Pi do any significant signal processing? I intend to find out. Stay tuned.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.