oreilly.comSafari Books Online.Conferences.


An Introduction to Linux Audio

by John Littler

Intro to Programming Linux Audio

Linux has come a long way in the last 10 years. At that time, if you were looking through the main audio and music applications on other operating systems, you would have struggled to find comparable, fully developed, apps on Linux. Nowadays, while no one would say the job was done, they could point to an assortment of high-quality applications that are getting real jobs done.

Having said that, there's still work to do on existing apps and the whole future ahead for those who want to try to get the future started now; which is to say that there is no fundamental law stating that there shall only be sequencers, loopers, and whatever. Whether designing the future or just having a play with sounds, Linux is a nice place to get started for practical as well as possible idealogical reasons. The practical reasons have to do with the variety of APIs available to let you get into Linux audio programming in a way that suits your ambitions and skills. Once you've acquired some skills, you could join an existing development team for an app you like, or head off into the wilderness hacking your own trail.

One side issue here is the business model. This is of interest to anyone hoping that what might start as an enjoyable hobby could have the possibility of being a job as well. First of all, there really aren't that many jobs in this field, and there still aren't that many if you include all the commercial audio software houses. They are there though. Academia is a similarly small area, but another possibility. One thing is for sure, project donations will not pay the rent. Consulting work gained as a result of your project work might do so, though. And if you come up with the next Big Thing, well, that's a different story.

Before we head into some specifics, and if you're unfamiliar with this general area, there are a couple of podcast chats with Fernando Lopez-Lescano of CCRMA at Stanford and another with Paul Davis of the Ardour project that cover a wide range of topics under this heading. The podcasts are available at Technical University Berlin and

Now, let's have a look at what we're trying to do and the main options available for doing it.

The three main things to do are capturing (recording) audio, replaying it, and altering it. All of this comes under the heading of Digital Signal Processing (DSP). We'll be looking at the first two options: capturing and replaying.

What we want to do is talk to the sound card in the computer, tell it what to do, what sort of arrangement the data should have (bearing in mind the card's capabilities), and then store it somewhere.

This could be broadly shown as the following (from the Paul Davis tutorial on ALSA programming):

      while (!done) {
           /* one or both of these */
      close the device

A look at the ALSA sound card matrix is a good starting point for learning about cards, but in what comes below, you frequently don't need to be that far into the machine (depending on what you want to accomplish).

Now, let's have a look at the ways we might get something happening.


Open Sound System (OSS), from 4Front Technologies, was the only supplier of sound card drivers for Linux until 1997 or '98. In those days, there was a free driver set and a commercial set, which offered a lot more. Now, OSS drivers are available for free for non-commercial use. The drivers are also open source since June 2007.

ALSA (Advanced Linux Sound Architecture) now provides the kernel drivers, and OSS use is deprecated, but there may be circumstances where OSS is useful, including doing work on an existing OSS application. Hannu Savolainen, the man behind OSS who was originally responsible for Linux sound support, writes a blog that provides a fascinating backlog to all of this. He maintains that lots of developers continue to use OSS because they don't like the ALSA API and don't need its added features (and complication). However, there are easier ways into ALSA, which we'll get to in the next section.

If you've ever seen a jokey reference to cat somefile > /dev/dsp then you've seen an OSS interface and, by the way, the results of doing just that with, say, a text file can be extremely ugly.

The OSS API consists of Posix/Unix system calls: open(), close(), read(), write(), ioctl(), select(), poll(), and mmap().

The following is a simple audio playback program from the OSS documentation that plays a continuous 1 kHz sine wave. The well-documented code, first of all, gives an example of synthesis and then goes on to set parameters, open and set the audio device, and finally to write the data to the device.


int fd_out;
int sample_rate = 48000;

static void
write_sinewave (void)

This routine is a typical example of application routine that produces
audio signal using synthesis. This is actually a very basic "wave 
table" algorithm (btw). It uses precomputed sine function values for a
complete cycle of a sine function. This is much faster than calling 
the sin() function once for each sample. In other applications this 
routine can simply be replaced by whatever the application needs to do.

  static unsigned int phase = 0;        /* Phase of the sine wave */
  unsigned int p;
  int i;
  short buf[1024];              /* 1024 samples/write is a safe choice */

  int outsz = sizeof (buf) / 2;

  static int sinebuf[48] = {

    0, 4276, 8480, 12539, 16383, 19947, 23169, 25995,
    28377, 30272, 31650, 32486, 32767, 32486, 31650, 30272,
    28377, 25995, 23169, 19947, 16383, 12539, 8480, 4276,
    0, -4276, -8480, -12539, -16383, -19947, -23169, -25995,
    -28377, -30272, -31650, -32486, -32767, -32486, -31650, -30272,
    -28377, -25995, -23169, -19947, -16383, -12539, -8480, -4276

  for (i = 0; i < outsz; i++)

The sinebuf[] table was computed for 48000 Hz. We will use simple 
sample rate compensation. We must prevent the phase variable from 
growing too large because that would cause cause arihmetic overflows 
after certain time. This kind of error posibilities must be identified
when writing audio programs that could be running for hours or even 
months or years without interruption. When computing (say) 192000 
samples each second the 32 bit integer range may get overflown very 
quickly. The number of samples played at 192 kHz will cause an 
overflow after about 6 hours.

      p = (phase * sample_rate) / 48000;

      phase = (phase + 1) % 4800;
      buf[i] = sinebuf[p % 48];

Proper error checking must be done when using write. It's also
important to report the error code returned by the system.

  if (write (fd_out, buf, sizeof (buf)) != sizeof (buf))
      perror ("Audio write");
      exit (-1);

The open_audio_device opens the audio device and initializes it for
the required mode.

static int
open_audio_device (char *name, int mode)
  int tmp, fd;

  if ((fd = open (name, mode, 0)) == -1)
      perror (name);
      exit (-1);

Setup the device. Note that it's important to set the sample format, 
number of channels and sample rate exactly in this order. Some 
devices depend on the order.

/* Set the sample format */

  tmp = AFMT_S16_NE;            /* Native 16 bits */
  if (ioctl (fd, SNDCTL_DSP_SETFMT, &tmp) == -1)
      perror ("SNDCTL_DSP_SETFMT");
      exit (-1);

  if (tmp != AFMT_S16_NE)
      fprintf (stderr,
               "The device doesn't support the 16 bit sample format.\n");
      exit (-1);

/* Set the number of channels */

  tmp = 1;
  if (ioctl (fd, SNDCTL_DSP_CHANNELS, &tmp) == -1)
      perror ("SNDCTL_DSP_CHANNELS");
      exit (-1);

  if (tmp != 1)
      fprintf (stderr, "The device doesn't support mono mode.\n");
      exit (-1);

/* Set the sample rate */

  sample_rate = 48000;
  if (ioctl (fd, SNDCTL_DSP_SPEED, &sample_rate) == -1)
      perror ("SNDCTL_DSP_SPEED");
      exit (-1);

No need for error checking because we will automatically adjust the
signal based on the actual sample rate. However most application must 
check the value of sample_rate and compare it to the requested rate.
Small differences between the rates (10% or less) are normal and the 
applications should usually tolerate them. However larger differences 
may cause annoying pitch problems (Mickey Mouse).

  return fd;

main (int argc, char *argv[])

Use /dev/dsp as the default device because the system administrator 
may select the device using the ossctl program or some other methods

  char *name_out = "/dev/dsp";

It's recommended to provide some method for selecting some other 
device than the default. We use command line argument but in some 
cases an environment variable or some configuration file setting may 
be better.

  if (argc > 1)
    name_out = argv[1];

It's mandatory to use O_WRONLY in programs that do only playback. 
Other modes may cause increased resource (memory) usage in the driver.
It may also prevent other applications from using the same device for 
recording at the same time.

  fd_out = open_audio_device (name_out, O_WRONLY);

  while (1)
    write_sinewave ();

  exit (0);

Copyright (C) 4Front Technologies, 2002-2004. Released under GPLv2/CDDL.

Pages: 1, 2, 3, 4

Next Pagearrow

Linux Online Certification

Linux/Unix System Administration Certificate Series
Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills, and to back those skills up with a Certificate from the University of Illinois Office of Continuing Education.

Enroll today!

Linux Resources
  • Linux Online
  • The Linux FAQ
  • Linux Kernel Archives
  • Kernel Traffic

  • Sponsored by: