Putting sounds together one little piece at a time.
Author:
Updated:
Original:

Most desktop musicians are familiar with the term "additive synthesis," but how many know exactly what it is and how it can be used to create sound? Like other methods of synthesis, such as subtractive, FM, or phase distortion, additive is a common technique that has great potential for synthesizing a wide range of sounds. In this article, I'll talk about the basic principles behind this technique and the different ways it has been implemented over the years.

To put it simply, additive synthesis is a sound generation technique that combines simple waveforms at various frequencies and amplitudes to create more complex, "composite" waveforms. Today's electronic instruments handle this task with ease, but additive synthesis actually dates back to the very beginnings of electrical theory.

Fig. 1: The simplest component of sound is the sine wave. This display shows a sine wave with a frequency of 440 Hz and an amplitude of 0 dB.

## The Fourier Theorem

During the early nineteenth century, a French mathematician named Jean Baptiste Joseph Fourier theorized that any complex sound can be broken down into a series of simple sounds. The inverse of this is also true: any complex sound can be created by using the basic building blocks of sound. These building blocks are known as sine waves, and just as the atom is the smallest known unit of any element, the sine wave is the simplest known unit of sound. A sine wave is a pure, continuous tone with only one specific frequency and amplitude, such as 440 Hz at 0 dB (see Fig. 1). It is produced by any object that vibrates in a very simple pattern of back and forth motion, called sine motion. (It can also be produced electronically by nearly any synthesizer available today.)

Because the sine wave is so simple, however, it is also a very boring sound to listen to. You know that tone heard over your television at the end of the broadcast day? That's a sine wave. As Fourier's theorem states, the combination of multiple sine waves can create complex sounds, and that is the basis for additive synthesis.

The individual sine wave components that make up a complex sound are called partials, and each partial has its own unique frequency and amplitude. The range of these partials makes up the spectrum of the final sound. The first partial, called the fundamental frequency, is the most important because it determines the overall pitch and loudness of the sound. Each additional partial also influences the pitch and loudness but typically to a lesser degree. More importantly, these additional partials determine the timbre, or tone color, of the sound (see the sidebar, "The Timbre Story").

## The Very First Synth

Unfortunately, during Fourier's day there were no practical means of testing out his theorem. It wasn't until the 1860s that Hermann von Helmholtz proved Fourier's theorem by making the first significant musical use of electricity. Helmholtz built an apparatus that consisted of a number of electrically driven tuning forks, each tuned to a specific partial. When played together, they produced a complex sound. This invention was the first "synthesizer," but it was by no means a musical instrument because it could only produce one specific sound at one specific pitch. (You can read more about the history of electric and electronic instruments in Joel Chadabe's excellent book Electric Sound, published by Prentice Hall, 1996.)

It wasn't long after the Helmholtz experiment, however, that American inventor Thaddeus Cahill created the world's first electrical musical instrument. Cahill's Telharmonium was also an additive synthesis-based instrument that combined simple sine waves to produce more complex sounds. The Telharmonium was a polyphonic instrument with a touch-sensitive keyboard that produced sine waves by using a series of rapidly spinning alternators. The alternators were driven by banks of electric motors that rotated at fixed speeds and controlled the frequency of the alternators and thus the pitch of the sound. The instrument predated the invention of the amplifier, however, and was a monstrosity that weighed over 200 tons and needed six railroad cars to transport. Because of this and other technological problems, the Telharmonium wasn't an enduring success.

## B-4 the B-3

When Laurens Hammond took the rotating-disk system of the Telharmonium and combined it with more modern electrical technology, the first commercially successful electric musical instrument became available. The Hammond organ was invented in 1935 and first reached the public in 1939. It sported an electric motor that rotated a shaft containing 91 metal disks, each patterned with specific grooves, that were used to control note frequencies. Its additive synthesis-related features came in the form of switches called drawbars, each of which corresponded to a specific partial, and which together could be used to produce over 300,000 different sounds.

But the Hammond organ (along with its predecessors) still lacked two very important features for the creation of truly complex sounds. First, the organ produced sounds with nonvarying amplitudes, meaning you preset the volume of each partial and then every note had the same amplitude. The volume settings could be changed, but not while a note was sounding.

More serious, however, was the fact that the range of sounds the Hammond could produce was, in a sense, limited. Let's look at a little more theory to understand why.

## Part and Partial

The partials that make up a sound's spectrum come in two forms: harmonic and inharmonic. Harmonic partials are defined mathematically as whole-number multiples of the fundamental frequency. For example, by doubling a fundamental frequency of 440 Hz, we get a harmonic partial with a frequency of 880 Hz, and tripling it produces another harmonic partial at 1,320 Hz. These frequencies are known as the second and third partials, and so on. Inharmonic partials, on the other hand, are those sine waves whose frequencies are not whole-number multiples of the fundamental. For example, partials at 500 Hz and 900 Hz would be inharmonic relative to a fundamental of 440 Hz. The Hammond organ was limited to harmonic partials, which is why it produced such pure and smooth tones.

To create truly complex sounds, any synthesizer should be able to produce both harmonic and inharmonic partials. And it should have the ability to combine several dozen to several hundred sine waves. Each of those waves requires its own oscillator, set to a unique frequency and amplitude. Because the loudness of most complex sounds varies over time, the amplitude of every sine wave must be dynamically controlled by an envelope generator. Each envelope generator requires at least an attack, decay, sustain, and release segment, so even with only 30 sine waves to manage, that's over 100 parameters that have to be controlled and created in real time in order to create a single note. This task was beyond the reach of any instrument in the 1930s and was something that could only be performed by the power of computing.

## The Computer as Synth

During the 1950s and '60s, digital mainframe computers found in research institutions were first used to generate complex sounds by manipulating specific partials. Researcher Max Matthews at Bell Labs is credited with developing the first sound programming language. Matthews called his program Music I, and though the first version was a simple, 1-voice generation utility, it quickly evolved into an application that provided an unlimited number of voices. The program didn't work in real time, however. Sound parameters had to be fed into the computer, which then took a certain amount of time for processing, and the results had to be converted into an analog signal before being played.

Then, in the late 1960s, David Luce built a machine that would analyze a set number of partials for any complex sound and display their individual envelopes as plots on an oscilloscope screen in real time. These plots were photographed, and Luce would then manipulate the partials of the sound by redrawing the envelopes and having the machine scan his drawings (using an optical scanner). The machine would then play back the altered sound in real time. This was one of the first demonstrations of what is known as resynthesis.

## Resynthesis

Today, pure additive synthesis is still very scarce even with all of the computing power available on the desktop. That's because the sheer number of parameters that have to be set in order to accomplish even a fairly complex sound is overwhelming. And to accurately mimic acoustic instruments, you need to set the partials for every note because the partial characteristics change for every fundamental frequency, and to a lesser extent, for different loudness levels.

Fig. 2: To find the different partials that are present in a sound, the sound must be analyzed with a mathematical process known as a Fast Fourier Transform (FFT). Here is a single spoken word as shown in the analysis screen of Steinberg's Wavelab.

Instead of building sounds from scratch, however, you can try a more practical form of additive synthesis, which is resynthesis. With resynthesis, a computer uses the principles of the Fourier theorem and analyzes a complex waveform to find all of its basic partial components. In particular, it tracks the frequency and amplitude envelopes for as many partials as you want. The computer first samples the sound and then puts it through a mathematical function known as a Fast Fourier Transform (FFT). This makes it easy to take apart any sound, like a spoken word, and find the structure of its spectrum (see Fig. 2). Once you know the sound's frequency and amplitude parameters, you can manipulate these building blocks to create a slightly different or perhaps a radically altered new sound.

Even with today's available computing power, however, real-time resynthesizers are few and far between. The reason is that both analysis and resynthesis processes are very complex, and creating the software and hardware to handle them in real time can be expensive. One high-end system that handles real-time resynthesis is the Kyma system from Symbolic Sound. Kyma is a modular, software-based synthesis and processing workstation accelerated by DSP hardware. (For a review, see the January 1998 issue of EM.) Sound designers use a graphic signal flow editor on the screen of either a Macintosh or PC to specify how to analyze, process, and resynthesize the sound. The signal-flow diagram is turned into a program for the multiple-DSP Capybara hardware, which connects to the host computer via PCI, NuBus, ISA, or PCMCIA card.

Because of the difficulties associated with generating the hundreds of parameters necessary for true additive synthesis, some synthesizers offer a modified approach to the technique and use more complex waveforms as the building blocks for sound production. And rather than letting you manipulate individual partials, they'll provide a limited number of partials whose parameters can be changed in groups.

One such synthesizer is the Kawai K5000, which uses a technique called Advanced Additive Synthesis. The basic building block of a K5000 sound is a bank of 64 partials called a wave set. The wave set can be either partials 1 to 64 or partials 65 to 128 of the naturally occurring harmonic series. You can adjust the amplitude envelope of each partial but not its frequency, which means you can't include inharmonic partials. In order to make up for this, the instrument provides PCM samples that can be combined with the wave sets to create more complex waveforms. This feature makes for a very powerful system.

Fig. 3: The Generate Tones feature in Syntrillium's Cool Edit Pro allows you to build waveforms by defining the fundamental frequency and up to five additional partials.

## On the Desktop

If you want to get your feet wet in additive synthesis on your home computer, you can use several software-only applications. Audio-editing programs such as Sonic Foundry's Sound Forge and Syntrillium's Cool Edit Pro on the PC provide the means for creating and combining simple sine waves in unlimited numbers. Using the Simple Synthesis function in Sound Forge, you can create a sine wave with the specific frequency and amplitude of your choice. Then you build complex sounds by continuing to mix new sine waves with the original until you hear something you like. Cool Edit Pro's Generate Tones feature provides even more flexibility: simply choose the frequencies for up to five partials (harmonic or inharmonic) above the fundamental and let the program build the waveform for you (see Fig. 3). You can even put simple amplitude envelopes on the partials to vary the sound as it evolves.

Fig. 4: The shareware program Adsyn32 creates complex sounds using pure additive synthesis with an unlimited number of partials.

If you're keen on trying out resynthesis, have a look at smsTools for the PC, developed by Xavier Serra and his research group at the Pompeu Fabra University of Barcelona. This program gives you extensive control over the analysis data in numerous ways before you resynthesize it. For example, you can modify just the frequency or amplitude envelopes, or combine the spectra of two different files. The results can be truly remarkable.

So now that you know a bit more about additive synthesis and its many facets, get out there and start building some cool sounds. Theoretically, it's possible to come up with any sound imaginable using pure additive synthesis. Who knows, you may invent something spectacular to replace the Windows start sound...please!

Sidebar
The Timbre Story
In sounds that have a clear pitch, the partials above the fundamental have frequencies that are related to the fundamental's frequency by simple ratios. For example, the second partial is twice the frequency of the fundamental, the third partial is three times, the fourth is four times, and so on. Nonpitched sounds, such as percussion, typically contain inharmonic partials, whose frequencies are not whole number multiples of the fundamental. In most sounds, the partials' frequencies remain relatively stable, but the amplitude of each partial can change over time. Static waveforms, for example the square or sawtooth, contain partials whose amplitudes are fixed, which is why these tones have, for the most part, a lifeless quality.

Just how much do the partials in a natural sound fluctuate? It's not unusual for the amplitudes of a natural sound to vary every 1/1000 of a second or so. Moreover, certain sounds, such as brass instruments, have a spectrum in which the upper partials enter after the lower ones and disappear sooner. Attempting to recreate the spectrum of a sound "from scratch" is not a trivial task and requires complex envelopes with hundreds or even thousands of breakpoints. Few devices can perform this task in real time.

Given the massive range of frequencies that can appear in a sound and the fact that these frequencies change in strength as the sound evolves, you can see why "sound quality," or timbre, is so difficult to define. Despite attempts by many researchers to classify and organize sound timbres, no widely accepted system of classification has yet appeared.