Convolution reverb is rapidly gaining popularity as a powerful sonic tool. Though there is a lot of information describing the theory (see the excellent article “Convolution Number Nine” in the June 1999 issue of EM and “Trading Spaces” in the October 2004 issue, available online at www.emusician.com), not much has been written about how to record the impulse responses used by convolution reverbs. As a result, many people do not take advantage of convolution's full potential.
In this article, I will provide a condensed overview of how convolution works and focus on the steps needed to make recordings suitable for use with convolution reverbs. Convolution software offers huge potential for sonic manipulation, and as you will see, even those on a small budget can blaze new trails into sonic territory using convolution and their own recordings.
React and Respond
Convolution software uses an impulse response and a dry signal as input. An impulse response (IR) is what results when you feed an impulse to some system (see the sidebar “Impulse Response Glossary” for a list of terms commonly associated with convolution). It is the sonic signature of a microphone, loudspeaker, filter, concert hall, or anything else a sound might pass through. An impulse, or “spike,” is typically an extremely short transient that contains all frequencies, like white noise. In the digital realm, this is approximated as a one-sample-long click at full amplitude. The reverberation that an impulse produces in any acoustic environment is that environment's impulse response.
Convolution, in practice, is the process of multiplying two audio signals in the frequency domain. This involves sending two audio samples through fast Fourier transform (FFT) algorithms, multiplying their spectra, running the product through an inverse FFT (IFFT), and playing back the results. (See “Square One: Look Through Any Window” in the July 2004 issue for more on FFT.) Of course, the phrase “multiplying two audio signals in the frequency domain” doesn't convey the sound of convolution. The results of convolving sounds are completely dependent on the sources, more so than with most effects. The best way to get a sense of what convolution is like is to listen to an example (see Web Clip 1). Imagine convolving a drum loop with a short, breathy flute sound. The result sounds a bit like someone playing a staccato rhythm on a flute (see Web Clip 2).
Why So Impulsive?
IR-based reverb is exciting because it sidesteps the central problem in designing reverb units. The true test of a reverb algorithm has always been whether it can convincingly produce the warm sound of a good, spacious concert hall. With IR-based reverb, the sound of the fabled “great concert hall” is captured, and then a plug-in convolves dry audio material with the hall's IR. That results in the dry material sounding as if it were recorded in the concert hall. Convolution plug-ins are commonly billed as being able to place your sounds in a “real acoustic environment.” It's a hefty claim, but the sound they can provide often measures up.
FIG. 1: Christian Knufinke''s free SIR Impulse Response Processor runs as a VST effects plug-in and can use any WAV file as an impulse response. You can alter the IR in a variety of ways before it is applied to an audio file.
If you'd like to test the claim for yourself, there are many convolution reverbs on the market, including both software and hardware units (see the sidebar “Manufacturers” for a list of companies mentioned in this article). A perennial favorite of the budget-minded user is Christian Knufinke's free SIR Impulse Response Processor, available in Windows-only VST format (see Fig. 1). On the Mac, Tom Erbe's free Soundhack has long been touted for its convolution capabilities. There are also commercial convolution plug-ins with prices ranging from $12.95 to $800, and corresponding feature sets. It's up to you to decide what software is most appropriate, but for the remainder of this article, I'll assume that you're working with only common, basic features. As with many other types of audio gear, the sounds themselves matter most. Starting with good material will always give better results, regardless of equipment.
“Good” material, however, is not always easy to come by. If you need your guitar to sound like it was recorded in a World War II — era submarine ballast tank, you have limited options. If you're going to use convolution reverb to get the sound, you'll need an IR sample from a ballast tank. The Internet is a good place to start, but often you'll be left cold by low-quality IRs or a lack of anything appropriate. This leaves the would-be guitarist/submariner with the task of recording his or her own IR. Easier said than done, you say? Yes, but luckily not by much.
Truth Is More Practical Than Fiction
Impulse responses are surprisingly simple to record, but attention to detail is critical. There are two general methods of going about it. One uses an explosive sound (“spiking”) and the other a frequency-swept sine wave (“sweeping”). Although you can record the impulse response of nearly anything, the assumption is that the IR is reverberant in nature.
A few principles apply to both methods. First, the signal-to-noise level is more important than usual. Minimizing noise is always key, but ambient noise is particularly troublesome when recording IRs. The nature of convolution causes anything that shares frequencies with the input to be emphasized, including background noise. Air-conditioning, heating, road noise, machinery, and wind are things to listen for carefully. Even when using high-quality equipment, unwanted background noise might easily drown out the quiet tail of the reverberation.
When setting up for recording, take a few minutes to listen for any ambient noise in the space and take all feasible steps to reduce it. If the ventilation can be shut off, do it. Noisy appliances? Unplug them. A blanket draped over a loud air vent might work wonders on the noise floor. Similar steps could improve any recording, but especially an IR, since what you're recording is the space itself. You can't take a church hall and put it in a sound isolation booth, so do as much preparation as you can on-site.
The Sound of Silence
Keep in mind that the recording will be used to process your sound later, and any quirk introduced at the recording stage will resurface noticeably during convolution. You're not just taking the impulse response of the room; you're taking the impulse response of the room, the microphones, the cables, the portable recorder, and everything in between. Because every aspect of the recording affects the results, many IR recordists use equipment that is as flat, transparent, and neutral as possible. Though coloration in other types of recording is often desirable, it's common to avoid this when recording an IR. The goal is to capture the space as accurately and neutrally as possible. Not only the quality but also the arrangement of the equipment will come through, so take care with placement as well.
Documenting your setup is another practice that becomes highly important when recording impulse responses. You'll be using the same setup every time you use the IR, so it's a good idea to keep track of what that setup is. Many experienced IR recordists take numerous digital photos while working, especially of the relative positions of microphones, the room itself, and the sound source. Others keep a voice log on tape with the IRs themselves. When it comes time to organize things, documentation is a bit of extra work that you'll be glad you did.
It's also a good idea to record some reference material in the site you're working in. That way, you can take the recorded reference material, convolve a dry version with the IR, and see how well it matches up. Then you can tweak the convolution parameters to get the most realistic effects from your IR.
Just Shoot Me
Whatever your equipment situation, you need an impulse to record. To capture a true acoustic impulse response, you must excite the room at every audible frequency and record the results. Unfortunately, it's not possible to create a sound containing every frequency for use as an impulse, so the next best thing is to use some sort of explosion, which provides a suitable approximation. This has the added advantage of allowing you to capture a wide range of frequencies at the same time. The explosion can be a balloon popping or a starter pistol firing, both of which have broad frequency profiles and are short and loud.
Other advantages of this method are that it's very simple to execute and it requires minimal equipment and preparation. Starter pistols and balloons are cheap, easy to come by, and simple to operate. Also, once the recording is done and the sounds are individually edited, the IRs are ready to drop into a convolver and use.
The disadvantages here are numerous, however. The main technical problem is that any approximation of a “true” impulse (that is, one that spans all frequencies) is bound to be imperfect. Starter pistols and, to a lesser extent, balloons are decent approximations, but they do have their own frequency characteristics that deviate significantly from pure white noise. This in turn will color any material processed with an IR created this way. Low frequencies in particular are often lacking. There are also differences among pistols: a .22 caliber pistol is cheaper but produces much less bass than a .38 caliber. In my experience, many of the cheapest starter pistols are unreliable and can jam or break during use, so remember that you get what you pay for.
Aside from their frequency coloration, balloons are not as suitable for larger spaces because they're simply not loud enough to fully excite the room. When they can be used, problems with consistency arise. A balloon bursting can't be repeated exactly. Even if you could inflate each to the same pressure, significant variations in the balloons themselves would introduce inconsistencies. For example, the amplitude and frequency profiles will vary noticeably. Similar inconsistencies also affect starter pistol blanks.
When using a starter pistol, several nonacoustical problems arise. For instance, there are bans on gun replicas in many areas. If starter pistols are illegal, contact the police and try to get dispensation to use the gun, or switch to another method. Even if starter pistols aren't verboten, notifying the police about your plans beforehand will prevent their having to respond to a neighbor's report of gunfire. If starter pistols are permitted, the sound and sight of a gun may still alarm bystanders if you happen to be recording outdoors. Permission is obviously needed anytime you're shooting a starter pistol indoors, and this may be hard to come by at churches, opera houses, or auditoriums. Also, even though starter pistols can't fire bullets, any explosive device carries some physical risk. Eye protection is always recommended. All things considered, the supposed ease and convenience of using a starter pistol seems to evaporate.
However, despite the problems with the explosive methods, using balloons or starter pistols is a lot cheaper and easier than using a sine-sweep rig, and with care and diligence they can give decent results. First, consider where you'll be recording. The location will determine whether you use a gun or a balloon, whether you use dynamic or condenser mics, and where you place the mics.
From here on, I'll assume that the recording will be done in stereo. (For a good overview of stereo recording, see “Double Your Pleasure” in the June 2000 issue.) It is possible to create mono IR recordings, but monophonic reverb is significantly less realistic than stereo. Once preliminary preparations are squared away, the first choice to make is what microphone to use. (This will apply when using the sweep method as well.)
FIG. 2: A loud impulse, such as a gunshot, can overload a condenser mic and clip, as shown here.
Generally, condenser mics are more appropriate for capturing IRs. Their higher sensitivities are particularly suited to picking up the details in a reverb tail. Although most mic techniques in IR recording apply to both spike and sweep methods, condenser mics are easily overloaded by the high SPL of a gun going off nearby (see Fig. 2 and Web Clip 3). This introduces distortion at the microphone stage, regardless of gain settings. If that occurs, you can move farther away, or try switching to dynamic mics, which can better handle extreme transients. As a rule of thumb, make sure the gun is at least 50 feet away when using condensers.
There are other special considerations when using a gun. While setting levels, it's important to remember that the gun will produce a very loud, fast transient. The peak should have quite a bit of headroom to avoid clipping. But be careful: due to the extremely fast decay of the gunshot, level meters might not give an accurate reading of the peak level. If that is the case, you'll need to set the gain even lower than the meters indicate. The best way to make sure you have enough headroom is to listen closely to a test recording. In smaller spaces, you may need to use dynamic mics or switch methods if clipping can't be avoided.
Once you've gotten past the pitfalls of distortion, decide what effect you want from the IR. Microphone placement will affect the output when the IR is used, as if the processed material were recorded with that setup. Unless you're looking for a special effect, the mic technique you use will probably be similar to standard distant stereo-recording techniques. With explosive techniques, it is nearly impossible to record much direct signal because of the extreme SPL. However, the goal is to capture reverb, not direct signal, which can be added later at the convolution stage.
FIG. 3: Sound travels in a random path within a space, reflecting off walls on its way to the mics. As a result, the sound arrives at slightly different times at each mic. A near-coincident configuration is best suited to capturing this effect and helps ensure that the recorded sound has a sense of space and directionality.
Start by thinking about how you might mic an instrument in the room. A practical way to find good spots to set up might look a little silly but is worthwhile. Slowly move around the room, loudly clapping your hands each time you change position. Listen carefully to the reflected sound after each clap. This will help in finding a “sweet spot” that has the sound you're looking for. To get a more accurate picture of what the mics will pick up when you're spiking the room from a distance, have a friend stand still and clap while you move around. When using the sweep method, you can play dry recordings over a loudspeaker to get similar information.
When setting up microphones in a smaller room, there aren't many options. Depending on the available space, you might be limited to coincident or near-coincident pairs. Coincident pairs are great in small spaces, but unless you intend to mix to mono, another technique may give more realistic results. Because stereo width is an important perceptual element of reverb realism, XY setups aren't ideal.
FIG. 4: The ORTF (Office de Radiodiffusion Télévision Française) setup gives a reasonably wide stereo image, is fairly mono compatible, and generally sounds good over loudspeakers.
Near-coincident pairs are often better suited than coincident for recording reverb because they allow reproduction of some stereo time delay. One reason for this is that differences in early reflection time help establish perceptual space and directionality (see Fig. 3). There are standardized near-coincident configurations that have proven both popular and effective over the years, notably the ORTF and NOS standards (see Figs. 4 and 5).
A variation involves turning the mics away from the source, normally toward a rear wall. Using a rear-facing setup is a matter of taste and will increase the amount of indirect sound in the recording. Because IR recording is mainly about capturing indirect sound, this technique is fairly popular. Additionally, it may help when using condensers, since the blast of sound from a pistol will be attenuated somewhat before reaching the capsule. If you have the time, try multiple setups. Place the mics closer to the walls, farther apart, and angle them up or down; always remember that if it sounds good, it is good. Any clean IR recording has the potential to become an interesting effect later on, even if it wasn't recorded using standard techniques.
FIG. 5: The NOS (Nederlandse Omroep Stichting) standard is qualitatively similar to ORTF. When recording impulse responses, choosing the right setup will depend on the room''s characteristics and how you want the IR to sound.
The Big Show
Capturing IRs in a larger room like a concert hall or church isn't fundamentally different from the process previously described. There is more flexibility, however, in mic choice and placement. Generally, the mics are placed in a spaced configuration. Rather than centimeters apart, you may separate the microphones by several meters. Again, you'll want to consider the standard stereo setups as starting points. To decide how to place the mics, consider how you'd like the ambience track to sound if you were recording an instrument in place of the gun or loudspeaker. Focused or very wide stereo? Direct or indirect sound? How distant should the sound be?
With a sound in mind, you can go about creating it. Most IR recording is done with either omni or cardioid pickup patterns. Cardioids are great for creating a sound with more pronounced difference across the stereo field. Even in closely spaced setups, the overlap between the pickup patterns is small. This makes cardioids better for setups in which the mics are angled to either side of the source. As a general rule, a wider angle results in a wider stereo field and wetter sound.
Because they pick up sound in all directions and generally have a more neutral frequency response, omnis are particularly suited for IR recording, especially in larger spaces. It's often necessary to separate omnis more than other mics. A setup with the mics on either side of a stage would be appropriate.
FIG. 6: When using two omni mics, more sound from the area between and behind the mics is captured. Cardioids, aside from rejecting sound behind and to the sides, also tend to color the off-axis sound that they pick up, giving the neutrality advantage to omnis as well.
An advantage to using omni mics is that they won't leave as much of a hole in the middle of the stereo field (see Fig. 6). Omni mics are great for capturing ambience and diffuse sound, especially with a wide spacing. With closer spacing they still provide an open sound but may lack stereo separation.
In a large venue, take time to experiment. Even if you've just recorded a great-sounding IR, others are worth capturing: there can be many ways to place mics in a large space and get varied but equally pleasant recordings. Some recordists take this idea very seriously, capturing dozens of spots, from the front row to the upper balconies.
Sweep Me off My Feet
The sweep method of recording IRs is somewhat more complex, but it usually yields better results and, luckily, carries a very low risk of getting you arrested. Rather than generating every frequency at once, you play a long sine tone that sweeps (normally) from 20 Hz to 20 kHz in the space and record it (see Web Clip 4). The recording and the tone are then deconvolved, which is essentially the reverse process of convolution. Software utilities for deconvolution, such as Voxengo's free Deconvolver (Win), can also generate the sine sweeps for you. The end result is an IR that is cleaner and more balanced than one created using the explosive method.
There are issues with the sine-sweep method, aside from the additional expense, but careful preparation can provide solutions. For example, many speakers are not particularly flat throughout their frequency range. If you know the frequency response of the speaker you are using to play the sweep tone, you can use an EQ to flatten it. Frequency plots are often available online from the manufacturer. Although they might not give you the exact plot for your specific unit, they should serve as a rough guide. The same logic applies to microphones. Careful equalization may greatly improve your IR recordings.
Another problem is directionality. Many instruments project sound in a 3-D pattern, and a starter pistol or balloon projects sound omnidirectionally. A loudspeaker, on the other hand, mostly projects sound forward. Generally, the speaker (normally only one speaker is used at a time) should face forward, out into the hall, from center stage. Some convolution software, such as Audio Ease Altiverb (Mac), supports a “true-stereo” format, which involves making two stereo recordings per IR, with the speaker first placed to the left and then to the right, symmetrically. In this technique both speakers are placed at the edges of the virtual soundstage. Vertically, the microphones should be placed at the height of the tweeter. The speaker should also be placed above the stage or floor in order to avoid any odd resonances, and away from any walls. The same goes for mics.
Once these two problems are dealt with, recording a sweep is very much like the techniques described earlier. Aside from microphones, though, there are a few technical considerations. Make sure you record according to the specifications of the deconvolution software you intend to use. It may be tricky to edit the recordings, so read the manual beforehand. When deconvolving, the result will be cleaner and less noisy when the recorded sweep is longer. A totally silent room might need a sweep of less than 10 seconds, whereas an IR recorded with road noise in the background might be very noisy unless the sweep is hundreds of seconds or longer. The signal-to-noise ratio tends to improve with sweep length: a longer sweep means more signal, and more signal means a better noise ratio.
In addition, silence needs to be recorded after the sweep to accommodate the reverb's full decay time. Most software programs require a period of silence after the tone that is equal to several times the decay time.
Also be aware that deconvolution has the potential to cause serious distortion. If the dry and wet sweeps are not trimmed and aligned properly during the deconvolution process, the resulting impulse may be smeared in time or frequency or may have an incorrect length. After all is said and done, however, the sweep method should provide an eminently usable, high-quality IR.
Think Inside the Box
IR recording is not limited to churches and concert halls. Real is nice, but what about great fake reverb? Many musicians have applied IR techniques to studio gear to capture classic sounds from pricey hardware units. This technique is the simplest yet. Feed a sine sweep directly to the device you're capturing, record the output, and deconvolve it. Everything from a pristine Lexicon unit to a toy tape recorder is fair game (see Web Clip 5).
This article focuses on stereo, but a lot of IR recording is done for multichannel systems. Recording a multichannel IR is conceptually similar to recording in stereo but obviously requires more equipment and expertise.
Take care once your IRs are recorded. Always keep samples at the highest bit depth and sampling rate possible and avoid unnecessary sampling-rate conversions. It is crucial to preserve a low noise floor when working with an IR. If you can't record directly to a computer, be extra careful to minimize noise during the transfer process.
Before you start running every track through a convolver, consider a few of the limitations you may face. All software convolvers introduce some latency in the signal chain, due to the nature of the algorithm. That is usually on the order of a few thousand samples or several milliseconds. The host application can compensate for this, or you might try a plug-in like Sampleslide (Win) from AnalogX. Also, most convolution plug-ins are static, meaning that the convolution doesn't change based on input. A WAV file of an IR won't change over time, so neither will the effect it produces. This means, for example, a chorus or flange effect would not be properly reproduced by an impulse response, because it depends on constantly changing delay effects. Dynamic convolution does exist, but it lies outside the scope of this article.
Don't Keep It Real
Despite the caveats given in this article, recording an IR is often simpler than recording music. If properly recorded, any IR could potentially add that certain something to an otherwise bland track. Even if you don't have recording gear, remember that IRs are samples that can be edited and synthesized like any other. You can apply effects, including crossfading, reversing, and time-stretching, to IRs you might find online, often with excellent results. Experiment and you may discover something wild and fresh.
FIG. 7: Voxengo Impulse Modeler allows you to simulate the sound of an imaginary room. The results can range from boring to surprisingly full of character, and they are fairly realistic.
Also consider looking into software that simulates acoustics to create an IR of an arbitrary virtual space, such as Voxengo Impulse Modeler (Win) or Spectra Ramsete (Win) (see Fig. 7 and Web Clips 6 and 7). This type of program lets you create spaces that otherwise might never exist and would be a good starting point for the landlocked guitarist who needs his or her ballast tank.
Whatever you do with your IRs and however you record them, they will be a valuable addition to your effects library. When you set out to record, though, make sure to plan properly and proceed carefully. Finding a location, getting permission, and making test recordings can be tedious, but a project undertaken without careful preparation is bound to be flawed. A good set of IRs is worth its weight in gold, and anyone can strike it rich.
Alex Kemmler attends Northwestern University. He can be reached firstname.lastname@example.org. Special thanks to Steph Wai for all her help and encouragement.
IMPULSE RESPONSE GLOSSARY
Here are some common terms that you will encounter in the world of IR:
convolution: a mathematical process that when applied to audio involves a simple multiplication of two signals in the frequency domain.
convolution reverb: a type of plug-in that uses convolution and impulse responses of various spaces to simulate the reverb those spaces produce.
deconvolution: when the impulse response is known, a signal convolved with that impulse response can be extracted by deconvolution. In practice, this is used to extract impulse responses from sine-sweep recordings.
dynamic convolution: a technique that attempts to simulate nonlinear systems by taking multiple impulse responses at different amplitudes to capture the nonlinearity.
impulse: also known as a Dirac function, an impulse is an audio signal that has a duration approaching zero and an amplitude approaching infinity.
impulse response: the output of a system when the input is an impulse.
spike: another common name for an impulse.