The Voices Have It

Voice-overs are all about conveying information think of a documentary movie, an audio book, or a radio spot. When recording spoken-word performances,

Voice-overs are all about conveying information — think of a documentary movie, an audio book, or a radio spot. When recording spoken-word performances, your job is to craft recordings that sound natural, accurate, and pleasant. Do that and you'll succeed.

Voice-over recording requires, at minimum, a recording space, a microphone, a mic preamp, and a recorder. As is always the case, the better each of these components, the better your finished recordings will sound — at least potentially. I say “potentially” because it's always possible to screw up, even when using the best of tools. The purpose of this article is to help you minimize screw-ups and maximize the quality of your voice-over recordings.


In general, the ideal room for recording voice-overs is small, quiet, and acoustically dead. Quiet means soundproof (that is, free from exterior sound coming through the walls, ceiling, or floor); acoustically dead means free of unwanted room sound (reflections, flutter echoes, boxiness, or what have you) that might adversely affect the recordings.

Many professional studios feature a separate voice-over isolation booth, which typically is treated heavily with absorptive foam rubber. A convenient and tidy solution for the home studio is to use one of the portable sound-isolation enclosures from WhisperRoom (see Fig. 1), which is what voice actor Harlan Hogan chose for his studio. In a home-recording situation, a walk-in closet filled with clothing can serve as a decent makeshift isolation booth.

If you're a one-room recordist, as I am, position the performer far from noisy gear (computer fans being the main culprit). If necessary, use a noise gate or a downward expander to suppress background sounds when the talent isn't speaking. Also, place additional sound-absorbing material around him or her. A simple, inexpensive trick is to hang heavy moving blankets from the ceiling to form a kind of tent around the performer (see Fig. 2). Obviously, this won't prevent the sound of airplanes flying overhead from destroying your recordings, but it will isolate the voice enough from unwanted room sounds to make the recordings sound better.

When recording in a one-room studio (or in a control room), monitor through headphones. Don't rely on them entirely, though. First, record a quick run-through so you can set appropriate levels. Then mute the mic and play back the track through your studio monitors to make sure you're getting the best sound. If necessary, make adjustments and record another test. When you're happy with the sound, continue recording while monitoring on headphones.

You can also monitor on the speakers by keeping the level low. Avoid feedback and unwanted leakage by putting some distance between the mic and the monitors and by facing the (unidirectional) mic's capsule away from the speakers. Remember to mute the mic when evaluating takes.


Generally, the best microphones for recording voice are large-diaphragm, unidirectional (cardioid-, supercardioid, or hypercardioid-pattern) condensers. Unlike omnidirectional mics, which theoretically “hear” equally in all directions, and bidirectional (figure-8-pattern) mics, which hear equally from the front and the rear, unidirectional microphones reject, or at least minimize, off-axis sounds, especially those coming from behind the capsule. In addition, unidirectional mics naturally provide some amount of bass boosting from the proximity effect, which typically adds low-end fullness and authority to voices.

Though unidirectional mics are usually the best choice, condensers may not be — especially bright condensers that accentuate sibilance. Instead of using EQ to reel in those esses, try selecting a different mic or positioning the mic further away, higher, or slightly off-axis; save the EQ and de-essers for editing and mixing. In the case of overly sibilant voices (more often female than male), you might opt for a darker-sounding condenser, such as certain tube models. I sometimes even use a dynamic, such as the Shure SM58 (the SM57 sometimes adds more unwanted sibilance to certain voices). It all depends on the particular interaction of the voice and the mic, of course, which is why a test run is always recommended.

In pro studios, the first pick for voice-over work is often a Neumann U 87 or U 89. If those are beyond your reach, you can also do a fine job with one of the many less expensive condensers on the market. There are terrific offerings from Shure, AKG, and Røde, for example. Also, I've had great success with the Marshall 2001 and 2003. Douglas Spotted Eagle, the Grammy-winning performer and producer (see, prefers either the Audio-Technica 4033 or the Audix SCX-25. Many radio stations use the Electro-Voice RE20 or the Sennheiser MD 421 — both dynamics — for that distinctive radio-voice quality.

A closely guarded voice-over secret is to use an interference tube or shotgun mic, specifically the Sennheiser MKH 416 (see Fig. 3). This mic's tight polar pattern and hefty proximity-effect bass boost lend a deep, rich character to voices, especially male ones. “Voices just naturally sound full and punchy on the 416,” says Hogan, who often prefers it to his vintage Neumann U 47.


Minor changes in mic positioning can greatly affect the sound you capture. Start by suspending the microphone from a boom stand, preferably using a shockmount, which will reduce floor rumble and other stand-borne vibrations. Position the capsule so that it is centered and even with the talent's mouth, between four and six inches away. (When using a shotgun mic, position it a little farther back, between six and ten inches from the lips.) Insert a pop filter about halfway between the mouth and the mic capsule.

When testing the mic, listen not only for accentuated sibilance but also for susceptibility to plosives (popped ps, ts, and even chs and tchs). To reduce plosives, try positioning the mic capsule a bit higher than the talent's upper lip and aimed downward slightly. Another approach is to mic the voice from the side, between 20 and 30 degrees off center. (Side miking has the additional benefit of making it easier to position the talent's script.)

Sometimes capsule repositioning alone doesn't solve the problem. For such cases, Spotted Eagle suggests taping a pencil directly over the center of the mic (see Fig. 4). This trick stops popping simply by breaking up the air flow.


The same preamps you use for recording music will usually work fine for recording spoken voice. In general, though, go for clean, low-noise preamps rather than those that provide coloration or a particular sonic signature.

In my studio, I often record straight into the onboard mic preamps of an Edirol UA-5 USB audio interface. The sound is basic, uncolored, and low noise. Hogan uses the mic preamps in his Mackie 1202 VLZ mixer, which he routes to an M-Audio Delta 66 interface. Spotted Eagle relies on his John Hardy M-1 mic preamp.


Any decent recorder, analog or digital, can work well for voice-overs. You can even use 2-track software, such as Sonic Foundry Sound Forge, to record and edit your tracks. Hogan uses Syntrillium Cool Edit Pro and simultaneously records to DAT as a backup. He leaves the DAT running during the session to avoid missing any takes because of computer problems. The always-running DAT also captures warm-ups and rehearsals — possibly useful takes that typically are lost.

When recording to a digital medium, set your levels so you still have plenty of wiggle room should the talent suddenly get louder. Average levels around -10 dB to -12 dB are usually sufficient. (Remember that 0 dB analog is equivalent to about -20 dB digital.) Record in mono at the highest sampling rate and bit depth that you have available. I record, edit, and mix at 24-bit, 48 kHz; later, if necessary, I convert the bit depth and downsample the sampling rate, depending on the client. Some clients want a CD or WAV or AIFF files, which mean a 16-bit word length and a 44.1 kHz sampling rate; others want a high-quality MP3; and some prefer to take the 24/48 files on CD-R.

In general when recording voice-overs, use no EQ, compression, or effects. Engage the highpass filter on any mic that has one, or roll off everything below 80 Hz at the preamp or mixer or during editing or mixdown to get rid of any rumble and subharmonic junk.

Be sure to keep written notes — about good takes, blown lines, and so forth — on a copy of the script. That way, you won't waste precious time later trying to locate what you need. Also, listen back to all tracks on your main monitors before the talent leaves — there may be a glitch you didn't hear during the take, and if you don't fix it immediately, you'll likely have to pay the talent again to rerecord it later. Another tip: after the session ends, immediately burn a backup CD of the raw tracks.


For shorter projects, most spoken-word performers will want to stand while recording. For longer projects, such as an audio book, they will probably need to sit, so make sure to have a chair or stool that doesn't squeak or otherwise make noise. Position the script nearby on a music stand, ideally at eye level so the performer doesn't have to look down or turn away from the mic while speaking.

When working with a professional, you can pretty much hand him or her the script, get a level, and press Record. Most pros know how to work a mic — for example, how to minimize plosives, back off on loud passages, and generally deliver a consistent vocal quality and dynamic range.

Recording amateurs is the greater challenge. Most are uncomfortable recording and, yes, they will almost always tell you how much they hate the sound of their voices. To get the best performance, do everything possible to put them at ease and make them feel comfortable. Depending on the personality, it might even help to explain what you're doing at various stages in the recording process.

Amateurs tend to rush, so look for ways to slow them down. “It's important that they relax,” says Spotted Eagle, “because a tense voice is higher in pitch than a relaxed voice. To some extent, excitement can be created in the mix, but tension cannot be removed, no matter how many tools you have in your arsenal.”

Bad breathing is another common problem among novices. Long sentences and strings of complex nouns often force the talent to take “catch” breaths. These gasps for air sound horrible and can really hurt a take. If you run into this problem, go through the script with the performer to indicate natural pauses for catching a breath. This will help the talent pace his or her breathing.

If you're still not getting a great performance, tactfully suggest ways for the performer to improve delivery of the lines. Of course, when working with a director or producer, it is usually up to him or her to make those suggestions. But as a general safeguard — and especially if you're recording without input from others — try to get at least three takes of each line. That way the client can choose — later — the best takes or performances.

When the performer blows a line, don't redo just the blown words. Sentence pickups, even when expertly edited, rarely sound natural — often the emotion or inflection is wrong, making them stick out. Instead, go back to the beginning of a sentence or some other logical starting point and pick it up from there.


One problem with recording and editing in the same room is that your brain tunes out any background din, which is present, naturally, both in the recording and in the room itself. Before cleaning up and editing takes, listen carefully on your best headphones to find glitches and other background sounds — things that you might miss on your monitors. Headphones can help you focus on the recording and ignore your noisy studio. Of course, when it comes time to edit, you should return to listening on your reference monitors.

Before starting to edit (digitally), always make a copy of the original file and then work on the copy. That way you can go back to the original file if something goes wrong. (The session CD is yet another backup.) Referring to your notes, choose the best takes, fix mistakes, and get rid of unwanted bits such as background noise, chatter between takes, unnecessary breaths, lip smacks and other mouth noises, digital ticks, and so on.

When it comes to editing breaths, note that, if you cut them out completely, the finished version can sound unnatural. On the other hand, excessive breathing can sound bad, too. The trick is finding a balance. Work to eliminate big breaths that add nothing to the performance, as well as distracting “catch” breaths that occur between phrases. Leave the more natural breaths in place (though you may want to play them down somewhat by reducing their volume).

If you're hired only to record the narration, work to get a good basic sound and leave the EQs, compressors, and effects off — those responsible for the final mix can do the creative work. Hogan delivers only his voice tracks to producers, who then add the other elements. They insist he send a clean file, free of special effects. “I might normalize the tracks, but that's about it,” says Hogan. “For auditions, though, I usually use some compression to make my readings sound louder, which helps them stand out from the other actors' submissions.”

On the other hand, if you are hired to do the whole project — voice, music, and sound effects — you can be more adventurous. After recording and cleaning up takes in Sound Forge, I load them into Sonic Foundry Vegas Video to arrange the final piece. The editing I do in Vegas is more to perfect timing than to fix mistakes. Vegas is also where I add EQ, compression, and other sweeteners, if needed.


On dull-sounding voice recordings, add a slight EQ boost somewhere between 3 and 4 kHz. Avoid boosting in the 5 to 8 kHz range, though, because that's where sibilance often resides. For a more in-your-face sound, Spotted Eagle suggests also punching up the low mids (150 to 250 Hz) and the extreme highs (8 to 10 kHz).

If sibilance persists, apply some frequency-dependent compression, known as de-essing, to reduce it. This usually works better than EQ alone. Sound Forge includes a Multiband Dynamics plug-in that's ideal for minor sibilance issues (see Fig. 5).

Gentle compression can even out levels and make for a smoother, more natural sound. Spotted Eagle always uses compression to maximize the impact of the voice, whether it's heard alone or over sound effects, music, or walla (nonsynced crowd sounds). He recommends a 4:1 ratio with a fast attack and a slow release as a good starting point. “If you need the level hotter,” he says, “such as for radio or TV, then compress even more.”

Jeffrey P. Fisherhelps musicians improve their careers through books such as Profiting from Your Music and Sound Project Studio (Allworth Press, 2001). Check out his web site Special thanks to Douglas Spotted Eagle and Harlan Hogan.


Recording narration on location in noisy environments brings another set of challenges. Isolating the performer or using acoustic absorbers may be impractical. Use them if you can, of course. But either way, choose a good unidirectional mic and position it close to the performer.

Later, when you edit the location sound, you may notice tiny jumps in background noise as you compile takes. Using a noise gate often compounds the problem. The solution is to record 30 seconds of room tone (no talking, just the sound of the room), either at the beginning or the end of the recording session. You can then mix in a bit of this background track to help smooth over problem areas.