Vocal Magic

It would be natural to assume that as digital audio technology has become more sophisticated, the job of the engineer and producer has gotten easier with
Image placeholder title
Image placeholder title

It would be natural to assume that as digital audio technology has become more sophisticated, the job of the engineer and producer has gotten easier with regard to editing vocals. After all, today's digital audio sequencers and editors give you precision that even the most skilled analog engineer — armed with a razor blade and splicing block — could only dream about. But inevitably, as the tools get better the bar gets raised, and now producers and engineers are expected to work miracles — correcting problems with pitch, dynamics, and even phrasing. “Fixing it in the mix” has become a complicated task indeed.

To give you an in-the-trenches perspective on some key editing issues and techniques for both lead and background vocals, I spoke with six successful producer-engineers who shared their expertise on such subjects as making a comp track, controlling dynamics, editing out unwanted noises, EQ'ing vocals, and much more.

Steve Addabbo is the owner of Shelter Island Sound (www.shelterislandsound.com) in New York City and has produced numerous artists, including Suzanne Vega and Shawn Colvin. He recently finished working on a CD for newcomer Sonya Kitchell that was released on the Velour/Starbucks label.

Bob Power (www.bobpower.com) is a producer, engineer, mixer, and songwriter who has worked for a huge list of artists, including Erykah Badu, D'Angelo, Me'shell N'degeOcello, Citizen Cope, the Roots, and Chaka Khan. He's currently working with singer-songwriter Andrea Wittgens.

Rail Jon Rogut (http://railjonrogut.com) recently engineered Ry Cooder's Chavez Ravine CD (Nonesuch, 2005). Other clients have included such artists as Aaron Neville, Mark Lanegan, and the String Cheese Incident. He just completed work on an album with producer Mike Clink and artist Sarah Kelly.

Johnny “Juice” Rosado (www.publicenemy.com) has produced and engineered numerous artists, including Slick Rick, C&C Music Factory, Mandrill, and Mavis Staples. He's the regular producer-engineer for Public Enemy.

Rick Sheppard has been the longtime engineer for R&B megaproducer Dallas Austin. Sheppard has engineered tracks for Michael Jackson, Madonna, Bjork, Macy Gray, Mick Jagger, Gwen Stefani, and Natalie Cole, to name a few.

Ed Stasium's (www.edstasium.com) credit list includes Mick Jagger, Talking Heads, the Ramones, Living Colour, the Smithereens, the Reverend Horton Heat, and many more. When I spoke to him he was finishing up a project for an up-and-coming band called Lourds.

Leveling Out

Before getting into the nuts and bolts of vocal editing, it's worth noting that editing is a lot easier if the vocal levels are properly recorded. In addition to the crucial issues of vocal performance, mic and preamp choice, and mic placement — which are beyond the breadth of this story — it's critical that your vocal tracks be recorded hot enough to give you full fidelity, but not so hot that you get distortion.

Several of the interviewees recommended that if you're recording at 24-bit resolution, which has more headroom than 16-bit, it's prudent to shoot for levels that average around -5 dB. They suggested that if the song has wide dynamic swings, you might want to manually ride the mic-preamp gain during recording. There was general agreement that lightly compressing to disk (with compression ratios typically between 2:1 and 4:1) is also a good idea. (For a more in-depth discussion from the six engineers on getting good vocal levels, see Web Clip 1.)

Get It Together

In most cases, the end result of a lead-vocal session is an abundance of takes that need to be sorted through, cut up, and reassembled into a final composite or “comp” take. “To me, one of the marks of a really good producer is how they comp,” says Power. “Sometimes a good producer can get a really compelling performance out of just an okay singer, if the producer's a good comper.”

Before you start recording vocals, it's important to create a system to keep track of the quality of the singer's performance on a line-by-line (or sometimes even word-by-word) basis for each take. Some use a comp sheet, which is a piece of graph paper with the lyrics written out on it. There are vertical numbered columns after each line where you can check off the “keepers” or grade the performance of each line on each take, or whatever works best for you.

Instead of a comp sheet, some people take notes on a lyric sheet or even on blank paper. “I take physical notes on a legal pad with each Playlist number and what I thought of each performance,” says Rogut. (In Digidesign Pro Tools, a track can have unlimited Playlists, which are nested tracks within one track, only one of which can play back at a time.) Rosado, a Cakewalk Sonar user, says, “You can write notes within the program, so I do that.”

However you do it, you'll save yourself time in the comping process by taking notes as the singer is recording. Just don't be too obvious about it, because no singer will like the feeling of being “graded” as they work. “I keep notes on the comp sheet,” says Power, “but not in a way that will be either demoralizing or obtrusive to the artist.”

The How and Why

So what are some of the actual comping methods these engineers use? Their procedures are dictated in part by the architecture of the different sequencers they use.

Rosado uses Sonar's Layers feature for his vocal recording and comping. It allows multiple takes to be recorded into one track; they show up as layers that can be muted, unmated, and freely manipulated. “I just split the tracks where I want them to be, and I mute the clips [regions] that I don't need,” he says. “Eventually, after I find the parts I really like, I bounce it down to one clip.”

Stasium takes advantage of Pro Tools' Playlist feature. Each vocal take gets recorded to its own Playlist. When it comes time for comping, each Playlist is sequentially numbered (which Pro Tools does automatically) and easy to audition and copy and paste from. “I remember the days of doing vocal composites with multitrack tapes and just rewinding and switching between the tracks,” says Stasium. “Sometimes it would take up to two days to make a vocal comp if you had like 10 or 12 tracks of vocals.”

Image placeholder title

FIG. 1: This screenshot from a Natalie Cole session shows Sheppard''s method for vocal comping in Logic Pro. The vocal-take tracks are assigned to the same audio voice, cut up by phrase, and auditioned one line at a time. The keepers are pasted into the comp track at the bottom.

Sheppard and Power are both Apple Logic Pro users (both use Logic on top of Digidesign TDM hardware, and Power also uses Pro Tools), and they take advantage of that program's feature for recording multiple takes that are assigned to the same audio voice. When auditioning takes, only one can play back at a time, which streamlines the auditioning process. “First, I cut up the vocals in phrases,” says Sheppard, “then I go through each take by unmuting one vocal line at a time. It goes really fast, and it's nice to see everything at once” (see Fig. 1).

“I set up a cycle, a loop, which is really easy and handy,” explains Power, “and then open up one channel, listen to it, open up take 2, listen to it, take 3, take 4, take 5, take 6. And as that goes on, when I hear something I really like, I actually cut it front and back, on the source track.”

Image placeholder title

FIG. 2: One way to help keep track of the takes you''re comping is to color-code them by quality. If your sequencer allows for color-coding of individual regions (such as in this example from Logic Pro), you can even color-code individual phrases.

If it's a complex comp, Power will often color code a region after he separates it. “I have a color coding scheme based on how dynamite the little segments are,” he says. “Red means really good and really exciting. Orange means pretty good and pretty exciting. Green means it's acceptable. Blue means maybe if we really need it, etc. But everybody will come up with their own. The advantage to color coding is that you can put together a comp pretty quickly, and then later on, if you're not sure about a line or a word, you can very quickly look up and say, ‘Oh wow, there's another orange one up there'” (see Fig. 2).

Addabbo takes a different approach. “I've just always been a kind of real-time vocal producer,” he explains, “where I'm building a performance with the artist in a live setting.” He'll start by having the singer do a few complete takes that are close in feel and execution to what he's looking for. Then, instead of continuing to record a series of complete takes, he'll use those first performances as a starting point and will begin punching in bits and pieces live with the singer. (He keeps all the lines he replaces, in case he changes his mind later.)

Here's an example of his procedure: say the vocalist sings a good first verse and chorus, but then the second verse is not spectacular. Addabbo would duplicate the Playlist, creating a copy of what was already recorded. Next, the singer would resing the second verse, with Addabbo punching into the duplicated Playlist. The singer would continue on until they reached another part Addabbo was not sure of, and then he'd duplicate the Playlist again, and punch from that spot. “It doesn't mean that I'm not going to have a lot of choices later if I decide I don't like something,” says Addabbo. “But at least I'm getting a cohesive performance that both the singer and I are vibing to together.”

Image placeholder title

FIG. 3: Many sequencers, including Steinberg''s Cubase SX (pictured here), allow you to record vocal takes as nested tracks within a single track. Such features facilitate vocal comping by making it easy to manipulate and audition the various takes.

Although Addabbo does these cumulative comps in Pro Tools, his method could be used in any digital audio sequencer. It's easiest if your sequencer has a nested track feature such as the Takes in MOTU Digital Performer, the Layers in Sonar, or the Lanes in Steinberg Cubase SX (see Fig. 3). But even if it doesn't, you can duplicate your lead vocal track before each punch, and punch into that new track.

Rogut also likes to use Playlists when recording vocal takes in Pro Tools. But afterward he splits them out onto five separate tracks and also creates a new empty track called Vocal Comp above the source tracks. He'll copy and paste the best sections from the source tracks to the Vocal Comp track, keeping their region names intact to make it easy to see which vocal track they came from.

Image placeholder title

FIG. 4: Rogut''s method for auditioning tracks when comping uses Pro Tools HD''s “Mute Frees Assigned Voice” preference. In this example, when the transport gets to the “hole” in the Vox Comp track, the audio from the track below will play for that section.

Rogut then assigns the source tracks and the comp track to the same audio voice, and uses Pro Tools|HD's voice-stealing feature — which gives playback priority to the top voice (the Mute Frees Assigned Voice feature should be enabled for this) — allowing him to easily audition switches between different lines or words. By leaving a “hole” in the comp track for the line he's auditioning (see Fig. 4), whichever of the vocal-take tracks he unmutes below the comp track will play during that cutout section.

Fading Away

When the sections you're comping together have natural pauses between them, then the process is mainly just one of identifying the pieces you want to use and assembling them in a new track. But what if you have to graft the first part of one line with the second of another? Or substitute one word at the end of a line? Or even put together a word using the first syllable or sound from one word and the second from another?

When editing two adjoining sections together, first experiment with moving the transition point slightly. A little more to the left or right can make a big difference in the smoothness of the edit.

Image placeholder title

FIG. 5: Zooming to the sample level and making your edit at a zero-crossing point will help assure that your edit will be free of clicks and pops.

If you're having trouble finding a natural-sounding edit point, zoom down to the sample level and make the cut at the zero-crossing (see Fig. 5), which is where the wave crosses the centerline. This will assure that the amplitude will be the same on either side of the edit and will help eliminate clicks and pops. Whenever possible, you should make your edits at zero-crossings.

Once you've found the best possible edit point, you'll want to utilize your sequencer's crossfade function to make the transition even more seamless. A crossfade gives you a smooth transition from one section to another, rather than an abrupt one. If properly set — you have control over the length and shape — a crossfade can make the edit point much less noticeable. Some digital audio sequencers can be set to automatically generate crossfades at every edit point.

The crossfade's length is an important factor. If the fade is too long, you'll hear unnatural overlapping. Short fades (set when you're zoomed in pretty close on the waveform) are often quite effective, although one that's too short might not have enough of a smoothing effect.

“When you're going from take to take, with Pro Tools or any other DAW,” says Stasium, “sometimes you're going to get a little click or a pop or some little thing — especially if you're in a word. You've got to find the right crossfade. I crossfade every piece, no matter what. I try to make it sound as natural as I possibly can.”

Grafting and Crafting

I asked the experts if they often have to splice words together from syllables, letter sounds, or both. “You can get as crazy as you want,” says Addabbo, “and I've certainly been there in terms of chopping up words.” However, the overall consensus was that constructing words through editing was not something to be done unless necessary.

“Have I had to do that? Yes,” says Sheppard. “Do I want to do that? No. Hopefully, you can get a word out of someone, but there have been times when it was a great performance, but at the end of the first time she said ‘this’ and the next time she said ‘thith’ because of whatever. So I can take that ess and throw it on the end of the next one and you won't know the difference.”

Power offered a couple of additional techniques for when you just can't get your edit to sound natural. “I find a lot of times, if a crossfade doesn't work, try fading out on the left segment and fading in on the right. Many times that will fix what a crossfade can't.” That technique is particularly easy to do in Logic, which allows you to type in a fade-in and fade-out value for each audio region, but can be accomplished in other digital audio sequencers or editors, too.

Power's other edit-finessing trick starts with putting the two sections being edited together on separate tracks. “Then lengthen the left piece a little and then fade out on that, and lengthen the right piece a little and fade in on that, so they actually overlap just a bit,” he suggests. “A lot of times you can make it work that way.”

Once you've finished comping and crossfading, you're going to have a track with a lot of regions in it. At that point “you should consolidate the track,” says Rogut, “so it's just one large audio file. This also allows you to remove the source tracks from the session to reduce its size.”

Every Breath You Take

Dealing with breath sounds is another important issue in vocal editing. Although it's tempting to just remove them all, doing so can sound unnatural in many musical situations. “Breaths are a really important part of a vocal performance,” says Addabbo. “I've had people go out and record breaths sometimes, if I don't have the right one [laughs].”

But you have to be careful with them. “I go in and I clean up each line,” says Sheppard. “If breaths are needed or give the song a good feeling like it's real personal, then I'll just leave them, but every once in a while you get a singer gasping for air, and you don't want that.” He says you should make your editing decisions on a breath-by-breath basis. “Does it enhance the track or does it sound like they're drowning?”

You have to be particularly careful of how compression affects the sounds of breaths. “Sometimes you hear stuff on the radio,” says Addabbo, “and you can tell how squashed these vocals are, because the breaths are sometimes almost louder than the words. You can just see the meters moving on the compressor.”

Image placeholder title

FIG 6: One way to get make breath sounds less obtrusive—without cutting them out entirely (which can sound unnatural)—is to use volume automation.

One way to deal with such problems is to use your sequencer's volume automation to bring the level of the breaths down to where they're not sticking out (see Fig. 6). “You're counterbalancing the tendency of the compressor to open back up when you're below the threshold,” says Power.

Another issue with breaths is when two cross in the transitions between edited sections. In that scenario, you'll have to finesse the edit point to try to find the spot where the breath sounds natural. “You'll have to move the file a little bit left, a little bit right, to catch the breath properly,” says Stasium.

Rogut finds that it generally sounds more natural to keep the breath from the right-hand (later) side of the edit. “Say there's a breath at the end of line 1 and another at the beginning of line 2. I would normally keep the breath of line 2 and cut the one from line 1.”

Rogut says that if he's having a problem getting a breath to sound right around the edit point, he'll sometimes move the breath to its own region, and then time-stretch it to the length he needs to make it sound the most natural.

Pitching Forward

The six pros generally agreed that pitch correction is a tool to be used only as needed. “I don't like to autotune the entire vocal,” Addabbo says. “I'll go in and say, ‘Okay, that word's a little flat.’ I'll go in and bring up that one line, as opposed to just putting it on stun. It depends. If it's a real poppy, slick record, you may just want to let the thing run through it. But in general, I like to pick and choose what I autotune.”

Addabbo uses Antares Auto-Tune for pitch correction. His standard method is to create a new track called the Fixer track. Onto it, he'll copy whatever sections of the vocal need correction. “Generally, what I'll try to do is to fix the stuff that really hurts my ear or bugs me,” he says. “That could be just the end. Singers always tend to drop their pitch at the end of a line.” He'll then instantiate Auto-Tune as an insert on the Fixer track, capture the track's contents into Auto-Tune's Graphical mode, and correct the problem areas using the various tools provided. After making sure that no clicks or pops have been created by the processing, Addabbo records the Fixer track with Auto-Tune back onto the original vocal track (on a duplicated Playlist).

Rosado has had a lot of success correcting pitch with Celemony Melodyne and lately with the new V-Vocal feature in Sonar 5 (see Fig. 7). He finds V-Vocal both convenient and powerful. “It does its correction with very few artifacts,” he observes.

Image placeholder title

FIG. 7: The V-Vocal section of Cakewalk''s Sonar 5 (developed in conjunction with Roland) gives users built in pitch correction and the ability to manipulate both pitch and time.

Sheppard often uses Wave Mechanics Pitch Doctor, a TDM plug-in, which has a manual pitch-correction mode that can be controlled with automation. If he only needs to correct a word here or there, however, he will typically use Auto-Tune. “Auto-Tune is great,” he says, “as long as you use it only where necessary.” Like Addabbo, Sheppard will make a new region out of the offending phrase, word, or syllable, apply correction (generally in Graphical mode), and then record it back onto the original track.

Power will sometimes use Auto-Tune when he needs mild fixes, but he's not comfortable with too much pitch correction. “Unfortunately, autotuning has become a part of what people expect to hear when they listen to a record,” he says, “and not even in the Cher exaggerated sense. But when I listen to the radio, I can hear it a mile away. This is what modern records sound like. I hate it. I do love things in tune. Everyone who knows me knows that pitch and time are way up on the scale for me. But I really don't like that laserlike, right-in-the-middle intonation. It doesn't sound right to me.”

In the Background

Background vocals present some different challenges to edit and mix compared with lead vocals. The ends of lines can be ragged when you have several people singing them, and problems like breaths and pitchiness can be exaggerated. Naturally, the best way to avoid such problems is to get the vocals right in the first place. “When I do group vocals, I make sure that they're very well rehearsed,” says Rosado. “I try to keep the problems to a minimum.”

In addition to a well-rehearsed performance, there are other techniques for making sure your background vocal performances sound tight when they're recorded. “If there are words with esses or tees,” says Addabbo, “a lot of times I'll tell the background singer, ‘Just don't sing it. The lead singer is doing that ess.’ If you have five people singing esses at the end of the line, you're going to hear ‘esssss.’”

Likewise, if all the singers have to sing a hard consonant like a tee sound at the end of a line, getting them all to sing it at precisely the same time can be very tricky. “I have the background singer not really sing the tee,” Addabbo says.

Another issue to consider when recording a lot of layered background parts is that any sonic problems with those tracks are going to be multiplied. Your mic choice is important. You don't want to use a mic that will accentuate an unpleasant frequency. Power gives this example: “If you have, say, 24-tracks of background vocals out of a sort of nasal-sounding singer, or that were recorded through a nasal-sounding mic [or both], you cannot possibly cut enough of that bad thing out of there, because you have 24 instances of it.”

Your decisions about editing breaths will also be different in regard to background vocals. In most situations you should just cut the breaths out. “You've got to be really careful,” points out Addabbo, “because you can't have 6, 8, or 12 breaths there — however many vocals you stack up. There's no reason to have 12 people breathing on your record. I'll definitely clean them up.” (Waves recently released a breath removal and reduction plug-in called DeBreath, which is part of its Vocal Bundle.)

Addabbo adds, “I'll stay in [Pro Tools'] Grid mode in case I want to fly the parts around.” No matter what sequencer you're using, it makes sense to edit the boundaries of your background vocals on the grid (assuming you've recorded to a click), because then it's a lot easier to copy and paste the parts to different sections of the song.

In Sync

If you're editing background vocal tracks on which the phrasing of the singers isn't particularly tight, there are plenty of editing solutions. One key for tightening up the background tracks is to make the singers' phrases all end together.

Sheppard's method for doing that is typically to use volume automation and fade down vocals that hold for too long so that they end at the same time as the other singers'. You can use the same technique to make sure that the background vocals start their lines together. “I always do fade-ups on everything,” says Rogut. If a harmony vocal is too short at the end of a line, Rogut will often use time-stretching to lengthen it.

Image placeholder title

FIG. 8: Synchro Arts VocAlign unifies the phrasing between vocal tracks.

Several of the engineers also said that they'll sometimes use Synchro Arts VocAlign (see Fig. 8) to tighten up the harmonies. VocAlign is available as a plug-in and as a standalone application for both Mac and Windows. “It actually captures and time-stretches a vocal to make it match the phrasing of the original,” says Rogut. “But you have to be very careful because it's so good that it will make it phasey.” Sometimes after Rogut uses VocAlign, he'll manipulate the vocal further if the plug-in created phasiness. “You can shift it [the vocal phrase] around a little bit, or slightly time-stretch it afterwards,” he says.

If you're going to be time-stretching or doing other destructive edits to your vocal tracks, be sure you're working on a copy, not the original audio file, so you can backtrack if your edits don't work out.

Taming the Beast

Even with their levels compressed on input, your vocals are almost certain to need additional dynamic adjustment in the mix. On this topic, the six engineers offered solutions that included both automated volume rides and compression.

Sheppard recommends evening the levels on a vocal track with volume automation before you mix it, and then writing those changes to disk. “I do that first,” he says. “That way, you won't get that heavy compression on something.” You generally want to keep the level to the compressor relatively even, because higher input volumes trigger more compression, which can change the character of the sound. “If the level difference is really crazy between the verse and chorus,” he says, “I'll put them on separate tracks.”

Power takes a different route to the same destination. “When someone sings soft and low, for example, instead of just turning up the volume on the channel, I will often turn up the input to the compressor. This does two things: one, it turns it up, and two, it keeps the sonic character the same.”

Rogut likes to do vocal-track volume adjustments at the same time he's comping. “While you're doing all the crossfades between all the comp selections, you check that each line matches EQ-wise and levelwise to the rest of the vocal. It's part of the comping process,” he says.

It's hard to generalize what types of compression settings to use on vocal tracks. A lot depends on the song, the singer, and the type of compressor. Typically, however, if you're working on a ballad, you'll set the compression relatively light, to keep the singer sounding natural and dynamic. For an up-tempo song where you want an “in-your-face” vocal sound, you'd compress much more heavily. “For a real driving, dense rock track, you're going to need to squash that vocal pretty hard to keep it powerful and to keep your track powerful,” says Addabbo.


Deciding how to EQ a vocal track depends a lot on the sound of the track and its context in the mix. That said, there was pretty wide agreement among the interviewees that getting rid of low frequencies below the range of the vocal is usually a good idea.

Rosado has found a lot of problems with low-end rumble and noise on vocal tracks recorded in home studios and sent to him for mixdown. “People don't understand what a low-cut filter is,” he says, “or they have mics that don't have low-cut filters. A lot of the very inexpensive condenser mics don't have them. I heard a truck once on a vocal. I said, ‘Is that a truck?’”

Even with well-recorded vocals, it's often useful to get rid of unneeded low frequencies. “I would definitely pull out anything that's not used,” says Sheppard.

“The vocal sits better when you roll off a lot of that bottom that you don't need,” says Stasium. “Depending on the vocalist and the key of the song and 18 other items, I'll roll off somewhere between 40 Hz and sometimes up to 150 Hz. I'll sometimes use a highpass filter.”

Boosting in the high frequencies is another typical vocal EQ treatment. “I like air in the vocals,” says Stasium, “so frequently, but not all the time, I'll boost it at 18 kHz to get some air into the vocal. I've always kind of done that, even when I was working with SSLs and with Pultecs.”

Power will also boost the really high frequencies. “Sometimes I kick up 20 kHz, even though there's not a lot up there,” he says. “What happens is that the curve extends down, and you end up gently nudging up the entire top end. Conversely, if you use a shelf at 10 kHz — although it may go all the way out to 20 kHz — it might be too harsh.”

Automation can be used not only for volume rides, but also for EQ that changes throughout a song. “I do automated EQ rides throughout the range of the song to even out the vocal sound through the different ranges of the person's voice,” says Power. “A lot of times, something will sound really good in a certain range of a person's voice, but then when they go down deep, all of a sudden 150 Hz comes up like 6 dB, and all of a sudden it sounds kind of tanky and woolly.”

Corrective Action

Sometimes you have to use editing and DSP (or both) to get rid of problems on a vocal track. One of the most common is excessive sibilance, which can be corrected, in most cases, with a de-esser plug-in. A de-esser is a compressor with a sidechain input (see “Silencing Sibilance” in the April 2000 issue of EM, available at www.emusician.com). It allows you to compress a specific sibilant frequency within the vocal.

“The more you compress,” says Power, “the more you need de-essers. The closer you mic, the more you need de-essers. The more humid the day (when you're using condenser mics), the more you need de-essers.”

The sibilance might not become noticeable until you start EQ'ing and processing the track. “Probably you're brightening your vocal a bit when it's in the track because you're competing with instruments, cymbals, and stuff that's all very powerful,” says Addabbo. “And there's this little vocal in the middle of it, and that's the most important part of the record anyway, so you better get that thing right. So you make it bright, you make it loud, and all of a sudden the esses start to become overbearing.”

Most de-esser plug-ins have “Listen” functions that let you hear only the frequencies you're attenuating. “Listening to the sidechain is very important,” Power says. “If you listen to the track for a little while, and you start flipping through the frequencies, you'll really find the areas that — when you put it all back together — sound offensive. In some cases, there's more than one area. Sometimes I'll set a de-esser around 3 kHz and I'll put another one up at 7 kHz. I also ride the threshold on the de-esser. As you know, it sounds okay up to a point, and then it doesn't sound good at all. So I do [automated] rides on the threshold.”

Image placeholder title

FIG. 9: The blue highlighted section shows a sibilant ess sound. Because of their denseness, such sounds are easy to spot when you look at a waveform display.

Your sequencer's volume automation can also be used for quashing ess sounds. “You go in and pick the spot where that ess is,” Addabbo says, “and you can usually see it because the high-frequency content is easy to spot (see Fig. 9), and you can just dip it there. You'd be amazed sometimes — you can dip it 10 dB and it still sounds natural because the esses are so prominent. You can get away with a lot.”

Besides esses, sometimes you get palate sounds from the singer and other anomalies, like clicks or plosive pops (see Fig. 10), which can mar an otherwise good take. “That's one reason that I do multiple takes,” says Rosado. “I could always crossfade that part out and put another part in.” Addabbo agrees: “Replacing the syllable, word, or phrase from another take would be my first choice.”

Image placeholder title

FIG. 10: This is how the sound of a plosive P looks in a waveform display.

But if you can't find a suitable replacement for the offending segment, you could zoom down to the sample level of your sequencer or audio editor and try to “draw” out the noise with the pencil tool. “If that doesn't work,” Rogut says, “I would try removing the click or pop by cutting it out completely — and then trying to time-stretch the surrounding audio to fill the gap.”

“Sometimes I'll cut out the very, very peak of that plosive and crossfade the rest in,” Rosado says, “and it will be enough to soften it up. You can't take too much out, because obviously then there will be a gap. But if there's enough happening in the song, you won't miss it.”

Stasium has another trick that he uses for pops: “I also will use a highpass filter, and put it in there with the computer just for that word. I'll automate in the highpass filter. Just roll everything off under 100 Hz or 120 Hz or even 150 Hz during the pop. It's so brief that you don't notice the filter coming in and out.”

Be sure to make a duplicate copy of the audio file in question before you start drawing in waveforms. In many digital audio sequencers, pencil edits (or their equivalents) are destructive.


The subject of vocal editing is too broad to be completely covered in one article. (See Web Clip 2 to learn what the six experts said about ambient effects treatments for vocals.) But hopefully, this look at the techniques and opinions of these engineer-producers will give you insights, inspiration, and plenty of food for thought regarding your own vocal-editing sessions.

Mike Levine is an EM senior editor. Thanks to all the interviewees for their time and cooperation, and to Michael Cooper for additional technical assistance.


  • After comping, smooth out vocal levels using automation or offline plug-ins. This will make mixing easier and allow vocals to be compressed more evenly, which will keep the tone more uniform.
  • Use pitch correction sparingly, only correcting words or notes that are out of tune. Avoid correcting an entire track unless you're going for that “down-the-middle” sound.
  • Consider cutting out low frequencies below the vocal's range to reduce muddiness in the mix.
  • If the vocal track needs more air, try gently boosting the high frequencies (usually between 18 and 20 kHz).
  • Get rid of sibilance using a de-esser plug-in or by lowering the volume of ess sounds with volume automation (or both).
  • For clicks and plosive pops, try finding a word on another take to replace the offending one. If you can't, try zooming in and correcting with the pencil tool, or actually cutting out the offending sound and possibly time-stretching the file to fill the gap.
  • If breaths sound unnaturally loud, reduce their volume with automation or cut them out where appropriate. Cut out breaths on background vocals. Be careful of your compressor's tendency to pull up breath sounds.

When doing a close edit on a vocal (or any other type of audio track), you have a number of options you can try to make the transition sound as natural as possible.

  1. Experiment with lengthening one side of the edit and shortening the other. (Adjust while looping the audio if possible.)
  2. Zoom to the sample level and make the edit at a zero-crossing.
  3. Crossfade at the edit point.
  4. Fade out on the left side and fade in on the right (instead of using a crossfade).
  5. Move one side of the edit to a separate track and overlap it slightly with the original track.