Vocal Tools In The Virtual World: With Today's Daws, The Sky's The Limit On Processing The Human Voice

As much as your lead guitarist might not want to admit it, it’s usually the vocals that provide a musical focus. In fact it’s been said that the human voice is the ultimate instrument, but it’s also an imperfect one: Unlike the frets that guide you through a guitar or the keyboard keys that constrain you to perfect half-steps, when it comes to making a great vocal sound, you’re on your own.
Or are you? Today’s tools for vocalists are far more evolved than the equalizers and plate reverbs used when dinosaurs roamed the earth: You can change pitch, change gender, change timbre, fix clams, enhance vibrato, add harmonies, and a whole lot more.

We’ve collected a bunch of information on vocal helpers, and also reviewed four software tools that stand out from the pack, to help you work your way through the terrain of recording vocals in the new century. So grab that mic, plug it in, and as it says in the tune immortalized by Benny Goodman’s band . . . “Sing, Sing, Sing!”


All pitch correction tools work pretty similarly: They analyze a note’s pitch, compare it to a scale, then apply the appropriate amount of pitch-shifting to have the note conform to the scale. Nor does this have to be a traditional scale; for example, both Waves’ Tune and Antares Evo Auto-Correct let you specify standard but also highly exotic scales.

You often determine how a device applies this correction. For example, you might want to have the input jump instantly to the new pitch, or slide to it over time instead. There may be various bells and whistles, such as being able to accentuate vibrato, “flatten” the natural vibrato and synthesize a new vibrato, change the formant (vocal characteristics, such as transforming male to female), and the like.

Furthermore, in most correction devices you can turn off “auto-pilot” and see a display that indicates how far off the note is from the scale. Then you can adjust the note for the desired degree of accuracy. As this usually relies on converting notes to MIDI data, manipulating vocals is like manipulating MIDI: You can move the note, transpose it, alter pitch, quantize, and so on. In fact, Celemony Melodyne, Antares Auto-Tune Evo, Waves Tune, and Sonar V-Vocal allow pulling out the MIDI file so you can use it to drive something other than voice—want to sing a trumpet part, then have it played by a trumpet sound? Yes, you can. Celemony’s Melodyne cre8 even installs a synth when you install the program, just to get the point across. (Those with long memories might recall that Opcode’s StudioVision included audio-to-MIDI conversion, but it lacked the sophisticated pitch and formant correction we associate with today’s tools.) Sometimes MIDI works both ways, too, where MIDI input can constrain notes within a scale you play.

Should you use correction to make sure all pitches are “perfect”? No. Not everyone wants to sing exactly on key—hitting a seventh note just a tiny bit flat before swooping up to the tonic can put the “tension” into “tension and release.” Besidses, processing a vocal note can produce audible artifacts, and the more you shift, the more audible these artifacts become. Finally, the human ear expects vocals to have slight inconsistencies—removing those can give vocals an artificial feel. While this works for certain types of music, it’s hard to imagine a sensitive jazz ballad given this kind of treatment.


There are two basic types of harmonization plug-ins. The simpler ones, such as Waves’ UltraPitch, do straight pitch transposition (actually, UltraPitch can generate multiple parallel transposed lines, but it doesn’t generate “intelligent” harmonies). With slight pitch variations, these plug-ins can also produce grandiose chorusing and thickening effects.

The second type breaks audio down into events that you can edit individually. The grand-daddy of this genre is Celemony’s Melodyne (Example 1); like all other harmonizing programs so far it’s restricted to working with monophonic lines, but a polyphonic version is in the pipeline. Originally, Melodyne was a stand-alone program but it later added a “bridge” program that allowed treating it more like a plug-in for host applications. It’s also possible to ReWire Melodyne into a host, or ReWire clients into Melodyne’s mixer. This is where Melodyne’s ability to act as a stand-alone recorder comes in really handy; for example, if you ReWire Melodyne and Propellerheads’ Reason together, you can record vocals in Melodyne that run in parallel with Reason, but also do all the cool Melodyne editing tricks.

Waves’ Tune works by rewiring into a host; all edits are non-destructive, and saved as part of your host’s project, whereas Antares Auto-Tune Evo is a true plug-in.

However, your DAW might already have some of those tools built in. For example, Cakewalk Sonar appropriated Roland’s VariPhrase technology in the VVocal processor (Eample 2) bundled with Sonar Producer Edition. Unlike plugins, you first turn a clip into a V-Vocal Clip, at which point the various V-Vocal options for timing, pitch, dynamics, and formant manipulation come into play. As expected, you can render the V-Vocal clip back into a standard audio clip that includes whatever changes you made.

MOTU’s Digital Performer has sophisticated pitch corrections options as well (Example 3), as detailed in the 02/08 issue’s Power App Alley. However, it also lets you draw in pitch changes very simply with a pencil tool, so it’s easy to make quick fixes without having to go deeper into the program. Magix Samplitude’s “elastic” window (Example 4) allows for pitch manipulation, again by drawing in new curves.

While all these types of programs have many similarities, they also have differences. For example, Waves Tune (Example 5) handles vibrato extremely elegantly: It highlights vibrato in pink, which you can increase or decrease (or add an attack). And you can substitute synth vibrato instead, with a choice of waveforms as well as depth, predelay, attack, and rate. Like MIDI sequencers, Tune also has a “glue” tool for connecting separate notes, and a scissors tool for splitting notes into multiple notes.


You say you don’t have a harmony synthesizer? During mixdown, everyone decided the lead vocal could use a harmony line, but the singer is on tour in Ulan Bator? You can sing harmonies, but you want them to have a bit of a different timbral quality?

All is not lost, if your DAW has decent pitch-stretching algorithms. However, it’s crucial that you can change pitch without changing duration; with programs that do “Acidized” looping (Sony Acid, Cakewalk Sonar), upon looping a clip you can change pitch without changing length. Ableton Live can do this as well, albeit with a different process.

With programs that use DSP algorithms for pitch-shifting, there will usually be some kind of check box called “Preserve Duration” or “Time Correction” for when you stretch pitch. Make sure this is enabled.

Here’s how to add harmonies with Pro Tools 7 LE, but the procedure is similar for other hosts.

1. In Edit view, right-click in the track header of the track you want to harmonize, and select “Duplicate.” In the dialog box that appears, enter 2 for the Number of Duplicates.

2. Click on the first copy to select it, and go AudioSuite > Pitch Shift. Enter +4 for “Coarse” to create a major 3rd harmony.

3. Make sure “Time Correction” is checked so that the clip duration doesn’t change.

4. Click on “Process.”

5. Click on the second copy to select it, and go AudioSuite > Pitch Shift. This time, enter +3 for “Coarse” to create a minor 3rd harmony and again, make sure “Time Correction” is checked.

6. Solo the original track and major 3rd harmony. Note which parts sound right together; cut away those pieces of the harmony track that clash with the main vocal because they should have a minor 3rd harmony.

7. Solo the original track and minor 3rd harmony. Cut out any sections of the minor third harmony where there’s already a major third harmony.

8. Continue editing, and keep as much or as little harmonization as you like.

That does it (Example 6). However, with standard DSPbased pitch shifting the harmonies seldom track formant changes, so the greater the amount of pitch shifting, the more unrealistic the sound. Therefore, you’ll probably want to mix the harmonies in the background, and apply some reverb (low diffusion settings work well, as that “thins out” the reverb a bit, and keeps the vocals from “stepping on” other parts).

Also, you’ll need to cut the phrases with a fair amount of precision, as you want to cut in the spaces between words. Zoom in as far as necessary; it helps to turn off snap.

Finally, because you’re shifting the vocals up in pitch, they’ll have more high frequency content. Trim back the highs slightly to “warm up” the vocals.

Although this method will never replace doing harmonies with “real” voices, for quick harmonies—or for those cases where the synthesized harmonies add a special effect—this is a very useful technique.


While software is convenient if you’re working “in the box,” there are plenty of hardware vocal tools available. Two DigiTech vocal processors, the V300 and VX400, model a variety of vocal effects (and even include an expression pedal), but DigiTech’s Big Deal is the Vocalist Live line—VL2, VL4, and VL Pro. What makes them cool is that if you plug in a guitar (VL2/4) or use guitar or keyboard (VL Pro), the box analyzes the key and chord progression, and generates harmonies automatically— you don’t tell it what root and scale you’re using, because your instrument tells it.

But are they really for live use only? I’ve run mic channels to the Vocalist and sent recorded tracks (including guitar, bass, or keyboard) into the Guitar input, and as long as I got the levels correct, everything worked properly. Of course, you have the additional latency of going out through an interface and coming back in again, but that would be true of any “outside of the box” processor.

TC-Helicon also sells pedals that while billed for live performance, work very well in the studio. Create is basically a vocal strip on steroids, with reverb, modulation effects, delays, and the like. It’s a good way to hit your recorder with an enhanced vocal sound. Double does what you’d suspect— although “Quadruple” would be more accurate, as it can generate four “virtual overdubs.” Correct is very versatile, as it does compression, EQ, de-essing, and pitch correction; most interestingly, it has a function where it analyzes your vocal and applies EQ and dynamics as necessary to bring out what Correct thinks is the best tone— sometimes it hits it exactly, sometimes it needs a little tweak, but the concept is great.

In addition to these processors, TC-Helicon makes boxes with similar functionality to the DigiTech VL line: Harmony-G (for guitar) and Harmony-M (for MIDI keyboards), as well as more pro-oriented rack units, like VoiceWorks, VoiceWorks Plus, Voice Doubler, and Voice Pro.


For some people, Antares’ Auto-Tune— like MIDI quantization, Beat Detective, single-sample editing, and other “technique helpers”—is cited as a primary cause of the decline and fall of music as we know it. But remember, machines don’t kill music: People do.

You gotta feel a little bit sorry for Antares (no, not because the Spice Girls broke up—I’ve heard from unreliable sources that they bought a lot of Auto- Tunes). When someone applies Auto- Tune intelligently, using it to fix a few errant notes that would have otherwise marred a perfect vocal performance, no one can tell Auto-Tune is in use—so it doesn’t get the credit. But when someone uses Auto-Tune like a sledgehammer to convert a vocal part into a steaming pile of phonemes, Auto-Tune gets the blame. You can’t win.

Unfortunately, some people reach for Auto-Tune as a quick fix rather than a last resort. Why not punch in, or let the singer warm up more? Worse yet, what if the reason for the pitch issues is that there’s simply not enough vocals in the headphones? If you deal with fundamental problems first, you’ll end up with a better vocal—and one that requires only light pitch correction.

But let’s not forget that pitch correction can also be a cool effect. Sure, the Cher “Believe” thing is a bit old, but on the track “Jaded Love” (by Trona), the singer kept the “real” vocal up front, but doubled it—and then applied pitch correction to only the doubled track. It’s one of the coolest double-tracked vocal effects I’ve heard.

The most important point in using pitch correction is this: Don’t correct based on what you see in the lovely GUI that shows the “real” notes not matching up with the “perfect” notes— correct based on what you hear. If a note sounds wrong, fix it. If it doesn’t sound wrong, leave it alone.


Image placeholder title

If any company has made a name for itself with vocal software processing, it’s Antares. From the original Auto- Tune through Kantos to AVOX to Harmony Engine ($349), the company has come up with consistently interesting vocal tools—and you can download demo versions that are fully functional for 10 days. Translation: just long enough to get you hooked.

What it does: Harmony Engine (RTAS, VST, AU; see Figure 1) provides up to four voices of harmonies, and these are “intelligent”—not parallel—harmonies that follow a specified scale. How you specify that scale, though, is where things get interesting. You can simply provide a root note and scale for simple harmonies, play the four harmony voices as if your vocal was recorded in a sampler and you were playing the samples, have Harmony Engine listen to a MIDI track that provides a “chord guide” (like the way DigiTech’s VL-series processors “listen” to a guitar to figure out the harmony), or program a chord progression.

That’s the basic idea, but within the way big GUI (I think the designer may have been paid by the pixel) there are a wealth of other tools and effects. Each of the four voices has pan, interval, level, solo, and mute controls, as well as vibrato. But there’s also a “throat length” parameter than can change the harmony line’s character, from (for lack of a better description) looser to more constricted. You don’t have to include the input signal in the mix; it can be harmonies only.

What’s cool: For the harmonies you’re most likely to use, such as thirds, the sound quality may not stand up to the scrutiny of the solo button but when blended in behind the main vocal, sounds exceptionally convincing. I was even able to add an octave above “female” voice that was more than credible (editing the throat length improved the realism; try 0.90, it worked for me). Again, I wouldn’t promote the harmony into a lead vocal, but it worked fine as a background voice. Octave lower effects are harder to pull off, but mixing them considerably in the background gave my voice a nice degree of gravitas. (Or perhaps a timbre more like an NFL linebacker, come to think of it.)

Other thumbs-up features are humanizing options that add variety (pitch, timing, glide, and natural vibrato keep/suppress), a “freeze” effect, and the ability to store 15 different harmony presets that you can call up via automation—when the song modulates, you needn’t panic. You also have multiple outputs if you want to run the vocals through different mixer channels and processing (e.g., reverb), assuming your host supports this. Even cooler: the “Chord By Name” mode where you simply program a song’s chord progression, then use the Register and Spread controls to “arrange” the harmonies by ear. Easy.

Limitations: Harmony Engine is definitely a “garbage in—garbage out” device, and the cleaner the vocal, the better the tracking. It’s important to select the right vocal range, and there’s a tracking control labeled with Trial at the top and Error at the bottom. The labeling is correct: I couldn’t figure out what it was doing, but some settings worked better than others. In extreme cases, you can copy the vocal track, clean it up with de-essing, EQ, compression, etc., process it through the Harmony Engine, and mix it in with the original track. Also, several programs aren’t officially supported—Ableton Live being one of them. So of course I had to try it out, and to my delight, it worked just fine.

Bottom Line: Despite what may appear to be a daunting interface, Harmony Engine is actually quite easy to use. The well-written manual is also very helpful in terms of not just understanding, but applying, the feature set. This harmony tool is no one-trick pony, especially if you feed it with MIDI to change presets and such; I also found that just tuning to unison and adding humanizing could enhance vocals considerably. Harmony Engine does lots of things, does them well, and is reasonably-priced . . . it’s tough to beat.


Image placeholder title

There are plenty of fine hardware/software combos, including Universal Audio’s UAD-2 and TC’s PowerCore. Duende ($399) is SSL’s answer to the concept, and the Vocalstrip runs on both the standard Duende and the less expensive Duende Mini (both support VST, AU, and RTAS).

What it does: Vocalstrip (Figure 2) is a mono plug-in with four vocal-specific sub-sections. The De-Esser has Threshold and Amount controls, with an indicator to show when de-essing takes place. The best feature, though, is the Aud button so you can hear what’s being removed—this really simplifies the adjustment process. A complementary De-Ploser takes out low frequencies, like “P” and “B” sounds, and has an identical complement of controls (although of course, they are tailored for different frequencies).

The EQ includes a low shelf with a slight bump at its cutoff, hi-Q bandpass/notch filter with 12dB boot/36dB attenuation, and a low-Q high band EQ for adding “air” or intelligibility. The EQ is more limited than a standard parametric—there’s no resonance control for any of the bands—but its virtue is that the filters themselves, and their characteristics, are optimized for finding good vocal sounds as quickly as possible. (If you need a more traditional EQ, the Channel Strip that’s bundled with the Duende hardware does the job.)

The Compander is also vocal-friendly, as it first has an expander to help reduce mic pre hiss, room ambience, low-level mouth sounds, and the like before heading into the compressor, which has the expected controls: Ratio, Threshold, Release, Attack, and Makeup Gain. You also have a choice of hard or soft knee curve.

Cool stuff: The Drive button can definitely add “character” to the voice. This seems to be a saturation stage after the Makeup Gain control, where increasing the gain increases an overdrive-type of distortion. Another interesting feature is you can put the various modules in any order—for example, if the compressor “pops” because you want a really squashed sound, you can put the De- Ploser after it instead of the more traditional pre-compressor position.

But one feature that really differentiates Vocalstrip is the set of displays that monitor the signal. The EQ curves are accurate, while an FFT display can monitor the input (to see if there are anomalies that need to be tamed) or output, which reveals the effect of the EQ. As soon as you touch a compander control, the display changes to show the typical compression curve display on the left, while the right half shows an I/O Difference Display—in short, it makes it easy to see how compression affects the signal.

Of course, you don’t adjust controls with your eyes, but with your ears. But in keeping with Vocalstrip’s “get it done fast” attitude, these visual displays help you zero in faster on optimum sounds.

Limitations: It’s only mono, which makes sense if you’re dealing with vocals—but if you want to do something like use it on a stereo choir track, you’ll lose the imaging. I also wish the EQ were more flexible; although I was able to get the sounds I wanted, I would like a bandwidth control on the mid EQ and the ability to change the low EQ between bandpass and shelf. And it probably goes without saying that you have to have the Duende host hardware.

Bottom line: Vocalstrip’s main competition is actually the EQ and Dynamics Channel Strip that’s bundled with Duende, because it performs very well with vocals and the sidechaining means you can set up de-essing and plosive reduction (albeit with more effort than Vocalstrip). It doesn’t surprise me that quite a few Duende users with limited budgets seem to go for the Drumstrip, Bus Compressor, X-Comp, and X-EQ plug-ins first; but there’s no denying that the Vocalstrip brings a lot to the Duende platform, and also, that it’s an extremely fast way to get solid vocal sounds.


Image placeholder title

TC’s PowerCore platform has a lot of plug-in support, some of which are optional at extra cost, but VoiceStrip (Figure 3) comes bundled with most PowerCore bundles and works with VST, AU, and RTAS.

Like any vocal strip, one of the questions is “why bother?” After all, it’s not hard to cobble together a compressor, EQ, and de-esser. However, a strip is convenient because one preset recalls all parameters at once. When you work with different vocalists and different mics, it saves time to be able to jump to an appropriate preset without a lot of fuss.

What it does: As with Duende, the EQ is tailored for voice with Lo, Mid, and High gain; only the Lo and Mid stages have frequency controls. There’s also a Saturation switch, and a low cut filter. While the low cut isn’t as effective as the Duende De-Ploser, it helps considerably in reducing unwanted low-frequency energy.

The compressor has Input Drive (there’s no threshold control; slamming the input harder gives more compression), Output Drive (makeup gain), Ratio, Attack, Release, and a pre-EQ/post-EQ switch for positioning, while the De-Esser serves up a Threshold and Frequency controls, along with a monitor so you can audition the sidechain signal coming into the De-Esser. The final piece of the puzzle is a noise gate, with Threshold and Intensity (amount of reduction when gated) controls.

The EQ and Compressor are claimed to model tube circuits, and I can hear that vibe. Comparing the Duende VocalStrip definitely reveals some differences: Overall, I’d say TC’s sound is more precise and neutral, while Duende isn’t shy about imparting its own character to the sound. Which one is better-suited to a particular task is a subjective call, as it depends mostly on the mic and vocalist. For narration and vocals where detail is important, the scale tips more toward TC. When having a vocal cut through a mix is the overriding concern, I think you’d get there faster with Duende.

Cool stuff: VoiceStrip is no slouch when it comes to metering. There are reduction meters for the Compressor, De-Esser, and Gate, as well as good input/output meters. And, VoiceStrip is available in both mono and stereo configurations, so you can use it to process those massed background vocals you bused down to two tracks. The Soft Sat saturation option is welcome too, as it can add some grit and funk to potentially clinical sounds.

Limitations: The EQ bands not only have no resonance controls, the high end frequency is fixed. This seems like a situation where you’re expected to get the sound you want with the mic and mic placement (which you should anyway!), and the EQ serves mainly as a way to “touch up” the sound.

Bottom line: PowerCore’s bundling of multiple desirable plug-ins has been a strong point of theirs, as you get more than just the hardware when you take the box home. As expected, the optional-at-extra-cost ones are where you’ll find the more complex plugs, but VoiceStrip need make no apologies—it’s eminently useful, and the price is right.


Image placeholder title

TC-Helicon makes three plug-ins for PowerCore: Harmony4 (four line parallel harmony generator), Intonator HS (realtime pitch corrector), and VoiceModeler (Figure 4)—which is definitely my favorite. After all, when it comes to harmonies, Harmony4 seems somewhat dated compared to hardware like the DigiTech VL4 and software like Antares’ Harmony Engine; and Intonator HS works as advertised, so if you have a PowerCore and need realtime pitch correction, you’re set. But VoiceModeler ($249) is a very creative tool that can transform voices in novel ways. It reminds me a bit of Antares’ AVOX, but more streamlined.

What it does: The heart of VoiceModeler is the Effects Section, which lets you impart qualities like Breath (good for adding a diaphanous quality to choir sounds), Resonance (sharp, filtering-type effects that change timbre), Vibrato, Growl (good for sounding like you blew out your voice out, so you don’t have to), Spectral (changes character), and Inflection (adds pitch-shifting effects, like scooping up to a note). Each of these offers various “Styles” (typically a couple dozen, each of which sounds seriously different), while a slider controls the amount of the added effect.

A Modulation section can affect the Resonance and Spectral effect with different LFO amounts (although not different parameter values for the two effects; there’s also sync-to-tempo for the LFO—extra credit for including a random waveform option, which is very useful.

Cool stuff: As you might imagine, there are a lot of options here, and when you find the right combination (believe me, it’s also possible to find wrong combinations!), you’ll want to save it. Fortunately, there’s a convenient preset architecture that allows for easy choice of defaults and the creation of sub-folders, as well as A-B comparisons so you can decide if that edited version really is better before you save it.

Another cool feature, albeit in eye candy-land, is the VoiceModeler display. This has several “lanes” that show the amount of signal contributed by the various effects.

Limitations: There’s no way to get rid of the original vocal sound, which is too bad—sometimes I’d like to hear just the process itself. I tried putting another dry track in parallel, switching it out of phase, and hoping it would cancel out the original vocal component; nope.

Bottom line: Before you actually use this for a mission critical project, experiment with it and get all the “funny sound” experiments out of your system. While you can obtain some novel effects by slamming the Effect sliders all the way to the right (“Hey! I’ve made Dexter sound like a girl!”), some of the best uses of VoiceModeler are subtle ones that add just a little bit of “shading” to the voice. Maybe I’m just a sucker for things that work simply and elegantly, but overall, if you have a PowerCore system and work with vocals, I think you’ll find this an exceptionally useful tool.










