The same gear used to record your music is well suited for audio postproduction for commercials, documentaries, animation, and narrative projects. I recently finished editing and sweetening the sound for an award-winning indie feature, The Craving Heart, and in this article I will share tips gleaned from the experience to provide insight into the process of working with sound for picture.
Getting the Job
The gig came through my colleagues Douglas Spotted Eagle and Mannie Francis at VASST (www.vasst.com), who oversaw postproduction duties for the film. The project arrived in its raw form as a nonlinear editing (NLE) file, and the various media files accompanied it on a single external hard drive. Director-editor Stan Harrington had used Sony Vegas 6 for video editing as well as for placing the initial music and effects (see the sidebar “Stan the Man” online at www.emusician.com). Because I use the same software, I simply plugged in the drive and watched the movie directly from the Vegas timeline.
Although the film was shot in the high-definition HDV format, I worked from a proxy DV version for smoother real-time playback because it is difficult for a computer to play HDV in real time if video effects, such as color correction, are involved: the performance may drop well below 29.97 frames per second (fps), and for sound-design elements to sync properly, the video playback must be rock solid. When the project was finished, my work was married to the full-resolution cut of the film.
This was a long-distance project: I was in Chicago and the director was in Hollywood. To handle work-in-progress approvals, I sent Harrington my Vegas project files by email. That worked because we both had the same media (my drive was a copy of his). I would still need to send my sound files, but I would avoid using audio plug-ins the director lacked. With a three-week deadline looming, I put everything in that I felt was needed and deleted whatever the director disagreed with, rather than wasting time getting an approval for every little detail. This approach ultimately saved me time and effort.
The soundtrack I received had all the dialog and a few key sound effects. The music cues — including songs by Katy J (see the sidebar “Scoring Craving” online), an instrumental by Douglas Spotted Eagle, and a sparse underscore by the director himself — were present, as were some crucial sound-design elements. My jobs included cleaning up dialog, building convincing backgrounds, finding and adding missing sound effects, sweetening elements, and creating a mix. Fortunately, the film was full of sound possibilities, and my head was swimming with creative ideas.
Always clone the project's hard drive and work on the copy. Regularly back up your changes to the original drive as well as to an off-site location, such as an online storage facility.
Being organized from the beginning is important; otherwise, you'll waste time hunting for missing elements in a highly complex project. For example, Craving had over 520 media files, equaling more than 200 GB of data. With a project of that magnitude, you have to develop a methodical work flow and be thoroughly organized on the hard drive and timeline.
The editor kept his audio tracks to a minimum by placing different types of sounds on the same tracks. However, mingling dialog, music, and sound effects never works because each type of element requires different approaches to EQ, volume, and panning.
FIG. 1: Buses work for premixing similar sound elements into stems and for applying the same audio effects to multiple tracks.
Starting with the original 7 tracks, I moved dialog, sound effects, underscore, and Katy J's songs to dedicated tracks, which brought the track count to 20. As a precaution, I worked on duplicates of the tracks but kept and muted all the originals, moving them to the bottom of my Vegas timeline. I also locked the video into place so it couldn't be inadvertently nudged.
The eventual track count topped 40. I also added buses and routed tracks to them accordingly (see Fig. 1): these three bus tracks, known as stems, were dedicated to the primary soundtrack elements of dialog, music, and effects (also known as DM&E).
There are two important things to remember about audio postproduction. One: dialog rules — what the actors say is always the main focus and what the entire soundtrack should be built around. Two: it's the mix that matters — just like in music production, individual elements may sound horrible on their own but may work within the context of a mix.
Always start with the dialog, because it takes the most time and is the most tedious and noncreative part of a project. We live in a noisy world, and much of that noise gets into dialog recordings. Although audiences can tune out steady noise, when unwanted sounds jump around, they become noticeable. These jumps in presence require extensive smoothing if the dialog track is to sound right.
Noise gates are useless for dialog work because their opening and closing action is too noticeable. Expanders are better for reducing softer noises without completely cutting them off, resulting in a more natural sound. To the dialog bus I added an expander set for 12 to 15 dB of expansion, which was effective 90 percent of the time. Manually drawing volume automation envelopes and using short fades also helped. I prefer manual volume adjustments because they make the dialog sound smoother.
EQ can also be used to clean up noisy dialog tracks. Use a highpass filter with a 24 dB-per-octave rolloff set at around 100 Hz. An equally aggressive lowpass filter set between 12 and 15 kHz is also useful. You will improve speech intelligibility with an EQ bump in the range of the consonants, around 2.5 kHz. Use a slight EQ cut at 600 Hz to overcome the muffled sound of lavalier mics hidden under clothing.
A dedicated noise-reduction tool is a must for serious audio postproduction work, and Sony's Noise Reduction 2.0 plug-in is my go-to tool. Using too much of this type of effect can make a track sound swirly and artificial. However, you can avoid getting unwanted artifacts by doing several passes on the noisy track and dialing in just a few decibels of noise reduction each time.
Many editors cut the picture and dialog at the same place. Unfortunately, any change in the audio gets magnified when it occurs simultaneously with a picture edit. Good dialog editors overcome this by using split edits, or J and L cuts, along with crossfades. A J cut places the dialog edit before the picture edit. An L cut edits the dialog after the picture and extends the noise from one shot into the next. Cutting on hard consonants and using crossfades will help hide edits too. Be aware that split edits require you to unlink the audio from its video file, and you risk losing lip sync if you move the audio.
Although cameras can record in stereo, dialog is usually tracked in mono. Harrington prudently used the camera as a 2-track recorder and isolated the actors on their own tracks. These recordings appear as stereo files on the timeline, with each actor panned hard left and right. I split the tracks and converted them to mono in order to move the dialog back to the center of the stereo field.
Unfortunately, the dialog tracks suffered from phase problems caused by the actors' lines bleeding into each other's mics. As a remedy, I isolated each of their tracks and intercut between them using volume automation.
click here for more of this story
Backgrounds and Room Tone
The hardest problem to fix is distortion caused by overloaded recording levels. If there isn't an alternative take and you are unable to rerecord the distorted lines, you can take the edge off the harshness and make the track more tolerable using EQ to cut 8 to 10 dB between 8 and 9 kHz.
FIG. 2: Several stereo car-bys were layered and sent to a flanger and delay to emphasize a scene.
Every location has a sound, called room tone or presence, that gets recorded and is useful for filling in dialog gaps. Because there was no separate room tone included with the project, I had to steal it from ends of phrases and between words.
Background (BG), also called natural or nat sound, shouldn't be confused with room tone. Ambiences, such as traffic noise during a street scene, are usually built entirely in postproduction. Although I created some conventional backgrounds, I also used some nonliteral sounds, such as low rumbles, to underscore emotion and create moods.
With dialog occupying the center position in the stereo field, I gave the backgrounds a wide stereo image using iZotope Ozone 3 (www.izotope.com) to open up the mix. My favorite trick is to combine two mono recordings of different ambiences and hard-pan them. Another approach is to use duplicates of the same sound and hard-pan them, but with one copy offset by half its length.
Although sound effects often get recorded on location along with dialog, they are usually not of high-enough quality for the finished movie. Rather, they serve as inspiration and help with synchronization. In this project, only a handful of the original effects remained, and the rest were replaced with better recordings.
You don't have to match a sound to every onscreen action. Focus on covering the obvious sound cues. Generally, when nobody is talking, I add more realism to a scene with sound effects, and I back them off during dialog.
I spent a day auditioning sound effects from my libraries for literal sounds, such as car-bys, footsteps, and doors closing. But I searched for nonliteral sound-design elements as well. Next, I used UltimateSoundBank's X-Treme FX (www.ultimatesoundbank.com) soft synth for layering and audio effects to generate more car-bys, thunder, rain, and general backgrounds. I recorded these MIDI performances — if holding down a key for 20 seconds can be considered a performance — using Sony Acid Pro 6 and then rendered the tracks to 24-bit, 48 kHz WAV files. (DV audio's sampling rate is 48 kHz.)
I also did some field recording with an M-Audio Microtrack 24/96 and a pair of inexpensive binaural mics from Core Sound (www.core-sound.com). I prefer grabbing backgrounds, such as traffic and restaurants, in stereo. I used a Marshall Electronics MXL DRK mic for recording close-up mono tracks, like car doors and footsteps.
By the end, I had amassed about a gigabyte of sound effects. Building such a complex sound toolbox takes time, but once it is in place, much of it can be reused. For example, once you build a location's ambience, you can use it again in subsequent scenes taking place at that same location.
Adding everything to a scene to make it work can take a lot of time. For instance, I spent four hours tweaking minimal production dialog, footsteps, car doors opening and closing, clothes rustling, and a car pulling up, idling, and driving away. It is a completely realistic scene that nobody will notice because it seems natural to the casual viewer, as if it had been captured on location perfectly. Subtle details, such as a little bell chime whenever a door opened in the coffee shop, further enhanced scenes.
One of my favorite sound moments in the film shows a woman waiting in vain for another person while cars whiz by in front of her. I carefully selected stereo car-bys to match the screen action and checkerboarded them on several tracks. Though it was time-consuming to sync every sound element to every onscreen car, the effect was realistic, if slightly exaggerated.
These particular tracks were sent to a single bus, which had a flanger and stereo multitap delay in series, to add an otherworldly sound to the passing cars (see Fig. 2). The scene begins with normal sounds, and then gradually the effects increase as the wait grows longer (see Web Clip 1). There is also a dark, heavy drone to fill in the low-frequency area of the mix.
Another segment that required a lot of effort was when the director had expressed how unhappy he was with the original screams he had used for the final scene. I opted for heavily processed and layered screams to make the scene far more effective (see Web Clip 2).
Music: the Director's Cut
With no budget for underscore, Harrington experimented with Sony AcidXpress and was immediately hooked by how easy it was to come up with music that worked. “I'm no musician, but I had a good feel for the emotion I was trying to convey,” he says.
Harrington's musical sequences were straightforward and unprocessed, so I dressed them with reverb, EQ, doubling, pitch-shifting, volume envelopes, and panning. Though some of Katy J's music came directly from her Stand Still CD, a few tracks were roughs. I placed iZotope Ozone 3 on the music bus to use its EQ, multiband compression, spatial enhancement, and limiting to bring up the volume and give her tracks and the underscore a final bit of sweetening.
It's in the Mix
I premix many of the tracks into stems as I go along, and perform the final mix at the bus level by recording automation in a few passes. I follow the Hollywood standard using a fixed monitor gain that has -20 dBfs, which is equal to 0 VU in the analog realm, yielding 86 dB SPL. The speaker calibration is done using pink noise.
Soundtracks have a much wider dynamic range than music — about 20 to 30 dB compared with a range of less than 12 dB. The dialog averages -27 dB RMS, which most musicians would find remarkably low. There are opportunities for huge dynamic swings in film soundtracks, which is part of the thrill of working on them.
Be sure to check the mono compatibility of your mix, because many theaters have less-than-stellar audio playback systems. I also check mixes at a low volume on small, bass-challenged close-field monitors, such as the Avant Electronics Avantone MixCubes. Referencing my mix on such a system gives me a good idea of what it will sound like on a consumer playback system.
Typically, wind and rain are hard to mix because of their noise components. In the opening scene of the film, I used different rain perspectives (close, far, and midfield) and a lot of equalization (high-frequency boost, low cuts, and carving out the middle mud). For thunder I layered loud, close-up crashes with distant rumbles. Some thunderclaps were actually explosions, and I used reverb with a long predelay to thicken them up.
Some of my bass sounds were routed to a bus with a lowpass filter and pitch-shifter set down an octave. The resulting deep rumble was mixed in at key points for low-frequency enhancement.
Another idea added during the mix happens when a character is awakened by a cell-phone ring. I made the first ring sound as if it were off in the distance, like you might perceive it as you come out of a dream, by soaking it with reverb. The second ring had about half as much of the effect, and the final ring, before the character answers, was dry and up front (see Web Clip 3).
I sent Harrington updates nearly every day, and he made suggestions that were incorporated right away. Because the drive also had the video files, I began making DVDs of the project with the new soundtrack and overnighting them for approval. This became more important when I switched to the newly released Vegas 7 and began using audio effects the director lacked. Near the end of the process, Spotted Eagle added his invaluable insight into the mix, and the audio portion of Craving was finally complete.
All that remained was some video work, such as color correction, and the final swap of proxies for the high-definition HDV media. Spotted Eagle handled the video tweaks while I pulled the project file together for the final rendering, which took more than six hours. I also authored the initial DVD for film festivals.
If you do sound for picture, you have to learn to live with the fact that most of your work goes completely unnoticed. As the saying goes, “Nobody leaves the theater whistling your sound effects.”
Despite the amount of tedious grunt work involved in audio postproduction, creative sound design and mixing can make the process fulfilling. Every project has a unique set of challenges that force you to employ every ounce of your knowledge and creativity. Finally, when you see and hear your work in a darkened theater and experience the audience's reactions, you know all that effort has paid off.
Jeffrey P. Fisher's latest book is Soundtrack Success: A Digital Storyteller's Guide to Audio Post-Production (Thomson Course Technology, 2007). Learn more about his work atwww.jeffreypfisher.com.