Make pictures come to life with sound.
If you've been to the movies lately, you've probably seen the name of a sound designer in the credits along with the names of the composer and cinematographer. You're also likely to find a sound designer as part of the production team for every computer and video game you come across, not to mention theatrical productions, Web sites, and even radio dramas. But exactly what is a sound designer, and when did that position reach its current status?
Sound design is as old as talking movies, but it really became a serious full-time endeavor when the use of sound was brought to new heights in films such as Apocalypse Now and Star Wars. Since the mid-1970s, sound design has become an essential part of most major films as well as the majority of computer and video games.
Today, sound-design work is available for virtually every form of visual entertainment. Large game companies often have junior sound-assistant openings that can lead to more senior positions with greater responsibility. Film post-production houses often hire interns, giving them the chance to learn while making coffee and labeling tapes. Freelance studios occasionally have roles for junior and senior sound designers to help with Web and game projects.
The art of sound design has provided fine careers for many musicians who are not interested in traditional job opportunities in music but love audio production and the creativity of the sonic arts. It's a booming profession that requires a combination of skill, patience, and hard work; professional sound designers are highly sought after and can command good fees.
BY ANY NAMESo what is sound design? How is it used in different media, and what is the process by which sonic tableaux are brought to the big and small screens? In the broadest sense, the purpose of sound design is to augment or enhance the telling of a story. In most cases, that involves the creation, manipulation, and organization of nonmusical sonic elements. Those elements can include door slams, cricket chirps, or computer beeps.
Sound design is the process that turns James Earl Jones's deep voice into Darth Vader's evil growl, and it adds the swishes and smacks that pepper the combat scenes in Hong Kong action flicks. Sound design is the sound of lasers firing and ships exploding in science-fiction games or waves lapping gently against creaky docks in a pirate adventure. What do these disparate sonic examples have in common? They reflect the imagination and taste of the sound designer as he or she tries to enhance a story with sound.
The sound-design process is no real mystery. In fact, you can break most jobs down into seven key steps: determine what sounds are needed, collect the raw sonic materials, manipulate and edit the sounds, integrate them into the project, revise until satisfied or time runs out, mix the sounds, and deliver the finished product to the client. This article will look at each of these steps and define a number of common terms. (See www.filmsound.org for a great collection of articles about sound design, including a glossary.) By the end, you should have a solid understanding of the technical and artistic elements that go into successful sound design.
SONIC STRATAThe sonic elements in a project can normally be broken down into several layers that serve different functions. Often, different people work on different layers simultaneously. The sound layers are combined with dialog and music during the mix, which creates the finished presentation. (In the case of interactive media, the mix consists of programming the volume and pan levels of the various elements in code rather than on a mixing console.)
Most projects begin with a spotting session, which is attended by the sound designer and film director or game producer. Spotting is the process of watching a scene, making a list of the sonic elements that are needed, and dividing them into their constituent layers. I'll define the layers of sound by spotting the following scene.
It is a foggy midnight near the docks. Lapping against the pier, the waves are restless, and a light breeze is kicking up. In the distance, a foghorn blows. The hero stumbles into the frame, his old leather shoes scuffing and scraping the sidewalk as he struggles to keep his balance. He hears tires squealing behind him and whirls around. A half-empty vodka bottle falls out of his coat pocket and explodes like a grenade on the sidewalk. He turns and runs forward, right into a fruit crate that was home to an alley cat, which yowls in protest and runs off into the night.
Foley. Foley is the process of recording the sounds of human action in a studio to mimic actors' onscreen movements (see Fig. 1). Footsteps, the rustling of clothing, the handling of objects, and other sounds are recorded by the Foley team while watching the picture. Foley is a deep and dedicated art form, and good Foley artists have a rare combination of skills that include amazing reflexes, great physical control and stamina, a rich imagination, and a knack for coaxing the best sounds out of inanimate objects.
For the scene described above, the Foley artists would find a pair of worn leather shoes to imitate the character's and then record all the scuffles and foot scrapes in a Foley pit, a concrete box with sand in the bottom. They would also record clothing rustles to mimic the character's movements, and the sound of the bottle slipping out of his coat pocket. The final recording would be the clattering of the fruit crate.
Hard SFX. Hard or principal SFX (sound effects) are the primary up-front sounds that sync to important events on the screen, highlight the drama, and help tell the story. The lead sound designer typically works on those sounds. In the sample scene, the hard effects are the tires squealing, the bottle smashing, the cat yowling, and possibly the clatter as the hero runs into the crate. The final mix might include the Foley artists' crate clatter, the sound designer's clatter, or a combination of both.
In large-budget film projects, the hard SFX category is further subdivided into editorial and principal effects. Editorial effects are routine, everyday sounds such as doors closing and cars starting. Principal effects are the big, production-specific laser zaps, explosions, and dinosaur footsteps.
For example, in the film Being John Malkovich, different people created each type of effect. Ren Klyce, Malcolm Fife, and I were contracted to provide all the weird sounds, including the tunnel sequences, everything that takes place inside Malkovich's head, and the bizarre restaurant scene. Another team provided the everyday sounds. In the film Fight Club, however, Klyce created a unified aesthetic mood throughout the film by creating all the sounds with his small team. (Klyce was nominated for an Academy Award for that film.)
Production sound. On most live-action projects, the sound at the scene of the shoot, called the production sound, is captured by a production recordist. That person's main job is to record the dialog, but he or she might also capture some of the editorial sound effects as well as backgrounds and room tones. That task is made difficult by the sound of cameras whirring, generators buzzing, and directors screaming, but the production recordings are a gold mine of material. Of course, animated projects such as games have no production recordings.
Background/ambience. Backgrounds (also called atmospheres) are sounds that are not synched to events on screen. Those sounds set the mood and define where something is taking place. Distant city traffic, the ever-present rumble of a starship, or a chorus of birds in the jungle can serve to reinforce the visual image and enhance the story's believability. Backgrounds also come in two types. Ambiences are long, continuous recordings that set a mood with something that doesn't call attention to the track. Stingers or specifics are short elements added to the ambience tracks at certain times to spice things up.
In the scene example, the ambience track is a continuous recording of waves lapping against the piers, with a bit of wood creaking and some distant traffic. That tells the audience that the scene takes place in a city, near the water. The stingers include the distant foghorn and a breath of spooky wind placed at just the right moment.
Room tone is a special type of ambience, typically a recording of the atmosphere of an interior space with no specific sound. It's not very dramatic, but room tone is important for creating a subtle undercurrent that ties together the other elements in a scene. It is critical in matching ADR (automated dialog replacement, which is dialog recorded after the fact) with dialog recorded during the shoot. Room tone is mixed with the replacement dialog, which helps conceal the fact that it was recorded elsewhere.
FROM THE TOPThere are various steps in the sound-design process, including ways to find material, process sound, and combine sources. Keep in mind that sound design is a post-production process; that is, it takes place after some or all of the primary production work (filming and animation, for example) is complete.
In determining how to design sound for a project, you need to gather everything related to the project that you can find. That might include scripts, documents, footage dubbed to VHS tape, QuickTime movies of game animations and alpha versions of the game, or rough versions of a Web applet in progress. Whatever content your collaborators can provide is of utmost importance; you need to see what is going on before you can design sound for the project. (See the sidebar "Preliminary Procedures" for additional suggestions about preparing yourself for the task.) Doing it any other way guarantees a mismatch between the sound and visual elements and a mess at the end of the project.
Also, because sound is added at the end of the process, all the delays and late deliveries throughout the production, as well as all budget overruns, will happen before you get to ply your trade. It isn't fair, but that's how it goes.
ACQUIRING RAW SOUNDOnce you have decided what sounds you will need, the next step is to acquire the raw sound recordings. Those recordings are the clay from which your sonic sculpture will come, and it is essential to pick the right material from the start. Look for sounds that you think are interesting, rich, and full of life; it's difficult to breathe life into dull or listless raw materials. The right well-recorded sound needs virtually no processing to get the message across.
Field recording. Field recording is the process of taking a portable tape recorder and microphone into the unruly world outside your studio. Field recording is terrific for capturing ambiences, animals, airplanes, or nearly anything else that you can't bring into your studio. There is no part of sound design I love more than field recording. It's supremely satisfying to strap on a bunch of gear and go out into the world to get a sound you need. The results are authentic, unique, and your own - frozen moments in time and space that you have committed to tape.
When traveling, I always take my field recorder (a portable DAT with a stereo mic preamp and twin mics) so I'm ready to capture whatever interesting sounds I come across. I've recorded airport ambience in Beijing, hippopotamus calls in Kenya, traffic on a rainy night in Amsterdam, and the rhythmic chugging of a train cutting through the Italian countryside. For my work on Lucasarts Entertainment's pirate-adventure game Escape from Monkey Island, I recorded as many watery locations as I could (see Fig. 2). Field recordings can fill a game with original ambiences that you won't get from a commercial sound-effects library.
Field recording does have one distinctive drawback: the frequent intrusion of unwanted noise, such as wind and human sounds. Putting a microphone in a windy spot doesn't just give the whistling, rustling sound you hear as wind in the movies. When the air blows directly on the diaphragm of the mic, it creates unbelievably loud low-frequency rumbles that ruin your recording. Try blowing directly on a mic and you'll hear what it sounds like.
I recommend three approaches to dealing with wind on field recordings. First, try to avoid it by recording on nonwindy days or during times that the wind is not kicking up. Second, use wind socks on your microphones. These are large furry coverings, such as the Rycote Windjammer (see Fig. 3), that block and distribute the wind air pressure before it hits the mic. Third, be careful with mic placement. If you can place the mic out of direct contact with the wind, perhaps behind a rock or tree, you can record the whoosh of the wind interacting with everything else around it. If you get the occasional buffets of wind across the mic, you can edit them out later and still have a nice, windy recording.
The bigger problem is human noise pollution. Walk out the front door of your home, close your eyes, and listen. Unless you live on a windswept plain in Wyoming, chances are that you can hear traffic, low-flying airplanes, or other human noises. Unfortunately, the microphone is impartial to noise and picks up all the unwanted stuff in addition to what you are looking for.
That problem has two primary solutions. The first is to be where (and when) people aren't. Try to do your exterior field recording after midnight, when there's much less traffic and fewer people around to hassle you (unless you want the effect of an urban background). Bioacoustician and sound pioneer Bernie Krause travels the world trying to record natural habitats without the intrusion of human noise, and he ends up with an extremely low ratio of usable sounds to recorded raw material.
If you are trying to get specific sounds that emanate from one small source, you can fight the problem of excessive noise by using shotgun microphones. Those mics have a hypercardioid polar pattern that rejects any sound not directly in front of the diaphragm. They are also commonly used in production-dialog recording to pick up the actors' voices on the set while minimizing the sound of cameras and other noise. If recording in a busy environment, have patience. Eventually, there will be a lull in the action, and you will get what you want.
IN THE STUDIORecording in the field is terrific for getting sounds in the context of their environments, but sometimes you want materials that are as clean, clear, noise-free, and devoid of ambience as possible. For those situations, studio recording is the way to go.
I record sound effects in two different areas of my studio. When I want a dry, clean sound with as little coloration as possible, I close-mic the source in my vocal booth with a large-diaphragm mic between six inches and two feet from the sound source. If I want a more open sound, I record in my larger room, with the mic placed one to six feet from the source. I also experiment with shotgun mics, lavaliere mics, and anything else I have available.
If you are recording in a bedroom-size space, deaden the surfaces around the recording area with acoustic absorbers and turn off any unnecessary computer fans. Use the nicest, most neutral mic and best mic preamp you have. If you're recording voice, use a pop filter.
Voice. The most flexible sound generation tool is the human voice. The voice has been used as the basis for a great deal of sound design, and it's capable of mimicking all sorts of animals and birds, among other strange and surprising sounds. The emotive character of the voice expresses movement and interest no matter how much manipulation it undergoes. Voice sounds great slowed down and reversed, and it can impart a sense of the familiar within a context of nonhuman sounds.
One of my favorite uses of voice was in the creation of a creature known as a Szlachta for the game Vampire: The Masquerade (see Fig. 4). I recorded a baby cooing, grumbling, crying, and whining. That material, slowed down and dropped in pitch, became the disturbing gibbers and moans of a misshapen monster. Our natural empathy toward babies evoked an extra shade of pathos for the hideous creature.
Props. When I'm working on sounds in the studio, I tend to record a wide variety of props. I am always on the lookout for items with interesting sounds to add to my collection. Bits of metal, chunks of wood, tools that slide or slip or ratchet, rough pieces of cloth and Velcro, balloons, nails, and sections of chain all can be recorded and manipulated to great effect.
I often think about the physical components that make up an object when I'm imagining how to create its sound. What's it made of? Does it have bits that rattle? Is it squeaky or smooth? Once I have imagined what the sound should be, I look for props I have that would work as an element. Remember: the prop doesn't need to look like what it represents; it just has to sound like it. If I don't have a suitable prop, I head out to find the right thing. Thrift stores are terrific places to find great props at low prices.
GET THEE TO A LIBRARYRegardless of how much field recording you do and how big your personal library is, it is always useful to have commercial sound-effects libraries on hand. You'll find general and specific sound libraries that are filled with excellent, well-organized material, both basic and exotic. Those libraries are convenient and useful, but always remember their chief limitation: other people own them, too, and the same effects will be used repeatedly in audio projects of all kinds. Try to use commercial effects sparingly, and give the sounds your unique stamp by editing, layering, and manipulating them in every way you can.
There are two major distributors of royalty-free, professional sound-effects libraries on CD: Sound Ideas and Hollywood Edge (see Fig. 5). Both companies sell high-quality libraries that cover a broad range of commonly used effects. Both also offer libraries that focus on specific areas such as vehicles, explosions, and footsteps, as well as inexpensive starter sets. In particular, Sound Ideas' The Library set and Hollywood Edge's The Edge Edition are both good basic libraries.
Good CD libraries can be pricey, averaging $25 to $50 per disc. The reason is simple: the market is quite small, and each disc represents a lot of labor by some of the best sound designers in the world. I try to offset the cost by purchasing one or two new libraries at the beginning of each major project.
One of the problems with purchasing large sound-effects libraries is that you end up paying for many sounds that you will never use. An alternative to buying complete libraries is to utilize Internet-based services that let you audition specific sound effects, then download and pay for just the sounds you need. The two largest online distributors of individual sound effects are Sound Dogs (www.sounddogs.com) and SFX Gallery (www.sfx-gallery.co.uk).
For example, at the Sound Dogs site, you can search for sounds by category (such as "doors") or keyword (such as "bulldozer") and then go to a page listing all their entries. You can audition low-resolution versions of the sounds and select the ones you want to purchase. The price depends on the length of the sound and format you choose; higher quality means higher prices. When you've made your selection, download the sounds from an FTP site or have a CD-R burned and mailed for a nominal extra fee. Other sites, such as www.ultimatesoundarchive.com, offer a monthly subscription that includes access to whatever sound effects are available on its site.
In addition to online commercial libraries, you can also download repositories of free sound effects and use them legally in your projects. Those free effects are wonderful for starting your collection; however, their resolution and quality vary considerably. Web sites with free sound effects include www.partnersinrhyme.com, www.alcljudprod.se, www.stonewashed.net/sfx.html, and www.wavplace.com. Keep in mind that you get what you pay for.
Synthesizers. Synths aren't generally used for creating real-world sounds, but they can be a great source when you need abstract, otherworldly material. They're well suited for making all sorts of sci-fi, computer, robot, and starship sounds, but avoid cliches. I used a Nord Modular synth to good effect on a recent pair of Star Wars games by modeling my sounds on the ARP 2600-based, ring-modulated style that Ben Burtt used for the droids and machines in the films.
Synths are great for creating filtered white-noise sweeps, which make terrific whooshes and swishes. You can use those sounds to create a sense of motion or action, particularly when they're augmented with panning and Doppler processing. Synths are also useful for creating earthquake and rocket rumbles. One of my favorite techniques is to start with white noise, lower the cutoff frequency of a lowpass filter until there is nothing left but the extreme low end, then use a sample-and-hold LFO to modulate the resonance or filter cutoff. I further enhance the low end through EQ or subharmonic synthesis; the result is instant beef.
GET ORGANIZEDAfter you have recorded the raw sound materials, transfer your sounds to a computer and organize them into clear categories. I work on a Mac with Pro Tools, but I know sound designers who use many varieties of systems and platforms.
Once you develop a good-size library, accessing and organizing your data becomes decidedly important. I have my entire sound library, some 200 GB as of this writing, online at all times. When I start a new project, I create a new folder that becomes the master library for the project. Within that folder, I create subfolders organized by whatever system makes sense for the project (see Fig. 6). For a film, I might create subfolders by reels ("reel 1," "reel 2," and so on), whereas for a game, I typically use general categories that refer to the way the sounds function (for example, "backgrounds," "footsteps," or "objects"). In both cases, I might also organize by specific categories of sound, such as doors, vehicles, and weapons.
Next, I pull my DAT field recordings into Pro Tools, edit the materials, move them into my library folder, and add descriptive names such as "night cricket river amb 01." Any recordings or synth elements I do in the studio are recorded directly into Pro Tools, then moved into the library. I like to keep ambience recordings stereo and about two minutes long. Point-source sounds, such as door slams, are usually mono. I keep all my master material in Sound Designer II format, the Macintosh/Pro Tools industry standard, at 16-bit/44.1 kHz. If possible, always transfer your source material into your computer digitally. I also recommend having an extra hard drive around to make quick backups.
PROCESS THIS!Once the raw materials are organized in your system, polish them to suit the project. That manipulation phase can be as simple as a bit of editing or EQ or as radical as mangling them into totally new sounds, utterly unrecognizable from the original. What and how much to do depends on the context of the work and the taste of the sound designer.
If the project is a film involving realistic characters in the present time, the majority of manipulation will focus on careful editing, EQ, and dynamics processing. If the project is an ultrahip techno-cyber sci-fi game, manipulations will involve every weird plug-in and effects processor you can get your hands on. My personal taste leans toward finding a terrifically beautiful sound, recording it really well, and using as little processing as possible. I do rely heavily on pitch-shifting, EQ, and reverb when needed, however.
Now I'll discuss the traditional techniques of sound manipulation, which are appropriate for all sorts of projects, as well as a few tools for producing more radical results. Check out the sidebar "Who Wants to Be a Sound Designer?" for some other projects you can try.
Editing. Editing is simply the process of choosing the part of the sound you like and discarding the rest. I typically record a dozen performances of a given door creak or screw turn and save them one after another in a single file. When examining the file, I listen carefully for the version that best fits the timing and performance of the action. Sometimes the creak of one take fits best with the slam of another take; don't be afraid to mix and match. I've also taken small segments of different sounds and put them together to create entirely new sounds. For example, I once used small snippets of a gun cocking and edited them together with a bicycle gear change to create a metallic tool ratcheting into place. Cutting, pasting, and crossfading are essential tools in your arsenal.
EQ. EQ can be used both correctively and creatively. Rolling off the low end of a wind recording at 85 Hz can eradicate unwanted rumble. But if you move the filter cutoff to 4 kHz, you can eliminate the parts of a sound that provide its basic identity. Try that with the sound of wind and you're left with a wispy, airy sound that could be used for a ghostly ambience. Creating strong resonant peaks in the middle of a sound's frequency spectrum can be great fun as well. Better yet, try moving the peaks over time as the sound plays back.
Dynamics processing. Level compression can be useful in adding punch or body to a sound, but be careful. It is a common mistake to overcompress sound elements before they are layered and mixed because doing so doesn't leave enough headroom once everything is put together. You have to turn down the sound in the mix to make it fit without peaking, negating your original intent. Hang on to your dynamic range - it is a precious commodity in short supply.
Pitch and time shifting. Pitch and time shifting are two methods of sonic manipulation that have been around as long as sound recording. By speeding up or slowing down a tape, you raise or lower the pitch and decrease or increase the duration of the sound concurrently. That technique can change a sound in powerful ways - voices and everyday sounds take on a murky, mysterious quality when dropped in pitch and time, and they assume a cartoonlike quality when raised.
These days, DSP techniques can change the pitch and temporal components of a sound independently. Those functions are useful, but they often impart artifacts to the sounds that disrupt their clarity, beauty, or impact. Try turning off the preserve duration option in the program you use to see if you prefer the results.
Chorusing, flanging, delay. Standard pitch- and delay-based processing have an important place in sound design. Chorusing and very short delays can be useful for converting a mono file to stereo, just as in music. I like using flangers with noise-based sounds to create jets and starship sounds. I've also used flanging on the tail portion of gun and weapon sounds to create more motion and edge.
Delays are great for making sounds bounce around, particularly if the delay taps are panned around the field, with a bit of pitch shifting thrown in for good measure. Short single-tap delays can simulate the early-reflection portion of a reverb, which can evoke the feeling of a tight space without adding a reverb tail.
Reverb. Reverb has two specific uses in sound design: placing a sound in a space and adding depth and size to a sound. If you have a dry sound and want to give it the illusion of being in a particular type of room, you can use fairly short room reverbs with a high wet/dry mix. I've never found this to be as convincing as recording the sound in an appropriate space, but it works well enough. You can also use longer reverb programs, typically plates and halls, to add weight and drama to sounds such as gigantic dinosaur footsteps and cannon shots. For this type of approach, I use the original sound completely dry and add reverb as a separate layer. I then raise the volume of the reverb at the tail portion of the sound.
Worldizing. Digital reverb units are stocked with enough horsepower and brilliant programming to sound terrific, but to my ear, the digital version never sounds quite like the real thing. The most convincing way to make something sound like it was recorded in a room is to record it in one - but sometimes that's not possible. In addition, you may have several sounds recorded under different circumstances that you want to sound as though they belong together.
The solution to both problems is to worldize the sounds, that is, to play them back in an appropriate space and record the playback. That means lugging around a high-quality sound-playback system along with your recording rig. Place a speaker in a room or location with the desired aural fingerprint and position a microphone some distance from the speaker. Next, play back your original sounds through the speaker and rerecord them on another tape recorder, capturing the sound with all the reverberant characteristics of the space. That requires much time and effort, but when only the most authentic reproduction will do, worldizing can get you there.
Other options. You can find useful processing functions in special-purpose sound-design tools, such as U & I Software's MetaSynth (www.uisoftware.com) and the Kyma System from Symbolic Sound (www.symbolicsound.com; see Fig. 7). Also available is a seemingly endless supply of plug-ins available for transforming audio in unusual ways. But don't worry about owning one of everything; just master the tools you have and try to get the most out of them.
PUTTING IT ALL TOGETHEROnce you've collected and processed the sounds you like, import them into a multitrack editor and align them with the visual image. If the visual material has been rendered in a computer, it is delivered to the sound designer in QuickTime format; otherwise, it typically comes on a VHS videotape. In the latter case, I import the picture into the computer as a QuickTime file, which Pro Tools plays along with the audio. If possible, it's nice to have a dedicated video card, such as the Aurora Fuse, which takes much of the burden of video playback from the computer CPU.
Sometimes I do the processing and manipulation at this point by letting the visual imagery tell me what the sound needs. I import the sound into the Region bin of Pro Tools and drag it into the edit window, roughly where the sound should be. Then I tweak the timing against the visual and adjust frame by frame until it is perfect. Synchronization is as much an art as a mechanical skill - you can't just assume that the closing-door sound should land exactly on the frame in which the door closes. You have to watch each motion repeatedly and adjust the timing of the sound until it feels right.
SOUND AND VISIONWhen I'm ready to put sound elements to a visual, I create a new Pro Tools session and roughly organize the tracks I'll need. Keeping the materials organized by track is critical because a high-density two-minute sound cue can include hundreds of individual elements (see Fig. 8). A typical session consists of two stereo pairs of tracks for ambience, a track for each character's dialog, a stereo pair for music, six mono tracks for Foley (two footstep tracks, two clothing tracks, two prop tracks), and four stereo pairs and four additional mono tracks for principal sound effects. At the top of the session I also include one stereo pair and one mono track into which I can load sounds. That lets me quickly pull sounds into the load tracks, adjust their timing, and drag them into an appropriate track.
Once each sound is loaded, it must be evaluated. That may seem obvious, but you must really listen to each sound element. Listen to how it sounds by itself against the picture and how it sounds with the other sonic elements. Listen deeply to the detail of the sound; then put your "big ears" on and listen objectively to the element as part of the whole. Does the general character of the sound work with the visual? What does it need? Will editing or processing improve it? Does it fill out the visual, or could it benefit from adding another layer?
Conversely, does the sound have too many details? Is the sound so busy that it distracts from the intended object of the audience's focus? Those decisions are made almost instantly and lead you to the options of keeping the sound as is, modifying it, or starting again. The real trick here is to balance the endless options and your desire for perfection with the need to make progress.
Now that you have the process down, repeat it dozens, hundreds, or thousands of times until all your sound elements are lined up and sounding good. You are prepared for the mix!
SPECIAL DELIVERYOnce you've completed your work, you must deliver it to the next person in the chain of project personnel. Formats and delivery standards must be carefully specified and spelled out in advance to minimize confusion and rework. Ask the project leader exactly what format and medium the files need to be delivered in - the all-nighter you prevent may be your own. (See the sidebar "Plan of Attack" for a discussion of additional steps to consider for delivering different types of media.)
Film and video delivery is fairly standardized. The audio standard for broadcast video is 16-bit/48 kHz, but films can be either 16/44.1 or 16/48, depending on the project. If you are responsible for a finished stereo mix of all audio elements, including dialog and music, you can deliver an AIFF interleaved stereo or Sound Designer II split stereo file burned to CD-R or recorded on DAT. If you mix to DAT, add a 2-pop to the beginning of the file. That is a short beep that occurs two seconds before the audio content begins, and it's used by the editor to align the audio with picture.
Tascam DA-88 tape is the broadcast industry standard. I like it because the time code is stable, and eight tracks provide enough room for a full 5.1 surround mix on the first six tracks and a stereo mix on the last two tracks. Currently, though, it seems that Digidesign's Pro Tools format dominates the film industry; you can typically deliver a hard drive containing your Pro Tools sessions to the mixer. When I do this, I deliver a carefully prepared session that has all the audio elements organized and roughly balanced the way I envision them sounding in the final mix. The idea is that the sound mixer, who is not familiar with the material, should be able to bring all faders up to unity gain and have it sound reasonably good. From that point he or she can tweak levels, pans, and mutes without wasting time.
Game elements are very small and obviously used on computers, so I always deliver them on CD-R. With the advent of broadband connections, I occasionally transfer files over the Internet directly to the people who need them. Usually, though, it's less hassle to simply burn a disc and meet in person to go over the material. In any event, delivering the material in the format and sample rate required by the programmer is of paramount importance. I have found 16-bit/22 kHz, AIFF format to be a safe bet for PC-based games, but the situation depends on the platform and RAM resources for sound. For example, the Sony PlayStation 2 console runs at 16/48. At the project's end, I deliver all the materials in high-resolution format; if the game gets ported to another platform, the audio elements can be converted from the masters.
Audio for the Web is still in a state of flux, with many different competing formats. RealAudio, MP3, QuickTime, Windows Media, and LiquidAudio are the dominant formats for Web playback, but by the time you read this, there could be more. If you are delivering audio to a developer, you might not need to worry about this. For example, if the audio is part of a Flash or Shockwave presentation, you can deliver stereo WAV files at full resolution, and the application will compress the data for you. However, listen to the results before the audio gets posted. Alternatively, you can get the appropriate software tool to convert your high-res files into a compressed format. Don't be afraid to tweak settings and compression rates until you get the smallest file that still sounds good.
THE MIXWhen all the hard work of the sound designers, dialog editors, and composers is done, one task remains. This is the moment that strikes fear into the hearts of all involved, a process that can turn perfectly nice, sane audio professionals into bloodthirsty maniacs. This is The Mix. Actually, the mix can be the greatest source of satisfaction on a project. In the mix, you can hear, under the best listening circumstances, all the elements - music and sound, foreground and background - coalesce into a beautiful partner to the picture.
The mix can also be a session fraught with frustration, difficulty, and disappointment. Problems can crop up in a mix when there is too much to do within the time allotted, when sound elements are inadequate, and when people disagree about the decisions being made. Nearly all this can be chalked up to miscommunication or lack of preparation. How can you make sure that you have done everything possible to contribute to a smooth mixing process?
You never know ahead of time exactly how a mix will sound. But you can rely on experience and guesswork to imagine which elements will be where and use this information to create your tracks accordingly. Scanning the scene for likely dialog and music moments helps you to leave more space in your work for a cooperative mix.
For example, say you're working on a close-up of two soldiers conversing on the front lines while grenades and bombs are going off in the background. Without hearing the dialog, you know it will be the most important sound in the scene, and mixing explosions behind the conversation will greatly impair the intelligibility of the words. That type of foresight lets you select explosion sounds that fit the context; you want low-end sounds without frequencies in the range of speech that are somewhat dull and that are generally low-key.
Marco D'Ambrosio, a film composer in the San Francisco Bay Area, makes sure never to reach a high point in his music during an explosion. Instead, he leaves room for the sound designer to put in a big boom, and he puts the musical climax afterward. Planning and imagination make the difference between sound components that complement, rather than compete with, one another.
KEEP IT LOGICALMake sure your individual elements are logically grouped on as few tracks as necessary to keep the mix from becoming a massive confusing mess. On the other hand, use as many tracks as necessary to give the mix engineer the flexibility he or she needs to remove or change individual elements. To some degree, that depends on the engineer's style, the amount of time for the mix, and the mixing system's capabilities. I've delivered as little as a stereo pair of hard sound effects and a stereo pair of ambience, and as many as 32 tracks of hard sound effects.
If you handle dialog and music as well as sound effects, never put them on the same tracks - the panning and EQ tend to be totally different for these elements, and combining them creates a conceptual headache. Ask the mix engineer what he or she wants and make it happen. Make sure your sound elements are clean, are well edited, and sync to picture nicely. Time is a precious commodity during a mix, and fixing problems comes at the expense of making the best mix possible.
Always bear in mind that dialog comes first. If a sound effect gets buried or a clarinet part can't be heard, it is unfortunate but not drastic. However, if a spoken line is unintelligible, the audience is robbed of the story. People are tuned to listen to human speech, and they get annoyed when they can't understand the words. Thus, music and effects are always subservient to dialog in a mix.
Dialog is so critical that the standard mixing convention in film is to use the center channel primarily for dialog. Separated from the music and effects that fill the rest of the space, the words are always heard in the middle of the screen. Dialog is sometimes compressed and limited to help it ride above the rest of the mix. You can also use EQ to make sure that too much information in other elements doesn't compete with the same frequency range as the words. If a line is still not reading well in the mix, turn everything else down a bit.
How do you make music and effects coexist peacefully? As mentioned earlier, one answer is to ensure that important events don't happen simultaneously. Another approach is to have the textures augment each other through contrast. For example, music that is tight and nicely percussive matches languid, airy, continuous ambiences. Similarly, ambient, Brian Eno-esque soundtracks contrast well with a tighter, sparser, more event-driven approach to sound effects.
Music and effects can always inhabit different parts of the frequency spectrum. There will be times when the music feels buried or the effects' subtlety is lost. That is all part of the deal - it is most important to tell the story effectively and artfully. If that means whole swaths of sound or music get lowered or removed in the mix, so be it. The creative process of shaping and changing the sound culminates in the mix itself.
Limiters are useful for increasing the overall perceived level of an element or mix. I try to avoid using limiters when creating premixes or stems (partial mixes); I prefer to preserve as much dynamic range as possible. In the final mix, though, limiters are often used to shoehorn everything in. Kent Sparling, a rerecording mixer at Skywalker Sound, believes that limiting in the mix is a necessary evil. "It's a shame to have to make everything louder to compensate for a problem in the storytelling," he says.
MEDIA MATRIXIn terms of overall approach, the mix should always be considered from the listener's perspective. There is no point in creating a gigantic, wide-ranging mix for a game if it doesn't sound good on computer speakers. Game sound designers typically mix using low-end computer speakers and switch back to high-quality speakers only for reference. That ensures the sound designers have the same sonic experience as the end-users. Here are some other issues to consider when creating sound for different types of media.
Film. No doubt about it, mixing for film is a beautiful thing. There are no limitations on frequency response due to low sampling rates, and having a subwoofer ensures that dramatic rumbles and explosions will be felt as well as heard. In mixing for 5.1 surround, the center channel enhances the clarity of the dialog, which leaves more room in the left and right for music and effects. Finally, the surround channels can be used liberally to enhance the mood and apparent environment size by sending ambient elements, reverb washes, and occasional musical elements to the rear (not to mention flyover effects). The mix's dynamic range can be quite wide, keeping intimate scenes quiet and saving the volume for big, dramatic moments.
Game. Audio for games has improved substantially during the past few years. With increased budgets, enhanced audio hardware, and more recognition of the need for high-quality audio as a significant part of the game experience, things are getting better all the time. Yet problems remain. The necessities of data compression lead to compromises in both frequency response and dynamic range. A typical sampling rate of 22 kHz means no frequencies above 11 kHz, resulting in a lack of sheen, air, and crispness. The audio is also compressed, using some perceptual coding and variable bit-depth scheme. That step adds artifacts and reduces clarity, and it's particularly harsh on quiet sounds.
What's more, computers have loud fans, which increase the ambient noise floor and thus decrease the listener's dynamic range at the computer. Finally, games tend to be loud and punchy. All those factors lead to mixes and individual sound files that make heavy use of normalization and limiting.
Web. The same computer-based limitations for games are inherent in Web-based presentations. An additional factor is download speed, which affects durations and increases data compression beyond that required for games. One general suggestion: turn technical limitations into design considerations by using very short sound files and loops.
END GAMEAs you can see, the audio arts outside of music have incredible depth and richness of expression. When you open your ears to the sounds around you, the world becomes an unending symphony of footsteps on crunchy gravel, wind slithering through tall grasses, and cars flying by in the night. Good sound design is based on events in the real world, and the more attuned you are to sounds in your environment, the better you will be at providing sound for the virtual world.
Your job is always to enhance the visual elements in the project. Be tasteful yet subtle, and your work will best serve its intended purpose.
Because the sound designer is part of a collaborative effort, you'll be working with other creative people who have their own perspectives on the project as well as technical folks who will be integrating your work. Developing good relationships with these people is crucial, and the key to making that happen is communication.
You must have several skills that are just as important as any artistic abilities you bring to the project. The first is listening to the needs of the other team members. You must really hear the creative desires of the project leader. At the outset, discuss his or her ideas and conceptions, starting with the broad viewpoint - "I want a film noir, 1930s New York City - style approach" - and working down to specifics - "It would be great to hear that taxicab with a 1934 Ford engine sound."
Take as much of their time as you can for the initial meetings, and bat ideas around until you all feel that you're on the same page. Take notes. Do more listening than talking but don't be afraid to chime in with your opinions. Be friendly and warm - that is critical to establishing their trust in you. Most importantly, remember that they are the bosses, so listen to them and satisfy their desires to the best of your ability. If you have an opinion that is different from theirs, don't be afraid to voice it - that is what they are paying you for - but if it is overruled, smile and move on. Whining, struggling, or playing the artiste guarantees that you will not work with them again.
Next, be prepared to clearly communicate what you will do, when you will do it, and what you need from others to do your job. The collaborative process fosters personal and artistic growth, as well as great finished material. The tricks to making it work are having a confident, nonarrogant demeanor and projecting the vibe that you really care about the project and respect the people you are working with.
Finally, make friends with the programmers; they are the people who have ultimate control over whether your hard work will sound right in the game or on the Web. Having clear technical conversations up front will save time, rework, and frustration near the end of the project.
Reading about sound design is fine, but to get into it, try it yourself. I've put together four sound-design missions that you can try at home.
Record an ordinary object. The object can be a glass clinking, a ruler vibrating on the edge of a table, or anything you have lying around. Transfer it to your computer and manipulate it by filtering, reversing, speeding up or slowing down, chopping off the attack, fading it in or out, or splitting it into numerous parts and rearranging them at will.
Make an ambience tape. Grab your portable recorder along with a stereo microphone (or two identical mics in a stereo configuration). Capture the sound of a place and time by recording five minutes from a still position: a busy city street, a restaurant, a cricket-filled twilight, a gentle rainstorm. Breathe quietly and be aware of clothing rustles and hand movements on the microphone or mic cable.
Transfer the recording to your computer, edit the best elements together, and make a two-minute loop. Make sure the loop point doesn't call attention to itself; the loop should be able to run smoothly forever. Remove any events that pop out (for example, a bus's hydraulic door opening, or hissing sounds in your city scene) and save them separately as ambient stingers that can be added in later if desired. If you are recording in nature, notice how often your recording is spoiled by humanity - airplanes flying overhead, distant traffic, and so on.
Collect a lot of something. Record multiple versions of the same thing, such as a door opening and closing, a car starting and stopping, or footsteps. Try recording them in different spaces and at different distances from the mic. Notice how much more of the room the mic seems to pick up than your ears do. Also notice how the high-frequency content of the sound tends to roll off as a function of the distance from the microphone. By adding reverb and judiciously filtering the high end of the sound, you can simulate distance in a close-miked recording. You are now on the road to creating your own sound library.
Make it from other things. Replicate the sound of something familiar by using other objects' sounds and editing and layering them together to create the final product. For example, try to create the sound of a 35 mm autowind camera by recording similar sounds that can be put together to make the target sound. Start by analyzing the components of the camera sound: the click of the button, the rapid open and close of the shutter, the whine of the autowind motor whisking the film along, and a final clunk as the motor stops.
Now create your own version without using a camera. The button press might come from a light switch or a button on a blender. The shutter could be a pair of scissors opening and closing quickly. The autowind could be the servomotor on a car sideview-mirror adjuster or an electric razor. The final thunk might come from snapping the lid onto an aspirin bottle. Load it all into a multitrack digital audio program. Then edit, pitch shift, and slip and slide the elements against each other until you have something that mimics the rhythm and general feeling of an auto-wind camera. Presto!
In planning the sound-design process, the nature of the delivery medium must be taken into account. Creative, appropriate sound is important in all media, but each medium has overriding principles that define your approach.
Film and video. Film and video are linear media. The audio is always synchronized to the video and plays back the same way every time. The sound quality is high, and the processes used to marry the audio to the picture are time tested and well understood. Technological advances and headaches occur in various parts of the process, but in general, certain standards allow for fairly smooth production. As a result, the primary sound focus for these media is to create the highest quality, most detailed work possible within the time and budget.
Games. Games are about interactivity. Some sections of games are expository, linear presentations that tend to follow a film approach. But the main areas of the game unfold based on user input, changing direction from moment to moment. As a result, the elements are developed in small pieces that are carefully named and organized, and put back together in real time like an ever-morphing jigsaw puzzle. The programmer plays a huge part in carrying out the vision of the sound designer by making sure that the right sounds trigger at the correct times. When programmers have enough time and sensitivity to audio issues, niceties such as pan and volume changes are included, enhancing the experience.
Game technology changes from game to game and platform to platform. Detailed knowledge of the platform's capabilities and programming system is critical for the sound designer's planning, but there is one guarantee: you will not have the resources you want. Graphics always take priority over sound, both in terms of RAM budgets and labor resources. As a result, every sound must be as short as possible while still conveying excitement and interest.
Sounds are also made smaller by decreasing the sampling rate and applying data compression. Lowered sampling rates mean decreased high-frequency response, so qualities such as airiness or sheen in metallic sounds and cymbals disappear. Reverb tails at the end of sounds always get cut, though many gaming platforms now possess built-in DSP effects that can bring some processing back in real time. Fortunately, hardware improvements and greater understanding that sound is a critical part of an immersive, satisfying game experience mean that the situation becomes less daunting each year.
Interactive audio design for the Web is making the most of a few short, sparse audio events rather than creating an immersive audio experience. That will change over time, of course, with streaming technologies such as QuickTime, Streaming MP3, and Windows Media Format. When enough consumers have a steady supply of continuous, reliable data streams, opportunities for creating rich, deep sound experiences on the Web will abound.