Producing Pro Podcasts

Anyone with a USB mic and a computer can record a Podcast, but if you want to produce a product that will hold its own with the radio shows and other
Image placeholder title

Web Clips: Watch a video tutorial that shows you how to edit and clean up audio for Podcasts

Image placeholder title

Anyone with a USB mic and a computer can record a Podcast, but if you want to produce a product that will hold its own with the radio shows and other top-shelf downloadable content available on the Web, you'll need both studio chops and an attention to detail. Sonic problems such as uneven levels, background noise, and distorted audio will all have a negative impact on your listeners no matter how good the content is.

As the producer of Electronic Musician's twice-monthly “EM Cast,” I've picked up a lot of useful Podcast production techniques. In this article, I'll try to pass on as much advice as I have space for. For additional information, see David Battino's “The Art of Podcasting” in the December 2005 issue of EM (available online at That story covers many general Podcasting issues, such as putting together RSS subscription feeds and promoting your product, that won't be covered here. I will focus on the production side: recording, editing, and mixing.

Your most important tool for Podcast production will be a digital audio sequencer or multitrack audio editor. I recommend fully featured applications like Digidesign Pro Tools, Apple Logic Studio (or Logic Express 8), Steinberg Cubase, MOTU Digital Performer, Cakewalk Sonar, and Ableton Live. You could also do quite well with Apple GarageBand, which has a lot of Podcasting features built in.

Talk to Me

All Podcasts contain at least some voice-overs or other spoken-word elements, and many consist solely of such content. Interviews are one of the best sources of material if you're putting together an informational or topical Podcast.

If you're recording an in-person interview or discussion for your Podcast, I would recommend using a digital 2-track recorder with a stereo mic. There are a number of good, relatively inexpensive models on the market by manufacturers such as M-Audio, Edirol, and Zoom. In a slightly higher price category, you might consider Sony's new PCM-D50, which is a lower-priced alternative to the company's high-end PCM-D1, but which offers similar functionality.

If you have a laptop with recording software and a good USB mic, you could record with that as well. Make sure you can get enough level with the mic you're using; some USB mics do not provide much gain. Another option is to record using a laptop with a portable audio interface and a conventional dynamic or condenser mic.

Image placeholder title

The Info page in iTunes, where you can edit your ID3 tags.

For both interview and voice-over recording, I recommend using 24-bit resolution whenever possible (at a sampling rate of 44.1 or 48 kHz). Although your final product will end up as an MP3, you want to get the best-quality source recording possible. The increased dynamic range and improved signal-to-noise ratio of 24-bit audio (as compared with 16-bit) gives you the luxury of not having to worry as much about levels that are a little too low. But even at 24-bit, you should always strive to capture levels that are as high as possible without clipping. During interviews, be constantly ready to adjust your recording level down, when necessary, to avoid digital overs. Digital distortion cannot be fixed in the mix, so if you have to err with your level setting, do so to the lower side.

Try to set up your interview in as quiet an environment as possible. Steady or intermittent background noise can wreck your final product. (Beware of air conditioners!) Although there are some methods for removing noise after the fact (more on this later), by far the best route is to get a clean initial recording.

Also pay attention to the acoustics of the space you're recording the interview in. If it's really reverberant, you can minimize that by holding the mic close to the interviewee, and then bringing it close to you for your questions. Try to move it in between questions and answers so that as little handling noise as possible occurs when you or your interviewee is talking.

If you can, do a test recording before you actually start the interview to make sure everything is sounding good. Bring headphones so that you can accurately judge your test recording. Make sure your settings are correct and that the machine is actually recording when your interview starts. There's nothing worse than finishing an interview and discovering that the recorder was in pause mode, not record, for the entire conversation.

Call Me

Telephone interviews are an important part of many Podcasts. They provide the opportunity to bring in the opinions and thoughts of people from anywhere in the world. But recording phone interviews with good fidelity is not easy. While recording phone-based Podcast interviews for EM, I've experimented with a number of different methods. Here are three that I've used, presented in order of preference.

Miking the speakerphone

I've gotten good results close-miking a speakerphone for the caller's voice while separately miking my own voice, and recording the output of both mics onto separate tracks of my recording software. This allows me to use high-quality mics and get pretty good separation. The recording of the close-miked speakerphone sounds surprisingly good, as long as I get good levels without turning the speaker up to the point of distortion. Overall, it's the best method I've found.

Image placeholder title

Chuck Dahmer
Fig. 1: This drawing shows the basic setup for the miking-the-speakerphone method of phone-interview recording.

To do this, position yourself facing the speakerphone so that your mic is about 1.5 feet from it (see Fig. 1). If you get too close, you'll have too much bleed between your mic and the speakerphone mic, but if you get too far, the person on the other end of the line won't be able to hear you talk. (The only way he or she will hear you is through the built-in mic in the speakerphone.)

Point your vocal mic away from the speakerphone to minimize leakage. Use cardioid or supercardioid mics to reduce both bleed and room noise. As when you record voice-overs, a pop screen on your mic is a must for reducing plosives (“popped” p, b, and other consonant sounds). But even with such a device, some exaggerated plosives will inevitably end up on your recording, and you'll have to lessen or remove them when you're editing (more about this later).

Because there's a degree of leakage in this method, it's best to edit out or mute the audio on your track when you're not talking, and do the same for the other party. This will clean up the sound immensely, and the only places where you may run into trouble due to bleed are those moments when both parties talk at once, the result of which is that you can't mute one of the tracks. I've never found this to be a deal breaker, but it's one of the downsides, along with the aforementioned editing, of this method.


Another option for telephone-interview recording is to use Skype, an Internet-based phone system that you can download for Mac and Windows. Skype calls can be recorded using optional third-party software. You don't use an actual phone to talk over Skype. Instead, you use a microphone connected to your computer's sound card or audio interface. You listen out of your sound card's output, preferably through headphones. The biggest problem with Skype is that its audio quality, though fine for talking, often sounds muffled and processed when recorded. It's also variable: sometimes you get a good connection and sometimes you don't.

You can purchase inexpensive recording software for Skype. On the Mac side, there's Ecamm Network Call Recorder for Mac. It records Skype calls with excellent clarity and virtually no background noise. Call Recorder records to QuickTime format and comes with conversion tools for extracting the audio files from the QuickTime files into various audio formats and onto separate tracks. Windows users can try programs such as Skylook and PowerGramo for recording Skype calls.

Skype-to-Skype calls (calls between Skype users) are free, and a Skype-to-phone plan, with which you can use Skype to call any U.S. or Canadian landline or cell phone, costs $3 per month.

Phone taps

Phone taps are an easy way to record phone calls, although many models have grounding issues that prevent them from being plugged into a computer sound card or interface (or any AC-powered device) without generating unacceptable levels of hum. In those cases, you need a battery-powered portable digital recorder to get clean results.

On the plus side, you can get into the game very inexpensively with units such as the Radio Shack Mini-Recorder Control, which plugs between your handset and phone, and the Rolls Phone Patch II, which connects between the phone and the wall.

Many phone taps output a mono signal through an ⅛-inch mono connector. That in itself is problematic, because virtually all analog inputs on digital recorders are stereo. As a result, it's necessary to use an adapter cable that splits the mono signal and outputs on a stereo connector. The guerrilla work-around is to plug the phone tap's jack in halfway, which provides the same result, albeit with a precarious connection.

Another problem with phone taps is that they output you and the person you're talking to at different levels (generally the other person is significantly quieter). The quality is okay, but you'll have to go in later and even out those levels in a digital audio program.

At the higher end of the phone tap spectrum is the JK Audio Voice Path, which is designed to output into your computer's sound card, thus obviating the need for a hardware digital recorder. Because it's designed for use with a computer, it shouldn't have the hum problems some of the other taps do when connected to AC-powered devices.

More Phone Solutions

Beyond those three methods, there are other phone-recording options. For example, Parliant Phone Valet (Mac) is a system that includes a phone interface and software. It digitally records from the phone (and is also compatible with most VOIP services) and has a built-in limiter that allows it to produce recordings in which both parties on the phone call are at approximately equal volume. The system costs well under $200, and for an additional $50, the company offers a Podcast package that also includes BIAS Soundsoap 2 noise-reduction software and Peak Express, a 2-track editor. I haven't used Phone Valet, so I can't offer opinions on its performance.

If you have a larger budget, the best possible solution for phone recording is probably a broadcast phone interface such as the JK Audio Broadcast Host. It plugs between the phone line and the phone itself and has a number of I/O options (XLR and ⅛-inch). The ⅛-inch jack splits your voice onto one side and the other person's onto the other. You have separate volume controls for each.

Finding Your Voice

Besides interviews, many Podcasts include introductory and transitional voice-overs (VOs) as well as bumpers, which are transitions (usually musical) that go before and after segments.

VOs can be recorded in a multitrack audio program and then placed in their proper spot for the final mix of the Podcast. Record VOs using a good-quality cardioid mic (condenser or dynamic). Use a pop screen to minimize plosives, and place yourself within a few inches of the mic (a little closer than if you were recording yourself singing). Getting close to the mic lets you take advantage of the proximity effect to help give you that larger-than-life “radio voice.”

As with in-person interviews, record your VO in as quiet an environment as possible. Do not use figure-8 or omni polar patterns, because they will pick up a lot of room sound. Set your mic pre so that you have plenty of gain, and record at a healthy level without clipping. You don't want to have to boost the volume too much when mixing, because this can increase noise.

Bumper music can come from a number of sources: original music that you already have on hand, music you compose specifically for the Podcast (bumper music can be very short — under 15 seconds), or royalty-free stock music, for example the “jingle tracks” that come with Apple Logic. Do not use another composer's copyrighted material unless you have express permission, or you'll be violating copyright law.

To give you an example of the construction of a Podcast, here's the sequence of events for a hypothetical two-interview Podcast:

  • Intro music with VO. Music is “ducked” (temporarily reduced in level) when VO starts.
  • Music fades out, then introductory VO tells what will be in the Podcast.
  • Music and VO to introduce interview 1.
  • Interview 1.
  • Bumper music coming out of interview 1.
  • VO to introduce interview 2.
  • Interview 2.
  • Ending music with VO. Music is ducked when VO starts.
  • Ending music fades out.

Of course, your Podcast could be a lot simpler. It's totally up to you.

Hum Bug

If any of your interviews or VOs end up with significant background noise, consider using a noise-reduction plug-in. One affordable solution is BIAS Soundsoap 2 (Mac/Win), which lets you remove various noise types using automatic or manual settings. There are a number of noise-reduction software products on the market, including plug-ins like Soundsoap and Waves X-Noise and the standalone RX from iZotope. Some host programs, like Apple Logic Pro 8 and Sony Creative Software Sound Forge, come with noise-reduction plug-ins built in. These products do an excellent job of detecting and removing noise, but use them judiciously — if turned up too high, software-based noise reduction can make a voice sound very unnatural.

If you don't have noise-reduction software, there are often situations where you can remove problem noise using EQ. If, for instance, the offensive noise is steady at a fixed frequency — say, a 60-cycle hum — you can use an EQ plug-in to notch it out. Set your EQ for as narrow a bandwidth as possible (use the notch filter setting if there is one), set it for 60 Hz, and cut it to the maximum amount allowed. If you're not sure what the frequency of the offensive noise is, set your notch and sweep through the frequencies until you hear the noise lessen. You may not be able to remove all of it this way, but you can probably lessen it.

Another way to go, if the noise is of the low-frequency variety, is to filter out frequencies below about 100 Hz using an EQ plug-in. Your settings will depend on the frequency of the noise and the effect of the filter on your program material. You may lose a bit of bottom end from the voices in your Podcast, but if it gets rid of annoying noise, it's well worth it. Just don't overdo it and make the voices sound tinny.

Sweating the Details

Image placeholder title

Fig. 2: The arrows point to the breaths in this waveform display from an interview track.

When you listen back to interviews, you'll likely notice a lot of glitches including loud plosives, clicks, or just inarticulate phrasing (such as “um's,” “uh's,” and excessive “you know's”) and overly loud breaths. Unless you're going for an audio vérité, warts-and-all production, you'll want to edit out most, if not all, such anomalies.

I'll tackle breaths first. There are a number of ways to deal with them, including deleting them completely, or reducing their volume with automation or the Change Gain command (or equivalent). The key is to do it so that the result sounds natural. After a while, you'll be able to quickly spot breaths in the waveform display just by looking at them (see Fig. 2).

One effective method for getting rid of breaths in a natural-sounding way is to copy a little bit of room tone (ambient noise during a pause when nobody is speaking), and then paste it in to replace the breath. That way, the breath is gone, but there's still a pause between the words. If you simply cut the breath out and delete the space where it was, the speaker's phrasing may seem rushed and it may sound like you made an edit, which you want to avoid whenever possible.

If you completely reduce the volume of a breath, you'll hear an unnatural dropout at that point because there won't be any ambient noise. That is why using room tone is so useful (see Web Clip 1). MOTU Digital Performer has a feature called Smooth Audio Edits that places room tone (which you can designate or it can find for you) in between edit points automatically, with crossfades added.

If you want breath removal done automatically, Waves makes a plug-in called DeBreath, which is part of the company's Vocal Bundle. DeBreath is designed to automatically detect breaths and allow you to separate them from the program material.

Pop Goes the P

Image placeholder title

Fig. 3: The shaded area shows the plosive part of the waveform, which is being reduced using Pro Tools volume automation.

The noises made by loud plosives are distracting and should be reduced. Typically, you can't just cut them out, because the word will not sound right with a consonant sound removed completely. Reducing them is the best strategy. You'll find that it's pretty easy to spot plosives due to their distinctive, dense waveshape (see Fig. 3).

Here are the steps for reducing plosives:

  1. Find the offensive plosive and zoom in on it.
  2. Select the plosive part only (check by auditioning your selection).
  3. Reduce the level of the plosive using either volume automation or the Change Gain command in your recording or editing software. Either way, reducing the level of the plosive by about 6 to 8 dB will usually do the trick, though sometimes you have to reduce it quite a bit more.
Image placeholder title

Fig. 4: By cutting out the center (shaded) portion, you'll be joining the left and right segments at a zero-crossing.

No matter what you're editing, always apply short crossfades at the edit point. Often crossfades will smooth out your edits. But if you have a click or pop at your edit point and crossfading doesn't help, you can often clean it up by changing your edit point slightly so that it happens on a zero-crossing, which is the point where the signal crosses from negative into positive or vice versa. Here's what you do: zoom in on the waveform until you find the zero-crossings for both sides of the edit. Then make your cuts so that the two newly joined sections meet at a zero-crossing (see Fig. 4). That will usually take care of the pop.

Level Playing Field

Another area that you're likely to have to pay a lot of attention to is levels. Human beings don't talk at perfectly even volume over the course of a conversation, so spoken-word recordings are going to have level variations you need to address. Whether it's voices trailing off at the end of sentences, loud interjections, or other inconsistencies related to volume, you're sure to find plenty of points during the course of an interview or voice-over where you need to make adjustments. You don't want the listeners to have to adjust their volume controls while listening to your Podcast.

A digital audio sequencer's volume automation is a great way to even out levels. You can also use compression for controlling dynamics, but be careful: too much compression can bring up the level of noise on spoken-word tracks. I generally lightly compress my voice-over and interview tracks using a plug-in that doesn't add a lot of color, such as Waves' Renaissance Compressor.

Mix It Down

Image placeholder title

Fig. 5: Place the various elements into your sequencer's timeline and arrange them as you wish. Then record transitional VOs before mixing.

Rather than dealing with the aforementioned editing issues during the final mix of my Podcasts, I find it easier to edit and mix my interviews separately and then import the mixed stereo files into the final Podcast mix and place them where I want in the timeline along with any transitional music elements (see Fig. 5).

I generally wait to do any introductory, ending, or transitional VO segments until I've got all these other elements in place. You have better context for writing or improvising your voice-over once you have the Podcast's structure fully fleshed out.

If you've premixed your interviews, the biggest issues you'll be facing in your final mix will be evening out levels between the various elements. Be particularly careful when you're mixing a voice-over that's going over a piece of music. Use volume automation to duck the music down significantly for the duration of the voice-over, and then bring the level back up if the music continues after the VO. Typically, you'll want to smoothly fade out music elements to end them.

When you think your mix is ready to go, double-check the consistency of the levels between the beginning, middle, and end of your Podcast. I do this by setting the volume of my monitors to a comfortable level for the intro section, and playing back random snippets from various parts of the Podcast (without touching the volume control) to make sure the levels are even.

Overall, you want your Podcast to be at a healthy level without clipping. Pay attention to gain-staging protocol: don't raise your master fader over 0 dB; boost your individual channel faders instead. When you've finally got your mix sounding the way you want, bounce it to disk (still at 24-bit).

One Last Pass

After you've mixed down to stereo from your multitrack, I would suggest taking a break from the material for a couple of hours or, even better, overnight. As with a music mix, that time away from it will allow you to regain your perspective. After you've given it a break, load your file into a 2-track editor application. Check once more for any offensive breaths, pops, or vocal stumbles and mismatched levels. All of those glitches can be addressed in a good 2-track editor.

When you're satisfied, convert it to a 16-bit file and then into MP3 format. (Many applications will let you do this in one step.) MP3 is the universal format for Podcasts and is compatible with both Macs and PCs.

If your Podcast is music centered, I would recommend encoding your MP3 file in stereo at a minimum of 128 Kbps (kilobits per second). If talk is the focus and music is secondary (or nonexistent), you can probably get away with a mono, 64 Kbps or 48 Kbps file. Remember that the higher the bit rate, the better the sound quality, but the longer it will take for users to download or stream the Podcast. Also, make sure to add ID3 tags, which are metadata embedded in MP3 files to identify them (see the sidebar “Tag, You're It”).

On the Web

Taking the time to make the audio in your Podcasts sound as good as possible will pay off in improved enjoyment for your listeners. Of course, content is the most important aspect of a Podcast, but if the sound quality is lousy, with stray noises, jumpy volume levels, and bad-quality interview recordings, people will be less likely to want to download, stream, or subscribe to it again. I try to make my Podcast production as radio-like as possible. If it sounds like it could be on the air, then it's ready for the Web.

Mike Levine is an EM senior editor and the producer of “EM Cast,” the twice-monthly, interview-based Podcast available


Once you have the converted MP3 file of your Podcast, you have one more step left, which is to add ID3 tags. These embed important metadata into your file, such as the name of the Podcast, the artist, and a description of the contents. If you're setting up your Podcast so that listeners can subscribe to it in a program like Apple iTunes, the ID3 tag is a critical item.

I usually drag the MP3 file of my Podcast into iTunes to edit the ID3 tag. I've tried editing in other audio applications, but sometimes the results aren't satisfactory (entries can get truncated). Standalone ID3 tag editors are also available.

Select your Podcast in iTunes' main window and hit Command-I, and you'll get the Info page (see Fig. A). Choose a standard naming convention that you'll use across all your Podcasts, and which is descriptive and will serve as your file name. The reason for standardization is that if you have a number of your Podcasts together in, say, an RSS subscription feed, or even in a list on your Web site, standardized file names will make your list look a lot cleaner and more professional.

Also, make sure that you put a concise description in the Comments field. When your Podcast is subscribed to, the comments are important because they help potential listeners choose which episode they want to download.







Ecamm Network



JK Audio





Radio Shack





Sony Creative Software




Web Clips: Watch a video tutorial that shows you how to edit and clean up audio for Podcasts