Sound Design Workshop: Um's the Word

One person's um is another person's groove.
Publish date:
Social count:
One person's um is another person's groove.
Image placeholder title

Web Clips: hear audio examples of "um"-like speech transformed into useable vocal pieces

Sometimes the soul of a song emerges by accident. I'd plugged a cheap sampling keyboard into a TV and started recording random shows 6 seconds at a time (the limit of the sampler's memory). When I captured interviewer Charlie Rose interrupting a stammering Henry Kissinger, something caught my ear — the dialog had a pronounced rhythm.

I split the keyboard, layered the sound bite with a drum loop, and got a groove that I use to this day (see Web Clips 1 and 2). In this column, I'll share some techniques I've developed for finding and shaping garbage speech into outrageous grooves.

Badda-Bing, Badda … Um …

When people punctuate their speech with ums and other placeholder sounds, it often produces an interesting rhythm. That makes sense; these involuntary grunts and stutters function as drum fills, buying time for the performer to work up to a more powerful downbeat.

Image placeholder title

FIG. 1: This composite screen shot shows two approaches to aligning syllables to the beat: slicing in iZotope Phatmatik Pro and using Warp markers in Ableton Live.

Of course, drum fills are the highlight of a groove, whereas ums are an annoyance or an embarrassment, depending on if you're listening or speaking. I spend a lot of time slicing the ums and false starts out of my Podcast interviews to keep the show moving and make both parties sound more articulate. But ever since I realized that vocal glitches were such great groove fodder, that cleanup task has become more of a treasure hunt than a chore. Now instead of pressing the Delete key in my audio editor, I press Cut. If the sound was cool, I paste it into a new file.

Not recording interviews? You can find plenty of stuttering speech online. Try the Library of Congress site ( and the Internet Archive: Open Source Audio ( for public-domain recordings. The pronouncements of public figures are fair game, too. You'll find 15,000 meticulously labeled speech excerpts at the George W. Bush Public Domain Audio Archive (

Rhythmic ums frequently sneak in through other channels as well. I once used an answering service that sent me my voice mail as email attachments. One day a message arrived from former EM staffer Matt Gallagher, starting with the riff, “Hi (ahem) — 'scuse me, David …” Not coincidentally, Matt is a drummer, and his throat-clearing salutation plugged perfectly into the beat of a song I was working on (see Web Clip 3).

The Slice Is Right

Sometimes the found rhythms will be spot-on, as with the Kissinger and Gallagher loops. (You may notice in Web Clip 3 that I extended the latter with a synchronized delay to fill out the bar.) A bit of deviation from the beat can sound fine, too, emphasizing the humanness of the rhythm.

But more often, you'll want to tighten the groove by nudging a syllable or two either forward or back in time. To do that, I typically use Ableton Live. Many other beat-slicing programs work just as well, including Sony Creative Software Acid, Propellerhead ReCycle, and iZotope Phatmatik Pro (see Fig. 1).

Before time-aligning the syllables, I usually beef up the vocals by applying BIAS Peak's Normalize RMS effect (see Web Clip 4), which works like an industrial-strength limiter to maximize the average level of the file. (You can also use a standard limiter or loudness maximizer.) Boosting the level gives the vocal groove more power, which helps it support the tune. Of course you won't always want such a bombastic effect; thinning the vocal groove with a highpass filter can produce a lighter percussive part.

If the voice file is noisy, as is often the case with my telephone and Skype recordings, I'll remove the noise with a fast fade-out, a noise reduction, or Peak's Silence command before boosting the level.

You should be able to time-align syllables by eye in any audio editor that displays a rhythm grid. I find, however, that it's faster and more musical to play a drum loop on a separate track and adjust the vocal's warp markers to match (see Web Clip 5).

You can make almost any audio file rhythmic by chopping it up, but the natural rhythms of involuntary ums have a special personality (see Web Clip 6). Once you start to listen for spoken rhythms, you'll find a wealth of material. So don't just sit on your ums! One musician's garbage is another's groove.

David Battino ( is the coauthor of The Art of Digital Music (Backbeat Books, 2004) and audio editor for the O'Reilly Digital Media site (

Additional Resources

You can hear more examples of speech grooves in the Digital Media Insider episode called “Seize the Rhythm”

Web Clips: hear audio examples of "um"-like speech transformed into useable vocal pieces