Phone Tag

Writing music for cell-phone games is a specialized skill. Though it may seem simple on the surface, the process is more complex than just dropping an
Image placeholder title

Writing music for cell-phone games is a specialized skill. Though it may seem simple on the surface, the process is more complex than just dropping an MP3 or MIDI file in an email to a client and getting paid. In its most basic form, it requires the composer to navigate through a dizzying array of formats, work with reduced file sizes and polyphony counts, and make the best possible sound come through an unforgiving mobile-phone speaker.

Image placeholder title

FIG. 1: This figure shows a common setup for configuring GM playback on a mobile phone. The sequencer being used is Yamaha''s XG Works 3.0.

In this article, I'll provide strategies to create bulletproof mobile soundtracks. To illustrate the process, I'll walk through a recent project and describe some of the techniques and tools I used. Though more-advanced mobile audio technologies are available, my focus will be the common protocol for U.S.-based games: playback of a single file using a limited General MIDI sound set. Mastering the basics will ensure a smooth transition to working with more-advanced engines.

Before starting the composition process, it is crucial to fully understand the scope of the project and its requirements. What style will the music be? What formats does the client require? What are the maximum file sizes? What is the polyphony count for each format? My particular project called for a 30-second looping theme no larger than 4K, delivered in four different formats: 4- and 16-voice GM, 16-voice Yamaha SMAF, and monophonic Nokia OTT. I'll describe these format types and the conversion process later; for now I'll focus on the composition.

The Project

Award-winning mobile game developer and publisher Digital Chocolate needed a soundtrack for an upcoming game. The music and the theme of the game were dark and moody. In the game, the user would play the role of a club manager, hiring dancers and staff while keeping the local law enforcement in check. There were six main dancers that needed their own theme song while they danced onstage.

The user had the challenge of figuring out which dancer would match up with the patrons that were in the club at the time. The better the match, the more money they would earn. With this earning of cash came more respect from the manager's bosses, and with the earning of respect, new items and staff were available for the club. In addition to these six tracks, the developer also required a title theme. The process of composing the theme is the basis of this article.


Because the largest and most complex version was the 16-voice GM theme, I chose to start with that and work down to the mono version later. The platform was a mobile phone with limited sound-playback capabilities, not an Xbox 360 or PS3, so I used only a basic sequencer and limited the sound palette to General MIDI. (Although it's tempting to load up your DAW or digital audio sequencer with the latest plug-ins and effects, that's not a good idea in this case.) I used Yamaha's XG Works 3.0 and set the playback source to my sound card's software GM sound set (see the sidebar “Manufacturer Contacts” for the URLs of the companies mentioned here). This allowed me to hear a representation similar to the final playback medium while I worked.

To streamline work, I used a template that included the necessary header information and track setup for the mobile formats. Track 1 was used as the master track and included two SysEx messages: GM System On and GM Master Volume, set to 127. Because some phones can be finicky, I always reserve the first five ticks to store patch, volume, and pan information on sequential steps (see Fig. 1). On more than one occasion, I have corrected sequences that did not play properly because all the setup data was stacked up on tick 0.

After going over the creative brief, I began work on the main theme. Keeping a careful eye on the polyphony count, I started by composing a monophonic melody to a click track and finished by filling out the arrangement with the remaining voices. I allocated the backing-track voices to synth bass, two distortion guitars, and a small drum set. In the end, I used all the available voices and had a fairly full-sounding arrangement of the theme.

A Clean File

To ensure proper playback on most mobile audio sequencers, MIDI controller messages must be kept to a minimum. Most mobile sequencers recognize only Pitch Bend and Mod Wheel, so it's pointless to have extra data that will be ignored. This is a critical point because in some cases, the extra data can result in playback errors on certain phones. I used an input filter so the sequencer ignored that data to begin with, which saved an extra step down the line. However, having recently bought a new controller keyboard, I realized that I had inadvertently recorded Aftertouch messages on all the tracks, so that data was removed after the fact in the event editor.

Before moving to the next stage, I double-checked the total polyphony count to make sure there were never more than 16 simultaneous voices. Because XG Works does not include a polyphony-count meter, I counted the notes manually, track by track. Right away I spotted overlapping notes in the melody track (which would count for two voices) and a few doubled notes on the drum track. It was much easier to see the doubled notes when I viewed the drum track in the event-list view.

Because the GM spec doesn't include start and end points, I used a simple technique to create a seamless loop of the music. Most phones handle looping by using the first Note On as the start point and the last Note Off as the end point of the sequence. To ensure a smooth loop, I extended the length of the last snare hit to reach the end of the last bar. That created a perfectly timed Note Off and end point for the sequence. Another trick to create an end point is to insert a note at the desired ending place in the sequence and set the Velocity to 1, which acts as a silent marker.

Read more of this article

Lower Your Voices

After finishing the file cleanup and loop, the next step was to create the 4-voice GM version. Using the 16-voice version as the template, I began the task of reducing the theme to four voices. Because the melody was using only one voice, there were three voices available to finish the arrangement. The bass track also used one voice, so it stayed in as well. With only two voices left, I employed a technique that allowed me to squeeze an apparent three voices out of the drums. Inspired by Charlie Watts's technique of never hitting the high hat and snare simultaneously, I simply removed the high-hat notes whenever the snare was hit. Because both the hat and snare were high-frequency sounds, it was a great way to keep the groove going, and the hat's absence was barely noticeable.

Making the mono version required a bit of extra thought and work. I used the 4-voice GM version as the working template and employed a few techniques to make a convincing 1-voice track. Because the original theme began with drums and bass, there would have been a lot of empty space had the file been reduced to just the melody track. The best way to represent the rhythm was to use the bass track up until the melody came in. I erased all the bass notes when the melody was playing, and when the melody took a break, I used the bass again to fill in the gap. This created a pseudo rhythm track while still using only one voice.

Yamaha Considerations

The Yamaha SMAF format works a bit differently than standard 16-voice General MIDI in that each different drum sound is counted as one voice of polyphony. This means a kick, snare, and high hat will count as three voices, even if they are not played simultaneously. Because I had used seven different drum sounds in the 16-voice GM version, I had to be careful not to go over the 16-voice limit. As a result, I ended up combining the two toms into a single tom hit, which reduced the track polyphony to six. With those changes, the SMAF file preparation was complete, and all the files were ready to go for the conversion stage.

Image placeholder title

FIG. 2: In this screen shot from Unwiredtec''s Ringtone Creator, all the checked formats are note based and the unchecked are PCM.

The next step was to convert the SMAF and Nokia tracks into their respective formats (the 4- and 16-voice GM files were already in their final format). Many specialized tools are available for file format conversion and creation, but the one I chose to use on this project was Unwiredtec's Ringtone Creator. Ringtone Creator allows for conversion to many note-based and PCM formats from a variety of sources (see Fig. 2). The conversion process involved simply loading the prepared file, selecting the appropriate output format, and clicking on a few buttons. I did this for both the SMAF- and Nokia-prepared files and ended up with files that could be played on their respective handsets.

Playback Testing

Playback testing for mobile is very much like mixing and mastering at the same time. Listening to the files on real handsets allowed me to tweak the voicing and balance of the instruments to sound best for each format. Because there was no single handset that could play all four of the formats for this project, I ended up using several to test on. The files were loaded onto the handsets through a data cable using various software and hardware combinations. Although there are other ways to transfer the data (Bluetooth or WAP, for example), all of the phones were set up for data-cable transfers. For the 4- and 16-voice GM versions, I used Motorola Phone Tools and loaded the files onto a Motorola Razr. For the Yamaha SMAF format, I opted to use the Yamaha MA2 hardware developer box with a dummy phone connected to it (see Fig. 3). Finally, the Nokia format was loaded onto a Nokia 6630 phone using Nokia's PC Suite.

Listening to the GM versions, I noticed right away that the bass was inaudible. It was also clear that the high hat was far louder than it had sounded when playing back on my PC monitors. The solution was to raise the bass part an octave and reduce the Velocity of the high hats. After going back and forth a few times, I found the balance to work great for both GM versions on the Razr. Knowing that each handset would sound different, I opted to load the GM versions onto a variety of handsets in my collection. After a few more minor tweaks, the files sounded good across the different phones (see Web Clip 1).

Image placeholder title

FIG. 3: Shown here are some tools of the trade: a Yamaha MA2 developer box with a dummy phone connected to audition the sounds.

The same process was applied to the SMAF and Nokia formats until they sounded right to my ears. Yamaha's hardware developers tools allow for extensive editing of the FM voices, so I spent a good deal of time crafting the actual sounds. An added bonus of the SMAF format is that it provides uniform playback across MA2-enabled phones (see Web Clip 2). The only difference is in the speaker and molding of various handsets.

The most difficult format to get a good-quality sound from was the mono Nokia OTT. OTT was created more than a decade ago and was not designed to play ringtones the way we know them now. The familiar Nokia ringtone that we've all heard showcases the sound of the format, which can be very piercing but functional. Most of my time was spent making sure the notes were in a pleasing octave range and not overly loud (see Web Clip 3). With that last bit of tweaking, I was finished and ready to deliver the files to the client.

Who's Calling?

Working on this project was very much like writing a polyphonic ringtone. At this time the technology is growing fast, so expect to see a lot more in the way of interactive soundtracks, multiple tracks of streaming audio, and PCM sound in the future. These trends have already developed in Japan and Europe, and some of the game companies I work with are now using highly advanced engines for mobile games.

Still, there will always be projects that have small audio budgets, so the need to squeeze the most out of every byte will remain for the foreseeable future. It's also a fun and rewarding challenge to work under tight limitations — it forces creative thinking. If done properly, your music can be a joy to hear when playing in a game, even if it's coming through a 10 mm handset speaker.

Steve Ouimette runs 8th Wonder Productions (, whose main business is creating soundtracks for mobile, console, and PC games. You can reach him

Manufacturer Contacts