Lord of the Ringtones

Since the late 1990s, the emergence of the mobile-telephone ringtone market has brought numerous employment opportunities to musicians, programmers, and
Image placeholder title

Since the late 1990s, the emergence of the mobile-telephone ringtone market has brought numerous employment opportunities to musicians, programmers, and composers familiar with MIDI sequencing and audio editing. Ringtones were first introduced as a consumer product in 1998 and shortly thereafter became a common feature on mobile phones. Early ringtones were monophonic, typically playing the melody of a popular tune. But soon mobile-phone manufacturers advanced the audio capabilities of their products, and the polyphonic ringtone was born in 1999.

Companies such as Beatnik, Faith, Nokia, and Yamaha began to create mobile MIDI solutions that allowed for more-elaborate-sounding ringtones consisting of an ever-expanding polyphonic note count (see the sidebar “Contact Information” for a list of companies mentioned in this article). Additional features followed, including support for Pitch Bend and controller information such as modulation, as well as text, vibration and LED synchronization, graphics, and real audio.

New mobile handsets arrive on the market at the rate of dozens per month, yet developer support is notoriously poor in this industry, and handset spec sheets are often horribly inaccurate. As a result, understanding the different formats required for different phones and manufacturers can be an onerous task. In this article, I will give an overview of the ringtone market and offer some suggestions on how you can get your foot in the door. And in case you get the call to provide ringtones for a client, I will provide some programming tips that you should find useful.

Here's the good news: nearly all ringtones are created by freelance programmers. Rarely will companies hire programmers as full-time employees. This has several obvious benefits, such as allowing the programmer to work from home, around his or her own schedule. Most interaction is through email, and files are usually transferred via FTP. However, it's rare for a full-time mobile-media programmer to make a living solely on ringtone programming. In most cases, film and TV composers, game-audio programmers, and audio editors will create ringtones to supplement their existing income. And keep in mind that mobile-content providers typically aren't interested in original music, so your composition skills will not be a big asset.

From the Top

Programming polyphonic ringtones today is similar to designing game audio in the early days of PCs. Due to the CPU, memory, and bandwidth limitations of modern handsets, ringtone files must be kept clean, efficient, and small. Moreover, no single standard format exists for ringtone file distribution. The most popular polyphonic formats in the United States today are Qualcomm's CMX (Compact Media Extensions), Yamaha's SMAF (Synthetic music Mobile Application Format), and Nokia's SP-MIDI (Scalable Polyphony). (See the table “Format Free-for-All” for details on the formats mentioned here.) Though nearly all modern mobile phones can use Standard MIDI Files, playback will vary due to polyphony, MIDI-controller, and file-size restrictions. The volume levels of General MIDI voices in mobile phones also vary considerably across formats.

Phones created by different manufacturers have vastly different audio capabilities as well. This has a direct and crucial impact on any particular model's ability to produce a convincing ringtone. Therefore, a content provider might need to distribute optimized subformats, which have different specs for file size, polyphony count, audio capabilities, and so on. For example, there are dozens of SMAF handset models in the U.S. market, and exploiting their individual potential would require several different versions of an SMAF ringtone. Short of creating multiple versions, a programmer might simply aim for the lowest common denominator.

Image placeholder title

FIG. 1: Nokia''s Sound Converter is used to convert a standard MIDI file into a file in Nokia''s SP-MIDI format. It can also determine which channels of the file will be dropped if a phone can''t support the file''s full polyphonic note count.

To keep up with the industry, a potential ringtone programmer must have a working knowledge of the various formats, a familiarity with the variety of phones on the market, and an excruciating attention to minor details when creating MIDI files — things that are not always necessary when working as a gigging player, composer, or producer. Companies are aware of these issues, and it is common for a programmer to receive some training from the client. Moreover, there are a number of readily available authoring tools — for example, Nokia Sound Converter, which is part of Nokia PC Suite (www.nokia.com/support/phones/6230) — to help the ringtone programmer on his or her way (see Fig. 1).

Note that the myriad free-ringtone sites found on the Net usually offer just a single, unoptimized MIDI file, regardless of the handset to which it's being ported. (It's doubtful that many of the companies offering such files have cleared the rights for their titles.) This is one of the fundamental differences between those ringtones and ringtones purchased from major ringtone companies like Modtones, Yamaha, and Zingy.

Getting in the Game

To get started as a ringtone programmer, try contacting the content providers listed in the sidebar “Contact Information” (or feel free to email me at lholden@yamaha.com). These are the major parent companies in the ringtone business, and most distribute via a variety of consumer mobile applications.

If you're asked to submit work for evaluation, you'll most likely receive several clips to transcribe. The clips will probably vary drastically in style to test your breadth of musical diversity. Often the content provider will ask for several versions based on differing polyphonic note count. These will show the content provider your musical intuition. The accuracy of the transcription is what's most important to the content company. A “generic” mix (that is, one not fully optimized for a specific format) is acceptable, but incorrectly transcribed notes and rhythms will disqualify you for any ringtone programming work.

Always make sure your MIDI file is clean and well organized. Those evaluating your work will look for orderly arrangements and clearly marked tracks free of any unnecessary controller data. Content providers want their ringtones created by programmers who are quick and reliable. Ensuring that your initial transcriptions are accurate and your MIDI files clean is the best way to land a ringtone programming position.

Start Your Engines

To get started programming, you'll need a sequencer that can export MIDI format 0 and 1 files — most any will do. Be careful that your file doesn't contain any stray SysEx messages, inserted either by you or by your software; these messages can cause the phone to begin playing erroneous notes, fail to loop, or even restart. The next requirement is a good set of ears — every polyphonic ringtone starts with an accurate transcription of the musical piece you're programming. Ringtones vary from hip-hop to classical to TV themes and practically every genre in between, so you should also be familiar with a variety of musical styles.

Initial transcriptions should contain only basic MIDI messages such as Master Volume and General MIDI System On. Channel messages should be limited to Program Change, Pan, and Volume. These messages should be consistent across tracks and inserted at identical times in the header of the track. Although modulation and Pitch Bend data is generally acceptable, some aging handsets have issues with these message types.

Image placeholder title

FIG. 2: Because it contains the same sound chip used in many phones, the Yamaha MA-3 synthesizer is useful for previewing the sound of your ringtones as they will sound on the actual handset hardware.

One of the most difficult parts of producing ringtones is that direct monitoring is not possible: what end users will hear on their handsets is not likely to be as robust as the sounds you get while working with your native sequencer in your studio. Though emulators exist for SP-MIDI and CMX formats, they are software based and not exactly trustworthy. For this reason, try to monitor using a poor-quality tone generator or a Yamaha MA-3 box (see Fig. 2). The MA-3 contains the actual MA chip that is commonly found in GSM network phones made by LG, Samsung, and other manufacturers. In fact, the MA series of chips, which includes the MA-2, 3, 5, and now 7, is based on scaled-down versions of the Yamaha FB-01 4-op chip. (For more information on the MA chips and SMAF creation, visit www.smaf-yamaha.com.)

You should also monitor on low-cost loudspeakers to better emulate the phone's frequency response. If you do use your main monitors, use EQ to cut everything below 300 Hz sharply, and forget about anything above 10 kHz.

Just for Effect

Another difficulty for ringtone programmers is the near-total absence of DSP effects to enhance the inherently weak GM patches found on cell-phone synthesizers. Reverb, chorus, or delay must be added manually. For instance, you can take a guitar melody line, copy it to a second channel, and detune the two using Pitch Bend to create a chorus effect. You'll get a nice chorus by tuning one of the tracks up 200 cents and the other down 200 cents. Expand those values to create more-intense chorus. Practically any and all patches, aside from acoustic pianos, can benefit from this technique.

To add reverb to a melody line, copy the channel's data to a blank channel, then time-shift the second line by a 32nd note. Reduce the second channel's volume to about half that of the original (these values will vary depending on the specs of the format you are programming for). Repeat this process for more reverb time. Using a breathy patch like a flute or ocarina for the time-shifted channels can add the airiness typical of reverb tails.

You can create delay effects in a similar manner. Again, copy the original track to a blank channel. Then time-shift the second line by an eighth note and reduce the second channel's volume to about half that of the original. Repeat this process multiple times to increase the “feedback” of the delay. Unlike for reverb, use the same patch for all delayed copies.

To enhance the sound of a ringtone melody line, you might need up to five tracks. Start with a saw lead (GM patch 82) or a harmonica (GM patch 23) as the melody's main patch. Then create a chorus effect using the technique described earlier. Next, double the melody one octave down to give it more body. Finally, add at least one delay channel to give the line depth and space.

Obviously, this creative method of DSP simulation will eat up channels quickly, and given that you have only 16 channels to work with, you need to decide which parts receive this special treatment. Certain formats, including SMAF and SP-MIDI, may require a dedicated vibration track that will use up another channel. This track is used to synchronize kick drums or bass notes to the vibrating motor of the device, which can help the user “hear” bass that the phone can't actually reproduce. By inserting specific Bank and Program Change messages in the header of those tracks (MSB 121, LSB 6, and GM patch 125 for SP-MIDI format), the Note On and Note Off commands become routed not to play a sample at a specific pitch, but to trigger the device's motor to vibrate the handset. The vibration track's note Velocities and Channel Volume should all be set to minimum. Insert the note D2 in your vibration track to trigger the device's motor. (See the developer's supplementary documentation on the company's download pages.)

Keep in mind that in nearly all cases, transcription is only half the task for ringtone programming. Creating the files in the formats that the myriad modern handsets require is also typically done by the programmer. To stay current with new phones, services, and trends, bookmark the Phone Scoop site (www.phonescoop.com) and check back often.

Keeping It Real

In the United States, recent technological advances have brought increased network speeds and widespread availability of handsets that are real tone capable. A great example of this caliber of handset is Verizon's LG VX8100 (see Fig. 3). This popular multimedia phone has MP3- and CMX-format capabilities in addition to connection speeds fast enough for video downloads and full-track audio content. Thanks to its miniSD card slot, users can store a large amount of data, thrusting phones such as this into competition with standalone MP3 players and iPods.

Image placeholder title

FIG. 3: Verizon''s LG VX8100 is a modern multimedia phone that supports MP3 and CMX format in addition to video downloads.

The predominant format for audio playback on mobile devices is, not surprisingly, MP3. Many handset manufacturers now include MP3 support for their newer models, including Motorola, nearly all of whose mobile devices have adopted MP3. Other phones will play audio via SMAF or CMX format.

If you've ever downloaded real tones, you've probably noticed that the clips are quite short, rarely exceeding 12 seconds or so. This is because many handsets still have limits on the size of files they can store, and download speeds are not universal across handsets and networks, even on the same carrier.

Since real-tone production work is usually handled in-house by content providers, real tones represent the dark, ominous cloud preparing to ruin ringtone programmers' sunny polyphonic day.

Mobile Content's Future

Let's face it: polyphonic ringtones have hit their peak in the United States, and the initial boom and demand for programmers is gone. But even though demand has diminished, there still is money to be made in the polyphonic market. Smaller carriers will be slow to support newer, more expensive handsets, and for the foreseeable future, “free” phones given to consumers in exchange for contractual agreements will probably be limited to polyphonic ringtones.

One can look to Japan and Korea to see the future of ringtones. Current offerings in those countries consist of full-track downloads, handsets with incredible stereo imaging, and polyphonic limits approaching those of professional synthesizers. The 3G networks in place allow for quick data retrieval and robust multimedia downloads.

Games are another widely adopted form of mobile entertainment, and seem to be the perfect relief for many who endure bus and subway commutes. Mobile-game companies also have a need for composers, sound designers, and mobile-device contractors. Electronic Arts Mobile Games and Hands-On Mobile (formerly mForma) are two leading mobile-game developers in the United States.

Other areas of mobile-content development include GPS/location-based services, video downloads, up-to-the-minute news and weather info, adult content, and many of your favorite TV shows condensed for mobile viewing. Channels such as Verizon's V-Cast will spearhead this new wave of mobile content.

Specialty niche applications are also beginning to appear. One example is Yamaha's Musician's Companion, which replaces your metronome, pitch pipe, and chord reference book with an all-in-one mobile application for your cell phone (see www.yamaha-wireless.com for more information). Some handsets are even equipped with hardware such as FM tuners and medical devices such as diabetes testers.

In Japan, consumers have begun paying for a subway ride or a bottle of soda at a convenience store with a swipe of their phone past a scanner. Those attending a sporting event or concert can have a virtual ticket stored in their phone and available at the wave of their hand. From such examples, you can see that the cell phone is becoming central to the Japanese lifestyle. Some of these services will inevitably appear in America in the years to come, but until then, ringtones will continue to dominate the mobile-content market.

Luke Holden is product manager and marketing manager of the Wireless Content Department of Yamaha Corporation of America. He also runs Moebius Sound and Recording (www.moebiusrecording.com), a Southern California — based recording and production facility.


Many consumers wonder why their ringtones don't sound as good as their friends'. One reason is that the quality of a phone's tone generator can vary greatly from handset to handset. In the United States, this is hugely dependent on what type of network the phone is supported by. American wireless networks generally use one of two underlying technologies: CDMA (Code Division Multiple Access) and GSM (Global System for Mobile communications). Programming polyphonic ringtones is substantially different when dealing with each technology.

The table below shows the main specs for each of the major polyphonic formats currently in use in the United States.

Major Polyphonic Formats Table


Content Providers
9 Squaredwww.9squared.com

AG Interactivewww.interactive.ag.com

Electronic Arts Mobile Gameswww.ea-mobile.com

Faith, Inc.www.faith-inc.com/service/mobile.html

Hands-On Mobile (formerly mForma)www.mforma.com

InfoSpace, Inc.www.infospaceinc.com



Sony BMG Music Entertainmentwww.sonymusicmobile.com

Tone Playerwww.widerthan.com/americas



Developers' Resources