Musical Profiling

Online technology is uniquely suited to help new artists gain an audience, thanks to music-recommendation services offered by the likes of Amazon, iTunes,
Image placeholder title

Online technology is uniquely suited to help new artists gain an audience, thanks to music-recommendation services offered by the likes of Amazon, iTunes, RealNetworks, and Napster. As consumers shop for albums and songs, the e-tailer recommends selections they might like based on what they buy.

Image placeholder title

FIG. 1: Basil Ganglia mapped 128 MIDI notes to the visible color spectrum and programmed the synesthizer to respond to notes by generating signals that mimic the brain''s EM patterns when perceiving the corresponding colors. Synesthis is a big hit with the heads at Callosum Corp. U.S.

One fundamental approach to predictive music recommendation is to have human experts identify certain attributes of songs, albums, and artists, and enter that information into a database, where it is compared with subsequent user purchases. This can provide good results, but it's labor-intensive, and there are hundreds of thousands of recorded artists.

Another approach is called collaborative filtering, a statistical method used by Amazon. Collaborative filtering assumes that people naturally fall into “taste clusters”; if you bought album A, you might like album B, because many other people have bought both. This technique can identify connections between recordings that don't necessarily sound much alike. On the downside, you need a huge number of data points (lots of titles and users) to get good results. Amazon's database is certainly big enough to do a good job, especially with popular material, but artists who aren't top-tier may get short shrift in the recommendation department because fewer people buy their stuff.

Gracenote ( is working on an interesting answer to these problems. The company supplies music-database tools to various music-service providers, who package these tools for their customers. For example, Gracenote's long-standing CDDB database helps consumers catalog their CD collections by recognizing a CD and sending back its metadata (title, artist, label, and so on). Recently, the CDDB service evolved into MusicID, which adds the ability to extract and identify unique “fingerprints” of individual recordings by “listening” to the audio data.

This year, Gracenote expects to release its Discover music-recommendation service, which combines human analysis and collaborative filtering with a new approach that uses DSP algorithms to extract descriptive attributes from audio files. The algorithms used by Gracenote are derived from those found in unrelated fields, such as gene sequencing and radar, which also rely on finding patterns in large, multidimensional data spaces.

The easiest attributes to identify are low-level features, such as the spectrum envelope and zero-crossings. They provide some idea of the song's timbral structure, but they do not correspond to anything humans would consciously identify. High-level attributes, like tempo, instrumentation, melodic structure, and character of the vocals, are more immediately apparent to human listeners, but they are much more difficult to extract from an audio file using DSP algorithms. The first commercial version of Discover will extract low-level attributes and at least one high-level feature — tempo — but even that basic attribute is not easy to identify consistently across all types of recordings. Other high-level attributes, such as dominant melody and vocal character, are expected to come later, as the algorithms are refined (see Fig. 1).

To enable the Discover service, Gracenote will first run the analysis on millions of recordings and store the resulting attributes in a database. Then, when a music fan requests a recommendation, it will be generated based on any “seed” song, album, or artist in the music service's catalog or the user's personal library. Furthermore, new songs can be added as users rip CDs that are not already in the database. Initially, the service will not be available for already-encoded files because the algorithms work best on uncompressed audio.

This technology is meant to fill the gaps in what can be done with human analysis and collaborative filtering. It can also be used to recommend items without much data, especially new artists and recordings, which is good news for aspiring musicians seeking an audience.