Musical Fingerprints

Identify any audio signal by its unique sonic signature

Songwriters and performers have always faced the challenge ofprotecting their intellectual property, especially in the Internet eraof peer-to-peer (P2P) file sharing. One solution is to embed digitalwatermarks into the audio data to identify copyrighted material, butmany believe this results in audible artifacts that intrude on theintended sound.

Audible Magic Corp. ( is taking a differentapproach that adds nothing to the sound. Using algorithms developed byMuscle Fish (, now a division of AudibleMagic, the content-based identification (CBID) system analyzes asegment of audio to determine its unique characteristics, includingloudness, pitch, brightness (high-frequency content), and harmonicity(the degree to which the sound's timbre conforms to the harmonicseries).

The most important analysis parameters are called melfilteredcepstral coefficients (MFCCs), which describe the shape of theharmonic spectrum as perceived by the human auditory system. The audiosignal is divided into short segments and passed through a Fast FourierTransform to derive the harmonic power spectrum of each segment. Thatspectrum is then processed by a mel filter, which warps thespectrum according to the human auditory response as determined bydecades of psychoacoustic research. Finally, the mel-filtered spectrumis subjected to a discrete cosine transform, which results in what iscalled a cepstrum (“spectrum” with the first fourletters reversed) consisting of multiple coefficients that representthe mel-adjusted shape of the original spectrum (see Fig.1).

FIG.1: Audible Magic's CBID system converts anaudio signal (a) into a power spectrum (b), applies a mel filter (c),and converts the result into cepstral coefficients(d).

Image placeholder title

The Audible Magic system focuses on the midrange frequencies andcepstral coefficients because they are the most important inidentifying an audio signal. Further, the midrange coefficients arerelatively immune to changes caused by equalization, dynamiccompression, data compression, and time scaling. The system is claimedto be 98 percent accurate, even in the face of such processing.

This technique allows any piece of audio to be identified by itsunique MFCC “fingerprint,” facilitating a number ofimportant applications. In particular, any audio data can be comparedwith other audio data to determine if they are identical, which is oneway to detect copyright infringement without adding watermarks. Towardthis end, Audible Magic maintains a database of 3.7 million NorthAmerican songs and other copyrighted material, to which 10,000 newtitles are added every week by the RIAA, major record labels, andSESAC, a performing-rights organization much like ASCAP and BMI. Thecompany is also introducing a song-registration program that willaccept material from independent and unsigned artists, as well assmaller labels, and add it to the database.

One application, called RepliCheck, lets a CD manufacturer comparethe tracks of a new CD with the database to see if the material isstolen. The system is surprisingly fast, taking only two minutes tocheck a full-length CD against the entire database. RepliCheck can beinstalled on a conventional Windows PC with an Internet connection.

Another antipiracy application is called network traffic monitoring,which can be used by P2P hosts and network operators at schools,businesses, and ISPs to block unauthorized transfers. The same type ofsystem also works to track royalties and broadcast content.

Another application is SoundFisher (, which lets you search anaudio library for items that sound like a specified file. Here, thefiles are not identical but share certain sonic characteristics, asreflected by their cepstral coefficients and other acousticfeatures.

This is just the tip of the iceberg for Audible Magic's CBID system.I predict a bright future for this technology. Who knows — thesong it keeps from being pirated could be your own.