Supreme Being - EMusician

Supreme Being

Restoring audio data discarded during MP3 encoding. Unless you live in a cave (which is unlikely if you're an electronic musician), you know that MP3
Author:
Publish date:
Social count:
0

Restoring audio data discarded during MP3 encoding.

Unless you live in a cave (which is unlikely if you're an electronic musician), you know that MP3 data compression has revolutionized online music distribution. Short for MPEG-1 Layer III, MP3 can encode 16-bit digital-audio files at sampling rates of 32, 44.1, or 48 kHz and output a bitstream ranging in bandwidth from 32 to 320 Kbps, which represents a compression ratio of as much as 48:1. On the Internet, stereo music files are typically encoded with a final bandwidth of 128 Kbps, which represents a compression ratio of 11:1 for CD audio.

Like most lossy audio data-compression schemes, MP3 uses a process called perceptual audio coding, in which the audio signal is analyzed according to a psychoacoustical model to determine which frequencies are not perceptible to the human ear. Those frequencies are then discarded, which greatly reduces the storage and bandwidth requirements. MP3 primarily discards the frequencies above 16 kHz because most human adults can't hear much in that range.

However, even though frequencies above 16 kHz are mostly imperceptible, listeners often sense that something is missing when those frequencies are absent. They describe a lack of depth or presence in the sound. As a result, many find the audio quality of MP3 files unsatisfactory compared with uncompressed CDs.

To address the problem, consumer-electronics giant Kenwood developed Supreme, a technology that allegedly restores the high frequencies lost during MP3 encoding. Supreme analyzes a bitstream that was decoded from MP3 to PCM and interpolates the missing high-frequency information in real time based on the signal's spectrum (see Fig. 1).

The Supreme Core, the algorithm that performs the interpolation, requires a stereo 16-bit datastream at a sampling rate of 44.1 kHz, so a sample-rate converter might be needed before the input stage. The amount of interpolation is variable and can be optimized for different music sources. Many experiments have demonstrated that the correlation between the original and interpolated frequencies is strong, indicating that the synthesized data closely matches the original, discarded harmonics.

Supreme is not limited to MP3 audio; it can be applied to stereo codecs such as Advanced Audio Coding (AAC, which is part of MPEG-2), Adaptive Transform Acoustic Coding (ATRAC, which is part of the MiniDisc format), and Windows Media Audio (WMA). A Windows-based dynamic link library (DLL) is also available, making it easy to implement Supreme in software on a minimum system consisting of an MMX Pentium/233 MHz with 32 MB of RAM and a 16-bit sound card running Windows 95, 98, or ME (but not Windows NT or 2000). The algorithm can also be embedded in firmware on a variety of digital-signal processing (DSP) chips.

The technology has many potential applications. In addition to implementing Supreme on a computer for music downloading and streaming, the most obvious example is incorporating it into an MP3 player with a DSP chip. It is already being used in two products available in Japan: TDK's MP3 Audio Magic encoding and decoding software and Sotec's Afina AV computer. Other possible applications include digital television, wireless handheld devices (such as cell phones and personal digital assistants), and Internet and satellite radio, all of which compress audio with lossy codecs.

The ability to recover data discarded during the encoding process holds considerable appeal for the Internet generation, which has so far endured an inevitable loss of audio quality in exchange for significantly lower bandwidth and storage requirements. If Kenwood can bring Supreme to the market for a reasonable price, it seems destined to become a ubiquitous part of the online-music landscape.