We rightly rely on our ears as the final arbiters of recording and mixing issues. Sometimes, however, identifying the cause of an audible production problem by ear can be difficult. Metering and analysis tools provide quantitative information that allows us to observe an audio signal from various perspectives, which can be the key to avoiding problems and identifying pesky artifacts.
Today, analysis tools have emerged from research laboratories to become affordable and readily available in plug-in suites such as BIAS Reveal (part of the company's Master Perfection Suite) and Roger Nichols Digital Inspector XL. The tools are embedded in editors and processing software like Steinberg WaveLab, iZotope Ozone, and Sony Creative Software Sound Forge. You also can find them in standalone form in the godfather of modern desktop audio analysis, Metric Halo SpectraFoo, as well as in newer entries such as Audiofile Engineering Spectre and SuperMegaUltraGroovy FuzzMeasure Pro. You can even find shareware and freeware analysis tools, including a scaled-down Inspector from RND.
Although the tools are available, how many people really know how to use them? Most users understand level meters and know the difference between peak and RMS (averaged) levels, but far fewer can read a spectrogram or a correlation meter, or even know what these tools are.
To help you understand computer-based metering and analysis tools, I'll discuss what they are, when you would use them, and how to make sense of what they say. I can only scratch the surface here, but hopefully that will be enough to embolden you to try out and learn more about some analysis tools. I'll look primarily at affordable software tools for signal analysis, rather than at tools for system or component measurement, although the line between the two types can be fuzzy. Hardware analyzers and high-end, specialized software such as acoustical-analysis programs have been excluded.
A Few Distinctions
FIG. 1: This BIAS Reveal display shows peak and RMS power histories for each channel. Notice the song fade-out at the right of the display and the roughly 12 dB crest factor.
Mathematically speaking, an audio phenomenon can be seen as a function of time or of frequency, and there are ways to convert between these two domains. Most audio-analysis tools exploit this concept to let you view the same data in multiple ways, each of which yields its own insights. For our purposes, we can break the available audio-analysis tools into the following categories: level, spectral analysis, phase- and stereo-image meters, transfer functions, and code tools and statistics.
Analyzers are often distinguished by the ways in which they handle time. Analysis tools fall into one of two presentation approaches; I will refer to the first as “now,” and the second as “then.” “Now” tools are real-time, and they display an analysis of a signal as it happens. “Then” tools show analysis over time, which provides a historical context for spotting trends or episodes in the audio (see Fig. 1).
Since histories and averaged values represent analysis over time, some time must elapse before a history or an average can be created. In real-time applications, such as live sound, the latency incurred by these processes is annoying at best, and unacceptable at worst. However, in many cases, such as RMS measurement, the time over which the signal is averaged can be short enough to be almost realtime.
Another distinction occurs between those analysis tools that are available as plug-ins and those that run standalone. Plug-ins are usable within DAW and audio-editor hosts, which makes them well suited for production tasks such as monitoring signals during recording, overdubbing, or mixing. Standalone programs can be used to observe a live signal — sometimes even multiple signals. But here the basic paradigm is different: only a single instance of an analyzer program is expected to be running, and you generally use the program as a bench-test instrument rather than as a production-monitoring tool.
Analysis tools provide broad capabilities, but people's needs for them are often specific. To accommodate such needs, most analysis tools are highly configurable, offering graphic preferences, variable parameters for processes such as Fast Fourier Transforms (FFTs), and parameters for defining quantities (such as the number of clipped samples that constitute a “clip” condition). The more comfortable you become with these tools, the more you will probably customize.
A good place to start working with analysis tools is by looking at the most important analysis technique for audio, which underlies a number of tools: the FFT. FFTs convert data from the time domain into the frequency domain. (An inverse FFT converts in the other direction.) FFTs allow you to look at the spectrum of a signal.
The SpectraFoo manual says, “The FFT algorithm is an efficient means of computing a Fourier transform on a computer. The Fourier transform, developed between 1804 and 1807 by the mathematician Joseph Fourier as part of a study of heat transfer, converts a continuous record of amplitude versus time into a record of amplitude versus frequency. A modification of the Fourier Transform called the Discrete Fourier Transform (DFT) was developed to deal with sampled, rather than continuous, waveforms. The FFT algorithm was developed as an efficient way of computing the DFT on digital computers.”
The FFT is the basis for everything from spectrum analyzers to transfer functions. The two key parameters that determine FFT performance are block size (the number of samples on which the FFT is performed) and window type (a preconditioning function applied to reduce error in the FFT). For more on FFTs and the impact of these parameters, see the online bonus material at www.emusician.com.
How Loud Is It?
Measuring level is the most familiar metering function. While people frequently say that a level meter shows how loud a sound is, loudness is actually a perceptual attribute that is almost never directly measured. Level meters show signal amplitude or power in decibels and vary in their response characteristics and demarcation.
The ear responds primarily to the average level of sound. Therefore, averaged metering techniques such as RMS metering are useful when you are concerned with how level is being perceived. Having peak-level metering is more critical for equipment that can overload.
Courtesy Bob Katz
FIG. 2: Mastering engineer Bob Katz devised the K-System scales for level metering and monitoring. This graphic has been reprinted from KatzÕs book, Mastering Audio: The Art and the Science (Focal Press, 2007).
With music and most other signals (assuming that they have not already been subject to dynamic compression), a substantial difference typically exists between the peak and average levels, often as much as 20 dB. The ratio between the peak and average (RMS) levels of a signal is called the crest factor. Crest factor is important because it determines the amount of headroom that is required. In audio, crest factor is usually expressed in decibels, making crest factor calculations as easy as subtracting a signal's RMS value from its peak value.
Mastering engineer Bob Katz's K-System for level metering has been gaining acceptance among engineers and metering-software manufacturers. The K-System takes the idea of crest factor to the next level, defining three meter calibrations with different zero levels that indicate varying amounts of headroom (see Fig. 2). The three calibrations are used because recordings in different genres vary in crest factor. Pop-music recordings, for example, are typically highly compressed, producing a lower crest factor and higher average level, so that less headroom is needed on the meter; classical or audiophile recordings typically have a higher crest factor. Therefore, commercial pop recordings might use the K-12 scale, which defines only 12 dB of headroom, while classical recordings might use K-20, which sets zero at -20 dBfs. Whichever scale is used, according to the K-System, your monitors should be calibrated so that zero produces 83 dB SPL in the room. For a more complete explanation of the K-System, see part 2 of the paper “Level Practices” at www.digido.com/bob-katz/level-practices-part2-includes-the-k-system.html, on Bob Katz's Digital Domain Web site.
Viewing with VUs
Volume unit (VU) meters and RMS-based meters are intended to indicate average levels, but they use different methods to do so. Mechanical VU meters have long integration (rise and fall) times of 300 ms and use mechanical smoothing to achieve an approximation of averaging. RMS meters, in contrast, perform a root-mean-square calculation to derive an average (mean) power level over a period of time that is called an RMS window. Smaller window sizes make the measurement more responsive to short-duration events and lowlevel peaks, while larger sizes apply more smoothing but add latency.
In the VU's heyday, mechanical VU meters were useful because of how difficult it was to create a more accurate averaging meter. Modern software meters are rarely true VU meters, even when they are marked as such. RMS meters provide a more meaningful indication of average level.
Peeking at Peaks
FIG. 3: This screen from iZotope Ozone shows six different metering functions: a histogram to the right of the threshold control, a gain-reduction meter, a DC offset meter, a bit scope, and input and output level meters with clip indicators.
Signal peaks can be much higher than average levels, and they are faster and shorter in duration. The peak program meter (PPM) was created to show signal peaks. The PPM-meter standard (European Broadcast Union technical document 3205-E) mandates an integration time of about 10 ms, which is fast enough to catch most peaks, but just slow enough to ignore spurious artifacts. Digital peak meters, which are the most common in DAWs, have no integration time — that is, they show the instantaneous peak value of the signal. While that method is truthful and can help to avoid clipping problems, it can lead you to focus disproportionately on peak values rather than average values, which give a better indication of what you are hearing. It is unfortunate that RMS meters aren't as prevalent as peak meters in DAWs and audio editors.
Hold functions highlight the hottest points in a signal by retaining the last high point that the level meter hit for a configurable amount of time (the hold time). Assuming that a higher peak has not already come along and reset the indicator, it will then fall to the current peak and remain there. Many engineers rely on the peak-hold indicator even more than they rely on the peak meter to monitor the maximum levels in the signal.
The Historical Record
FIG. 4: Metric Halo SpectraFoo's FFT spectrum analyzer (called the Spectragraph) and its oscilloscope are both highly configurable.
Level history can be shown in several ways. A peak and/or RMS power-level history gives a continuous record of how the level evolves. That information is useful for, among other things, checking level in different parts of a song to make sure that the level of one verse or chorus is not too much higher or lower than that of other verses and choruses.
Level histograms give a different perspective on history. Histograms show frequency distribution, and so a level histogram indicates how often a given level occurs over time (see Fig. 3). When setting dynamics processors such as compressors and gain maximizers, a level histogram helps by letting you see where most of a song's energy falls.
The time-domain waveform displays found in DAWs and audio editors are level histories, too. Oscilloscopes, on the other hand, provide real-time waveform monitoring (see Fig. 4).
Implementing real-time analysis tools for multichannel surround can be a daunting task in several respects. As a result, it's rare to find surround monitoring other than within a DAW host. Programmable Analysis Software's (PAS) shareware Surround Meter is one of the few such tools that shows up in a Google search.
After level, the most common analysis task is spectral analysis. Spectral-analysis tools show the amplitude of each frequency component of a signal. The analysis parameters are key to determining frequency resolution and accuracy across the spectrum. FFT spectrum analyzers, the most common type, are x-y displays that show frequency on the x-axis and amplitude on the yaxis (see Fig. 4). FFTs can be read intuitively: if you have a lot of energy at or around one frequency, that fact is self-evident from the higher readout shown at that frequency.
That is not to say, however, that spectrum analyzers are only narrowly useful. You can look at both RMS and peak spectral analyses with them and use hold functions in the way that they are used with level metering. Using hold functions can be very helpful when you are trying to pinpoint problem frequencies that aren't easily identified by ear. Since spectrum analyzers usually give a readout of the frequency and level corresponding to the current position of the cursor, you can easily home in on where a problem lies by looking at the hold display and moving the cursor where there are anomalies.
FIG. 5: This graphic shows the A-, B-, and C-weighting curves, derived from Fletcher-Munson equal-loudness curves.
Many spectrum analysis tools offer A-, B-, and C-weighting curves, which make the analyzer read more in the way that sound is perceived. Human hearing response is not linear across frequency, so when a spectrum analyzer shows equal levels of high and low frequencies in a signal, it is likely to sound as though there is more treble than bass. Worse, the frequency response of this filtering effect in human perception changes with level.
Weighting curves apply filters to the signal that is routed to the analyzer in order to bring the readings more in line with how sound is perceived. A-weighting initially approximated the 40 dB Fletcher-Munson equal-loudness curve; B-weighting (rarely used), the 70 dB curve; and C-weighting, the 100 dB curve. In that method, you used the appropriate curve for the source level (see Fig. 5). That idea mutated, though, and there was a movement to standardize level measurement around Aweighting only. Today, weighting curves are often chosen for their appropriateness to the application rather than to the listening level. For instance, Aweighting has long been used in outdoor measurements of ambient sound, in which people easily tune out continuous, low-frequency background sounds.
Third-octave analyzers show the spectrum broken down into ISO third-octave bands, a display familiar to those who have used hardware third-octave RTAs (real-time analyzers). Third-octave analyzers are useful for getting a feel for the overall shape of the spectrum. Third-octave bands, however, are not fine enough to pinpoint many problems, and their center frequencies do not closely relate to the harmonic relationships that dominate musical and acoustical signals.
There is an important distinction between FFT spectrum analyzers and third-octave equalizers. FFT analysis produces a linearly spaced response; that is, it breaks the spectrum into bands with spacing that has a fixed number of hertz. White noise, which has equal energy per frequency, shows a flat response when viewed with an FFT analyzer.
In contrast, third-octave analysis produces a logarithmically spaced representation, based on a division of an octave. Pink noise, which has equal energy per octave, shows a flat response when viewed with a third-octave analyzer. White noise shows much more energy in the higher octaves than the lower ones, looking like a low-frequency rolloff.
The Colorful Spectrogram
FIG. 6: In RND Inspector XL's spectrogram, note the visibility of cymbal crashes and beats from snare hits.
A spectrogram (sometimes called “sonogram”) shows a spectral history, a continuous 3-axis record of FFTs performed on an incoming signal. The spectrogram shows time along one axis and frequency along the second axis and uses color (the third axis) to show level.
Spectrograms are used heavily in speech research but are useful also for studio work, and they're easy to read once you are accustomed to them. For example, it's not hard to see where the beat is in a typical pop song: sharp, regularly spaced lines along the spectrum indicate transients that are probably the snare, and cymbal crashes can be seen by the smear in the high frequencies that follow some snare hits. Other transient events, such as a cough or a door-close during a live recording, can also be spotted (see Fig. 6). Spectrograms are helpful in comparing the spectra of different songs. Note that the larger the FFT size, the more history is shown in a spectrogram.
Phase and Stereo-Image Meters
Phase and stereo-image meters illustrate the relative time relationships between the left and right channels of the stereo signal. The simplest of these tools is the stereo-balance meter, a horizontal strip showing power distribution between the two channels. The stereo-balance meter can be useful for balancing stereo tracks of sections, such as background vocals and strings, and for checking stereo-miking techniques.
Phase monitoring comes in several forms, the most familiar being the Lissajous display — a simple x-y display in which each axis shows the instantaneous level of one channel. If the display shows a line (or, more often, in practice, a narrow oval) pointing from lower left to upper right, the material in both channels is very much in phase, meaning that the signal should be highly mono compatible. If the display shows a straight line pointing from upper left to lower right, then that means there are identical signals in each channel, but with opposite polarities, which usually is not good. A somewhat fatter oval is the most common, and a circle would indicate that you have as much out-of-phase material as you would want in a mix.
A vector scope works similarly to a Lissajous, except that the display has been rotated 45 degrees counterclockwise, so that in-phase behavior is shown by a line going straight up and down, and out-of-phase material is shown by a horizontal line.
Can You Correlate?
FIG. 7: This phase scope shows a polar plot of amplitude versus perceived location. This material doesn't have much stereo image; note the 13 dB difference between mid and side signals.
Correlation meters measure the similarity between two signals (see Fig. 7). A fully correlated signal is the same in each channel — that is, the signal is mono. But correlation is not the same as phase: if two channels are extremely dissimilar, they may have a lot of out-of-phase information, or they may have a number of hard-panned mono instruments. However, uncorrelated material is usually perceived as stereo material, and highly correlated material as mono, so a correlation meter can give some idea of the overall width of the signal. It is important to look at trends in the meter more than its instantaneous behavior, because moments with a lot of uncorrelated material between channels are common.
Using mid-side meters is another way of contrasting mono and stereo information in a signal. A mid-side meter typically has two displays: one showing material that appears equally in both channels (mid), the other showing material that is different between them (side). A healthy mix generally has near-equal amounts of both.
Interchannel phase can also be displayed on a polar plot, but two polar-phase plots might represent very different information. For instance, SpectraFoo's Phase Torch shows a vector whose length represents frequency and whose angle represents phase angle. Inspector XL's Phase Scope, on the other hand, plots amplitude as the distance from the circumference of the scope and perceived direction of the sound as the angle.
A transfer function is generated with a differential FFT analyzer: that is, FFTs are computed for two signals, and then one FFT is divided by the other, leaving only the difference between them. If one signal is the input to a system and the other its output, the display shows how the signal was altered going through the system. Put a vocal track into one input and an EQ'd version of the same vocal on the output, and you will see the actual EQ being applied (regardless of what the EQ display says).
Transfer functions have several powerful properties. First, since they are a comparison, the source material could be anything, including music. Second, it is possible to show phase differences as well as spectral and level differences, and even to calculate coherence, which is a good indicator of signal-to-noise ratio.
Transfer functions can be used for any kind of comparison, but they are most commonly used for acoustical measurements, especially in tuning sound systems. You can also check the performance of pieces of equipment or software and compare the outputs of two microphones, among other things. As powerful as transfer functions are, they are rarely used in studio applications. One reason for that is they are CPU intensive, making them difficult to release in plug-in format. Another is that test applications often call for comparison, but music recording and mixing production deal more with what is happening in real time and, in some cases, the recent past.
Code Tools and Statistics
There is a class of information in digital recording that relates to the behavior of the digital audio medium rather than of the source signal, and these are code-related tools. Test packages such as SpectraFoo have a more extensive set of code tools than do most production-oriented analysis packages.
Clipping counters let you know the frequency with which digital clipping has occurred. Some analysis tools refer to “overs” rather than “clipping,” and Inspector XL distinguishes the two, defining clipping as levels exceeding a user-defined clip threshold, and overs as signals that exceed full-scale. In many cases, one or two clipped samples won't be audible, so if a clip counter shows only a few clips for an entire song, there may not be reason for concern. On the other hand, having dozens of clips might be a concern. Clip counters can generally be programmed to respond to a specified number of consecutive clips.
Bit scopes are real-time indicators of the instantaneous use of each bit in a digital word. If, for example, there were a plug-in or device that operated at 16-bit resolution but claimed to use 24-bit, it would be obvious on a bit scope, because bits 17 through 24 would never light up to indicate use. Another example: a bit in the middle of the word that never lights up to show that it is used could indicate a problem in a digital audio converter.
The strength of analysis tools is that they give visualizations of quantitative data, which makes understanding the data more intuitive. One limitation of analysis tools is that using more than two or three at a time creates visual clutter. Small analysis windows are not very useful, so with four or five of them open at the same time, even a dual-monitor system must devote substantial space to meters. Constantly updating multiple, real-time graphic readouts is also taxing on a CPU.
There is also the human problem of paying attention. With too many dancing displays going at the same time, the analytical focus that was the impetus for using a meter in the first place becomes dissipated, and you can end up glancing from one meter to the next, trying to catch events that require attention. One way around this is to let the program watch the meters for you, alerting you when specified conditions are met. However, the only package I have found that implements alarms is Inspector XL.
Analysis tools give us useful data, but it is only data, not knowledge about the audio. So, in the final analysis, the ears remain the best and most important source of knowledge about whether something sounds good. But data can be seductive, and people sometimes come to rely more on what they think meters are telling them than on what their ears tell them. That becomes a problem if someone is not metering the appropriate information or if the quantitative data supplied by a meter does not map well to the most closely related perceptual attribute. For instance, level meters mostly give representations that are of the power in a signal. But loudness is a perceptual attribute that does not map directly to signal power as shown on a typical meter. Or the problem may be as simple as not metering the right parameter.
In any event, meters are best treated as supplements to what we hear. If there is a discrepancy between the two, further inquiry may be in order. But it is foolish to assume that the meters must be “right” and that you aren't hearing correctly.
Be a Meter Reader
Meters and analysis tools present various ways of looking at our audio. Getting good results from them depends on three things: understanding the working principles of each tool well enough to know when to use it, being able to read each tool's display, and having the context to interpret what the display is showing and to extrapolate useful knowledge from it.
A large and growing number of analysis packages are priced affordably and are available for all of the major operating systems and plug-in formats (see the sidebar “Some Metering and Analysis Tools”). For a creative artist, making informed decisions isn't always necessary, but for an engineer at any level, making informed decisions can help lead to good results.
Larry the O is a musician, producer, engineer, sound designer, writer, and master soup stirrer in the San Francisco Bay Area. Special thanks to Bob Katz, Joe and B. J. Buchalter of Metric Halo, and Roger Nichols Digital. Tips of the hat to BIAS, Steinberg, iZotope, and Audiofile Engineering.
SOME METERING AND ANALYSIS TOOLS
Many metering and analysis tools are available. Below is a list of the major ones, along with their Web addresses, which will lead you to more information about them.
Audiofile Engineering Spectre www.audiofile-engineering.com/spectre.php
BIAS Reveal (Master Perfection Suite) www.bias-inc.com/products/masterPerfectionSuite
Blue Cat Audio Analysis Pack www.bluecataudio.com/Products/Category_Analysis
iZotope Ozone www.izotope.com/products/audio/ozone
Metric Halo SpectraFoo www.mhlabs.com/metric_halo/products/foo
Programmable Analysis Software (PAS) Surround Meter (shareware) www.audio-software.com
Sony Creative Software Sound Forge www.sonycreativesoftware.com/products/product.asp?pid=431
Steinberg WaveLab www.steinberg.net/128_1.html
SuperMegaUltraGroovy FuzzMeasure Pro www.supermegaultragroovy.com
Troodon Technologies TrooTrace www.troodontechnologies.com/products.htm
Waves PAZ Analyzer www.waves.com/Content.aspx?id=233