By Larry the O | Thu, 13 Oct 2011
BONUS MATERIAL
Learn more about FFTs (Fast Fourier Transforms) and their
impact on data parameters
We rightly rely on our ears as the final arbiters of recording and mixing issues. Sometimes, however, identifying the cause of an audible production problem by ear can be difficult. Metering and analysis tools provide quantitative information that allows us to observe an audio signal from various perspectives, which can be the key to avoiding problems and identifying pesky artifacts.
Today, analysis tools have emerged from research laboratories to become affordable and readily available in plug-in suites such as BIAS Reveal (part of the company's Master Perfection Suite) and Roger Nichols Digital Inspector XL. The tools are embedded in editors and processing software like Steinberg WaveLab, iZotope Ozone, and Sony Creative Software Sound Forge. You also can find them in standalone form in the godfather of modern desktop audio analysis, Metric Halo SpectraFoo, as well as in newer entries such as Audiofile Engineering Spectre and SuperMegaUltraGroovy FuzzMeasure Pro. You can even find shareware and freeware analysis tools, including a scaled-down Inspector from RND.
Although the tools are available, how many people really know how to use them? Most users understand level meters and know the difference between peak and RMS (averaged) levels, but far fewer can read a spectrogram or a correlation meter, or even know what these tools are.
To help you understand computer-based metering and analysis tools, I'll discuss what they are, when you would use them, and how to make sense of what they say. I can only scratch the surface here, but hopefully that will be enough to embolden you to try out and learn more about some analysis tools. I'll look primarily at affordable software tools for signal analysis, rather than at tools for system or component measurement, although the line between the two types can be fuzzy. Hardware analyzers and high-end, specialized software such as acoustical-analysis programs have been excluded.
A Few Distinctions
FIG. 1: This BIAS Reveal display shows peak and RMS power histories for each channel. Notice the song fade-out at the right of the display and the roughly 12 dB crest factor.
Mathematically speaking, an audio phenomenon can be seen as a function of time or of frequency, and there are ways to convert between these two domains. Most audio-analysis tools exploit this concept to let you view the same data in multiple ways, each of which yields its own insights. For our purposes, we can break the available audio-analysis tools into the following categories: level, spectral analysis, phase- and stereo-image meters, transfer functions, and code tools and statistics.
Analyzers are often distinguished by the ways in which they handle time. Analysis tools fall into one of two presentation approaches; I will refer to the first as “now,” and the second as “then.” “Now” tools are real-time, and they display an analysis of a signal as it happens. “Then” tools show analysis over time, which provides a historical context for spotting trends or episodes in the audio (see Fig. 1).
Since histories and averaged values represent analysis over time, some time must elapse before a history or an average can be created. In real-time applications, such as live sound, the latency incurred by these processes is annoying at best, and unacceptable at worst. However, in many cases, such as RMS measurement, the time over which the signal is averaged can be short enough to be almost realtime.
Another distinction occurs between those analysis tools that are available as plug-ins and those that run standalone. Plug-ins are usable within DAW and audio-editor hosts, which makes them well suited for production tasks such as monitoring signals during recording, overdubbing, or mixing. Standalone programs can be used to observe a live signal — sometimes even multiple signals. But here the basic paradigm is different: only a single instance of an analyzer program is expected to be running, and you generally use the program as a bench-test instrument rather than as a production-monitoring tool.
Analysis tools provide broad capabilities, but people's needs for them are often specific. To accommodate such needs, most analysis tools are highly configurable, offering graphic preferences, variable parameters for processes such as Fast Fourier Transforms (FFTs), and parameters for defining quantities (such as the number of clipped samples that constitute a “clip” condition). The more comfortable you become with these tools, the more you will probably customize.
PFFT!
A good place to start working with analysis tools is by looking at the most important analysis technique for audio, which underlies a number of tools: the FFT. FFTs convert data from the time domain into the frequency domain. (An inverse FFT converts in the other direction.) FFTs allow you to look at the spectrum of a signal.
The SpectraFoo manual says, “The FFT algorithm is an efficient means of computing a Fourier transform on a computer. The Fourier transform, developed between 1804 and 1807 by the mathematician Joseph Fourier as part of a study of heat transfer, converts a continuous record of amplitude versus time into a record of amplitude versus frequency. A modification of the Fourier Transform called the Discrete Fourier Transform (DFT) was developed to deal with sampled, rather than continuous, waveforms. The FFT algorithm was developed as an efficient way of computing the DFT on digital computers.”
The FFT is the basis for everything from spectrum analyzers to transfer functions. The two key parameters that determine FFT performance are block size (the number of samples on which the FFT is performed) and window type (a preconditioning function applied to reduce error in the FFT). For more on FFTs and the impact of these parameters, see the online bonus material at www.emusician.com.
How Loud Is It?
Measuring level is the most familiar metering
function. While people frequently say that a level meter shows how loud
a sound is, loudness is actually a perceptual attribute that is almost
never directly measured. Level meters show signal amplitude or power in
decibels and vary in their response characteristics and demarcation.
The ear responds primarily to the average
level of sound. Therefore, averaged metering techniques such as RMS
metering are useful when you are concerned with how level is being
perceived. Having peak-level metering is more critical for equipment
that can overload.
Courtesy Bob Katz
FIG. 2: Mastering engineer Bob Katz devised the K-System scales for level metering and monitoring. This graphic has been reprinted from KatzÕs book, Mastering Audio: The Art and the Science (Focal Press, 2007).
With music and most other signals (assuming
that they have not already been subject to dynamic compression), a
substantial difference typically exists between the peak and average
levels, often as much as 20 dB. The ratio between the peak and average
(RMS) levels of a signal is called the crest factor. Crest
factor is important because it determines the amount of headroom that
is required. In audio, crest factor is usually expressed in decibels,
making crest factor calculations as easy as subtracting a signal's RMS
value from its peak value.
Mastering engineer Bob Katz's K-System for
level metering has been gaining acceptance among engineers and
metering-software manufacturers. The K-System takes the idea of crest
factor to the next level, defining three meter calibrations with
different zero levels that indicate varying amounts of headroom (see Fig. 2).
The three calibrations are used because recordings in different genres
vary in crest factor. Pop-music recordings, for example, are typically
highly compressed, producing a lower crest factor and higher average
level, so that less headroom is needed on the meter; classical or
audiophile recordings typically have a higher crest factor. Therefore,
commercial pop recordings might use the K-12 scale, which defines only
12 dB of headroom, while classical recordings might use K-20, which
sets zero at -20 dBfs. Whichever scale is used, according to the
K-System, your monitors should be calibrated so that zero produces 83
dB SPL in the room. For a more complete explanation of the K-System,
see part 2 of the paper “Level Practices” at
www.digido.com/bob-katz/level-practices-part2-includes-the-k-system.html,
on Bob Katz's Digital Domain Web site.
Viewing with VUs
Volume unit (VU) meters and RMS-based meters
are intended to indicate average levels, but they use different methods
to do so. Mechanical VU meters have long integration (rise and fall)
times of 300 ms and use mechanical smoothing to achieve an
approximation of averaging. RMS meters, in contrast, perform a
root-mean-square calculation to derive an average (mean) power level
over a period of time that is called an RMS window. Smaller window
sizes make the measurement more responsive to short-duration events and
lowlevel peaks, while larger sizes apply more smoothing but add latency.
In the VU's heyday, mechanical VU meters were
useful because of how difficult it was to create a more accurate
averaging meter. Modern software meters are rarely true VU meters, even
when they are marked as such. RMS meters provide a more meaningful
indication of average level.
Peeking at Peaks
FIG. 3: This screen from iZotope Ozone shows six different metering functions: a histogram to the right of the threshold control, a gain-reduction meter, a DC offset meter, a bit scope, and input and output level meters with clip indicators.
Signal peaks can be much higher than average
levels, and they are faster and shorter in duration. The peak program
meter (PPM) was created to show signal peaks. The PPM-meter standard
(European Broadcast Union technical document 3205-E) mandates an
integration time of about 10 ms, which is fast enough to catch most
peaks, but just slow enough to ignore spurious artifacts. Digital peak
meters, which are the most common in DAWs, have no integration time —
that is, they show the instantaneous peak value of the signal. While
that method is truthful and can help to avoid clipping problems, it can
lead you to focus disproportionately on peak values rather than average
values, which give a better indication of what you are hearing. It is
unfortunate that RMS meters aren't as prevalent as peak meters in DAWs
and audio editors.
Hold functions highlight the hottest points in
a signal by retaining the last high point that the level meter hit for
a configurable amount of time (the hold time). Assuming that a higher
peak has not already come along and reset the indicator, it will then
fall to the current peak and remain there. Many engineers rely on the
peak-hold indicator even more than they rely on the peak meter to
monitor the maximum levels in the signal.
The Historical Record
FIG. 4: Metric Halo SpectraFoo's FFT spectrum analyzer (called the Spectragraph) and its oscilloscope are both highly configurable.
Level history can be shown in several ways. A
peak and/or RMS power-level history gives a continuous record of how
the level evolves. That information is useful for, among other things,
checking level in different parts of a song to make sure that the level
of one verse or chorus is not too much higher or lower than that of
other verses and choruses.
Level histograms give a different perspective
on history. Histograms show frequency distribution, and so a level
histogram indicates how often a given level occurs over time (see Fig. 3).
When setting dynamics processors such as compressors and gain
maximizers, a level histogram helps by letting you see where most of a
song's energy falls.
The time-domain waveform displays found in
DAWs and audio editors are level histories, too. Oscilloscopes, on the
other hand, provide real-time waveform monitoring (see Fig. 4).
Implementing real-time analysis tools for
multichannel surround can be a daunting task in several respects. As a
result, it's rare to find surround monitoring other than within a DAW
host. Programmable Analysis Software's (PAS) shareware Surround Meter
is one of the few such tools that shows up in a Google search.
Spectral Analysis
After level, the most common analysis task is
spectral analysis. Spectral-analysis tools show the amplitude of each
frequency component of a signal. The analysis parameters are key to
determining frequency resolution and accuracy across the spectrum. FFT
spectrum analyzers, the most common type, are x-y displays that show frequency on the x-axis
and amplitude on the yaxis (see Fig. 4). FFTs can be read intuitively:
if you have a lot of energy at or around one frequency, that fact is
self-evident from the higher readout shown at that frequency.
That is not to say, however, that spectrum
analyzers are only narrowly useful. You can look at both RMS and peak
spectral analyses with them and use hold functions in the way that they
are used with level metering. Using hold functions can be very helpful
when you are trying to pinpoint problem frequencies that aren't easily
identified by ear. Since spectrum analyzers usually give a readout of
the frequency and level corresponding to the current position of the
cursor, you can easily home in on where a problem lies by looking at
the hold display and moving the cursor where there are anomalies.
Weighty Matters
FIG. 5: This graphic shows the A-, B-, and C-weighting curves, derived from Fletcher-Munson equal-loudness curves.
Many spectrum analysis tools offer A-, B-, and
C-weighting curves, which make the analyzer read more in the way that
sound is perceived. Human hearing response is not linear across
frequency, so when a spectrum analyzer shows equal levels of high and
low frequencies in a signal, it is likely to sound as though there is
more treble than bass. Worse, the frequency response of this filtering
effect in human perception changes with level.
Weighting curves apply filters to the signal
that is routed to the analyzer in order to bring the readings more in
line with how sound is perceived. A-weighting initially approximated
the 40 dB Fletcher-Munson equal-loudness curve; B-weighting (rarely
used), the 70 dB curve; and C-weighting, the 100 dB curve. In that
method, you used the appropriate curve for the source level (see Fig. 5).
That idea mutated, though, and there was a movement to standardize
level measurement around Aweighting only. Today, weighting curves are
often chosen for their appropriateness to the application rather than
to the listening level. For instance, Aweighting has long been used in
outdoor measurements of ambient sound, in which people easily tune out
continuous, low-frequency background sounds.
Third-Octave Analysis
Third-octave analyzers show the spectrum
broken down into ISO third-octave bands, a display familiar to those
who have used hardware third-octave RTAs (real-time analyzers).
Third-octave analyzers are useful for getting a feel for the overall
shape of the spectrum. Third-octave bands, however, are not fine enough
to pinpoint many problems, and their center frequencies do not closely
relate to the harmonic relationships that dominate musical and
acoustical signals.
There is an important distinction between FFT
spectrum analyzers and third-octave equalizers. FFT analysis produces a
linearly spaced response; that is, it breaks the spectrum into bands
with spacing that has a fixed number of hertz. White noise, which has
equal energy per frequency, shows a flat response when viewed with an
FFT analyzer.
In contrast, third-octave analysis produces a
logarithmically spaced representation, based on a division of an
octave. Pink noise, which has equal energy per octave, shows a flat
response when viewed with a third-octave analyzer. White noise shows
much more energy in the higher octaves than the lower ones, looking
like a low-frequency rolloff.
The Colorful Spectrogram
FIG. 6: In RND Inspector XL's spectrogram, note the visibility of cymbal crashes and beats from snare hits.
A spectrogram (sometimes called “sonogram”)
shows a spectral history, a continuous 3-axis record of FFTs performed
on an incoming signal. The spectrogram shows time along one axis and
frequency along the second axis and uses color (the third axis) to show
level.
Spectrograms are used heavily in speech
research but are useful also for studio work, and they're easy to read
once you are accustomed to them. For example, it's not hard to see
where the beat is in a typical pop song: sharp, regularly spaced lines
along the spectrum indicate transients that are probably the snare, and
cymbal crashes can be seen by the smear in the high frequencies that
follow some snare hits. Other transient events, such as a cough or a
door-close during a live recording, can also be spotted (see Fig. 6).
Spectrograms are helpful in comparing the spectra of different songs.
Note that the larger the FFT size, the more history is shown in a
spectrogram.
Phase and Stereo-Image Meters
Phase and stereo-image meters illustrate the
relative time relationships between the left and right channels of the
stereo signal. The simplest of these tools is the stereo-balance meter,
a horizontal strip showing power distribution between the two channels.
The stereo-balance meter can be useful for balancing stereo tracks of
sections, such as background vocals and strings, and for checking
stereo-miking techniques.
Phase monitoring comes in several forms, the
most familiar being the Lissajous display — a simple x-y display in
which each axis shows the instantaneous level of one channel. If the
display shows a line (or, more often, in practice, a narrow oval)
pointing from lower left to upper right, the material in both channels
is very much in phase, meaning that the signal should be highly mono
compatible. If the display shows a straight line pointing from upper
left to lower right, then that means there are identical signals in
each channel, but with opposite polarities, which usually is not good.
A somewhat fatter oval is the most common, and a circle would indicate
that you have as much out-of-phase material as you would want in a mix.
A vector scope works similarly to a Lissajous,
except that the display has been rotated 45 degrees counterclockwise,
so that in-phase behavior is shown by a line going straight up and
down, and out-of-phase material is shown by a horizontal line.
Can You Correlate?
FIG. 7: This phase scope shows a polar plot of amplitude versus perceived location. This material doesn't have much stereo image; note the 13 dB difference between mid and side signals.
Correlation meters measure the similarity between two signals (see Fig. 7). A fully correlated signal is the same in each channel — that is, the signal is mono. But correlation is not the same as phase:
if two channels are extremely dissimilar, they may have a lot of
out-of-phase information, or they may have a number of hard-panned mono
instruments. However, uncorrelated material is usually perceived as
stereo material, and highly correlated material as mono, so a
correlation meter can give some idea of the overall width of the
signal. It is important to look at trends in the meter more than its
instantaneous behavior, because moments with a lot of uncorrelated
material between channels are common.
Using mid-side meters is another way of
contrasting mono and stereo information in a signal. A mid-side meter
typically has two displays: one showing material that appears equally
in both channels (mid), the other showing material that is different
between them (side). A healthy mix generally has near-equal amounts of
both.
Interchannel phase can also be displayed on a
polar plot, but two polar-phase plots might represent very different
information. For instance, SpectraFoo's Phase Torch shows a vector
whose length represents frequency and whose angle represents phase
angle. Inspector XL's Phase Scope, on the other hand, plots amplitude
as the distance from the circumference of the scope and perceived
direction of the sound as the angle.
Transfer Functions
A transfer function is generated with a
differential FFT analyzer: that is, FFTs are computed for two signals,
and then one FFT is divided by the other, leaving only the difference
between them. If one signal is the input to a system and the other its
output, the display shows how the signal was altered going through the
system. Put a vocal track into one input and an EQ'd version of the
same vocal on the output, and you will see the actual EQ being applied
(regardless of what the EQ display says).
Transfer functions have several powerful
properties. First, since they are a comparison, the source material
could be anything, including music. Second, it is possible to show
phase differences as well as spectral and level differences, and even
to calculate coherence, which is a good indicator of signal-to-noise
ratio.
Transfer functions can be used for any kind of
comparison, but they are most commonly used for acoustical
measurements, especially in tuning sound systems. You can also check
the performance of pieces of equipment or software and compare the
outputs of two microphones, among other things. As powerful as transfer
functions are, they are rarely used in studio applications. One reason
for that is they are CPU intensive, making them difficult to release in
plug-in format. Another is that test applications often call for
comparison, but music recording and mixing production deal more with
what is happening in real time and, in some cases, the recent past.
Code Tools and Statistics
There is a class of information in digital
recording that relates to the behavior of the digital audio medium
rather than of the source signal, and these are code-related tools.
Test packages such as SpectraFoo have a more extensive set of code
tools than do most production-oriented analysis packages.
Clipping counters let you know the frequency
with which digital clipping has occurred. Some analysis tools refer to
“overs” rather than “clipping,” and Inspector XL distinguishes the two,
defining clipping as levels exceeding a user-defined clip
threshold, and overs as signals that exceed full-scale. In many cases,
one or two clipped samples won't be audible, so if a clip counter shows
only a few clips for an entire song, there may not be reason for
concern. On the other hand, having dozens of clips might be a concern.
Clip counters can generally be programmed to respond to a specified
number of consecutive clips.
Bit scopes are real-time indicators of the
instantaneous use of each bit in a digital word. If, for example, there
were a plug-in or device that operated at 16-bit resolution but claimed
to use 24-bit, it would be obvious on a bit scope, because bits 17
through 24 would never light up to indicate use. Another example: a bit
in the middle of the word that never lights up to show that it is used
could indicate a problem in a digital audio converter.
Usage Challenges
The strength of analysis tools is that they
give visualizations of quantitative data, which makes understanding the
data more intuitive. One limitation of analysis tools is that using
more than two or three at a time creates visual clutter. Small analysis
windows are not very useful, so with four or five of them open at the
same time, even a dual-monitor system must devote substantial space to
meters. Constantly updating multiple, real-time graphic readouts is
also taxing on a CPU.
There is also the human problem of paying
attention. With too many dancing displays going at the same time, the
analytical focus that was the impetus for using a meter in the first
place becomes dissipated, and you can end up glancing from one meter to
the next, trying to catch events that require attention. One way around
this is to let the program watch the meters for you, alerting you when
specified conditions are met. However, the only package I have found
that implements alarms is Inspector XL.
Analysis tools give us useful data, but it is
only data, not knowledge about the audio. So, in the final analysis,
the ears remain the best and most important source of knowledge about
whether something sounds good. But data can be seductive, and people
sometimes come to rely more on what they think meters are telling them
than on what their ears tell them. That becomes a problem if someone is
not metering the appropriate information or if the quantitative data
supplied by a meter does not map well to the most closely related
perceptual attribute. For instance, level meters mostly give
representations that are of the power in a signal. But loudness is a
perceptual attribute that does not map directly to signal power as
shown on a typical meter. Or the problem may be as simple as not
metering the right parameter.
In any event, meters are best treated as
supplements to what we hear. If there is a discrepancy between the two,
further inquiry may be in order. But it is foolish to assume that the
meters must be “right” and that you aren't hearing correctly.
Be a Meter Reader
Meters and analysis tools present various ways
of looking at our audio. Getting good results from them depends on
three things: understanding the working principles of each tool well
enough to know when to use it, being able to read each tool's display,
and having the context to interpret what the display is showing and to
extrapolate useful knowledge from it.
A large and growing number of analysis
packages are priced affordably and are available for all of the major
operating systems and plug-in formats (see the sidebar “Some Metering
and Analysis Tools”). For a creative artist, making informed decisions
isn't always necessary, but for an engineer at any level, making
informed decisions can help lead to good results.
Larry the O is a musician, producer,
engineer, sound designer, writer, and master soup stirrer in the San
Francisco Bay Area. Special thanks to Bob Katz, Joe and B. J. Buchalter
of Metric Halo, and Roger Nichols Digital. Tips of the hat to BIAS,
Steinberg, iZotope, and Audiofile Engineering.
SOME METERING AND ANALYSIS TOOLS
Many metering and analysis tools are
available. Below is a list of the major ones, along with their Web
addresses, which will lead you to more information about them.
Audiofile Engineering Spectre www.audiofile-engineering.com/spectre.php
BIAS Reveal (Master Perfection Suite) www.bias-inc.com/products/masterPerfectionSuite
Blue Cat Audio Analysis Pack www.bluecataudio.com/Products/Category_Analysis
iZotope Ozone www.izotope.com/products/audio/ozone
Metric Halo SpectraFoo www.mhlabs.com/metric_halo/products/foo
Programmable Analysis Software (PAS) Surround Meter (shareware) www.audio-software.com
Roger Nichols Digital Inspector XL and Inspector (freeware) www.rogernicholsdigital.com/inspectorXL.html, www.rogernicholsdigital.com/inspector.html
Sony Creative Software Sound Forge www.sonycreativesoftware.com/products/product.asp?pid=431
Steinberg WaveLab www.steinberg.net/128_1.html
SuperMegaUltraGroovy FuzzMeasure Pro www.supermegaultragroovy.com
Troodon Technologies TrooTrace www.troodontechnologies.com/products.htm
Waves PAZ Analyzer www.waves.com/Content.aspx?id=233
BONUS MATERIAL
Learn more about FFTs (Fast Fourier Transforms) and their
impact on data parameters