Illustration: Laura Williams
Loudness is like happiness: we all know when we're experiencing it, and the concept, at first glance, seems like a simple one. But when we start exploring what it is and how to control it, we find that it isn't simple at all. In fact, we can't even really measure it! Despite these difficulties, we must press on, as loudness management is at the heart and soul of good audio and expressive musicianship.
Loudness isn't a physical quantity, but rather a subjective sensation that humans have as part of our hearing — a sensation relating roughly to the size or proximity of a sound source. As a sound gets louder, we sense that it is nearer, bigger, or more powerful. All that probably has its origins in the theory of perception for survival, and it almost certainly predates human evolution.
It would be handy if such sensations simply and easily correlated with physical quantities such as sound-pressure level (SPL), but they don't. Though there is some correlation between loudness and SPL, our casual linking of the two often leads to some serious confusion.
Loudness has the same sort of relationship to amplitude that pitch has to frequency. In both cases, as we change one quality, the other generally changes as well, and we casually use the physically objective quantity to stand in for the subjective one. However, we can change frequency without changing pitch and vice versa, and we can also change loudness without changing amplitude. Furthermore, the loudness of a sound varies dramatically as a function of its spectrum. There is much more to subjective loudness than a simple correlation with amplitude.
In addition, the range of loudness levels that we can perceive, from so soft that we can barely hear it to so loud that we can't stand the pain, involves an amplitude range so huge that it challenges the capability of our technology to approach it.
FIG. 1: This figure is a simplified overview of the auditory system and indicates the various processes that occur as part of hearing.
Our auditory system is a complex array of features that includes a pair of ears, two auditory-nerve bundles, and an elaborate processing network in the brain that is responsible for memory, identification, perception, and action. It's a complicated system, and we don't understand it very well yet (see Fig. 1).
Fig. 1 shows a simplified representation of our hearing system. A key point in the system is the basilar membrane, housed in the cochlea. The basilar membrane is a suspended resonant skin surface, with tens of thousands of nerve endings embedded in it. These nerve endings are one end of the auditory nerve, the bundle of nerves leading to the auditory cortex in the brain, with numerous neural processing points along the way.
The basilar membrane can be thought of as a mechanico-neurological transducer or a vibration-to-neural impulse converter. In plain English, it's the point where the sound energy that enters the ear is converted into neurological impulses.
FIG. 2: This pair of graphics shows multiple sensory zones on a basilar membrane and illustrates the concept of pitch shift related to amplitude change. Gray zones represent distinct vibrating areas of the membrane.
Photo: Courtesy Dave Moulton
Different areas of the membrane vibrate at different frequencies, which explains in part how we derive pitches from complex sounds. When a particular area of the membrane resonates, it causes the nerve endings in that vicinity to fire, that is, to generate the electrical impulses that constitute neural activity. Specific regions on the membrane correlate with sensations of highness and lowness, and taken together lead to the sensation of pitch (see Fig. 2).
Fig. 2 shows a stylized, simplified basilar membrane that is being excited by a sound with multiple frequency components. Each gray zone represents a vibrating area of the membrane. The sense of pitch of the sound arises from the specific collection of vibrating areas, which we perceive as a neurological template for, say, E-flat above middle C.
As the vibrating area gets larger or smaller due to amplitude change, the perceived pitch changes. From that, we speculate that the position of the edge of each vibrating area is what is used to define pitch. And it is this activity on the basilar membrane that leads us to the concept of loudness.
HOW WE HEAR LOUDNESS
Loudness seems to be related to how many nerve endings are firing, which depends on how big the vibrating area is and how rapidly the nerve endings are firing. Both of these are related to the magnitude of the sound wave that enters the outer ear and is transmitted to the basilar membrane.
The path from the outer ear to the basilar membrane also has an effect on our sensation of loudness. The eardrum is, in fact, a muscle. When exposed to extremely loud sounds, it tightens, acting like a slow-attack, slow-release audio compressor. Meanwhile, the middle ear houses three bones that mechanically transmit the vibrations at the eardrum to the cochlea. Those bones, in combination, add some mechanical “gain” to the amplitude. They also function as a comparatively fast-acting limiter, slipping apart slightly (but still held together by cartilage) when their motion is extreme.
In combination, the gain limiting can be as much as 40 dB (more on decibels shortly) and for the most part, we are not aware that it is taking place. A kind of neurological feedback loop probably helps to maintain our perceived loudness by adjusting it for the amount of limiting that is occurring at any given time.
OUR HEARING RANGE
The range of amplitudes that we are able to sense is huge. At the middle of the audible spectrum, that range is about 10,000,000:1. The quiet end is called the threshold of hearing (or 0 dB SPL), a fuzzy limit below which we cannot detect acoustic energy. That threshold, however, is so soft that we can detect things like the sound of a single hydrogen molecule bouncing off our eardrum, although we seldom encounter SPLs anywhere near that soft. A really quiet studio might have a noise floor with ten times (20 dB) that SPL.
The louder end of the spectrum, the threshold of discomfort, occurs at 120 dB SPL. We experience this level when our ears begin to hurt, which is nature's way of letting us know that if we don't back off we'll damage our hearing. It's best to avoid these frequencies if at all possible, especially at mid and upper-middle levels.
DECIBELS, SONES, AND PHONS
The decibel is a quirky term used to quantify power and amplitude in acoustics and audio. We often use it when discussing loudness, which is a sloppy, if useful, practice. Decibels are ratios — a change of 1 dB is an amplitude change of approximately 12 percent. A change of 6 dB is a doubling or halving of amplitude. A change of 20 dB is a change of 10:1. Conveniently, a 1 dB change is fairly close to the smallest change in loudness that humans can detect. Such a change is called the just noticeable difference.
The sone is a term coined to represent loudness. The value of 1 sone is defined as equal to 40 dB SPL at 1 kHz. A value of 2 sones is supposed to sound twice as loud as 1 sone, and to represent a 10 dB change. Similarly, 50 dB SPL is supposed to sound twice as loud as 40 dB SPL and 60 dB SPL four times as loud. In reality, however, those proportions depend on the spectrum of the sounds being measured, as well as on several other factors. I've found that a doubling of loudness in music actually occurs at around 7 to 8 dB and varies widely, depending on level and spectrum.
A phon refers to the loudness of a sound at a frequency that sounds equally loud as a sound of the same phon value at 1 kHz. All sounds designated as having a loudness of 50 phons, for instance, sound equally loud. This point can be illustrated by the equal loudness contour, which is a plot of the amplitudes of all frequencies that sound equally loud. For example, the 50 phon equal loudness contour shows the amplitude of all frequencies that sound equally loud as a sound of 50 dB SPL at 1 kHz.
FIG. 3: Equal loudness contours were originally conceived by Harvey Fletcher and Wilden Munson in 1933 and are hence often referred to as Fletcher-Munson Curves. The MAF line refers to the measured threshold of audibility at various frequencies.
The frequency response of our hearing isn't flat. Instead, it is widely rolled off at low frequencies and steeply rolled off at high frequencies, with a peak around 4 kHz and lots of bumps and ripples along the way.
Even worse is that the frequency response of our hearing changes as the level changes. That means that every time we change level, we're also changing the perceived spectrum, timbre, and EQ of a sound (see Fig. 3). Phon lines (as shown in equal loudness contour plots) are the inverse of typical frequency-response curves. Though a complete explanation of that concept is beyond the scope of this article, in simple terms, that means that all frequencies have different loudnesses, and loudness change varies with frequency. As a result, when you change the level of a musical sound, you are also changing its perceived timbre and spectrum.
This phenomenon is pervasive, confounding, and not well understood in the audio business. At the very least, it means you need to do your monitoring at a fixed level, and when you are tweaking a synth patch to get the sound just right, you'd best do it at around the level at which you'll finally mix it.
Loudness and spectrum
As spectrum width increases, so does loudness. A band of pink noise will sound much louder than a sine wave of the same amplitude. If you want something to sound louder, add spectrum to it by boosting high or low frequency elements with EQ. That is, in fact, what harmonic distortion does — it fills up the spectrum with harmonics and adds loudness, even though the circuit can't generate any additional amplitude. That's the reason why distortion pedals are so popular.
Loudness and time
FIG. 4: This graph shows various durations of sound and their relative subjective loudnesses.
Another issue relating to loudness has to do with duration. Extremely short sounds (transients) aren't nearly as loud as sustained sounds. A 1 ms noise will sound about 20 dB softer than a 100 ms version of the same noise (see Fig. 4). Therefore, sustained sounds will sound louder than transients, and transients have to be considerably louder than sustained material in order to sound equally loud.
It's very hard to measure loudness. Meters measure amplitude, not loudness. If you really want to find out how loud a sound is, you have to assemble a panel of listeners and get them all to give their opinion, which is how subjective audio measurements are conducted. It is widely assumed that some meters correlate well with loudness, but it's important to keep in mind that that isn't necessarily true (see “Square One: Meter Matters” in the January 2003 issue for a full discussion of meters and metering).
LOUDNESS FOR FUN & PROFIT
How best to use all this information? First, be humble. It's a complicated topic, and it takes years to puzzle out. Second, learn to listen to loudness very carefully. There are ways you can get a signal to sound louder without increasing its amplitude much. For example, try boosting a narrow band of quite audible frequencies while cutting a different band of less important ones, or try adding slight distortion or using an exciter, which does roughly the same thing. Often, midrange frequencies don't sound very loud, even at high amplitudes. You can cut them, while boosting highs in the signal.
Adding extreme highs can help, too. You can make a sound louder by spreading it out in time and using a little room ambience with predelay to make the sound last. That technique works well for percussion sounds.
A key trick for expressive synthesis is to learn how to manipulate loudness by editing level on a note-by-note basis, and by carefully observing dynamic behaviors of level and filter settings relating to key Velocity. That sort of dynamic action is what great musicians use all the time.
Finally, remember that loudness management, or dynamics, is one of the most important expressive aspects of music. When you get it right, your recordings become a lot more musical, a lot more expressive, and a lot more powerful. It's definitely worth the effort.
Dave Moultonis an engineer, synthesist, composer, producer, acoustician, teacher, author, and party guy. You can complain to him about anything at his Web site,www.moultonlabs.com.