Phase shift is one of those truly confusing topics in audio, yet along with frequency and amplitude, phase is one of the primary conceptual building blocks of any audio waveform. Why is it so confusing?
The confusion arises in large part because our hearing uses phase in ways that are wondrous but not obvious, and phase issues are inherent to many musical behaviors. In addition, the term phase shift has been misused and abused by many audio engineers for the past century.
THE CONCEPT OF PHASE
Along with a period (that is, the length of time it takes to complete one cycle of wave motion) and an amplitude, every audio wave has a characteristic called phase. Phase is simply a way of talking about time in terms of a wave's period.
FIG. 1: This figure shows a single cycle of a 1 kHz sine wave, with additional iterations at 45-, 90-, 135-, and 180-degrees phase shift, relative to the original wave. Note the period of the wave, which is the frequency (1,000 cycles per second) inverted.
Fig. 1 shows a sine wave with a frequency of 1 kHz. It has a period of 1 millisecond (the period is equal to 1 divided by the frequency.) This sine wave starts at some point in time, which we have arbitrarily labeled 0.000 ms. The starting point for the wave is traditionally the point at which the wave begins to build up positive amplitude, beginning at “average” pressure, which is defined as 0 volts, or, in air, one atmosphere of barometric pressure (1 bar).
The phase of the wave is a way of expressing a specific point in its time cycle. The entire time cycle of the wave is divided into 360 equal parts, called degrees. The beginning of the cycle's phase is labeled 0 degrees, halfway through the cycle is labeled 180 degrees, and the end of the cycle is labeled 360 degrees (also 0 degrees again). Thus, we can talk about the wave in terms of degrees and use that term as a reference point. For example, you might say, “the positive peak occurs at a phase of 90 degrees and the negative peak at 270 degrees.”
The term phase shift refers to an offset in time of the original wave, a copy of that wave, or a wave of a slightly different frequency. In the first and second cases, some force (latency, for example) causes the wave to be delayed. We can express that delay in time or in degrees (as shown in the drawing for the various iterations). When referring to the relationship of the delayed wave to the original, it's best to express the delay in degrees.
FIG. 2: In this figure, you can see a sine wave with an exact iteration of itself phase-shifted by 180 degrees (a), and a pulse wave with the same period and an exact iteration of itself, also phase-shifted by 180 degrees (b).
For example, a wave with a phase shift of 180 degrees starts halfway through the cycle of the original wave, so that it crosses zero going positive at the same point in time as the original crosses zero going negative. This occurs in the case of sine waves and most artificially generated test signals (see Fig. 2). To recap, phase shift refers to the offset in time relative to the starting point of the original wave.
ON THE BEAT
Beating is an interesting phase-related phenomenon. It occurs when two waves of slightly different frequencies (for example, 1,000 Hz and 999 Hz) sound simultaneously. In such a case, there is continuously varying phase shift between the original wave (1,000 Hz) and the second wave (999 Hz). The second wave has a period of 1.001 ms, so that for each cycle of the original, the second wave shifts in phase by 0.36 degrees (360/1000), trailing the original by an additional 0.36 degrees for each cycle.
The result is that the second wave goes in and out of phase with the original — once a second, here. Every 500 ms, it is 180 degrees out of phase with the original.
FIG. 3: Here, the green wave has a longer period than the red wave, by a ratio of 4:3. The red wave goes in and out of phase with the green wave every four cycles. When these are summed, beating will occur.
If these two waves are mixed together (summed), the resulting signal is a sine wave whose amplitude swings from 6 dB louder than the original (when the phase shift is 0 degrees) to complete cancellation (when the phase shift is 180 degrees). This amplitude modulation is called beating, and the rate of beating (called the beat frequency) is equal to the difference in frequency between the original and the second wave — in this case, 1 Hz (see Fig. 3).
In music, we use the presence of beats to tune our instruments (by getting the beating to occur as slowly as we can). The effect of chorusing is a simulation of the beating phenomenon that occurs when a large number of sources sound approximately the same pitch simultaneously. This is extraordinarily important for music, of course.
Although phase shift and time delay both refer to offsets in time, they do so in different ways. A time delay of 1 ms is exactly what it says, regardless of the signal. How many degrees of phase shift equal a 1 ms delay? It depends on the period (and therefore frequency) of the signal.
A 1 ms delay represents an infinite possible number of degrees of phase shift for an infinite possible number of frequencies. A 1 ms delay for a frequency of 250 Hz causes a phase shift of 90 degrees, whereas for 500 Hz it causes a phase shift of 180 degrees. The amount of phase shift is derived from the length of the delay as a fraction of the period of the waveform.
THE COMB FILTER
As we have seen, a given time delay yields a range of phase shifts for the various frequencies present in the signal. A delayed signal summed with its original has the following behavior: all phase shifts of 180 degrees yield cancellations, and all phase shifts of 360 degrees yield 6 dB boosts. For a 1 ms delay, 360-degree shifts occur at 1 kHz, 2 kHz, 3 kHz … n kHz, while 180-degree shifts occur at 500 Hz, 1,500 Hz, 2,500 Hz … n500 Hz. The result is called a comb filter, and it is a highly distinctive pitched sound that is the basis for the entire family of phasing and flanging effects, as well as being a primary stereophonic compatibility issue and a troublesome artifact in audio.
When we vary the delay time, the frequencies and pitch of the comb filter vary, producing the effect of flanging.
Polarity reversal (or PolRev) is a term that is often confused with phase but involves no phase shift or time delay. Polarity reversal occurs whenever we “change the sign” of the amplitude values of a signal. In the analog realm this can be done with an inverting amplifier, a transformer, or in a balanced line (by simply switching connections between pins 2 and 3 on one end of the cable). In the digital realm, it is done by simply changing all pluses to minuses and vice versa in the audio-signal data stream.
FIG. 4: Both the sine waves (a) and pulse waves (b) are out of polarity.
The top of Fig. 4 shows two sine waves (which are symmetrical over both the vertical and horizontal axes). Below them are two pulse waves (which are symmetrical over the vertical axis but not the horizontal). Both sets of waves are out of polarity. The two sine waves also appear to be 180 degrees out of phase. However, because there is no time offset, there is no phase shift. Nevertheless, when summed together, they will cancel, just like sine waves that actually are 180 degrees out of phase (such as the ones in Fig. 2a).
Unlike the sine waves, the pulse waves in Fig. 4b do not appear to be 180 degrees out of phase when their polarity is reversed (compare Fig. 4b with Fig. 2b). Like the sine waves, two pulse waves with reversed polarity will cancel each other, while two pulse waves (or any asymmetric waves) will not be cancelled when simply phase-shifted by 180 degrees.
So, to recap: when two signals are out of polarity, there will be no phase shift and no comb-filtering. The two signals will simply cancel when summed. They are not out of phase, they are out of polarity.
Since at least 1950, many in the recording industry have been incorrectly referring to polarity reversal as being out of phase. On consoles, you'll often find a “phase” button that inverts the polarity. In many cases, the Greek letter theta (Ø) is used to refer to a polarity reversal, when the more formal and correct application of that symbol in physics is to indicate phase shift. Yet PolRev is a special trick that has nothing to do with phase shift.
Polarity reversal is an essential tool in your stereo toolbox and a key element in the various stereophonic miking techniques of the middle-side (MS) family. It can similarly be used to great effect in synthesis for developing powerful and effective pseudostereo voices. Such voices have a middle element and a side element, and the relationships between the two can be modified over time using separate envelope generators. The side element is split into a polarity-reversed pair, while the middle element is mixed in mono with both side elements.
Awareness of the existence of polarity reversal, as both a working tool and also a potentially disastrous production problem, is essential to your studio craft. When you are working with multiple iterations of a signal and suddenly the resulting signal disappears or attenuates by 50 dB, you can be pretty sure that one of the signals has its polarity reversed.
Many hardware manufacturers are fairly casual about the actual polarity of the signal; for example, many line-amp topologies invert the signal as a function of their design. I've encountered numerous consoles where various outputs have their polarities reversed. Nor can you count on sound cards to be much better in this regard, although I haven't done a survey recently. Once you know the nature of the problem, it is usually easy to correct.
The concept of phase refers to an offset in time during the cycle of a given wave, and it is a fundamental quality of the audio signal. The concept of polarity refers to the relative “sign” or polarity of two otherwise identical signals. The only real relation between the two is that they share the same jargon.
Dave Moultonhas been working on The World's Best Loudspeaker, also known as TWBL. You can complain to him about anything at his Web site,www.moultonlabs.com.