Tracks, Voices, and Channels

Working with music gear involves not only learning new concepts but also sorting through a lot of confusing terminology. The terms voice, channel, and
Image placeholder title
Image placeholder title

Illustration: Jack Desrocher

Working with music gear involves not only learning new concepts but also sorting through a lot of confusing terminology. The terms voice, channel, and track often cause confusion. They pop up frequently during talk about the capabilities of multitrack recorders, digital audio workstations (DAWs), and MIDI sequencers. Here's a look at these topics, including how they apply in various systems and the differences among them.

Image placeholder title

FIG. 1: An 8-track analog play head records eight audio tracks across the tape''s width.

An analog multitrack recorder's tape heads encode and decode audio information on discrete linear segments of tape called tracks (see Fig. 1). You categorize multitrack recorders by the number of magnets their record and play heads contain, which determines the total number of available tracks. Typically, multitrack machines provide 4 to 24 tracks.

Digital tape decks such as DATs and MDMs also store audio in tracks, but unlike analog machines, digital decks store audio as a stream of zeros and ones. Most DAT machines are stereo and provide only two tracks, whereas most MDMs, such as the Alesis ADAT, offer eight tracks. Higher-capacity machines that provide as many as 48 tracks and use open-reel digital tape are available but cost at least several hundred thousand dollars.

Of course, a single track of a digital or an analog recorder can contain more than one type of sound. However, once two sounds are mixed onto a single track, you can't separate them.


Channel is another term associated with audio gear, and it is often misused. For example, some people refer to multitrack recorders as multichannel recorders. Whereas track refers to how audio information is stored, channel pertains to how that information travels from one place to another, in and out of various devices. Channel is often used in reference to mixers and signal paths.

On a standard mixer, the main hardware inputs are the channels, where the primary signals come in. Each channel has a main fader along with other controls (equalizers, sends, and so on) for mixing purposes. Mixers can also have secondary inputs, such as auxiliary returns, but they don't count as channels. For example, a typical 16-channel mixer might have 16 main inputs and 4 auxiliary inputs. It would properly be called a 16-channel, 20-input mixer, rather than a 20-channel mixer.

Standalone modular hard-disk recorders (M-HDRs) have become very popular in music production. Instead of using tape, the systems record digital audio onto a hard drive. However, you don't need to lay out the audio data on the hard drive in a linear fashion as you must when recording to tape. Instead, you can physically scatter data over the drive and then arrange it as a single track for playback. That is possible because an M-HDR is a random-access device: it can use audio segments from anywhere on the hard drive. A track in an M-HDR is simply an arbitrary collection of audio segments strung together, playing one after the other.

Image placeholder title

FIG. 2: The Roland VS-880 DAW is a modular hard-disk recorder that offers 8 physical tracks and 128 virtual tracks.

The Roland VS-880 is an example of an M-HDR (see Fig. 2). It is an 8-track device, meaning it plays as many as eight audio streams simultaneously. Random-access devices such as M-HDRs also offer virtual tracks, which are other streams of audio data on the hard drive that are not playing. The VS-880, for example, only plays 8 tracks at once but can store and maintain 128 virtual tracks. Virtual tracks are great for storing different guitar solo edits, multiple vocal performances, or alternate language tracks. When needed, you can combine material from different virtual tracks into one or more real tracks, which is a feat that would be difficult to perform with an analog or digital tape recorder.

Some audio systems run on a host computer and combine dedicated hardware and software. Digidesign's Pro Tools system is an example of this kind of DAW. Pro Tools software runs on Macs and PCs and requires Digidesign hardware to operate. (Digidesign offers a scaled-down version of Pro Tools called Pro Tools Free that does not require special hardware. Check Digidesign's Web site for details; A system such as Pro Tools is available in several configurations. The high-end versions have multiple computer cards and audio interfaces; simpler configurations consist of a single computer card with audio inputs and outputs (I/O) built in. The configuration you choose determines the number of voices, channels, and tracks you get.

As with M-HDRs, tracks in a DAW are arrangements of various-size audio segments that are scattered across the hard drive, which Pro Tools calls Regions. Pro Tools organizes its tracks into a master document called a Session, and in a high-end Pro Tools TDM system, a maximum of 128 tracks is available in one Session file.

Image placeholder title

FIG. 3: This is a Pro Tools Session, with Regions in each track and each track allocated to a Voice. Note that the SFX tracks (SFX 1 and SFX 2) are allocated to the same voice (A4), because their ­Regions play at different times.

However, having 128 tracks displayed on the screen doesn't mean they can play simultaneously, because Pro Tools limits the number of Voices you can have. A Voice is any digital audio playing in any track at any time. Pro Tools can only play back audio files as high as its Voice capability. Depending on your hardware, some Pro Tools systems offer 64 voices whereas others have 32 or less. Therefore, a Pro Tools Session containing, say, 72 audio tracks might be able to play only 32 tracks at once. The other tracks can play as soon as one of the first 32 tracks stops, freeing up that Voice. Fig. 3 shows a Pro Tools Session containing audio Regions arranged into tracks, with a Voice indication shown next to each track (A1, A3, and so on).

At some point you will need to send your audio signals out of Pro Tools, perhaps to a digital mixer. Your Digidesign hardware will once more determine the number of physical input and output channels you possess. For example, Digidesign's main audio interface is the 888/24, which has eight mono channels of I/O. In a Pro Tools Session, you can assign each track to a specific output, depending on your system's limitations. Your Session might have 72 tracks with, say, a 32-Voice capability, but on a single audio interface, everything comes out of just eight audio channels. If you need more I/O channels, you can use multiple interfaces. A Pro Tools TDM system can have as many as 72 audio I/O channels. With a less expensive Pro Tools system, such as an Audiomedia III, you get a lot of tracks (24) in your Session but only two I/O channels.


Many audio programs don't require dedicated hardware but will work with any standard audio and computer configuration. In those cases, the audio device and the computer determine your audio-performance capabilities. The CPU type and speed, the amount of RAM, the hard drive's speed, and the quality of the audio interface and its drivers work together to determine the performance level.

Emagic's Logic Audio is one example of an integrated MIDI and audio program that works with various audio hardware/computer configurations. Like Pro Tools, Logic organizes audio regions into audio tracks. Unlike Pro Tools, however, Logic Audio has no voice limitation. Instead, it has a maximum number of audio tracks that it can handle, and your ability to reach that maximum depends on the power your computer system provides.

For example, Emagic's flagship package, Logic Audio Platinum, has a fixed limit of 64 audio tracks with a single audio interface or 128 audio tracks with multiple audio interfaces. It is likely that only a very powerful computer can reach those limits. With anything less, you might experience intermittent audio playback. (Some software warns you in advance that dropouts might occur.) To avoid this, make sure that your computer system, including all associated hardware, satisfies the recommended (not the minimum) software requirements.

The number of audio I/O channels you get with a digital-audio sequencer also depends on your audio hardware, which can range from the default stereo I/O provided by many new PCs' built-in sound cards to the dozens of channels you get using a high-end Pro Tools hardware setup.


The three terms — track, channel, and voice — also appear in connection with MIDI sequencing, in which they have similar meaning. When you record MIDI data into a sequencer, the data is stored in a linear stream called a track. Some sequencers offer a fixed number of tracks, but many modern programs don't set a finite limit of tracks you can use.

Like an audio recorder, a MIDI sequencer is limited to a fixed number of channel inputs and outputs. However, these are MIDI channels, not audio channels. The MIDI specification stipulates that a single MIDI cable can carry 16 channels, or distinct streams, of MIDI data. Many MIDI interfaces, especially external ones, support multiple streams of 16 channels each. To do that, they must have more than one MIDI In port and MIDI Out port. Each port can handle its 16 MIDI channels, giving you a huge number of MIDI channels to work with. A typical computer rig for MIDI sequencing might have 128 or more MIDI channels available. Just as a computer's audio interface determines the number of audio channels you can have, the MIDI interface determines the number of MIDI channels available.

Voice is also used in connection with MIDI systems, but voice limitations aren't related to the sequencing software. Rather, the available polyphony of your MIDI sound module determines the maximum number of voices that you can use in your projects. (Often, the term note is used in place of voice when describing the polyphonic limits of a piece of hardware. This avoids confusion with the naming conventions of many manufacturers who use Voice to refer to a specific aspect of their sound architecture.) Modern synths can play from 32 to 128 polyphonic voices (notes), though a single synth patch might use two or more voices.


The terms track, channel, and voice are used with many types of music gear and are not in themselves overly difficult. Yet with common misuse, even on some major manufacturers' Web pages, you might be more confused than is necessary. Look through the user manuals of any gear you plan to buy to make sure its specs meet your needs. Getting a better idea about these concepts will help keep you on the right track.

Jeff Baust is an audio engineer and composer in Boston and New York City. He is owner of Coral Sea Music and a professor of music technology at Berklee College of Music.