Get with the Interaction

Image placeholder title

 FIG.1: Cycling '74's Max?MSP combines customized MIDI and audio processing.Plain black patch cords carry MIDI-based processes, whileblack-and-yellow striped patch cords carry audio processes. Thisexample creates simple FM synthesis.FIG. 2: InSymbolic Sound's Kyma, processes are dragged onto a multitracktimeline. Processes such as waitForLowG (center-left, track 1) are usedto tell the system when to start various tasks. 

There's a growing trend in the world ofelectronic-music performance. Next time you go to a show, chances areyou'll see a laptop onstage, functioning as a performer. Ininteractive composition and performance, control of a pieceincludes a computer that has been programmed to sense significantmusical features from a human performer and produce its own music inresponse.

Image placeholder title

Nothing about the idea is new — people have been writing andplaying interactive works for more than 25 years. But the pioneersworked for institutions that could spend hundreds of thousands ofdollars on specialized computer systems. Now that PCs are intertwinedwith everyday life, interactive music systems have trickled down to theproletarian sphere of individual musicians. In this column, I'll take abrief look at the evolution of interactive music systems and give anoverview of some performance approaches that are commonly used. (Seethe sidebar “References and Recordings” for additionalresources.)


By the end of the 1960s, Max Mathews, the father of computer music,was increasingly dissatisfied with the music that computers wereproducing. Music created from coded scores was dry and lifeless. In aneffort to transmit micromodulations — the uncountablevariations in embouchure, bow position, breath pressure, and so on,that give live music a dynamic dimension — Mathews began thepursuit of what he called the intelligent machine that wouldrespond to performers' nuances. Conductor was an early system thatMathews designed for Pierre Boulez, who was the musical director of theNew York Philharmonic at the time. Boulez was enthusiastic aboutelectronic elements in performance, but felt constrained by having tofollow a tape. The Conductor system allowed electronic elements to bedynamically controlled by external devices such as joysticks andpercussion instruments.

In 1977, composer Joel Chadabe snatched the first Synclavier off theproduction line and had it outfitted with special software that createdmelodies based on predefined parameters such as harmony and intervalcontent. The Synclavier was interfaced with two modified theremins. Oneantenna controlled the tempo (note durations), while the othercontrolled relative volumes of four Synclavier voices (in effect,overall timbre). Chadabe wrote that performing with the system was likehaving “a conversation with a clever friend.” He could dothings like cue clarinet sounds to play slowly; but since he did notknow which pitches would play, the notes he heard then influenced hisnext control gesture.

Meanwhile, at Boulez's research brainchild, IRCAM, in Paris, workwas under way on a digital signal-processing computer that was capableof any synthesis configuration as well as real-time audio processing.The 4X workstation was completed in the early 1980s and was likenothing the world had ever seen. Miller Puckette created aMacintosh-based interface for the 4X in which processes and controlswere represented graphically. Patches could be created by drawing patchcords between modules, and processing algorithms could be switched onand off by various gates. He named the program Max in honor ofMathews.

Max was later ported to the NeXT personal computer, where it couldbe run with the help of peripheral hardware processors in aconfiguration called the ISPW (IRCAM Signal Processing Workstation).Though far more economical than the 4X, the ISPW remained a priceyhardware-software combination. Max was then released commercially as akind of erector set for MIDI input, processing, and output and is nowunder active development by Cycling '74 (the sidebar “On theWeb” provides URLs for all the developers mentioned in thisarticle) for both the Mac and Windows computers. The tools forinteractivity were now within the means of independent musicians.


The followinglist includes books and magazine articles, as well as a number ofrecordings that capture the spirit of a live, interactive performance.Most of the recordings can be purchased online at

Books andArticles

Composing Interactive Music,by Todd Winkler (MIT Press, 1998)

ElectricSound: The Past and Promise of Electronic Music, by Joel Chadabe(Prentice Hall, 1997)

Interactive Music Systems,by Robert Rowe (MIT Press, 1993)

“Language Inventors on theFuture of Music Software,” Computer Music Journal 26 (4):Winter 2002

MachineMusicianship, by Robert Rowe (MIT Press, 2001)

Trendsin Gestural Control of Music, Marcelo Wanderley and Marc Battier,editors (IRCAM, 2000)


PierreBoulez, “Répons,” from Répons/Dialogue del'ombre double (Deutsche Grammophon, 1998)

JoelChadabe, “Follow Me Softly” and Cort Lippe, “Musicfor Clarinet and ISPW” from The Composer in the Computer AgeVII (CDCM, 1997)

Agostino diScipio, “5 Difference-Sensitive Circular Interactions”;Gerhard Eckel and Vincent Royer, “Traverse”; and CortLippe, “Music for Hi-Hat and Computer,” from ICMC2000 (ICMA Recordings, 2000)

TodMachover, “Bounce” from Tod Machover (BridgeRecords, 1993)

TodMachover, “Bug Mudra” from Flora (Bridge Records,1990)

RogerReynolds, The Paris Pieces (Neuma Records,1995)

Jean ClaudeRisset, “Eight Sketches: Duet for One Pianist” fromDigital Rewind (MIT Experimental Music Studio,1998)

RobertRowe, “Color and Velocity” from Jade Nocturno(Quindecim, 2001)

RobertRowe, “Flood Gate” from Cultures Electroniques 5:Bourges 1990 Laureats (Mnemosyne, 1990)

RobertRowe, “Shells” from Tárogató (RomeoRecords, 2001)


So what is meant, exactly, by machine responses to a humanplayer? Author-composer Robert Rowe classifies interactions into threebroad categories. The first concerns the type of“listening” a computer is doing. The second describes thecomputer response types. The third describes the nature of thepartnership between performer and computer.

As for listening, computers can listen generally orspecifically. General listening means that the computer sensesgeneral characteristics such as register, loudness, or density.Specific listening can come in two forms. One, score following,involves moment-by-moment estimations of a performer's tempo. Onecommercial score follower is Smart Music, a practice aid for musicstudents, by MakeMusic Inc. The program has accompaniments to standardrepertoire for most solo instruments. A piece's accompaniment playsalong with a soloist, whose tempo is tracked with a microphone. A lessrigorous form of listening, score orientation, does make notcontinual tempo estimations but responds to selected highlights, suchas a trigger from a pedal or a high note at a given pitch.

So much for listening. Now we can consider three forms of response.Transformative responses create variations on a performance. Forexample, Max can be configured to invert intervals, play a phrasebackwards, transpose notes, arpeggiate chords, sense the currentharmony and add a bass note, create chords from a melody, and more.Generative responses are based on material that the computercreates on its own, such as algorithmic creation of melodies from alibrary of pitches and rhythms (see “Game of Chance” in theNovember 2003 EM for more on algorithmic composition).Sequenced responses consist of stored musical passages that arekept on hand to be played when triggered. For example, in ascore-oriented listening system, certain events in a score, such as along, loud middle A, might trigger a preset melody. The performer mightthen create variations on the melody using a continuous-control pedalthat changes the sequence's tempo or dynamics.

Finally, we can think of two roles the computer might play in aperformance. In one, the computer extends the player's instrument,augmenting a solo performance with features such as filtering, effects,or pitch doubling. In the second, the computer creates anotherpersonality, so that it plays a kind of duet with a musician.Sophisticated implementations of duet partnering may rely on techniquesof artificial intelligence to perform tasks such as defining phrasebeginnings and endings or sensing changes of scale, mode, or key.


The previous examples described MIDI responses. MIDI is an effectivevehicle for interaction, given its discrete, event-based format.Incoming events can be marked with time stamps, easilycataloged, and complemented by stored catalogs of algorithms orsequences. MIDI, however, provides an incomplete representation of aperformance. Notably absent is any description of timbral variation.But an extension to Max called MSP adds the ISPW audio-processingmodules to the environment, letting today's computer owners explorewhat was once only possible with the 4X, at less than one one-hundredthof the cost.

While an audio-based system has the advantage of being more closelytied acoustically to a performance, it lacks many of the flexibilitiesof a MIDI-based system. Responses such as playing a phrase in reverseor inverting all pitches around a given note are easy to implement withMIDI's unambiguous event types, but much more difficult to perform witha stream of audio samples. Polyphony is another issue that is easy forMIDI: a chord is easily recognizable as a set of discrete pitches. Thislevel of analysis is impossible for an acoustic signal, as no one hasbeen able to create a program that can distinguish between simultaneouspitches and overtones of a fundamental pitch. Acoustic systems, then,are typically based on input from a monophonic instrument.

Pitch trackers can identify the fundamental of a monophonicinstrument or signal. With a pitch-tracking module, a signal'sfrequency can be sent to an oscillator to control its pitch, or thesignal may be transposed. Other audio-based applications could includeusing the volume of an acoustic signal to modify the index of afrequency-modulating oscillator, or mapping MIDI controller values toaudio processes such as reverb time, filter frequencies, or stereoplacement. Analysis modules can do things like analyze incoming speech,separate noisy sibilants from periodic vowels, and process eachdifferently.

OSC (Open Sound Control) is a protocol introduced by the Center forNew Media and Audio Technologies (CNMAT) at the University ofCalifornia at Berkeley in the late 1990s to enable real-time control ofcomputer-synthesis processes from gestural devices. OSC does notinclude MIDI messages, but MIDI messages can easily be mapped into OSC,making OSC commands a superset of the MIDI protocol. OSC offersincreased resolution and definition of gestures and synthesisparameters, as well as more accurate time control. It is transmittedover networks of computers, which means that it is well suited forbroadcast performances of computers and performers interacting witheach other from different places. The Gibson guitar company has alsodeveloped the MaGIC specification, which sends an electric guitar'sacoustic signal over an Ethernet network, giving guitarists theopportunity to participate in these simulcast collaborations.


Joel Chadabe probably chose the theremin for his original Synclaviersystem because that instrument is practically unparalleled in itssensitivity to micromodulations. Ironically, as the sound capabilitiesof electronic instruments have evolved, their player interfaces havebecome increasingly rudimentary. Interactive performances often featureexperimental-instrument types that push the sensing envelope.Instruments like Don Buchla's Lightning allow movements in space to betranslated into MIDI control signals.

Massachusetts Institute of Technology Media Lab composer TodMachover heads the development of hyperinstruments that generatevarious control signals. The conducting dataglove translates aconductor's left-hand movements into controls by tracking the angle ofeach finger relative to the back of the hand, as well as the angle ofthe joints of each finger. Hyperstrings augment the capabilitiesof string instruments. One commission by cellist Yo-Yo Ma consisted ofsensors that tracked bow angle, bow pressure, wrist angle, andleft-hand finger positions. Data from the cello motions and an analysisof the instrument's audio were fed into a computer that generated audioin response.


Image placeholder title

Max/MSP is the software most commonly used in interactive musicapplications (see Fig. 1). Its graphical front end facilitatesalgorithm configuration, while the essential issues of event schedulingand input tracking are kept “under the hood.” This allowsusers to focus on music rather than computer cycles. The Maxenvironment has also spawned two offshoots. Pd (“pure data”or “public domain”) is a version introduced by MillerPuckette that exists in the public domain. It is free, runs onvirtually all hardware platforms, and is under continual development bya community of users. Yet another version, jMax, is written in Java andis available from IRCAM's Web site.

Other systems suited to interactivity include Symbolic Sound's Kymasystem, an audio processor and sound-programming language for Macintoshand Windows. Like Max, it is visually oriented, but processing andsynthesis modules are arranged on a timeline. Kyma includes pitch andamplitude trackers, and it can be configured to wait for a specificevent (such as a middle C) before, for example, running a script togenerate notes (see Fig. 2).

Image placeholder title

James McCartney's SuperCollider, a free program for the Macintosh,is a text-based programming environment. Although the absence of agraphical interface makes SuperCollider harder to learn than someprograms, it also permits a greater degree of efficiency andflexibility. For example, the number of active oscillators can beassigned to a variable. Changing the number of oscillators in a patchis simply a matter of changing the value assigned to that variable,rather than adding or removing objects and patch cords from thescreen.

Kyma's developer, Carla Scaletti, has pointed out that theseprograms are computer music languages. Most commercial musicsoftware falls into the category of a utility, meaningprograms that perform common, well-defined functions. It's true thatmany utilities are quite complex — your average digital audiosequencer is an example. But they cannot match the open-endedness andflexibility of general purpose languages that enable users to configurewhatever synthesis and audio-processing algorithms they want, nor canthey provide the same ability to tailor these processes to customizedinput and output routings. You can take all the features of yourfavorite commercial synths and combine them in one custom environment,provided you have the computer memory (and the patience!) to cobblethem together. For those wanting individualized performanceenvironments, computer music languages are the only way tofly.


Interactive music raises intriguing questions about musicalintelligence, compositional methodology, and collaboration —questions that only become more intriguing as computing power advances.This is a pursuit likely to become an important current of 21st-centurymusic.


www.makemusic.comCycling '74Max/MSP
www.gibsonmagic.comMITHyperinstrument Project

Mark Ballorateaches music technology at Penn StateUniversity, where he spends most of his time interacting withcomputers.