Multi-dimensional Music Representation and Mixing

Columbia University Computer Music Center

Chris Bailey and Brad Garton

The joy of musique concrete is that of making music out of "found sounds"---sounds that we hear in our everyday lives, but torn out of context, and placed in a new, musical one. Thus, for example, a melody might be constructed out of a sequence of sounds that you recorded, say, around your house: hitting a few different pans, some clinks of silverware and glasses, a scrape of your fingernail on your wall, and perhaps a motorcycle whizzing by outside to close off the gesture. The trick is to make these sounds cohere musically, so that when you hear the sequence, you think, "Those sounds are very familiar. . . . yet they're making music. Wow!"

One way to do this is to group sounds together by shared characteristics. For example, you might want to have a percussive gesture---all the sounds having hard, percussive attacks. Or, alternatively you might want to have a very soft, flowing kind of texture. Or, you might want to use a bunch of sounds that all contain the note A#. Or that are very noisy, or, very pure. Or some combination of all of these. Clearly, there are many ways to classify found sounds.

So, for this installation, we began by making a database of several hundred sounds. Every sound comes with a list of parameter-values associated with it.

The parameters are:

Duration: How long is the recorded sound fragment?
Pitch(es): Are there any notes that stick out or are prominent? (For example: you record yourself scraping your finger over the cheese grater, and it happens to sound like a low Bb . . .)
Loudness: This is a tricky one, because of course you can make any sound loud by turning up the volume. But besides that, is the sound "loud" in quality? Does it have a real bite, even at low volume, or is it meek, even at high volume?
Attack Hardness: This is a measure of how much BANG the sound has when it starts. So hitting a metal object has a high attack hardness; saying the syllable "Waaah" has a low attack hardness.
Purity--noisiness: This is fairly self-explanatory: is the sound "noisy", or does it sound bell- or tone-like?
Color: Is the sound tense, high, rich, and therefore "bright"? Or is it low, dull, plain and therefore "dark?"
Agitation: Some sounds you record are very agitated: Record yourself scraping your finger on a washboard, for example. On the other hand, if you hit a metal object and it just rings to silence, that's relatively un-agitated.
Material/Category: This is a more psychological/semiotic parameter. What is the sound? Someone talking? a pot, pan, a piece of wood? Or a recording of crickets in your backyard?
Tessitura: This is like pitch, but less exact. It is simply a matter of whether the sound is relatively high, or low.

After this data is entered for a large number of sounds, then you can have the computer create gestures out of the sounds. For example, you might want a gesture consisting of short sounds, falling from high to low. The computer would simply look, first of all, for sounds with a short duration, and then, for sounds with "tessitura" gradually going from high to low.

But how do you tell the computer to do this?

Well, you could write nasty, complicated computer programs to do it. But that's boring. So, for your enjoyment, we've created a multi-dimensional "space" that you can fly through. In the "space", you'll see the sounds floating around you. The tricky part is to determine how to navigate in a 9-dimensional space (the nine parameters listed above) in a manner that delights both the eye and the ear. We arbitrarily decided that all the parameters would be independent of each other -- this allows us to find a sound consisting of any combination of the description parameters. We realize that in the Real World many of the parameters we use are in fact linked to each other. To be honest, the list of parameters is rather arbitrary anyhow. We used our ears and our musical intuition to create the descriptions (and values) listed for each sound.

We then mapped (or assigned) each parameter to a different visual attribute seen on a computer screen; the positioning of sounds in a 3-dimensional representation, the color of the visual sound objects, the "spikiness" of each object, etc. Our goal was to create a representation that would be memorable enough to allow us to revisit a region of 9-space with particularly satisfying sound combinations.

The real fun, however, comes from simply 'flying around' in the sound space. Not a bad paradigm at all for composing new music!