Synthesis: Physical Models

Physical models are a relatively new class of synthesis algorithms based upon the use of bi-directional delay lines ("digital waveguides") and digital filters to imitate the physics of Real Live Instruments. They are very powerful and yield a wide range of interesting synthetic sounds and musical behavior, but they are somewhat difficult to use. Such is life.

Links

Most of these links are fairly theoretical in nature, hitting waveguide filter theory, etc. in some degree of detail. This is an emerging area of DSP research, however, and the links below can hopefully function as a guide for people interested in learning more about this fascinating area:

Datmouth Book-Thing -- chapter on physical models, especially good on Karplus-Strong
Physical Modeling Synthesis -- good overview/tutorial from Harmony Central
Julius Smith Papers and Tutorials -- all about waveguide synthesis, from Julius hisself. Mathemetical content
Perry Cook's page -- lots of info here, check out the link to the "SynthToolKit" in particular
Julius Smith's Karplus-Strong Paper -- probably linked from the above-mentioned page, too
Karplus-Strong Algorithm -- by Anne-Marie Burns, McGill; lots of good references

The following links are to downloadable physical model code and packages for experimenting with these models:

the Synthesis ToolKit (STK) -- the must-have for serious phys-model hackers. From Perry Cook and Gary Scavone, works on a lot of platforms
PeRColate -- max/msp port of STK and associated instruments, done right here at the good ole CMC
the RTcmix insts.stk.tar.gz package -- the RTcmix instruments source code, used in the [rtcmix~] object

There are also packages for CSOUND and a few ported models for SuperCollider, but I don't have the links handy...

Applications and Examples

For this class, we used one basic patch with an [rtcmix~] object and loaded it with different RTcmix scripts showing some of the physical modelling instruments available:

week7a-examples.sit -- StuffIt archive with the class max/msp patch and rtcmix scripts

individual patches as text files (for Windows users)
- phys-mod.txt
- simple_pluck.sco
- changing_decay.sco
- changing_squish.sco
- pluckiness.sco
- metaflute.sco
- clar1.sco
- clar2.sco
- mbrass.sco
- mmodalbar.sco
- mshakers.sco
  
  NOTE: the "*.sco" files contains the rtcmix scripts used in the "phys-mod.txt" patch. You will need to download all of them.
You may also want to fool with the extended (Charlie Sullivan) version of the Karplus-Strong algorithm, available as part of the RTcmix STRUM instrument. A fun demo of this intstrument is available as the "riff-o-matic" help patch for the [maxlisp] object.

We started the class by exploring the theory and development of the "Karplus-Strong" (or "plucked-string", discovered in the early 1980's by Kevin Karplus and Alex Strong at Stanford University) algorithm. The basic algorithm uses a simple low-pass filter applied recursively to a buffer initially filled with random numbers (i.e. noise).

What does this mean? Well, you start with a buffer (usually not too large, maybe 100 samples or so) filled with random numbers, then you go through the buffer and average between each pair of numbers. You replace each number in the buffer by the new averaged value. Thus the buffer itself gets modified -- in place -- by the low-passed version of itself. Then you do this over and over again, each time replacing the buffer-samples with the newly-filtered (averaged) versions. If you write out the buffer numbers from this operation sequentially, this results in a repeating waveform that very quickly gets smoothed and eventually damps to a single value.

The aural result of this operation sounds remarkably like a plucked string. Depending on the initial set of random numbers, the timbral evolution will be slightly different. This means that it is possible to generate a series of 'plucks' and each one will be timbrally unique.

This very simple algorithm, then, is a very powerful and fairly sophisticated synthesis technique. It is also extremely efficient, using only a small amount of memory and a straightforward filter equation. One of the problems resulting from this approach is related to this simplicity. The length of the buffer determines the periodic repeat rate of the shifting waveform, which means that the pitch of the resulting sound is tied to the size of the buffer. At low frequencies (fairly large buffers), this isn't a big problem. At high frequencies, however, the buffer lengths can be quite short. The problem is that moving from a buffer length of 14 samples to a buffer length of 15 samples will produce a fairly significant jump in frequency. The basic basic formula for computing the frequency from a buffer size is:

frequency = sampling rate/buffer size (think about it -- it makes sense). So a 14 sample buffer at a sampling rate of 44100 samples/sec. would produce a pitch of 3150 Hz. A 15 sample buffer yields a frequency of 2940 Hz. If you desired a pitch of 3000 Hz, then you're in trouble.

David Jaffe (also at Stanford at the time) proposed a set of extensions to the basic Karplus-Strong algorithm that enabled "fractional" buffer lengths, thus allowing for the production of any desired frequency. David also showed that by modifying the basic filter equation (averaging) slightly, a range of timbral effects could be produced. Many of these effects related to the perceived 'brightness' of the plucked-string.

In the mid 1980's, a Princeton undergraduate named Charles Sullivan added several more significant enhancements to the basic K-S algorithm. First of all, Charlie showed that by altering the characteristics of the initial random-filling of the buffer, it was possible to produce the effect of different hardnesses of the 'virtual plectrum' used to 'pluck' the string. Charlie then added a feedback pathway coupled with a waveshaping distortion algorithm, and viola! -- instant grunge-o electric guitar.

Julius O. Smith, a DSP researcher at Stanford, was so impressed by the sonic qualities of the Karplus-Strong algorithm, that he reformulated the operation of the algorithm as the action of a bi-directional digital waveguide, effectively re-creating the physics of a real string in a digital simulation. He was able to generalize this idea to produce a new paradigm for the development of synthesis algorthms (see the above links to Julius' papers).

Several of Julius' students greatly extended this idea. Most notably, Perry Cook (now at Princeton University) built a number of physical model instrument simulations, beginning with several basic wind instrument (flute, clarinet) and brass models. Perry's work now includes a range of wind/brass instriments as well as a number of percussion instrument models. Perry's recent explorations have gone "meta", using a technique he calls "physically informed stochastic event modelling" (PhISEM) to reproduce the sounds of shaken instruments such as maracas, tambourines, etc.

Although physical models make a lot of conceptual sense, and their efficiency as synthesis algorithms is almost unparalleled, they haven't seen widespread use. The fact is that the models are almost too good -- they are actually "difficult to play" (just like real instruments!). In simulating the physics of actual sounding devices, the non-linearities that make those devices (flutes, trumpets, etc.) work have also been captured. This means that the few parameters that are set in a physical model algorithm all interact in non-linear ways, and producing a desired result can be dependent upon a range of interacting factors. Perry's basic clarinet model will actually 'squeak' if 'played' incorrectly (or correctly, if that is your intended output!).

Why use physical models, then? Well, they actually do produce a wide vareity of interesting sounds, and the timbral complexity created by a good physical model is extremely hard to produce using other digital synthesis techniques. I'm also fascinated by the kinds of "handles" available as parameters for physical models. This synthesis approach could be very suggestive when employed as an element in a larger musical scheme (algorithmic composition or data auralization, for example).

Try these models out. Don't be disturbed if you don't get the ultimate Sound You Want, because chances are that the output will at the very least be, um, "interesting".