week2

Ok, so we're starting on our Max/MSP/Jitter 'unit' -- Luke has this really cool idea... we're going to work up in class an extended application that will allow us to do wild and wonderful things by combining real-time FFT analysis/resynthesis with Jitter (video matrix processing) and nifty interface stuff. Sounds cool, eh?

Luke went into a bit of detail about how the FFT works. If you'd like more information about the FFT, check out the following web sites:

Fast Fourier Transform FAQ -- some basic info
www.fftw.org -- these people are maniacs. Supposedly home of the "fastest FFT in the west". Lots of source code available.
fftw links -- scroll down to the "Explanatory Material" section. Some of the 'explanations' are pretty dense, but there's good stuff here.
FFT lab -- this is just sorta fun. Click on those yellow boxes...
FFT tutorial -- yeah, here you go.

Also, the following books are really good resources for the math-behind-the-magic with respect to things like the Fourier Transform, etc. We'll be referencing Perry's book later in the term as we get into physical modeling a bit more.

Real Sound Synthesis for Interactive Applications by Perry Cook
A Digital Signal Processing Primer by Ken Steiglitz
here is a good listing of a lot of DSP books, etc.

Check out the resources page for Max/MSP/Jitter resources. We forgot to mention in class that you can download Max/MSP/Jitter for a 30-day trial period if you'd like to fool with it before buying any software.

As promised, here are semi-coherent notes taken during Luke's presentation.

The big fun plan -- Use Max/MSP/Jitter to create a SNAZZY application!

What Luke intends to do is use an FFT analysis of a soundfile (Fast Fourier Transform — see below for more info) -- turn into a "waterfall" FFT, then migrate towards a 3-D FFT, all the time keeping it interactive, allowing us to ‘traverse’ the transform and operate on it to make weird and wonderful sounds.

For those who don't know Max/MSP — Basic Electroacoustics (g6601) (taught by Terry Pender and Johnathan Lee) is presently covering this particular programming environment. Also check out what is available for Max/MSP/Jitter on the resources web page for this class.

If you’d like to jump right in and want to learn Max/MSP/Jitter on your own, there is fairly extensive documentation on our Macintosh computers at the CMC. Find it by:

Locate the Max/MSP folder in "sound" in the applications folder
Look for ""tutorials" in that folder to walk-through how to do it.
Check out the Max 'documents' (called 'patches')
Open the pdf documents and you can follow along with patches that correspond to the documentation

The Max tutorials are pretty basic, the MSP tutorials are a bit better, and the Jitter tutorial is very useful.

What is Jitter?

jitter:

a collection of objects that all start with "jit".
Event-driven objects; i.e. jit.qt.movie -- plays QuickTime movies, queued by events ("bangs")

outlets of the objects put out 'jit matrices' (basically video data). The little cat icon in the object tool-dock is the interpreter for this video-matrix output (i.e. it will display the decoded matrix signal)

If you print the output from the jit.object outputs, it will say "jit.matrix" followed by some data garbage.

These matrices can be named for internal processing. A way to keep a handle on the data, etc.

Example of matrix-processing signal chain -- inserting jit.brass after the jit.qt.movie will 'emboss' the video.

Terry asked what frame rate corresponds to reality (for example, how fast should a 'metro' be to playback a qt video at normal speed? The bangs only dictate how fast the video matrices are being sent out -- the jit.qt.movie object will interpolate data frames from the actual frame rate of the movie.

Another object, jit.plur will do odd interpolations across the video matrix. Sending messages like "x_step 39" and "y_step 32" will cause a snazzy blurring effect. These data values can also be loaded automatically by listing the various 'attributes' (parameters) with an "@" sign in front of them... a good way to initialize the video state of the object. No more loadbangs!

"plur" stands for peace love unity rave

The rightmost outlet on each jitter object gives out the state of the video object. Usually the leftmost inlet/outlets are the video (matrix) stream.

oops, I missed some stuff. I had to go and get more chairs.

Each matrix element has:

dim (dimension) (x and y)
cell

So this is a fairly substantially multi-dimensional matrix. You can store virtually any kind of data in these things, doesn't have to be video.

The "fps" display object will display lots of info about the matrix (video) stream, including the type, etc.

Each cell is then addressed by x and y, with the cells representing structured data (in "video" cells, alpha rgb).

jit.unpack/jit.pack -- will pack and unpack the planes of each cell, puts 'em out through individual outlets/inlets. You can then add effects on individual channels, swap them around, other fun things.

jit.repos -- does remapping of pixel addresses. Each pixel number can be shifted to a new address. (on each help patch object -- clicking the "view html reference" will bring up an html reference for the object.)

jit.matrix -- stores matrix data, allows you to name it for internal reference. Why name a matrix? Allows you to operate on a video matrix and then replace that matrix with the revised data. This is necessary for feedback effects.

jit.rota -- for example, will rotate a matrix, but you need to keep updating the matrix it is operating upon.

WHAT WE'RE GONNA DO

Back to MSP-land... play with sound. We're going to take an FFT of a sound, store it in the matrix, and then resynthesize it, AFTER we do really fun and snazzy stuff in jit.

[aside -- good codec to use for compressing movies for jit, photo-jpeg. Use QuickTime player "export" option. Don't use sorenson coding because it makes small files but the computer has to work hard to recreate the movie. But movies do have to be compressed for jit.]

[doing a "Get Info" on the movies will show you the data rate, important for figuring out how much nifty stuff you will be able to do with the computer.]

pfft~ -- an MSP object (subpatch in this case) which loads in another Max/MSP patch. It takes audio in and audio out, but inside the patch the representation of the data is changed.

Consider a normal audio waveform -- it is represented in the 'time' domain. We all know that Fourier figured out how to decompose the time-domain signal into a series of 'frequency' domain signals (sine waves, basically). The quick-and-dirty way of thinking about how this happens is to imagine a bunch of tiny frequency-centered filters stacked up, each one measures how much energy goes through at each frequency. So every so often, we grab a chunk of audio and do this transformation into a set of sine wave frequencies. We can also do the "inverse" FFT, which will reconstruct the audio by generating a bunch of sine waves and adding them all together.

Each little "frequency-filter" point is called "bin" (sometimes called a "channel" Luke called them "cachter’s mitts" for some really weird reason). The number of these bins determine the resolution of the frequency-grabber, but they also determine how big a chunk of sound we can grab for analysis because of the way the FFT works. Time/Frequency resolution tradeoff: the more "bins" you have, the greater your frequency resolution, but it will also have to operate on a correspondingly larger chunk of sound, so the timing gets blurred. Vice versa, too.

One way around this limitation is to advance ("hop") the FFT analysis forward by less than the chunk-size of samples. But the timing within each chunk is still going to be blurred-out.

Back to pfft~:

Within the subpatch, instead of inlets and outlets, you use fftin~ and fftout~. fftin~ has three outlets: real (x), imaginary (y) and sync (which is basically the fft phase, or where you are in the fft sample chunk.). The real and imaginary thing can be converted to the amplitude and frequency components (that we want) by doing some projection-math on the unit circle, or use the Max/MSP cartopol~ object (cartesian to polar).

To find the frequency of each bin in the fft, we can start at the center of each bin, and then look at the phase of the sine wave. If it is unchanging, the sine wave for that frequency is dead-on the middle. If in fact the phase is increasing or decreasing, we can use the amount that the phase changes to determine how far above or below the center frequency of each bin. This will give us a relatively exact frequency value for each sine-wave component in the frequency domain.

In the pfft~ patch, these values are computed and stored in a jit matrix. Ha ha! Now the fun starts! The planes of this jit matrix, plane 0 stores the amplitude, plane 1 stores the frequency for each bin of the fft analysis.

This storage is done by the sync outlet of the fft anaylsis updating the y-index of the jit matrix -- this way we can store each amplitude/frequency pair in the appropriate slot in the matrix.

We have a second patch, playfft~ that takes the fft data and then reconstructs the fft data into a time-domain waveform.

Last patch we did does a simple fftin/fftout thing. (get .sit from luke)

Because the FFT data is in a jitter matrix, we can now operate on the data using various jit matrix operations.

Which we will do.