|<-- Back to Previous Page||TOC||Next Section -->|
Chapter 3: The Frequency Domain
Section 3.2: Phasors
In Chapter 1 we talked about the basic atoms of soundthe sine wave, and about the function that describes the sound generated by a vibrating tuning fork. In this chapter were talking a lot about the frequency domain. If you remember, our basic units, sine waves, only had two parameters: amplitude and frequency. It turns out that these dull little sine waves are going to give us the fundamental tool for the analysis and description of sound, and especially for the digital manipulation of sound. Thats the frequency domain: a place where lots of little sine waves are our best friends.
But before we go too far, its important to fully understand what a sine wave is, and its also wonderful to know that we can make these simple little curves ridiculously complicated, too. And its useful to have another model for generating these functions. That model is called a phasor.
Description of a Phasor
Think of a bicycle wheel suspended at its hub. Were going to paint one of the spokes bright red, and at the end of the spoke well put a red arrow. We now put some axes around the wheelthe x-axis going horizontally through the hub, and the y-axis going vertically. Were interested in the height of the arrowhead relative to the x-axis as the wheelour phasorspins around counterclockwise.
As time goes on, the phasor goes round and round. At each instant, we measure the height of the dot over the x-axis. Lets consider a small example first. Suppose the wheel is spinning at a rate of one revolution per second. This is its frequency (and remember, this means that the period is 1 second/revolution). This is the same as saying that the phasor spins at a rate of 360 degrees per second, or better yet, 2\2 radians per second (if were going to be mathematicians, then we have to measure angles in terms of radians). So 2 radians per second is the angular velocity of the phasor.
This means that after 0.25 second the phasor has gone /2 radians (90 degrees), and after 0.5 second its gone radians or 180 degrees, and so on. So, we can describe the amount of angle that the phasor has gone around at time t as a function, which we call (t).
Now, lets look at the function given by the height of the arrow as time goes on. The first thing that we need to remember is a little trigonometry.
The sine and cosine of an angle are measured using a right triangle. For our right triangle, the sine of , written sin() is given by the equation:
This means that:
Well make use of this in a minute, because in this example a is the height of our triangle.
Similarly, the cosine, written cos(), is:
This means that:
This will come in handy later, too.
Now back to our phasor. Were interested in measuring the height at time t, which well denote as h(t). At time t, the phasors arrow is making an angle of (t) with the x-axis. Our basic phasor has a radius of 1, so we get the following relationship:
We also get this nice graph of a function, which is our favorite old sine curve.
Then we get this nice curve, which is another kind of sinusoid (bigger!).
Now lets start messing with the frequency, which is the rate of revolution of the phasor. Lets ramp it up a notch and instead start spinning at a rate of five revolutions per second. Now:
This is easy to see since after 1 second we will have gone five revolutions, which is a total of 10 radians. Lets suppose that the radius of the phasor is 3. Again, at each moment we measure the height of our arrow (which we call h(t)), and we get:
Now we get this sinusoid:
In general, if our phasor is moving at a frequency of revolutions per second and has radius A, then plotting the height of the phasor is the same as graphing this sinusoid:
Now were almost done, but there is one last thing we could vary: we could change the place where we start our phasor spinning. For example, we could start the phasor moving at a rate of five revolutions per second with a radius of 3, but start the phasor at an angle of /4 radians, instead.
Now, what kind of function would this be? Well, at time t = 0 we want to be taking the measurement when the phasor is at an angle of /4, but other than that, all is as before. So the function we are graphing is the same as the one above, but with a phase shift of /4. The corresponding sinusoid is:
Our most general sinusoid of amplitude A, frequency , and phase shift has the form:
A particularly interesting example is what happens when we take the phase shift equal to 90 degrees, or /2 radians. Lets make it nice and simple, with equal to one revolution per second and amplitude equal to 1 as well. Then we get our basic sinusoid, but shifted ahead /2. Does this look familiar? This is the graph of the cosine function!
You can do some checking on your own and see that this is also the graph that you would get if you plotted the displacement of the arrow from the y-axis. So now we know that a cosine is a phase-shifted sine!
Fouriers theorem tells us that any periodic function can be expressed as a sum (possibly with an infinite number of terms!) of sinusoids. (Well discuss Fouriers theorem more in depth later.) Remember, a periodic function is any function that looks like the infinite repetition of some fixed pattern. The length of that basic pattern is called the period of the function. Weve seen a lot of examples of these in Chapter 1.
In particular, if the function has period T, then this sum looks like:
If T is the period of our periodic function, then we now know that its frequency is 1/Tthis is also called the fundamental (frequency) of the periodic function, and we see that all other frequencies that occur (called the partials) are simply integer multiples of the fundamental.
If you read other books on acoustics and DSP, you will find that partials are sometimes called overtones (from an old German word, "übertonen") and harmonics. Theres often confusion about whether the first overtone is the second partial, and so on. So, to be specific, and also to be more in keeping with modern terminology, were always going to call the first partial the one with the frequency of the fundamental.
Example: Suppose we have a triangle wave that repeats once every 1/100 second. Then the corresponding fundamental frequency is 100 Hz (it repeats 100 times per second). Triangle waves only contain partials at odd multiples of the fundamental. (The even multiples have no energyin fact, this is generally true of wave shapes that have the "odd" symmetry, like the triangle wave.) Click on Applet 3.2 and see a triangle wave built by adding one partial after another.
How should we define an arithmetic of arrows? It sounds funny, but in fact its a pretty natural generalization of what we already know about adding regular old numbers. When we add a negative number, we go backward, and when we add a positive number, we go forward.
Our regular old numbers can be thought of as arrows on a number line. Adding any two numbers, then, simply means taking the two corresponding arrows and placing them one after the other, tip to tail. The sum is then the arrow from the origin pointing to the place where "adding" the two arrows landed you.
Really, what we are doing here is thinking of numbers as vectors. They have a magnitude (length) and a direction (in this case, positive or negative, or better yet 0 radians or radians).
Now, to add phasors, we need to enlarge our worldview and allow our arrows to get not just 2 directions, but instead a whole 2 radians worth of directions! In other words, we allow our arrows to point anywhere in the plane. We add, then, just as before: place the arrows tip to tail, and draw an arrow from the origin to the final destination.
So, to recap: to add phasors, at each instant as our phasors are spinning around, we add the two arrows. In this way, we get a new arrow spinning around (the sum) at some frequencya new phasor. Now its easy to see that the sum of two phasors of the same frequency yields a new phasor of the same frequency. We can also see that the sum of a cosine and sine of the same frequency is simply a phase-shifted sine of the same frequency with a new amplitude given by the square root of the sum of squares of the two original phasors. Thats the Pythagorean theorem!
Sampling and Fourier Expansion
The decomposition of a complex waveform into its component phasors (which is pretty much the same as saying the decomposition of an acoustic waveform into its component partials) is called Fourier expansion.
In practice, the main thing that happens is that analog waveforms are sampled, creating a time-domain representation inside the computer. These samples are then converted (using what is called a fast Fourier transform, or FFT) into what are called Fourier coefficients.
Figure 3.17 shows a common way to show timbral information, especially the way that harmonics add up to produce a waveform. However, it can be slightly confusing. By running an FFT on a small time-slice of the sound, the FFT algorithm gives us the energy in various frequency bins. (A bin is a discrete slice, or band, of the frequency spectrum. Bins are explained more fully in Section 3.4.) The x-axis (bottom axis) shows the bin numbers, and the y-axis shows the strength (energy) of each partial.
The slightly strange thing to keep in mind about these bins is that they are not based on the frequency of the sound itself, but on the sampling rate. In other words, the bins evenly divide the sampling frequency (linearly, not exponentially, which can be a problem, as we’ll explain later). Also, this plot shows just a short fraction of time of the sound: to make it time-variant, we need a waterfall 3D plot, which shows frequency and amplitude information over a span of time. Although theoretically we could use the FFT data shown in Figure 3.17 in its raw form to make a lovely, synthetic gamelan sound, the complexity and idiosyncracies of the FFT itself make this a bit difficult (unless we simply use the data from the original, but that’s cheating).
Figure 3.18 shows a better graphical representation of sound in the frequency domain. Time is running from front to back, height is energy, and the x-axis is frequency. This picture also takes the essentially linear FFT and shows us an exponential image of it, so that most of the "action" happens in the lower 2k, which is correct. (Remember that the FFT divides the frequency spectrum into linear, equal divisions, which is not really how we perceive soundits often better to graph this exponentially so that theres not as much wasted space "up top.")
The waterfall plot in Figure 3.18 is stereo, and each channel of sound has its own slightly different timbre.
Heres a fact that will help a great deal: if the highest frequency is B times the fundamental, then you only need 2B + 1 samples to determine the Fourier coefficients. (Its easy to see that you should need at least 2B, since you are trying to get 2B pieces of information (B amplitudes and B2 phase shifts).)
|<-- Back to Previous Page||Next Section -->|
©Burk/Polansky/Repetto/Roberts/Rockmore. All rights reserved.