The Modern Audio Analyzer
Understanding dual-channel FFT audio analyzers
The FFT analyzer can give us amplitude and phase over frequency in more detail than we can ever use. Great. Unfortunately, in its raw form the data is marginally usable in the practical world of sound system optimization. We had to build in a lot of special features to adapt this tool for our trade. Here are a few examples of the limitations of the basic single-channel FFT analyzer: It creates a linear frequency response, a mismatch to our log hearing system. It requires a known (and precisely constructed) input signal; explain to the musicians what notes they can play, and when. The phase response is like a clock with just a second hand. Am I late or early? This is just a start.
How do we get around this? We throw money at it. We stack up lots of analyzers and run them in series and parallel. A typical modern system runs about 16 FFT analyzers together to make the composite frequency response display you see.
The basic transform flow goes from the sampled waveform to time record to real and imaginary numbers to magnitude (the more mathematical word for “amplitude”) and phase over frequency. First item of business: Forget about real and imaginary numbers. That is for math geeks only. That leaves us with two transitions: input to time record and then onto the frequency domain. The sampled input waveform runs by the same rules as the digital audio we deal with every day (i.e., we sample at >2X the highest frequency we want to use). We will use 50kHz as an example to get a usable 22kHz bandwidth.
The 50kHz sample frequency yields time increments of 0.02 milliseconds. Let’s start the counting game. If we take 250 samples, our time buffer will contain 5 milliseconds (250 x 0.02 milliseconds) of data. Notice that 5 milliseconds is the period for the frequency 200Hz, so exactly one cycle of 200Hz will fit into our time buffer. The time/frequency domain transform begins an interrogation of the data in the buffer. Let’s listen in to the conversation. “Did anyone here complete exactly one cycle? If so, (a) how big are you? and (b) what part of the phase cycle were you in when you entered (and left) the time buffer?” The former tells us the magnitude at 200Hz, and latter tells us the phase. The next step is to ask if anybody completed two cycles, which gives us the status report on 400Hz. The process continues for as long as we like (until we reach the highest frequency allowed by the sample rate). In short, a 5-millisecond sample has us slicing the spectrum into 200Hz increments. These are evenly spaced linear separations but variable separations on the log scale (1 octave between 200Hz and 400Hz, 1/2 octave between 400Hz and 600Hz, 1/3 octave between 600Hz and 800Hz, and so on).
Infinity and Beyond
The first challenge we see is linear data in a log world (the one in our heads). The second challenge is more subtle. What if the there was data in the waveform from frequencies that are not integer multiples of 200Hz? How do we count 300Hz? Do we spread its amplitude across the 200Hz and 400Hz bins? The phase values at the beginning and end of the time record don’t match. Which is right? What can we do? If we capture a longer waveform, say 10 milliseconds, we will be counting in 100Hz increments. Now we can get 300Hz, but what about 350Hz? The problem divides down but never goes away. Remember how I told you that the full Fourier transform needed to be computed for infinity? You are seeing it right here.
The price for not measuring to infinity is making an assumption that the finite sample we have is representative of everything between the Big Bang and the end of time. Let’s look at this challenge with an analogy. If our time sample were the complete song “Stairway to Heaven,” we would be operating under the assumption that this has been played continuously back to back forever. Admittedly it feels that way. But the loop must have the whole song, so that the quiet whiny ending meets the quiet whiny beginning. If our time record cut off while Jimmy Page was wailing with his amp set to 11, the restart would be abrupt and we would know the song does not remain the same. Yes, that is a long analogy, but I am saving you from integral mathematics.
Acceptable Use Policy blog comments powered by Disqus