Apr 2 2009

DFT Magnitude/Power Spectra

I often get confused about the scaling of magnitude and power spectra. I think I’ve worked it out so I’m puttin it here for my own benefit, and maybe the benefit of someone on the internet. If I’m wrong, please tell me.

Let’s take a unit-amplitude sinusoid at 100 Hz, and do a Fourier transform. If we look at the amplitude we’d get two components at 100 Hz and -100 Hz, each with amplitude 1/2. Something like this.

Now, if we do the same with a DFT we will get basically the same thing, except we’ll see the effects of discretization of course. But depending on which variation of the DFT you’re using, the magnitude of the components will be either about 1/2 or about N/2. The difference is, of course, the 1/N factor that you included or left out. In the case of the fft function in Octave (and I assume MATLAB), it’s without the 1/N factor.
max(abs(fft(x,1024)))
ans = 443.23

The power spectrum is the magnitude spectrum squared. abs(fft(x)).^2 or (abs(fft(x))/N).^2. Why would you want to square it? Well there are of course many reasons but the nature of audio signals makes most of the ones I know about moot. Furthermore the relative shape of the power spectrum and the magnitude spectrum is the same.

Try this. Take an audio signal (something more interesting than a single sinusoid) and plot the magnitude spectrum in dB, then plot the power spectrum also in dB, e.g.
plot(20*log10(abs(fft(x))))
figure
plot(20*log10(abs(fft(x)).^2))

They have a different scale, but they have the same shape. If you were doing something like peak-picking, it wouldn’t matter which you used if you’re working in dB.


Jan 21 2009

Log-frequency Spectrograms

Tensai asked me how I made my graphs in my ringtones post. I’d like to blather on about the graphs and why they’re cool and how you can make them, because that’s the sort of thing I’m good at.

In the olden days, the spectrogram was invented.

spectrogram of speech

Originally greyscale, they are now usually portrayed in color, with “hot” colors meaning higher amplitude and “cool” colors meaning lower amplitude.

While there are myriad ways to view and/or generate spectrograms, the most convenient for me right now is to do it in Octave. If you’re not familiar with Octave, it’s a MATLAB clone. Octave is libre, MATLAB is insanely expensive. Obviously, I use Octave. I have used MATLAB previously (I had a beta copy, until it left beta), and for the most part they are quite comparable. Octave is a bit slower for some things (less optimized) but I’ve seen Octave outperform MATLAB on some specific tasks. The biggest impedance mismatch is in user interface stuff (MATLAB has sophisticated dialog support) and graphing. Octave has most of the essential graphing functionality (it uses gnuplot to render graphs).

So how do we generate a spectrogram in Octave? First we need to read in the WAV file, then we generate the spectrogram.


[x,sr] = wavread('logchirp.wav');
specgram(x,8192,sr);

8192 is the size of the FFT. I find 8192 is a nice compromise between time and frequency resolution (and computation time), but other powers of 2 (especially 1024) are common as well. Here’s the chirp spectrogram:

chirp spectrogram

Notice how the chirp is logarithmic. To our ears, this sounds like a steadily-rising tone. For this reason when dealing with music it’s often better to look at a log-frequency spectrogram. Otherwise all the low frequencies are scrunched together and the relationships between different pitches (and harmonics) aren’t constant. Here’s a log-frequency spectrogram of the same chirp:

log-frequency spectrogram of chirp

This was generated by the Octave code

logfsgram(x,8192,sr); title('logchirp.wav');

Notice the bleed on the low frequencies, this is because we need a longer FFT in order to get more frequency resolution at low frequencies. This is a tradeoff in time resolution though, and processor time. Experiment with different FFT lengths for extra credit.

To use these functions you’ll need to put the specgram and logfsgram “m-files” in your octave search path (current directory or whatever else you specify in your ~/.octaverc).


Nov 17 2008

Clojure DSP Longing

I often find myself longing to be able to use Clojure, a very enticing lispy language that runs on the JVM.

I could possibly be using it right now in my dissertation research. It has the promise of dynamic languages, functional programming, almost-as-cool-as-Erlang concurrency, JVM performance, and Java library soup. It could be so awesome. A few months ago I started briefly down this road, unaware that…

Clojure sucks. Not generally, but it sucks for DSP. More specifically, Java and therefore Clojure has no real support for complex numbers. In order to do serious DSP, you need native syntactic, semantic, and performance support for complex numbers. Java has none of the above. Older versions of C didn’t have syntactic or semantic support, but the performance of using arrays was plenty fast. Not so in Java, at least not to the extent necessary to override the lack of syntactic and semantic.

So someday, when I’m writing general purpose code again and not high performance DSP code, I will have an opportunity to use Clojure, and I think that will make me very happy. By then the book will be out of beta. The community will be in full swing. There will be awesome libraries. Children will play in pristine parks with formerly-ravenous ravens.

In the meantime, if anyone sees the scene change, do let me know.


May 4 2006

RTDX OLE API

And now, ladies and gentlemen, I will answer the questions you have been dying to have answered! Right before your very eyes I will demystify The Texas Instruments TMS320C6416DSK RTDX OLE API!

Continue reading


Apr 6 2006

Convergent Rounding

The rounding they taught you in elementary school has an eventual bias. i.e.
over time you find that you’ve rounded up more often than you’ve rounded down.
Bias is a bad thing, especially in DSP and other numerically-intensive computing.

Convergent rounding is one good way to eliminate this bias. It works the same
as regular rounding except that instead of rounding 0.5 up, you round 0.5 to
the nearest even number. So 1.5 goes to 2.0, 2.5 goes to 2.0, etc.

A quick tutorial on short fixed-point programming in C. The decimal point is
between bits 15 and 14, and bit 15 is your sign bit. So, you have a range of
1-2^-15 to -1. So 0.5 is 0×4000, and -0.5 is 0xC000 (two’s complement). Now,
when you multiply two numbers together you get twice as many bits of precision,
so multiplying two shorts gives you an int, and in that int the decimal point
is between bits 30 and 29, and bits 30 and 31 are sign bits. We want to round
at bit 15 so we can fit the result back into a short. Clear as mud?

The point of this post is to get this C code out there, because as simple as it
seems it was a devil to debug:

int cround(int a)
{
    if (a & 0x3fff)
        a += 0x4000;              // normal rounding
    else
        a += (a>>1) & 0x4000;     // convergent rounding

    return a & 0xffff8000;
}

If you don’t see why this works (I didn’t when I first read the “add half of
the lsb uf the upper part” algorithm), then fill out this truth table as an
excercise:

0.00
0.01
0.10
0.11
1.00
1.01
1.10
1.11

Here’s an example of the function in use:

short x,y;
int a;
x = 0x7ffd;     // x = almost 1
y = 0x4000;     // y = 1/2
a = x*y;        // a = 0x1fff4000
a = cround(a);  // a = 0x1fff0000
x = a>>15;      // x = 0x3ffe