Different FFT implementationsThe relative performance (speed of execution) depends on fft size and stereo/mono. The speed depends on cash usage and computer architecture and I do not yet know which routines to select for different situations on different hardware platforms.
On a Pentium 66MHz which is the slowest computer I have acces to the 3 different versions run at equal speed. They all use about 23% of the total cpu time available when sampling is at 20kHz for a single channel 1024FFT with a sin squared window. On a Pentium MMX sampling at 44.1kHz 4096FFT's run 30% faster with the approximate FFT.
Fig.1 and fig.2 show graphs with a real to complex FFT. These graphs represent correctly produced FFT's.
Fig.3 and fig.4 show graphs with a more elaborate algorithm which is sometimes considerably faster. Here the signal is first mixed digitally with 10kHz to produce an IQ pair. The IQ pair is filtered in a IIR filter to remove components above 5kHz and then the number of points is reduced by a factor of two. Finally the IQ pair is transformed by use of a complex FFT. The IIR filter is fast but introduces two errors. Firstly the frequency response is no longer flat which is absolutely no problem - later digital processes will correct the frequency respons of the analog filters preceding the D/A converter and the deviation from flattnes is also absorbed completely. Secondly, the limited steepness of the IIR filter causes an aliasing spur that will degrade performance towards the ends of the spectrum. This is usually no problem either since the analog circuitry outside the computer has to attenuate signals near the spectrum ends to avoid aliasing from the A/D sampling and to avoid the mirror image signal when mixing from RF to audio. The approximate FFT should not be used if more than 90% of the bandwidth from zero to the Nyquist frequency is to be used.