The fast Fourier transform (FFT) is one of the most widely used and important signal processing functions, for example, in applications related to digital communications and image processing. Since the computational complexity of the FFT is O(N log2N), where N the number of inputs, the FFT potentially requires multi-cycle processing, and can become a major bottleneck for overall system performance. To relieve this bottleneck, many commercial FPGA IP blocks provide a streaming form of the FFT with single-cycle-per-sample throughput. This high-throughput form of FFT comes by exploiting the locality of reference and massive parallelism offered by modern FPGA fabrics. Multiple butterflies can be easily implemented on the FPGA. The twiddle factors and the intermediates data samples can be stored either in block RAM or Distributed RAM or a combination of both. This completely eliminates the need for off chip memory access which may slow down the FFT considerably. This report descirbes the implementation of FFT on a Xilinx-6 FPGA.

Download (PDF, 337KB)