Responsible for this page: Oscar Gustafsson , oscarg@isy.liu.se
Page last update: 2009-05-20
The basic building block of an FPGA is a look-up table (LUT) and a flip-flop, which may be connected to the output of the LUT. By configuring the LUT and connecting LUTs and flip-flops the FPGA can be configured to perform the required operation. Many current FPGAs do not only have LUTs and flip-flops, but also complex integrated building blocks, e.g. multipliers. Some FPGAs even have DSP processor cores integrated on to the chip core, apart from the optimized IP-cores, i.e. files to configure the FPGA, supplied from most vendors. With the increasing sizes these FPGAs forms a foundation for implementing systems-on-chips (SoCs) on a single FPGA.
As the basic building blocks of ASICs and FPGAs are different, they have different cost measures in terms of area, speed, and power consumption. Furthermore, for ASICs in modern process technologies the chip area is often determined by the number of input and output connections, and, hence, as long as it is possible to fit the computational core inside the pad frame, the manufacturing cost is independent of the core area. For FPGAs the consumed area is often more critical as excessive area may require a change to the next larger (and more expensive) FPGA model. For speed it is important to be able to place pipelining stages, i.e., registers that divide the critical propagation path, at suitable cuts to efficiently utilize each LUT and flip-flop. Hence, dividing operations into blocks that can be well fitted into an FPGA has been an active research area. For example, it has been shown that bit-serial processing may be advantageous as a bit-serial adder or subtractor fit well in a LUT plus flip-flop. On the other hand, recent FPGAs have an extra block for the fast carry propagation required in bit-parallel arithmetic. This will improve the results for bit-parallel implementation compared with e.g. redundant arithmetic implementation, which is commonly used in ASICs.
The focus of the project will be on algorithms for digital filtering, including sample rate changes, transforms such as the discrete Fourier transform (DFT, often implemented using an FFT algorithm) and discrete cosine transform (DCT), and modulation and demodulation techniques.
It has recently been shown that the use of multiple constant multiplication (MCM) techniques, i.e., expressing the multiplications using shifts and additions/subtractions and utilizing the redundancy between coefficients that are multiplied by the same input data, are efficient when implementing FIR digital filters in FPGAs, often outperforming the vendors tools. This is an area where our research group has a long experience and where we recently have developed one of the best, if not the best, algorithms in terms of complexity. It is expected that it is possible to further optimize the results by considering bit-level optimizations. Furthermore, while most work so far has considered single rate FIR filters, it has been shown that similar approaches can be used for sample rate changes. Some work has also been done on color space converters for digital video using these techniques.
As the previous approaches compute one output sample per clock cycle (for single-rate FIR filters) they are suitable for high sample rate applications. However, for lower sample rates it is often possible to find time-multiplexed architectures that has a lower area complexity. The straightforward approach is to use one of the built-in multipliers if they are available and have suitable wordlength. Another approach related to the MCM problem is to introduce multiplexers among the shifts and additions/subtractions to reduce the circuit area. This may also be advantageous when combining several FIR filters where only one should be used, such as in multi-standard receivers and transmitters.
Another considered area is high-speed implementation of arithmetic functions for use in prototyping ASICs. As FPGAs are inherently slower it is important to find fast enough implementations in FPGAs. For communication systems this may include e.g. sine and cosine functions. Recently, we have proposed a number of methods for elementary function approximation. However, these have been aimed at ASIC implementation and therefore needs to be thoroughly evaluated and possibly modified for FPGA implementation.